Modeling and predicting chiral stationary phase enantioselectivity: An efficient random forest classifier using an optimally balanced training dataset and an aggregation strategy

Skip to Navigation

EarlyView Article

  • Published: Mar 2, 2018
  • Author: Patrick Piras, Robert Sheridan, Edward C. Sherer, Wes Schafer, Christopher J. Welch, Christian Roussel
  • Journal: Journal of Separation Science


Predicting whether a chiral column will be effective is a daily task for many analysts. Moreover, finding the best chiral column for separating a particular racemic compound is mostly a matter of trial and error that may take up to a week in some cases. In this study we have developed a novel prediction approach based on combining a random forest classifier and an optimized discretization method for dealing with enantioselectivity as a continuous variable. Using the optimization results, models were trained on data sets divided into four enantioselectivity classes. The best model performances were achieved by over‐sampling the minority classes (α ≤ 1.10 and α ≥ 2.00), down‐sampling the majority class (1.2 ≤ α < 2.0), and aggregating multicategory predictions into binary classifications. We tested our method on 41 chiral stationary phases using layered fingerprints as descriptors. Experimental results show that this learning methodology was successful in terms of average area under the Receiver Operating Characteristic curve, Kappa indices and F‐measure for structure‐based prediction of the enantioselective behavior of 34 chiral columns.

Social Links

Share This Links

Bookmark and Share


Suppliers Selection
Societies Selection

Banner Ad

Click here to see
all job opportunities

Most Viewed

Copyright Information

Interested in spectroscopy? Visit our sister site

Copyright © 2018 John Wiley & Sons, Inc. All Rights Reserved