|
ClassCK Classifier Construction Kit Brian T. Luke (lukeb@ncifcrf.gov) |
|
Goal: omic-investigations (Proteomics, Metabolomics, Metabonomics,...) generate a large amount of data on a relatively small number of samples. There is a growing interest in using this data to distinguish one Class of samples from another (i.e. healthy versus diseased or prostate cancer versus kidney cancer). The Classifier Construction Kit (ClassCK) is a collection of routines that allows the user to construct classification models using their own or publically available datasets. A classifier, as constructed here, uses a small number of features, a distance metric, and a procedure that predicts the Class of an unknown sample by comparing it to group of known samples with similar feature values. Each classifier is given a score based on how well it determines the Class of a set of training samples, or how compact the resulting clusters are (assuming a clustering classification method is used), or a combination of the two. Given a distance metric and a classification method, ClassCK searches for the best set of features. ClassCK uses a modified Evolutionary Programming method to search for the best set of features. Nine different distance metrics are available.
In addition, there are six classification methods available.
For a given distance metric and classification method the Evolutionary Programming driver searches for the best set of features. At the conclusion of the feature search the top classifiers can be examined by one or more of the following methods.
In addition, ClassCK is able to produce PostScript(tm) files that contain the following plots:
To learn more about ClassCK, including how to download the program, consult the menu on the left.
|