Feature table file

Feature tables are tabbed text files specifying feature vectors and categories for a set of observations.

In OTU analysis, features are OTUs and observations are samples (see random forests).

Feature tables are similar to OTU tables except (1) rows are samples and columns are OTUs, (2) there is an additional column specifying a metadata category for each sample, and (3) values may be floating-point (OTU table values must be integers).

Rows are observations (e.g. samples).

Columns are features (e.g. OTUs), except that the last column contains the category. The category is used for supervised training, i.e. training of a classifier on known categories. If the category is not known, this column must still be included; in that case values are ignored and can sbe arbitrary placeholders; e.g. Unknown.

The first row is a header line. The first field of the header is an arbitrary string which is ignored. The subsequent fields are feature labels, e.g. OTU names. The last field is the name of the category, e.g. Category, State, Acidity, Time etc.

Subsequent rows are observations, e.g. samples. The first field in a row is the name of the feature, e.g. SampleA. Subsequent values are floating-point numbers giving the count or frequency of the OTU in the sample. The last field is a string specifying the category value, e.g. Healthy, Sick, HighAcid, LowAcid, Morning, Evening, etc.

Values are floating-point numbers. They may be integer counts.

Example

Sample   Otu1    Otu2    Otu3    Category
SamA     12      99      3       Sick
SamB     103     4       11      Sick
SamC     77      8       12      Healthy