Model feature weights selected by Lasso for eleven different training data sets. Chromosomes from which 1,000,000 random nucleotide positions were taken are given at bottom. Correlation coefficients are given in the middle, using a test set that does not include any of the random nucleotide positions used in the training set. The top panel is a zoom-in of the 16 features that were weighted in more than half of the eleven runs. Weights do not directly reflect importance or proportion of the data that a feature explains, because features are unit-normalized prior to analysis, and can have dissimilar distributions.
Tillo and Hughes BMC Bioinformatics 2009 10:442 doi:10.1186/1471-2105-10-442