|
Resolution: standard / high Figure 1.
NEXUS and the character-state data model. Some relevant terms and concepts are illustrated with a graphical view of a small
family of RAD23-related protein-coding genes (KOG0011 data provided in Additional
file 1). In molecular sequence analyses, OTUs often are labelled with a token fusing a species
name with a database ID. In a NEXUS file, such OTU labels are declared in a TAXA block.
The TREES block may contain one or more trees relating the OTUs, each tree optionally
having label (e.g., "Coelomate") and a numeric weight (e.g., a probability). Trees
may contain branch lengths and support values. In the matrix of amino acid character
data shown here, the 4th character (i.e., the 4th alignment column), has the states "V", "I", "I", "L", "V", "I", "L", and "I". Such
character-state data are stored in a NEXUS CHARACTERS block, which defines the type
of data and the meaning of a gap symbol such as "-". The ASSUMPTIONS block of a NEXUS
file provides the means to store a weight or other numeric value for each character,
such as the column-wise alignment scores shown here. Many other types of information
not shown here can be stored in NEXUS files.
Hladish et al. BMC Bioinformatics 2007 8:191 doi:10.1186/1471-2105-8-191 |