Table 5 |
|
|
Orthographic Features. |
|
|
Orthographic Feature |
Regular Expression |
|
|
|
|
Init Caps |
[A-Z].* |
|
Init Caps Alpha |
[A-Z][a-z]* |
|
All Caps |
[A-Z]+ |
|
Caps Mix |
[A-Za-z]+ |
|
Has Digit |
.*[0-9].* |
|
Single Digit |
[0-9] |
|
Double Digit |
[0-9][0-9] |
|
Natural Number |
[0-9]+ |
|
Real Number |
[-\+][[0-9]+[\.,]+[0-9].,]+ |
|
Alpha-Numeric |
[A-Za-z0-9]+ |
|
Roman |
[ivxdlcm]+|[IVXDLCM]+ |
|
Has Dash |
.*-.* |
|
Init Dash |
-.* |
|
End Dash |
.*- |
|
Punctuation |
[,\.;:\?!-\+"] |
|
Greek |
(alpha|beta|...|omega) |
|
Has Greek |
.*\b(alpha|beta|...|omega)\b.* |
|
Mutation Pattern |
\w*\d+-*\D+ |
|
|
|
|
Orthographic features and their corresponding regular expressions used in the experiments. |
|
|
Bundschus et al. BMC Bioinformatics 2008 9:207 doi:10.1186/1471-2105-9-207 |
|