Open Access Highly Accessed Research article

Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar Micro-Tom, a reference system for the Solanaceae genomics

Koh Aoki1*, Kentaro Yano2, Ayako Suzuki2, Shingo Kawamura2, Nozomu Sakurai1, Kunihiro Suda1, Atsushi Kurabayashi1, Tatsuya Suzuki3, Taneaki Tsugane3, Manabu Watanabe3, Kazuhide Ooga1, Maiko Torii1, Takanori Narita4, Tadasu Shin-i4, Yuji Kohara4, Naoki Yamamoto2, Hideki Takahashi5, Yuichiro Watanabe6, Mayumi Egusa7, Motoichiro Kodama7, Yuki Ichinose8, Mari Kikuchi9, Sumire Fukushima9, Akiko Okabe9, Tsutomu Arie9, Yuko Sato10, Katsumi Yazawa10, Shinobu Satoh10, Toshikazu Omura11, Hiroshi Ezura11 and Daisuke Shibata1

Author Affiliations

1 Kazusa DNA Research Institute, 2-6-7 Kazusa-Kamatari, Kisarazu, 292-0818, Japan

2 Meiji University, 1-1-1 Higashi-mita, Tama-ku, Kawasaki, 214-8571, Japan

3 Chiba Prefectural Agriculture and Forestry Research Center, 808 Daizenno-cho, Midori-ku, Chiba, 266-0006, Japan

4 National Institute of Genetics, Yata 1111, Mishima, 411-8540, Japan

5 Tohoku University, 1-1 Amamiya-machi, Tsutsumidori, Aoba-ku, Sendai, 981-8555, Japan

6 The University of Tokyo, Komaba, Meguro-ku, 153-8902, Japan

7 Tottori University, 4-101 Koyama-minami, Tottori, 680-8553, Japan

8 Okayama University, 1-1-1 Tsushima-naka, Kita-ku, Okayama, 700-8530, Japan

9 Tokyo University of Agriculture and Technology, 3-5-8 Saiwai-cho, Fuchu, 183-8509, Japan

10 Institute of Biological Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, 305-8571, Japan

11 Gene Research Center, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, 305-8571, Japan

For all author emails, please log on.

BMC Genomics 2010, 11:210  doi:10.1186/1471-2164-11-210

Published: 30 March 2010

Abstract

Background

The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance.

Results

To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs) was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706) was estimated to be 0.061%.

Conclusion

The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the tomato whole-genome sequence and aid in tomato functional genomics and molecular breeding. Full-length cDNA sequences and their annotations are provided in the database KaFTom http://www.pgb.kazusa.or.jp/kaftom/ webcite via the website of the National Bioresource Project Tomato http://tomato.nbrp.jp webcite.