This article is part of the supplement: Italian Society of Bioinformatics (BITS): Annual Meeting 2007
SNPLims: a data management system for genome wide association studies
1 Consorzio Interuniversitario Lombardo per l'Elaborazione Automatica, Via Sanzio Raffaello 4, 20090 Segrate (MI), Italy
2 Dipartimento di Scienze e Tecnologie Biomediche, Università degli Studi di Milano, Via Fratelli Cervi 93, 20090 Segrate (MI), Italy
3 Istituto di Tecnologie Biomediche, Consiglio Nazionale delle Ricerche, Via Fratelli Cervi 93, 20090 Segrate (MI), Italy
BMC Bioinformatics 2008, 9(Suppl 2):S13 doi:10.1186/1471-2105-9-S2-S13Published: 26 March 2008
Recent progresses in genotyping technologies allow the generation high-density genetic maps using hundreds of thousands of genetic markers for each DNA sample. The availability of this large amount of genotypic data facilitates the whole genome search for genetic basis of diseases.
We need a suitable information management system to efficiently manage the data flow produced by whole genome genotyping and to make it available for further analyses.
We have developed an information system mainly devoted to the storage and management of SNP genotype data produced by the Illumina platform from the raw outputs of genotyping into a relational database.
The relational database can be accessed in order to import any existing data and export user-defined formats compatible with many different genetic analysis programs.
After calculating family-based or case-control association study data, the results can be imported in SNPLims. One of the main features is to allow the user to rapidly identify and annotate statistically relevant polymorphisms from the large volume of data analyzed. Results can be easily visualized either graphically or creating ASCII comma separated format output files, which can be used as input to further analyses.
The proposed infrastructure allows to manage a relatively large amount of genotypes for each sample and an arbitrary number of samples and phenotypes. Moreover, it enables the users to control the quality of the data and to perform the most common screening analyses and identify genes that become “candidate” for the disease under consideration.