Additional File 1.Full data set. Table S1 may be found online as an Excel file at: http://lifesci/~pattersonlab/publication/MBCDBpaper/supplementary.htm Please feel free to use any annotation in these tables, provided that the original source of the data is cited, and this collection of annotations is cited. Table S1 has the data for all of the genes on the microarray. Raw data in unprocessed, MIAME compliant format is available at: http://genome-www5.stanford.edu/. Column A, "SMD gene name" is the name given to the probe on the microarray. The probe is a PCR product from a pair of primers made to amplify the ORF named in this column (Reinke, V., Smith, H. E., Nance, J., Wang, J., Van Doren, C., Begley, R., Jones, S. J. M., Davis, E., Scherer, S., Ward, S., & Kim, S. K. (2000). Mol. Cell 6, 605–616). Many of these ORF predictions have changed, and the current match is in column B. Column B, "wormbase gene name" gives the current gene prediction that is complementary to the probe on the array. Columns C-K are annotations downloaded from Wormbase in 2003, from April to June. The column headings are defined in Wormbase. Columns L and M are optimal DAF-16 binding sites, which are the sequence TTGTTTAC. Negative numbers are sites upstream of the start of translation, and positive numbers are downstream of the stop codon. These sites were predicted in Fall 2002 using data and software from http://rsat.ulb.ac.be/rsat/ (van Helden, J., André, B., & Collado-Vides, J. (2000). Yeast 16,177–187.). All sites within 2000 bp upstream and 300 bp downstream are shown, except for a few (<2%) that we were not able to match because of different annotation in our database and the rsat database. Columns N-O are our annotation of function based on sequence similarity, annotation from wormbase, and published reports. For the following groups, we have annotated all genes in the group, to the best of our knowledge: G protein coupled receptors, glutathione S transferases, cytochrome p450s, heat shock proteins, peroxidases, UGTs, epoxide hydrolases, collagens, cuticlins, NRF6 related, scl-1 familly (aka CRISP family), signaling proteins, and transcription factors. For the following groups, we have annotated only a subset of genes in the group: amine oxidases, ribosomal proteins, amino acid catabolism, and lipid metabolism. Column P, "mountains", lists groups of putatively co-regulated genes from Kim, S. K., Lund, J., Kiraly, M., Duke, K., Jiang, M., Stuart, J. M., Eizinger, A., Wylie, B. N., & Davidson, G. S. (2001). Science 293, 2087–2092. Column Q and R list the average ratio of signal from daf-c genotypes to wild-type, using all experiments. Column Q is the average expressed as a base 2 logarithm (>0 means the expression was higher in daf-c, <0 is higher in N2, and Column R is the average expressed as a fold change (for both columns, a positive number means the expression was higher in daf-c, a negative number means the expression was higher in N2). Column S is the p value for the values in column Q and R, using a t test, asking the question, given the standard deviation for the data, is the value significantly different from a ratio of 1 (or 0 for the base 2 log). For these calculations we used the local standard deviation and the global standard deviation as described in Jiang, M., Ryu, J., Kiraly, M., Duke, K., Reinke, V., & Kim, S. K. (2001). Proc. Natl. Acad. Sci. 98, 218–223. Column T is the number of successful experiments for each spot on the array. The numbers vary because the data for a particular spot may or may not be of acceptable quality for a given experiment. Some genes have a number greater than 10 (the total number of independent experiments) because the same PCR product was put on the array in two different locations. Columns U thru BH are the data for each spot for each experiment. The column labeled "CH1D_MEAN" is the data for wild-type sample, the column labeled CH2D_MEAN is the data for the daf-c mutant sample, the column labeled CORR is the correlation coefficient for the average ratio of channel 1 to channel 2 calculated pixel by pixel, and the column labeled Flag indicates the data quality. Any value other than 0 indicates that the data had problems and was not used for analysis. Columns BL to CG give the data broken down by genotype (for each of the three daf-c genotypes used in this study). For each genotype, the first three or four columns give the base 2 log of the ratio of channel 2 (from daf-c genotype) to channel 1 (from wild type) for each experiment. Blank cells indicate bad data, not used in the analysis. The five digit number in the headings of these colums refer to the experiment ID used to catalog data at the Stanford Microarray Database. The next column gives the average ratio, the next column the standard deviation and the next gives the number of successful experiments, and the next the p value, asking the question, given the standard deviation for the data, is the value significantly different from a ratio of 1 (or 0 for the base 2 log). Format: CSV Size: 9.9MB Download file Liu et al. BMC Developmental Biology 2004 4:11 doi:10.1186/1471-213X-4-11 |