The FTO (fat mass and obesity associated) gene codes for a novel member of the non-heme dioxygenase superfamily

Sanchez-Pulido, Luis; Andrade-Navarro, Miguel A

doi:10.1186/1471-2091-8-23

Research article
Open access
Published: 08 November 2007

The FTO (fat mass and obesity associated) gene codes for a novel member of the non-heme dioxygenase superfamily

Luis Sanchez-Pulido¹ &
Miguel A Andrade-Navarro^2,3,4

BMC Biochemistry volume 8, Article number: 23 (2007) Cite this article

13k Accesses
122 Citations
11 Altmetric
Metrics details

Abstract

Background

Genetic variants in the FTO (fat mass and obesity associated) gene have been associated with an increased risk of obesity. However, the function of its protein product has not been experimentally studied and previously reported sequence similarity analyses suggested the absence of homologs in existing protein databases. Here, we present the first detailed computational analysis of the sequence and predicted structure of the protein encoded by FTO.

Results

We performed a sequence similarity search using the human FTO protein as query and then generated a profile using the multiple sequence alignment of the homologous sequences. Profile-to-sequence and profile-to-profile based comparisons identified remote homologs of the non-heme dioxygenase family.

Conclusion

Our analysis suggests that human FTO is a member of the non-heme dioxygenase (Fe(II)- and 2-oxoglutarate-dependent dioxygenases) superfamily. Amino acid conservation patterns support this hypothesis and indicate that both 2-oxoglutarate and iron should be important for FTO function. This computational prediction of the function of FTO should suggest further steps for its experimental characterization and help to formulate hypothesis about the mechanisms by which it relates to obesity in humans.

Background

Two recent reports [1, 2] characterized the strong association of a number of single nucleotide polymorphisms (SNPs) in intron 1 of the human FTO gene with an increased risk of obesity, characterized by an increase in body max index due to fat mass rather than lean mass that is seen in children as early as age seven [2].

However, the mechanisms by which this genetic variability relates to obesity remain obscure. These publications indicate that the function of FTO is unknown [2] and that its protein has no identified structural domain or link to other proteins that could be used to predict its function [1]. Knowledge of the function of FTO is crucial to guide the search for a mechanism relating this gene to obesity.

Here we report evidence obtained by computational analysis indicating that the protein coded by FTO is a member of the non-heme dioxygenase (Fe(II)- and 2-oxoglutarate-dependent dioxygenases) superfamily.

Results and Discussion

In the course of the computational characterization of the FTO family (see Methods) we identified sequences homologous to human FTO in different eukaryote groups including vertebrates (from fish to mammals), green algae (Ostreococcus) and diatoms (Phaeodactylum and Thalassiosira) (see Figure 1 and Table 1).

Table 1 Additional details for lanes in Figure 1 and FTO close homologous sequences in Figures 1 and 2.

Full size table

Using sequence profiles of the N-terminal conserved region of the FTO family (corresponding to the human FTO sequence amino acid positions 57–324) members of the non-heme dioxygenase family were identified. Additionally, the secondary structure predictions of the FTO family showed high similarity with the known structures of AlkB, a member of the non-heme dioxygenase family [3–5]. We were not able to find significant homology in the C-terminal of the FTO family to other genes.

To investigate if fold recognition analysis would generate supporting results, we submitted the FTO N-terminal region as a query to an independent fold assignment system based on profile-profile comparisons (see Methods). The profiles generated for the human and E. coli AlkB proteins (PDB entries 2iuw and 2fdi) matched the FTO N-terminal region with an E-Value of 3.2 × 10^-21 and 3.1 × 10^-12, respectively (estimated error rate < 3%) despite their low level of sequence identity to the human FTO protein (approximately 17%). The next match corresponded to the hypothetical protein TM0957 from Thermotoga maritima, however it was considered unreliable given its short length (28 amino acids) and high E-value (0.02).

Given the E-values of the HMMer searches, the reliability of secondary structure predictions, and the fold assignment results, we are confident that the proteins of the FTO family (including the protein coded by the FTO human gene) are members of the non-heme dioxygenase superfamily.

Proteins of this superfamily catalyze different oxidative reactions on multiple substrates producing varied biological effects [6] and are characterized by a number of conserved amino acids involved in the binding of iron and 2-oxoglutarate (as a cofactor and co-substrate, respectively). We found these amino acids in human FTO and in its homologs (see Figure 1 and Table 1), suggesting that 2-oxoglutarate and iron are essential for the normal function of the FTO protein.

The FTO family is not a unique case as other families of the non-heme dioxygenase superfamily are also very divergent and their detection required non-trivial computational analysis [3]. Due to the divergence of the FTO family from already known non-heme dioxygenases, we were unable to predict the target of the family's catalytic action.

The ubiquitous expression of FTO throughout many human tissues [1] indicates that it has an important function. The phylogenetic distribution of FTO homologs (consistently present in organisms from fish to mammals) suggests that this gene appeared during the evolution of vertebrates. Intriguingly, FTO homologs can be found in green algae Ostreococcus and diatoms, whereas they are apparently absent in insects, worms and fungi (see Figure 1 and Table 1). The most parsimonious explanation of this fact is the existence of independent events of horizontal gene transfer from vertebrates to protists. Horizontal gene transfer has previously been related to the evolution of several eukaryotic regulatory systems that function in development, differentiation and apoptosis [7]. Concisely, horizontal transfer of FTO indicates that the FTO protein has a function that confers a selective advantage but that it is not indispensable, which agrees with a possible regulatory role.

For comparison, the hypoxia-inducible factor (HIF), a known member of the non-heme dioxygenase family, also has a wide phylogenetic distribution (from worms to mammals) and is ubiquitously expressed in all human tissues. HIF acts as a sensor of oxygen level and affects the expression of over one hundred genes [8]. This molecule performs its activity by shuttling between the cytoplasm in normoxic conditions and the nucleus in hypoxic conditions [9].

To investigate if FTO could be acting in a similar manner, we studied its sequence using an algorithm for prediction of protein cellular localization (WolfPSORT; [10]). The results suggested with similar scores a cytoplasmic and a nuclear-cytoplasmic localization for this protein. This is consistent with human FTO's possible function as a metabolic sensor and nuclear effector.

The FTO human gene product has a predicted molecular mass of 50 KDa. With that mass it would need a Nuclear Localization Signal (NLS) [11] in order to act in the nucleus. Analysis of FTO's sequence using an algorithm that includes the prediction of NLS (PSORTII; [10]) suggested a 17 amino acid long bipartite NLS from positions 2 to 18 (Figure 2A) noted previously [12] but not experimentally verified. Further analysis of the family indicated that this region stands as a K/R rich region in comparison to the rest of the sequence, and that it is located in an N-terminal extension that is conserved in close human homologs from fish to mammals but not in the other FTO homologues we found in algae or diatomea.

In light of these computational results we hypothesize that FTO is a sensor of the cell's metabolic state and when dysfunctional can result in an obese phenotype. We identify the N-terminal of human FTO as having a high likelihood of determining its cellular localization, which could be verified by mutational analysis.

Conclusion

Here we have provided valuable information about FTO by indicating its possible catalytic function, and we have pointed to the amino acids involved in cofactor (Fe) and co-substrate (2-oxoglutarate) binding in human FTO as well as in its homologous proteins in other organisms, which could be used as models for the study of the human disease. This insight should help to guide experiments to clarify the mechanisms by which FTO relates to obesity and to accelerate the discovery of novel molecular therapies for this condition.

Methods

We first performed BLAST sequence similarity searches [13] using the human FTO protein as query against different sequence database resources: NCBI [14], ENSEMBL [15] and JGI [16]. Multiple sequence alignments of protein sequences homologous to human FTO were generated with the program T-Coffee [17] using default parameters, slightly refined manually and visualized with the Belvu program (Figure 1. Top) [18].

Profiles of the alignment as global hidden Markov models (HMMs) were generated using HMMer [19]. Profile-based sequence searches were performed against the Uniref50 and Uniref90 protein sequence databases [20] using HMMsearch [21]. We used NAIL [22] to view and analyze the HMMsearch results, which provided a formatted view with hyperlinks to related web resources and coloring related to taxonomic information, thus facilitating the interpretation of the results.

Fold recognition analyses were performed using profile-to-profile comparisons of the HMM profile of the FTO family to profiles generated for each sequence of known structure with its homologues (HHpred server; [23, 24]). The significance of sequence-to-sequence, profile-to-sequence, and profile-to-profile matches were evaluated in terms of an E-value, which is an estimation of the probability of finding a better match by chance. Secondary structure predictions were performed using the PredictProtein Server [25, 26]. AlkB active center illustrations (Figure 1. Bottom) were generated with Pymol [27].

Abbreviations

2OG:: (2-oxoglutarate)
AlkB:: (Alkylated DNA repair protein)
ESTs:: (Expressed sequence tags)
FGENESH:: (Find Genes using HMM)
FTO:: (fat mass and obesity associated)
HIF:: (Hypoxia-inducible factor)
HMMs:: (Hidden Markov Models)
JGI:: (Joint Genome Institute)
NBCI:: (National Center for Biotechnology Information)
NLS:: (Nuclear Localization Signal)
SNPs:: (Single nucleotide polymorphisms)

References

Dina C, Meyre D, Gallina S, Durand E, Korner A, Jacobson P, Carlsson LM, Kiess W, Vatin V, Lecoeur C, et al.: Variation in FTO contributes to childhood obesity and severe adult obesity. Nat Genet. 2007, 39 (6): 724-726. 10.1038/ng2048.
Article CAS PubMed Google Scholar
Frayling TM, Timpson NJ, Weedon MN, Zeggini E, Freathy RM, Lindgren CM, Perry JR, Elliott KS, Lango H, Rayner NW, et al.: A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science. 2007, 316 (5826): 889-894. 10.1126/science.1141634.
Article PubMed Central CAS PubMed Google Scholar
Aravind L, Koonin EV: The DNA-repair protein AlkB, EGL-9, and leprecan define new families of 2-oxoglutarate- and iron-dependent dioxygenases. Genome Biol. 2001, 2 (3): RESEARCH0007-10.1186/gb-2001-2-3-research0007.
Article PubMed Central CAS PubMed Google Scholar
Sundheim O, Vagbo CB, Bjoras M, Sousa MM, Talstad V, Aas PA, Drablos F, Krokan HE, Tainer JA, Slupphaug G: Human ABH3 structure and key residues for oxidative demethylation to reverse DNA/RNA damage. Embo J. 2006, 25 (14): 3389-3397. 10.1038/sj.emboj.7601219.
Article PubMed Central CAS PubMed Google Scholar
Yu B, Edstrom WC, Benach J, Hamuro Y, Weber PC, Gibney BR, Hunt JF: Crystal structures of catalytic complexes of the oxidative DNA/RNA repair enzyme AlkB. Nature. 2006, 439 (7078): 879-884. 10.1038/nature04561.
Article CAS PubMed Google Scholar
Ozer A, Bruick RK: Non-heme dioxygenases: cellular sensors and regulators jelly rolled into one?. Nat Chem Biol. 2007, 3 (3): 144-153. 10.1038/nchembio863.
Article CAS PubMed Google Scholar
Iyer LM, Aravind L, Coon SL, Klein DC, Koonin EV: Evolution of cell-cell signaling in animals: did late horizontal gene transfer from bacteria have a role?. Trends Genet. 2004, 20 (7): 292-299. 10.1016/j.tig.2004.05.007.
Article CAS PubMed Google Scholar
Semenza GL: Targeting HIF-1 for cancer therapy. Nat Rev Cancer. 2003, 3 (10): 721-732. 10.1038/nrc1187.
Article CAS PubMed Google Scholar
Kallio PJ, Okamoto K, O'Brien S, Carrero P, Makino Y, Tanaka H, Poellinger L: Signal transduction in hypoxic cells: inducible nuclear translocation and recruitment of the CBP/p300 coactivator by the hypoxia-inducible factor-1alpha. EMBO J. 1998, 17 (22): 6573-6586. 10.1093/emboj/17.22.6573.
Article PubMed Central CAS PubMed Google Scholar
Horton P, Park KJ, Obayashi T, Fujita N, Harada H, Adams-Collier CJ, Nakai K: WoLF PSORT: protein localization predictor. Nucleic Acids Res. 2007, W585-W587. 10.1093/nar/gkm259. 35 Web Server
Lusk CP, Blobel G, King MC: Highway to the inner nuclear membrane: rules for the road. Nat Rev Mol Cell Biol. 2007, 8 (5): 414-420. 10.1038/nrm2165.
Article CAS PubMed Google Scholar
Peters T, Ausmeier K, Ruther U: Cloning of Fatso (Fto), a novel gene deleted by the Fused toes (Ft) mouse mutation. Mammalian Genome. 1999, 10 (10): 983-986. 10.1007/s003359901144.
Article CAS PubMed Google Scholar
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25 (17): 3389-3402. 10.1093/nar/25.17.3389.
Article PubMed Central CAS PubMed Google Scholar
NCBI's BLAST server. [http://www.ncbi.nlm.nih.gov/BLAST/]
Ensembl. [http://www.ensembl.org/index.html]
DOE Joint Genome Institute. [http://www.jgi.doe.gov/]
Notredame C, Higgins DG, Heringa J: T-Coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol. 2000, 302 (1): 205-217. 10.1006/jmbi.2000.4042.
Article CAS PubMed Google Scholar
Sonnhammer EL, Hollich V: Scoredist: a simple and robust protein sequence distance estimator. BMC Bioinformatics. 2005, 6: 108-10.1186/1471-2105-6-108.
Article PubMed Central PubMed Google Scholar
Eddy SR: Profile hidden Markov models. Bioinformatics. 1998, 14 (9): 755-763. 10.1093/bioinformatics/14.9.755.
Article CAS PubMed Google Scholar
Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH: UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics. 2007, 23 (10): 1282-1288. 10.1093/bioinformatics/btm098.
Article CAS PubMed Google Scholar
Janelia farm Hmmer web site. [http://hmmer.janelia.org/]
Sanchez-Pulido L, Yuan YP, Andrade MA, Bork P: NAIL-Network Analysis Interface for Linking HMMER results. Bioinformatics. 2000, 16 (7): 656-657. 10.1093/bioinformatics/16.7.656.
Article CAS PubMed Google Scholar
HHpred web server. [http://toolkit.tuebingen.mpg.de/hhpred]
Soding J: Protein homology detection by HMM-HMM comparison. Bioinformatics. 2005, 21 (7): 951-960. 10.1093/bioinformatics/bti125.
Article PubMed Google Scholar
PredictProtein web server. [http://www.predictprotein.org/]
Rost B: PHD: predicting one-dimensional protein structure by profile-based neural networks. Methods Enzymol. 1996, 266: 525-539.
Article CAS PubMed Google Scholar
Pymol web site. [http://pymol.sourceforge.net/]
Holm L, Sander C: Dali: a network tool for protein structure comparison. Trends Biochem Sci. 1995, 20 (11): 478-480. 10.1016/S0968-0004(00)89105-7.
Article CAS PubMed Google Scholar
Supplementary material web server. [http://www.pdg.cnb.uam.es/FTO]
Huska MR, Buschmann H, Andrade-Navarro MA: BiasViz: Visualization of amino acid biased regions in protein alignments. Bioinformatics. 2007, 23 (22): 3093-3094. 10.1093/bioinformatics/btm489.
Article CAS PubMed Google Scholar
Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S: Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2007, D5-12. 10.1093/nar/gkl1031. 35 Database

Download references

Acknowledgements

MAA is a recipient of a Canada Research Chair in Bioinformatics.

Author information

Authors and Affiliations

Centro Nacional de Biotecnologia, CSIC, Madrid, Spain
Luis Sanchez-Pulido
Molecular Medicine, Ottawa Health Research Institute, Ottawa, Canada
Miguel A Andrade-Navarro
Faculty of Medicine, University of Ottawa, Ottawa, Canada
Miguel A Andrade-Navarro
Max Delbrück Center for Molecular Medicine, Berlin, Germany
Miguel A Andrade-Navarro

Authors

Luis Sanchez-Pulido
View author publications
You can also search for this author in PubMed Google Scholar
Miguel A Andrade-Navarro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luis Sanchez-Pulido.

Additional information

Authors' contributions

LSP carried out the initial sequence and structural analysis of the domain. LSP and MAA interpreted the data and prepared the manuscript. All authors read and approved the final manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Sanchez-Pulido, L., Andrade-Navarro, M.A. The FTO (fat mass and obesity associated) gene codes for a novel member of the non-heme dioxygenase superfamily. BMC Biochem 8, 23 (2007). https://doi.org/10.1186/1471-2091-8-23

Download citation

Received: 27 July 2007
Accepted: 08 November 2007
Published: 08 November 2007
DOI: https://doi.org/10.1186/1471-2091-8-23

The FTO (fat mass and obesity associated) gene codes for a novel member of the non-heme dioxygenase superfamily