Email updates

Keep up to date with the latest news and content from BMC Molecular Biology and BioMed Central.

Open Access Highly Accessed Research article

Sequence-dependent DNA helical rise and nucleosome stability

Francesco Pedone* and Daniele Santoni

Author Affiliations

Dept. of Genetics and Molecular Biology, 'Sapienza' University, P.le A. Moro 3, 00161 Rome, Italy

For all author emails, please log on.

BMC Molecular Biology 2009, 10:105  doi:10.1186/1471-2199-10-105

The electronic version of this article is the complete one and can be found online at: http://www.biomedcentral.com/1471-2199/10/105


Received:26 June 2009
Accepted:27 November 2009
Published:27 November 2009

© 2009 Pedone and Santoni; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background

Nucleosomes are the basic structural units of eukaryotic chromatin and play a key role in regulation of gene expression. After resolution of the nucleosome structure, the bipartite nature of this particle has revealed itself and has disclosed the presence, on the histone surface, of a symmetric distribution of positive charges, able to interact with their negative DNA phosphate counterpart.

Results

We analyzed helical steps in known nucleosomal DNA sequences, observing a significant relationship between their symmetric distribution and nucleosome stability. Synthetic DNA sequences able to form stable nucleosomes were used to compare distances on the left and on the right side of the nucleosomal dyad axis, where DNA phosphates and charged residues of the (H3H4)2-tetramer interact. We observed a linear relationship between coincidence of distances and nucleosome stability, i. e., the more symmetric these distances the more stable the nucleosome.

Conclusion

Curves related to this symmetric distribution along the DNA sequence identify preferential sites for positioning of the dyad axis, which we termed palinstases. The comparison of our data with known nucleosome positions in archaeal and eukaryotic sequences shows many coincidences of location. Sequences that impair nucleosome formation and DNase I hypersensitive sites yield curves with a lower degree of symmetry. Analysis performed on DNA tracts of promoters close to the transcription start and termination sites identified peculiar patterns: in particular low affinity for nucleosome binding at the transcription start site and a high affinity exactly at the transcription termination site, suggesting a major role of nucleosomes in the termination of transcription.

Background

The role played by the DNA sequence in determining preferred positions of individual nucleosomes has been studied using both experimental and theoretical approaches. Several global assessments of nucleosome positioning have been described in yeast [1-4], in Caenorhabditis elegans [5,6], in Drosophila [7] and in humans [8-13]. Experimental mapping of nucleosomes has been performed mainly by micrococcal nuclease digestion followed either by ligation-mediated PCR analysis or by DNA microarray-based methods. Theoretical models used for nucleosome-positioning prediction include probabilistic models [1], the comparative genomics approach [14], the support vector machine classifier [3], energy landscapes [15] and DNA physical properties [16]. During nucleosome formation, 60 bp in the central region of nucleosomal DNA become primarily associated with (H3H4)2-tetramer [17]. The histone particle presents, on its surface, a distribution of positive charges able to interact with their negative DNA phosphate counterpart. These charges are symmetrically distributed with respect to the pseudo-dyad axis of the nucleosome and constitute a 'mask' of distances that remained constant during evolution [18]. It is usually assumed that DNA length is the same for any DNA sequence of the same size and that the helical rise of any dinucleotide step does not shift to a large extent from the mean value of about 3.4 Å. More recent results, obtained by X-ray analysis of DNA crystals, suggest helical rise values around 2.83 ± 0.36 Å for A-DNA and 3.29 ± 0.21 Å for B-DNA [19]. We observed that DNA oligomers having the same number of base pairs, as reported in X-ray and NMR databases, show different lengths, i.e., the length of dodecamers varies from 32 up to 37Å.

We guessed that nucleosome positioning must be related to a symmetric distribution of distances along the DNA sequence upstream and downstream of the presumed dyad-axis location. In order to measure the length of DNA sequences, as the sum of helical steps, we have collected from literature available helical rise values of the 136 possible tetranucleotide steps of DNA.

Results and Discussion

Helical rise values of tetranucleotides

A tetranucleotide code for the 136 possible tetrads was obtained collecting data from available databases of resolved DNA structures (see Methods, table 1). We referred each value to the central dinucleotide helical step, taking into account the first two flanking bases. For instance, we assigned a value of 3.40 Å to ACTG tetrad, meaning that the central dinucleotide CT has such a value if it occurs with an A and a G as flanking bases.

Table 1. Helical rise values of the 136 possible tetranucleotides.

Data reported in table 1 show a distribution of helical rise values with a mean of 3.2 Å and a maximal and a minimal value of 4.46 Å (step 114 CGCA/TGCG) and 2.36 Å (step 96 ATGA/TCAT), respectively, with a remarkable difference of 2.1 Å between these two values. Thirteen of the values reported in the table were calculated by averaging values for tetranucleotides containing the same central dinucleotide step. For these tetranucleotides and 39 additional ones, whose helical rise values were derived by a single DNA oligomer, rmsd values are absent. Therefore, a refinement of the table is needed using new available resolvedstructures.

It is remarkable that tetranucleotides whose rmsd values are higher than 0.3 Å have central dinucleotides that can be stacked, in the DNA helix, into two different conformations; that's why they are termed 'bistable'. Hunter [20] reports evidence of bistability in DNA bp, mainly in the pyrimidine-purine CG and TA steps, but also in CC/GG and AG/CT. A re-classification of bistability was performed by Gardiner et al. [21] in a study on structural parameters of DNA oligomers, and, in tetranucleotides, the bistability turned out to be dependent of the central step according to GG, CG, CA > GC > TA > AG, GA, AC, AT, AA order. Therefore, we conclude that high variability of helical rise values for some of the tetranucleotides in table 1 is due to the presence of a central bistable dinucleotide step, which exhibits a high sensitivity to neighboring base pairs.

Lu and Olson [19] have shown that the variation of helical rise values in dinucleotide steps is related to coupling of roll and slide values, i.e., when these parameters are both positive or both negative, DNA either lengthens or becomes shorter. We confirmed this behavior and noticed as well that the influence of adjacent nucleotides on helical rise extends over the tetranucleotide, as shown in the two examples reported in table 2. In the first example, tetranucleotide no. 23 (GAGC) is reported; in the second one, tetranucleotide no. 86 (CTGC). In GAGC tetranucleotide, derived from sample 3kbd, we observed an increase in helical rise due to the substitution of an A with a G at the right terminal end and a corresponding change in roll from positive to negative. A similar increase is observed in the CTGC step derived from sample 1bdz. This result suggests that a hexanucleotide code would be more adequate for the evaluation of helical rise than a tetranucleotide code.

Table 2. Dependence of helical rise on neighboring bases.

Symmetric elements in the (H3H4)2-tetramer

The central 57 bp of the nucleosomal particle NCP147 and positions of primary bound DNA phosphates interacting with the (H3H4)2-tetramer, mapped by Richmond and Davey [18], are reported in figure 1. The two DNA strands are reported as W and C and the twelve distances Li (i = 1, 2 .. 6) and Ri (i = 1, 2 .. 6) localize DNA segments we measured to assess the presence of a symmetric distribution of lengths. The maximal degree of symmetry Δls (see Methods) is obtained when Li = Ri (i = 1, 2 .. 6). We used the twelve segments reported in figure 1 and the corresponding distances as a mask to compute the probability for each base of a given DNA sequence to represent a dyad axis. The mask in figure 1 covers 57 bp in the sequence and two additional bp are required to calculate the helical rise of the first and the last dinucleotide step of the mask, due to using of the tetranucleotide code. Given, for example, a 100-bp sequence, we compute, using the mask, 41 values starting from the 30th bp up to the 70th bp of the sequence. The translation of the mask along a DNA sequence implies its rotation through about 36°, which is required to follow the helical path of the bases. Under these conditions, in our analysis of nucleosome positioning, translational and rotational phasing are coupled.

thumbnailFigure 1. Central 57 bp of NCP147 nucleosome. The black thick arrow, from the central, boldfaced Adenine, represents the dyad axis. The sequence is divided into 6 symmetric tracts to the left (L1 to L6) and 6 to the right (R1 to R6), marked by thin arrows. W and C marks the two DNA strands at the DNA-phosphate interaction points with histones in the (H3H4)2-tetramer. This frame is referred to as the mask in the text.

Our purpose is to discover DNA sequences in which equal distances are repeatedly inverted, such as in inverted repeats of nucleotides that occur in palindromes. We term these kinds of DNA sequences "palinstases", based on the ancient Greek word "diastasis" meaning distance. It is evident that palindromic sequences are palinstasic, but the number of palinstases is expected to exceed the number of palindromes, due to the larger number of possible combinations for 136 helical rise values of tetranucleotides, when compared with the 4 possible DNA nucleotides.

Symmetric patterns of nucleosomal DNA sequences

We calculated the Δls values (see Methods) for eleven synthetic DNA sequences previously analyzed by Fitzgerald and Anderson [22] in a study of nucleosome translational positioning. These authors mapped nucleosomal positions and determined nucleosomal stability ΔG for each of their samples, including the ΔG value of the nucleosome located on 5S rDNA of Lytechinus variegatus [23]. Results from this analysis are reported in figure 2A. Δls curves exhibit either single, V-shaped, profiles with a minimum pointing towards mapped nucleosomal locations or profiles with multiple minimal Δls values. Such minimal Δls values are always located, with an uncertainty of 10 bp in a few cases, above the positions of mapped nucleosomes, suggesting that their positioning is favored by a symmetric distribution of the distances in relation to the topology of the (H3H4)2-tetramer. The variability of Δls values was tested on a sample (s601) originally selected from a pool of synthetic random DNA sequences [24] for its strong nucleosome positioning ability. The presence of two sub-populations of nucleosomes in s601 and their relative abundance in solution, assessed by single-pair fluorescence resonance energy transfer, was previously reported [25]. In figure 2B the two mapped nucleosomal positions in s601 are reported. Δls values (black line) exhibit a minimum that coincides with the first nucleosomal position on the left and several minimal values that are not coincident with the second nucleosomal position on the right. In the center of s601 sequence, at step 83, we noticed the ACGT tetranucleotide (number 110 in table 1), which accommodates the central bistable dinucleotide step CG and has the highest rmsd value (1.35 Å). For this step we substituted the mean value of the helical rise (code value) with minimal and maximal values obtained from the available structures we collected; then we reported the newly generated Δls profiles (green and red line, respectively). This substitution causes changes of about 2 Å in Δls value at the first nucleosomal position and a shift of about 10 bp for the minimum at the second nucleosomal position; this minimum, in new profiles, coincides with the mapped position. These changes can explain minimal Δls values of s67 and s77, whose displacement from mapped dyad positions is probably due to uncertainty characterizing some helical rise values in table 1.

thumbnailFigure 2. Nucleosome stability and positioning of synthetic DNA sequences. A: Δls profiles (black line) of the eleven stable nucleosome-forming DNA sequences reported elsewhere [22]. In the panels, the name of the samples and their ΔG are reported. Dots with horizontal bars (146-bp long) mark positions of mapped nucleosomes. B: Δls profile of s601-sample at different values of helical rise for the step at position 83 of the sequence. The minimal value corresponds to the green curve, the mean value to the black curve and the maximal value to the red curve. C: Scatter plot of nucleosomal stability ΔG vs. degree of symmetry Δls.

Minimal Δls values in figure 2A, averaged over curves characterized by multiple positions, were plotted as a function of the ΔG value (figure 2C) and a linear relationship, with a correlation coefficient R = 0.89, was obtained. This result indicates that the stability of nucleosomes depends on Δls in a linear fashion and that an increase in Δls destabilizes nucleosomes. Interaction points on the (H3H4)2-tetramer and interaction points along the DNA-phosphate backbone can be less or more coincident. DNA can stretch in order to reach a distant interaction point, can increase its curvature in order to interact with a back point or the insertion of bridging water molecules may occur. In fact, X-ray analysis of nucleosomal structure at high resolution showed that, inside the minor groove of DNA strands, up to 121 water-mediated hydrogen-bonds can form [26]. It is evident that the substitution of an electrostatic bond with a weaker hydrogen bond of a bridging water molecule substantially destabilizes the nucleosome.

Thåström et al. [27] reported a sixfold increase in affinity for selected synthetic sequences when compared with the most natural nucleosome positioning. We obtained a similar variation of Δls values (figure 2B) ranging from 0.7 Å, for the most stable synthetic sample, up to 5.5 Å for 5SrDNA, which represents a stable natural nucleosome forming sequence.

The symmetric length-distribution in a given DNA sequence can not be identified in a textual way, i. e., the sequence G40T30G40 is fully symmetric and supposed to have a Δls = 0 at the central TT step. This result seems to be in contrast with the observed low nucleosome positioning affinity of poly-(A/T) tracts. Actually, the Δls profile calculated for this sequence yields a minimum of 2.3 Å, due to differences in helical rise between GGGT, GGTT, GTTT tetranucleotides on the left side and TTTG, TTGG, TGGG tetranucleotides on the right side of the central TT step. It must be mentioned that a Δls = 0 value can be attained by a sequence such as G40T59G40 and the minimum will be located at the central T(30).

We observed very low Δls values (0.3 - 0.6 Å) for synthetic DNA sequences, 150-bp long and characterized by the repetition of the (A/T)3NN(G/C)3NN motif, as well as for (CTG)50 bp repeats. It has been shown that these sequences form stable nucleosomes [28,29].

We calculated Δls for the sequences of two well characterized nucleosomal particles, NCP147 [18] and NCP146 [30] (figure 3). NCP146 is a 146-bp long palindromic sequence derived from human α satellite DNA, which yields an X-ray structure resolved at 2.8 Å. NCP147 differs from NCP146 by a substitution at position 21 (T → A) and at the corresponding palindromic position 127 (A → T) and by an insertion of a G at position 73. Its X-ray structure is resolved at 1.9Å. This substantial improvement of X-ray diffraction data was obtained upon the increase in the DNA length by just one bp and it was observed that the distribution of interaction points between DNA-phosphates and the histone core was the same for both NCP146 and NCP147. NCP147 is not a perfect palindrome because two dinucleotide steps, AA and AT, located at position -1 and 1, respectively, are not reverse complementary (see figure 1). The sequence becomes palindromic at -2 and 2. Furthermore, by measuring each dinucleotide step with a tetranucleotide code, reverse complementarity is observed only starting from -3W and 3C positions. The measurement of Δ ls values (see Methods) uses the convention of dividing the central segment, from position -3 to 3, into two identical halves in order to make equal the segments L1 and R1 of the mask shown in figure 1. The degree of symmetry of a sequence is, therefore, mainly based on contacts between phosphates and histones beyond positions -3 and 3.

thumbnailFigure 3. Symmetric nucleosomal DNA profiles. Δls profiles of NCP147 (filled circles) and NCP146 (empty circles) as a function of superhelix location (SHL).

Δls values calculated for the two samples form curves very similar and symmetric with respect to the superhelix location (SHL) 0 of the nucleosomal dyad axis (figure 3). Due to the difference between the two sequences at positions 21 and 127, there are small differences on the left and the right side of the dyad, while a more relevant change occurs at SHL = 0, where NCP146 exhibits two positions with the same Δls value of 1.1 Å in comparison to NCP147, which has a single Δls value close to zero. This difference in the distribution of symmetric distances correlates to the different resolution in X-ray structures obtained for the two particles. The lower Δls value found for NCP147 suggests a higher degree of symmetry and a tighter structure in comparison to NCP146.

Archaeal nucleosomes

To further characterize sequences forming stable nucleosomes according to the distribution of Δls values, 89 synthetic sequences, selected for their ability to form very stable archaeal nucleosomes, were analyzed [31]. These particles are usually made of 58 DNA bp bound to archaeal histones and resistant to micrococcal nuclease digestion. Archaeal histones are characterized by the same fold as eukaryotic ones and their quaternary structures resemble (H3H4)2-tetramers [32]. The analyzed sequences are 110-bp long and are formed by a variable 60-bp central core and by two identical lateral sequences of 25 bp. We show (figure 4, left column) the calculated Δls values for 4 representative samples, numbered according to authors' convention [31]. All samples were grouped together into four types: 22 samples named (a), with a symmetric profile and a low Δls value located at the center; 20 samples termed (b), with minimal Δls values displaced about 10 bp from the center; 31 samples termed (c), with a region of constant and low Δls values; and 16 samples termed (d), with minimal Δls values displaced by more than 10 bp from the center of the sequence. We report (figure 4, central column) the four mean profiles (a), (b), (c) and (d), obtained by grouping the 89 curves derived from the samples belonging to the same type. The bp number 55 corresponds to SHL 0. The 22 (a)- and the 20 (b)- samples exhibit single symmetrical profiles centered at SHL 0 and -10, respectively. The pattern of the 31 (c)- samples shows tracts that have almost constant Δls values and, consequently, multiple positions for the dyad axis. The 16 (d)-samples have two positions corresponding to minimal Δls values, displaced about 20 bp from the SHL 0 position. Nucleosomal stabilities of all the 89 reported sequences are similar, probably because their recovery was performed at the same step of the purification process [31]. The four mean curves reported in figure 4 (central column) exhibit a common minimal Δls value around 5 Ǻ, which, according to data reported in figure 2, suggests a mean ΔG-value typical of stable nucleosome positioning sequences such a 5SrDNA. Distributions of minimal Δls values for each of the four mean curves are shown on the right of figure 4. It is remarkable that minimal Δls values gather around 2 Å.

thumbnailFigure 4. A: Archaeal nucleosomes. Left column: Δls profiles (black line). Samples are numbered according to authors' convention [31] and separated into four types: (a), (b), (c) and (d). Central column: Mean Δls profiles (black line) of the four types of archaeal nucleosomes. Right column: Distributions of minimal Δls values for corresponding archaeal profiles.

Eukaryotic nucleosomes

We analyzed 99 sequences of eukaryotic nucleosomes, 146-bp long, cited in scientific literature as strong nucleosome-positioning sequences and with dyad positions mapped at the center of the sequence, as previously reported [1]. We show (figure 5, left column) the calculated Δls values for 4 representative samples, numbered according to authors'convention [1]. All samples were grouped together into four types, according to the same characterization used for archaeal sequences (figure 4). As previously observed for archaeal sequences, minimal Δls values are mainly localized above the center of the sequences. Due to the higher length of these samples if compared to archaeal ones, up to two minimal Δls values, which could represent potential dyad-axis sites, can be observed in the curves at a distance of about 30 bp. The four mean curves (figure 5, central column) exhibit very flat profiles when compared to archaeal analogous curves (figure 4, central column). This is probably due to the uncertainty of ± 20 bp reported in literature for this set of samples [1]. Distributions of minimal Δls values (figure 5, left column) vary, in the four profiles, within the range 1-2 Å ; therefore, we consider these low values as representative of high stability in the considered nucleosomal eukaryotic sequences.

thumbnailFigure 5. Nucleosome-forming sequences from literature. Left column: Δls profiles (black line). Samples are numbered according authors' convention [1] and separated into four types: (a), (b), (c) and (d). Central column: Mean Δls profiles (black line) of the four types of nucleosomes. Right column: Distributions of minimal Δls values for corresponding nucleosomal profiles.

Asymmetric DNA sequences

We analyzed 40 synthetic DNA sequences selected as refractory to nucleosome formation [33]. These samples range from 86 to 126 bp and the corresponding Δls profiles are reported in figure 6(a)-(c). Further 40 human DNA sequences, 110-bp long, selected for the presence of DNase I hypersensitive sites [34], were analyzed. The associated Δls profiles are reported in figure 6(d)-(f). The studied sequences were grouped according to curve shape: panels (a) and (d) show profiles with an ascending slope, panels (b) and (e) accommodate curves with a descending slope and no slope is present in panels (c) and (f). Most of these sequences differ from those forming stable nucleosomes, which always exhibited minimal Δls values in their center. In the 80 samples which were examined, we observed, in the central part of Δls curves, values ranging from 5 to 30 Å. These values and profiles characterize regions having low affinity for nucleosomes.

thumbnailFigure 6. Nucleosome free sequences. Δls profiles (black line) for sequences that impair nucleosome formation are shown in panels (a), (b) and (c), while those for sequences accommodating DNase I hypersensitive sites are shown in panels (d), (e) and (f).

Nucleosomal stability at promoters

In figure 7 we report the Δls profile obtained from the DNA sequence of 5S rRNA [35], with published nucleosome positions shown as black dots with 146 bp long horizontal bars. Nucleosome mapping was made with micrococcal nuclease and dyad positions are affected by an uncertainty of ± 20 bp. We assumed the minimal Δls values as possible dyad positions of nucleosomes and marked them with blue dots having 146 bp long horizontal bars in order to visualize the extension of the sequence covered by the nucleosomes. We mapped the nucleosomes starting from the first most stable Δls value of 0.8 Å that is found at position 664 bp. There are several other adjacent minima with similar values around this position that became excluded from the mapping so that the two experimental nucleosomes at position 595 and 750 can not be correctly predicted. We located the second most stable nucleosome with Δls value of 1.1 Å at position 404 bp and the third Δls value of 1.8 Å at position 53 bp. We obtained therefore two coincidences with published positions of 5S rRNA sequence with an uncertainty lower than 20 bp.; a fourth and last nucleosome can be inserted either at position 209 or 251 bp with an uncertainity of 1 bp or 41 bp respectively. We must remember the variability of the Δls profile reported in figure 2B for s601 sample, where minimal Δls values could vary up to 2 Å and shift of about 10 bp in the presence of bistable dinucleotide steps. This reasonably precludes accurate mapping of nucleosomes by use of minimal Δls values. Relative Δls values may instead be assumed as reliable indicators of nucleosomal stability when their measurements are based on a statistical approach. The Δls profile in figure 7 shows, upstream of the transcription start site (TSS), a region of about 200 bp with high Δls values and it is known that nucleosome free regions are always found in proximity of the TSS.

thumbnailFigure 7. Localization and stability of 5S rRNA nucleosomes. Δls profiles (black line) of 869-bp long 5S rDNA sequence from Xenopus borealis. Nucleosomal positions mapped with micrococcal nuclease [35] (black dots in the center of 146-bp long horizontal bars) and mapped according to minimal Δls values (blue dots in the center of 146-bp long horizontal bars) are reported with specific abscissa and ordinate values marked in blue and black digits respectively. The arrow marks the position of the transcription start site.

We analyzed 2126 DNA sequences from promoter regions, experimentally identified, 3 kb in length. They belong to chromosomal genes of vertebrates and represent a set of not-closely-related sequences. We generated 2126 random DNA sequences that were processed the same way for comparison. Mean Δls profiles of promoter and random sequences are presented in figure 8A. A region showing high Δls values is present in the promoter profile if compared to random DNA. Almost 50% analyzed promoters originate from human genome, hence nucleosome positions characterized, in human promoters [13], with -1 and +1, respectively, are reported for comparison. Nucleosome free regions in proximity of the transcription start site have been reported in yeast [36], Drosophila [7], and humans [8-13] and, in particular [13], a lower stability of -1 nucleosome, when compared to nucleosome at +1 position, has been observed. According to our model, the highest point of the curve (figure 8A) is the best candidate for representing a nucleosome free region and is located between -1 and +1 nucleosomes at position -37 with respect to the start codon. The upstream region from this point shows a profile with a descending slope and can be considered a weaker nucleosome-forming region. +1 nucleosome is located downstream of the maximal Δls value and positioned under the center of a V-shaped and symmetric profile. The latter resembles many profiles we previously reported as preferred for nucleosome forming. These results indicate a preserved sequence-specificity of nucleosome binding in promoters of vertebrates. The agreement between our results and the experimental mapping obtained by DNA microarray-based methods or computational algorithms supports the consistency of our approach, based on a very simple computation of symmetry. We also analysed DNA sequences belonging to 35650 human promoters [37] 500 bp upstream and downstream with respect to both transcription start and transcription termination site (TTS). Using the same approach reported above a mean profile for each nucleotide was obtained and reported in Figure 8B.

thumbnailFigure 8. Nucleosomal stability at promoters. A: Scatter plot of the mean Δls value averaged over 2126 promoter (black line) and 2126 random (green line) sequences, 3kbp long and aligned at the transcription start site. Black dots with horizontal bars mark mapped nucleosomal positions in the human genome [13]. B: Scatter plot of the mean Δls value averaged over 35650 human promoter sequences at transcription start (black line) and termination (red line) sites.

The result related to the region close to the TSS shows the same profile identifying a low affinity for the nucleosomes around 100 bp upstream with respect to the TSS.

A completely different scenario is reported for the region related to the TTS. The plot clearly shows a very high affinity for nucleosome in corrispondence exactly with the predicted TTS. It is remarkable that the extension of the V-shaped plot, corresponding to the potential identified nucleosome, has the extension of about 150 bp, the extension of a nucleosome.

Conclusion

We guessed that symmetric distributions of DNA lengths could be related to nucleosome formation and suggested two novel ideas to test this hypothesis. First we used a tetranucleotide code in order to measure DNA length and then we searched for symmetric distributions of lengths according to the frame inherent to the concept of palinstase. Results previously reported show a linear relationship between nucleosome stability and symmetry measured by Δls values of known nucleosome-forming sequences. Minimal Δls values in the profiles of several analyzed DNA sequences were consistent with preferential nucleosome formation. The presence of many contiguous minimal Δls values (4-5 every 200 bp) and of flat Δls profiles severely limits the use of our results for obtaining genome-wide maps of nucleosome positions. Δls values may instead be assumed as reliable indicators of nucleosomal stability when their measurements are based on a statistical approach. In human promoters we observed low affinity for nucleosome binding at the transcription start site and a high affinity exactly at the transcription termination site. In expectation of the acquisition of more experimental data on DNA helical rise values, we consider our results as a preliminary assessment of the weight of DNA length in nucleosome positioning.

Methods

We drew on structural databases deposited at http://ndbserver.rutgers.edu/atlas webcite to find helical rise values of naked DNA oligomers obtained by NMR analysis, since this technique suitably applies to samples in the liquid phase, which is more reliable than the crystalline phase to represent the state of DNA in living organisms.

Samples found in the database were selected by discarding those studied in aqueous dilute liquid crystalline phase, which is typically used to resolve long-range structures (> 10 Å), but yields a poor resolution at distances such as those found for helical rise. 99 values of tetranucleotide helical rise, out of the 136 possible ones, were derived this way. 14 further values were found by searching in database samples of DNA oligomers accommodating one modified base when the tetranucleotide sequence of interest was at least two steps away from the modified base. In these samples, we have verified that the presence of the modified base does not change the overall structure of the double helix and checked similarity between helical rise values found either in the modified samples and in the normal ones (data not shown). 10 of the lacking helical rise values were taken from the X-ray database and the remaining 13 were calculated by averaging values for tetranucleotides containing the same central dinucleotide step.

To express the DNA sequence as a linear array of consecutive helical steps, we read the first tetranucleotide of the sequence and derive, from table 1, the first helical rise value related to the dinucleotide step between the second and the third bp. The second tetranucleotide of the sequence yields the value of the helical rise between the third and fourth bp and so on, up to the end of the sequence. Given a sequence of n bp, the number of the elements in the array of helical rise values is equal to n-3. In order to compare positions between various DNA sequences, base-pair numbering coincides with helical-step numbering, but the first helical rise value and the last two ones are lacking. A further decrease in the original number n is due to the use of the mask (figure 1), which covers 56 helical rise values; therefore, the final number of data is n-59.

We compute the rate of symmetry of helical rise distribution for each base pair of any DNA sequence according to the following equation:

(1)

where Li and Ri correspond to the lengths shown in Figure 1.

Δls values for the two tracts L1 and R1 are always equal, due to the convention of dividing the central segment from -3 to 3 into two identical halves. The minimal Δ ls value obtained represents the maximum degree of symmetry.

DNA sequences from Archaeal nucleosomes must be requested to:

John N. Reeve at reeve.2@osu.edu

DNA sequences from literature were retrieved from :

http://genie.weizmann.ac.il/pubs/nucleosomes06/segal06_data.html webcite

DNA sequences that impair nucleosome formation must be requested to:

Mikael.Kubista@bcbp.chalmers.se

DNA sequences from DNase I hypersensitive sites were from:

http://www.research.nhgri.nih.gov/DNaseHS/May2005/ webcite

DNA promoter sequences of vertebrates were retrieved from the EPD database:

http://www.epd.isb-sib.ch/seq_download.html webcite

DNA promoter sequences of human genome were from :

http://genome.ucsc.edu/ENCODE/encode.hg17.html webcite

Authors' contributions

FP leaded the project, designed the computational analysis and drafted the initial manuscript. DS performed the computational analysis and helped to draft the manuscript. All authors read and approved the final manuscript.

Acknowledgements

We wish to thank Prof. Paola Ballario for useful comments and suggestions.

References

  1. Segal E, Fondufe-Mittendorf Y, Chen L, Thastrom A, Field Y, Moore IK, Wang JP, Widom J: Genomic code for nucleosome positioning.

    Nature 2006, 442:772-778. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  2. Lee W, Tillo D, Bray N, Morse RH, Davis RW, Hughes TR, Nislow C: A high resolution atlas of nucleosome occupancy in yeast.

    Nat Genet 2007, 39:1235-1244. PubMed Abstract | Publisher Full Text OpenURL

  3. Peckham HE, Thurman RE, Fu Y, Stamatoyannopoulos JA, Noble WS, Struhl K, Weng Z: Nucleosome positioning signals in genomic DNA.

    Genome Res 2007, 17:1170-1177. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  4. Yuan GC, Liu YJ, Dion MF, Slack MD, Wu LF, Altsculer SJ, Rando OJ: Genome-scale identification of nucleosome posotions in S. cerevisiae.

    Science 2005, 309:626-630. PubMed Abstract | Publisher Full Text OpenURL

  5. Johnson SM, Tan FJ, McCullough HL, Riordan DP, Fire AZ: Flexibility and constraint in the nucleosome core landscape of Caenhorabditis elegans chromatin.

    Genome Res 2006, 16:1505-1516. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  6. Valouev A, Ichikawa J, Tontha T, Stuart J, Ranade S, Peckham H, Zeng K, Malek JA, Costa G, McKernan K, Sidow A, Fire A, Johnson SM: A high resolution, nucleosome position map of C. elegans reveals a lack of universal sequence dictated positioning.

    Genome Res 2008, 18:1-13. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  7. Mavrich TN, Jiang C, Ioshikhes IP, Li X, Venters BJ, Zanton SJ, Tomsho LP, Qi J, Glaser RL, Schuster SC, Gilmour DS, Albert I, Pugh BF: Nucleosome organization in the Drosophila genome.

    Nature 2008, 435:358-362. Publisher Full Text OpenURL

  8. Ozsolak F, Song JS, Liu XS, Fisher DE: High throughput mapping of the chromatin structure of human promoters.

    Nat Biotechnol 2007, 25:244-248. PubMed Abstract | Publisher Full Text OpenURL

  9. Guptas S, Dennis J, Thurman RE, Kingston R, Stamatoyannopoulos JA, Noble W: Predicting Human Nucleosome Occupancy from Primary Sequence.

    PLoS Comput Biol 2008, 4:1-11. Publisher Full Text OpenURL

  10. Heintzman ND, Stuart RK, Hon G, Fu J, Ching CW, Hawkins RD, Barrera LO, Van Calcar S, Qu C, Ching KA, Wang W, Weng Z, Green RD, Crawford GE, Ren B: Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome.

    Nat Genet 2007, 39:311-318. PubMed Abstract | Publisher Full Text OpenURL

  11. Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K: High-resolution profiling of histone methylation in the human genome.

    Cell 2007, 129:823-837. PubMed Abstract | Publisher Full Text OpenURL

  12. Yuan GC, Liu JS: Genomic sequence is highly predictive of local nucleosome depletion.

    PLoS Comput Biol 2008, 4:164-174. Publisher Full Text OpenURL

  13. Schones DE, Cui K, Cuddapah S, Roh TJ, Barski A, Wang Z, Wei G, Zhao KK: Dynamic regulation of nucleosome positioning in the human genome.

    Cell 2008, 132:887-898. PubMed Abstract | Publisher Full Text OpenURL

  14. Ioshikhes IP, Albert I, Santon SJ, Pugh BF: Nucleosome positions predicted through comparative genomics.

    Nat Genet 2006, 38:1210-1215. PubMed Abstract | Publisher Full Text OpenURL

  15. Tolstourukov MY, Colasanti AW, McCandlish DM, Olson WK, Zhurkin VB: A novel roll-and-slide mechanism of DNA folding in chromatin: implications for nucleosome positioning.

    J Mol Biol 2007, 371:725-738. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  16. Miele V, Vaillant C, d'Aubenton-Carafa Y, Thermes C, Grange T: DNA physical properties determine nucleosome occupancy from yeast to fly.

    Nucl Acid Res 2008, 36:3746-3756. Publisher Full Text OpenURL

  17. van Holde KE: Chromatin. New York: Spinger-Verlag; 1989. OpenURL

  18. Richmond TJ, Davey CA: The structure of DNA in the nucleosome core.

    Nature 2003, 423:145-150. PubMed Abstract | Publisher Full Text OpenURL

  19. Lu XJ, Olson WK: 3DNA: a software package for the analysis, rebuilding and visualization of three dimensional nucleic acid structures.

    Nucl Acid Res 2003, 31:5108-5121. Publisher Full Text OpenURL

  20. Hunter CA: Sequence dependent DNA structure. The role of base stacking interactions.

    J Mol Biol 1993, 295:85-103. OpenURL

  21. Gardiner EJ, Hunter CA, Packer MJ, Palmer DS, Willet P: Sequence, dependent DNA structure: A database of octamer structural parameters.

    J Mol Biol 2003, 332:1025-1035. PubMed Abstract | Publisher Full Text OpenURL

  22. Fitzgerald DJ, Anderson JN: Unique translational positioning of nucleosomes on synthetic DNAs.

    Nucl Acid Res 1998, 26:2526-2535. Publisher Full Text OpenURL

  23. Simpson RT, Stafford DW: Structural features of a phased nucleosome core particle.

    Proc Nat Acad Sci USA 1983, 80:51-55. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  24. Lowary PT, Widom J: New DNA sequence rules for high affinity binding to hystone octamer and sequence-directed nucleosome positioning.

    J Mol Biol 1998, 276:19-42. PubMed Abstract | Publisher Full Text OpenURL

  25. Gansen A, Hauger F, Tòth K, Langowski J: Single-pair fluorescence resonance energy transfer of nucleosome in free diffusion: Optimizing stability and resolution of subpopulations.

    Anal Biochem 2007, 368:193-204. PubMed Abstract | Publisher Full Text OpenURL

  26. Davey CA, Sargent KL, Maeder AW, Richmond TJ: Solvent mediated interactions in the structure of the nucleosome core particle at 1.9 Å resolution.

    J Mol Biol 2002, 319:1097-1113. PubMed Abstract | Publisher Full Text OpenURL

  27. Thåström A, Lowary PT, Widlund HR, Cao H, Kubista M, Widom J: Sequence motifs and free energies of selected natural and non natural nucleosome positioning DNA sequences.

    J Mol Biol 1999, 288:213-229. PubMed Abstract | Publisher Full Text OpenURL

  28. Shrader TE, Crothers DM: Artificial nucleosome positiong sequences.

    Proc Nat Acad Sci USA 1989, 86:7418-7422. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  29. Wang YH, Amirhaeri S, Kang S, Wells RD, Grifth JD: Preferential nucleosome assembly at DNA triplet repeats from the myotonic dystrophy gene.

    Science 1994, 265:1709-1712. PubMed Abstract | Publisher Full Text OpenURL

  30. Luger K, Maeder AW, Richmond RK, Sargent DF, Richmond TJ: Crystal structure of the nucleosome core particle at 2.8 Å resolution.

    Nature 1997, 389:251-260. PubMed Abstract | Publisher Full Text OpenURL

  31. Bailey KA, Pereira SL, Widom J, Reeve JN: Archaeal histone selection of nucleosome positioning sequences and the procaryotic origin of histone-dependent genome evolution.

    J Mol Biol 2000, 303:25-34. PubMed Abstract | Publisher Full Text OpenURL

  32. Alilat M, Sivolob A, Révet B, Prunell A: Nucleosome dynamics IV. Protein and DNA contributions in the chiral transition of the tetrasome, the histone (H3-H4)2 tetramer-DNA particle.

    J Mol Biol 1999, 291:815-841. PubMed Abstract | Publisher Full Text OpenURL

  33. Cao H, Widlund HR, Simonsson T, Kubista M: TGGA repeats impair nucleosome formation.

    J Mol Biol 1998, 281:252-260. Publisher Full Text OpenURL

  34. Crawford GE, Holt IE, Whittle J, Webb BD, Tai D, Davis S, Margulies EH, Chen Y, Bernat JA, Ginsburg D, Zhou D, Luo S, Vasicek TJ, Daly MJ, Wolfsberg TG, Collins FS: Genome wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS).

    Genome Res 2005, 16:123-131. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  35. Drew HR, Calladine CR: Sequence-specific positioning of core histones on an 860 base-pair DNA. Experiment and theory.

    J Mol Biol 1987, 195:143-173. PubMed Abstract | Publisher Full Text OpenURL

  36. Pokholok DK, Harbison CT, Levine S, Cole M, Hannett NM, Lee TI, Bell GW, Walker K, Rolfe PA, Herbolsheimer E, Zeitlinger J, Lewitter F, Gifford DK, Young RA: Genome-wide map of nucleosome acetylation and methylation in yeast.

    Cell 2005, 122:517-527. PubMed Abstract | Publisher Full Text OpenURL

  37. Engström PG, Suzuki H, Ninomiya N, Akalin A, Sessa L, Lavorgna G, Brozzi A, Luzi L, Tan SL, Yang L, Kunarso G, Ng EL, Batalov S, Wahlestedt C, Kai C, Kawai J, Carninci P, Hayashizaki Y, Wells C, Bajic VB, Orlando V, Reid JF, Lenhard B, Lipovich L: Complex Loci in Human and Mouse Genomes.

    PLos Genetics 2006, 4:564-577. OpenURL