<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1471-2164-13-S1-S5</ui>
   <ji>1471-2164</ji>
   <fm>
      <dochead>Proceedings</dochead>
      <bibl>
         <title>
            <p>Learning generative models of molecular dynamics</p>
         </title>
         <aug>
            <au id="A1"><snm>Razavian</snm><mnm>Sharif</mnm><fnm>Narges</fnm><insr iid="I1"/></au>
            <au id="A2"><snm>Kamisetty</snm><fnm>Hetunandan</fnm><insr iid="I2"/></au>
            <au ca="yes" id="A3"><snm>Langmead</snm><mi>J</mi><fnm>Christopher</fnm><insr iid="I3"/><insr iid="I4"/><email>cjl@cs.cmu.edu</email></au>
         </aug>
         <insg>
            <ins id="I1"><p>Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA</p></ins>
            <ins id="I2"><p>Department of Biochemistry, University of Washington, Seattle, WA 98195, USA</p></ins>
            <ins id="I3"><p>Computer Science Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA</p></ins>
            <ins id="I4"><p>Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA</p></ins>
         </insg>
         <source>BMC Genomics</source>
         
         
         <supplement><title><p>Selected articles from the Tenth Asia Pacific Bioinformatics Conference (APBC 2012)</p></title><editor>Yi-Ping Phoebe Chen and Peer Bork</editor><note>Proceedings</note></supplement><conference><title><p>The Tenth Asia Pacific Bioinformatics Conference (APBC 2012)</p></title><location>Melbourne, Australia</location><date-range>17-19 January 2012</date-range><url>http://homepage.cs.latrobe.edu.au/ypchen/APBC2012/</url></conference><issn>1471-2164</issn>
         <pubdate>2012</pubdate>
         <volume>13</volume>
         <issue>Suppl 1</issue>
         <fpage>S5</fpage>
         <url>http://www.biomedcentral.com/1471-2164/13/S1/S5</url>
         <xrefbib><pubidlist><pubid idtype="pmpid">22369071</pubid><pubid idtype="doi">10.1186/1471-2164-13-S1-S5</pubid></pubidlist></xrefbib>
      </bibl>
      <history><pub><date><day>17</day><month>1</month><year>2012</year></date></pub></history>
      <cpyrt><year>2012</year><collab>Razavian et al.; licensee BioMed Central Ltd.</collab><note>This is an open access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <p>We introduce three algorithms for learning generative models of molecular structures from molecular dynamics simulations. The first algorithm learns a Bayesian-optimal undirected probabilistic model over user-specified covariates (e.g., fluctuations, distances, angles, etc). <it>L</it><sub>1 </sub>reg-ularization is used to ensure sparse models and thus reduce the risk of over-fitting the data. The topology of the resulting model reveals important couplings between different parts of the protein, thus aiding in the analysis of molecular motions. The generative nature of the model makes it well-suited to making predictions about the global effects of local structural changes (e.g., the binding of an allosteric regulator). Additionally, the model can be used to sample new conformations. The second algorithm learns a time-varying graphical model where the topology and parameters change smoothly along the trajectory, revealing the conformational sub-states. The last algorithm learns a Markov Chain over undirected graphical models which can be used to study and simulate kinetics. We demonstrate our algorithms on multiple molecular dynamics trajectories.</p>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Introduction</p>
         </st>
         <p>The three dimensional structures of proteins and other molecules vary in time according to the laws of thermodynamics. Each molecule visits an ensemble of states which can be partitioned into distinct <it>conformational sub-states </it><abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr></abbrgrp> consisting of similar structures. The study of these conformational sub-states remains an active area of research <abbrgrp><abbr bid="B3">3</abbr><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp> and has provided valuable insights into biological function, such as enzyme catalysis <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B6">6</abbr><abbr bid="B7">7</abbr></abbrgrp> and energy transduction <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>.</p>
         <p>Molecular dynamics (MD) simulations are often used to characterize conformational dynamics <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. These simulations are performed by numerically integrating Newton's laws of motion for a set of atoms. Conformational frames are written to disk into a <it>trajectory </it>for subsequent analysis. Until recently, MD simulations were limited to time-scales of several tens of nanoseconds (<it>ns </it>= 10<sup>-9 </sup>sec.). Recent advances in hardware and software (e.g., <abbrgrp><abbr bid="B10">10</abbr><abbr bid="B11">11</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr><abbr bid="B14">14</abbr></abbrgrp>) make it possible to investigate conformational dynamics on microsecond (<it>&#956;s </it>= 10<sup>-6 </sup>sec.) and millisecond (<it>ms </it>= 10<sup>-3 </sup>sec.) time-scales. Such long simulations are especially well-suited to identifying and studying the conformational sub-states relevant to biological function. Unfortunately, the corresponding trajectories are often difficult to analyze and interpret due to their size and complexity. Thus, there is a need for algorithms for analyzing such long timescale trajectories. The primary goal of this paper is to introduce new algorithms to do so.</p>
         <p>Our approach to analyzing MD data is to learn generative models known as Markov Random Fields (MRF). This is the first time MRFs have been used to model MD data. A MRF is an undirected probabilistic graphical model that encodes the joint probability distribution over a set of user-specified variables. In this paper those variables correspond to the positional fluctuations of the atoms, but the technique can be easily extended to other quantities, such as pairwise distances or angles. The generative nature of the model means that new conformations can be sampled and, perhaps more importantly, that users can make structural alterations to one part of the model (e.g., modeling the binding of a ligand) and then perform inference to predict how the rest of the system will respond.</p>
         <p>We present three closely related algorithms. The first algorithm learns a single model from the data. Both the topology and the parameters of the model are learned. The topology of the learnt graph reveals which variables are directly coupled and which correlations are indirect. Alternative methods, such as constructing a covariance matrix cannot distinguish between direct and indirect correlations. Our algorithm is guaranteed to produce an optimal model. Regularization is used to reduce the tendency of over-fitting the data. The second algorithm learns a time-varying model where the topology and parameters of the MRF change smoothly over time. Time-varying models reveal the different conformational sub-states visited by the molecule and the features of the the energy barriers that separate them. The final algorithm learns a Markov Chain over MRFs which can be used to generate new trajectories and study to kinetics.</p>
      </sec>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <sec>
            <st>
               <p>Molecular dynamics simulation</p>
            </st>
            <p>Molecular Dynamics simulations involve integrating Newton's laws of motion for a set of atoms. Briefly, given a set of <it>n </it>atomic coordinates <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i1"><m:mi mathvariant="bold">X</m:mi>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">{</m:mo>
   <m:mrow>
      <m:msub>
         <m:mrow>
            <m:mover accent="true">
               <m:mrow>
                  <m:mi>X</m:mi>
               </m:mrow>
               <m:mo class="MathClass-op">&#8594;</m:mo>
            </m:mover>
         </m:mrow>
         <m:mrow>
            <m:mn>1</m:mn>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:mo class="MathClass-punc">.</m:mo>
      <m:mo class="MathClass-punc">.</m:mo>
      <m:mo class="MathClass-punc">.</m:mo>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msub>
         <m:mrow>
            <m:mover accent="true">
               <m:mrow>
                  <m:mi>X</m:mi>
               </m:mrow>
               <m:mo class="MathClass-op">&#8594;</m:mo>
            </m:mover>
         </m:mrow>
         <m:mrow>
            <m:mi>n</m:mi>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-punc">:</m:mo>
      <m:msub>
         <m:mrow>
            <m:mover accent="true">
               <m:mrow>
                  <m:mi>X</m:mi>
               </m:mrow>
               <m:mo class="MathClass-op">&#8594;</m:mo>
            </m:mover>
         </m:mrow>
         <m:mrow>
            <m:mi>i</m:mi>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-rel">&#8712;</m:mo>
      <m:msup>
         <m:mrow>
            <m:mi>&#8477;</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mn>3</m:mn>
         </m:mrow>
      </m:msup>
   </m:mrow>
   <m:mo class="MathClass-close">}</m:mo>
</m:mrow>
</m:math></inline-formula> and the corresponding velocity vectors <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i2"><m:mi mathvariant="bold">V</m:mi>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">{</m:mo>
   <m:mrow>
      <m:msub>
         <m:mrow>
            <m:mover accent="true">
               <m:mrow>
                  <m:mi>V</m:mi>
               </m:mrow>
               <m:mo class="MathClass-op">&#8594;</m:mo>
            </m:mover>
         </m:mrow>
         <m:mrow>
            <m:mn>1</m:mn>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:mo class="MathClass-punc">.</m:mo>
      <m:mo class="MathClass-punc">.</m:mo>
      <m:mo class="MathClass-punc">.</m:mo>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msub>
         <m:mrow>
            <m:mover accent="true">
               <m:mrow>
                  <m:mi>V</m:mi>
               </m:mrow>
               <m:mo class="MathClass-op">&#8594;</m:mo>
            </m:mover>
         </m:mrow>
         <m:mrow>
            <m:mi>n</m:mi>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-punc">:</m:mo>
      <m:msub>
         <m:mrow>
            <m:mover accent="true">
               <m:mrow>
                  <m:mi>V</m:mi>
               </m:mrow>
               <m:mo class="MathClass-op">&#8594;</m:mo>
            </m:mover>
         </m:mrow>
         <m:mrow>
            <m:mi>i</m:mi>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-rel">&#8712;</m:mo>
      <m:msup>
         <m:mrow>
            <m:mi>&#8477;</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mn>3</m:mn>
         </m:mrow>
      </m:msup>
   </m:mrow>
   <m:mo class="MathClass-close">}</m:mo>
</m:mrow>
</m:math></inline-formula>, MD updates the positions and velocities of each atom according to an energy potential. The updates are performed via numerical integration, resulting in a conformational <it>trajectory</it>. The size of the time step for the numerical integration is normally on the order of a 1-2 femtoseconds (<it>fs </it>= 10<sup>-15 </sup>sec), meaning that a 1 microsecond simulation requires one billion integration steps. In most circumstances, every 1000th to 10000th conformation is written to disc as an ordered series of <it>frames</it>.</p>
            <p>Traditional methods for analyzing MD data either monitor the dynamics of global statistics (e.g., the radius of gyration, total energy, etc), or else identify sub-states via a clustering the frames <abbrgrp><abbr bid="B15">15</abbr><abbr bid="B16">16</abbr><abbr bid="B17">17</abbr></abbrgrp> or through Principal Components Analysis (PCA) and closely related methods (e.g., <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr><abbr bid="B22">22</abbr></abbrgrp>). Clustering based methods do not produce generative models and generally rely on pairwise comparisons between frames and thus run in quadratic time with respect to the number of frames in the trajectory. Our algorithms produce generative models and only perform linear work in the number of frames. This complexity difference is especially important for long timescale simulations. PCA-based methods implicitly assume that the data are drawn from a multivariate Gaussian distribution. Our method makes the same assumption but differs from PCA in two important ways. First, PCA projects the data onto an orthogonal basis. Our method involves no change of basis, making the resulting model easier to interpret. Second, we employ <it>L</it>1 regularization when learning the parameters of our model. Regularization is a common strategy for reducing the tendency to over-fit data by, informally, penalizing overly complicated models. We use <it>L</it>1 regularization because it has desirable statistical properties. Specifically, it leads to consistent models (that is, given enough data our algorithm learns the true model) while while enjoying high efficiency (that is, the number of samples needed to achieve the true model is small).</p>
            <p>More recently, Lange and Grubm&#252;ller introduced full correlation analysis <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, which can capture both linear and non-linear correlated motions from MD simulations. The algorithms in this paper are limited to linear models, but we note that they can be easily extended to more complex forms by using non-Gaussian random variables (e.g., <abbrgrp><abbr bid="B24">24</abbr><abbr bid="B25">25</abbr></abbrgrp>). Our final algorithm produces models that resemble Markov State Models (MSMs) <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> but are different in that they are fully generative.</p>
         </sec>
         <sec>
            <st>
               <p>Markov Random Fields</p>
            </st>
            <p>A Markov Random Field <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i3"><m:mi mathvariant="script">M</m:mi>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mi mathvariant="script">G</m:mi>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:mi>&#920;</m:mi>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula> consists of an undirected graph <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i4"><m:mi mathvariant="script">G</m:mi>
</m:math></inline-formula> over a set of random variables <it>X </it>= {<it>X</it><sub>1</sub>, ..., <it>X</it><sub><it>n</it></sub>} and a set of functions &#920; over the nodes and edges of <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i4"><m:mi mathvariant="script">G</m:mi></m:math></inline-formula>. Together, they define the joint distribution <it>P</it>(<b>X</b>). The topology of the graph determines the set of <it>conditional independencies </it>between the variables. In particular, the ith random variable is conditionally independent of the remaining variables, given its neighbors in the graph. Informally, if variables <it>X</it><sub><it>i </it></sub>and <it>X</it><sub><it>j </it></sub>are not connected by an edge in the graph, then any correlation between them is indirect. By 'indirect' we mean that the correlation between <it>X</it><sub><it>i </it></sub>and <it>X</it><sub><it>j </it></sub>(if any) can be explained in terms of a pathway of correlations (e.g., <it>X</it><sub><it>i </it></sub>&#8594; <it>X</it><sub><it>k </it></sub>&#8594; &#183;&#183;&#183; &#8594; <it>X</it><sub><it>j</it></sub>). Conversely, if <it>X</it><sub><it>i </it></sub>and <it>X</it><sub><it>j </it></sub>are connected by an edge, then the correlation is direct. Our algorithm automatically detects these conditional independencies and learns the sparsest possible model, subject to fitting the data.</p>
         </sec>
         <sec>
            <st>
               <p>Gaussian Graphical Models</p>
            </st>
            <p>A <it>Gaussian Graphical Model </it>(GGM) or <it>Gaussian Markov Random Field </it>is simply a MRF where each variable is normally distributed. Thus, a GGM encodes a multivariate Gaussian distribution. A GGM has parameters <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i5"><m:mi mathvariant="script">M</m:mi>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mover accent="true">
         <m:mrow>
            <m:mi>h</m:mi>
         </m:mrow>
         <m:mo class="MathClass-op">&#8594;</m:mo>
      </m:mover>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msup>
         <m:mrow>
            <m:mo class="MathClass-op">&#8721;</m:mo>
         </m:mrow>
         <m:mrow>
            <m:mo class="MathClass-bin">-</m:mo>
            <m:mn>1</m:mn>
         </m:mrow>
      </m:msup>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula> where &#931;<sup>-1 </sup>is an <it>n </it>&#215; <it>n </it>matrix (known as the <it>precision matrix</it>) and <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i6"><m:mover accent="true">
   <m:mrow>
      <m:mi>h</m:mi>
   </m:mrow>
   <m:mo class="MathClass-op">&#8594;</m:mo>
</m:mover>
</m:math></inline-formula> is a <it>n </it>&#215; 1 vector. The non-zero elements of &#931;<sup>-1 </sup>reveal the edges in the MRF. The inverse of the precision matrix, denoted by &#931;, is the covariance matrix for a multivariate Gaussian distribution with mean <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i7"><m:mover accent="true">
   <m:mrow>
      <m:mi>&#956;</m:mi>
   </m:mrow>
   <m:mo class="MathClass-op">&#8594;</m:mo>
</m:mover>
<m:mo class="MathClass-rel">=</m:mo>
<m:msup>
   <m:mrow>
      <m:mover accent="true">
         <m:mrow>
            <m:mi>h</m:mi>
         </m:mrow>
         <m:mo class="MathClass-op">&#8594;</m:mo>
      </m:mover>
   </m:mrow>
   <m:mrow>
      <m:mi>T</m:mi>
   </m:mrow>
</m:msup>
<m:mo class="MathClass-op"> &#8721;</m:mo>
</m:math></inline-formula>.</p>
            <p>Gaussian distributions have a number of desirable properties including the availability of analytic expressions for a variety of quantities. For example, the probability of observing <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i8"><m:mover accent="true">
   <m:mrow>
      <m:mi>x</m:mi>
   </m:mrow>
   <m:mo class="MathClass-op">&#8594;</m:mo>
</m:mover>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:msub>
         <m:mrow>
            <m:mi>x</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mn>1</m:mn>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:mo class="MathClass-punc">.</m:mo>
      <m:mo class="MathClass-punc">.</m:mo>
      <m:mo class="MathClass-punc">.</m:mo>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msub>
         <m:mrow>
            <m:mi>x</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>n</m:mi>
         </m:mrow>
      </m:msub>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula> is:</p>
            <p>
               <display-formula id="M1">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i9"><m:mrow>
   <m:mi>P</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi mathvariant="bold">x</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mfrac>
      <m:mrow>
         <m:mn>1</m:mn>
      </m:mrow>
      <m:mrow>
         <m:mi>Z</m:mi>
      </m:mrow>
   </m:mfrac>
   <m:mo class="qopname">exp</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">{</m:mo>
      <m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mfrac>
            <m:mrow>
               <m:mn>1</m:mn>
            </m:mrow>
            <m:mrow>
               <m:mn>2</m:mn>
            </m:mrow>
         </m:mfrac>
         <m:msup>
            <m:mrow>
               <m:mrow>
                  <m:mo class="MathClass-open">(</m:mo>
                  <m:mrow>
                     <m:mover accent="true">
                        <m:mrow>
                           <m:mi>x</m:mi>
                        </m:mrow>
                        <m:mo class="qopname">&#8594;</m:mo>
                     </m:mover>
                     <m:mo class="MathClass-bin">-</m:mo>
                     <m:mover accent="true">
                        <m:mrow>
                           <m:mi>&#956;</m:mi>
                        </m:mrow>
                        <m:mo class="qopname">&#8594;</m:mo>
                     </m:mover>
                  </m:mrow>
                  <m:mo class="MathClass-close">)</m:mo>
               </m:mrow>
            </m:mrow>
            <m:mrow>
               <m:mi>T</m:mi>
            </m:mrow>
         </m:msup>
         <m:mover class="msup">
            <m:mrow>
               <m:mo mathsize="big">&#8721;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mn>1</m:mn>
            </m:mrow>
         </m:mover>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:mover accent="true">
                  <m:mrow>
                     <m:mi>x</m:mi>
                  </m:mrow>
                  <m:mo class="qopname">&#8594;</m:mo>
               </m:mover>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mover accent="true">
                  <m:mrow>
                     <m:mi>&#956;</m:mi>
                  </m:mrow>
                  <m:mo class="qopname">&#8594;</m:mo>
               </m:mover>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
      </m:mrow>
      <m:mo class="MathClass-close">}</m:mo>
   </m:mrow>
   <m:mo class="MathClass-punc">,</m:mo>
</m:mrow>
</m:math>
               </display-formula>
            </p>
            <p>where <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i10"><m:mi>Z</m:mi>
<m:mo class="MathClass-rel">=</m:mo>
<m:msqrt>
   <m:mrow>
      <m:msup>
         <m:mrow>
            <m:mrow>
               <m:mo class="MathClass-open">(</m:mo>
               <m:mrow>
                  <m:mn>2</m:mn>
                  <m:mi>&#960;</m:mi>
               </m:mrow>
               <m:mo class="MathClass-close">)</m:mo>
            </m:mrow>
         </m:mrow>
         <m:mrow>
            <m:mi>n</m:mi>
         </m:mrow>
      </m:msup>
      <m:mo class="MathClass-rel">&#8739;</m:mo>
      <m:mo class="MathClass-op"> &#8721;</m:mo>
      <m:mo class="MathClass-rel">&#8739;</m:mo>
   </m:mrow>
</m:msqrt>
</m:math></inline-formula> is the partition function and |&#931;| denotes the determinant of &#931;. Other quantities of interest can be computed as well, such as the free energy of the model, - ln <it>Z</it>, its differential entropy:</p>
            <p>
               <display-formula id="M2">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i11"><m:mrow>
   <m:mfrac>
      <m:mrow>
         <m:mn>1</m:mn>
      </m:mrow>
      <m:mrow>
         <m:mn>2</m:mn>
      </m:mrow>
   </m:mfrac>
   <m:mo class="qopname">ln</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">[</m:mo>
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mrow>
                  <m:mo class="MathClass-open">(</m:mo>
                  <m:mrow>
                     <m:mn>2</m:mn>
                     <m:mi>&#960;</m:mi>
                     <m:mi>e</m:mi>
                  </m:mrow>
                  <m:mo class="MathClass-close">)</m:mo>
               </m:mrow>
            </m:mrow>
            <m:mrow>
               <m:mi>n</m:mi>
            </m:mrow>
         </m:msup>
      </m:mrow>
      <m:mo class="MathClass-close">]</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">&#8739;</m:mo>
   <m:mo mathsize="big">&#8721;</m:mo>
   <m:mo class="MathClass-rel">&#8739;</m:mo>
</m:mrow>
<m:mo class="MathClass-close">]</m:mo>
</m:math>
               </display-formula>
            </p>
            <p>or the KL-divergence between two different models:</p>
            <p>
               <display-formula id="M3">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i12"><m:mrow>
   <m:mi>K</m:mi>
   <m:mi>L</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mi mathvariant="script">M</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mn>0</m:mn>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-rel">&#8741;</m:mo>
         <m:msub>
            <m:mrow>
               <m:mi mathvariant="script">M</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mn>1</m:mn>
            </m:mrow>
         </m:msub>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mn>1</m:mn>
   <m:mo class="MathClass-bin">&#8725;</m:mo>
   <m:mn>2</m:mn>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi>t</m:mi>
         <m:mi>r</m:mi>
         <m:mi>a</m:mi>
         <m:mi>c</m:mi>
         <m:mi>e</m:mi>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:munderover accent="false" accentunder="false">
                  <m:mrow>
                     <m:mo mathsize="big">&#8721;</m:mo>
                  </m:mrow>
                  <m:mrow>
                     <m:mn>1</m:mn>
                  </m:mrow>
                  <m:mrow>
                     <m:mo class="MathClass-bin">-</m:mo>
                     <m:mn>1</m:mn>
                  </m:mrow>
               </m:munderover>
               <m:munder class="msub">
                  <m:mrow>
                     <m:mo mathsize="big">&#8721;</m:mo>
                  </m:mrow>
                  <m:mrow>
                     <m:mn>0</m:mn>
                  </m:mrow>
               </m:munder>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
         <m:mo class="MathClass-bin">+</m:mo>
         <m:msup>
            <m:mrow>
               <m:mrow>
                  <m:mo class="MathClass-open">(</m:mo>
                  <m:mrow>
                     <m:msub>
                        <m:mrow>
                           <m:mover accent="true">
                              <m:mrow>
                                 <m:mi>&#956;</m:mi>
                              </m:mrow>
                              <m:mo>&#8594;</m:mo>
                           </m:mover>
                        </m:mrow>
                        <m:mrow>
                           <m:mn>1</m:mn>
                        </m:mrow>
                     </m:msub>
                     <m:mo class="MathClass-bin">-</m:mo>
                     <m:msub>
                        <m:mrow>
                           <m:mover accent="true">
                              <m:mrow>
                                 <m:mi>&#956;</m:mi>
                              </m:mrow>
                              <m:mo>&#8594;</m:mo>
                           </m:mover>
                        </m:mrow>
                        <m:mrow>
                           <m:mn>0</m:mn>
                        </m:mrow>
                     </m:msub>
                  </m:mrow>
                  <m:mo class="MathClass-close">)</m:mo>
               </m:mrow>
            </m:mrow>
            <m:mrow>
               <m:mi>T</m:mi>
            </m:mrow>
         </m:msup>
         <m:msubsup>
            <m:mrow>
               <m:mo mathsize="big"> &#8721;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mn>1</m:mn>
            </m:mrow>
            <m:mrow>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mn>1</m:mn>
            </m:mrow>
         </m:msubsup>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:msub>
                  <m:mrow>
                     <m:mover accent="true">
                        <m:mrow>
                           <m:mi>&#956;</m:mi>
                        </m:mrow>
                        <m:mo>&#8594;</m:mo>
                     </m:mover>
                  </m:mrow>
                  <m:mrow>
                     <m:mn>1</m:mn>
                  </m:mrow>
               </m:msub>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:msub>
                  <m:mrow>
                     <m:mover accent="true">
                        <m:mrow>
                           <m:mi>&#956;</m:mi>
                        </m:mrow>
                        <m:mo>&#8594;</m:mo>
                     </m:mover>
                  </m:mrow>
                  <m:mrow>
                     <m:mn>0</m:mn>
                  </m:mrow>
               </m:msub>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mo class="qopname"> ln</m:mo>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:mo class="MathClass-rel">&#8739;</m:mo>
               <m:munder class="msub">
                  <m:mrow>
                     <m:mo mathsize="big">&#8721;</m:mo>
                  </m:mrow>
                  <m:mrow>
                     <m:mn>0</m:mn>
                  </m:mrow>
               </m:munder>
               <m:mo class="MathClass-rel">&#8739;</m:mo>
               <m:mo class="MathClass-bin">&#8725;</m:mo>
               <m:mo class="MathClass-rel">&#8739;</m:mo>
               <m:munder class="msub">
                  <m:mrow>
                     <m:mo mathsize="big">&#8721;</m:mo>
                  </m:mrow>
                  <m:mrow>
                     <m:mn>1</m:mn>
                  </m:mrow>
               </m:munder>
               <m:mo class="MathClass-rel">&#8739;</m:mo>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mi>n</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-punc">.</m:mo>
</m:mrow>
</m:math>
               </display-formula>
            </p>
            <p>A GGM can also be used to manipulate a subset of variables and then then compute the marginal densities for the remaining variables. For example, let <b>V </b>&#8834; <b>X </b>be an arbitrary subset of variables <b>X </b>and let <b>W </b>be the complement set. We can condition the model by setting variables <b>V </b>to some particular value, <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i13"><m:mover accent="true">
   <m:mrow>
      <m:mi>v</m:mi>
   </m:mrow>
   <m:mo class="MathClass-op">&#8594;</m:mo>
</m:mover>
</m:math></inline-formula>. The marginal distribution over <b>W </b>given <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i13"><m:mover accent="true"><m:mrow><m:mi>v</m:mi></m:mrow><m:mo class="MathClass-op">&#8594;</m:mo></m:mover></m:math></inline-formula> is a multivariate Gaussian with parameters <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i14"><m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:msub>
         <m:mrow>
            <m:mover accent="true">
               <m:mrow>
                  <m:mi>&#956;</m:mi>
               </m:mrow>
               <m:mo class="MathClass-op">&#8594;</m:mo>
            </m:mover>
         </m:mrow>
         <m:mrow>
            <m:mi>w</m:mi>
            <m:mo class="MathClass-rel">&#8739;</m:mo>
            <m:mover accent="true">
               <m:mrow>
                  <m:mi>v</m:mi>
               </m:mrow>
               <m:mo class="MathClass-op">&#8594;</m:mo>
            </m:mover>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msub>
         <m:mrow>
            <m:mo class="MathClass-op">&#8721;</m:mo>
         </m:mrow>
         <m:mrow>
            <m:mi>W</m:mi>
         </m:mrow>
      </m:msub>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula> where</p>
            <p>
               <display-formula id="M4">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i15"><m:mrow>
   <m:msub>
      <m:mrow>
         <m:mover accent="true">
            <m:mrow>
               <m:mi>&#956;</m:mi>
            </m:mrow>
            <m:mo class="MathClass-op">&#8594;</m:mo>
         </m:mover>
      </m:mrow>
      <m:mrow>
         <m:mi>W</m:mi>
         <m:mo class="MathClass-rel">&#8739;</m:mo>
         <m:mover accent="true">
            <m:mrow>
               <m:mi>v</m:mi>
            </m:mrow>
            <m:mo class="MathClass-op">&#8594;</m:mo>
         </m:mover>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:msub>
      <m:mrow>
         <m:mover accent="true">
            <m:mrow>
               <m:mi>&#956;</m:mi>
            </m:mrow>
            <m:mo class="MathClass-op">&#8594;</m:mo>
         </m:mover>
      </m:mrow>
      <m:mrow>
         <m:mi>W</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-bin">+</m:mo>
   <m:msubsup>
      <m:mrow>
         <m:mo mathsize="big"> &#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>W</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>T</m:mi>
      </m:mrow>
   </m:msubsup>
   <m:msubsup>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>V</m:mi>
         <m:mi>V</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msubsup>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mover accent="true">
            <m:mrow>
               <m:mi>v</m:mi>
            </m:mrow>
            <m:mo>&#8594;</m:mo>
         </m:mover>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:msub>
            <m:mrow>
               <m:mover accent="true">
                  <m:mrow>
                     <m:mi>&#956;</m:mi>
                  </m:mrow>
                  <m:mo>&#8594;</m:mo>
               </m:mover>
            </m:mrow>
            <m:mrow>
               <m:mi>v</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math>
               </display-formula>
            </p>
            <p>
               <display-formula id="M5">
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i16"><m:mrow>
   <m:msub>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>W</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:msub>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>W</m:mi>
         <m:mi>W</m:mi>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:msubsup>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>W</m:mi>
         <m:mi>V</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mi>T</m:mi>
      </m:mrow>
   </m:msubsup>
   <m:msubsup>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>V</m:mi>
         <m:mi>V</m:mi>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msubsup>
   <m:msub>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>W</m:mi>
         <m:mi>V</m:mi>
      </m:mrow>
   </m:msub>
</m:mrow>
</m:math>
               </display-formula>
            </p>
            <p>Here, <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i17"><m:mrow>
   <m:mo mathsize="big">&#8721;</m:mo>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mfenced close=")" open="(" separators="">
      <m:mrow>
         <m:mtable class="array" columnlines="none none none none none none none none none none none none none none none none none none none" equalcolumns="false" equalrows="false">
            <m:mtr>
               <m:mtd class="array" columnalign="center">
                  <m:msub>
                     <m:mrow>
                        <m:mo mathsize="big">&#8721;</m:mo>
                     </m:mrow>
                     <m:mrow>
                        <m:mi>W</m:mi>
                        <m:mi>W</m:mi>
                     </m:mrow>
                  </m:msub>
               </m:mtd>
               <m:mtd class="array" columnalign="center">
                  <m:msub>
                     <m:mrow>
                        <m:mo mathsize="big">&#8721;</m:mo>
                     </m:mrow>
                     <m:mrow>
                        <m:mi>W</m:mi>
                        <m:mi>V</m:mi>
                     </m:mrow>
                  </m:msub>
               </m:mtd>
            </m:mtr>
            <m:mtr>
               <m:mtd class="array" columnalign="center">
                  <m:msubsup>
                     <m:mrow>
                        <m:mo mathsize="big">&#8721;</m:mo>
                     </m:mrow>
                     <m:mrow>
                        <m:mi>W</m:mi>
                        <m:mi>V</m:mi>
                     </m:mrow>
                     <m:mrow>
                        <m:mi>T</m:mi>
                     </m:mrow>
                  </m:msubsup>
               </m:mtd>
               <m:mtd class="array" columnalign="center">
                  <m:msub>
                     <m:mrow>
                        <m:mo mathsize="big">&#8721;</m:mo>
                     </m:mrow>
                     <m:mrow>
                        <m:mi>V</m:mi>
                        <m:mi>V</m:mi>
                     </m:mrow>
                  </m:msub>
               </m:mtd>
            </m:mtr>
            <m:mtr>
               <m:mtd class="array" columnalign="center"/>
            </m:mtr>
         </m:mtable>
      </m:mrow>
   </m:mfenced>
</m:mrow>
</m:math></inline-formula>. Thus, inference can be performed analytically via matrix operations. In this way, users can predict the conformational changes induced by local perturbations or, more generally, study the couplings between arbitrarily chosen subsets of variables.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Algorithms</p>
         </st>
         <p>We now present three algorithms for learning various kinds of generative models from MD data.</p>
         <p><b>Input </b>The input to all three algorithms is a time-series of vectors <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i18"><m:mi mathvariant="bold">D</m:mi>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:msub>
         <m:mrow>
            <m:mover accent="true">
               <m:mrow>
                  <m:mi mathvariant="bold">d</m:mi>
               </m:mrow>
               <m:mo class="MathClass-op">&#8594;</m:mo>
            </m:mover>
         </m:mrow>
         <m:mrow>
            <m:mn>1</m:mn>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:mo class="MathClass-punc">.</m:mo>
      <m:mo class="MathClass-punc">.</m:mo>
      <m:mo class="MathClass-punc">.</m:mo>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msub>
         <m:mrow>
            <m:mover accent="true">
               <m:mrow>
                  <m:mi mathvariant="bold">d</m:mi>
               </m:mrow>
               <m:mo class="MathClass-op">&#8594;</m:mo>
            </m:mover>
         </m:mrow>
         <m:mrow>
            <m:mi>t</m:mi>
         </m:mrow>
      </m:msub>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula> where <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i19"><m:msub>
   <m:mrow>
      <m:mover accent="true">
         <m:mrow>
            <m:mi mathvariant="bold">d</m:mi>
         </m:mrow>
         <m:mo class="MathClass-op">&#8594;</m:mo>
      </m:mover>
   </m:mrow>
   <m:mrow>
      <m:mi>i</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula> is a <it>n </it>&#215; 1 vector of covariates (e.g., positional and/or angular deviations) and <it>t </it>is the number of snapshots in the MD trajectory.</p>
         <sec>
            <st>
               <p>Algorithm 1</p>
            </st>
            <p><b>Output </b>The first algorithm produces a Gaussian Graphical Model <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i20"><m:mi mathvariant="script">M</m:mi>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mover accent="true">
         <m:mrow>
            <m:mi>h</m:mi>
         </m:mrow>
         <m:mo class="MathClass-op">&#8594;</m:mo>
      </m:mover>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msup>
         <m:mrow>
            <m:mo class="MathClass-op">&#8721;</m:mo>
         </m:mrow>
         <m:mrow>
            <m:mo class="MathClass-bin">-</m:mo>
            <m:mn>1</m:mn>
         </m:mrow>
      </m:msup>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula>. The first step is to compute the sample mean <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i21"><m:mover accent="true">
   <m:mrow>
      <m:mi>&#956;</m:mi>
   </m:mrow>
   <m:mo class="MathClass-op">&#8594;</m:mo>
</m:mover>
<m:mo class="MathClass-rel">=</m:mo>
<m:mn>1</m:mn>
<m:mo class="MathClass-bin">&#8725;</m:mo>
<m:mi>t</m:mi>
<m:msubsup>
   <m:mrow>
      <m:mo class="MathClass-op">&#8721;</m:mo>
   </m:mrow>
   <m:mrow>
      <m:mi>i</m:mi>
      <m:mo class="MathClass-rel">=</m:mo>
      <m:mn>1</m:mn>
   </m:mrow>
   <m:mrow>
      <m:mi>t</m:mi>
   </m:mrow>
</m:msubsup>
<m:msub>
   <m:mrow>
      <m:mover accent="true">
         <m:mrow>
            <m:mi mathvariant="bold">d</m:mi>
         </m:mrow>
         <m:mo class="MathClass-op">&#8594;</m:mo>
      </m:mover>
   </m:mrow>
   <m:mrow>
      <m:mi>i</m:mi>
   </m:mrow>
</m:msub>
</m:math></inline-formula>. Then it computes the regularized precision matrix &#931;<sup>-1 </sup>(see below). Finally, <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i6"><m:mover accent="true"><m:mrow><m:mi>h</m:mi></m:mrow><m:mo class="MathClass-op">&#8594;</m:mo></m:mover></m:math></inline-formula> is computed as follows: <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i22"><m:mrow>
   <m:mover accent="true">
      <m:mrow>
         <m:mi>h</m:mi>
      </m:mrow>
      <m:mo class="MathClass-op">&#8594;</m:mo>
   </m:mover>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:msup>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msup>
   <m:mover accent="true">
      <m:mrow>
         <m:mi>&#956;</m:mi>
      </m:mrow>
      <m:mo class="MathClass-op">&#8594;</m:mo>
   </m:mover>
</m:mrow>
</m:math></inline-formula>.</p>
            <p>The algorithm produces the sparsest precision matrix that still fits the data (see below). It also guarantees that &#931;<sup>-1 </sup>is positive-definite, which means it can be inverted to produce the regularized covariance matrix (as opposed to the sample covariance, which is trivial to compute). This is important because Eqs 1-3 require the covariance matrix, &#931;. We further note that a sparse precision matrix does not imply that the corresponding covariance matrix is sparse, nor does a sparse covariance imply that the corresponding precision matrix is sparse. That is, our algorithm isn't equivalent to simply thresholding the sample covariance matrix, and then inverting.</p>
            <sec>
               <st>
                  <p>Learning regularized precision matrices</p>
               </st>
               <p>A straight-forward way of learning a GGM is to find the parameters (<inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i23"><m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mover accent="true">
         <m:mrow>
            <m:mi>&#956;</m:mi>
         </m:mrow>
         <m:mo class="MathClass-op">&#8594;</m:mo>
      </m:mover>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:mo class="MathClass-op">&#8721;</m:mo>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula>) that maximize the likelihood of the data (i.e., by finding parameters that maximize <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i24"><m:msubsup>
   <m:mrow>
      <m:mo class="MathClass-op">&#8721;</m:mo>
   </m:mrow>
   <m:mrow>
      <m:mi>i</m:mi>
      <m:mo class="MathClass-rel">=</m:mo>
      <m:mn>1</m:mn>
   </m:mrow>
   <m:mrow>
      <m:mi>t</m:mi>
   </m:mrow>
</m:msubsup>
<m:mi>P</m:mi>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:msub>
         <m:mrow>
            <m:mover accent="true">
               <m:mrow>
                  <m:mi>d</m:mi>
               </m:mrow>
               <m:mo class="MathClass-op">&#8594;</m:mo>
            </m:mover>
         </m:mrow>
         <m:mrow>
            <m:mi>i</m:mi>
         </m:mrow>
      </m:msub>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula>. It is known that a maximum likelihood model can be produced by setting the pair <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i23"><m:mrow><m:mo class="MathClass-open">(</m:mo><m:mrow><m:mover accent="true"><m:mrow><m:mi>&#956;</m:mi></m:mrow><m:mo class="MathClass-op">&#8594;</m:mo></m:mover><m:mo class="MathClass-punc">,</m:mo><m:mo class="MathClass-op">&#8721;</m:mo></m:mrow><m:mo class="MathClass-close">)</m:mo></m:mrow></m:math></inline-formula> to the sample mean and covariance matrices, respectively. Unfortunately, maximum likelihood estimates can be prone to over-fitting. This is not surprising because the covariance matrix alone contains <it>m </it>= <it>O</it>(<it>n</it><sup>2</sup>) parameters, each of which must be estimated from the data. This is relevant because the number of <it>independent </it>samples needed to obtain a statistically robust estimate of &#931; grows polynomially in <it>m</it>. We note that while modern MD simulations do produce large numbers of samples (i.e., frames), these samples are <it>not </it>independent (because they form a time-series), and so the effective sample size is much smaller than the number of frames in the trajectory.</p>
               <p>Our algorithm addresses the problem of over-fitting by maximizing the following objective function:</p>
               <p>
                  <display-formula id="M6">
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i25"><m:mrow>
   <m:mi>l</m:mi>
   <m:mi>l</m:mi>
   <m:mo stretchy="false">(</m:mo>
   <m:msup>
      <m:mo>&#8721;</m:mo>
      <m:mrow>
         <m:mo>&#8722;</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msup>
   <m:mo>|</m:mo>
   <m:mstyle mathsize="normal" mathvariant="bold">
      <m:mi>D</m:mi>
   </m:mstyle>
   <m:mo stretchy="false">)</m:mo>
   <m:mo>=</m:mo>
   <m:mstyle displaystyle="true">
      <m:munderover>
         <m:mo>&#8721;</m:mo>
         <m:mrow>
            <m:mi>k</m:mi>
            <m:mo>=</m:mo>
            <m:mn>1</m:mn>
         </m:mrow>
         <m:mi>t</m:mi>
      </m:munderover>
      <m:mrow>
         <m:mi>log</m:mi>
         <m:mi>P</m:mi>
         <m:mo stretchy="false">(</m:mo>
         <m:msub>
            <m:mover accent="true">
               <m:mi>d</m:mi>
               <m:mo>&#8594;</m:mo>
            </m:mover>
            <m:mi>k</m:mi>
         </m:msub>
         <m:mo stretchy="false">)</m:mo>
         <m:mo>&#8722;</m:mo>
         <m:mi>&#955;</m:mi>
         <m:mo class="MathClass-rel">&#8741;</m:mo>
         <m:msup>
            <m:mo>&#8721;</m:mo>
            <m:mrow>
               <m:mo>&#8722;</m:mo>
               <m:mn>1</m:mn>
            </m:mrow>
         </m:msup>
      </m:mrow>
   </m:mstyle>
   <m:msub>
      <m:mo>&#8741;</m:mo>
      <m:mn>1</m:mn>
   </m:msub>
   <m:mo>.</m:mo>
</m:mrow>
</m:math>
                  </display-formula>
               </p>
               <p>Here, ||&#931;<sup>-1</sup>||<sub>1 </sub>is the <it>L</it><sub>1 </sub>norm of the precision matrix. The <it>L</it><sub>1 </sub>norm is defined as the sum of the absolute values of the matrix elements. It can be interpreted as a measure of the complexity of the model. In particular, each non-zero element of &#931;<sup>-1 </sup>corresponds to a parameter in the model and must be estimated from the data. Thus, Eq. 6 establishes a tradeoff between the log likelihood of the data (the first term) and the complexity of the model (the second term). The scalar value &#955; controls this tradeoff such that higher values produce sparser precision matrices. This is our algorithm's only parameter. Its value can be computed analytically <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> from the number of frames in the trajectory and variables. Alternatively, users may elect to adjust &#955; to obtain precision matrices of desired sparsity.</p>
               <p>Algorithmically, our algorithm maximizes Eq. 6 in an indirect fashion, by defining and then solving a convex optimization problem. Using the functional form of <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i26"><m:mrow>
   <m:mi>P</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mover accent="true">
            <m:mrow>
               <m:mi>d</m:mi>
            </m:mrow>
            <m:mo class="MathClass-op">&#8594;</m:mo>
         </m:mover>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math></inline-formula> according to Eq. 1, the log-likelihood of &#931;<sup>-1 </sup>can be rewritten as:</p>
               <p>
                  <display-formula>
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i27"><m:mrow>
   <m:mi>l</m:mi>
   <m:mi>l</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mo mathsize="big">&#8721;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mn>1</m:mn>
            </m:mrow>
         </m:msup>
         <m:mo class="MathClass-rel">&#8739;</m:mo>
         <m:mi mathvariant="bold">D</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:mo class="qopname">log</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mo class="MathClass-rel">&#8739;</m:mo>
         <m:mo mathsize="big">&#8721;</m:mo>
         <m:mo class="MathClass-rel">&#8739;</m:mo>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:munderover accent="false" accentunder="false">
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mi>k</m:mi>
         <m:mo class="MathClass-rel">=</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
      <m:mrow>
         <m:mi>t</m:mi>
      </m:mrow>
   </m:munderover>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mover accent="true">
                  <m:mrow>
                     <m:mi>d</m:mi>
                  </m:mrow>
                  <m:mo>&#8594;</m:mo>
               </m:mover>
            </m:mrow>
            <m:mrow>
               <m:mi>k</m:mi>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mover accent="true">
            <m:mrow>
               <m:mi>&#956;</m:mi>
            </m:mrow>
            <m:mo>&#8594;</m:mo>
         </m:mover>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:msup>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msup>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msub>
            <m:mrow>
               <m:mover accent="true">
                  <m:mrow>
                     <m:mi>d</m:mi>
                  </m:mrow>
                  <m:mo>&#8594;</m:mo>
               </m:mover>
            </m:mrow>
            <m:mrow>
               <m:mi>k</m:mi>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mover accent="true">
            <m:mrow>
               <m:mi>&#956;</m:mi>
            </m:mrow>
            <m:mo>&#8594;</m:mo>
         </m:mover>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:mi>&#955;</m:mi>
   <m:mo class="MathClass-rel">&#8741;</m:mo>
   <m:msup>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msup>
   <m:msub>
      <m:mrow>
         <m:mo class="MathClass-rel">&#8741;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-punc">.</m:mo>
</m:mrow>
</m:math>
                  </display-formula>
               </p>
               <p>Noting that <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i28"><m:mo class="MathClass-rel">&#8739;</m:mo>
<m:mo class="MathClass-op">&#8721;</m:mo>
<m:mo class="MathClass-rel">&#8739;</m:mo>
<m:mo class="MathClass-rel">=</m:mo>
<m:mfrac>
   <m:mrow>
      <m:mn>1</m:mn>
   </m:mrow>
   <m:mrow>
      <m:mo class="MathClass-rel">&#8739;</m:mo>
      <m:msup>
         <m:mrow>
            <m:mo class="MathClass-op">&#8721;</m:mo>
         </m:mrow>
         <m:mrow>
            <m:mo class="MathClass-bin">-</m:mo>
            <m:mn>1</m:mn>
         </m:mrow>
      </m:msup>
      <m:mo class="MathClass-rel">&#8739;</m:mo>
   </m:mrow>
</m:mfrac>
</m:math></inline-formula> and that <it>trace </it>(<b>ABC</b>) = <it>trace</it>(<b>CAB</b>), the log-likelihood of &#931;<sup>-1 </sup>can then be rewritten as:</p>
               <p>
                  <display-formula>
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i29"><m:mrow>
   <m:mi>l</m:mi>
   <m:mi>l</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mo mathsize="big">&#8721;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mn>1</m:mn>
            </m:mrow>
         </m:msup>
         <m:mo class="MathClass-rel">&#8739;</m:mo>
         <m:mi mathvariant="bold">D</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mo class="qopname"> log</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mo class="MathClass-rel">&#8739;</m:mo>
         <m:msup>
            <m:mrow>
               <m:mo mathsize="big">&#8721;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mn>1</m:mn>
            </m:mrow>
         </m:msup>
         <m:mo class="MathClass-rel">&#8739;</m:mo>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:mi>t</m:mi>
   <m:mi>r</m:mi>
   <m:mi>a</m:mi>
   <m:mi>c</m:mi>
   <m:mi>e</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:mi mathvariant="bold">D</m:mi>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mover accent="true">
                  <m:mrow>
                     <m:mi>&#956;</m:mi>
                  </m:mrow>
                  <m:mo class="qopname">&#8594;</m:mo>
               </m:mover>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
         <m:msup>
            <m:mrow>
               <m:mo mathsize="big">&#8721;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mn>1</m:mn>
            </m:mrow>
         </m:msup>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:mi mathvariant="bold">D</m:mi>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mover accent="true">
                  <m:mrow>
                     <m:mi>&#956;</m:mi>
                  </m:mrow>
                  <m:mo class="qopname">&#8594;</m:mo>
               </m:mover>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:mi>&#955;</m:mi>
   <m:mo class="MathClass-rel">&#8741;</m:mo>
   <m:msup>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msup>
   <m:msub>
      <m:mrow>
         <m:mo class="MathClass-rel">&#8741;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-punc">.</m:mo>
</m:mrow>
</m:math>
                  </display-formula>
               </p>
               <p>Next, using the definition of the sample covariance matrix,</p>
               <p>
                  <display-formula>
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i30"><m:mrow>
   <m:mi mathvariant="bold">S</m:mi>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mfenced close="&#9002;" open="&#9001;" separators="">
      <m:mrow>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:mi mathvariant="bold">D</m:mi>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mfenced close="&#9002;" open="&#9001;" separators="">
                  <m:mrow>
                     <m:mi mathvariant="bold">D</m:mi>
                  </m:mrow>
               </m:mfenced>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
         <m:msup>
            <m:mrow>
               <m:mrow>
                  <m:mo class="MathClass-open">(</m:mo>
                  <m:mrow>
                     <m:mi mathvariant="bold">D</m:mi>
                     <m:mo class="MathClass-bin">-</m:mo>
                     <m:mfenced close="&#9002;" open="&#9001;" separators="">
                        <m:mrow>
                           <m:mi mathvariant="bold">D</m:mi>
                        </m:mrow>
                     </m:mfenced>
                  </m:mrow>
                  <m:mo class="MathClass-close">)</m:mo>
               </m:mrow>
            </m:mrow>
            <m:mrow>
               <m:mi>T</m:mi>
            </m:mrow>
         </m:msup>
      </m:mrow>
   </m:mfenced>
   <m:mo class="MathClass-punc">,</m:mo>
</m:mrow>
</m:math>
                  </display-formula>
               </p>
               <p>we can define the matrix &#931;<sup>-1 </sup>that maximizes 6 as the solution to the following optimization problem:</p>
               <p>
                  <display-formula id="M7">
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i31"><m:mrow>
   <m:mo class="qopname">arg</m:mo>
   <m:munder class="msub">
      <m:mrow>
         <m:mo class="qopname">max</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mover class="msup">
            <m:mrow>
               <m:mo mathsize="big">&#8721;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mn>1</m:mn>
            </m:mrow>
         </m:mover>
         <m:mo class="MathClass-rel">&#8827;</m:mo>
         <m:mn>0</m:mn>
      </m:mrow>
   </m:munder>
   <m:mspace class="tmspace" width="2.77695pt"/>
   <m:mo class="qopname"> log</m:mo>
   <m:mo class="MathClass-rel">&#8739;</m:mo>
   <m:msup>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msup>
   <m:mo class="MathClass-rel">&#8739;</m:mo>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:mi>t</m:mi>
   <m:mi>r</m:mi>
   <m:mi>a</m:mi>
   <m:mi>c</m:mi>
   <m:mi>e</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi mathvariant="bold">S</m:mi>
         <m:msup>
            <m:mrow>
               <m:mo mathsize="big">&#8721;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mn>1</m:mn>
            </m:mrow>
         </m:msup>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:mi>&#955;</m:mi>
   <m:mo class="MathClass-rel">&#8741;</m:mo>
   <m:msup>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msup>
   <m:msub>
      <m:mrow>
         <m:mo class="MathClass-rel">&#8741;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-punc">.</m:mo>
</m:mrow>
</m:math>
                  </display-formula>
               </p>
               <p>We note that <it>L</it><sub>1 </sub>regularization is equivalent to maximizing the likelihood under a Laplace prior and so the solution to Eq. 7 is a <it>maximum a posteriori </it>(MAP) estimate of the true precision matrix, as opposed to a maximum likelihood estimate. That is, our algorithm is a Bayesian method. Moreover, the use of <it>L</it><sub>1 </sub>regularization ensures additional desirable properties including <it>consistency </it>-- given enough data, the learning procedure learns the true model, and high statistical <it>efficiency </it>-- the number of samples needed to achieve this guarantee is small.</p>
               <p>We now show that the optimization problem defined in Eq. 7 is smooth and convex and can therefore be solved optimally. First, we consider the dual form of the objective. To obtain the dual, we first rewrite the <it>L</it><sub>1</sub>-norm as:</p>
               <p>
                  <display-formula>
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i32"><m:mrow>
   <m:mo class="MathClass-rel">&#8741;</m:mo>
   <m:mi mathvariant="bold">X</m:mi>
   <m:msub>
      <m:mrow>
         <m:mo class="MathClass-rel">&#8741;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msub>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:munder class="msub">
      <m:mrow>
         <m:mo class="MathClass-op"> max</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-rel">&#8741;</m:mo>
         <m:mi mathvariant="bold">U</m:mi>
         <m:msub>
            <m:mrow>
               <m:mo class="MathClass-rel">&#8741;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8734;</m:mi>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-rel">&#8804;</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:munder>
   <m:mi>t</m:mi>
   <m:mi>r</m:mi>
   <m:mi>a</m:mi>
   <m:mi>c</m:mi>
   <m:mi>e</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi mathvariant="bold">X</m:mi>
         <m:mi mathvariant="bold">U</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
</m:mrow>
</m:math>
                  </display-formula>
               </p>
               <p>where ||<b>U</b>||&#8734; denotes the maximum absolute value element of the matrix <b>U</b>. Given this change of formulation, the primal form of the optimization problem can be rewritten as:</p>
               <p>
                  <display-formula id="M8">
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i33"><m:mrow>
   <m:munder class="msub">
      <m:mrow>
         <m:mo class="MathClass-op">max</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mover class="msup">
            <m:mrow>
               <m:mo mathsize="big">&#8721;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mn>1</m:mn>
            </m:mrow>
         </m:mover>
         <m:mo class="MathClass-rel">&#8827;</m:mo>
         <m:mn>0</m:mn>
      </m:mrow>
   </m:munder>
   <m:munder class="msub">
      <m:mrow>
         <m:mo class="MathClass-op"> min</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-rel">&#8741;</m:mo>
         <m:mi mathvariant="bold">U</m:mi>
         <m:msub>
            <m:mrow>
               <m:mo class="MathClass-rel">&#8741;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8734;</m:mi>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-rel">&#8804;</m:mo>
         <m:mi>&#955;</m:mi>
      </m:mrow>
   </m:munder>
   <m:mo class="qopname"> log</m:mo>
   <m:mo class="MathClass-rel">&#8739;</m:mo>
   <m:msup>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msup>
   <m:mo class="MathClass-rel">&#8739;</m:mo>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:mi>t</m:mi>
   <m:mi>r</m:mi>
   <m:mi>a</m:mi>
   <m:mi>c</m:mi>
   <m:mi>e</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mo mathsize="big">&#8721;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mn>1</m:mn>
            </m:mrow>
         </m:msup>
         <m:mo class="MathClass-punc">,</m:mo>
         <m:mi mathvariant="bold">S</m:mi>
         <m:mo class="MathClass-bin">+</m:mo>
         <m:mi mathvariant="bold">U</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-punc">.</m:mo>
</m:mrow>
</m:math>
                  </display-formula>
               </p>
               <p>That is, the optimal &#931;<sup>-1 </sup>is the one that maximizes the worst case log likelihood over all additive perturbations of the covariance matrix.</p>
               <p>Next, we exchange the <it>min </it>and <it>max </it>in Eq. 8. The inner <it>max </it>in the resulting function can now be solved analytically by calculating the gradient and setting it to zero. The primal form of the objective can thus be written as:</p>
               <p>
                  <display-formula>
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i34"><m:mrow>
   <m:mi mathvariant="bold">U</m:mi>
   <m:mo class="MathClass-bin">*</m:mo>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:munder class="msub">
      <m:mrow>
         <m:mo class="MathClass-op"> min</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-rel">&#8741;</m:mo>
         <m:mi mathvariant="bold">U</m:mi>
         <m:msub>
            <m:mrow>
               <m:mo class="MathClass-rel">&#8741;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8734;</m:mi>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-rel">&#8804;</m:mo>
         <m:mi>&#955;</m:mi>
      </m:mrow>
   </m:munder>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:mo class="qopname"> log</m:mo>
   <m:mo class="MathClass-rel">&#8739;</m:mo>
   <m:mi mathvariant="bold">S</m:mi>
   <m:mo class="MathClass-bin">+</m:mo>
   <m:mi mathvariant="bold">U</m:mi>
   <m:mo class="MathClass-rel">&#8739;</m:mo>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:mi>n</m:mi>
   <m:mo class="MathClass-punc">,</m:mo>
</m:mrow>
</m:math>
                  </display-formula>
               </p>
               <p>such that &#931;<sup>-1 </sup>= (<b>S </b>+ <b>U</b>*)<sup>-1</sup>.</p>
               <p>After one last change of variables, <b>W </b>= <b>S </b>+ <b>U</b>, the dual form of Eq. 7 can now be defined as:</p>
               <p>
                  <display-formula id="M9">
                     <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i35"><m:mrow>
   <m:mo mathsize="big">&#8721;</m:mo>
   <m:mo class="MathClass-bin">*</m:mo>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mo class="qopname"> max</m:mo>
   <m:mrow>
      <m:mo class="MathClass-open">{</m:mo>
      <m:mrow>
         <m:mo class="qopname">log</m:mo>
         <m:mo class="MathClass-rel">&#8739;</m:mo>
         <m:mi mathvariant="bold">W</m:mi>
         <m:mo class="MathClass-rel">&#8739;</m:mo>
         <m:mo class="MathClass-punc">:</m:mo>
         <m:mo class="MathClass-rel">&#8741;</m:mo>
         <m:mi mathvariant="bold">W</m:mi>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mi mathvariant="bold">S</m:mi>
         <m:msub>
            <m:mrow>
               <m:mo class="MathClass-rel">&#8741;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mi>&#8734;</m:mi>
            </m:mrow>
         </m:msub>
         <m:mo class="MathClass-rel">&#8804;</m:mo>
         <m:mi>&#955;</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">}</m:mo>
   </m:mrow>
</m:mrow>
</m:math>
                  </display-formula>
               </p>
               <p>Eq. 9 is smooth and convex, and for small values of <it>n </it>it can be solved by standard convex multivariate optimization techniques, such as the interior point method. For larger values of <it>n </it>we use Block Coordinate Descent <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> instead.</p>
            </sec>
            <sec>
               <st>
                  <p>Block Coordinate Descent</p>
               </st>
               <p>Given matrix <b>A</b>, let <b>A</b><sub>\<it>k</it>\<it>j </it></sub>denote the matrix produced by removing column <it>k </it>and row <it>j </it>of the matrix. Let <b>A</b><sub><it>j </it></sub>also denote the column <it>j</it>, with diagonal element <b>A</b><sub><it>jj </it></sub>removed. The Block Coordinate Descent algorithm <abbrgrp><abbr bid="B27">27</abbr></abbrgrp>. Algorithm 1 proceeds by optimizing one row and one column of the variable matrix <b>W </b>at a time. The algorithm iteratively optimizes all columns until a convergence criteria is met. The <b>W</b>s produced in each iterations are strictly positive definite and so the regularized covariance matrix &#931; = <it>W </it>is invertible.</p>
               <p><b>Algorithm 1 </b>Block Coordinate Descent</p>
               <p><b>Require</b>: Tolerance parameter &#949;, sample covariance <b>S</b>, and regularization parameter &#955;.</p>
               <p>&#160;&#160;&#160;Initialize <b>W</b><sup>(0)</sup>:= <b>S </b>+ &#955;<b>I </b>where <b>I </b>is the identity matrix.</p>
               <p>&#160;&#160;&#160;<b>repeat</b></p>
               <p>&#160;&#160;&#160;&#160;&#160;&#160;<b>for </b><it>j </it>= 1,... <it>n </it><b>do</b></p>
               <p>&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;<inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i36"><m:mi>y</m:mi>
<m:mo class="MathClass-bin">*</m:mo>
<m:mo class="MathClass-rel">=</m:mo>
<m:mo class="qopname"> arg</m:mo>
<m:munder class="msub">
   <m:mrow>
      <m:mo class="qopname">min</m:mo>
   </m:mrow>
   <m:mrow>
      <m:mi>y</m:mi>
   </m:mrow>
</m:munder>
<m:mrow>
   <m:mo class="MathClass-open">{</m:mo>
   <m:mrow>
      <m:msup>
         <m:mrow>
            <m:mi>y</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>T</m:mi>
         </m:mrow>
      </m:msup>
      <m:msubsup>
         <m:mrow>
            <m:mi mathvariant="bold">W</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mo class="MathClass-bin">\</m:mo>
            <m:mi>j</m:mi>
            <m:mo class="MathClass-bin">\</m:mo>
            <m:mi>j</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mrow>
               <m:mo class="MathClass-open">(</m:mo>
               <m:mrow>
                  <m:mi>j</m:mi>
                  <m:mo class="MathClass-bin">-</m:mo>
                  <m:mn>1</m:mn>
               </m:mrow>
               <m:mo class="MathClass-close">)</m:mo>
            </m:mrow>
         </m:mrow>
      </m:msubsup>
      <m:mi>y</m:mi>
      <m:mo class="MathClass-punc">:</m:mo>
      <m:mo class="MathClass-rel">&#8741;</m:mo>
      <m:mi>y</m:mi>
      <m:mo class="MathClass-bin">-</m:mo>
      <m:msub>
         <m:mrow>
            <m:mi mathvariant="bold">S</m:mi>
         </m:mrow>
         <m:mrow>
            <m:mi>j</m:mi>
         </m:mrow>
      </m:msub>
      <m:msub>
         <m:mrow>
            <m:mo class="MathClass-rel">&#8741;</m:mo>
         </m:mrow>
         <m:mrow>
            <m:mi>&#8734;</m:mi>
         </m:mrow>
      </m:msub>
      <m:mo class="MathClass-rel">&#8804;</m:mo>
      <m:mi>&#955;</m:mi>
   </m:mrow>
   <m:mo class="MathClass-close">}</m:mo>
</m:mrow>
</m:math></inline-formula> {//Here, <b>W</b><sup>(<it>j</it>-1) </sup>denotes the current iterate.}</p>
               <p>&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;&#160;Set <b>W</b><sup>(<it>j</it>) </sup>to <b>W</b><sup>(<it>j</it>-1) </sup>such that <b>W</b><sub><it>j </it></sub>is replaced by <it>y</it>*.</p>
               <p>&#160;&#160;&#160;&#160;&#160;&#160;<b>end for</b></p>
               <p>&#160;&#160;&#160;&#160;&#160;&#160;Set <b>W</b><sup>(0) </sup>= <b>W</b><sup>(<it>n</it>)</sup></p>
               <p>&#160;&#160;&#160;<b>until </b><it>trace</it>((<b>W</b><sup>(0)</sup>)<sup>-1</sup><b>S</b>) - <it>n </it>+ &#955;||(<b>W</b><sup>(0)</sup>)<sup>-1</sup>||<sub>1 </sub>&#8804; &#949;.</p>
               <p>&#160;&#160;&#160;<b>return W</b><sup>(0)</sup></p>
               <p>The time complexity of this algorithm is <it>O</it>(<it>n</it><sup>4.5</sup>/&#949;) <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> when converging to a solution within &#949; &gt; 0 of the optimal. This complexity is better than <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i37"><m:mrow>
   <m:mi>O</m:mi>
   <m:mfenced close=")" open="(" separators="">
      <m:mrow>
         <m:msup>
            <m:mrow>
               <m:mi>n</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mn>6</m:mn>
            </m:mrow>
         </m:msup>
         <m:mo class="MathClass-bin">&#8725;</m:mo>
         <m:mo class="qopname">log</m:mo>
         <m:mfenced close=")" open="(" separators="">
            <m:mrow>
               <m:mfrac>
                  <m:mrow>
                     <m:mn>1</m:mn>
                  </m:mrow>
                  <m:mrow>
                     <m:mi>&#949;</m:mi>
                  </m:mrow>
               </m:mfrac>
            </m:mrow>
         </m:mfenced>
      </m:mrow>
   </m:mfenced>
</m:mrow>
</m:math></inline-formula>, which would have been achieved using the interior point method on the dual form <abbrgrp><abbr bid="B28">28</abbr></abbrgrp>.</p>
               <p>In summary, the algorithm produces a time-averaged model of the data by computing the sample mean and then constructing the optimal regularized &#931; by solving Eq. 9 using Block Coordinate Decent. The regularized covariance matrix &#931; is guaranteed to be invertible which means we can always compute the precision matrix, &#931;<sup>-1</sup>, which can be interpreted as a graph over the variables revealing the direct and indirect correlations between the variables.</p>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Algorithm 2</p>
            </st>
            <p>The second algorithm is a straight-forward extension of the first. Instead of producing a time-averaged model, it produced time-varying model: <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i38"><m:mi mathvariant="script">M</m:mi>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mi>&#964;</m:mi>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
<m:mo class="MathClass-rel">=</m:mo>
<m:mrow>
   <m:mo class="MathClass-open">(</m:mo>
   <m:mrow>
      <m:mover accent="true">
         <m:mrow>
            <m:mi>h</m:mi>
         </m:mrow>
         <m:mo class="MathClass-op">&#8594;</m:mo>
      </m:mover>
      <m:mrow>
         <m:mo class="MathClass-open">(</m:mo>
         <m:mrow>
            <m:mi>&#964;</m:mi>
         </m:mrow>
         <m:mo class="MathClass-close">)</m:mo>
      </m:mrow>
      <m:mo class="MathClass-punc">,</m:mo>
      <m:msup>
         <m:mrow>
            <m:mo class="MathClass-op">&#8721;</m:mo>
         </m:mrow>
         <m:mrow>
            <m:mo class="MathClass-bin">-</m:mo>
            <m:mn>1</m:mn>
         </m:mrow>
      </m:msup>
      <m:mrow>
         <m:mo class="MathClass-open">(</m:mo>
         <m:mrow>
            <m:mi>&#964;</m:mi>
         </m:mrow>
         <m:mo class="MathClass-close">)</m:mo>
      </m:mrow>
   </m:mrow>
   <m:mo class="MathClass-close">)</m:mo>
</m:mrow>
</m:math></inline-formula>. Here, <it>&#964; </it>&#8804; <it>t </it>indexes over sequentially ordered windows of frames in the trajectory. The width of the window, <it>w</it>, is a parameter and may be adjusted to learn time-varying models at a particular time-scale. Naturally, a separate time-averaged model could be learned for each window. Instead, the second algorithm applies a simple smoothing kernel so that the parameters of the <it>&#964;</it>th window includes information from neighboring window too. In this way, the algorithm ensures that the parameters of the time-varying model evolve as smoothly as possible, subject to fitting the data.</p>
            <p>Let <b>D</b><sup>(<it>&#964;</it>) </sup>&#8838; <b>D </b>denote the subset of frames in the MD trajectory that correspond to the &#964;th window, 1 &#8804; <it>&#964; </it>&#8804; <it>T</it>. The second algorithm solves the following optimization problem for each 1 &#8804; <it>&#964; </it>&#8804; <it>T</it>:</p>
            <p>
               <display-formula>
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i39"><m:mrow>
   <m:msup>
      <m:mrow>
         <m:mo mathsize="big">&#8721;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mo class="MathClass-bin">-</m:mo>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msup>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi>&#964;</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mo class="qopname"> arg</m:mo>
   <m:munder class="msub">
      <m:mrow>
         <m:mo class="qopname">max</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mstyle mathvariant="bold">
            <m:mi>X</m:mi>
         </m:mstyle>
         <m:mo class="MathClass-rel">&#8827;</m:mo>
         <m:mn>0</m:mn>
      </m:mrow>
   </m:munder>
   <m:mo class="qopname"> log</m:mo>
   <m:mo class="MathClass-rel">&#8739;</m:mo>
   <m:mi mathvariant="bold">X</m:mi>
   <m:mo class="MathClass-rel">&#8739;</m:mo>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:mi>t</m:mi>
   <m:mi>r</m:mi>
   <m:mi>a</m:mi>
   <m:mi>c</m:mi>
   <m:mi>e</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi mathvariant="bold">S</m:mi>
         <m:mrow>
            <m:mo class="MathClass-open">(</m:mo>
            <m:mrow>
               <m:mi>&#964;</m:mi>
            </m:mrow>
            <m:mo class="MathClass-close">)</m:mo>
         </m:mrow>
         <m:mi mathvariant="bold">X</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-bin">-</m:mo>
   <m:mi>&#955;</m:mi>
   <m:mo class="MathClass-rel">&#8741;</m:mo>
   <m:mi mathvariant="bold">X</m:mi>
   <m:msub>
      <m:mrow>
         <m:mo class="MathClass-rel">&#8741;</m:mo>
      </m:mrow>
      <m:mrow>
         <m:mn>1</m:mn>
      </m:mrow>
   </m:msub>
</m:mrow>
</m:math>
               </display-formula>
            </p>
            <p>Here, S(<it>&#964;</it>) is the <it>weighted covariance matrix</it>, and is calculated as follows:</p>
            <p>
               <display-formula>
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i40"><m:mrow>
   <m:mi>S</m:mi>
   <m:mrow>
      <m:mo class="MathClass-open">(</m:mo>
      <m:mrow>
         <m:mi>&#964;</m:mi>
      </m:mrow>
      <m:mo class="MathClass-close">)</m:mo>
   </m:mrow>
   <m:mo class="MathClass-rel">=</m:mo>
   <m:mfrac>
      <m:mrow>
         <m:msubsup>
            <m:mrow>
               <m:mo mathsize="big">&#8721;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mi>k</m:mi>
               <m:mo class="MathClass-rel">=</m:mo>
               <m:mi>&#964;</m:mi>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mi>&#954;</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#964;</m:mi>
               <m:mo class="MathClass-bin">+</m:mo>
               <m:mi>&#954;</m:mi>
            </m:mrow>
         </m:msubsup>
         <m:msub>
            <m:mrow>
               <m:mi>w</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>k</m:mi>
            </m:mrow>
         </m:msub>
         <m:mfenced close="&#9002;" open="&#9001;" separators="">
            <m:mrow>
               <m:mrow>
                  <m:mo class="MathClass-open">(</m:mo>
                  <m:mrow>
                     <m:msup>
                        <m:mrow>
                           <m:mi mathvariant="bold">D</m:mi>
                        </m:mrow>
                        <m:mrow>
                           <m:mrow>
                              <m:mo class="MathClass-open">(</m:mo>
                              <m:mrow>
                                 <m:mi>k</m:mi>
                              </m:mrow>
                              <m:mo class="MathClass-close">)</m:mo>
                           </m:mrow>
                        </m:mrow>
                     </m:msup>
                     <m:mo class="MathClass-bin">-</m:mo>
                     <m:mfenced close="&#9002;" open="&#9001;" separators="">
                        <m:mrow>
                           <m:msup>
                              <m:mrow>
                                 <m:mi mathvariant="bold">D</m:mi>
                              </m:mrow>
                              <m:mrow>
                                 <m:mrow>
                                    <m:mo class="MathClass-open">(</m:mo>
                                    <m:mrow>
                                       <m:mi>k</m:mi>
                                    </m:mrow>
                                    <m:mo class="MathClass-close">)</m:mo>
                                 </m:mrow>
                              </m:mrow>
                           </m:msup>
                        </m:mrow>
                     </m:mfenced>
                  </m:mrow>
                  <m:mo class="MathClass-close">)</m:mo>
               </m:mrow>
               <m:msup>
                  <m:mrow>
                     <m:mrow>
                        <m:mo class="MathClass-open">(</m:mo>
                        <m:mrow>
                           <m:msup>
                              <m:mrow>
                                 <m:mi mathvariant="bold">D</m:mi>
                              </m:mrow>
                              <m:mrow>
                                 <m:mrow>
                                    <m:mo class="MathClass-open">(</m:mo>
                                    <m:mrow>
                                       <m:mi>k</m:mi>
                                    </m:mrow>
                                    <m:mo class="MathClass-close">)</m:mo>
                                 </m:mrow>
                              </m:mrow>
                           </m:msup>
                           <m:mo class="MathClass-bin">-</m:mo>
                           <m:mfenced close="&#9002;" open="&#9001;" separators="">
                              <m:mrow>
                                 <m:msup>
                                    <m:mrow>
                                       <m:mi mathvariant="bold">D</m:mi>
                                    </m:mrow>
                                    <m:mrow>
                                       <m:mrow>
                                          <m:mo class="MathClass-open">(</m:mo>
                                          <m:mrow>
                                             <m:mi>k</m:mi>
                                          </m:mrow>
                                          <m:mo class="MathClass-close">)</m:mo>
                                       </m:mrow>
                                    </m:mrow>
                                 </m:msup>
                              </m:mrow>
                           </m:mfenced>
                        </m:mrow>
                        <m:mo class="MathClass-close">)</m:mo>
                     </m:mrow>
                  </m:mrow>
                  <m:mrow>
                     <m:mi>T</m:mi>
                  </m:mrow>
               </m:msup>
            </m:mrow>
         </m:mfenced>
      </m:mrow>
      <m:mrow>
         <m:msubsup>
            <m:mrow>
               <m:mo mathsize="big">&#8721;</m:mo>
            </m:mrow>
            <m:mrow>
               <m:mi>k</m:mi>
               <m:mo class="MathClass-rel">=</m:mo>
               <m:mi>&#964;</m:mi>
               <m:mo class="MathClass-bin">-</m:mo>
               <m:mi>&#954;</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>&#964;</m:mi>
               <m:mo class="MathClass-bin">+</m:mo>
               <m:mi>&#954;</m:mi>
            </m:mrow>
         </m:msubsup>
         <m:msub>
            <m:mrow>
               <m:mi>w</m:mi>
            </m:mrow>
            <m:mrow>
               <m:mi>k</m:mi>
            </m:mrow>
         </m:msub>
      </m:mrow>
   </m:mfrac>
</m:mrow>
</m:math>
               </display-formula>
            </p>
            <p>where <it>k </it>indexes over windows <it>&#964; </it>- <it>&#954; </it>to <it>&#964; </it>+ <it>&#954;</it>, <it>&#954; </it>is a user-specified kernel width, and the weights <it>w</it><sub><it>k </it></sub>are defined by a nonnegative kernel function. The choice of kernel function is specified by the user. In our experiments the kernel mixed the current window and the previous window with the current window having twice the weight of the previous. The time-varying model is then constructed by solving Eq. 9 for each <b>S</b>(<it>&#964;</it>). That is, the primary difference between the time-averaged and time-varying version of the algorithm is the kernel function.</p>
         </sec>
         <sec>
            <st>
               <p>Algorithm 3</p>
            </st>
            <p>The final algorithm builds on the second algorithm. Recall that the second algorithm learns <it>T </it>sequentially ordered models over windows of the trajectory. Moreover, recall that each model encodes a multivariate Gaussian (Eq. 1) and that the KL-divergence between multivariate Gaussians can be computed analytically via Eq. 3. The KL-divergence (also known as information gain or relative entropy) is a non-negative measure of the difference between two probability distributions. It is zero if and only if the two distributions are identical. It is not, however, a distance metric because it is not symmetric. That is <it>D</it>(<it>P</it>||<it>Q</it>) &#8800; <it>D</it>(<it>Q</it>||<it>P</it>), in general. However, it is common to define a symmetric KL-divergence by simply summing <it>KL</it><sub><it>sym </it></sub>= <it>D</it>(<it>P</it>||<it>Q</it>)+<it>D</it>(<it>Q</it>||<it>P</it>). We can thus cluster the models using any standard clustering algorithm, such as k-means or a hierarchial approach. In our experiments we used complete linkage clustering, an agglomerative method that minimizes the maximum distance between elements when merging clusters.</p>
            <p>Let <it>S </it>be the set of clusters returned by a clustering algorithm. Our final algorithm treats those clusters as states in a Markov Chain. The prior probability of being in each state can be estimated using free energy calculations <abbrgrp><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr></abbrgrp> for each cluster, or according to the relative sizes of each cluster. It then estimates the transition probabilities between states <it>i </it>and <it>j </it>by counting the number of times a model assigned to cluster <it>i </it>is followed by a model assigned to cluster <it>j</it>. This simple approach creates a model that can be used to generate new trajectories by first sampling states from the Markov Chain and then sampling conformations from the models associated with that state.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Experiments</p>
         </st>
         <p>We applied our algorithms to several molecular dynamics simulation trajectories. In this section, we illustrate some of the results obtained through this analysis. The algorithms were implemented in Matlab and run on a dual core T9600 Intel processor running at 2.8 Ghz. The wall-clock runtimes for all the experiments were on the order of seconds to about 10 minutes, depending on the size of the data set and parameter settings.</p>
         <sec>
            <st>
               <p>Algorithm 1: application to the early events of HIV entry</p>
            </st>
            <p>We applied the first algorithm to simulations of a complex (Figure <figr fid="F1">1</figr>-left) consisting of gp120 (a glycoprotein on the surface of the HIV envelope) and the CD4 receptor (a glycoprotein expressed on the surface of T helper cells). The binding of gp120 to CD4 receptors is among the first events involved in HIV's entry into helper T-Cells. We performed two simulations using namd <abbrgrp><abbr bid="B31">31</abbr></abbrgrp>. The first simulation was the gp120-CD4 complex in explicit solvent at 310 degrees Kelvin. The second simulation was the same complex bound to Ibalizumab (Figure <figr fid="F1">1</figr>-right), a humanized monoclonal antibody that binds to CD4 and inhibits the viral entry process <abbrgrp><abbr bid="B32">32</abbr></abbrgrp>. Each trajectory was each 2 ns long and contained 4500 frames.</p>
            <fig id="F1"><title><p>Figure 1</p></title><caption><p>(Left) gp120 (blue) bound to CD4 (green)</p></caption><text>
   <p>(Left) gp120 (blue) bound to CD4 (green). (Right) The same complex bound to Ibalizumab (yellow and purple), a monoclonal antibody HIV entry inhibitor. Notice that Ibalizumab does not bind to gp120.</p>
</text><graphic file="1471-2164-13-S1-S5-1"/></fig>
            <p>Ibalizumab's mechanism of action is poorly understood. As can be seen in Figure <figr fid="F1">1</figr>, Ibalizumab does not prevent gp120 from binding to CD4, nor does it directly bind to gp120 itself, suggesting that its inhibitory action occurs via an allosteric mechanism. To investigate this phenomenon, we applied our first algorithm to the two trajectories and then compared the resulting models. The variables in the models corresponded to the positional fluctuations of the C-<it>&#945; </it>atoms, relative to the initial frame of the simulation.</p>
            <sec>
               <st>
                  <p>Correlation networks</p>
               </st>
               <p>Figure <figr fid="F2">2</figr> illustrates the correlation networks learned from the drug-free (left) and drug-bound (right) simulations. The same lambda value (250) was used in each case. In each panel, a black dot indicates that residue <it>i </it>is connected to residue <it>j </it>in the graphical model. The residues corresponding to gp120 and CD4 are labeled on the left-hand side. Edges exist between both spatially proximal and distant residues. For these panels, only the data from the gp120 and CD4 atoms were modeled. However, the effects of the drug are obvious. In the drug-free case the direct correlations are largely intra-molecular, with inter-molecular correlations limited to the binding interface. The drug-bound model, in contrast, exhibits many more inter-molecular edges. Moreover, the drug-bound gp120 has far fewer inter-molecular edges. That is, Ibalizumab not only modulates the interactions between gp120 and CD4, it also changes the internal correlation structure of gp120, despite the fact that the drug only binds to CD4. This is consistent with the hypothesis that Ibalizumab's inhibitory action occurs via an allosteric mechanism.</p>
               <fig id="F2"><title><p>Figure 2</p></title><caption><p>gp120-CD4 correlation networks learned with Algorithm 1</p></caption><text>
   <p><b>gp120-CD4 correlation networks learned with Algorithm 1.</b> (Left) Edges learned by algorithm for the drug-free simulation. (Right) Edges learned by algorithm for the drug-bound simulation.</p>
</text><graphic file="1471-2164-13-S1-S5-2"/></fig>
               <p>The probabilistic nature of the model means that it is possible to compute the likelihood of each data set under both models. Table <tblr tid="T1">1</tblr> presents the log-likelihoods of both data sets under both models. As expected, the log-likelihood of the unbound data is larger (i.e., more likely) under the unbound model than it is under the bound model, and visa-versa. That is, the models are capturing statistical differences between the simulations.</p>
               <tbl id="T1"><title><p>Table 1</p></title><caption><p>Log-likelihood <inline-formula><m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i41"><m:mrow><m:mrow><m:mo class="MathClass-open">(</m:mo><m:mrow><m:mi mathvariant="bold-script">L</m:mi><m:mi mathvariant="bold-script">L</m:mi></m:mrow><m:mo class="MathClass-close">)</m:mo></m:mrow></m:mrow></m:math></inline-formula> of the gp120-CD simulations under both models</p></caption><tblbdy cols="3">
      <r>
         <c ca="center">
            <p>
               <b>
                  <it>Data</it>
               </b>
            </p>
         </c>
         <c ca="center">
            <p>
               <inline-formula>
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i42">
                     <m:mi mathvariant="bold-script">L</m:mi>
                     <m:mi mathvariant="bold-script">L</m:mi>
                  </m:math>
               </inline-formula>
               <b>(<it>Data</it>|<it>Unbound Model</it>)</b>
            </p>
         </c>
         <c ca="center">
            <p>
               <inline-formula>
                  <m:math xmlns:m="http://www.w3.org/1998/Math/MathML" name="1471-2164-13-S1-S5-i42">
                     <m:mi mathvariant="bold-script">L</m:mi>
                     <m:mi mathvariant="bold-script">L</m:mi>
                  </m:math>
               </inline-formula>
               <b>(<it>Data</it>|<it>Drug </it>- <it>Bound Model</it>)</b>
            </p>
         </c>
      </r>
      <r>
         <c cspan="3">
            <hr/>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>Unbound</p>
         </c>
         <c ca="center">
            <p>-0.03</p>
         </c>
         <c ca="center">
            <p>-0.19</p>
         </c>
      </r>
      <r>
         <c ca="center">
            <p>Bound</p>
         </c>
         <c ca="center">
            <p>-0.04</p>
         </c>
         <c ca="center">
            <p>-0.29</p>
         </c>
      </r>
   </tblbdy></tbl>
               <p>Figure <figr fid="F3">3</figr> illustrates the correlation networks learned for all three molecules in the drug-bound simulation. A red box encompasses edges between the drug and the V5 loop of gp120. These particular couplings are interesting because it is known that mutations to the V5 loop can cause resistance to Ibalizumab <abbrgrp><abbr bid="B33">33</abbr></abbrgrp>. Future simulations of such mutants might provide further insights into the mechanism of resistance.</p>
               <fig id="F3"><title><p>Figure 3</p></title><caption><p>gp120-CD4-Ibalizumab correlation networks learned with Algorithm 1</p></caption><text>
   <p><b>gp120-CD4-Ibalizumab correlation networks learned with Algorithm 1.</b> Edges learned by algorithm for the drug-bound simulation. Here, all three models are shown.</p>
</text><graphic file="1471-2164-13-S1-S5-3"/></fig>
            </sec>
            <sec>
               <st>
                  <p>Comparison to sub-optimal models</p>
               </st>
               <p>Our method is guaranteed to return an optimal model. Here we compare the models returned by our algorithm to those obtained by a reasonable, but nevertheless sub-optimal algorithm for generating sparse networks. For comparison, we inverted the <it>sample </it>covariance matrices for each data set. The resulting sample precision matrices were then thresholded so that they had the same number of edges as the ones produced via our method. We find that while the resulting models have similar fits to the data (-0.02 log-likelihood for the unbound trajectory; -0.03 log-likelihood for the bound trajectory), the <it>L</it><sub>1 </sub>penalty is is much larger in each case (0.86 vs 15.1 for unbound; 0.75 vs 12.9 for bound). The difference in <it>L</it><sub>1 </sub>penalties is due to the radically different choices of edges each method makes. Only 41% (resp. 31%) of the unbound (resp. bound) edges match the ones identified by our algorithm. Moreover, the thresholded sample precision matrices (Figure <figr fid="F4">4</figr>) lack the kind of structure seen in Figure <figr fid="F2">2</figr>. Thus, in addition to producing models that maximize Eq. 6, the resulting models are potentially easier to interpret.</p>
               <fig id="F4"><title><p>Figure 4</p></title><caption><p>Thresholded precision matrix models</p></caption><text>
   <p><b>Thresholded precision matrix models.</b> (Left) Edges produced by thresholding inverse of sample covariance matrix for the drug-free simulation. (Right) Edges produced by thresholding inverse of sample covariance matrix for the drug-bound simulation. Notice that the edges lack the kind of structure seen in Figures <figr fid="F2">2</figr> and <figr fid="F3">3</figr>.</p>
</text><graphic file="1471-2164-13-S1-S5-4"/></fig>
            </sec>
            <sec>
               <st>
                  <p>Perturbation analysis</p>
               </st>
               <p>Next, we demonstrate the use of inference to quantify the sensitivity of gp120 to structural perturbations in the drug. We conditioned the model learned from the trajectory with gp120, CD4 and Ibalizumab on the structure of the drug and then performed inference (Eq. 4) to compute the most likely configuration of remaining variables (i.e., those corresponding to gp120 and CD4). This was repeated for each frame in the trajectory. The residues with the highest average displacement are illustrated as red spheres in Figure <figr fid="F5">5</figr>. As expected, the residues that form the binding interface between CD4 and Ibalizumab are sensitive Ibalizumab's motions. Interestingly, a number of gp120 residues are also sensitive, including residues in the vicinity of the V5 loop.</p>
               <fig id="F5"><title><p>Figure 5</p></title><caption><p>Sensitivity to perturbations</p></caption><text>
   <p><b>Sensitivity to perturbations.</b> The red spheres mark the residues that are most sensitive to perturbations in the drug.</p>
</text><graphic file="1471-2164-13-S1-S5-5"/></fig>
            </sec>
         </sec>
         <sec>
            <st>
               <p>Algorithm 2: application to a 1 microsecond simulation of the engrailed homeodomain</p>
            </st>
            <p>We applied the second algorithm to a simulation of the engrailed homeodomain (Figure <figr fid="F6">6</figr>), a 54-residue DNA binding domain. The DNA-binding domains of the homeotic proteins, called homeodomains (HD), play an important role in the development of all metazoans <abbrgrp><abbr bid="B34">34</abbr></abbrgrp> and certain mutations to HDs are known to cause disease in humans <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. Homeodomains fold into a highly conserved structure consisting of three alpha-helices wherein the C-terminal helix makes sequence-specific contacts in the major groove of DNA <abbrgrp><abbr bid="B36">36</abbr></abbrgrp>. The Engrailed Homeodomain (En-HD) is an ultra-fast folding protein that is predicted to exhibit significant amounts of helical structure in the denatured state ensemble <abbrgrp><abbr bid="B37">37</abbr></abbrgrp>. Moreover, the experimentally determined unfolding rate is of 1.1E + 03/sec <abbrgrp><abbr bid="B38">38</abbr></abbrgrp>, which is also fast. Taken together, these observations suggest that the protein may exhibit substantial conformational fluctuations at equilibrium.</p>
            <fig id="F6"><title><p>Figure 6</p></title><caption><p>Engrailed homeodomain</p></caption><text>
   <p>
      <b>Engrailed homeodomain.</b>
   </p>
</text><graphic file="1471-2164-13-S1-S5-6"/></fig>
            <p>We performed three 50-microsecond simulations of the protein at 300, 330, and 350 degrees Kelvin. These simulations were performed on A<monospace>NTON</monospace><abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, a special-purpose supercomputer designed to perform long-timescale simulations. Each simulation had more than 500,000 frames. In this paper, we learned a time-varying model of the first microsecond of the 300 degree trajectory, modeling the fluctuations of the alpha carbons. The window size was 2 ns, and a sawtooth smoothing kernel was applied such that the <it>i</it>th model is built from the data from windows <it>i</it>, <it>i </it>-1, and <it>i </it>- 2 such with kernel weights 0.57, 0.29, and 0.14, respectively. A total of 500 models were learned from the first microsecond of the trajectory.</p>
            <p>Figure <figr fid="F7">7-A</figr> plots the differential entropy (Eq. 2) of the 500 models. We see that the curve has a variety of peaks and valleys that can be used to segment the trajectory into putative sub-states. Figures <figr fid="F7">7-B</figr> and <figr fid="F7">7-C</figr> illustrate the correlation networks obtained from the models with the smallest and largest differential entropies, respectively. As can be seen, the simulation visits sub-states that have radically different correlation structures.</p>
            <fig id="F7"><title><p>Figure 7</p></title><caption><p>(A) Differential entropy of the 500 model learned from the engrailed trajectory</p></caption><text>
   <p>(A) Differential entropy of the 500 model learned from the engrailed trajectory. (B) Correlation network of the model with the smallest differential entropy (model 42). (C) Correlation network of the model with the largest differential entropy (model 342).</p>
</text><graphic file="1471-2164-13-S1-S5-7"/></fig>
            <p>Figure <figr fid="F8">8-A</figr> plots the average log-likelihood of the frames from the <it>i </it>+ 1st window under the ith model. Sharp drops in the likelihood can also be used to segment the trajectory into possible sub-states and to pin-point the moment when the system transitions between them. Figure <figr fid="F8">8-B</figr> shows the log-likelihood of each of the frames under each of the 500 models. Figure <figr fid="F8">8-C</figr> shows the first 50 rows and the first 2,000 columns of Figure <figr fid="F8">8-B</figr>. The clear block-structure of the matrix more clearly illustrates the sub-states visited by the simulation.</p>
            <fig id="F8"><title><p>Figure 8</p></title><caption><p>(A) Average log-likelihood of the frames from the <it>i </it>+ 1st window under the ith model</p></caption><text>
   <p>(A) Average log-likelihood of the frames from the <it>i </it>+ 1st window under the ith model. Sudden drops in likelihood mark the transition between sub-states. (B) Log-likelihoods for each frame under each of the 500 models. (C) The first 50 rows and first 2,000 columns of the matrix from panel B. The block-structure illustrates the sub-states visited in the first 2,000 frames.</p>
</text><graphic file="1471-2164-13-S1-S5-8"/></fig>
            <p>Figure <figr fid="F9">9-A</figr> plots the symmetric version of the KL-divergence (Eq. 3) between sequential models. Once again, spikes in this curve can be used to segment the trajectory.</p>
            <fig id="F9"><title><p>Figure 9</p></title><caption><p>(A) KL-divergence between sequential models</p></caption><text>
   <p>(A) KL-divergence between sequential models. (B) Pairwise KL-divergences between models.</p>
</text><graphic file="1471-2164-13-S1-S5-9"/></fig>
         </sec>
         <sec>
            <st>
               <p>Algorithm 3: application to a 1 microsecond simulation of the engrailed homeodomain</p>
            </st>
            <p>Using the 500 models learned in the previous section, we computed the symmetric KL-divergence between all pairs of models. Recall that the KL-divergence (Eq. 3) is a measure of the difference between distributions. Figure <figr fid="F9">9-B</figr> plots the pairwise KL divergences between the 500 models.</p>
            <p>We then applied complete linkage clustering to the KL-divergence matrix. Complete linkage clustering minimizes the maximum distance between elements when merging clusters. We selected a total of 7 clusters based on the assumption that the number of sub-states visited by a sequence of <it>m </it>models proportional to the logarithm of <it>m</it>. The intuition behind this assumption is that different sub-states are separated by energy barriers and the probability of surmounting an energy barrier is exponentially small in the height of the barrier. Figure <figr fid="F10">10</figr> shows two representative structures from the two largest clusters. As can be seen, the primary difference between the two structures is the N-terminal loop.</p>
            <fig id="F10"><title><p>Figure 10</p></title><caption><p>Representative structures for states 4 (green) and 6 (magenta)</p></caption><text>
   <p>
      <b>Representative structures for states 4 (green) and 6 (magenta).</b>
   </p>
</text><graphic file="1471-2164-13-S1-S5-10"/></fig>
            <p>Finally, we estimated the parameters of a Markov chain over the 7 clusters by counting the number of times a model from the <it>i</it>th cluster was followed by a model from the <it>j</it>th cluster. The resulting state-transition matrix is shown in Figure <figr fid="F11">11</figr>. The matrix indicates that state 4 is the dominant state, but inter-converts with states 6 and 7. This state-transition matrix and the graphical models associated with each state encapsulate the statistics of the trajectory.</p>
            <fig id="F11"><title><p>Figure 11</p></title><caption><p>State-transition matrix</p></caption><text>
   <p><b>State-transition matrix.</b> The color indicates the log of the number of times state <it>i </it>transitions to state <it>j</it>.</p>
</text><graphic file="1471-2164-13-S1-S5-11"/></fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>Many existing techniques for analyzing MD data are closely related to, or direct applications of Principal Components Analysis (PCA). Quasi-Harmonic Analysis (QHA) <abbrgrp><abbr bid="B18">18</abbr><abbr bid="B19">19</abbr></abbrgrp>, for example, is PCA applied to a mass-weighted covariance matrix of atomic fluctuations. PCA-based methods diagonalize the covariance matrix and thus produce a set of eigenvectors and corresponding eigenvalues. Each eigenvector can be interpreted as one of the principal modes of vibration within the system or, equivalently, as a normally distributed random variable with zero mean and variance proportional to the corresponding eigenvalue. That is, PCA-based methods model the data in terms of a multivariate Gaussian distribution. Our methods also build multivariate Gaussian models of the data but does so over the real-space variables, not the eigen-space variables.</p>
         <p>PCA-based methods generally project the data onto a low-dimensional subspace spanned by the eigenvectors corresponding to the largest eigenvalues. This is done to simplify the data and because lower dimensional models tend to be more robust (i.e., less likely to over-fit the data). Our methods, in contrast, uses regularization when estimating the parameters of the model to achieve the same goals.</p>
         <p>The eigenvectors produced by PCA-based methods contain useful information about how different regions of the system move in a coordinated fashion. In particular, the components of each vector quantify the degree of coupling between the covariates in that mode. However, the eigenvectors make no distinction between direct and indirect couplings. Moreover, eigenvectors are an inherently global description of dynamics. Our methods, in contrast, do not perform a change of basis and instead models the data in terms of a network of correlations. The resulting model, therefore, reveals which correlations are direct and which are indirect. Pathways in these networks may provide mechanistic insights into important phenomena, such as allosteric regulation. Our models can also be used to investigate motions that are localized to specific regions of the system.</p>
         <p>Finally, we note that because our first algorithm produces a regularized estimate of the true covariance matrix, &#931;, it could potentially be used as a pre-processing step for PCA-based methods, which normally take as input the sample covariance matrix.</p>
      </sec>
      <sec>
         <st>
            <p>Conclusions and future work</p>
         </st>
         <p>We have introduced three novel methods for analyzing Molecular Dynamics simulation data. Our algorithms learn regularized graphical models of the data which can then be used to: (i) investigate the networks of correlations in the data; (ii) sample novel configurations; or (iii) perform <it>in silico </it>perturbation studies. We note that our methods are complementary to existing analysis techniques, and are not intended to replace them.</p>
         <p>There are a number of important areas for future research. Gaussian Graphical Models have a number of limitations, most notably that they encode uni-modal distributions and are best suited to modeling harmonic motions. Boltzmann distributions, in contrast, are usually multi-modal. Our third algorithm partially addresses this problem by creating a Markov chain over GGMs but the motions are still harmonic. Discrete distributions could be used to model anharmonic motions (e.g., by adapting the algorithm in <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>). Gaussian distributions are also best suited to modeling variables defined on the real-line. Angular variables, naturally, are best modeled with circular distributions, like the von Mises. We've recently developed an algorithm for learning multivariate von Mises graphical models <abbrgrp><abbr bid="B25">25</abbr></abbrgrp> which could be used to model distributions over bond and dihedral angles.</p>
      </sec>
      <sec>
         <st>
            <p>List of abbreviations used</p>
         </st>
         <p>GGM: Gaussian Graphical Model; KL: Kullback Leibler; MAP: maximum a posteriori; MD: Molecular dynamics; MRF: Markov Random Field; MSM: Markov State Model; PCA: Principal Components Analysis; QHA: Quasi-Harmonic Analysis.</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The authors declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>All three authors contributed to the creation and implementation of the algorithms and writing the manuscript. N.S.R. and C.J.L. performed the experiments and analysis.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>This work is supported in part by US NSF grant IIS-0905193. Use of the Anton machine was provided through an allocation from National Resource for Biomedical Supercomputing at the Pittsburgh Supercomputing Center via US NIH RC2GM093307.</p>
            <p>This article has been published as part of <it>BMC Genomics </it>Volume 13 Supplement 1, 2012: Selected articles from the Tenth Asia Pacific Bioinformatics Conference (APBC 2012). The full contents of the supplement are available online at <url>http://www.biomedcentral.com/1471-2164/13?issue=S1</url>.</p>
         </sec>
      </ack>
      <refgrp><bibl id="B1"><title><p>Temperature-dependent X-ray diffraction as a probe of protein structural dynamics</p></title><aug><au><snm>Frauenfelder</snm><fnm>H</fnm></au><au><snm>Petsko</snm><fnm>GA</fnm></au><au><snm>Tsernoglou</snm><fnm>D</fnm></au></aug><source>Nature</source><pubdate>1979</pubdate><volume>280</volume><issue>5723</issue><fpage>558</fpage><lpage>563</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/280558a0</pubid><pubid idtype="pmpid">460437</pubid></pubidlist></xrefbib></bibl><bibl id="B2"><title><p>Conformational substates in proteins</p></title><aug><au><snm>Frauenfelder</snm><fnm>H</fnm></au><au><snm>Parak</snm><fnm>F</fnm></au><au><snm>Young</snm><fnm>RD</fnm></au></aug><source>Annu Rev Biophys Biophys Chem</source><pubdate>1988</pubdate><volume>17</volume><fpage>451</fpage><lpage>479</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1146/annurev.bb.17.060188.002315</pubid><pubid idtype="pmpid">3293595</pubid></pubidlist></xrefbib></bibl><bibl id="B3"><title><p>Dynamic personalities of proteins</p></title><aug><au><snm>Henzler-Wildman</snm><fnm>K</fnm></au><au><snm>Kern</snm><fnm>D</fnm></au></aug><source>Nature</source><pubdate>2007</pubdate><volume>450</volume><fpage>964</fpage><lpage>972</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature06522</pubid><pubid idtype="pmpid" link="fulltext">18075575</pubid></pubidlist></xrefbib></bibl><bibl id="B4"><title><p>The role of dynamic conformational ensembles in biomolecular recognition</p></title><aug><au><snm>Boehr</snm><fnm>DD</fnm></au><au><snm>Nussinov</snm><fnm>R</fnm></au><au><snm>Wright</snm><fnm>PE</fnm></au></aug><source>Nat Chem Biol</source><pubdate>2009</pubdate><volume>5</volume><issue>11</issue><fpage>789</fpage><lpage>796</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nchembio.232</pubid><pubid idtype="pmcid">2916928</pubid><pubid idtype="pmpid" link="fulltext">19841628</pubid></pubidlist></xrefbib></bibl><bibl id="B5"><title><p>Hidden alternative structures of proline isomerase essential for catalysis</p></title><aug><au><snm>Fraser</snm><fnm>J</fnm></au><au><snm>Clarkson</snm><fnm>M</fnm></au><au><snm>Degnan</snm><fnm>S</fnm></au><au><snm>Erion</snm><fnm>R</fnm></au><au><snm>Kern</snm><fnm>D</fnm></au><au><snm>Alber</snm><fnm>T</fnm></au></aug><source>Nature</source><pubdate>2009</pubdate><volume>462</volume><issue>7273</issue><fpage>669</fpage><lpage>673</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature08615</pubid><pubid idtype="pmcid">2805857</pubid><pubid idtype="pmpid" link="fulltext">19956261</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>Enzyme dynamics during catalysis</p></title><aug><au><snm>Eisenmesser</snm><fnm>EZ</fnm></au><au><snm>Bosco</snm><fnm>DA</fnm></au><au><snm>Akke</snm><fnm>M</fnm></au><au><snm>Kern</snm><fnm>D</fnm></au></aug><source>Science</source><pubdate>2002</pubdate><volume>295</volume><issue>5559</issue><fpage>1520</fpage><lpage>1523</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1126/science.1066176</pubid><pubid idtype="pmpid">11859194</pubid></pubidlist></xrefbib></bibl><bibl id="B7"><title><p>Intrinsic dynamics of an enzyme underlies catalysis</p></title><aug><au><snm>Eisenmesser</snm><fnm>EZ</fnm></au><au><snm>Millet</snm><fnm>O</fnm></au><au><snm>Labeikovsky</snm><fnm>W</fnm></au><au><snm>Korzhnev</snm><fnm>D</fnm></au><au><snm>M</snm><fnm>WW</fnm></au><au><snm>Bosco</snm><fnm>D</fnm></au><au><snm>Skalicky</snm><fnm>J</fnm></au><au><snm>Kay</snm><fnm>L</fnm></au><au><snm>Kern</snm><fnm>D</fnm></au></aug><source>Nature</source><pubdate>2005</pubdate><volume>438</volume><fpage>117</fpage><lpage>121</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nature04105</pubid><pubid idtype="pmpid" link="fulltext">16267559</pubid></pubidlist></xrefbib></bibl><bibl id="B8"><title><p>Energy flow in proteins</p></title><aug><au><snm>Leitner</snm><fnm>DM</fnm></au></aug><source>Annu Rev Phys Chem</source><pubdate>2008</pubdate><volume>59</volume><fpage>233</fpage><lpage>259</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1146/annurev.physchem.59.032607.093606</pubid><pubid idtype="pmpid" link="fulltext">18393676</pubid></pubidlist></xrefbib></bibl><bibl id="B9"><title><p>Molecular dynamics simulations of biomolecules</p></title><aug><au><snm>Karplus</snm><fnm>M</fnm></au><au><snm>McCammon</snm><fnm>JA</fnm></au></aug><source>Nat Struct Biol</source><pubdate>2002</pubdate><volume>9</volume><fpage>646</fpage><lpage>652</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1038/nsb0902-646</pubid><pubid idtype="pmpid" link="fulltext">12198485</pubid></pubidlist></xrefbib></bibl><bibl id="B10"><title><p>Scalable molecular dynamics with NAMD</p></title><aug><au><snm>Philips</snm><fnm>JC</fnm></au><au><snm>Braun</snm><fnm>R</fnm></au><au><snm>Wang</snm><fnm>W</fnm></au><au><snm>Gumbart</snm><fnm>J</fnm></au><au><snm>Tajkhorshid</snm><fnm>E</fnm></au><au><snm>Villa</snm><fnm>E</fnm></au><au><snm>Chipot</snm><fnm>C</fnm></au><au><snm>Skeel</snm><fnm>RD</fnm></au><au><snm>Kale</snm><fnm>LV</fnm></au><au><snm>Schulten</snm><fnm>K</fnm></au></aug><source>J Comput Chem</source><pubdate>2005</pubdate><volume>26</volume><issue>16</issue><fpage>1781</fpage><lpage>1802</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/jcc.20289</pubid><pubid idtype="pmcid">2486339</pubid><pubid idtype="pmpid" link="fulltext">16222654</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>Scalable algorithms for molecular dynamics simulations on commodity clusters</p></title><aug><au><snm>Bowers</snm><fnm>KJ</fnm></au><au><snm>Chow</snm><fnm>E</fnm></au><au><snm>Xu</snm><fnm>H</fnm></au><au><snm>Dror</snm><fnm>RO</fnm></au><au><snm>Eastwood</snm><fnm>MP</fnm></au><au><snm>Gregersen</snm><fnm>BA</fnm></au><au><snm>Klepeis</snm><fnm>JL</fnm></au><au><snm>Koloss-vary</snm><fnm>I</fnm></au><au><snm>Moraes</snm><fnm>MA</fnm></au><au><snm>Sacerdoti</snm><fnm>FD</fnm></au><au><snm>Salmon</snm><fnm>JK</fnm></au><au><snm>Shan</snm><fnm>Y</fnm></au><au><snm>Shaw</snm><fnm>DE</fnm></au></aug><source>SC '06: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing.</source><publisher>New York, NY, USA: ACM</publisher><pubdate>2006</pubdate><fpage>84</fpage><lpage>96</lpage><url>http://dx.doi.org/10.1145/1188455.1188544</url><xrefbib><pubid idtype="pmpid">22247925</pubid></xrefbib></bibl><bibl id="B12"><title><p>Atomistic protein folding simulations on the submillisecond time scale using worldwide distributed computing</p></title><aug><au><snm>Pande</snm><fnm>VS</fnm></au><au><snm>Baker</snm><fnm>I</fnm></au><au><snm>Chapman</snm><fnm>J</fnm></au><au><snm>Elmer</snm><fnm>SP</fnm></au><au><snm>Khaliq</snm><fnm>S</fnm></au><au><snm>Larson</snm><fnm>SM</fnm></au><au><snm>Rhee</snm><fnm>YM</fnm></au><au><snm>Shirts</snm><fnm>MR</fnm></au><au><snm>Snow</snm><fnm>C</fnm></au><au><snm>Sorin</snm><fnm>EJ</fnm></au><au><snm>Zagrovic</snm><fnm>B</fnm></au></aug><source>Biopolymers</source><pubdate>2003</pubdate><volume>68</volume><fpage>91</fpage><lpage>109</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/bip.10219</pubid><pubid idtype="pmpid" link="fulltext">12579582</pubid></pubidlist></xrefbib></bibl><bibl id="B13"><title><p>Accelerating molecular modeling applications with graphics processors</p></title><aug><au><snm>Stone</snm><fnm>JE</fnm></au><au><snm>Phillips</snm><fnm>JC</fnm></au><au><snm>Freddolino</snm><fnm>PL</fnm></au><au><snm>Hardy</snm><fnm>DJ</fnm></au><au><snm>Trabuco</snm><fnm>LG</fnm></au><au><snm>Schulten</snm><fnm>K</fnm></au></aug><source>J Comput Chem</source><pubdate>2007</pubdate><volume>28</volume><fpage>2618</fpage><lpage>2640</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/jcc.20829</pubid><pubid idtype="pmpid" link="fulltext">17894371</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>Anton, a special-purpose machine for molecular dynamics simulation. SIGARCH Comput</p></title><aug><au><snm>Shaw</snm><fnm>DE</fnm></au><au><snm>Deneroff</snm><fnm>MM</fnm></au><au><snm>Dror</snm><fnm>RO</fnm></au><au><snm>Kuskin</snm><fnm>JS</fnm></au><au><snm>Larson</snm><fnm>RH</fnm></au><au><snm>Salmon</snm><fnm>JK</fnm></au><au><snm>Young</snm><fnm>C</fnm></au><au><snm>Batson</snm><fnm>B</fnm></au><au><snm>Bowers</snm><fnm>KJ</fnm></au><au><snm>Chao</snm><fnm>JC</fnm></au><au><snm>Eastwood</snm><fnm>MP</fnm></au><au><snm>Gagliardo</snm><fnm>J</fnm></au><au><snm>Grossman</snm><fnm>JP</fnm></au><au><snm>Ho</snm><fnm>CR</fnm></au><au><snm>Ierardi</snm><fnm>DJ</fnm></au><au><snm>Kolossv'ary</snm><fnm>I</fnm></au><au><snm>Klepeis</snm><fnm>JL</fnm></au><au><snm>Layman</snm><fnm>T</fnm></au><au><snm>McLeavey</snm><fnm>C</fnm></au><au><snm>Moraes</snm><fnm>MA</fnm></au><au><snm>Mueller</snm><fnm>R</fnm></au><au><snm>Priest</snm><fnm>EC</fnm></au><au><snm>Shan</snm><fnm>Y</fnm></au><au><snm>Spengler</snm><fnm>J</fnm></au><au><snm>Theobald</snm><fnm>M</fnm></au><au><snm>Towles</snm><fnm>B</fnm></au><au><snm>Wang</snm><fnm>SC</fnm></au></aug><source>Archit News</source><pubdate>2007</pubdate><volume>35</volume><fpage>1</fpage><lpage>12</lpage></bibl><bibl id="B15"><title><p>Clustering molecular dynamics trajectories: 1. Characterizing the performance of different clustering algorithms</p></title><aug><au><snm>Shao</snm><fnm>J</fnm></au><au><snm>Tanner</snm><fnm>S</fnm></au><au><snm>Thompson</snm><fnm>N</fnm></au><au><snm>Cheatham</snm><fnm>T</fnm></au></aug><source>J Chem Theory Comput</source><pubdate>2007</pubdate><volume>3</volume><issue>6</issue><fpage>2312</fpage><lpage>2334</lpage><xrefbib><pubid idtype="doi">10.1021/ct700119m</pubid></xrefbib></bibl><bibl id="B16"><title><p>Efficient evaluation of sampling quality of molecular dynamics simulations by clustering of dihedral torsion angles and Sammon mapping</p></title><aug><au><snm>Frickenhaus</snm><fnm>S</fnm></au><au><snm>Kannan</snm><fnm>S</fnm></au><au><snm>Zacharias</snm><fnm>M</fnm></au></aug><source>J Comput Chem</source><pubdate>2009</pubdate><volume>30</volume><issue>3</issue><fpage>479</fpage><lpage>492</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/jcc.21076</pubid><pubid idtype="pmpid" link="fulltext">18680215</pubid></pubidlist></xrefbib></bibl><bibl id="B17"><title><p>Folding-unfolding thermodynamics of a beta-heptapeptide from equilibrium simulations</p></title><aug><au><snm>Daura</snm><fnm>X</fnm></au><au><snm>van Gunsteren</snm><fnm>WF</fnm></au><au><snm>Mark</snm><fnm>AE</fnm></au></aug><source>Proteins</source><pubdate>1999</pubdate><volume>34</volume><issue>3</issue><fpage>269</fpage><lpage>280</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/(SICI)1097-0134(19990215)34:3&lt;269::AID-PROT1&gt;3.0.CO;2-3</pubid><pubid idtype="pmpid" link="fulltext">10024015</pubid></pubidlist></xrefbib></bibl><bibl id="B18"><title><p>Method for estimating the configurational entropy of macro-molecules</p></title><aug><au><snm>Karplus</snm><fnm>M</fnm></au><au><snm>Kushick</snm><fnm>JN</fnm></au></aug><source>Macromolecules</source><pubdate>1981</pubdate><volume>14</volume><issue>2</issue><fpage>325</fpage><lpage>332</lpage><xrefbib><pubid idtype="doi">10.1021/ma50003a019</pubid></xrefbib></bibl><bibl id="B19"><title><p>Quasi-harmonic method for studying very low frequency modes in proteins</p></title><aug><au><snm>Levy</snm><fnm>RM</fnm></au><au><snm>Srinivasan</snm><fnm>AR</fnm></au><au><snm>Olson</snm><fnm>WK</fnm></au><au><snm>McCammon</snm><fnm>JA</fnm></au></aug><source>Biopolymers</source><pubdate>1984</pubdate><volume>23</volume><fpage>1099</fpage><lpage>1112</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/bip.360230610</pubid><pubid idtype="pmpid">6733249</pubid></pubidlist></xrefbib></bibl><bibl id="B20"><title><p>Collective protein dynamics in relation to function</p></title><aug><au><snm>Berendsen</snm><fnm>HJ</fnm></au><au><snm>Hayward</snm><fnm>S</fnm></au></aug><source>Curr Opin Struct Biol</source><pubdate>2000</pubdate><volume>10</volume><issue>2</issue><fpage>165</fpage><lpage>169</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/S0959-440X(00)00061-0</pubid><pubid idtype="pmpid" link="fulltext">10753809</pubid></pubidlist></xrefbib></bibl><bibl id="B21"><title><p>An online approach for mining collective behaviors from molecular dynamics simulations</p></title><aug><au><snm>Ramanathan</snm><fnm>A</fnm></au><au><snm>Agarwal</snm><fnm>PK</fnm></au><au><snm>Kurnikova</snm><fnm>M</fnm></au><au><snm>Langmead</snm><fnm>CJ</fnm></au></aug><source>J Comput Biol</source><pubdate>2010</pubdate><volume>17</volume><issue>3</issue><fpage>309</fpage><lpage>324</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1089/cmb.2009.0167</pubid><pubid idtype="pmpid" link="fulltext">20377447</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>On-the-fly identification of conformational sub-states from molecular dynamics simulations</p></title><aug><au><snm>Ramanathan</snm><fnm>A</fnm></au><au><snm>Yoo</snm><fnm>J</fnm></au><au><snm>Langmead</snm><fnm>C</fnm></au></aug><source>J Chem Theory Comput</source><pubdate>2011</pubdate><volume>7</volume><issue>3</issue><fpage>778</fpage><lpage>789</lpage><xrefbib><pubid idtype="doi">10.1021/ct100531j</pubid></xrefbib></bibl><bibl id="B23"><title><p>Full correlation analysis of conformational protein dynamics</p></title><aug><au><snm>Lange</snm><fnm>OF</fnm></au><au><snm>Grubm&#252;ller</snm><fnm>H</fnm></au></aug><source>Proteins</source><pubdate>2008</pubdate><volume>70</volume><issue>4</issue><fpage>1294</fpage><lpage>1312</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/prot.21618</pubid><pubid idtype="pmpid" link="fulltext">17876828</pubid></pubidlist></xrefbib></bibl><bibl id="B24"><title><p>Learning generative models for protein fold families</p></title><aug><au><snm>Balakrishnan</snm><fnm>S</fnm></au><au><snm>Kamisetty</snm><fnm>H</fnm></au><au><snm>Carbonell</snm><fnm>JG</fnm></au><au><snm>Lee</snm><fnm>SI</fnm></au><au><snm>Langmead</snm><fnm>CJ</fnm></au></aug><source>Proteins</source><pubdate>2011</pubdate><volume>79</volume><issue>4</issue><fpage>1061</fpage><lpage>1078</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/prot.22934</pubid><pubid idtype="pmpid" link="fulltext">21268112</pubid></pubidlist></xrefbib></bibl><bibl id="B25"><title><p>The von Mises graphical model: regularized structure and parameter learning</p></title><aug><au><snm>Razavian</snm><fnm>N</fnm></au><au><snm>Kamisetty</snm><fnm>H</fnm></au><au><snm>Langmead</snm><fnm>C</fnm></au></aug><source>Tech Rep CMU-CS-11-108, Carnegie Mellon University, Department of Computer Science</source><pubdate>2011</pubdate></bibl><bibl id="B26"><title><p>Progress and challenges in the automated construction of Markov state models for full protein systems</p></title><aug><au><snm>Bowman</snm><fnm>GR</fnm></au><au><snm>Beauchamp</snm><fnm>KA</fnm></au><au><snm>Boxer</snm><fnm>G</fnm></au><au><snm>Pande</snm><fnm>VS</fnm></au></aug><source>J Chem Phys</source><pubdate>2009</pubdate><volume>131</volume><issue>12</issue><fpage>124101</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1063/1.3216567</pubid><pubid idtype="pmcid">2766407</pubid><pubid idtype="pmpid" link="fulltext">19791846</pubid></pubidlist></xrefbib></bibl><bibl id="B27"><title><p>Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data</p></title><aug><au><snm>Banerjee</snm><fnm>O</fnm></au><au><snm>El Ghaoui</snm><fnm>L</fnm></au><au><snm>d'Aspremont</snm><fnm>A</fnm></au></aug><source>J Mach Learn Res</source><pubdate>2008</pubdate><volume>9</volume><fpage>485</fpage><lpage>516</lpage></bibl><bibl id="B28"><title><p>Determinant maximization with linear matrix inequality constraints</p></title><aug><au><snm>Vandenberghe</snm><fnm>L</fnm></au><au><snm>Boyd</snm><fnm>S</fnm></au><au><snm>Wu</snm><fnm>SP</fnm></au></aug><source>SIAM Journal on Matrix Analysis and Applications</source><pubdate>1998</pubdate><volume>19</volume><fpage>499</fpage><lpage>533</lpage><xrefbib><pubid idtype="doi">10.1137/S0895479896303430</pubid></xrefbib></bibl><bibl id="B29"><title><p>Free energy estimates of all-atom protein structures using generalized belief propagation</p></title><aug><au><snm>Kamisetty</snm><fnm>H</fnm></au><au><snm>Xing</snm><fnm>EP</fnm></au><au><snm>Langmead</snm><fnm>CJ</fnm></au></aug><source>J Comput Biol</source><pubdate>2008</pubdate><volume>15</volume><issue>7</issue><fpage>755</fpage><lpage>766</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1089/cmb.2007.0131</pubid><pubid idtype="pmpid" link="fulltext">18662103</pubid></pubidlist></xrefbib></bibl><bibl id="B30"><title><p>Accounting for conforma-tional entropy in predicting bidning free energies of protein-protein interactions</p></title><aug><au><snm>Kamisetty</snm><fnm>H</fnm></au><au><snm>Ramanathan</snm><fnm>A</fnm></au><au><snm>Bailey-Kellogg</snm><fnm>C</fnm></au><au><snm>Langmead</snm><fnm>C</fnm></au></aug><source>Proteins</source><pubdate>2011</pubdate><volume>79</volume><issue>2</issue><fpage>444</fpage><lpage>462</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/prot.22894</pubid><pubid idtype="pmpid" link="fulltext">21120864</pubid></pubidlist></xrefbib></bibl><bibl id="B31"><title><p>Scalable molecular dynamics with NAMD</p></title><aug><au><snm>Phillips</snm><fnm>JC</fnm></au><au><snm>Braun</snm><fnm>R</fnm></au><au><snm>Wang</snm><fnm>W</fnm></au><au><snm>Gumbart</snm><fnm>J</fnm></au><au><snm>Tajkhorshid</snm><fnm>E</fnm></au><au><snm>Villa</snm><fnm>E</fnm></au><au><snm>Chipot</snm><fnm>C</fnm></au><au><snm>Skeel</snm><fnm>RD</fnm></au><au><snm>Kal&#233;</snm><fnm>L</fnm></au><au><snm>Schulten</snm><fnm>K</fnm></au></aug><source>J Comput Chem</source><pubdate>2005</pubdate><volume>26</volume><fpage>1781</fpage><lpage>1802</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/jcc.20289</pubid><pubid idtype="pmcid">2486339</pubid><pubid idtype="pmpid" link="fulltext">16222654</pubid></pubidlist></xrefbib></bibl><bibl id="B32"><title><p>Safety, pharmacokinetics, and antiretroviral activity of multiple doses of ibalizumab (formerly TNX-355), an anti-CD4 monoclonal antibody, in human immunodeficiency virus type 1-infected adults</p></title><aug><au><snm>Jacobson</snm><fnm>JM</fnm></au><au><snm>Kuritzkes</snm><fnm>DR</fnm></au><au><snm>Godofsky</snm><fnm>E</fnm></au><au><snm>DeJesus</snm><fnm>E</fnm></au><au><snm>Larson</snm><fnm>JA</fnm></au><au><snm>Weinheimer</snm><fnm>SP</fnm></au><au><snm>Lewis</snm><fnm>ST</fnm></au></aug><source>Antimicrob Agents Chemother</source><pubdate>2009</pubdate><volume>53</volume><issue>2</issue><fpage>450</fpage><lpage>457</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1128/AAC.00942-08</pubid><pubid idtype="pmcid">2630626</pubid><pubid idtype="pmpid" link="fulltext">19015347</pubid></pubidlist></xrefbib></bibl><bibl id="B33"><title><p>Loss of asparagine-linked glycosylation sites in variable region 5 of human immunodeficiency virus type 1 envelope is associated with resistance to CD4 antibody ibalizumab</p></title><aug><au><snm>Toma</snm><fnm>J</fnm></au><au><snm>Weinheimer</snm><fnm>SP</fnm></au><au><snm>Stawiski</snm><fnm>E</fnm></au><au><snm>Whitcomb</snm><fnm>JM</fnm></au><au><snm>Lewis</snm><fnm>ST</fnm></au><au><snm>Petropoulos</snm><fnm>CJ</fnm></au><au><snm>Huang</snm><fnm>W</fnm></au></aug><source>J Virol</source><pubdate>2011</pubdate><volume>85</volume><issue>8</issue><fpage>3872</fpage><lpage>3880</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1128/JVI.02237-10</pubid><pubid idtype="pmcid">3126132</pubid><pubid idtype="pmpid" link="fulltext">21289125</pubid></pubidlist></xrefbib></bibl><bibl id="B34"><title><p>Homeodomain proteins</p></title><aug><au><snm>Gehring</snm><fnm>W</fnm></au><au><snm>Affolter</snm><fnm>M</fnm></au><au><snm>Burglin</snm><fnm>T</fnm></au></aug><source>Annu Rev Biochem</source><pubdate>1994</pubdate><volume>63</volume><fpage>487</fpage><lpage>526</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1146/annurev.bi.63.070194.002415</pubid><pubid idtype="pmpid" link="fulltext">7979246</pubid></pubidlist></xrefbib></bibl><bibl id="B35"><title><p>Missense mutations of human homeoboxes: a review</p></title><aug><au><snm>D'Elia</snm><fnm>AV</fnm></au><au><snm>Tell</snm><fnm>G</fnm></au><au><snm>Paron</snm><fnm>I</fnm></au><au><snm>Pellizzari</snm><fnm>L</fnm></au><au><snm>Lonigro</snm><fnm>R</fnm></au><au><snm>Damante</snm><fnm>G</fnm></au></aug><source>Hum Mutat</source><pubdate>2001</pubdate><volume>18</volume><fpage>361</fpage><lpage>374</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1002/humu.1207</pubid><pubid idtype="pmpid" link="fulltext">11668629</pubid></pubidlist></xrefbib></bibl><bibl id="B36"><title><p>Homeodomain-DNA recognition</p></title><aug><au><snm>Gehring</snm><fnm>W</fnm></au><au><snm>Qian</snm><fnm>Y</fnm></au><au><snm>Billeter</snm><fnm>M</fnm></au><au><snm>Furukubotokunaga</snm><fnm>K</fnm></au><au><snm>Schier</snm><fnm>A</fnm></au><au><snm>Resendezperez</snm><fnm>D</fnm></au><au><snm>Affolter</snm><fnm>M</fnm></au><au><snm>Otting</snm><fnm>G</fnm></au><au><snm>Wuthrich</snm><fnm>K</fnm></au></aug><source>Cell</source><pubdate>1994</pubdate><volume>78</volume><fpage>211</fpage><lpage>223</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0092-8674(94)90292-5</pubid><pubid idtype="pmpid" link="fulltext">8044836</pubid></pubidlist></xrefbib></bibl><bibl id="B37"><title><p>The denatured state of engrailed homeodomain under denaturing and native conditions</p></title><aug><au><snm>Mayor</snm><fnm>U</fnm></au><au><snm>Grossmann</snm><fnm>JG</fnm></au><au><snm>Foster</snm><fnm>NW</fnm></au><au><snm>Freund</snm><fnm>SM</fnm></au><au><snm>Fersht</snm><fnm>AR</fnm></au></aug><source>J Mol Biol</source><pubdate>2003</pubdate><volume>333</volume><fpage>977</fpage><lpage>991</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jmb.2003.08.062</pubid><pubid idtype="pmpid" link="fulltext">14583194</pubid></pubidlist></xrefbib></bibl><bibl id="B38"><title><p>Protein folding and unfolding in microseconds to nanoseconds by experiment and simulation</p></title><aug><au><snm>Mayor</snm><fnm>U</fnm></au><au><snm>Johnson</snm><fnm>CM</fnm></au><au><snm>Dagget</snm><fnm>V</fnm></au><au><snm>Fersht</snm><fnm>AR</fnm></au></aug><source>Proc Natl Acad Sci U S A</source><pubdate>2000</pubdate><volume>97</volume><fpage>13518</fpage><lpage>13522</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1073/pnas.250473497</pubid><pubid idtype="pmcid">17607</pubid><pubid idtype="pmpid" link="fulltext">11087839</pubid></pubidlist></xrefbib></bibl></refgrp>
   </bm>
</art>