<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
	<ui>1471-2288-12-156</ui>
	<ji>1471-2288</ji>
	<fm>
		<dochead>Research article</dochead>
		<bibl>
			<title>
				<p>Combining directed acyclic graphs and the change-in-estimate procedure as a novel approach to adjustment-variable selection in epidemiology</p>
			</title>
			<aug>
				<au id="A1" ca="yes"><snm>Evans</snm><fnm>David</fnm><insr iid="I1"/><insr iid="I2"/><insr iid="I3"/><insr iid="I4"/><email>davidwevans1@gmail.com</email></au>
				<au id="A2"><snm>Chaix</snm><fnm>Basile</fnm><insr iid="I1"/><insr iid="I3"/><email>chaix@u707.jussieu.fr</email></au>
				<au id="A3"><snm>Lobbedez</snm><fnm>Thierry</fnm><insr iid="I4"/><insr iid="I5"/><email>lobbedez-t@wanadoo.fr</email></au>
				<au id="A4"><snm>Verger</snm><fnm>Christian</fnm><insr iid="I4"/><email>c.verger@wanadoo.fr</email></au>
				<au id="A5"><snm>Flahault</snm><fnm>Antoine</fnm><insr iid="I1"/><insr iid="I2"/><email>Antoine.Flahault@ehesp.fr</email></au>
			</aug>
			<insg>
				<ins id="I1"><p>Inserm UMR-S 707, Paris, France</p></ins>
				<ins id="I2"><p>EHESP School of Public Health, Rennes-Sorbonne Paris Cit&#233;, Paris, France</p></ins>
				<ins id="I3"><p>UPMC-Sorbonne Universit&#233;, Paris, France</p></ins>
				<ins id="I4"><p>Registre de Dialyse P&#233;riton&#233;ale de Langue Fran&#231;aise, Pontoise, France</p></ins>
				<ins id="I5"><p>Nephrology Department, CHU Clemenceau, Ca&#235;n, France</p></ins>
			</insg>
			<source>BMC Medical Research Methodology</source>
			<section><title><p>Data analysis, statistics and modelling</p></title></section><issn>1471-2288</issn>
			<pubdate>2012</pubdate>
			<volume>12</volume>
			<issue>1</issue>
			<fpage>156</fpage>
			<url>http://www.biomedcentral.com/1471-2288/12/156</url>
			<xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2288-12-156</pubid><pubid idtype="pmpid">23058038</pubid></pubidlist></xrefbib>
		</bibl>
		<history><rec><date><day>12</day><month>3</month><year>2012</year></date></rec><acc><date><day>1</day><month>10</month><year>2012</year></date></acc><pub><date><day>11</day><month>10</month><year>2012</year></date></pub></history>
		<cpyrt><year>2012</year><collab>Evans et al.; licensee BioMed Central Ltd.</collab><note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note></cpyrt>
		<kwdg>
			<kwd>Directed acyclic graph</kwd>
			<kwd>Adjustment-variable selection</kwd>
			<kwd>Change-in-estimate</kwd>
			<kwd>Peritoneal dialysis</kwd>
		</kwdg>
		<abs>
			<sec>
				<st>
					<p>Abstract</p>
				</st>
				<sec>
					<st>
						<p>Background</p>
					</st>
					<p>Directed acyclic graphs (DAGs) are an effective means of presenting expert-knowledge assumptions when selecting adjustment variables in epidemiology, whereas the change-in-estimate procedure is a common statistics-based approach. As DAGs imply specific empirical relationships which can be explored by the change-in-estimate procedure, it should be possible to combine the two approaches. This paper proposes such an approach which aims to produce well-adjusted estimates for a given research question, based on plausible DAGs consistent with the data at hand, combining prior knowledge and standard regression methods.</p>
				</sec>
				<sec>
					<st>
						<p>Methods</p>
					</st>
					<p>Based on the relationships laid out in a DAG, researchers can predict how a collapsible estimator (e.g. risk ratio or risk difference) for an effect of interest should change when adjusted on different variable sets. Implied and observed patterns can then be compared to detect inconsistencies and so guide adjustment-variable selection.</p>
				</sec>
				<sec>
					<st>
						<p>Results</p>
					</st>
					<p>The proposed approach involves i. drawing up a set of plausible background-knowledge DAGs; ii. starting with one of these DAGs as a working DAG, identifying a minimal variable set, S, sufficient to control for bias on the effect of interest; iii. estimating a collapsible estimator adjusted on S, then adjusted on S plus each variable not in S in turn (&#8220;add-one pattern&#8221;) and then adjusted on the variables in S minus each of these variables in turn (&#8220;minus-one pattern&#8221;); iv. checking the observed add-one and minus-one patterns against the pattern implied by the working DAG and the other prior DAGs; v. reviewing the DAGs, if needed; and vi. presenting the initial and all final DAGs with estimates.</p>
				</sec>
				<sec>
					<st>
						<p>Conclusion</p>
					</st>
					<p>This approach to adjustment-variable selection combines background-knowledge and statistics-based approaches using methods already common in epidemiology and communicates assumptions and uncertainties in a standardized graphical format. It is probably best suited to areas where there is considerable background knowledge about plausible variable relationships. Researchers may use this approach as an additional tool for selecting adjustment variables when analyzing epidemiological data.</p>
				</sec>
			</sec>
		</abs>
	</fm>
	<bdy>
		<sec>
			<st>
				<p>Background</p>
			</st>
			<p>Adjustment-variable selection in epidemiology can be broadly grouped into background knowledge-based and statistics-based approaches. Directed acyclic graphs (DAGs) have come to be a core tool in the background-knowledge approach as they allow researchers to present assumed relationships between variables graphically and, based on these assumptions, to identify variables to adjust for confounding and other biases 
				<abbrgrp>
					<abbr bid="B1">1</abbr>
					<abbr bid="B2">2</abbr>
					<abbr bid="B3">3</abbr>
				</abbrgrp>. There is, however, no guarantee that the assumptions in such a prior DAG align with the patterns in the data. Stepwise selection based on p-values or the change-in-estimate are common statistics-based approaches 
				<abbrgrp>
					<abbr bid="B4">4</abbr>
				</abbrgrp>. In contrast to the background-knowledge approach, these allow patterns in the data to decide the final adjustment variables but risks in such data-driven approaches have been highlighted 
				<abbrgrp>
					<abbr bid="B5">5</abbr>
				</abbrgrp>.</p>
			<p>To our knowledge, only one methodological article in epidemiology to date has explicitly looked at combining background knowledge in DAGs with a statistical selection procedure for variable selection 
				<abbrgrp>
					<abbr bid="B6">6</abbr>
				</abbrgrp>. However, this article only considered stepwise deletion from an adjustment set defined from a prior DAG without checking whether the data supported the starting adjustment set. DAG-discovery algorithms, such as the PC and other algorithms in the TETRAD suite 
				<abbrgrp>
					<abbr bid="B7">7</abbr>
				</abbrgrp>, combine background knowledge with statistical selection rules to discover DAG structures but they have proven controversial 
				<abbrgrp>
					<abbr bid="B8">8</abbr>
				</abbrgrp> and have not yet crossed over into epidemiological research. In fact, empirical articles 
				<abbrgrp>
					<abbr bid="B9">9</abbr>
					<abbr bid="B10">10</abbr>
					<abbr bid="B11">11</abbr>
					<abbr bid="B12">12</abbr>
					<abbr bid="B13">13</abbr>
					<abbr bid="B14">14</abbr>
					<abbr bid="B15">15</abbr>
				</abbrgrp> reporting DAGs for variable selection usually report only using prior DAGs, sometimes with subsequent stepwise deletion, but apparently without checking the starting assumptions against the data. Since the performance of these approaches depends on the appropriateness of the starting assumptions, a simple method for checking DAGs against the data may be valuable.</p>
			<p>In this article, we propose an approach to adjustment-variable selection which aims to produce well-adjusted estimates for a given research question based on plausible DAGs which are also consistent with the data at hand, and to clearly communicate assumptions and uncertainties underlying the estimates in DAG format. It asks researchers to lay out prior assumptions about variable relationships in one or more prior DAGs, uses the change-in-estimate patterns in the data to refine and revise these DAGs, and presents the prior and final DAGs with corresponding estimates. The approach is based on recent theoretical results regarding confounding equivalence (c-equivalence) 
				<abbrgrp>
					<abbr bid="B16">16</abbr>
				</abbrgrp> and work on the collapsibility of estimates over different DAG structures 
				<abbrgrp>
					<abbr bid="B17">17</abbr>
				</abbrgrp>. To be pragmatic, the approach focuses on an exposure-outcome relationship of interest and uses regression models and the change-in-estimate procedure familiar to epidemiologists.</p>
		</sec>
		<sec>
			<st>
				<p>Methods</p>
			</st>
			<sec>
				<st>
					<p>DAGs and minimally sufficient adjustment variable sets</p>
				</st>
				<p>In this article, we assume that the reader is familiar with the terminology of and rules for reading DAGs. There are now many introductions to DAGs for epidemiologists [
					<abbrgrp>
						<abbr bid="B1">1</abbr>
						<abbr bid="B2">2</abbr>
						<abbr bid="B17">17</abbr>
						<abbr bid="B18">18</abbr>
						<abbr bid="B19">19</abbr>
						<abbr bid="B20">20</abbr>
					</abbrgrp>, annexe in 
					<abbrgrp>
						<abbr bid="B21">21</abbr>
					</abbrgrp>], including applications to specific areas of epidemiology 
					<abbrgrp>
						<abbr bid="B20">20</abbr>
						<abbr bid="B22">22</abbr>
					</abbrgrp>. DAGs are a graphical description of the joint probability distribution of a set of random variables, showing marginal and conditional (in)dependencies between variables 
					<abbrgrp>
						<abbr bid="B3">3</abbr>
						<abbr bid="B7">7</abbr>
						<abbr bid="B23">23</abbr>
						<abbr bid="B24">24</abbr>
					</abbrgrp>. We follow standard practice in epidemiology and give the arrows causal meaning, thereby interpreting a DAG as a causal diagram. We only address total associations in this article but the approach can be extended to direct and indirect effects based on graphical criteria for their identification 
					<abbrgrp>
						<abbr bid="B25">25</abbr>
						<abbr bid="B26">26</abbr>
						<abbr bid="B27">27</abbr>
					</abbrgrp>.</p>
				<p>DAGs allow the identification of the variable set or sets sufficient to adjust for confounding and other biases, based on the variable relationships shown. Greenland et al. 
					<abbrgrp>
						<abbr bid="B1">1</abbr>
					</abbrgrp> give conditions for this: a variable set is sufficient if i. there is no unblocked backdoor path joining the two variables which does not contain a variable in the set, and ii. there is no unblocked path joining the two variables induced by adjustment on the set which does not contain a variable in the set. This second condition means that if a collider is in the set and if adjusting on the collider unblocks the path between the two variables, then another variable on the path has also to be in the set to ensure that the path remains blocked. No variable in the set can be a descendant of the exposure or outcome 
					<abbrgrp>
						<abbr bid="B1">1</abbr>
					</abbrgrp>. (See 
					<abbrgrp>
						<abbr bid="B28">28</abbr>
					</abbrgrp> for a more recent formalization.) In practice, these conditions mean that the only unblocked paths joining exposure and outcome after conditioning on the adjustment variables can be mediating paths. A minimally sufficient adjustment set is a sufficient adjustment set which would no longer be sufficient if any variable were removed 
					<abbrgrp>
						<abbr bid="B2">2</abbr>
						<abbr bid="B29">29</abbr>
					</abbrgrp>. Minimally sufficient adjustment sets can be identified by manual 
					<abbrgrp>
						<abbr bid="B1">1</abbr>
						<abbr bid="B18">18</abbr>
					</abbrgrp> or computer 
					<abbrgrp>
						<abbr bid="B30">30</abbr>
						<abbr bid="B31">31</abbr>
					</abbrgrp> algorithms but a visual inspection is frequently sufficient.</p>
			</sec>
			<sec>
				<st>
					<p>Drawing up prior DAGs</p>
				</st>
				<p>The first step is preparing a set of DAGs which encode prior, expert knowledge about variable relationships and show the major prior uncertainties. These DAGs should include</p>
				<p indent="1">1. all measured variables considered relevant, including those routinely used for adjustment in the research area (e.g. sex) even if not thought <it>a priori</it> to be associated with other variables on the graph;</p>
				<p indent="1">2. plausible proxy and measurement error relations;</p>
				<p indent="1">3. plausible unmeasured parents with two or more children in the DAG; and</p>
				<p indent="1">4. participation or selection variables conditioned upon during data-collection, including voluntary participation by subjects and restriction of the study to particular groups, such as hospitalized patients.</p>
				<p>In most cases, more than one prior DAG will be needed to show the main uncertainties in variable relationships, including the presence or absence of arrows between variables, arrow direction, and the presence of unmeasured variables.</p>
				<p>It is important to consider the source population of the data in preparing the prior DAG or DAGs. As much prior knowledge will come from research in other contexts, there will be cases when a researcher judges that an association between variables found in other studies do not apply in his or her dataset. For example, socioeconomic status may have an association with access to healthcare in systems with large out-of-pocket payments but not in well-functioning nationalized systems. In this case, the researcher needs to explain why he or she has chosen not to connect two variables which other researchers would connect, based on knowledge about source populations. Possible differences in source populations should also be borne in mind when revising the DAG, as discussed below.</p>
			</sec>
			<sec>
				<st>
					<p>Using minimally sufficient adjustment sets to compare a DAG with data</p>
				</st>
				<p>For any given DAG, a researcher can identify the minimally sufficient adjustment set or sets for the effect of interest. Once done, he or she can identify the changes expected in this estimate when adjusting on different variable sets according to the DAG. To do this, we need to assume compatibility, faithfulness 
					<abbrgrp>
						<abbr bid="B32">32</abbr>
					</abbrgrp>, and correct model specification. We also need to use a collapsible estimator (e.g. risk ratio (RR), risk difference (RD)), as the non-collapsible estimators (e.g. conditional odds ratio) can change upon adjusting on a variable which is strongly related with the outcome but is not, in fact, a confounder 
					<abbrgrp>
						<abbr bid="B33">33</abbr>
						<abbr bid="B34">34</abbr>
						<abbr bid="B35">35</abbr>
					</abbrgrp>. The RR and RD are therefore recommended and can now be readily estimated by regression 
					<abbrgrp>
						<abbr bid="B36">36</abbr>
						<abbr bid="B37">37</abbr>
						<abbr bid="B38">38</abbr>
						<abbr bid="B39">39</abbr>
					</abbrgrp>.</p>
				<p>Given the above, a collapsible effect estimate conditional on a minimally sufficient adjustment set will not change when estimated on this set plus the variables excluded from the set, provided that the excluded variables are not mediators (or ancestors or descendants of mediators) lying on an open path or colliders (or descendants of colliders) which, if conditioned upon, would open the path on which they lie. Conversely, a collapsible effect estimate conditional on a minimally sufficient adjustment set should change when estimated on this set minus any variable in the set. This allows a researcher to identify the change-in-estimate pattern implied by the DAG and so compare it with the observed pattern from the data.</p>
				<p>Practically, we propose the following steps for this. Sample R-code is in Additional file 
					<supplr sid="S1">1</supplr> (web appendix):</p>
				<p indent="1">1. Draw up the DAGs encoding prior, expert knowledge and the main prior uncertainties as described above and select an initial working DAG from this set (the most plausible DAG);</p>
				<p indent="1">2. From the working DAG, identify a minimally sufficient adjustment set, S, for the effect of interest (A&#8594;Y);</p>
				<p indent="1">3. Using a collapsible estimator, estimate A&#8594;Y conditional on S;</p>
				<p indent="1">4. Re-estimate A&#8594;Y conditional on S plus each of the variables not included in S in turn (&#8220;add-one pattern&#8221;);</p>
				<p indent="1">5. Plot each estimate on a single graph, thereby showing differences in the estimates between the models;</p>
				<p indent="1">6. Repeat steps 4 and 5 but deleting each variable in turn from S (&#8220;minus-one pattern&#8221;);</p>
				<p indent="1">7. Determine whether the add-one and minus-one patterns found are consistent with the working DAG;</p>
				<p indent="1">8. If the patterns are consistent with the working DAG, check to see if any of the other prior DAGs give the same expected patterns. Take all prior DAGs with consistent patterns as the revised working DAGs and move to step 11;</p>
				<p indent="1">9. If the patterns are not consistent with the working DAG, check to see if any of the other prior DAGs imply the patterns as observed. Take all such consistent prior DAGs as the revised working DAGs and move to step 11;</p>
				<p indent="1">10. If the patterns are not consistent with the working DAG or with any of the other prior DAGs, undertake an <it>ad hoc</it> revision (see web appendix) to create a new working DAG;</p>
				<p indent="1">11. Repeat steps 2 to 11 for each revised working DAG, moving to step 12 when there are no inconsistent add-one and minus-one patterns;</p>
				<p indent="1">12. Present the prior and all final DAGs with corresponding effect estimates.</p>
				<suppl id="S1">
					<title>
						<p>Additional file 1</p>
					</title>
					<text>
						<p>
							<b>(Reviewing a DAG when implied and observed patterns are incompatible; Additional information on the empirical example; Sample R code for the add-one and minus-one graphs).</b></p>
					</text>
					<file name="1471-2288-12-156-S1.docx">
   <p>Click here for file</p>
</file>
				</suppl>
				<p>The key to step 7 is recognizing when the observed patterns are consistent with the patterns implied by the DAG. If S is minimally sufficient, the add-one pattern is consistent if the only meaningful changes arise when conditioning on mediators lying on open paths from A to Y or when conditioning on colliders which open a path from A to Y. All variables in S should show meaningful minus-one changes, but this may not always be the case in practice because of incidental cancellations (see Discussion). Once familiar with the rules of DAGs, it is straightforward for a researcher to identify the expected changes for any adjustment set for a given DAG: for example, if adjusting on {C<sub>1</sub>,C<sub>3</sub>} in Figure 
					<figr fid="F1">1</figr>, the implied add-one pattern is no change for C<sub>2</sub> and a change for C<sub>4</sub> and C<sub>5</sub>. The implied minus-one pattern is a change for C<sub>1</sub> and C<sub>3</sub>.</p>
				<fig id="F1"><title><p>Figure 1</p></title><caption><p>Directed acyclic graph showing putative relationships between variables A, Y, C1, C2, C3, C4, and C5</p></caption><text>
   <p>
      <b>Directed acyclic graph showing putative relationships between variables A, Y, C1, C2, C3, C4, and C5.</b>
   </p>
</text><graphic file="1471-2288-12-156-1"/></fig>
				<p>Importantly, DAGs will commonly have more than one minimally sufficient adjustment set. In this case, the researcher should also compare the effects estimated on each minimally sufficient set in steps 8 and 9 above. These adjusted effect estimates should not differ, meaning that any observed differences can help distinguish between the different working DAGs in these steps.</p>
			</sec>
			<sec>
				<st>
					<p>Defining a meaningful change</p>
				</st>
				<p>A key decision is defining the change in the estimate sufficient to warrant reviewing the DAG. The first issue here is the size of the change. For this, a researcher could choose to follow (and defend) the commonly used threshold of a 10% relative difference in the starting estimate 
					<abbrgrp>
						<abbr bid="B4">4</abbr>
						<abbr bid="B40">40</abbr>
					</abbrgrp>. Although standard practice in epidemiology, the relative nature of this rule means that the chance of declaring a change meaningful will differ with the magnitude of the starting estimate (see empirical example below). An alternative to consider is therefore using absolute change, which, given arguments that the absolute RD is particularly relevant to decision-making 
					<abbrgrp>
						<abbr bid="B37">37</abbr>
					</abbrgrp>, also has the benefit of allowing a researcher to determine the threshold based on judgements of clinical or public-health relevance 
					<abbrgrp>
						<abbr bid="B36">36</abbr>
					</abbrgrp>. For example, the threshold could be the difference in mortality or in non-persistence to a prescribed treatment which would warrant a clinical or public-health reaction. If no consensus threshold is available for certain questions, the researcher will need to propose (and defend) a reasonable value. Although arbitrary, this approach has the benefit of transparently communicating the decision rule and its rationale to other researchers, who can adopt or challenge it. The choice of estimator and of the meaningful threshold therefore clearly depend on the research question but should be defined and justified before analysis.</p>
				<p>The second issue here is variability in the change in estimate because of sampling error or other problems such as unstable models. In this case, a researcher may inappropriately revise (or not revise) a prior DAG because the observed patterns have failed to align with the patterns in the source population by chance. We note, however, that this is the case for the change-in-estimate procedure as currently practised as it only uses the point estimate change to guide covariable selection.</p>
				<p>To incorporate variability into the proposed approach, we suggest estimating the expected proportion of times the add-one and minus-one patterns would lead to a revision of the DAG under resampling and using this information in a sensitivity analysis. This can be done by bootstrap, calculating the proportion of resampled estimates lying beyond the meaningful change threshold for each variable during the add-one and minus-one steps. The researcher should report these proportions for the prior working and final DAGs. We also suggest undertaking a sensitivity analysis by revising the prior working DAG considering only variables with &gt;50% of resampled add-one changes outside the meaningful threshold as showing meaningful changes. Although this will mean presenting several final DAGs, it has the merit of communicating uncertainty in the assumptions used for the final models. In contrast, for the minus-one step we suggest only reporting the proportion of resampled estimates without undertaking the sensitivity analysis for the reasons outlined in the Discussion.</p>
				<p>There are two important caveats here. First, the proposed 50% cut-off for the add-one changes is arbitrary and further studies should explore the performance of different cut-off values. Second, inflated variance estimates because of unstable regression models (e.g. small sample size, collinearity) would also lead to a high estimated variability of the changes, highlighting the importance of routine model checking in the approach.</p>
			</sec>
			<sec>
				<st>
					<p>Reviewing the DAG</p>
				</st>
				<p>An important issue in reviewing the working DAG (steps 7 to 10 above) is that, as numerous DAGs can be constructed around the same variables, there is a risk of revision <it>a posteriori</it> to fit the observed empirical pattern. To mitigate this, we suggest first addressing the prior uncertainties as represented by the set of alternative, prior DAGs. If these DAGs do not include a graph consistent with the observed patterns, the researcher will need to consider other possible misspecification of confounding, mediating, and collision pathways, measurement error, and bias amplification as outlined in the Results. A structured approach to working through these possibilities is in Additional file 
					<supplr sid="S1">1</supplr> (web appendix). However, given the risk of <it>post hoc</it> fitting the DAG to the data at this stage, the researcher should state that none of the prior DAGs was consistent with the observed patterns. Note that model misspecification, another reason to consider, is not addressed in this article for reasons of space. As noted, usual methods for model checking clearly apply.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Results</p>
			</st>
			<p>We now run through a theoretical example to illustrate the approach before presenting an empirical example from clinical epidemiology.</p>
			<sec>
				<st>
					<p>Confounding, mediation, collision</p>
				</st>
				<p>Take the (as yet unknown) best-working DAG in Figure 
					<figr fid="F1">1</figr>, the prior DAG in Figure 
					<figr fid="F2">2</figr> as the preferred initial working DAG, and the DAGs in Figures 
					<figr fid="F1">1</figr>, 
					<figr fid="F3">3</figr>, and 
					<figr fid="F4">4</figr> as prior alternative DAGs. These figures are also available in Additional file 
					<supplr sid="S2">2</supplr> in slide format to follow the changes by flicking back and forth between figures. From Figure 
					<figr fid="F2">2</figr>, a researcher identifies a putative minimally sufficient adjustment set of {C<sub>1</sub>}. The implied add-one pattern for Figure 
					<figr fid="F2">2</figr> when adjusting on {C<sub>1</sub>} is a change for C<sub>4</sub> and C<sub>5</sub> and no change for C<sub>2</sub> or C<sub>3</sub>; the implied minus-one pattern is a change for C<sub>1</sub>. He or she estimates the A&#8594;Y effect adjusted on {C<sub>1</sub>} and the add-one and minus-one patterns. Graphing this (step 5 above) gives a pattern as in Figure 
					<figr fid="F5">5</figr>, where the dotted horizontal lines represent the pre-defined threshold for a meaningful change. The changes on adding C<sub>4</sub> and C<sub>5</sub> and for removing C<sub>1</sub> are consistent with Figure 
					<figr fid="F2">2</figr>. In contrast, the changes for adding C<sub>2</sub> and C<sub>3</sub> are not consistent with Figure 
					<figr fid="F2">2</figr>, flagging the need to reconsider them.</p>
				<suppl id="S2">
					<title>
						<p>Additional file 2</p>
					</title>
					<text>
						<p>
							<b>(Figures containing DAGs as Powerpoint slides).</b>
						</p>
					</text>
					<file name="1471-2288-12-156-S2.ppt">
   <p>Click here for file</p>
</file>
				</suppl>
				<fig id="F2"><title><p>Figure 2</p></title><caption><p>Directed acyclic graph showing alternative putative relationships between variables A, Y, C1, C2, C3, C4, and C5</p></caption><text>
   <p>
      <b>Directed acyclic graph showing alternative putative relationships between variables A, Y, C1, C2, C3, C4, and C5.</b>
   </p>
</text><graphic file="1471-2288-12-156-2"/></fig>
				<fig id="F3"><title><p>Figure 3</p></title><caption><p>Directed acyclic graph showing one set of alternative putative relationships between variables A, Y, C1, C2, C3, C4, and C5</p></caption><text>
   <p>
      <b>Directed acyclic graph showing one set of alternative putative relationships between variables A, Y, C1, C2, C3, C4, and C5.</b>
   </p>
</text><graphic file="1471-2288-12-156-3"/></fig>
				<fig id="F4"><title><p>Figure 4</p></title><caption><p>Directed acyclic graph showing another set of alternative putative relationships between variables A, Y, C1, C2, C3, C4, and C5</p></caption><text>
   <p>
      <b>Directed acyclic graph showing another set of alternative putative relationships between variables A, Y, C1, C2, C3, C4, and C5.</b>
   </p>
</text><graphic file="1471-2288-12-156-4"/></fig>
				<p>During preparation of the prior DAGs, our researcher flagged the possible confounding pathways in Figures 
					<figr fid="F1">1</figr> or 
					<figr fid="F3">3</figr> and C<sub>2</sub> as a collider in Figure 
					<figr fid="F4">4</figr>. Both Figures 
					<figr fid="F1">1</figr> and 
					<figr fid="F4">4</figr> have the same implied add-one and minus-one patterns when adjusting on C<sub>1</sub> only, namely add-one changes for C<sub>2</sub>, C<sub>3</sub>, C<sub>4</sub>, and C<sub>5</sub> and minus-one changes for C<sub>1</sub>. These are consistent with Figure 
					<figr fid="F3">3</figr>. The implied patterns for Figure 
					<figr fid="F4">4</figr> when adjusting on C<sub>1</sub> only are add-one changes for C<sub>2</sub>, C<sub>4</sub>, and C<sub>5</sub>; no add-one change for C<sub>3</sub>; and a minus-one change for C<sub>1</sub>. These do not correspond to those observed in Figure 
					<figr fid="F5">5</figr> (the add-one pattern should not change for C<sub>3</sub>). Consequently, the researcher can discount the DAG in Figure 
					<figr fid="F4">4</figr> and focus on Figures 
					<figr fid="F1">1</figr> and 
					<figr fid="F3">3</figr>.</p>
				<fig id="F5"><title><p>Figure 5</p></title><caption><p>Add-one and minus-one patterns for a starting adjustment-variable set of {C1} based on DAG in Figure <figr fid="F2">2</figr>, taking the associations in the DAG in Figure <figr fid="F1">1</figr> as the unknown best working DAG</p></caption><text>
   <p><b>Add</b>-<b>one and minus</b>-<b>one patterns for a starting adjustment</b>-<b>variable set of</b><b>{</b><b>C</b><sub><b>1</b></sub><b>}</b><b>based on DAG in Figure</b><figr fid="F2">2</figr>, <b>taking the associations in the DAG in Figure</b><figr fid="F1">1</figr><b>as the unknown best working DAG.</b> The solid horizontal line is the RD estimate adjusted on the putative minimally sufficient set {C<sub>1</sub>}. The dashed horizontal lines are the pre-defined meaningful change thresholds in the RD estimate. The add-one section shows the RD upon adding each variable listed to the adjustment-variable set in turn. The minus-one section shows the RD upon removing each variable listed from the adjustment-variable set in turn.</p>
</text><graphic file="1471-2288-12-156-5"/></fig>
				<p>The researcher should reapply the above steps to each of Figures 
					<figr fid="F1">1</figr> and 
					<figr fid="F3">3</figr>. In Figure 
					<figr fid="F3">3</figr>, the minimally sufficient adjustment set is {C<sub>1</sub>,C<sub>2</sub>,C<sub>3</sub>}. The implied patterns adjusting on this set is an add-one change for C4 and C5 and a minus-one change for C<sub>1</sub>, C<sub>2</sub>, and C<sub>3</sub>. As Figure 
					<figr fid="F1">1</figr> is the still unknown best working DAG, the observed pattern will have no minus-one change for C<sub>2</sub> and C<sub>3</sub>. In contrast, re-running the steps on Figure 
					<figr fid="F1">1</figr> will obviously give consistent add-one and minus-one patterns. This favours Figure 
					<figr fid="F1">1</figr>. The researcher can go further, noting that both {C<sub>1</sub>,C<sub>2</sub>} and {C<sub>1</sub>,C<sub>3</sub>} are minimally sufficient adjustment sets in Figure 
					<figr fid="F1">1</figr>. The effect estimate adjusted on each of these sets does not change, consistent with Figure 
					<figr fid="F1">1</figr> as the final working DAG based on these prior starting DAGs.</p>
				<p>Alternatively, the researcher may have pre-identified uncertain mediation paths involving C<sub>2</sub> and C<sub>3</sub>, for example a single mediating path (A&#8594;C<sub>2</sub>&#8594;C<sub>3</sub>&#8594;Y) or two separate mediating paths (A&#8594;C<sub>2</sub>&#8594;Y and A&#8594;C<sub>3</sub>&#8594;Y) (not shown but easily constructed by replacing A&#8592;C<sub>2</sub> with A&#8594;C<sub>2</sub> in Figures 
					<figr fid="F1">1</figr> and 
					<figr fid="F3">3</figr> and A&#8592;C<sub>3</sub> by A&#8594;C<sub>3</sub> in Figure 
					<figr fid="F3">3</figr>). The same approach as for the confounding scenarios will help distinguish between these, although, as discussed below, background knowledge is required to decide on the confounding vs. mediating direction of the arrows.</p>
			</sec>
			<sec>
				<st>
					<p>Measurement error</p>
				</st>
				<p>Measurement error can also cause an estimate to change when adding or deleting variables to or from the adjustment set, even though this would not be the case had the variables been measured perfectly. To see why, consider Figure 
					<figr fid="F6">6</figr>, which is Figure 
					<figr fid="F1">1</figr> with measurement error of C<sub>2</sub> and C<sub>3</sub>. Following 
					<abbrgrp>
						<abbr bid="B41">41</abbr>
					</abbrgrp>, we define C* as the measured variable, and U<sub>C</sub> as representing all factors affecting measurement of C. Adjusting on C<sub>2</sub>* only partially blocks A&#8592;C<sub>2</sub>&#8594;C<sub>3</sub>&#8594;Y at C<sub>2</sub>; similarly, adjusting on C<sub>3</sub>* only partially blocks this pathway at C<sub>3</sub>; consequently the estimate adjusted on {C<sub>1</sub>,C<sub>2</sub>*} will not equal that adjusted on {C<sub>1</sub>,C<sub>2</sub>*,C<sub>3</sub>*} even though they would have been the same if we could have adjusted on {C<sub>1</sub>,C<sub>2</sub>} and {C<sub>1</sub>,C<sub>2</sub>,C<sub>3</sub>}.</p>
				<fig id="F6"><title><p>Figure 6</p></title><caption><p>Directed acyclic graph showing alternative putative relationships between variables A, Y, C1, C2, C3, C4, and C5 in which C2 and C3 are measured with error (measured variables are C2* and C3* and variables affecting their measurement are UC2 and UC3)</p></caption><text>
   <p>
      <b>Directed acyclic graph showing alternative putative relationships between variables A, Y, C1, C2, C3, C4, and C5 in which C2 and C3 are measured with error (measured variables are C2* and C3* and variables affecting their measurement are UC2 and UC3).</b>
   </p>
</text><graphic file="1471-2288-12-156-6"/></fig>
				<p>To see how measurement error fits into the proposed approach, consider the case of Figure 
					<figr fid="F6">6</figr> as the (unknown) best working DAG, Figure 
					<figr fid="F1">1</figr> as a researcher&#8217;s initial working prior DAG, and measurement error of C<sub>2</sub> and C<sub>3</sub> in Figure 
					<figr fid="F6">6</figr> as an alternative prior DAG. Running through the above steps on Figure 
					<figr fid="F1">1</figr> using a minimally sufficient adjustment set of {C<sub>1</sub>,C<sub>2</sub>} will give add-one and minus-one patterns as in Figure 
					<figr fid="F7">7</figr>. These are inconsistent for C<sub>3</sub> in Figure 
					<figr fid="F1">1</figr>, since adding C<sub>3</sub> to the {C<sub>1</sub>,C<sub>2</sub>} adjustment set should not change the estimate. In contrast, this pattern is consistent with the measurement error in Figure 
					<figr fid="F6">6</figr>. Although, intuitively, the &#8220;best&#8221; adjustment set is expected to be {C<sub>1</sub>,C<sub>2</sub>*,C<sub>3</sub>*}, adjusting on a mismeasured confounder may increase bias under certain conditions 
					<abbrgrp>
						<abbr bid="B42">42</abbr>
						<abbr bid="B43">43</abbr>
					</abbrgrp> such as the presence of a qualitative interaction between exposure and confounder if the confounder is binary 
					<abbrgrp>
						<abbr bid="B43">43</abbr>
					</abbrgrp>. Even in conditions for which adjustment on {C<sub>1</sub>,C<sub>2</sub>*,C<sub>3</sub>*} will be bias reducing, arguably common in epidemiological research 
					<abbrgrp>
						<abbr bid="B43">43</abbr>
						<abbr bid="B44">44</abbr>
						<abbr bid="B45">45</abbr>
					</abbrgrp>, this will not be a sufficient adjustment set as it only partially blocks the A&#8592;C<sub>2</sub>&#8594;C<sub>3</sub>&#8594;Y pathway. Regardless of the direction of the bias, the proposed change-in-estimate approach should flag the need to review the associations involving the mismeasured variables in the DAG.</p>
				<fig id="F7"><title><p>Figure 7</p></title><caption><p>Add-one and minus-one patterns for a starting adjustment-variable set of {C<sub>1</sub>, C<sub>2</sub>} based on DAG in Figure 1, taking the associations in the DAG in Figure 6 as the unknown best working DAG</p></caption><text>
   <p><b>Add</b>-<b>one and minus</b>-<b>one patterns for a starting adjustment</b>-<b>variable set of</b> {<b>C</b><sub><b>1</b></sub>, <b>C</b><sub><b>2</b></sub>} <b>based on DAG in Figure</b><figr fid="F1">1</figr>, <b>taking the associations in the DAG in Figure</b><figr fid="F6">6</figr><b>as the unknown best working DAG.</b> Note that the variables listed as C<sub>2</sub> and C<sub>3</sub> are actually these variables measured with error, i.e. C<sub>2</sub>* and C<sub>3</sub>* in Figure 
							<figr fid="F6">6</figr>. The solid horizontal line is the RD estimate adjusted on the putative minimally sufficient set {C<sub>1</sub>}. The dashed horizontal lines are the pre-defined meaningful change thresholds in the RD estimate. The add-one section shows the RD upon adding each variable listed to the adjustment-variable set in turn. The minus-one section shows the RD upon removing each variable listed from the adjustment-variable set in turn.</p>
</text><graphic file="1471-2288-12-156-7"/></fig>
			</sec>
			<sec>
				<st>
					<p>Bias amplification</p>
				</st>
				<p>Recent work has shown that residual bias can be amplified by adjustment on instrument-like variables 
					<abbrgrp>
						<abbr bid="B46">46</abbr>
						<abbr bid="B47">47</abbr>
					</abbrgrp>, a finding which, although its quantitative relevance is still under debate 
					<abbrgrp>
						<abbr bid="B48">48</abbr>
						<abbr bid="B49">49</abbr>
					</abbrgrp>, has potentially major implications for adjustment-variable selection in epidemiology. Such bias amplification can also lead to a change in the effect estimate when adjusting on different variable sets, so researchers should consider it when reviewing a DAG based on the add-one and minus-one patterns. Note that &#8220;instrument-like&#8221; refers to variables which are strong predictors of the exposure but can be also associated with the outcome (see 
					<abbrgrp>
						<abbr bid="B46">46</abbr>
					</abbrgrp> for detailed discussion and estimate of the ratio of two associations). Confounders can therefore be instrument-like, depending on the relative strength of their relationships with the exposure and the outcome. This is not to be confused with standard instrumental variables which, by definition, are associated only with the exposure and which have bias-reducing properties in appropriate analyses (see 
					<abbrgrp>
						<abbr bid="B50">50</abbr>
					</abbrgrp> for this) and bias-amplifying effects in other analyses 
					<abbrgrp>
						<abbr bid="B46">46</abbr>
					</abbrgrp>.</p>
				<p>Consider Figure 
					<figr fid="F1">1</figr> as a prior DAG, Figure 
					<figr fid="F8">8</figr> as the unknown best working DAG, and major residual confounding, shown by the pathway A&#8592;Z<sub>U</sub>&#8594;Y in Figure 
					<figr fid="F8">8</figr>, as a prior uncertainty. In the absence of residual confounding (Figure 
					<figr fid="F1">1</figr>), a collapsible estimate adjusted on {C<sub>1</sub>,C<sub>2</sub>}, {C<sub>1</sub>,C<sub>3</sub>}, and {C<sub>1</sub>,C<sub>2</sub>,C<sub>3</sub>} should not differ. However, with residual confounding (Figure 
					<figr fid="F8">8</figr>), these estimates will differ because C<sub>2</sub> and C<sub>3</sub> have different &#8220;instrument strengths&#8221; (i.e. relative to C<sub>3</sub>, C<sub>2</sub> is more strongly associated with the exposure A) and so amplify the residual bias differently 
					<abbrgrp>
						<abbr bid="B16">16</abbr>
					</abbrgrp>. Consequently, a researcher starting with a minimally sufficient adjustment set of {C<sub>1</sub>,C<sub>2</sub>} (based on Figure 
					<figr fid="F1">1</figr>) will find add-one and minus-one patterns similar to those shown in Figure 
					<figr fid="F7">7</figr>. These patterns are inconsistent with Figure 
					<figr fid="F1">1</figr> but are consistent with the alternative DAG in Figure 
					<figr fid="F8">8</figr>. The question again becomes which adjustment set to choose to minimize bias. Until further theoretical and simulation work is available on bias amplification, a conservative strategy is to adjust on {C<sub>1</sub>,C<sub>3</sub>}, as C<sub>3</sub> should be a weaker instrument than C<sub>2</sub>, but also to present the estimate adjusted on {C<sub>1</sub>,C<sub>2</sub>} and {C<sub>1</sub>,C<sub>2</sub>,C<sub>3</sub>}.</p>
				<fig id="F8"><title><p>Figure 8</p></title><caption><p>Directed acyclic graph showing alternative putative relationships between variables A, Y, C1, C2, C3, C4, C5, and an unmeasured variable</p></caption><text>
   <p>
      <b>Directed acyclic graph showing alternative putative relationships between variables A, Y, C1, C2, C3, C4, C5, and an unmeasured variable ZU.</b>
   </p>
</text><graphic file="1471-2288-12-156-8"/></fig>
			</sec>
			<sec>
				<st>
					<p>Presenting more than one final DAG</p>
				</st>
				<p>In many instances, the researcher will need to present more than one final DAG with implied add-one and minus-one patterns consistent with the patterns observed. Sometimes the adjusted estimate will be the same as the DAGs imply the same minimally sufficient adjustment set. An example is removing the C<sub>5</sub>&#8594;Y arrow and adding a C<sub>5</sub>&#8592;C<sub>3</sub> arrow in Figure 
					<figr fid="F2">2</figr>. This DAG has similar implied patterns as the current Figure 
					<figr fid="F2">2</figr> and so, if matching the observed patterns, both would need to be presented amongst the final DAGs. The minimally sufficient adjustment set in both is {C<sub>1</sub>} and so the adjusted effect estimate will be the same. However, in some cases the minimally sufficient adjustment sets will be different, so that an estimate for each DAG will need to be presented. One example of this involves the confounding vs. mediating pathways mentioned above, if both types of relationship were identified as plausible during the preparation of the prior DAGs (e.g. the DAG in Figure 
					<figr fid="F4">4</figr> and the DAG created by replacing A&#8592;C<sub>2</sub>&#8594;Y with A&#8594;C<sub>2</sub>&#8594;Y in Figure 
					<figr fid="F4">4</figr>).</p>
			</sec>
			<sec>
				<st>
					<p>Empirical example</p>
				</st>
				<p>We now consider an empirical example to illustrate the approach. We compare mortality 5 years after peritoneal-dialysis (PD) initiation amongst patients with polycystic kidney disease (PKD) versus other nephropathies, using data from the French Language Peritoneal Dialysis Registry (RDPLF) (details in Additional file 
					<supplr sid="S1">1</supplr> (web appendix); see also 
					<abbrgrp>
						<abbr bid="B51">51</abbr>
					</abbrgrp> for background). We estimate the RD by linear regression with robust standard errors 
					<abbrgrp>
						<abbr bid="B52">52</abbr>
					</abbrgrp> and use a &#177;0.01 absolute change in the point estimate of the RD as meaningful, considering that difference of this magnitude in the cumulative incidence of death would warrant attention from clinical or public health decision-makers. To compare the absolute with relative scales, we also show a &#177;10% change in the RD. We calculated the proportion of estimates lying outside the &#177;0.01 absolute change threshold on resampling using 2000 non-parametric bootstrap samples.</p>
				<p>The DAG in Figure 
					<figr fid="F9">9</figr> illustrates prior assumptions regarding variable relationships. Type of peritoneal dialysis refers to the two modalities of treatment, namely continuous ambulatory peritoneal dialysis and automated peritoneal dialysis. The other variables are self-explanatory. Figure 
					<figr fid="F9">9</figr> shows, for example, that we assume that <it>Type of peritoneal dialysis</it> and <it>Sex</it> have no direct association with <it>Death</it> and that both <it>PKD vs. other nephropathies</it> and <it>Comorbidity index</it> are associated with the <it>Peritoneal dialysis vs. haemodialysis</it> participation variable. The square around this latter variable shows that it has been conditioned upon during data collection, since only PD patients are included in the registry. Our prior uncertainties are absence of the <it>Type of assistance</it>&#8594;<it>Death arrow</it> (Figure 
					<figr fid="F10">10</figr>), absence of the <it>Sex</it>&#8594;<it>Type of assistance</it> arrow (Figure 
					<figr fid="F11">11</figr>), and whether <it>Comorbidity index</it> and <it>Type of assistance</it> are better considered as proxies for two unmeasured variables, <it>Major concurrent illnesses</it> and <it>Frailty</it>, respectively (Figure 
					<figr fid="F12">12</figr>). In this last case, we consider <it>Frailty</it> also to be associated with the <it>Peritoneal dialysis vs. haemodialysis</it> collider and with <it>Death</it>.</p>
				<fig id="F9"><title><p>Figure 9</p></title><caption><p>Directed acyclic graph showing prior assumptions about relationships between variables in the empirical example</p></caption><text>
   <p>
      <b>Directed acyclic graph showing prior assumptions about relationships between variables in the empirical example.</b>
   </p>
</text><graphic file="1471-2288-12-156-9"/></fig>
				<fig id="F10"><title><p>Figure 10</p></title><caption><p>Directed acyclic graph showing prior uncertainty about variable relationships in the empirical example (absence of Type of Assistance -> Death arrow)</p></caption><text>
   <p>
      <b>Directed acyclic graph showing prior uncertainty about variable relationships in the empirical example (absence of Type of Assistance -> Death arrow).</b>
   </p>
</text><graphic file="1471-2288-12-156-10"/></fig>
				<fig id="F11"><title><p>Figure 11</p></title><caption><p>Directed acyclic graph showing prior uncertainty about variable relationships in the empirical example (absence of Sex -> Type of Assistance)</p></caption><text>
   <p>
      <b>Directed acyclic graph showing prior uncertainty about variable relationships in the empirical example (absence of Sex -> Type of Assistance).</b>
   </p>
</text><graphic file="1471-2288-12-156-11"/></fig>
				<fig id="F12"><title><p>Figure 12</p></title><caption><p>Directed acyclic graph showing prior uncertainty about variable relationships in the empirical example (showing Comorbidity index and Type of assistance as proxy variables for Major concomitant illnesses and Frailty, respectively)</p></caption><text>
   <p>
      <b>Directed acyclic graph showing prior uncertainty about variable relationships in the empirical example (showing Comorbidity index and Type of assistance as proxy variables for Major concurrent illnesses and Frailty, respectively).</b>
   </p>
</text><graphic file="1471-2288-12-156-12"/></fig>
				<p>There is only one minimally sufficient adjustment set in the prior DAG (Figure 
					<figr fid="F9">9</figr>), simply {<it>Age</it>, <it>Comorbidity index</it>}. Figure 
					<figr fid="F13">13</figr> shows the add-one and minus-one patterns for this adjustment set. The dotted lines are the &#177;0.01 threshold; the dashed lines are the 10% relative change in the RD. The add-one pattern shows a meaningful change for <it>Type of assistance</it> (i.e. lies outside of the dotted line in Figure 
					<figr fid="F13">13</figr>), inconsistent with the implied pattern from Figure 
					<figr fid="F9">9</figr>, whereas the minus-one pattern shows a meaningful change for both variables in the set, consistent with Figure 
					<figr fid="F9">9</figr>. The proportions of bootstrapped estimates lying outside of the meaningful threshold are in Table 
					<tblr tid="T1">1</tblr>: only <it>Type of assistance</it> had &gt;50% of the add-one estimates outside of the meaningful threshold.</p>
				<fig id="F13"><title><p>Figure 13</p></title><caption><p>Add-one and minus-one patterns for a adjustment-variable set of {<it>Age</it>, <it>Comorbidity index</it>} based on DAG in Figure <figr fid="F9">9</figr></p></caption><text>
   <p><b>Add</b>-<b>one and minus</b>-<b>one patterns for a adjustment</b>-<b>variable set of</b><b>{</b><b><it>Age</it></b><b>, </b><b><it>Comorbidity index</it></b><b>}</b><b>based on DAG in Figure</b><figr fid="F9">9</figr> The solid horizontal line is the RD estimate adjusted on this set. The dotted horizontal lines are the pre-defined meaningful change thresholds for an absolute change of &#177; 0.01 in the RD. The dashed horizontal lines are a relative change of &#177;10% of the starting RD. The add-one section shows the RD upon adding each variable listed to the adjustment-variable set in turn. The minus-one section shows the RD upon removing each variable listed from the adjustment-variable set in turn.</p>
</text><graphic file="1471-2288-12-156-13"/></fig>
				<table id="T1">
					<title>
						<p>Table 1</p>
					</title>
					<caption>
						<p>
							<b>Percentage of bootstrapped risk difference estimates representing a meaningful change</b> (&#177; <b>0</b>.<b>01 change</b>) <b>for each variable in the empirical example</b>
						</p>
					</caption>
					<tgroup align="left" cols="4">
						<colspec align="left" colname="c1" colnum="1" colwidth="1*"/>
						<colspec align="char" colname="c2" colnum="2" colwidth="1*"/>
						<colspec align="left" colname="c3" colnum="3" colwidth="1*"/>
						<colspec align="char" colname="c4" colnum="4" colwidth="1*"/>
						<thead>
							<row rowsep="1">
								<entry colname="c1"/>
								<entry colname="c2"/>
								<entry colname="c3"/>
								<entry colname="c4"/>
							</row>
						</thead>
						<tbody>
							<row rowsep="1">
								<entry colname="c1">
									<p>For Figure 
										<figr fid="F13">13</figr>
									</p>
								</entry>
								<entry colname="c2"/>
								<entry colname="c3"/>
								<entry colname="c4"/>
							</row>
							<row rowsep="1">
								<entry colname="c1" nameend="c2" namest="c1">
									<p>Add-one variables</p>
								</entry>
								<entry align="center" colname="c1" nameend="c4" namest="c3">
									<p>Minus-one variables</p>
								</entry>
							</row>
							<row rowsep="1">
								<entry colname="c1">
									<p>Sex</p>
								</entry>
								<entry align="char" char="." colname="c2">
									<p>28.4%</p>
								</entry>
								<entry colname="c3">
									<p>Age</p>
								</entry>
								<entry align="char" char="." colname="c4">
									<p>95.3%</p>
								</entry>
							</row>
							<row rowsep="1">
								<entry colname="c1">
									<p>Type of peritoneal dialysis</p>
								</entry>
								<entry align="char" char="." colname="c2">
									<p>37.1%</p>
								</entry>
								<entry colname="c3">
									<p>Comorbidity index</p>
								</entry>
								<entry align="char" char="." colname="c4">
									<p>98.6%</p>
								</entry>
							</row>
							<row rowsep="1">
								<entry colname="c1">
									<p>Type of assistance</p>
								</entry>
								<entry align="char" char="." colname="c2">
									<p>99.6%</p>
								</entry>
								<entry colname="c3"/>
								<entry colname="c4"/>
							</row>
							<row rowsep="1">
								<entry colname="c1">
									<p>For Figure 
										<figr fid="F14">14</figr>
									</p>
								</entry>
								<entry colname="c2"/>
								<entry colname="c3"/>
								<entry colname="c4"/>
							</row>
							<row rowsep="1">
								<entry colname="c1" nameend="c2" namest="c1">
									<p>Add-one variables</p>
								</entry>
								<entry colname="c1" nameend="c4" namest="c3">
									<p>Minus-one variables</p>
								</entry>
							</row>
							<row rowsep="1">
								<entry colname="c1">
									<p>Type of peritoneal dialysis</p>
								</entry>
								<entry align="char" char="." colname="c2">
									<p>15.2%</p>
								</entry>
								<entry colname="c3">
									<p>Age</p>
								</entry>
								<entry align="char" char="." colname="c4">
									<p>38.3%</p>
								</entry>
							</row>
							<row rowsep="1">
								<entry colname="c1"/>
								<entry colname="c2"/>
								<entry colname="c3">
									<p>Comorbidity index</p>
								</entry>
								<entry align="char" char="." colname="c4">
									<p>58.8%</p>
								</entry>
							</row>
							<row rowsep="1">
								<entry colname="c1"/>
								<entry colname="c2"/>
								<entry colname="c3">
									<p>Sex</p>
								</entry>
								<entry align="char" char="." colname="c4">
									<p>75.9%</p>
								</entry>
							</row>
							<row rowsep="1">
								<entry colname="c1"/>
								<entry colname="c2"/>
								<entry colname="c3">
									<p>Type of assistance</p>
								</entry>
								<entry align="char" char="." colname="c4">
									<p>100.0%</p>
								</entry>
							</row>
						</tbody>
					</tgroup>
				</table>
				<p>We therefore need to review the DAG, focusing on <it>Type of assistance</it>. Looking at the prior uncertainties, dropping the <it>Type of assistance</it>&#8594;<it>Death</it> (Figure 
					<figr fid="F10">10</figr>) or the <it>Sex</it>&#8594;<it>Type of assistance</it> arrows (Figure 
					<figr fid="F11">11</figr>) does not change the implied patterns compared with Figure 
					<figr fid="F9">9</figr>. In contrast, specifying the proxy relations in Figure 
					<figr fid="F12">12</figr> changes the adjustment set. (Note that there is no sufficient adjustment set (of measured variables) according to this DAG as the paths <it>PKD vs. other nephropathies</it>&#8592;<it>Major concurrent illnesses</it>&#8594;<it>Death</it>, <it>PKD vs. other nephropathies</it>&#8592;<it>Major concurrent illnesses</it>&#8594;<it>Frailty</it>&#8594;<it>Death</it>, PKD vs. other nephropathies&#8592;Major concurrent illnesses&#8594;Peritoneal dialysis vs. haemodialysis&#8592;Frailty&#8594;Death, and <it>PKD vs. other nephropathies</it>&#8594;<it>Peritoneal dialysis vs</it>. <it>haemodialysis</it>&#8592;<it>Frailty</it>&#8594;<it>Death</it> remain partially open at <it>Major concurrent illnesses</it> and <it>Frailty</it>.) The implied add-one pattern for a starting adjustment set of {<it>Age</it>, <it>Comorbidity index</it>} in Figure 
					<figr fid="F12">12</figr> is therefore a meaningful change for <it>Type of assistance</it>, <it>Sex</it>, and <it>Type of peritoneal dialysis</it>.</p>
				<p>Now using Figure 
					<figr fid="F12">12</figr> as our revised working DAG, the best adjustment set is {<it>Age</it>, <it>Comorbidity index</it>, <it>Type of assistance</it>, <it>Sex</it>}. The last three variables are included as descending or ascending proxies of the two unmeasured variables. We did not include <it>Type of peritoneal dialysis</it> in this set as its net bias-reducing effect is not clear, noting that it will contributed to partially conditioning on the unmeasured <it>Frailty</it> variable but will also open biasing pathways, e.g. <it>PKD vs. other nephropathies</it>&#8594;<it>Type of peritoneal dialysis</it>&#8592;<it>Frailty</it>&#8594;<it>Death</it>. The RD adjusted on the final set did not show a meaningful change in the add-one pattern (proportion of bootstrapped estimates outside of threshold &lt;50% shown in Table 
					<tblr tid="T1">1</tblr>) and the minus-one pattern showed a meaningful change for all adjustment variables except <it>Age</it> (Figure 
					<figr fid="F14">14</figr>). <it>Age</it> also had &lt;50% of bootstrapped estimates lying outside of the meaningful threshold (Table 
					<tblr tid="T1">1</tblr>). We maintain <it>Age</it> in the adjustment set as this pattern is coherent with the DAG, since the other adjustment variables, <it>Comorbidity index</it> and <it>Type of assistance</it>, may already condition effectively on <it>Age</it> owing to a strong correlation. However, we note that <it>Age</it> may be dropped if it improves the efficiency of the estimate (see 
					<abbrgrp>
						<abbr bid="B6">6</abbr>
					</abbrgrp>). We would therefore present our prior working DAG (Figure 
					<figr fid="F9">9</figr>) with an RD of &#8722;0.07 (95%CI: -0.14, 0.00) and our final working DAG (Figure 
					<figr fid="F12">12</figr>) with an RD of &#8722;0.02 (95%CI: -0.10, 0.05).</p>
				<fig id="F14"><title><p>Figure 14</p></title><caption><p>Add-one and minus-one patterns for a adjustment-variable set of {<it>Age</it>, <it>Comorbidity index</it>, <it>Type of assistance</it>, <it>Sex</it>} based on DAG in Figure 12</p></caption><text>
   <p><b>Add</b>-<b>one and minus</b>-<b>one patterns for a adjustment</b>-<b>variable set of </b><b><it>{Age</it></b>, <b><it>Comorbidity index</it></b>, <b><it>Type of assistance</it></b>, <b><it>Sex} </it></b><b>based on DAG in Figure</b><figr fid="F12">12</figr> The solid horizontal line is the RD estimate adjusted on this set. The dotted horizontal lines are the pre-defined meaningful change thresholds for an absolute change of &#177; 0.01 in the RD. The dashed horizontal lines are a relative change of &#177;10% of the starting RD. The add-one section shows the RD upon adding each variable listed to the adjustment-variable set in turn. The minus-one section shows the RD upon removing each variable listed from the adjustment-variable set in turn.</p>
</text><graphic file="1471-2288-12-156-14"/></fig>
				<p>As an aside, Figures 
					<figr fid="F13">13</figr> and 
					<figr fid="F14">14</figr> show the difference between using relative and absolute scales as the threshold for a meaningful change. In Figure 
					<figr fid="F13">13</figr>, the starting RD is &#8722;0.07 and so the width of the relative change (dashed lines) is close to that of the absolute change (dotted lines). In Figure 
					<figr fid="F14">14</figr>, the starting RD is considerably smaller, at &#8722;0.02, and so the width of the relative change is much smaller than that of the absolute change.</p>
			</sec>
		</sec>
		<sec>
			<st>
				<p>Discussion</p>
			</st>
			<p>We have presented an approach to selecting adjustment variables which combines prior knowledge expressed in a DAG with results from analysis of the data. The approach is pragmatic in that it focuses only on the effect of interest (also emphasized by others 
				<abbrgrp>
					<abbr bid="B5">5</abbr>
				</abbrgrp>); uses regression models and the change-in-estimate procedure familiar to epidemiologists; and can incorporate real-data problems such as measurement error and residual bias. It aims at producing a plausible, best working DAG or set of DAGs for a given research question, given the data at hand, and at communicating the assumptions underlying variable selection in the initial and final models using a standardized, graphical form 
				<abbrgrp>
					<abbr bid="B3">3</abbr>
				</abbrgrp>. The approach also communicates the uncertainties in the assumptions in the final models by presenting all the DAGs identified by the researcher which are consistent with the observed change-in-estimate patterns. This aims to help other research teams to focus on the areas of uncertainty and corroborate or refute the DAGs, based on the analysis of different datasets in an iterative way.</p>
			<p>The approach depends on recent theoretical work on c- (confounding-) equivalence 
				<abbrgrp>
					<abbr bid="B16">16</abbr>
				</abbrgrp> and collapsibility of estimates over different DAG structures 
				<abbrgrp>
					<abbr bid="B17">17</abbr>
				</abbrgrp>. Pearl and Paz 
				<abbrgrp>
					<abbr bid="B16">16</abbr>
				</abbrgrp> have developed conditions for c-equivalence which apply to any subsets of the variables in a DAG. Our approach uses two of their results: that all sufficient adjustment sets are c-equivalent and that failure to find c-equivalence of putative sufficient adjustment sets rules out a DAG implying such c-equivalence 
				<abbrgrp>
					<abbr bid="B3">3</abbr>
				</abbrgrp>. The approach also uses Pearl and Paz&#8217;s insights into bias amplification, in which they note that bias amplification will lead to changes in associations conditional on different variables even if the variables block the same path. In a recent, detailed review of collapsibility (i.e. equivalence) of different estimators over different DAGs 
				<abbrgrp>
					<abbr bid="B17">17</abbr>
				</abbrgrp>, Greenland and Pearl noted that regression coefficients may be used to check collapsibility over different covariable sets, an approach which we develop here for applied work.</p>
			<p>To our knowledge, only one other article in the epidemiology literature to date has looked at adjustment variable selection by explicitly combining DAGs and a statistical selection procedure 
				<abbrgrp>
					<abbr bid="B6">6</abbr>
				</abbrgrp>. This article addressed deletion of variables from an adjustment set defined from a prior DAG using the change-in-estimate procedure, but considered only odds ratios from simulations of case&#8211;control studies and explicitly excluded colliders. Our approach is therefore broader as it addresses whether the data support the initial DAG which defines the starting adjustment set, applies to any collapsible estimator, and covers the range of possible relationships between variables. Interestingly, this article found largest bias (using simulated data) when including covariables associated only with the outcome in the adjustment set and suggested that non-collapsibility of the odds ratio may have been involved 
				<abbrgrp>
					<abbr bid="B6">6</abbr>
				</abbrgrp>. This reinforces our insistence on collapsible estimators.</p>
			<p>The proposed approach has some potential advantages over other variable-selection methods. It can reduce the &#8220;black-box&#8221; nature of using the p-value or the change-in-estimate alone to select variables, as it lays out the rationale for adjustment-variable choice graphically. It will also frequently lead to a more parsimonious model than selection based on p-values since it chooses variables by relevance to the exposure-outcome association, rather than the association with the outcome alone. The approach also extends background-knowledge methods by checking starting assumptions against the data and requiring researchers to justify mismatches or adapt assumptions appropriately. The approach complements the recently proposed method of adjusting on all assumed parents of exposure and outcome 
				<abbrgrp>
					<abbr bid="B21">21</abbr>
				</abbrgrp> as it can incorporate adjustment decisions when parent variables are measured with error and can achieve a more parsimonious model by excluding parent variables which do not lie on biasing pathways. Of course, sensitivity analyses to explore the impact of possible unmeasured confounding 
				<abbrgrp>
					<abbr bid="B53">53</abbr>
				</abbrgrp> remain important.</p>
			<p>An important point concerns the possibility of incidental cancellations and small effects. Finding a meaningful difference in the add-one pattern for a variable <it>when no difference is implied by the DAG</it> indicates the need to review the variable&#8217;s relationships. However, finding no meaningful difference in the add-one or minus-one patterns <it>when a difference is implied</it> is not, strictly speaking, inconsistent with the DAG. This is because of the possibilities of incidental cancellations across pathways and of changes which simply do not exceed the pre-defined meaningful threshold. For this reason, we suggest that the researcher maintain such arrows (thereby assuming &#8220;weak faithfulness&#8221; rather than faithfulness (see 
				<abbrgrp>
					<abbr bid="B32">32</abbr>
				</abbrgrp> p.190), but label these arrows for other research teams to examine with different datasets.</p>
			<p>A potential criticism of the approach is that it does not eliminate background knowledge from adjustment-variable selection. Indeed, the examples include instances of needing background knowledge to distinguish between DAGs giving the same add-one and minus-one patterns (e.g. confounding- vs. mediating-pathway examples, measurement-error vs. bias-amplification examples). It is well known that different DAGs can imply the same statistical relationships 
				<abbrgrp>
					<abbr bid="B3">3</abbr>
					<abbr bid="B7">7</abbr>
					<abbr bid="B54">54</abbr>
				</abbrgrp>, making an appeal to background knowledge unavoidable when using DAGs in applied work. We do not consider this a limitation, however, seeing background knowledge as valid information which should rarely be over-ruled by any single dataset but, rather, reviewed in light of the patterns in the data. This is particularly appropriate in clinical epidemiology, where we frequently know quite a lot about likely relationships between variables. In contrast, the approach is unlikely to be well adapted to datasets for which researchers have very little background knowledge, when alternative approaches such as DAG-discovery algorithms (below) may be used.</p>
			<p>Another potential criticism is that the approach only addresses variable relationships relevant to the effect of interest, remaining agnostic about other regions of the DAG. This aims to focus on the research question at hand and to minimize the risk of &#8220;getting lost&#8221; in trying to explore all possible associations in the DAG, many of which do not directly impact on the selected exposure-outcome estimate. A researcher wishing to explore the full DAG could apply a DAG-discovery algorithm (e.g. the PC, GES, or FCI algorithms; see the TETRAD project&#8217;s website and 
				<abbrgrp>
					<abbr bid="B7">7</abbr>
				</abbrgrp>). Such algorithmic approaches use statistical tests or scoring rules to identify edges between variables and can incorporate background knowledge such as the temporal ordering of variables or the forced inclusion or exclusion of arrows. However, they have proven controversial 
				<abbrgrp>
					<abbr bid="B8">8</abbr>
				</abbrgrp> and have not yet crossed over into applied epidemiologic research. Nonetheless, recent applications of these algorithms in the biomedical literature for data with many variables and little background knowledge have been interesting 
				<abbrgrp>
					<abbr bid="B55">55</abbr>
				</abbrgrp>. In the approach proposed in this article, a researcher could use these algorithms to explore additional prior starting DAGs. In our experience, however, there are challenges to using these algorithms currently, including handling datasets with mixed continuous and categorical variables and dealing with issues such as measurement error and bias amplification.</p>
			<p>We wish to highlight several additional limitations of the proposed approach. Like the change-in-estimate procedure, the approach is <it>ad hoc</it> and informal as it depends on arbitrary thresholds and is not founded on well-defined statistical tests with appropriate theoretical properties. In addition, as discussed above, different DAG structures can give the same implied add-one and minus-one patterns and so more than one DAG will be consistent with the observed patterns. For this reason, the researcher should present all identified DAGs with implied patterns consistent with those observed; further, researchers should always remember that other DAGs (not identified) will also be consistent with the patterns.</p>
			<p>Several extensions to the approach are possible, should it appeal to epidemiologists working on applied questions. These include how best to address sampling variability in the patterns, comparing the performance of different rules based on the proportion of bootstrap samples which fall outside the meaningful threshold. Another potential extension concerns precision in choosing the adjustment set. We note that a researcher may wish to adjust on additional variables to improve precision 
				<abbrgrp>
					<abbr bid="B56">56</abbr>
				</abbrgrp> and may wish to delete variables from the final adjustment set based on precision of estimates, as concluded in 
				<abbrgrp>
					<abbr bid="B6">6</abbr>
				</abbrgrp>. Researchers should of course bear in mind that, as with any <it>a posteriori</it> variable selection, estimates from a revised DAG will tend to be over-precise. Finally, it may be possible to extend the approach to include recent advances in DAG theory, including selection variables to encode differences between populations (and so uncertainty about arrows) 
				<abbrgrp>
					<abbr bid="B57">57</abbr>
				</abbrgrp>, signed DAGs which specify assumptions about the positive or negative direction of paths 
				<abbrgrp>
					<abbr bid="B58">58</abbr>
				</abbrgrp>, and interactions using sufficient causation DAGs 
				<abbrgrp>
					<abbr bid="B59">59</abbr>
				</abbrgrp>.</p>
		</sec>
		<sec>
			<st>
				<p>Conclusions</p>
			</st>
			<p>In summary, we have proposed a novel approach to adjustment-variable selection in epidemiology which combines existing knowledge-based and statistics-based methods. It requires a researcher to present background-knowledge assumptions in a DAG, to compare these against patterns in the data, and to review assumptions accordingly. It also ensures clear communication of assumptions and uncertainties to other researchers and readers in a standardized graphical format. As the approach requires background knowledge, it is probably best suited to areas such as clinical epidemiology where researchers know quite a lot about <it>a priori</it> plausible variable relationships. Researchers can use this approach as an additional tool for selecting adjustment variables when analyzing epidemiological data.</p>
		</sec>
		<sec>
			<st>
				<p>Competing interests</p>
			</st>
			<p>The authors declare that they have no competing interests.</p>
		</sec>
		<sec>
			<st>
				<p>Authors&#8217; contributions</p>
			</st>
			<p>DE, BC, and AF conceived the idea through their interests in confounder selection and directed acyclic graphs. CV and TL were responsible for the peritoneal dialysis data and contributed to the development and interpretation of the empirical example. DE did the analyses and drafted the manuscript. All authors critically reviewed the drafts and approved the final version.</p>
		</sec>
	</bdy>
	<bm>
		<refgrp><bibl id="B1"><title><p>Causal diagrams for epidemiologic research</p></title><aug><au><snm>Greenland</snm><fnm>S</fnm></au><au><snm>Pearl</snm><fnm>J</fnm></au><au><snm>Robins</snm><fnm>JM</fnm></au></aug><source>Epidemiology</source><pubdate>1999</pubdate><volume>10</volume><fpage>37</fpage><lpage>48</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/00001648-199901000-00008</pubid><pubid idtype="pmpid" link="fulltext">9888278</pubid></pubidlist></xrefbib></bibl><bibl id="B2"><title><p>Causal diagrams</p></title><aug><au><snm>Glymour</snm><fnm>M</fnm></au><au><snm>Greenland</snm><fnm>S</fnm></au></aug><source>Modern epidemiology</source><publisher>Philadelphia, PA: Lippincott Williams &amp;Wilkins</publisher><edition>3rd</edition><pubdate>2008</pubdate><fpage>183</fpage><lpage>209</lpage></bibl><bibl id="B3"><aug><au><snm>Pearl</snm><fnm>J</fnm></au></aug><source>Causality: models, reasoning, and inference</source><publisher>Cambridge: Cambridge University Press</publisher><edition>2nd</edition><pubdate>2009</pubdate></bibl><bibl id="B4"><title><p>Modeling and variable selection in epidemiologic analysis</p></title><aug><au><snm>Greenland</snm><fnm>S</fnm></au></aug><source>Am J Public Health</source><pubdate>1989</pubdate><volume>79</volume><fpage>340</fpage><lpage>349</lpage><xrefbib><pubidlist><pubid idtype="doi">10.2105/AJPH.79.3.340</pubid><pubid idtype="pmcid">1349563</pubid><pubid idtype="pmpid">2916724</pubid></pubidlist></xrefbib></bibl><bibl id="B5"><title><p>On model selection and model misspecification in causal inference</p></title><aug><au><snm>Vansteelandt</snm><fnm>S</fnm></au><au><snm>Bekaert</snm><fnm>M</fnm></au><au><snm>Claeskens</snm><fnm>G</fnm></au></aug><source>Stat Methods Med Res</source><pubdate>2012</pubdate><volume>21</volume><fpage>7</fpage><lpage>30</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1177/0962280210387717</pubid><pubid idtype="pmpid" link="fulltext">21075803</pubid></pubidlist></xrefbib></bibl><bibl id="B6"><title><p>Methods of covariate selection: directed acyclic graphs and the change-in-estimate procedure</p></title><aug><au><snm>Weng</snm><fnm>HY</fnm></au><au><snm>Hsueh</snm><fnm>YH</fnm></au><au><snm>Messam</snm><fnm>LLM</fnm></au><au><snm>Hertz-Picciotto</snm><fnm>I</fnm></au></aug><source>Am J Epidemiol</source><pubdate>2009</pubdate><volume>169</volume><fpage>1182</fpage><lpage>1190</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/aje/kwp035</pubid><pubid idtype="pmpid" link="fulltext">19363102</pubid></pubidlist></xrefbib></bibl><bibl id="B7"><aug><au><snm>Spirtes</snm><fnm>P</fnm></au><au><snm>Glymour</snm><fnm>C</fnm></au><au><snm>Scheines</snm><fnm>R</fnm></au></aug><source>Causation, prediction, and search, second edition</source><publisher>Cambridge: The MIT Press</publisher><edition>2nd</edition><pubdate>2001</pubdate></bibl><bibl id="B8"><title><p>Rejoinder to glymour and spirtes</p></title><source>Computation, causation, and discovery</source><publisher>Cambridge MA: AAAI Press/The MIT Press</publisher><editor>Glymour C, Cooper G</editor><pubdate>1999</pubdate><fpage>333</fpage><lpage>342</lpage></bibl><bibl id="B9"><title><p>Management practices and risk of occupational blood exposure in U.S. Paramedics: Non-intact skin exposure</p></title><aug><au><snm>Leiss</snm><fnm>JK</fnm></au></aug><source>Ann Epidemiol</source><pubdate>2009</pubdate><volume>19</volume><fpage>884</fpage><lpage>890</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.annepidem.2009.08.006</pubid><pubid idtype="pmpid" link="fulltext">19944350</pubid></pubidlist></xrefbib></bibl><bibl id="B10"><title><p>Prevalence of and risk factors for anal human papillomavirus infection in Men Who have Sex with women: a cross national study</p></title><aug><au><snm>Nyitray</snm><fnm>AG</fnm></au><au><snm>Smith</snm><fnm>D</fnm></au><au><snm>Villa</snm><fnm>L</fnm></au><au><snm>Lazcano Ponce</snm><fnm>E</fnm></au><au><snm>Abrahamsen</snm><fnm>M</fnm></au><au><snm>Papenfuss</snm><fnm>M</fnm></au><au><snm>Giuliano</snm><fnm>AR</fnm></au></aug><source>J Infect Dis</source><pubdate>2010</pubdate><volume>201</volume><fpage>1498</fpage><lpage>1508</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1086/652187</pubid><pubid idtype="pmcid">2856726</pubid><pubid idtype="pmpid" link="fulltext">20367457</pubid></pubidlist></xrefbib></bibl><bibl id="B11"><title><p>Sleep disturbances and cause-specific mortality: results from the GAZEL cohort study</p></title><aug><au><snm>Rod</snm><fnm>NH</fnm></au><au><snm>Vahtera</snm><fnm>J</fnm></au><au><snm>Westerlund</snm><fnm>H</fnm></au><au><snm>Kivimaki</snm><fnm>M</fnm></au><au><snm>Zins</snm><fnm>M</fnm></au><au><snm>Goldberg</snm><fnm>M</fnm></au><au><snm>Lange</snm><fnm>T</fnm></au></aug><source>Am J Epidemiol</source><pubdate>2010</pubdate><volume>173</volume><fpage>300</fpage><lpage>309</lpage><xrefbib><pubidlist><pubid idtype="pmcid">3105272</pubid><pubid idtype="pmpid" link="fulltext">21193534</pubid></pubidlist></xrefbib></bibl><bibl id="B12"><title><p>The effect of highly active antiretroviral therapy on the survival of HIV-infected children in a resource-deprived setting: a cohort study</p></title><aug><au><snm>Edmonds</snm><fnm>A</fnm></au><au><snm>Yotebieng</snm><fnm>M</fnm></au><au><snm>Lusiama</snm><fnm>J</fnm></au><au><snm>Matumona</snm><fnm>Y</fnm></au><au><snm>Kitetele</snm><fnm>F</fnm></au><au><snm>Napravnik</snm><fnm>S</fnm></au><au><snm>Cole</snm><fnm>SR</fnm></au><au><snm>Van Rie</snm><fnm>A</fnm></au><au><snm>Behets</snm><fnm>F</fnm></au></aug><source>PLoS Med</source><pubdate>2011</pubdate><volume>8</volume><fpage>e1001044</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pmed.1001044</pubid><pubid idtype="pmcid">3114869</pubid><pubid idtype="pmpid" link="fulltext">21695087</pubid></pubidlist></xrefbib></bibl><bibl id="B13"><title><p>Assessing perceived risk and STI prevention behavior: a national population-based study with special reference to HPV</p></title><aug><au><snm>Leval</snm><fnm>A</fnm></au><au><snm>Sundstr&#246;m</snm><fnm>K</fnm></au><au><snm>Ploner</snm><fnm>A</fnm></au><au><snm>Arnheim Dahlstr&#246;m</snm><fnm>L</fnm></au><au><snm>Widmark</snm><fnm>C</fnm></au><au><snm>Spar&#233;n</snm><fnm>P</fnm></au></aug><source>PLoS One</source><pubdate>2011</pubdate><volume>6</volume><fpage>e20624</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1371/journal.pone.0020624</pubid><pubid idtype="pmcid">3107227</pubid><pubid idtype="pmpid" link="fulltext">21674050</pubid></pubidlist></xrefbib></bibl><bibl id="B14"><title><p>Whole grains Are associated with serum concentrations of high sensitivity C-reactive protein among premenopausal women</p></title><aug><au><snm>Gaskins</snm><fnm>AJ</fnm></au><au><snm>Mumford</snm><fnm>SL</fnm></au><au><snm>Rovner</snm><fnm>AJ</fnm></au><au><snm>Zhang</snm><fnm>C</fnm></au><au><snm>Chen</snm><fnm>L</fnm></au><au><snm>Wactawski-Wende</snm><fnm>J</fnm></au><au><snm>Perkins</snm><fnm>NJ</fnm></au><au><snm>Schisterman</snm><fnm>EF</fnm></au><au><cnm>for the BioCycle Study Group</cnm></au></aug><source>J Nutr</source><pubdate>2010</pubdate><volume>140</volume><fpage>1669</fpage><lpage>1676</lpage><xrefbib><pubidlist><pubid idtype="doi">10.3945/jn.110.124164</pubid><pubid idtype="pmcid">2924598</pubid><pubid idtype="pmpid" link="fulltext">20668255</pubid></pubidlist></xrefbib></bibl><bibl id="B15"><title><p>Effect of daily fiber intake on reproductive function: the BioCycle study</p></title><aug><au><snm>Gaskins</snm><fnm>AJ</fnm></au><au><snm>Mumford</snm><fnm>SL</fnm></au><au><snm>Zhang</snm><fnm>CL</fnm></au><au><snm>Wactawski-Wende</snm><fnm>J</fnm></au><au><snm>Hovey</snm><fnm>KM</fnm></au><au><snm>Whitcomb</snm><fnm>BW</fnm></au><au><snm>Howards</snm><fnm>PP</fnm></au><au><snm>Perkins</snm><fnm>NJ</fnm></au><au><snm>Yeung</snm><fnm>E</fnm></au><au><snm>Schisterman</snm><fnm>EF</fnm></au></aug><source>Am J Clin Nutr</source><pubdate>2009</pubdate><volume>90</volume><fpage>1061</fpage><lpage>1069</lpage><xrefbib><pubidlist><pubid idtype="doi">10.3945/ajcn.2009.27990</pubid><pubid idtype="pmcid">2744625</pubid><pubid idtype="pmpid" link="fulltext">19692496</pubid></pubidlist></xrefbib></bibl><bibl id="B16"><title><p>Confounding equivalence in observational studies (or, when are two measurements equally valuable for effect estimation?)</p></title><aug><au><snm>Pearl</snm><fnm>J</fnm></au><au><snm>Paz</snm><fnm>A</fnm></au></aug><source>Proceedings of the twenty-sixth conference on uncertainty in artificial intelligence</source><publisher>Corvallis: AUAI</publisher><pubdate>2010</pubdate><fpage>433</fpage><lpage>441</lpage></bibl><bibl id="B17"><title><p>Adjustments and their consequences - Collapsibility analysis using graphical models</p></title><aug><au><snm>Greenland</snm><fnm>S</fnm></au><au><snm>Pearl</snm><fnm>J</fnm></au></aug><source>Int Stat Rev</source><pubdate>2011</pubdate><volume>79</volume><fpage>401</fpage><lpage>426</lpage><xrefbib><pubid idtype="doi">10.1111/j.1751-5823.2011.00158.x</pubid></xrefbib></bibl><bibl id="B18"><title><p>Reducing bias through directed acyclic graphs</p></title><aug><au><snm>Shrier</snm><fnm>I</fnm></au><au><snm>Platt</snm><fnm>RW</fnm></au></aug><source>BMC Med Res Methodol</source><pubdate>2008</pubdate><volume>8</volume><fpage>70</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2288-8-70</pubid><pubid idtype="pmcid">2601045</pubid><pubid idtype="pmpid" link="fulltext">18973665</pubid></pubidlist></xrefbib></bibl><bibl id="B19"><title><p>Using directed acyclic graphs to guide analyses of neighbourhood health effects: an introduction</p></title><aug><au><snm>Fleischer</snm><fnm>NL</fnm></au><au><snm>Diez Roux</snm><fnm>AV</fnm></au></aug><source>J Epidemiol Community Health</source><pubdate>2008</pubdate><volume>62</volume><fpage>842</fpage><lpage>846</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1136/jech.2007.067371</pubid><pubid idtype="pmpid" link="fulltext">18701738</pubid></pubidlist></xrefbib></bibl><bibl id="B20"><title><p>Using directed acyclic graphs to consider adjustment for socioeconomic status in occupational cancer studies</p></title><aug><au><snm>Richiardi</snm><fnm>L</fnm></au><au><snm>Barone-Adesi</snm><fnm>F</fnm></au><au><snm>Merletti</snm><fnm>F</fnm></au><au><snm>Pearce</snm><fnm>N</fnm></au></aug><source>J Epidemiol Community Health</source><pubdate>2008</pubdate><volume>62</volume><fpage>e14</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1136/jech.2007.065581</pubid><pubid idtype="pmpid" link="fulltext">18572430</pubid></pubidlist></xrefbib></bibl><bibl id="B21"><title><p>A New criterion for confounder selection</p></title><aug><au><snm>Vanderweele</snm><fnm>TJ</fnm></au><au><snm>Shpitser</snm><fnm>I</fnm></au></aug><source>Biometrics</source><pubdate>2011</pubdate><volume>67</volume><fpage>1406</fpage><lpage>1413</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1111/j.1541-0420.2011.01619.x</pubid><pubid idtype="pmcid">3166439</pubid><pubid idtype="pmpid" link="fulltext">21627630</pubid></pubidlist></xrefbib></bibl><bibl id="B22"><title><p>Methodological considerations, such as directed acyclic graphs, for studying &#8220;acute on chronic&#8221; disease epidemiology: chronic obstructive pulmonary disease example</p></title><aug><au><snm>Tsai</snm><fnm>C-L</fnm></au><au><snm>Camargo</snm><fnm>CA</fnm><suf>Jr</suf></au></aug><source>J Clin Epidemiol</source><pubdate>2009</pubdate><volume>62</volume><fpage>982</fpage><lpage>990</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jclinepi.2008.10.005</pubid><pubid idtype="pmpid" link="fulltext">19211222</pubid></pubidlist></xrefbib></bibl><bibl id="B23"><title><p>Beware of the DAG!</p></title><aug><au><snm>Dawid</snm><fnm>AP</fnm></au></aug><source>Journal of Machine Learning Research Workshop and Conference Proceedings</source><pubdate>2010</pubdate><volume>6</volume><fpage>59</fpage><lpage>86</lpage></bibl><bibl id="B24"><title><p>Influence diagrams for causal modelling and inference</p></title><aug><au><snm>Dawid</snm><fnm>AP</fnm></au></aug><source>Int Stat Rev</source><pubdate>2002</pubdate><volume>70</volume><fpage>161</fpage><lpage>189</lpage></bibl><bibl id="B25"><title><p>Estimation of direct causal effects</p></title><aug><au><snm>Petersen</snm><fnm>ML</fnm></au><au><snm>Sinisi</snm><fnm>SE</fnm></au><au><snm>van der Laan</snm><fnm>MJ</fnm></au></aug><source>Epidemiology</source><pubdate>2006</pubdate><volume>17</volume><fpage>276</fpage><lpage>284</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/01.ede.0000208475.99429.2d</pubid><pubid idtype="pmpid" link="fulltext">16617276</pubid></pubidlist></xrefbib></bibl><bibl id="B26"><title><p>Identifiability and exchangeability for direct and indirect effects</p></title><aug><au><snm>Robins</snm><fnm>JM</fnm></au><au><snm>Greenland</snm><fnm>S</fnm></au></aug><source>Epidemiology</source><pubdate>1992</pubdate><volume>3</volume><fpage>143</fpage><lpage>155</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/00001648-199203000-00013</pubid><pubid idtype="pmpid">1576220</pubid></pubidlist></xrefbib></bibl><bibl id="B27"><title><p>A complete graphical criterion for the adjustment formula in mediation analysis</p></title><aug><au><snm>Shpitser</snm><fnm>I</fnm></au><au><snm>Vanderweele</snm><fnm>TJ</fnm></au></aug><source>Int J Biostat</source><pubdate>2011</pubdate><volume>7</volume><fpage>16</fpage><xrefbib><pubidlist><pubid idtype="doi">10.2202/1557&#8211;4679.1297</pubid><pubid idtype="pmcid">3083137</pubid><pubid idtype="pmpid" link="fulltext">21556286</pubid></pubidlist></xrefbib></bibl><bibl id="B28"><aug><au><snm>Shpitser</snm><fnm>I</fnm></au><au><snm>VanderWeele</snm><fnm>TJ</fnm></au><au><snm>Robins</snm><fnm>JM</fnm></au></aug><source>On the validity of covariate adjustment for estimating causal effects. Proceedings of the Twenty-Sixth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-10)</source><publisher>Corvallis: AUAI</publisher><pubdate>2010</pubdate><fpage>527</fpage><lpage>536</lpage></bibl><bibl id="B29"><title><p>Confounding and collapsibility in causal inference</p></title><aug><au><snm>Greenland</snm><fnm>S</fnm></au><au><snm>Robins</snm><fnm>JM</fnm></au><au><snm>Pearl</snm><fnm>J</fnm></au></aug><source>Stat Sci</source><pubdate>1999</pubdate><volume>14</volume><fpage>29</fpage><lpage>46</lpage><xrefbib><pubid idtype="doi">10.1214/ss/1009211805</pubid></xrefbib></bibl><bibl id="B30"><title><p>A suite of R functions for directed acyclic graphs</p></title><aug><au><snm>Breitling</snm><fnm>L</fnm></au></aug><source>Epidemiology</source><pubdate>2010</pubdate><volume>21</volume><fpage>586</fpage><lpage>587</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/EDE.0b013e3181e09112</pubid><pubid idtype="pmpid" link="fulltext">20539116</pubid></pubidlist></xrefbib></bibl><bibl id="B31"><title><p>DAG program: identifying minimal sufficient adjustment sets</p></title><aug><au><snm>Knueppel</snm><fnm>S</fnm></au><au><snm>Stang</snm><fnm>A</fnm></au></aug><source>Epidemiology</source><pubdate>2010</pubdate><volume>21</volume><fpage>159</fpage><xrefbib><pubid idtype="pmpid" link="fulltext">20010223</pubid></xrefbib></bibl><bibl id="B32"><aug><au><snm>Rothman</snm><fnm>K</fnm></au><au><snm>Greenland</snm><fnm>S</fnm></au><au><snm>Lash</snm><fnm>T</fnm></au></aug><source>Modern epidemiology</source><publisher>Philadelphia: Lipincott Williams &amp;Wilkins</publisher><edition>3rd</edition><pubdate>2008</pubdate></bibl><bibl id="B33"><title><p>Marginalia: comparing adjusted effect measures</p></title><aug><au><snm>Kaufman</snm><fnm>JS</fnm></au></aug><source>Epidemiology</source><pubdate>2010</pubdate><volume>21</volume><fpage>490</fpage><lpage>493</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/EDE.0b013e3181e00730</pubid><pubid idtype="pmpid" link="fulltext">20539110</pubid></pubidlist></xrefbib></bibl><bibl id="B34"><title><p>Absence of confounding does not correspond to collapsibility of the rate ratio or rate difference</p></title><aug><au><snm>Greenland</snm><fnm>S</fnm></au></aug><source>Epidemiology</source><pubdate>1996</pubdate><volume>7</volume><fpage>498</fpage><lpage>501</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/00001648-199609000-00007</pubid><pubid idtype="pmpid" link="fulltext">8862980</pubid></pubidlist></xrefbib></bibl><bibl id="B35"><title><p>Confounding - essence and detection</p></title><aug><au><snm>Miettinen</snm><fnm>OS</fnm></au><au><snm>Cook</snm><fnm>EF</fnm></au></aug><source>Am J Epidemiol</source><pubdate>1981</pubdate><volume>114</volume><fpage>593</fpage><lpage>603</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">7304589</pubid></xrefbib></bibl><bibl id="B36"><title><p>A tutorial on methods to estimating clinically and policy-meaningful measures of treatment effects in prospective observational studies: a review</p></title><aug><au><snm>Austin</snm><fnm>PC</fnm></au><au><snm>Laupacis</snm><fnm>A</fnm></au></aug><source>Int J Biostat</source><pubdate>2011</pubdate><volume>7</volume><issue>1</issue><fpage>6</fpage><xrefbib><pubidlist><pubid idtype="pmcid">3404554</pubid><pubid idtype="pmpid" link="fulltext">22848188</pubid></pubidlist></xrefbib></bibl><bibl id="B37"><title><p>Absolute risk reductions, relative risks, relative risk reductions, and numbers needed to treat can be obtained from a logistic regression model</p></title><aug><au><snm>Austin</snm><fnm>PC</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>2010</pubdate><volume>63</volume><fpage>2</fpage><lpage>6</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jclinepi.2008.11.004</pubid><pubid idtype="pmpid" link="fulltext">19230611</pubid></pubidlist></xrefbib></bibl><bibl id="B38"><title><p>Logistic regression was preferred to estimate risk differences and numbers needed to be exposed adjusted for covariates</p></title><aug><au><snm>Gehrmann</snm><fnm>U</fnm></au><au><snm>Kuss</snm><fnm>O</fnm></au><au><snm>Wellmann</snm><fnm>J</fnm></au><au><snm>Bender</snm><fnm>R</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>2010</pubdate><volume>63</volume><fpage>1223</fpage><lpage>1231</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jclinepi.2010.01.011</pubid><pubid idtype="pmpid" link="fulltext">20430578</pubid></pubidlist></xrefbib></bibl><bibl id="B39"><title><p>Estimating the relative risk in cohort studies and clinical trials of common outcomes</p></title><aug><au><snm>McNutt</snm><fnm>L-A</fnm></au><au><snm>Wu</snm><fnm>C</fnm></au><au><snm>Xue</snm><fnm>X</fnm></au><au><snm>Hafner</snm><fnm>JP</fnm></au></aug><source>Am J Epidemiol</source><pubdate>2003</pubdate><volume>157</volume><fpage>940</fpage><lpage>943</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/aje/kwg074</pubid><pubid idtype="pmpid" link="fulltext">12746247</pubid></pubidlist></xrefbib></bibl><bibl id="B40"><title><p>Simulation study of confounder-selection strategies</p></title><aug><au><snm>Maldonado</snm><fnm>G</fnm></au><au><snm>Greenland</snm><fnm>S</fnm></au></aug><source>Am J Epidemiol</source><pubdate>1993</pubdate><volume>138</volume><fpage>923</fpage><lpage>936</lpage><xrefbib><pubid idtype="pmpid" link="fulltext">8256780</pubid></xrefbib></bibl><bibl id="B41"><title><p>Invited Commentary: causal diagrams and measurement bias</p></title><aug><au><snm>Hern&#225;n</snm><fnm>MA</fnm></au><au><snm>Cole</snm><fnm>SR</fnm></au></aug><source>Am J Epidemiol</source><pubdate>2009</pubdate><volume>170</volume><fpage>959</fpage><lpage>962</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/aje/kwp293</pubid><pubid idtype="pmcid">2765368</pubid><pubid idtype="pmpid" link="fulltext">19755635</pubid></pubidlist></xrefbib></bibl><bibl id="B42"><title><p>Bias due to non-differential misclassification of polytomous confounders</p></title><aug><au><snm>Brenner</snm><fnm>H</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>1993</pubdate><volume>46</volume><fpage>57</fpage><lpage>63</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/0895-4356(93)90009-P</pubid><pubid idtype="pmpid" link="fulltext">8433115</pubid></pubidlist></xrefbib></bibl><bibl id="B43"><title><p>On the nondifferential misclassification of a binary confounder</p></title><aug><au><snm>Ogburn</snm><fnm>EL</fnm></au><au><snm>VanderWeele</snm><fnm>TJ</fnm></au></aug><source>Epidemiology</source><pubdate>2012</pubdate><volume>23</volume><fpage>433</fpage><lpage>439</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/EDE.0b013e31824d1f63</pubid><pubid idtype="pmpid" link="fulltext">22450692</pubid></pubidlist></xrefbib></bibl><bibl id="B44"><title><p>Intuitions, simulations, theorems: the role and limits of methodology</p></title><aug><au><snm>Greenland</snm><fnm>S</fnm></au></aug><source>Epidemiology</source><pubdate>2012</pubdate><volume>23</volume><fpage>440</fpage><lpage>442</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/EDE.0b013e31824e278d</pubid><pubid idtype="pmpid" link="fulltext">22475828</pubid></pubidlist></xrefbib></bibl><bibl id="B45"><title><p>Theorems, proofs, examples, and rules in the practice of epidemiology</p></title><aug><au><snm>Vanderweele</snm><fnm>TJ</fnm></au><au><snm>Ogburnb</snm><fnm>EL</fnm></au></aug><source>Epidemiology</source><pubdate>2012</pubdate><volume>23</volume><fpage>443</fpage><lpage>445</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/EDE.0b013e31824e2d4e</pubid><pubid idtype="pmpid" link="fulltext">22475829</pubid></pubidlist></xrefbib></bibl><bibl id="B46"><title><p>On a class of bias-amplifying covariates that endanger effect estimates</p></title><aug><au><snm>Pearl</snm><fnm>J</fnm></au></aug><source>Proceedings of the twenty-sixth conference on uncertainty in artificial intelligence, 417--424. AUAI, Corvallis, OR, 2010</source><pubdate>2010</pubdate><fpage>417</fpage><lpage>424</lpage><note>Technical report (R-356)</note></bibl><bibl id="B47"><title><p>Should instrumental variables be used as matching variables?</p></title><aug><au><snm>Wooldridge</snm><fnm>J</fnm></au></aug><source>Tech. Rep. Michigan state university</source><pubdate>2006</pubdate></bibl><bibl id="B48"><title><p>Effects of adjusting for instrumental variables on bias and precision of effect estimates</p></title><aug><au><snm>Myers</snm><fnm>JA</fnm></au><au><snm>Rassen</snm><fnm>JA</fnm></au><au><snm>Gagne</snm><fnm>JJ</fnm></au><au><snm>Huybrechts</snm><fnm>KF</fnm></au><au><snm>Schneeweiss</snm><fnm>S</fnm></au><au><snm>Rothman</snm><fnm>KJ</fnm></au><au><snm>Joffe</snm><fnm>MM</fnm></au><au><snm>Glynn</snm><fnm>RJ</fnm></au></aug><source>Am J Epidemiol</source><pubdate>2011</pubdate><volume>174</volume><fpage>1213</fpage><lpage>1222</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/aje/kwr364</pubid><pubid idtype="pmcid">3254160</pubid><pubid idtype="pmpid" link="fulltext">22025356</pubid></pubidlist></xrefbib></bibl><bibl id="B49"><title><p>Invited commentary: understanding bias amplification</p></title><aug><au><snm>Pearl</snm><fnm>J</fnm></au></aug><source>Am J Epidemiol</source><pubdate>2011</pubdate><volume>174</volume><fpage>1223</fpage><lpage>1227</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/aje/kwr352</pubid><pubid idtype="pmcid">3224255</pubid><pubid idtype="pmpid" link="fulltext">22034488</pubid></pubidlist></xrefbib></bibl><bibl id="B50"><title><p>Instrumental variables I: instrumental variables exploit natural variation in nonexperimental data to estimate causal relationships</p></title><aug><au><snm>Rassen</snm><fnm>JA</fnm></au><au><snm>Brookhart</snm><fnm>MA</fnm></au><au><snm>Glynn</snm><fnm>RJ</fnm></au><au><snm>Mittleman</snm><fnm>MA</fnm></au><au><snm>Schneeweiss</snm><fnm>S</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>2009</pubdate><volume>62</volume><fpage>1226</fpage><lpage>1232</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jclinepi.2008.12.005</pubid><pubid idtype="pmcid">2905668</pubid><pubid idtype="pmpid" link="fulltext">19356901</pubid></pubidlist></xrefbib></bibl><bibl id="B51"><title><p>Peritoneal dialysis in polycystic kidney disease patients. Report from the French peritoneal dialysis registry (RDPLF)</p></title><aug><au><snm>Lobbedez</snm><fnm>T</fnm></au><au><snm>Touam</snm><fnm>M</fnm></au><au><snm>Evans</snm><fnm>D</fnm></au><au><snm>Ryckelynck</snm><fnm>J-P</fnm></au><au><snm>Knebelman</snm><fnm>B</fnm></au><au><snm>Verger</snm><fnm>C</fnm></au></aug><source>Nephrol Dial Transplant</source><pubdate>2011</pubdate><volume>26</volume><fpage>2332</fpage><lpage>2339</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/ndt/gfq712</pubid><pubid idtype="pmpid" link="fulltext">21115669</pubid></pubidlist></xrefbib></bibl><bibl id="B52"><title><p>A modified least-squares regression approach to the estimation of risk difference</p></title><aug><au><snm>Cheung</snm><fnm>YB</fnm></au></aug><source>Am J Epidemiol</source><pubdate>2007</pubdate><volume>166</volume><fpage>1337</fpage><lpage>1344</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/aje/kwm223</pubid><pubid idtype="pmpid" link="fulltext">18000021</pubid></pubidlist></xrefbib></bibl><bibl id="B53"><title><p>Quantitative assessment of unobserved confounding is mandatory in nonrandomized intervention studies</p></title><aug><au><snm>Groenwold</snm><fnm>RHH</fnm></au><au><snm>Hak</snm><fnm>E</fnm></au><au><snm>Hoes</snm><fnm>AW</fnm></au></aug><source>J Clin Epidemiol</source><pubdate>2009</pubdate><volume>62</volume><fpage>22</fpage><lpage>28</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1016/j.jclinepi.2008.02.011</pubid><pubid idtype="pmpid" link="fulltext">18619797</pubid></pubidlist></xrefbib></bibl><bibl id="B54"><title><p>Data, design, and background knowledge in etiologic inference</p></title><aug><au><snm>Robins</snm><fnm>JM</fnm></au></aug><source>Epidemiology</source><pubdate>2001</pubdate><volume>12</volume><fpage>313</fpage><lpage>320</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1097/00001648-200105000-00011</pubid><pubid idtype="pmpid" link="fulltext">11338312</pubid></pubidlist></xrefbib></bibl><bibl id="B55"><title><p>Understanding human functioning using graphical models</p></title><aug><au><snm>Kalisch</snm><fnm>M</fnm></au><au><snm>Fellinghauer</snm><fnm>BAG</fnm></au><au><snm>Grill</snm><fnm>E</fnm></au><au><snm>Maathuis</snm><fnm>MH</fnm></au><au><snm>Mansmann</snm><fnm>U</fnm></au><au><snm>B&#252;hlmann</snm><fnm>P</fnm></au><au><snm>Stucki</snm><fnm>G</fnm></au></aug><source>BMC Med Res Methodol</source><pubdate>2010</pubdate><volume>10</volume><fpage>14</fpage><xrefbib><pubidlist><pubid idtype="doi">10.1186/1471-2288-10-14</pubid><pubid idtype="pmcid">2831907</pubid><pubid idtype="pmpid" link="fulltext">20149230</pubid></pubidlist></xrefbib></bibl><bibl id="B56"><title><p>Some surprising results about covariate adjustment in logistic-regression models</p></title><aug><au><snm>Robinson</snm><fnm>LD</fnm></au><au><snm>Jewell</snm><fnm>NP</fnm></au></aug><source>Int Stat Rev</source><pubdate>1991</pubdate><volume>59</volume><fpage>227</fpage><lpage>240</lpage><xrefbib><pubid idtype="doi">10.2307/1403444</pubid></xrefbib></bibl><bibl id="B57"><title><p>Transportability across studies: a formal approach</p></title><aug><au><snm>Pearl</snm><fnm>J</fnm></au><au><snm>Bareinboim</snm><fnm>E</fnm></au></aug><source>Technical report R-372</source><pubdate>2011</pubdate></bibl><bibl id="B58"><title><p>Signed directed acyclic graphs for causal inference</p></title><aug><au><snm>VanderWeele</snm><fnm>TJ</fnm></au><au><snm>Robins</snm><fnm>JM</fnm></au></aug><source>Journal of the Royal Statistical Society Series B-Statistical Methodology</source><pubdate>2009</pubdate><volume>72</volume><fpage>111</fpage><lpage>127</lpage></bibl><bibl id="B59"><title><p>Directed acyclic graphs, sufficient causes, and the properties of conditioning on a common effect</p></title><aug><au><snm>VanderWeele</snm><fnm>TJ</fnm></au><au><snm>Robins</snm><fnm>JM</fnm></au></aug><source>Am J Epidemiol</source><pubdate>2007</pubdate><volume>166</volume><fpage>1096</fpage><lpage>1104</lpage><xrefbib><pubidlist><pubid idtype="doi">10.1093/aje/kwm179</pubid><pubid idtype="pmpid" link="fulltext">17702973</pubid></pubidlist></xrefbib></bibl></refgrp>
	<sec><st><p>Pre-publication history</p></st><p>The pre-publication history for this paper can be accessed here:</p><p><url>http://www.biomedcentral.com/1471-2288/12/156/prepub</url></p></sec></bm>
</art>