Modelling continuous data ========================= * Murray M. Finkelstein Gilbert Welch and colleagues have written an interesting essay contrasting continuous and categorical approaches to modelling the relation between exposure and health effects.1 They suggest that assuming a continuous relation between exposure and outcome may produce misleading results and state that this assumption “is less the result of a considered decision than a practice born out of convention and convenience. The convention is that biologic relations ought to be smooth. … The convenience is … [in summarizing] the relation between multiple levels of exposure and the outcome in a parsimonious manner.” The authors have not mentioned one of the more compelling reasons for using a continuous relation in modelling: the ability to smooth out sampling variability among the discrete categories. The data they used for illustrative purposes are from population-based samples. As such, there will be sampling variability associated with each category, which is conventionally presented as 95% confidence intervals on the outcome in each category. The authors have completely ignored sampling variability; they have assumed that the observed outcomes are 100% precise. Had they included error bars in their graphs of outcome data, they would probably have observed substantial uncertainty about the point estimates in each of the categories and would have been less inclined to state, for example, that “[in Fig. 1] there is a slight increase in mortality between the moderate and high adherence categories [for men].” A well-conducted regression analysis using continuous data will include model checking to confirm that the model captures the appropriate functional form (e.g., linear v. quadratic) and is not distorted by outliers. One must bear in mind, and not be fooled by, random fluctuations from category to category. ## Footnotes * We received no response from Dr. Welch to our invitation to reply to this letter. ## Reference 1. 1. Welch HG, Schwartz LM, Woloshin S. The exaggerated relations between diet, body weight and mortality: the case for a categorical data approach [editorial]. CMAJ 2005;172(7):891-5. [FREE Full Text](http://www.cmaj.ca/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiRlVMTCI7czoxMToiam91cm5hbENvZGUiO3M6NDoiY21haiI7czo1OiJyZXNpZCI7czo5OiIxNzIvNy84OTEiO3M6NDoiYXRvbSI7czoyMjoiL2NtYWovMTczLzcvNzMzLjEuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9)