Comparison of diagnostic decision rules and structured data collection in assessment of acute ankle injury ========================================================================================================== * Afina S. Glas * Bas A.C.M. Pijnenburg * Jeroen G. Lijmer * Kjell Bogaard * Marnix de Roos * Johannes N. Keeman * Rudolf M.J.M. Butzelaar * Patrick M.M. Bossuyt ## Abstract **Background:** Ankle decision rules help to determine which patients with ankle injuries should undergo radiography. However, these rules are limited by imperfect generalizability and sensitivity. The judgement of physicians, aided by structured data collection, is a potential alternative. We compared the diagnostic performance of 2 decision rules with the performance of physicians, aided by structured data collection, in ruling out fracture in patients with acute ankle injury. **Methods:** Consecutive patients with acute ankle injury who visited the emergency department of a teaching community hospital in Amsterdam were included in the study. After taking the patient's history and performing a physical examination, the surgical resident in each case completed a specially developed structured data form incorporating all of the variables in the Ottawa and Leiden ankle rules, as well as some additional variables. The form then asked whether the resident thought radiography was necessary. Each patient then underwent ankle and midfoot radiography. The films were independently interpreted by a radiologist and a trauma surgeon, who were both blinded to the information on the data form. Sensitivity, specificity and the percentage of patients for whom radiography was recommended were the main outcome measures. **Results:** Of 690 consecutive patients, 647 met the inclusion criteria. Fractures were observed in 74 (11%) of these patients. Sensitivity was 89% (95% confidence interval [CI] 80% to 95%) for the Ottawa ankle rules, 80% (95% CI 69% to 88%) for the Leiden ankle rule and 82% (95% CI 72% to 90%) for physicians' judgement. Specificity was 26% (95% CI 23% to 30%), 59% (95% CI 55% to 63%) and 68% (95% CI 64% to 71%) respectively. Radiography was recommended in 76% (95% CI 72% to 79%), 46% (95% CI 42% to 50%) and 38% (95% CI 34% to 42%) of cases respectively. The Ottawa rules missed 8 fractures, of which 1 was clinically significant, the Leiden rule missed 15 fractures, of which 5 were clinically significant, and the residents missed 13 fractures, of which 1 was clinically significant. **Interpretation:** Physicians' judgement, aided by structured data collection, was similar to existing international and local decision rules in terms of sensitivity in identifying cases requiring radiography and may outperform these prediction rules in terms of minimizing radiographic examinations for patients with ankle trauma. It is not uncommon for physicians to routinely order radiography for patients with ankle injury, although less than 15% of such patients actually have fractures.1,2 This policy is safe, in that it ensures that no fractures are missed, but it entails a high use of resources. Stiell and colleagues3 suggested that guidelines might help physicians to identify patients who did not have fractures. In these cases radiography could be safely withheld and the associated costs avoided. Empirical evidence for the claim that decision rules perform better than physicians' judgement came from 2 studies by Stiell and colleagues, both published in 1992, which used a set of rules now known as the Ottawa ankle rules.1,3 Shortly afterward, Stiell and colleagues expanded the Ottawa ankle rules.4 Some hospitals have constructed and implemented locally developed decision rules, whereas others have modified the Ottawa rules to suit their particular systems.5,6,7,8 The Ottawa ankle rules are among the most validated and most widely implemented clinical prediction rules. Yet they cannot escape from imperfect generalizability (the ability to perform as well in other populations as in the population for which they were originally developed9), a common problem with prediction rules.10 The Ottawa rules have proved unsuccessful in some populations.7,11,12 A possible explanation for this problem might be more or less severe ankle injury in different populations because of various thresholds for seeking medical assistance. Conversely, it might be questioned whether physicians fare so poorly in making recommendations about radiography for patients with ankle injury that formal decision rules and explicit calculations are needed. There is evidence that structured data collection (completion of a form with various questions relevant to a specific decision process and an explicit question about the outcome of the decision) might also improve the diagnostic performance of physicians.13,14,15 Such a form might be an alternative to formal decision rules. We hypothesized that even relatively inexperienced surgical residents in the emergency department can make accurate judgements in one of the most common situations encountered in this setting. In this prospective study, we compared the diagnostic performance of physicians aided by structured data collection with the performance of the Ottawa ankle rules and a local decision rule (the Leiden ankle rule) for patients with acute ankle injury. ## Methods This prospective comparative study was performed at a medium-sized teaching community hospital in Amsterdam, the Netherlands. Consecutive patients presenting to the emergency department with acute ankle injury between January 1998 and April 1999 were eligible. Acute ankle injury was defined as any case of painful ankle resulting from trauma. The ankle was defined as the malleolar area and the midfoot area, both of which are commonly involved in twisting injuries. Patients were excluded if they were under 18 years of age, if they were pregnant, if they had been referred with radiographs from another hospital or a general practitioner, if the ankle injury had occurred more than 5 days previously, if they had returned for reassessment or if they had experienced multiple trauma. No changes in clinical management were made as a result of this study, so approval was not sought from the local ethics committee, nor were the patients asked to provide informed consent. Two sets of decision rules for ankle injury were used. The Ottawa ankle rules were developed in 19921 and refined in 1993.4 These rules consist of a foot section and an ankle section, each comprising 3 unweighted variables. If any one of these variables is positive, radiography is indicated (Table 1). The rules were designed to have 100% sensitivity in detecting clinically significant fractures. View this table: [Table1](http://www.cmaj.ca/content/166/6/727/T1) Table 1. The Leiden ankle rule was developed at the university hospital of the city of Leiden in 1991.6 It consist of 7 rows of which each row consists of one or more variables. If per row at least one variable is positive, the stated score (weighted score) is given to that row, except for the last row, which depends on the age of the patient. If two variables are positive, the score is not doubled. For example, if both deformity and crepitation are positive, the score for that row is 5. The final score is the sum of the row scores (Table 2). The developers of the rule reported a sensitivity of 100% in detecting clinically significant fractures at this cutoff level.6 View this table: [Table2](http://www.cmaj.ca/content/166/6/727/T2) Table 2. The attending physicians in this study were junior surgical and orthopedic residents. For each case, the resident was asked to complete a structured data collection form, after taking the history and performing a physical examination. The one-page form, developed specifically for this study, incorporated the criteria for the Ottawa ankle rules and the Leiden ankle rule, along with some additional variables (Fig. 1, translated from Dutch). At the end of the form, the resident was asked to indicate whether radiographic examination was necessary. In addition, the resident was asked to estimate the likelihood of fracture on a probability scale ranging from 0% to 100%. The residents were instructed in scoring the various items on the form. They were told that the study involved a comparison between the Ottawa ankle rules and a local decision rule, but the rules as such were not discussed. Although some residents may have been aware of the rules, it was assumed that they did not use any specific rules in making the decision to request radiography or in estimating the probability of fracture. ![Figure1](http://www.cmaj.ca/https://www.cmaj.ca/content/cmaj/166/6/727/F1.medium.gif) [Figure1](http://www.cmaj.ca/content/166/6/727/F1) **Fig. 1: Structured data collection form encompassing variables from the Ottawa ankle rules (Table 1), the Leiden ankle rule (Table 2) and other variables from the history and physical examination. Pallor = a noticeable difference in skin colour compared with the contralateral side; pulseless or weakened posterior tibial artery = a marked difference from the contralateral side; pain on axial compression = pain when plantar pressure is applied on the heel in the direction of the knee. [The form is a translation of a Dutch original.]** After the resident had completed the form, the patient underwent a radiographic series of both foot and ankle, regardless of the resident's assessment of the need for radiography. A radiologist and trauma surgeon (JNK) interpreted the radiographs independently. Both were blinded as to the contents of the structured data collection form and any treatment given. Disagreement was resolved by consensus. A fracture was defined as any fresh fracture line. A clinically significant fracture was defined as an apparent dislocation of more than 2 mm and a fracture line more than 3 mm across. Patients with clinically significant fractures, as diagnosed by the resident, received operative treatment or cast immobilization. For each patient, the scores for the Ottawa and Leiden ankle rules were calculated from the relevant variables on the structured data collection form. The radiographic series was used as the reference standard. For both the Ottawa and the Leiden ankle rules, we calculated the sensitivity, specificity, percentage of missed fractures and percentage of patients for whom radiography would be indicated, on the basis of the established cutoff scores for these rules. The percentage of patients for whom radiography would be indicated was calculated as the percentage of true positives and false positives for the whole patient group. We calculated the same items as determined by residents' judgement; for the residents, the percentage of patients for whom radiography would be recommended was based on answers to the question of whether radiographic examination was deemed necessary (see Fig. 1). Receiver operating characteristic (ROC) curves were created for the 2 sets of decision rules. To draw the ROC curve for the Ottawa rules, sensitivity and specificity were calculated on the basis of combined foot and ankle criteria for the following 5 thresholds: 0 items positive through 5 items positive. The area under each ROC curve (AUC) was subsequently calculated. The AUC expresses the performance of a diagnostic tool in distinguishing patients with the target condition from those without it for all possible cutoff values. We also calculated the AUC for the residents' estimates of the probability of fracture. A calibration curve was constructed for the probability estimates to examine the reliability of the residents' predictions.16 For this purpose the probability estimates were divided into deciles. For each decile the actual fracture rate was calculated and compared with the mean probability estimate for all patients in that decile. With well-calibrated probability estimates, the average probability estimate should correspond to the actual fracture rate in each decile. The McNemar test for paired samples was used to test for significant differences between the rules and the residents in terms of the percentage of patients for whom radiography was recommended. The method of Hanley and McNeil was used to test whether differences between the AUCs were statistically significant.17 Confidence intervals (CI) around estimates were calculated when appropriate. The sample size needed for this study was estimated with the McNemar test. We considered as clinically important a 5% difference between the decision rules and the residents' answers to the yes/no question in terms of patients for whom radiography was recommended. We calculated that to achieve a power of at least 90%, with a 2-tailed 5% type 2 error, a (paired) sample size of at least 609 subjects was necessary. ## Results Twenty-four residents gathered data during the study period. A total of 690 patients presented with acute ankle trauma, of whom 647 were included in the study (Fig. 2). Mean age was 35 (range 18 to 92, standard deviation 14) years. Half of the subjects, 324, were female. Fractures were observed radiographically for 74 (11%) of the patients, and 41 of these were considered clinically significant. Nineteen of the patients underwent operative treatment, and the other 22 underwent cast immobilization. The 33 patients with insignificant fractures and all patients without a fracture were treated with tape or bandage. ![Figure2](http://www.cmaj.ca/https://www.cmaj.ca/content/cmaj/166/6/727/F2.medium.gif) [Figure2](http://www.cmaj.ca/content/166/6/727/F2) **Fig. 2: Distribution of patients. Patients were excluded for the following reasons: age less than 18 years (29 patients), referred by another hospital or a general practitioner (4), pregnant (3), reason for exclusion unknown (3), injured more than 5 days before (2), lack of insurance (1) and head injury (1). The following types of fracture were observed: fracture of fifth metatarsal (16 cases), Weber B fracture (8), trimalleolar fracture (7), Weber A fracture (7), Weber C fracture (7), fracture of calcaneus (6), fracture of cuboid (6), fracture of medial malleolus (5), bimalleolar fracture (3), avulsion fracture of fibula (3), fracture of talus (3), fracture of navicular bone (1), fracture of first to fourth metatarsal (1), and Volkmann fracture (1). Fractures are defined as follows: Weber A — fibular fracture distal from the tibiofibular syndesmosis; Weber B — fibular fracture at the level of the tibiofibular syndesmosis; Weber C — fibular fracture proximal of the tibiofibular syndesmosis; and Volkmann – triplane fracture of dorsal tibia. A clinically significant fracture was defined as an apparent dislocation of more than 2 mm and a fracture line more than 3 mm across.** The Ottawa ankle rules identified 66 of the 74 fractures (sensitivity 89% and specificity 26%), the Leiden ankle rule identified 59 (sensitivity 80% and specificity 59%), and the residents identified 61 (sensitivity 82% and specificity 68%) (Table 3). The Ottawa rules missed 8 fractures, of which one was clinically significant (a Weber B fracture of the fibula, which was treated with a cast). The Leiden rule missed 15 fractures, of which 5 were clinically significant (2 metatarsal fractures, 1 anterior fracture of the calcaneus, 1 Weber A fracture and 1 Weber B fracture, all of which were treated with a cast). The residents missed 13 fractures, of which 1 was clinically significant (a Weber A fracture with dislocation that needed repositioning and cast treatment). The Ottawa rules and the Leiden rule recommended radiography in 76% and 46% of the cases, respectively. The residents considered radiography necessary in 38% of the cases. View this table: [Table3](http://www.cmaj.ca/content/166/6/727/T3) Table 3. McNemar tests showed a significant difference between both sets of decision rules and the residents' opinions in terms of the percentage of cases in which radiography was recommended (*p* < 0.001 for all 2-way comparisons). The ROC curves are shown in Fig. 3. The AUC for the Ottawa rules was 0.69, significantly lower than the AUC for both the Leiden rule (0.77) and the residents' probability estimates (0.80) (*p* < 0.05). There were no significant differences between the Leiden rule and residents' probability estimates. ![Figure3](http://www.cmaj.ca/https://www.cmaj.ca/content/cmaj/166/6/727/F3.medium.gif) [Figure3](http://www.cmaj.ca/content/166/6/727/F3) **Fig. 3: Receiver operating characteristic curves for the 2 sets of ankle rules and the probability estimates for physicians' judgement. The area under these curves expresses the performance of a diagnostic tool in distinguishing patients with the target condition from those without it for all possible cutoff values. The AUC can be interpreted as the probability that a test correctly ranks two individuals, of which one has the disease and ond does not have the disease. The AUC takes values between 0 and 1, with higher values indicating better overall performance.** The calibration curve shows 8 point estimates, covering a total of 10 deciles (Fig. 4). The first point, at zero, comprises 3 deciles representing 243 patients, for all of whom the estimated probability of fracture was zero. The observed proportion of fractures in these patients was 4.1%. The calibration curve of the residents' probability estimates was nearly perfect for the low probability estimates, indicated by the fact that estimates in the range from zero to about 10% are close to the dashed line. Above 10% estimated probability the calibration curve shows increasing overestimation: in the highest decile, the mean estimated probability of fracture was 85%, which contrasts sharply with the 65% observed fracture rate. ![Figure4](http://www.cmaj.ca/https://www.cmaj.ca/content/cmaj/166/6/727/F4.medium.gif) [Figure4](http://www.cmaj.ca/content/166/6/727/F4) **Fig. 4: Calibration curve comparing physicians' performance in diagnosing ankle fracture with observed fractures. The probability estimates for physicians' judgement were divided into 10 deciles, and the actual fracture rate in each decile was compared with the mean probability estimate for all patients in the decile. The first point (probability estimate of zero) covers 3 deciles (a total of 243 patients). For each point estimate, the 95% confidence interval is given. The dashed line indicates perfect calibration between observed fractures and physicians' diagnostic performance.** ## Interpretation In this study the diagnostic performance of residents using structured data collection was compared with 2 sets of diagnostic decision rules for patients with acute ankle injury. These rules were designed for determining whether radiographic assessment is necessary to exclude fracture. The sensitivity of diagnostic performance by relatively inexperienced surgical and orthopedic residents, after structured data collection, was similar to the sensitivity for the Ottawa rules and the local Leiden rule. In addition, the residents outperformed both sets of rules in terms of minimizing the percentage of patients for whom radiography was deemed necessary. The results of this study may have been biased by a Hawthorne effect, whereby participants (in this case, the residents examining the patients) tend to perform better simply because of their awareness that they are participating in a comparative study.18 We feel that the effects of the Hawthorne effect in this study were probably not large, in that the residents were informed that the study involved a comparison between decision rules, not an assessment of their own performance. To rule out verification bias, every consecutive patient with ankle injury underwent standard radiography of both ankle and foot. This could have effected the residents' decisions, because they knew in advance that radiographs would be obtained regardless of their clinical judgement. The residents were probably less careful than they would have been in a nonstudy situation. This form of bias might have led to more false-negative and fewer false-positive judgements and thus to underestimation of sensitivity and overestimation of the potential reduction in radiographic assessment. We instructed the residents to focus on distinguishing fractures from nonfracture injury; there was no focus on potential treatment. The residents may have based their clinical judgement on possible treatment options and, consequently, shifted their diagnostic frame of reference to more severe cases. For example, if a significant fracture could be ruled out, radiography would have been deemed unnecessary since the treatment for a small fracture is the same for no fracture. Such a shift in the frame of reference might lead to an underestimation of sensitivity due to missing small fractures. To our knowledge only 4 studies on the use of structured data collection are available, all focusing on the diagnosis of acute abdominal pain. These studies reported better diagnostic performance by physicians who used structured data collection compared to standard policy that was not aided by decision-making tools.13,14,15,19 Structured data collection enables consistent clinical assessment and as such might induce a better thought-out decision, which would be particularly beneficial for junior (less experienced) residents.14 A small number of previous studies have compared ankle rules or guidelines with physicians' performance, but none of these studies used explicit structured data collection. Stiell and colleagues have published 2 studies on this subject. The first concerned estimates of the probability of clinically significant fractures.1 The authors concluded that if the threshold for requesting radiographs was less than 10%, the physicians performed reasonably well, missing only 2 of 145 significant fractures (sensitivity 99% [95% CI 95% to 100%]).1 We used the probability estimate as an intermediate measure, to conceptualize the decision process of the physician without dichotomizing it. Nevertheless, applying a threshold of 10% for the data in this study would have yielded a sensitivity of 90% (95% CI 77% to 97%) for detecting significant fractures. In the study by Stiell and colleagues1 no ROC analysis was performed to compare overall diagnostic performance. A second publication by the same authors focused mainly on physicians' performance in detecting clinically significant fractures.3 For comparative purposes, we created the ROC curve for clinically significant fractures only in our study; the AUC was 0.89, almost equal to the AUC of 0.88 for the probability estimates reported by Stiell and colleagues.3 In our population the 2 sets of ankle rules and the residents missed some fractures. Some might argue that it is unethical to allow any fractures to be missed, even if they are small and quality of care is not affected. However, both sets of rules and the residents missed clinically important fractures, which would affect quality of care. A possible explanation for the lack of perfect sensitivity for the Ottawa rules in this study may be the limited experience of the physicians. Most of them had just finished their medical training and were much less experienced than the well-trained physicians in the studies of Stiell and colleagues. Unknown population differences might also be a factor, although there were no obvious reasons why the populations would not be comparable. Another disadvantage of prediction rules is the lack of transportability, the ability to use the rules in related populations.9 For example, both the Ottawa and Leiden rules do not apply to patients with specific comorbidity, such as neuropathic disorders.20 We believe that procedures that continuously challenge the decision-making skills of physicians are in the long run more accurate than rigid guidelines. We have shown that the clinical skills of relatively inexperienced physicians, supported by structured data collection and explicit questioning, allowed them to perform as well as validated decision rules in judging the need for radiography in patients with acute ankle injuries. Explicit ankle rules are not necessary to make a decision about the need for radiography to rule out fracture in a patient with a painful ankle after trauma. Even more may be at stake: these results indicate that even greater reductions in the use of radiography can be attained, without compromising quality of care, by relying on physicians' judgment rather than on formalized decision rules. ## Footnotes * *This article has been peer reviewed.* *Contributors:* Afina Glas contributed to the analysis and interpretation of data and to writing the manuscript. Bas Pijnenburg, Kjell Bogaard and Marnix de Roos contributed to the conception and design of the study, collected the data and assisted in both the interpretation of the data and writing of the manuscript. Patrick Bossuyt and Jeroen Lijmer were the initial developers of the design of the study and supervised and assisted the statistical analysis as well as the interpretation of the results. They also contributed in revision of the manuscript for important intellectual content. Johannes Keeman and Rudolf Butzelaar assisted in the data collection, gave administrative and technical support and critically revised the manuscript. *Competing interests:* None declared. ## References 1. 1. Stiell IG, Greenberg GH, McKnight RD, Nair RC, McDowell I, Worthington JR. A study to develop clinical decision rules for the use of radiography in acute ankle injuries. Ann Emerg Med 1992;21(4):384-90. [CrossRef](http://www.cmaj.ca/lookup/external-ref?access_num=10.1016/S0196-0644(05)82656-3&link_type=DOI) [PubMed](http://www.cmaj.ca/lookup/external-ref?access_num=1554175&link_type=MED&atom=%2Fcmaj%2F166%2F6%2F727.atom) [Web of Science](http://www.cmaj.ca/lookup/external-ref?access_num=A1992HL85300008&link_type=ISI) 2. 2. Dunlop MG, Beattie TF, White GK, Raab GM, Doull RI. Guidelines for selective radiological assessment of inversion ankle injuries. *BMJ (Clin Res Ed)* 1986;293(6547):603-5. 3. 3. Stiell IG, McDowell I, Nair RC, Aeta H, Greenberg G, McKnight RD, et al. Use of radiography in acute ankle injuries: physicians' attitudes and practice. CMAJ 1992;147(11):1671-8. [Abstract](http://www.cmaj.ca/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NDoiY21haiI7czo1OiJyZXNpZCI7czoxMToiMTQ3LzExLzE2NzEiO3M6NDoiYXRvbSI7czoyMDoiL2NtYWovMTY2LzYvNzI3LmF0b20iO31zOjg6ImZyYWdtZW50IjtzOjA6IiI7fQ==) 4. 4. Stiell IG, Greenberg GH, McKnight RD, Nair RC, McDowell I, Reardon M, et al. Decision rules for the use of radiography in acute ankle injuries. Refinement and prospective validation. JAMA 1993;269(9):1127-32. [CrossRef](http://www.cmaj.ca/lookup/external-ref?access_num=10.1001/jama.1993.03500090063034&link_type=DOI) [PubMed](http://www.cmaj.ca/lookup/external-ref?access_num=8433468&link_type=MED&atom=%2Fcmaj%2F166%2F6%2F727.atom) [Web of Science](http://www.cmaj.ca/lookup/external-ref?access_num=A1993KN61100029&link_type=ISI) 5. 5. Leddy JJ, Smolinski RJ, Lawrence J, Snyder JL, Priore RL. Prospective evaluation of the Ottawa ankle rules in a university sports medicine center. With a modification to increase specificity for identifying malleolar fractures. Am J Sports Med 1998;26(2):158-65. [Abstract/FREE Full Text](http://www.cmaj.ca/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6OToiYW1qc3BvcnRzIjtzOjU6InJlc2lkIjtzOjg6IjI2LzIvMTU4IjtzOjQ6ImF0b20iO3M6MjA6Ii9jbWFqLzE2Ni82LzcyNy5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 6. 6. Kievit JVW, Dijkgraaf PB, Zwetsloot-Schonk JHN, Tholen B. Rapport AZL-CBO. Sturing van zorgverlening, kwaliteit en informatie. Resultaten en conclusies van twee uitgevoerde beleidswijzigingen in de praktijk. [Academic Hospital Leiden — Central Steering Institute. Steering of care, quality and information. Results and conclusions of two changes in the care process.] Utrecht: Nationaal Ziekenhuis Instituut; 1991. p. 19-35. 7. 7. Tay SY, Thoo FL, Sitoh YY, Seow E, Wong HP. The Ottawa ankle rules in Asia: validating a clinical decision rule for requesting x-rays in twisting ankle and foot injuries. J Emerg Med 1999;17(6):945-7. [CrossRef](http://www.cmaj.ca/lookup/external-ref?access_num=10.1016/S0736-4679(99)00120-1&link_type=DOI) [PubMed](http://www.cmaj.ca/lookup/external-ref?access_num=10595876&link_type=MED&atom=%2Fcmaj%2F166%2F6%2F727.atom) [Web of Science](http://www.cmaj.ca/lookup/external-ref?access_num=000083773200003&link_type=ISI) 8. 8. van Riet YEA, van der Schouw YT, van de Werken C. Minder Rontgenfoto's en toch goede klinische zorg door geprotocolleerde fysische diagnostiek bij enkelletsels [Fewer x-rays while maintaining quality of clinical care using clinical protocols for physical diagnosis of ankle injuries]. Ned Tijdschr Geneeskd 2000; 144(5):224-8. [PubMed](http://www.cmaj.ca/lookup/external-ref?access_num=10682650&link_type=MED&atom=%2Fcmaj%2F166%2F6%2F727.atom) 9. 9. Justice AC, Covinsky KE, Berlin JA. Assessing the generalizability of prognostic information. Ann Intern Med 1999;130(6):515-24. [CrossRef](http://www.cmaj.ca/lookup/external-ref?access_num=10.7326/0003-4819-130-6-199903160-00016&link_type=DOI) [PubMed](http://www.cmaj.ca/lookup/external-ref?access_num=10075620&link_type=MED&atom=%2Fcmaj%2F166%2F6%2F727.atom) [Web of Science](http://www.cmaj.ca/lookup/external-ref?access_num=000079165500008&link_type=ISI) 10. 10. Diamond GA. Future imperfect: the limitations of clinical prediction models and the limits of clinical prediction. J Am Coll Cardiol 1989;14(3 Suppl A): 12A-22A. [CrossRef](http://www.cmaj.ca/lookup/external-ref?access_num=10.1016/0735-1097(89)90157-5&link_type=DOI) 11. 11. Perry S, Raby N, Grant PT. Prospective survey to verify the Ottawa ankle rules. J Accid Emerg Med 1999;16(4):258-60. [Abstract/FREE Full Text](http://www.cmaj.ca/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6NzoiZW1lcm1lZCI7czo1OiJyZXNpZCI7czo4OiIxNi80LzI1OCI7czo0OiJhdG9tIjtzOjIwOiIvY21hai8xNjYvNi83MjcuYXRvbSI7fXM6ODoiZnJhZ21lbnQiO3M6MDoiIjt9) 12. 12. Kelly AM, Richards D, Kerr L, Grant J, O'Donovan P, Basire K, et al. Failed validation of a clinical decision rule for the use of radiography in acute ankle injury. N Z Med J 1994;107(982):294-5. [PubMed](http://www.cmaj.ca/lookup/external-ref?access_num=8093158&link_type=MED&atom=%2Fcmaj%2F166%2F6%2F727.atom) [Web of Science](http://www.cmaj.ca/lookup/external-ref?access_num=A1994PA20300007&link_type=ISI) 13. 13. Hancock DM, Heptinstall M, Old JM, Lobo FX, Contractor BR, Chaturvedi S, et al. Computer aided diagnosis of acute abdominal pain. The practical impact of a “theoretical” exercise. Theor Surg 1987;2(3):99-105. 14. 14. Korner H, Sondenaa K, Soreide JA, Andersen E, Nysted A, Lende TH. Structured data collection improves the diagnosis of acute appendicitis. Br J Surg 1998;85(3):341-4. [CrossRef](http://www.cmaj.ca/lookup/external-ref?access_num=10.1046/j.1365-2168.1998.00627.x&link_type=DOI) [PubMed](http://www.cmaj.ca/lookup/external-ref?access_num=9529488&link_type=MED&atom=%2Fcmaj%2F166%2F6%2F727.atom) [Web of Science](http://www.cmaj.ca/lookup/external-ref?access_num=000072630200012&link_type=ISI) 15. 15. de Dombal FT, Dallos V, McAdam WA. Can computer aided teaching packages improve clinical care in patients with acute abdominal pain? BMJ 1991;302(6791):1495-7. [Abstract/FREE Full Text](http://www.cmaj.ca/lookup/ijlink/YTozOntzOjQ6InBhdGgiO3M6MTQ6Ii9sb29rdXAvaWpsaW5rIjtzOjU6InF1ZXJ5IjthOjQ6e3M6ODoibGlua1R5cGUiO3M6NDoiQUJTVCI7czoxMToiam91cm5hbENvZGUiO3M6MzoiYm1qIjtzOjU6InJlc2lkIjtzOjEzOiIzMDIvNjc5MS8xNDk1IjtzOjQ6ImF0b20iO3M6MjA6Ii9jbWFqLzE2Ni82LzcyNy5hdG9tIjt9czo4OiJmcmFnbWVudCI7czowOiIiO30=) 16. 16. Diamond GA. What price perfection? Calibration and discrimination of clinical prediction models. J Clin Epidemiol 1992;45(1):85-9. [CrossRef](http://www.cmaj.ca/lookup/external-ref?access_num=10.1016/0895-4356(92)90192-P&link_type=DOI) [PubMed](http://www.cmaj.ca/lookup/external-ref?access_num=1738016&link_type=MED&atom=%2Fcmaj%2F166%2F6%2F727.atom) [Web of Science](http://www.cmaj.ca/lookup/external-ref?access_num=A1992HC40200013&link_type=ISI) 17. 17. Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 1983; 148(3): 839-43. [CrossRef](http://www.cmaj.ca/lookup/external-ref?access_num=10.1148/radiology.148.3.6878708&link_type=DOI) [PubMed](http://www.cmaj.ca/lookup/external-ref?access_num=6878708&link_type=MED&atom=%2Fcmaj%2F166%2F6%2F727.atom) [Web of Science](http://www.cmaj.ca/lookup/external-ref?access_num=A1983RD86600044&link_type=ISI) 18. 18. De Amici D, Klersy C, Ramajoli F, Brustia L, Politi P. Impact of the Hawthorne effect in a longitudinal clinical study: the case of anesthesia. Control Clin Trials 2000;21(2):103-14. [CrossRef](http://www.cmaj.ca/lookup/external-ref?access_num=10.1016/S0197-2456(99)00054-9&link_type=DOI) [PubMed](http://www.cmaj.ca/lookup/external-ref?access_num=10715508&link_type=MED&atom=%2Fcmaj%2F166%2F6%2F727.atom) [Web of Science](http://www.cmaj.ca/lookup/external-ref?access_num=000086157100003&link_type=ISI) 19. 19. Hallan S, Asberg A, Edna TH. Estimating the probability of acute appendicitis using clinical criteria of a structured record sheet: the physician against the computer. Eur J Surg 1997;163(6):427-32. [PubMed](http://www.cmaj.ca/lookup/external-ref?access_num=9231854&link_type=MED&atom=%2Fcmaj%2F166%2F6%2F727.atom) 20. 20. McLaughlin SA, Binder DS, Sklar DP. Ottawa ankle rules and the diabetic foot [letter]. Ann Emerg Med 1998;32(4):518.