

Handbook of Statistical Methods for Case-Control Studies
Ørnulf borgan, norman breslow, nilanjan chatterjee, mitchell h. gail, alastair scott, chris j. wild.
- Available on Taylor & Francis eBooks
- Preview this title
What are VitalSource eBooks?
Prices & shipping based on shipping country
Multiple eBook Formats
ISBN | Quantity:
Shopping Cart Summary
VitalSource is an academic technology provider that offers Routledge.com customers access to its free eBook reader, Bookshelf. Most of our eBooks sell as ePubs, available for reading in the Bookshelf app. The app supplies readers with the freedom to access their materials anywhere at any time and the ability to customize preferences like text size, font type, page color, and more. To learn more about our eBooks, visit the links below:
- About eBooks »
- eBooks FAQ »
- Get the Bookshelf App »
Book Description
Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field. It provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and methods. The handbook is designed to serve as a reference text for biostatisticians and quantitatively-oriented epidemiologists who are working on the design and analysis of case-control studies or on related statistical methods research. Though not specifically intended as a textbook, it may also be used as a backup reference text for graduate level courses. Book Sections Classical designs and causal inference, measurement error, power, and small-sample inference Designs that use full-cohort information Time-to-event data Genetic epidemiology About the Editors Ørnulf Borgan is Professor of Statistics, University of Oslo. His book with Andersen, Gill and Keiding on counting processes in survival analysis is a world classic. Norman E. Breslow was, at the time of his death, Professor Emeritus in Biostatistics, University of Washington. For decades, his book with Nick Day has been the authoritative text on case-control methodology. Nilanjan Chatterjee is Bloomberg Distinguished Professor, Johns Hopkins University. He leads a broad research program in statistical methods for modern large scale biomedical studies. Mitchell H. Gail is a Senior Investigator at the National Cancer Institute. His research includes modeling absolute risk of disease, intervention trials, and statistical methods for epidemiology. Alastair Scott was, at the time of his death, Professor Emeritus of Statistics, University of Auckland. He was a major contributor to using survey sampling methods for analyzing case-control data. Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear regression and methods for fitting models to response-selective data.
Table of Contents
Introduction. Introduction. Origins. Classical Case-Control Studies. Design issues in case-control studies. Basic concepts and methods of analysis. Matched samples. Beyond logistic regression. Small sample methods. Multiple case or control groups. Power and sample size. Causal inference. Misclassification and measurement error. Analysis of secondary phenotype under case-control design. Sampling from a Defined Cohort. Two and three (or multi) phase sampling designs. Calibration and estimation of sampling weights. Maximum likelihood. Re-use of case-control samples. Misspecification. Case-control studies with complex sampling. Cohort sampling for time to event data. Case-cohort designs and analyses. Design options and partial likelihood analyses of nested case-control data. Inverse probability weighting in nested case-control studies. Multiple imputation. Maximum likelihood. Self controlled case series. Genetic Epidemiology. Basic design and association analysis of population-based case-control studies. Analysis of gene-environment interactions. Screening methods for detecting genetic association and interactions under case-control design. Analysis of family-based case-control studies. Fitting mixed model to case-control genome-wide association studies. Analysis of secondary phenotype under case-control design.
Ørnulf Borgan is Professor of Statistics, University of Oslo. His book with Andersen, Gill and Keiding on counting processes in survival analysis is a world classic. Norman E. Breslow was, at the time of his death, Professor Emeritus in Biostatistics, University of Washington. For decades, his book with Nick Day has been the authoritative text on case-control methodology. Nilanjan Chatterjee is Bloomberg Distinguished Professor, Johns Hopkins University. He leads a broad research program in statistical methods for modern large scale biomedical studies. Mitchell H. Gail is a Senior Investigator at the National Cancer Institute. His research includes modeling absolute risk of disease, intervention trials, and statistical methods for epidemiology. Alastair Scott was, at the time of his death, Professor Emeritus of Statistics, University of Auckland. He was a major contributor to using survey sampling methods for analyzing case-control data. Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear regression and methods for fitting models to response-selective data.
We use cookies to improve your website experience. To learn how to manage your cookie settings, please see our Cookie Policy . By continuing to use the website, you consent to our use of cookies.
The country you have selected will result in the following:
- Product pricing will be adjusted to match the corresponding currency.

Statistical analyses of case-control studies

How Evidence-based practice (EBP) can be translated as health communication or patient education materials

How to evaluate bias in meta-analysis within meta-epidemiological studies?
Introduction.
A case-control study is used to see if exposure is linked to a certain result (i.e., disease or condition of interest). Case-control research is always retrospective by definition since it starts with a result and then goes back to look at exposures. The investigator already knows the result of each participant when they are enrolled in their separate groups. Case-control studies are retrospective because of this, not because the investigator frequently uses previously gathered data. This article discusses statistical analysis in case-control studies.
Advantages and Disadvantages of Case-Control Studies

Study Design
Participants in a case-control study are chosen for the study depending on their outcome status. As a result, some individuals have the desired outcome (referred to as cases), while others do not have the desired outcome (referred to as controls). After that, the investigator evaluates the exposure in both groups. As a result, in case-control research , the outcome must occur in at least some individuals. Thus, as shown in Figure 1, some research participants have the outcome, and others do not enrol.

Figure 1. Example of a case-control study [1]
Selection of case
The cases should be defined as precisely as feasible by the investigator. A disease’s definition may be based on many criteria at times; hence, all aspects should be fully specified in the case definition.
Selection of a control
Controls that are comparable to the cases in a variety of ways should be chosen. The matching criteria are the parameters (e.g., age, sex, and hospitalization time) used to establish how controls and cases should be similar. For instance, it would be unfair to compare patients with elective intraocular surgery to a group of controls with traumatic corneal lacerations. Another key feature of a case-control study is that the exposure in both cases and controls should be measured equally.
Though some controls have to be similar to cases in many respects, it is possible to over-match. Over-matching might make it harder to identify enough controls. Furthermore, once a matching variable is chosen, it cannot be analyzed as a risk factor. Enrolling more than one control for each case is an effective method for increasing the power of research. However, incorporating more than two controls per instance adds little statistical value.
Data collection
Decide on the data to be gathered after precisely identifying the cases and controls; both groups must have the same data obtained in the same method. If the search for primary risk variables is not conducted objectively, the study may suffer from researcher bias, especially because the conclusion is already known. It’s crucial to try to hide the outcome from the person collecting risk factor data or interviewing patients, even if it’s not always practicable. Patients may be asked questions concerning historical issues (such as smoking history, food, usage of conventional eye medications, and so on). For some people, precisely recalling all of this information may be challenging.
Furthermore, patients who get the result (cases) are more likely to recall specifics of unfavourable experiences than controls. Recall bias is a term for this phenomenon. Any effort made by the researcher to reduce this form of bias would benefit the research.
The frequency of each of the measured variables in each of the two groups is computed in the analysis. Case-control studies produce the odds ratio to measure the strength of the link between exposure and the outcome. An odds ratio is the ratio of exposure probabilities in the case group to the odds of response in the control group. Calculating a confidence interval for each odds ratio is critical. A confidence interval of 1.0 indicates that the link between the exposure and the result might have been discovered by chance alone and that the link is not statistically significant. Without a confidence interval, an odds ratio isn’t particularly useful. Computer programmes are typically used to do these computations. Because no measures are taken in a population-based sample, case-control studies cannot give any information regarding the incidence or prevalence of a disease.
Risk Factors and Sampling
Case-control studies can also be used to investigate risk factors for a rare disease. Cases might be obtained from hospital records. Patients who present to the hospital, on the other hand, may not be typical of the general community. The selection of an appropriate control group may provide challenges. Patients from the same hospital who do not have the result are a common source of controls. However, hospitalized patients may not always reflect the broader population; they are more likely to have health issues and access the healthcare system.
Recent research on case-control studies using statistical analyses
i) R isk factors related to multiple sclerosis in Kuwait
This matched case-control research in Kuwait looked at the relationship between several variables: family history, stressful life events, tobacco smoke exposure, vaccination history, comorbidity, and multiple sclerosis (MS) risk. To accomplish the study’s goal, a matched case-control strategy was used. Cases were recruited from Ibn Sina Hospital’s neurology clinics and the Dasman Diabetes Institute’s MS clinic. Controls were chosen from among Kuwait University’s faculty and students. A generalized questionnaire was used to collect data on socio-demographic, possibly genetic, and environmental aspects from each patient and his/her pair-matched control. Descriptive statistics were produced, including means and standard deviations for quantitative variables and frequencies for qualitative variables. Variables that were substantially (p ≤ 0.15) associated with MS status in the univariable conditional logistic regression analysis were evaluated for inclusion in the final multivariable conditional logistic regression model. In this case-control study, 112 MS patients were invited to participate, and 110 (98.2 %) agreed to participate. Therefore, 110 MS patients and 110 control participants were enlisted, and they were individually matched with cases (1:1) on age (5 years), gender, and nationality (Fig. 1). The findings revealed that having a family history of MS was significantly associated with an increased risk of developing MS. In contrast, vaccination against influenza A and B viruses provided significant protection against MS.

Figure 1. Flow chart on the enrollment of the MS cases and controls [1]
ii) Relation between periodontitis and COVID-19 infection
COVID-19 is linked to a higher inflammatory response, which can be deadly. Periodontitis is characterized by systemic inflammation. In Qatar, patients with COVID-19 were chosen from Hamad Medical Corporation’s (HMC) national electronic health data. Patients with COVID-19 problems (death, ICU hospitalizations, or assisted ventilation) were categorized as cases, while COVID-19 patients released without severe difficulties were categorized as controls. There was no control matching because all controls were included in the analysis. Periodontal problems were evaluated using dental radiographs from the same database. The relationships between periodontitis and COVID 19 problems were investigated using logistic regression models adjusted for demographic, medical, and behavioural variables. 258 of the 568 participants had periodontitis. Only 33 of the 310 patients with periodontitis had COVID-19 issues, whereas only 7 of the 310 patients without periodontitis had COVID-19 issues. Table 2 shows the unadjusted and adjusted odds ratios and 95 % confidence intervals for the relationship between periodontitis and COVID-19 problems. Periodontitis was shown to be substantially related to a greater risk of COVID-19 complications, such as ICU admission, the requirement for assisted breathing, and mortality, as well as higher blood levels of indicators connected to a poor COVID-19 outcome, such as D-dimer, WBC, and CRP.
Table 2. Associations between periodontal condition and COVID-19 complications [3]

iii) Menstrual, reproductive and hormonal factors and thyroid cancer
The relationships between menstrual, reproductive, and hormonal variables and thyroid cancer incidence in a population of Chinese women were investigated in this study. A 1:1 corresponding hospital-based Case-control study was conducted in 7 counties of Zhejiang Province to investigate the correlations of diabetes mellitus and other variables with thyroid cancer. Case participants were eligible if they were diagnosed with primary thyroid cancer for the first time in a hospital between July 2015 and December 2017. The patients and controls in this research were chosen at random. At enrollment, the interviewer gathered all essential information face-to-face using a customized questionnaire. Descriptive statistics were utilized to characterize the baseline characteristics of female individuals using frequency and percentage. To investigate the connections between the variables and thyroid cancer, univariate conditional logistic regression models were used. We used four multivariable conditional logistic regression models adjusted for variables to investigate the relationships between menstrual, reproductive, and hormonal variables and thyroid cancer. In all, 2937 pairs of participants took part in the case-control research. The findings revealed that a later age at first pregnancy and a longer duration of breastfeeding were substantially linked with a lower occurrence of thyroid cancer, which might shed light on the aetiology, monitoring, and prevention of thyroid cancer in Chinese women [4].
It’s important to note that the term “case-control study” is commonly misunderstood. A case-control study starts with a group of people exposed to something and a comparison group (control group) who have not been exposed to anything and then follows them over time to see what occurs. However, this is not a case-control study. Case-control studies are frequently seen as less valuable since they are retrospective. They can, however, be a highly effective technique of detecting a link between an exposure and a result. In addition, they are sometimes the only ethical approach to research a connection. Case-control studies can provide useful information if definitions, controls, and the possibility for bias are carefully considered.
[1] Setia, Maninder Singh. “Methodology Series Module 2: Case-control Studies.” Indian journal of dermatology vol. 61,2 (2016): 146-51. doi:10.4103/0019-5154.177773
[2] El-Muzaini, H., Akhtar, S. & Alroughani, R. A matched case-control study of risk factors associated with multiple sclerosis in Kuwait. BMC Neurol 20, 64 (2020). https://doi.org/10.1186/s12883-020-01635-1 .
[3] Marouf, Nadya, Wenji Cai, Khalid N. Said, Hanin Daas, Hanan Diab, Venkateswara Rao Chinta, Ali Ait Hssain, Belinda Nicolau, Mariano Sanz, and Faleh Tamimi. “Association between periodontitis and severity of COVID‐19 infection: A case–control study.” Journal of clinical periodontology 48, no. 4 (2021): 483-491.
[4] Wang, Meng, Wei-Wei Gong, Qing-Fang He, Ru-Ying Hu, and Min Yu. “Menstrual, reproductive and hormonal factors and thyroid cancer: a hospital-based case-control study in China.” BMC Women’s Health 21, no. 1 (2021): 1-8.
pubrica-academy
Related posts.

PUB - Selecting material (e.g. excipient, active pharmaceutical ingredient) for drug development
Selecting material (e.g. excipient, active pharmaceutical ingredient, packaging material) for drug development

PUB - Health Economics of Data Modeling
Health economics in clinical trials

PUB - Epidemiology designs for clinical trials
Epidemiology designs for clinical trials
Comments are closed.

Select Your Services Medical Writing Services Regulatory Science Writing Editing & Translation Medical & Scientific Editing Writing in Clinical Research (CRO) Clinical (or Medical) Auditing Medical Animations Solutions Medical Translation Scientific & Academic Publishing Manuscript Artwork Preparation Impact Factor Journal Publication Scientific Research & Analytics Healthcare Data Science Projects Bio-Statistical & Meta Data Analytics Scientific Communication Medical Communication Services

An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Publications
- Account settings
- My Bibliography
- Collections
- Citation manager
Save citation to file
Email citation, add to collections.
- Create a new collection
- Add to an existing collection
Add to My Bibliography
Your saved search, create a file for external citation management software, your rss feed.
- Search in PubMed
- Search in NLM Catalog
- Add to Search
Valid statistical inference methods for a case-control study with missing data
Affiliations.
- 1 1 Department of Mathematics, South University of Science and Technology of China, Shenzhen City, Guangdong, P. R. China.
- 2 2 Department of Statistics and Actuarial Science, The University of Hong Kong, Pokfulam Road, Hong Kong, P. R. China.
- PMID: 27199234
- DOI: 10.1177/0962280216649619
The main objective of this paper is to derive the valid sampling distribution of the observed counts in a case-control study with missing data under the assumption of missing at random by employing the conditional sampling method and the mechanism augmentation method. The proposed sampling distribution, called the case-control sampling distribution, can be used to calculate the standard errors of the maximum likelihood estimates of parameters via the Fisher information matrix and to generate independent samples for constructing small-sample bootstrap confidence intervals. Theoretical comparisons of the new case-control sampling distribution with two existing sampling distributions exhibit a large difference. Simulations are conducted to investigate the influence of the three different sampling distributions on statistical inferences. One finding is that the conclusion by the Wald test for testing independency under the two existing sampling distributions could be completely different (even contradictory) from the Wald test for testing the equality of the success probabilities in control/case groups under the proposed distribution. A real cervical cancer data set is used to illustrate the proposed statistical methods.
Keywords: Bootstrap methods; Wald test; case–control study; missing at random; the mechanism augmentation method.
Similar articles
- A new framework of statistical inferences based on the valid joint sampling distribution of the observed counts in an incomplete contingency table. Tian GL, Li HQ. Tian GL, et al. Stat Methods Med Res. 2017 Aug;26(4):1712-1736. doi: 10.1177/0962280215586591. Epub 2015 Jun 5. Stat Methods Med Res. 2017. PMID: 26048903
- Reducing Bias for Maximum Approximate Conditional Likelihood Estimator with General Missing Data Mechanism. Zhao J. Zhao J. J Nonparametr Stat. 2017;29(3):577-593. doi: 10.1080/10485252.2017.1339306. Epub 2017 Jun 14. J Nonparametr Stat. 2017. PMID: 31551650 Free PMC article.
- Applications of Monte Carlo Simulation in Modelling of Biochemical Processes. Tenekedjiev KI, Nikolova ND, Kolev K. Tenekedjiev KI, et al. In: Mode CJ, editor. Applications of Monte Carlo Methods in Biology, Medicine and Other Fields of Science [Internet]. Rijeka (HR): InTech; 2011 Feb 28. Chapter 4. In: Mode CJ, editor. Applications of Monte Carlo Methods in Biology, Medicine and Other Fields of Science [Internet]. Rijeka (HR): InTech; 2011 Feb 28. Chapter 4. PMID: 28045483 Free Books & Documents. Review.
- Inferences concerning exponential distributions in the presence of randomly right censored data with missing censored values. Singh B. Singh B. Lifetime Data Anal. 2002 Mar;8(1):69-88. doi: 10.1023/a:1013522916356. Lifetime Data Anal. 2002. PMID: 11878226
- Interval estimation for a proportion using a double-sampling scheme with two fallible classifiers. Qiu SF, Lian H, Zou GY, Zeng XS. Qiu SF, et al. Stat Methods Med Res. 2018 Aug;27(8):2478-2503. doi: 10.1177/0962280216681599. Epub 2016 Dec 29. Stat Methods Med Res. 2018. PMID: 27932666
Publication types
- Search in MeSH
Related information
Linkout - more resources, full text sources.
- Ovid Technologies, Inc.
Other Literature Sources
- scite Smart Citations

- Citation Manager
NCBI Literature Resources
MeSH PMC Bookshelf Disclaimer
The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.
Statistical methods for case-control and case-cohort studies with possibly correlated failure time data
Downloadable content.
- March 21, 2019
- Affiliation: Gillings School of Global Public Health, Department of Biostatistics
- In large cohort studies, the major effort and cost typically arise from the assembling of covariate measurements. Case-control and case-cohort study designs are widely used ones to reduce the cost and achieve the same goals in such studies, especially when the disease rate is low. In this dissertation, we consider analyzing the multivariate failure time data arising from case-control and case-cohort studies. First, we consider a case-control within cohort study with correlated failure times. A retrospective dental study was conducted to evaluate the effect of pulpal involvement on tooth survival (Caplan and Weintraub, 1997; Caplan et al., 2005). Due to the clustering of teeth, the survival times of the matched teeth within subjects could be correlated and thus the statistical methods for conventional case-control studies cannot not be directly applied. We study the marginal proportional hazards regression model for data from this type studies. Second, we consider a case-cohort study with multiple disease outcomes. A case-cohort design was implemented in the Busselton Health Study (Cullen, 1972) and it was of interest to study the relationship between serum ferritin and coronary heart disease and stroke events. Since times to coronary heart disease and stroke events observed from the same subject could be correlated, valid statistical method needs to take it into consideration. To this end, we consider marginal proportional hazards regression model. Third, we consider marginal additive hazards regression model for case-cohort studies with multiple disease outcomes. Most modern analyses of survival data focus on multiplicative models for relative risk using proportional hazards models. The additive hazards model, which model the risk differences has often been suggested as an alternative to the proportional hazards model. In each of the three cases, we propose a weighted estimating equation approach for model parameter estimation, with different types weights to enhance the efficiency. The asymptotic properties of the proposed estimators are derived and their finite sample properties are assessed via simulation studies. The proposed method are applied to the aforementioned dental study and the Busselton Health Study for illustration.
- August 2007
- https://doi.org/10.17615/yzjx-t288
- Dissertation
- In Copyright
- Cai, Jianwen
- University of North Carolina at Chapel Hill
- Open access
This work has no parents.
Select type of work
Master's papers.
Deposit your masters paper, project or other capstone work. Theses will be sent to the CDR automatically via ProQuest and do not need to be deposited.
Scholarly Articles and Book Chapters
Deposit a peer-reviewed article or book chapter. If you would like to deposit a poster, presentation, conference paper or white paper, use the “Scholarly Works” deposit form.
Undergraduate Honors Theses
Deposit your senior honors thesis.
Scholarly Journal, Newsletter or Book
Deposit a complete issue of a scholarly journal, newsletter or book. If you would like to deposit an article or book chapter, use the “Scholarly Articles and Book Chapters” deposit option.
Deposit your dataset. Datasets may be associated with an article or deposited separately.
Deposit your 3D objects, audio, images or video.
Poster, Presentation or Paper
Deposit scholarly works such as posters, presentations, conference papers or white papers. If you would like to deposit a peer-reviewed article or book chapter, use the “Scholarly Articles and Book Chapters” deposit option.

Methoden der Statistik und Informatik in Epidemiologie und Diagnostik pp 97–109 Cite as
Statistical Methods for Cohort and Case-Control Studies
- N. E. Breslow 2 , 3
- Conference paper
81 Accesses
1 Citations
Part of the Medizinische Informatik und Statistik book series (MEDINFO,volume 40)
Traditional methods of occupational cohort analysis have used the standardized mortality ratio (SMR) as the fundamental measure of association between risk factor and disease. The SMR is shown here to result from maximum likelihood estimation in a multiplicative statistical model involving known national death rates. The same model permits regression analysis of variations in the SMR according to the intensity, type, or duration of exposure to environmental agents.
A second method of analysis (COX,1972) results when the underlying death rates are treated as an unknown nuisance function. Case-control sampling from the “risk sets” formed during analysis leads to a third technique which is computationally more efficient than the other two.
All three methods yield roughly equivalent measures of the relative risk of respiratory cancer associated with arsenic trioxide exposure among a cohort of Montana smelter workers. Questions of efficiency, bias and cost in the selection of a method of analysis are discussed.
Research supported in part by USPHS grant 1 K07 CA00723 and the Alexander von Humboldt Foundation
This is a preview of subscription content, access via your institution .
Buying options
- DOI: 10.1007/978-3-642-81938-4_12
- Chapter length: 13 pages
- Instant PDF download
- Readable on all devices
- Own it forever
- Exclusive offer for individuals only
- Tax calculation will be finalised during checkout
- ISBN: 978-3-642-81938-4
- ISBN: 978-3-540-12007-0
- Dispatched in 3 to 5 business days
- Free shipping worldwide Shipping restrictions may apply, check to see if you are impacted .
Unable to display preview. Download preview PDF.
Baker RJ and Neider JA (1978). The GLIM System: Release 3, Oxford: Numerical Algorithms Group.
Google Scholar
Berry G, Gilson JC, Holmes S, Lewisohn HC and Roach SA (1979). Asbestosis: a study of dose-response relationship in an asbestos textile factory. British Journal of Industrial Medicine 36, 98–112.
Breslow NE and Day NE (1980). Statistical Methods in Cancer Research I: The Analysis of Case-Control Studies. Lyon: IARC.
Cox DR (1972). Regression models and life tables (with discussion). Journal of the Royal Statistical Society Series B 34, 187–220.
MATH Google Scholar
Enterline PE (1976). Pitfalls in epidemiological research: an examination of the asbestos literature. Journal of Occupational Medicine 18, 150–156.
CrossRef Google Scholar
Fox AJ and Collier PF (1976). Low mortality rates in industrial cohort studies due to selection for work and survival in the industry. British Journal of Preventive and Social Medicine 30, 225–230.
Kalbfleisch JD and Prentice RL (1980). The Statistical Analysis of Failure Time Data. New York: Wiley.
Knox EG (1973). Computer simulation of industrial hazards. British Journal of Industrial Medicine 30, 54–63.
Lee AM and Fraumeni JF (1969). Arsenic and respiratory cancer in man. Journal of the National Cancer Institute 42, 1045–1052.
Lubin JH and Breslow NE (1983). Application of survival data ethodology to occupational mortality studies. (Unpublished manuscript).
Mancuso TF and El-Attar AA (1967). Mortality pattern in a cohort of asbestos workers. Journal of Occupational Medicine 9, 147–162.
Mosteller F and Tukey JW (1977). Data Analysis and Regression. Reading: Addison-Wesley.
Prentice RL and Breslow NE (1978). Retrospective studies and failure time models. Biometrika 65, 153–158.
CrossRef MATH Google Scholar
Rao CR (1965). Linear Statistical Inference and its Applications. New York: Wiley.
Yule GU (1934). On some points relating to vital statistics, more especially statistics of occupational mortality. Journal of the Royal Statistical Society 94, 1–84.
Download references
Author information
Authors and affiliations.
Department of Biostatistics, University of Washington, Seattle, USA
N. E. Breslow
Institute for Documentation, Information, and Statistics, German Cancer Research Center, Heidelberg, USA
You can also search for this author in PubMed Google Scholar
Editor information
Editors and affiliations.
Universitäts-Krankenhaus Eppendorf, Institut für Mathematik und Datenverarbeitung in der Medizin, Universität Hamburg, Martinistraße 52, 2000, Hamburg 20, Deutschland
J. Berger & K. H. Höhne &
Additional information
Dedicated to Professor Dr. Otto Westphal on the occasion of his 70th birthday.
Rights and permissions
Reprints and Permissions
Copyright information
© 1983 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper.
Breslow, N.E. (1983). Statistical Methods for Cohort and Case-Control Studies. In: Berger, J., Höhne, K.H. (eds) Methoden der Statistik und Informatik in Epidemiologie und Diagnostik. Medizinische Informatik und Statistik, vol 40. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-81938-4_12
Download citation
DOI : https://doi.org/10.1007/978-3-642-81938-4_12
Publisher Name : Springer, Berlin, Heidelberg
Print ISBN : 978-3-540-12007-0
Online ISBN : 978-3-642-81938-4
eBook Packages : Springer Book Archive
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Handbook of Statistical Methods for Case-Control Studies
Publisher description.
Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field. It provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and methods. The handbook is designed to serve as a reference text for biostatisticians and quantitatively-oriented epidemiologists who are working on the design and analysis of case-control studies or on related statistical methods research. Though not specifically intended as a textbook, it may also be used as a backup reference text for graduate level courses. Book Sections Classical designs and causal inference, measurement error, power, and small-sample inference Designs that use full-cohort information Time-to-event data Genetic epidemiology About the Editors Ørnulf Borgan is Professor of Statistics, University of Oslo. His book with Andersen, Gill and Keiding on counting processes in survival analysis is a world classic. Norman E. Breslow was, at the time of his death, Professor Emeritus in Biostatistics, University of Washington. For decades, his book with Nick Day has been the authoritative text on case-control methodology. Nilanjan Chatterjee is Bloomberg Distinguished Professor, Johns Hopkins University. He leads a broad research program in statistical methods for modern large scale biomedical studies. Mitchell H. Gail is a Senior Investigator at the National Cancer Institute. His research includes modeling absolute risk of disease, intervention trials, and statistical methods for epidemiology. Alastair Scott was, at the time of his death, Professor Emeritus of Statistics, University of Auckland. He was a major contributor to using survey sampling methods for analyzing case-control data. Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear regression and methods for fitting models to response-selective data.
More Books by Ornulf Borgan, Norman Breslow, Nilanjan Chatterjee, Mitchell H. Gail, Alastair Scott & Chris J. Wild
Other books in this series.
- - Google Chrome
Intended for healthcare professionals
- My email alerts
- BMA member login
- Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Search form
- Advanced search
- Search responses
- Search blogs
- Analysis of matched...
Analysis of matched case-control studies
- Related content
- Peer review
- Neil Pearce , professor 1 2
- 1 Department of Medical Statistics and Centre for Global NCDs, Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK
- 2 Centre for Public Health Research, Massey University, Wellington, New Zealand
- neil.pearce{at}lshtm.ac.uk
- Accepted 30 December 2015
There are two common misconceptions about case-control studies: that matching in itself eliminates (controls) confounding by the matching factors, and that if matching has been performed, then a “matched analysis” is required. However, matching in a case-control study does not control for confounding by the matching factors; in fact it can introduce confounding by the matching factors even when it did not exist in the source population. Thus, a matched design may require controlling for the matching factors in the analysis. However, it is not the case that a matched design requires a matched analysis. Provided that there are no problems of sparse data, control for the matching factors can be obtained, with no loss of validity and a possible increase in precision, using a “standard” (unconditional) analysis, and a “matched” (conditional) analysis may not be required or appropriate.
Summary points
Matching in a case-control study does not control for confounding by the matching factors
A matched design may require controlling for the matching factors in the analysis
However, it is not the case that a matched design requires a matched analysis
A “standard” (unconditional) analysis may be most valid and appropriate, and a “matched” (conditional) analysis may not be required or appropriate
Matching on factors such as age and sex is commonly used in case-control studies. 1 This can be done for convenience (eg, choosing a control admitted to hospital on the same day as the case), to improve study efficiency by improving precision (under certain conditions) when controlling for the matching factors (eg, age, sex) in the analysis, or to enable control in the analysis of unquantifiable factors such as neighbourhood characteristics (eg, by choosing neighbours as controls and then controlling for neighbourhood in the analysis). The increase in efficiency occurs because it ensures similar numbers of cases and controls in confounder strata. For example, in a study of lung cancer, if controls are sampled at random from the source population, their age distribution will be much younger than that of the lung cancer cases. Thus, when age is controlled in the analysis, the young age stratum may contain mostly controls and few cases, whereas the old age stratum may contain mostly cases and fewer controls. Thus, statistical precision may be improved if controls are age matched to ensure roughly equal numbers of cases and controls in each age stratum.
There are two common misconceptions about case-control studies: that matching in itself eliminates confounding by the matching factors; and that if matching has been performed, then a “matched analysis” is required.
Matching in the design does not control for confounding by the matching factors. In fact, it can introduce confounding by the matching factors even when it did not exist in the source population. 1 The reasons for this are complex and will only be discussed briefly here. In essence, the matching process makes the controls more similar to the cases not only for the matching factor but also for the exposure itself. This introduces a bias that needs to be controlled in the analysis. For example, suppose we were conducting a case-control study of poverty and death (from any cause), and we chose siblings as controls (that is, for each person who died, we matched on family or residence by choosing a sibling who was still alive as a control). In this situation, since poverty runs in families we would tend to select a disadvantaged control for each disadvantaged person who had died and a wealthy control for each wealthy person who had died. We would find roughly equal percentages of disadvantaged people among the cases and controls, and we would find little association between poverty and mortality. The matching has introduced a bias, which fortunately (as we will illustrate) can be controlled by controlling for the matching factor in the analysis.
Thus, a matched design will (almost always) require controlling for the matching factors in the analysis. However, this does not necessarily mean that a matched analysis is required or appropriate, and it will often be sufficient to control for the matching factors using simpler methods. Although this is well recognised in both recent 2 3 and historical 4 5 texts, other texts 6 7 8 9 do not discuss this issue and present the matched analysis as the only option for analysing matched case-control studies. In fact, the more standard analysis may not only be valid but may be much easier in practice, and yield better statistical precision.
In this paper I explore and illustrate these problems using a hypothetical pair matched case-control study.
Options for analysing case-control studies
Unmatched case-control studies are typically analysed using the Mantel-Haenszel method 10 or unconditional logistic regression. 4 The former involves the familiar method of producing a 2×2 (exposure-disease) stratum for each level of the confounder (eg, if there are five age groups and two sex groups, then there will be 10 2×2 tables, each showing the association between exposure and disease within a particular stratum), and then producing a summary (average) effect across the strata. The Mantel-Haenszel estimates are robust and not affected by small numbers in specific strata (provided that the overall numbers of exposed or non-exposed cases or controls are adequate), although it can be difficult or impossible to control for factors other than the matching factors if some strata involve small numbers (eg, just one case and one control). Furthermore, the Mantel-Haenszel approach works well when there are only a few confounder strata, but will experience problems of small numbers (eg, strata with only cases and no controls) if there are too many confounders to adjust for. In this situation, logistic regression may be preferred, since this uses maximum likelihood methods, which enable the adjustment (given certain assumptions) of more confounders.
Suppose that for each case we have chosen a control who is in the same five year age group (eg, if the case is aged 47 years, then a control is chosen who is aged 45-49 years). We can then perform a standard analysis, which adjusts for the matching factor (age group) by grouping all cases and controls into five year age groups and using unconditional logistic regression 4 (or the Mantel-Haenszel method 10 ); if there are eight age groups then this analysis will just have eight strata (represented by seven age group dummy variables), each with multiple cases and controls. Alternatively we can perform a matched analysis (that is, retaining the pair matching of one control for each case) using conditional logistic regression (or the matched data methods, which are equivalent to the Mantel-Haenszel method); if there are 100 case-control pairs, this analysis will then have 100 strata.
The main reason for using conditional (rather than unconditional) logistic regression is that when the analysis strata are very small (eg, with just one case and one control for each stratum), problems of sparse data will occur with unconditional methods. 11 For example, if there are 100 strata, this requires 99 dummy variables to represent them, even though there are only 200 study participants. In this extreme situation, unconditional logistic regression is biased and produces an odds ratio estimate that is the square of the conditional (true) estimate of the odds ratio. 5 12
Example of age matching
Table 1 ⇓ gives an example of age matching in a population based case-control study, and shows the “true’ findings for the total population, the findings for the corresponding unmatched case-control study, and the findings for an age matched case-control study using the standard analysis. Table 2 ⇓ presents the findings for the same age matched case-control study using the matched analysis. All analyses were performed using the Mantel-Haenszel method, but this yields similar results to the corresponding (unconditional or conditional) logistic regression analyses.
Hypothetical study population and case-control study with unmatched and matched standard analyses
- View inline
Hypothetical matched case-control study with matched analysis
Table 1 ⇑ shows that the crude odds ratio in the total population is 0.86 (0.70 to 1.05), but this changes to 2.00 (1.59 to 2.51) when the analysis is adjusted for age (using the Mantel-Haenszel method). This occurs because there is strong confounding by age—the cases are mostly old, and old people have a lower exposure than young people. Overall, there are 390 cases, and when 390 controls are selected at random from the non-cases in the total population (which is half exposed and half not exposed), this yields the same crude (0.86) and adjusted (2.00) odds ratios, but with wider confidence intervals, reflecting the smaller numbers of non-cases (controls) in the case-control study.
Why matching factors need to be controlled in the analysis
Now suppose that we reconduct the case-control study, matching for age, using two very broad age groups: old and young (table 1 ⇑ ). The number of cases and controls in each age group are now equal. However, the crude odds ratio (1.68, 1.25 to 2.24) is different from both the crude (0.86) and the adjusted (2.00) odds ratios in the total population. In contrast, the adjusted odds ratio (2.00) is the same as that in the total population and in the unmatched case-control study (both of these adjusted odds ratios were estimated using the standard approach). Thus, matching has not removed age confounding and it is still necessary to control for age (this occurs because the matching process in a case-control study changes the association between the matching factor and the outcome and can create an association even if there were none before the matching was conducted). However, there is a small increase in precision in the matched case-control study compared with the unmatched case-control studies (95% confidence intervals of 1.42 to 2.81 compared with 1.38 to 2.89) because there are now equal numbers of cases and controls in each age group (table 1 ⇑ ).
A pair matched study does not necessarily require a pair matched analysis
However, control for simple matching factors such as age does not require a pair matched analysis. Table 2 ⇑ gives the findings that would have been obtained from a pair matched analysis (this is created by assuming that in each age group, and for each case, the control was selected at random from all non-cases in the same age group). The standard adjusted (Mantel-Haenszel) analysis (table 1 ⇑ ) yields an odds ratio of 2.00 (95% confidence interval 1.42 to 2.81); the matched analysis (table 2 ⇑ ) yields the same odds ratio (2.00) but with a slightly wider confidence interval (1.40 to 2.89).
Advantages of the standard analysis
So for many matched case-control studies, we have a choice of doing a standard analysis or a matched analysis. In this situation, there are several possible advantages of using the standard approach.
The standard analysis can actually yield slightly better statistical precision. 13 This may apply, for example, if two or more cases and their matched controls all have identical values for their matching factors; then combining them into a single stratum produces an estimator with lower variance and no less validity 14 (as indicated by the slightly narrower confidence interval for the standard adjusted analysis (table 1 ⇑ ) compared with the pair matched analysis (table 2 ⇑ ). This particularly occurs because combining strata with identical values for the matching factors (eg, if two case-control pairs all concern women aged 55-59 years) may mean that fewer data are discarded (that is, do not contribute to the analysis) because of strata where the case and control have the same exposure status. Further gains in precision may be obtained if combining strata means that cases with no corresponding control (or controls without a corresponding case) can be included in the analysis. When such strata are combined, a conditional analysis may still be required if the resulting strata are still “small,” 13 but an unconditional analysis will be valid and yield similar findings if the resulting strata are sufficiently large. This may often be the case when matching has only been performed on standard factors such as sex and age group.
The standard analysis may also enhance the clarity of the presentation, particularly when analysing subgroups of cases and controls selected for variables on which they were not matched, since it involves standard 2×2 tables for each subgroup. 15
A further advantage of the standard analysis is that it makes it easier to combine different datasets that have involved matching on different factors (eg, if some have matched for age, some for age and sex, and some for nothing, then all can be combined in an analysis adjusting for age, sex, and study centre). In contrast, one multicentre study 16 (of which I happened to be a coauthor) attempted to (unnecessarily) perform a matched analysis across centres. Because not all centres had used pair matching, this involved retrospective pair matching in those centres that had not matched as part of the study design. This resulted in the unnecessary discarding of the unmatched controls, thus resulting in a likely loss of precision.
Conclusions
If matching is carried out on a particular factor such as age in a case-control study, then controlling for it in the analysis must be considered. This control should involve just as much precision as was used in the original matching 14 (eg, if exact age in years was used in the matching, then exact age in years should be controlled for in the analysis), although in practice such rigorous precision may not always be required (eg, five year age groups may suffice to control confounding by age, even if age matching was done more precisely than this). In some circumstances, this control may make no difference to the main exposure effect estimate—eg, if the matching factor is unrelated to exposure. However, if there is an association between the matching factor and the exposure, then matching will introduce confounding that needs to be controlled for in the analysis.
So when is a pair matched analysis required? The answer is, when the matching was genuinely at (or close to) the individual level. For example, if siblings have been chosen as controls, then each stratum would have just one case and the sibling control; in this situation, an unconditional logistic regression analysis would suffer from problems of sparse data, and conditional logistic regression would be required. Similar situations might arise if controls were neighbours or from the same general practice (if each general practice only had one or a few cases), or if matching was performed on many factors simultaneously so that most strata (in the standard analysis) had just one case and one control.
Provided, however, that there are no problems of sparse data, such control for the matching factors can be obtained using an unconditional analysis, with no loss of validity and a possible increase in precision.
Thus, a matched design will (nearly always) require controlling for the matching factors in the analysis. It is not the case, however, that a matched design requires a matched analysis.
I thank Simon Cousens, Deborah Lawlor, Lorenzo Richiardi, and Jan Vandenbroucke for their comments on the draft manuscript. The Centre for Global NCDs is supported by the Wellcome Trust Institutional Strategic Support Fund, 097834/Z/11/B.
Competing interests: I have read and understood the BMJ policy on declaration of interests and declare the following: none.
Provenance and peer review: Not commissioned; externally peer reviewed.
This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/ .
- ↵ Rothman KJ, Greenland S, Lash TL, eds Design strategies to improve study accuracy. Modern epidemiology. 3rd ed . Lippincott Williams & Wilkins, 2008 .
- ↵ Rothman KJ. Epidemiology: an introduction. Oxford University Press, 2012 .
- ↵ Rothman KJ, Greenland S, Lash TL, eds. Modern epidemiology. 3rd ed . Lippincott Williams & Wilkins, 2008 .
- ↵ Breslow NE, Day NE. Statistical methods in cancer research. Vol I: the analysis of case-control studies. IARC, 1980 .
- ↵ Kleinbaum DG, Kupper LL, Morgenstern H. Epidemiologic research: principles and quantitative methods. Lifetime Learning Publications, 1982 .
- ↵ Dos Santos Silva I. Cancer epidemiology: principles and methods. IARC, 1999 .
- ↵ Keogh RH, Cox DR. Case-control studies. Cambridge University Press, 2014 doi:10.1017/CBO9781139094757 . .
- ↵ Lilienfeld DE, Stolley PD. Foundations of epidemiology. 3rd ed . Oxford University Press, 1994 .
- ↵ MacMahon B, Trichopolous D. Epidemiology: principles and methods. 2nd ed . Little Brown, 1996 .
- ↵ Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 1959 ; 22 :719- 48 . 13655060 .
- ↵ Robins J, Greenland S, Breslow NE. A general estimator for the variance of the Mantel-Haenszel odds ratio. Am J Epidemiol 1986 ; 124 :719- 23 . 3766505 .
- ↵ Pike MC, Hill AP, Smith PG. Bias and efficiency in logistic analyses of stratified case-control studies. Int J Epidemiol 1980 ; 9 :89- 95 . doi:10.1093/ije/9.1.89 . 7419334 .
- ↵ Brookmeyer R, Liang KY, Linet M. Matched case-control designs and overmatched analyses. Am J Epidemiol 1986 ; 124 :693- 701 . 3752063 .
- ↵ Greenland S. Applications of stratified analysis methods. In: Rothman KJ, Greenland S, Lash TL, eds. Modern epidemiology. 3rd ed . Lippincott Williams & Wilkins, 2008 .
- ↵ Vandenbroucke JP, Koster T, Briët E, Reitsma PH, Bertina RM, Rosendaal FR. Increased risk of venous thrombosis in oral-contraceptive users who are carriers of factor V Leiden mutation. Lancet 1994 ; 344 :1453- 7 . doi:10.1016/S0140-6736(94)90286-0 . 7968118 .
- ↵ Cardis E, Richardson L, Deltour I, et al. The INTERPHONE study: design, epidemiological methods, and description of the study population. Eur J Epidemiol 2007 ; 22 :647- 64 . doi:10.1007/s10654-007-9152-z . 17636416 .
- Mansournia MA, Hernán MA, Greenland S. Matched designs and causal diagrams. Int J Epidemiol 2013 ; 42 :860- 9 . doi:10.1093/ije/dyt083 . 23918854 .
- Hernán MA, Hernández-Díaz S, Robins JM. A structural approach to selection bias. Epidemiology 2004 ; 15 :615- 25 . doi:10.1097/01.ede.0000135174.63482.43 . 15308962 .
- Science & Math
- Biological Sciences
Buy new: $59.95
- Free returns are available for the shipping address you chose. You can return the item for any reason in new and unused condition: no shipping charges
- Learn more about free returns.
- Go to your orders and start the return
- Select the return method


Sorry, there was a problem.
Other sellers on amazon.

Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required . Learn more
Read instantly on your browser with Kindle for Web .
Using your mobile phone camera - scan the code below and download the Kindle app.

Handbook of Statistical Methods for Case-Control Studies (Chapman & Hall/CRC Handbooks of Modern Statistical Methods) 1st Edition
- Kindle $47.36 Read with Our Free App
- Hardcover $114.96 - $115.71 6 Used from $110.97 11 New from $111.77
- Paperback $59.95 2 Used from $60.82 12 New from $57.45
Enhance your purchase
Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field. It provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and methods. The handbook is designed to serve as a reference text for biostatisticians and quantitatively-oriented epidemiologists who are working on the design and analysis of case-control studies or on related statistical methods research. Though not specifically intended as a textbook, it may also be used as a backup reference text for graduate level courses.
Book Sections
- Classical designs and causal inference, measurement error, power, and small-sample inference
- Designs that use full-cohort information
- Time-to-event data
- Genetic epidemiology
About the Editors
Ørnulf Borgan is Professor of Statistics, University of Oslo. His book with Andersen, Gill and Keiding on counting processes in survival analysis is a world classic.
Norman E. Breslow was, at the time of his death, Professor Emeritus in Biostatistics, University of Washington. For decades, his book with Nick Day has been the authoritative text on case-control methodology.
Nilanjan Chatterjee is Bloomberg Distinguished Professor, Johns Hopkins University. He leads a broad research program in statistical methods for modern large scale biomedical studies.
Mitchell H. Gail is a Senior Investigator at the National Cancer Institute. His research includes modeling absolute risk of disease, intervention trials, and statistical methods for epidemiology.
Alastair Scott was, at the time of his death, Professor Emeritus of Statistics, University of Auckland. He was a major contributor to using survey sampling methods for analyzing case-control data.
Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear regression and methods for fitting models to response-selective data.
- ISBN-10 0367571374
- ISBN-13 978-0367571375
- Edition 1st
- Publisher Routledge
- Publication date June 30, 2020
- Part of series Chapman & Hall/CRC Handbooks of Modern Statistical Methods
- Language English
- Dimensions 7.01 x 1.25 x 10 inches
- Print length 536 pages
- See all details

Customers who bought this item also bought

Editorial Reviews
"This book is essential reading and reference for any statistical methodologist with interest in case-control studies...This book is a very good place to start on the next leg of our statistical journey in this field." ~Nicholas P. Jewell , ISCB Newsletter
" . . . as a handbook, it is designed to address specific methodological issues, more like a toolbox. And this is done well. All chapters come with an introduction and a worked example using sample data, with ample reference to further details. Occasional chapters on unconventional study designs provide food for thought. Overall, the book is well written and very comprehensive; it provides help for many situations, and for situations of greater complexity it points to further references." ~Anika Hüsing, Biometrical Journal
About the Author
Product details.
- Publisher : Routledge; 1st edition (June 30, 2020)
- Language : English
- Paperback : 536 pages
- ISBN-10 : 0367571374
- ISBN-13 : 978-0367571375
- Item Weight : 2.25 pounds
- Dimensions : 7.01 x 1.25 x 10 inches
- #237 in Biostatistics (Books)
- #385 in Epidemiology (Books)
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
- Top reviews Most recent Top reviews
Top review from the United States
There was a problem filtering reviews right now. please try again later..

- Amazon Newsletter
- About Amazon
- Accessibility
- Sustainability
- Press Center
- Investor Relations
- Amazon Devices
- Amazon Science
- Sell products on Amazon
- Sell apps on Amazon
- Supply to Amazon
- Protect & Build Your Brand
- Become an Affiliate
- Become a Delivery Driver
- Start a package delivery business
- Advertise Your Products
- Self-Publish with Us
- Host an Amazon Hub
- › See More Ways to Make Money
- Amazon Rewards Visa Signature Cards
- Amazon Store Card
- Amazon Secured Card
- Amazon Business Card
- Shop with Points
- Credit Card Marketplace
- Reload Your Balance
- Amazon Currency Converter
- Amazon and COVID-19
- Your Account
- Your Orders
- Shipping Rates & Policies
- Amazon Prime
- Returns & Replacements
- Manage Your Content and Devices
- Your Recalls and Product Safety Alerts
- Amazon Assistant
- Conditions of Use
- Privacy Notice
- Your Ads Privacy Choices
Handbook of Statistical Methods for Case-Control Studies
The Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field and published by Chapman & Hall/CRC Press (2018). The handbook provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and methods. The handbook is designed to serve as a reference text for biostatisticians and quantitatively-oriented epidemiologists who are working on the design and analysis of case-control studies or on related statistical methods research.
This website provides supplementary materials for some of the chapters of the handbook. The website is maintained by Ørnulf Borgan (email: [email protected]).

SUPPLEMENTARY MATERIALS
Chapter 8: Small Sample Methods
Chapter 12: Multi-Phase Sampling
Chapter 13: Calibration in Case-Control Studies
Chapter 17: Survival Analysis of Case-Control Data: A Sample Survey Approach
Chapter 18: Nested Case-Control Studies: A Counting Process Approach
Chapter 19: Inverse Probability Weighting in Nested Case-Control Studies
Chapter 20: Multiple Imputation for Sampled Cohort Data
Chapter 21: Maximum Likelihood Estimation for Case-Cohort and Nested Case-Control Studies
Chapter 22: The Self-Controlled Case Series Method
- Share on Facebook
- Share on Twitter

- Try the new Google Books
- Advanced Book Search
Get this book in print
- Barnes&Noble.com
- Books-A-Million
- Find in a library
- All sellers »
What people are saying - Write a review
Other editions - view all, about the author (2020).
Ørnulf Borgan is Professor of Statistics, University of Oslo. His book with Andersen, Gill and Keiding on counting processes in survival analysis is a world classic.
Norman E. Breslow was, at the time of his death, Professor Emeritus in Biostatistics, University of Washington. For decades, his book with Nick Day has been the authoritative text on case-control methodology.
Nilanjan Chatterjee is Bloomberg Distinguished Professor, Johns Hopkins University. He leads a broad research program in statistical methods for modern large scale biomedical studies.
Mitchell H. Gail is a Senior Investigator at the National Cancer Institute. His research includes modeling absolute risk of disease, intervention trials, and statistical methods for epidemiology.
Alastair Scott was, at the time of his death, Professor Emeritus of Statistics, University of Auckland. He was a major contributor to using survey sampling methods for analyzing case-control data.
Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear regression and methods for fitting models to response-selective data.
Bibliographic information
Uh-oh, it looks like your Internet Explorer is out of date. For a better shopping experience, please upgrade now.
Javascript is not enabled in your browser. Enabling JavaScript in your browser will allow you to experience all the features of our site. Learn how to enable JavaScript on your browser

Handbook of Statistical Methods for Case-Control Studies

- Ship This Item — Qualifies for Free Shipping
Book Sections
- Classical designs and causal inference, measurement error, power, and small-sample inference
- Designs that use full-cohort information
- Time-to-event data
- Genetic epidemiology
About the Editors
Ørnulf Borgan is Professor of Statistics, University of Oslo. His book with Andersen, Gill and Keiding on counting processes in survival analysis is a world classic.
Norman E. Breslow was, at the time of his death, Professor Emeritus in Biostatistics, University of Washington. For decades, his book with Nick Day has been the authoritative text on case-control methodology.
Nilanjan Chatterjee is Bloomberg Distinguished Professor, Johns Hopkins University. He leads a broad research program in statistical methods for modern large scale biomedical studies.
Mitchell H. Gail is a Senior Investigator at the National Cancer Institute. His research includes modeling absolute risk of disease, intervention trials, and statistical methods for epidemiology.
Alastair Scott was, at the time of his death, Professor Emeritus of Statistics, University of Auckland. He was a major contributor to using survey sampling methods for analyzing case-control data.
Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear regression and methods for fitting models to response-selective data.
Related collections and offers
Product details, about the author, table of contents.
About the Editors xiii
List of Contributors xv
I Introduction 1
1 Origins of the Case-Control Study Norman E. Breslow Noel Weiss 3
2 Design Issues in Case-Control Studies Duncan C. Thomas 15
II Classical Case-Control Studies 39
3 Basic Concepts and Analysis Barbara McKnight 41
4 Matched Case-Control Studies Barbara McKnight 63
5 Multiple Case or Control Groups Barbara McKnight 77
6 Causal Inference from Case-Control Studies Vanessa Didelez Robin J. Evans 87
7 The Case-Crossover Study Design in Epidemiology Joseph A. "Chris" Delaney Samy Suissa 117
8 Small Sample Methods Jinko Graham Brad McNeney Robert Platt 133
9 Power and Sample Size for Case-Control Studies Mitchell H. Gail Sebastien Haneuse 163
10 Measurement Error and Case-Control Studies Raymond J. Carroll 189
III Case-control Studies that Use Full-Cohort Information 205
11 Alternative Formulation of Models in Case-Control Studies William E. Barlow John B. Cologne 207
12 Multi-Phase Sampling Gustavo Amorim Alastair J. Scott Chris J. Wild 219
13 Calibration in Case-Control Studies Thomas Lumley 239
14 Secondary Analysis of Case-Control Data Chris J. Wild 251
15 Response Selective Study Designs Using Existing Longitudinal Cohorts Paul J. Rathouz Jonathan S. Schildcrout Leila R. Zelnick Patrick J. Heagerty 261
IV Case-Control Studies for Time-to-Event Data 283
16 Cohort Sampling for Time-to-Event Data: An Overview Ørnulf Borgan Sven Ove Samuelsen 285
17 Survival Analysis of Case-Control Data: A Sample Survey Approach Norman E. Breslow Jie Kate Hu 303
18 Nested Case-Control Studies: A Counting Process Approach Ømulf Borgan 329
19 Inverse Probability Weighting in Nested Case-Control Studies Sven Ove Samuelsen Nathalie Støer 351
20 Multiple Imputation for Sampled Cohort Data Ruth H. Keogh 373
21 Maximum Likelihood Estimation for Case-Cohort and Nested Case-Control Studies Donglin Zeng Dan-Yu Lin 391
22 The Self-Controlled Case Series Method Paddy Farrington Heather Whitaker 405
V Case-Control Studies in Genetic Epidemiology 423
23 Case-Control Designs for Modern Genome-Wide Association Studies: Basic Principles and Overview Nilanjan Chatterjee 425
24 Analysis of Gene-Environment Interactions Summer S. Han Raymond J. Carroll Nilanjan Chatterjee 437
25 Two-Stage Testing for Genome-Wide Gene-Environment Interactions James Y. Dai Li Hsu Charles Kooperberg 459
26 Family-Based Case-Control Approaches to Study the Role of Genetics Clarice R. Weinberg Min Shi David M. Umbach 475
27 Mixed Models for Case-Control Genome-Wide Association Studies: Major Challenges and Partial Solutions David Golan Saharon Rosset 495
28 Analysis of Secondary Phenotype Data under Case-Control Designs Guoqing Diao Donglin Zeng Dan-Yu Lin 515
Customer Reviews

An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Publications
- Account settings
- Advanced Search
- Journal List
- Biostatistics

Statistical methods for biomarker data pooled from multiple nested case–control studies
Abigail sloan.
1 Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
Stephanie A Smith-Warner
2 Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
2a Department of Nutrition, Harvard T. H. Chan School of Public Health, Boston, MA, USA
Regina G Ziegler
3 Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
4a Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
4b Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, and Harvard Medical School, Boston, MA, USA
Associated Data
Pooling biomarker data across multiple studies allows for examination of a wider exposure range than generally possible in individual studies, evaluation of population subgroups and disease subtypes with more statistical power, and more precise estimation of biomarker-disease associations. However, circulating biomarker measurements often require calibration to a single reference assay prior to pooling due to assay and laboratory variability across studies. We propose several methods for calibrating and combining biomarker data from nested case–control studies when reference assay data are obtained from a subset of controls in each contributing study. Specifically, we describe a two-stage calibration method and two aggregated calibration methods, named the internalized and full calibration methods, to evaluate the main effect of the biomarker exposure on disease risk and whether that association is modified by a potential covariate. The internalized method uses the reference laboratory measurement in the analysis when available and otherwise uses the estimated value derived from calibration models. The full calibration method uses calibrated biomarker measurements for all subjects, including those with reference laboratory measurements. Under the two-stage method, investigators complete study-specific analyses in the first stage followed by meta-analysis in the second stage. Our results demonstrate that the full calibration method is the preferred aggregated approach to minimize bias in point estimates. We also observe that the two-stage and full calibration methods provide similar effect and variance estimates but that their variance estimates are slightly larger than those from the internalized approach. As an illustrative example, we apply the three methods in a pooling project of nested case–control studies to evaluate (i) the association between circulating vitamin D levels and risk of stroke and (ii) how body mass index modifies the association between circulating vitamin D levels and risk of cardiovascular disease.
1. Introduction
Combining data from multiple studies to maximize sample size has become a common strategy to quantify exposure-disease associations, including those where the exposure is a biomarker. Increased sample sizes facilitate subgroup and tumor subtype analyses, allow more precise estimation of the biomarker exposure effect over a wider range of biomarker measurements, and avoid issues related to data sparsity ( Key and others , 2010 ; Smith-Warner and others , 2006 ). The increase in the use of pooling consortia over time reflects the availability and advantages of big data in epidemiology and its promises to improve quantification of disease risk factors. Note that we use the term pooling throughout this article to refer to combination of data from individual participants and not physical specimen combination. Here, we define biomarkers as measurable indicators of health at the molecular, biochemical, or cellular level ( Key and others , 2010 ). Examples include proteins, antibodies, hormones, and lipids. Many consortia have analyzed biomarker-disease associations, including the Endogenous Hormones, Nutritional Biomarkers, and Prostate Cancer Collaborative Group ( Key and others , 2010 ), the COPD Biomarkers Qualification Consortium Database ( Tabberer and others , 2017 ), and the Circulating Biomarkers and Breast and Colorectal Cancer Consortium ( McCullough and others , 2018 ), among others. The participating cohorts in many consortia have employed nested case–control studies using individual or frequency matching to improve efficiency.
An important consideration when conducting pooled analyses of biomarker measurements from different studies is whether the measurements differ across studies due to real differences in the underlying populations or due to usage of different assays, kits, or laboratories in some or all studies. This consideration is particularly important when samples in the pooled analysis were assayed at different laboratories with different assays at different times. Examples of biomarkers with highly variable measurements across assays and laboratories include estradiol, testosterone, and insulin-like growth factor 1 ( Key and others , 2010 ). Measurements of circulating 25-hydroxyvitamin D (25(OH)D) also vary up to 40% between laboratories and assays ( Lai and others , 2012 ).
For consortial projects of biomarkers that do not use a single assay and laboratory, investigators must address potential between-study variation in biomarker measurements. Critically, to quantify risk associated with per-unit increases in the biomarker, a common metric for the biomarker data must be used in each of the contributing studies. One strategy used to harmonize biomarker measurements involves study-specific calibration models. In this method, a random subset of biospecimens from each study is reassayed at a designated reference laboratory. A study-specific calibration model is estimated in each study between the original “local” laboratory measurements and reference laboratory measurements. The resulting calibration equation is then used to estimate the reference laboratory biomarker measurement from the local laboratory measurement for all cases and controls in the individual study. Following the calibration procedure, the harmonized biomarker measurements can be modeled using categories defined by absolute concentrations, consortium-wide quantiles, or continuously. In practice, re-assayed biospecimens are typically selected at random from controls in each study owing to concerns about the availability of case biospecimens ( Sloan and others , 2019 ).
Two major classes of methods exist for analyzing data pooled from multiple studies, namely the two-stage approach and the aggregated approach ( Debray and others , 2013 ; Smith-Warner and others , 2006 ). Under the two-stage method, investigators complete study-specific analyses using standardized criteria in the first stage followed by meta-analysis in the second stage. In the aggregated approach, investigators combine harmonized data from all studies into a single dataset before performing statistical analyses on the combined dataset. Sloan and others (2019) developed pooling methodology for cohort studies and subdivided the aggregated approach into the internalized and full calibration approach. The internalized method uses the reference laboratory measurement in the analysis when available and the calibrated measurement otherwise. In contrast, the full calibration method uses calibrated biomarker measurements exclusively for all subjects regardless of the availability of reference laboratory measurements. In this article, we derive these approaches under the paradigm of nested case–control studies, allowing the potential inclusion of a biomarker–covariate interaction term.
The methods developed here are gnostic to the type of assay being used or biomarker being measured so long as investigators have access to reference assay measurements for a subset of individuals at each local laboratory and can model the relationship between reference laboratory measurements and local laboratory measurements. Variation in the assays or laboratories are captured in these study-specific models.
We can equivalently view pooled and calibrated biomarker data as a covariate measurement error problem. If we treat the reference and local laboratory measurements as the true and surrogate biomarker values, respectively ( Carroll and others , 2006 ), we can envision each study-specific calibration model as a different measurement error model. We leverage an existing strategy in the measurement error literature, namely regression calibration ( Carroll and others , 2006 ; Rosner and others , 1990 ), to form the basis of our methods. Although each of our methods are classified as a two-stage or aggregated approach, each utilizes concepts underlying regression calibration.
In this article, we propose calibration methods for pooled biomarker data from nested case–control studies that allow inference on the main effect of the biomarker in addition to biomarker–covariate interaction terms. Section 2 presents the models and statistical methods. Section 3 compares the methods via simulation and considers the inclusion of a covariate–biomarker interaction term. Section 4 illustrates the methods in examples involving 25(OH)D data pooled from the Nurses’ Health Study I (NHS1), Nurses’ Health Study II (NHS2), and Health Professionals Follow-up Study (HPFS) for stroke and cardiovascular disease (CVD) outcomes. Section 5 discusses our results.
2.1. Model and approximate conditional likelihood
To estimate the biomarker exposure effect under the aggregated approach, we develop a likelihood-based method. The conditional logistic regression model for the biomarker–disease association is
Under aggregation, the likelihood contribution from a stratum with only local laboratory biomarker measurements is
2.2. Calibration model
We assume a linear relationship between the reference and local laboratory measurements among the matched cases and controls such that
2.3. Parameter estimation under aggregated approach
2.4. Two-stage approach
The two-stage approach for pooled data uses regression calibration, a broadly applicable method initially developed in the measurement error literature, to adjust for calibration in the first stage study-specific analyses ( Carroll and others , 2006 ; Rosner and others , 1990 ; Spiegelman and others , 1997 , 2001 ). The second stage combines these estimates using fixed effects meta-analysis.
2.5. Two-stage method for models with an interaction term
3. Simulations
3.1. model without an interaction term.
We also performed simulations that fixed the total sample size at 1000 participants while varying the calibration subset size between 30, 50, and 150 subjects (or 3%, 5%, and 15% participation rates, respectively). As shown in Figure 1 , at all calibration study sizes, the full calibration method offered nearly unbiased point estimates. With larger calibration study sizes, the MSEs decreased as a result of the improvement in efficiency. However, the internalized method estimates experienced increasing downward bias as the proportion of subjects participating in the calibration subset increased owing to increasingly differential calibration of cases and controls. As calibration study size increased, the two-stage method point estimates were increasingly less biased with decreasing MSEs owing to the improved bias and efficiency of calibration parameters.

Comparison of methods as number of participants in the calibration study increases. The number of subjects in each study remains fixed at 1000, or equivalently, 500 case–control pairs. The calibration study participation rates considered are 3%, 5%, and 15%, or 30, 50, and 150 individuals, respectively. Panels a-c depict the percent bias of the parameter estimate while panels d-f display the MSE of the estimate.
3.2. Model with an interaction term
4. Applied example
We completed two data examples to illustrate the methods. In the first example, we investigate the impact of circulating 25-hydroxyvitamin D (25(OH)D) levels on risk of stroke. In the second example, we investigate the impact of 25(OH)D levels and its interaction with a dichotomized body mass index (BMI) term on the risk of a composite outcome, fatal or nonfatal stroke, or myocardial infarction (henceforth referred to as the CVD endpoint). In both examples, we match each case to a single control based on sex and age at blood draw.
We applied the two aggregated methods (i.e. full calibration and internalized), two-stage, and naive methods to data combined from three large prospective cohort studies in the United States, including the HPFS ( Wu and others , 2011 ), the NHS1 ( Eliassen and others , 2016 ), and the NHS2 ( Eliassen and others , 2011 ). The HPFS began enrollment in 1986 and includes 51 529 male health professionals aged 40–75 years at baseline. The NHS1 enrolled 121 701 female nurses aged 30–55 years at baseline in 1976. The NHS2, a younger counterpart to the NHS1, was established in 1989 with the enrollment of 116 671 female nurses, aged 25–42 years at baseline. In each cohort, participants completed biannual questionnaires providing information about medical history, diet, and lifestyle conditions. Between 1989 and 1997, each study completed laboratory assays on blood samples for a host of biomarkers, including 25(OH)D, from a subset of participants. Subjects with a previous cancer diagnosis were not eligible for random selection. Individuals were excluded from the pooled analysis if they did not have 25(OH)D measurements available or stroke or myocardial infarction outcome data.
Each study obtained calibration data among a subset of controls by re-assaying their blood samples at Heartland Assays, LLC between 2011 and 2013. Circulating 25(OH)D levels were modeled continuously and reported using 20 nmol/L increments. Table 4 of the supplementary material available at Biostatistics online lists information about the main studies and the calibration subsets, including the parameter estimates of the study-specific calibration models.
5. Discussion
In this work, we proposed statistical methods for analyzing calibrated biomarker data pooled across multiple nested case–control studies. Our methods facilitate inference on the main effect of the biomarker as well as a biomarker–covariate interaction term. Keeping with common practice, we estimated study-specific calibration models from subsets of controls reassayed at the reference lab. The methods developed here can also be used to contend with exposure measurement error when pooling data from multiple studies with internal validation subsets.
Naive estimates were typically quite biased and illustrated the risk of failing to implement a calibration step when necessary. More problematically, the naive estimates were sometimes biased toward the alternative, resulting in an inflated type I error rate.
Although this article focuses on the common scenario of a controls-only calibration study, all the methods discussed also apply if the calibration subset includes both cases and controls. Furthermore, both the full calibration and internalized methods work for nonlinear calibration models. If necessary, one could include nonlinear terms in the calibration model when applying the full calibration and/or internalized methods. Note however that the two-stage method does require the linear calibration model in (2.2).
Regarding inclusion of covariates, if covariates are correlated with the biomarker and not the outcome, they may be included in the calibration model but not in the conditional logistic regression model. Covariates that are correlated with both the biomarker and the outcome can be included in both models.
Although the aggregated and two-stage methods are equally viable and valid options for analyzing outcome–exposure relationships in pooled data, logistical considerations may dictate the preferred approach for the statistical analysis. For instance, aggregated methods often lend themselves better to subgroup analyses because they reduce issues resulting from sparse data for a single study in specific strata. If the main exposure effect and at least some covariate effects are homogeneous, the aggregated method may also offer efficiency gains in covariate estimation relative to the two-stage method ( Lin and Zeng, 2010 ). However, the two-stage method may be more appealing than the aggregated methods at times for its intuitive and simple implementation, and its robustness to these covariate homogeneity assumptions.
6. Software
Functions in the form of R code (with an example) are available at the first author’s Github account https://github.com/agsloan/PoolingBiomarkerData and last author’s website https://www.hsph.harvard.edu/molin-wang/software .
Supplementary Material
Kxz051_supplementary_data, acknowledgments.
We are grateful to Tao Hou and Shiaw-Shyuan (Sherry) Yaun for their assistance in accessing the data. We also thank the Circulating Biomarkers and Breast and Colorectal Cancer Consortium team (R01CA152071, PI: Stephanie Smith-Warner; Intramural Research Program, Division of Cancer Epidemiology and Genetics, National Cancer Institute: Regina Ziegler) for conducting the calibration study in the vitamin D examples. Conflict of Interest : None declared.
Supplementary material
Supplementary material is available at http://biostatistics.oxfordjournals.org .
This work was supported by the NIH (T32-NS048005 to A.S.) and by the NIH/NCI (R03CA212799 to M.W.).
- Breslow, N. E., Day, N. E., Halvorsen, K. T., Prentice, R. L. and Sabai, C. (1978). Estimation of multiple relative risk functions in matched case-control studies . American Journal of Epidemiology 108 ( 4 ), 299–307. [ PubMed ] [ Google Scholar ]
- Carroll, R., Ruppert, D., Stefanski, L. and Crainiceanu, C. (2006). Measurement error in nonlinear models: a modern perspective; 2nd ed. , Monographs on Statistics and Applied Probability. Boca Raton, FL: Chapman and Hall. [ Google Scholar ]
- Debray, T. P. A., Moons, K. G. M., Abo-Zaid, G. M. A., Koffijberg, H. and Riley, R. D. (2013). Individual participant data meta-analysis for a binary outcome: one-stage or two-stage? PloS one 8 ( 4 ), e60650. [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Eliassen, A. H., Spiegelman, D., Hollis, B. W., Horst, R. L., Willett, W. C. and Hankinson, S. E. (2011). Plasma 25-hydroxyvitamin D and risk of breast cancer in the Nurses’ Health study II . Breast Cancer Research 13 ( 3 ), R50. [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Eliassen, A. H., Warner, E. T., Rosner, B., Collins, L. C., Beck, A. H., Quintana, L. M., Tamimi, M. and Hankinson, S. E. (2016). Plasma 25-hydroxyvitamin d and risk of breast cancer in women followed over 20 years . Cancer research 76 ( 18 ), 5423–5430. [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Gail, M. H., Wu, J., Wang, M., Yaun, S., Cook, N. R., Eliassen, A. H., McCullough, M. L., Yu, K., Zeleniuch-Jacquotte, A., Smith-Warner, S. A., Ziegler, R. G.. and others . (2016). Calibration and seasonal adjustment for matched case⣓-control studies of vitamin D and cancer . Statistics in Medicine 35 ( 13 ), 2133–2148. [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Gong, G. and Samaniego, F. J. (1981). Pseudo maximum likelihood estimation: theory and applications . The Annals of Statistics 9 ( 4 ), 861–869. [ Google Scholar ]
- Guolo, A. and Brazzale, A. R. (2008). A simulation-based comparison of techniques to correct for measurement error in matched case–control studies . Statistics in medicine 27 ( 19 ), 3755–3775. [ PubMed ] [ Google Scholar ]
- Key, T. J., Appleby, P. N., Allen, N. E. and Reeves, G. K. (2010). Pooling biomarker data from different studies of disease risk, with a focus on endogenous hormones . Cancer Epidemiology and Prevention Biomarkers 19 ( 4 ), 960–965. [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Lai, J. K. C., Lucas, R. M., Banks, E. and Ponsonby, A. (2012). Variability in vitamin D assays impairs clinical assessment of vitamin D status . Internal medicine journal 42 ( 1 ), 43–50. [ PubMed ] [ Google Scholar ]
- Levi-Vardi, R. and Yagil, Y. (2017). Vitamin D, hypertension, and ischemic stroke: An unresolved relationship . American Heart Association 70 ( 3 ), 496–498. [ PubMed ] [ Google Scholar ]
- Lin, D. Y. and Zeng, D. (2010). On the relative efficiency of using summary statistics versus individual-level data in meta-analysis . Biometrika 97 ( 2 ), 321–332. [ PMC free article ] [ PubMed ] [ Google Scholar ]
- McCullough, M. L., Zoltick, E. S., Weinstein, S. J., Fedirko, V., Wang, M., Cook, N. R., Eliassen, A. H., Zeleniuch-Jacquotte, A., Agnoli, C., Albanes, D.. and others . (2018). Circulating vitamin d and colorectal cancer risk: an international pooling project of 17 cohorts . JNCI: Journal of the National Cancer Institute 111 ( 2 ), 158–169. [ PMC free article ] [ PubMed ] [ Google Scholar ]
- McShane, L. M., Midthune, D. N., Dorgan, J. F., Freedman, L. S. and Carroll, R. J. (2001). Covariate measurement error adjustment for matched case–control studies . Biometrics 57 ( 1 ), 62–73. [ PubMed ] [ Google Scholar ]
- Prentice, R.L. and Breslow, N.E. (1978). Retrospective studies and failure time models . Biometrika 65 ( 1 ), 153–158. [ Google Scholar ]
- Rosner, B., Spiegelman, D. and Willett, W. C. (1990). Correction of logistic regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error . American journal of epidemiology 132 ( 4 ), 734–745. [ PubMed ] [ Google Scholar ]
- Sloan, A., Song, Y., Gail, M. H., Betensky, R., Rosner, B., Ziegler, R. G., Smith-Warner, S. A. and Wang, M. (2019). Design and analysis considerations for combining data from multiple biomarker studies . Statistics in medicine 38 ( 8 ), 1303–1320. [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Smith-Warner, S. A., Spiegelman, D., Ritz, J., Albanes, D., Beeson, W. L., Bernstein, L., Berrino, F., Van Den Brandt, P. A., Buring, J. E. and Cho, E. (2006). Methods for pooling results of epidemiologic studies: the pooling project of prospective studies of diet and cancer . American journal of epidemiology 163 ( 11 ), 1053–1064. [ PubMed ] [ Google Scholar ]
- Spiegelman, D., Carroll, R. J. and Kipnis, V. (2001). Efficient regression calibration for logistic regression in main study/internal validation study designs with an imperfect reference instrument . Statistics in medicine 20 ( 1 ), 139–160. [ PubMed ] [ Google Scholar ]
- Spiegelman, D., McDermott, A. and Rosner, B. (1997). Regression calibration method for correcting measurement-error bias in nutritional epidemiology . The American journal of clinical nutrition 65 ( 4 ), 1179S–1186S. [ PubMed ] [ Google Scholar ]
- Sun, Q. and Pan, A. and Hu, F. and Manson, J. and Rexrode, K. (2012). 25-hydroxyvitamin D levels and the risk of stroke: a prospective study and meta-analysis . Stroke 43 ( 6 ), 1470–1477. [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Tabberer, M., Benson, V. S., Gelhorn, H., Wilson, H., Karlsson, N., Mullerova, H., Menjoge, S., Rennard, S. I., Tal-Singer, R. and Merrill, D. (2017). The COPD biomarkers qualification consortium database: baseline characteristics of the St. George’s respiratory questionnaire dataset . Chronic Obstructive Pulmonary Diseases: Journal of the COPD Foundation 4 ( 2 ), 112. [ PMC free article ] [ PubMed ] [ Google Scholar ]
- Wu, K., Feskanich, D., Fuchs, C. S., Chan, A. T., Willett, W. C., Hollis, B. W., Pollak, M. N. and Giovannucci, E. (2011). Interactions between plasma levels of 25-hydroxyvitamin D, insulin-like growth factor (IGF)-1 and C-peptide with risk of colorectal cancer . PLoS One 6 ( 12 ), e28520. [ PMC free article ] [ PubMed ] [ Google Scholar ]
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
- View all journals
- Explore content
- About the journal
- Publish with us
- Sign up for alerts
- Published: 08 March 2023
Periodontics
Is there evidence of a relationship between pre-eclampsia and periodontitis?
- Lauren Crowder 1
Evidence-Based Dentistry ( 2023 ) Cite this article
Metrics details
- Dental treatments
- Periodontitis
Data sources
The review searched several databases which included Medline (from 1950), Pubmed (from 1946), Embase (from 1949), Lilacs, Cochrane Controlled Clinical Trial Register, CINAHL, ClinicalTrials.gov and Google Scholar (from 1990).
Study selection
Two of the authors (LD and HN) independently assessed the eligibility of studies by looking at the titles, abstracts and methods. If there was a disagreement, a third reviewer was consultant (QA) for a decision.
Data extraction and synthesis
A data extraction form was created and used. Data collected included: the first author’s name; publication year; study design; number of cases; number of controls, total sample size; country; national income group; mean age; the risk of estimates or data used to calculate the risk estimates; confidence intervals (CI) or data used to generate CI. For assessment of socioeconomic status and its role as a possible influential factor, the World Bank classification through Gross National Income per capita was used to determine which level (low-income, lower-middle-income, upper-middle-income, high-income) a country resided in. All authors cross-checked all data and discussions were had to resolve disagreements. Statistical software ‘RevMan’ was used to input data. Pooled odds ratios, mean difference, and 95% CI were calculated for the association between periodontitis and pre-eclampsia using a random-effects model. A significance level of 0.05 was used for pooled effect. Forest plots for primary analysis and subgroup analysis show the raw data, odds ratio and CIs, means and SDs for the chosen effect, heterogeneity statistic ( I 2 ), total number of participants per group, overall odds ratio and mean difference. Groups were divided for subgroup analysis by: study design (case-control and cohort); the studies’ definition of periodontitis (defined by pocket depth [PD] and/or clinical attachment loss [CAL]); and national income (high-income or middle-income or low-income countries). Cochran’s Q statistic and I 2 statistic were used to determine heterogeneity and degree of heterogeneity, respectively. For publication bias, Egger’s regression model and fail-safe number was used.
Thirty articles and 9650 women were included in total. Six of the studies were cohort studies (2840 participants overall) and 24 were case-control studies. Pre-eclampsia was defined the same across all studies, whereas periodontitis differed. There was a significant association between periodontitis and pre-eclampsia (OR 3.18, 95% CI 2.26–4.48, p < 0.00001). In subgroup analysis of just cohort studies, the significance increased (OR 4.19, 95% CI 2.23–7.87, p < 0.00001). It further increased looking at lower-middle-income countries (OR 6.70, 95% CI 2.61–17.19, p < 0.0001).
Conclusions
Periodontitis in pregnancy is a risk factor for pre-eclampsia. The data would suggest that this is more prominent in lower-middle-income subgroups. Further research could be undertaken to explore the possible mechanisms and also if prevention of adequate treatment can reduce the risk of pre-eclampsia, thereby improving maternal health outcomes.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 4 print issues and online access
$259.00 per year
only $64.75 per issue
Rent or buy this article
Get just this article for as long as you need it
Prices may be subject to local taxes which are calculated during checkout
Knight M, UKOSS. Eclampsia in the United Kingdom 2005. BJOG. 2007;114:1072–8. https://doi.org/10.1111/j.1471-0528.2007.01423.x .
Article PubMed Google Scholar
British Society of Periodontology. The good practitioner’s guide to periodontology. British Society of Periodontology 2016. http://www.bsperio.org.uk .
Moher D, Liberati A, Tetzlaff J, Altman DG,PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6:e1000097. https://doi.org/10.1371/journal.pmed.1000097 .
Article PubMed PubMed Central Google Scholar
Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA. 2000;283:2008–12. https://doi.org/10.1001/jama.283.15.2008 .
Download references
Author information
Authors and affiliations.
Paediatric Dentistry Department, Dundee Dental Hospital, Dundee, Scotland
Lauren Crowder
You can also search for this author in PubMed Google Scholar
Corresponding author
Correspondence to Lauren Crowder .
Ethics declarations
Competing interests.
The author declares no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Reprints and Permissions
About this article
Cite this article.
Crowder, L. Is there evidence of a relationship between pre-eclampsia and periodontitis?. Evid Based Dent (2023). https://doi.org/10.1038/s41432-023-00870-y
Download citation
Received : 06 January 2023
Accepted : 12 January 2023
Published : 08 March 2023
DOI : https://doi.org/10.1038/s41432-023-00870-y
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Quick links
- Explore articles by subject
- Guide to authors
- Editorial policies

Information
- Author Services
Initiatives
You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.
All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .
Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.
Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.
Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

- Active Journals
- Find a Journal
- Proceedings Series
- For Authors
- For Reviewers
- For Editors
- For Librarians
- For Publishers
- For Societies
- For Conference Organizers
- Open Access Policy
- Institutional Open Access Program
- Special Issues Guidelines
- Editorial Process
- Research and Publication Ethics
- Article Processing Charges
- Testimonials
- Preprints.org
- SciProfiles
- Encyclopedia

Article Menu

- Subscribe SciFeed
- Recommended Articles
- Google Scholar
- on Google Scholar
- Table of Contents
Find support for a specific problem in the support section of our website.
Please let us know what you think of our products and services.
Visit our dedicated information section to learn more about MDPI.
JSmol Viewer
Spatial differentiation characteristics of rural areas based on machine learning and gis statistical analysis—a case study of yongtai county, fuzhou city, 1. introduction, 2. machine learning and rural spatial differentiation characteristics, 2.1. gis technology, 2.2. spatial differentiation characteristics, 2.3. machine learning, 3. experiments on rural spatial differentiation characteristics, 3.1. data sources, 3.2. data preprocessing, 3.3. determination of evaluation indicators, 3.4. spatial analysis modeling, 3.5. feature pattern classification, 3.6. rural regional function evaluation, 3.7. rural regional function orientation, 4. analysis of the characteristics of regional rural spatial differentiation, 4.1. gis statistical analysis, 4.2. spatial differentiation feature analysis, 5. conclusions, data availability statement, conflicts of interest.
- Sun, Y.; Peng, M.; Zhou, Y.; Huang, Y.; Mao, S. Application of machine learning in wireless networks: Key techniques and open issues. IEEE Commun. Surv. Tutor. 2019 , 21 , 3072–3108. [ Google Scholar ] [ CrossRef ]
- Dev, S.; Wen, B.; Lee, Y.H.; Winkler, S. Ground-Based Image Analysis: A Tutorial on Machine-Learning Techniques and Applications. IEEE Geoence Remote Sens. Mag. 2016 , 4 , 79–93. [ Google Scholar ] [ CrossRef ]
- Taherkhani, N.; Pierre, S. Centralized and Localized Data Congestion Control Strategy for Vehicular Ad Hoc Networks Using a Machine Learning Clustering Algorithm. IEEE Trans. Intell. Transp. Syst. 2016 , 17 , 3275–3285. [ Google Scholar ] [ CrossRef ]
- Ruske, S.; Topping, D.O.; Foot, V.E.; Kaye, P.H.; Stanley, W.R.; Crawford, I.; Morse, A.P.; Gallagher, M.W. Evaluation of machine learning algorithms for classification of primary biological aerosol using a new UV-LIF spectrometer. Atmos. Meas. Tech. 2017 , 10 , 695–708. [ Google Scholar ] [ CrossRef ]
- Buczak, A.; Guven, E. A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection. IEEE Commun. Surv. Tutor. 2017 , 18 , 1153–1176. [ Google Scholar ] [ CrossRef ]
- Helma, C.; Cramer, T.; Kramer, S.; De Raedt, L. Data mining and machine learning techniques for the identification of mutagenicity inducing substructures and structure activity relationships of noncongeneric compounds. J. Chem. Inf. Comput. 2018 , 35 , 1402–1411. [ Google Scholar ]
- Maxwell, A.E.; Warner, T.A.; Fang, F. Implementation of machine-learning classification in remote sensing: An applied review. Int. J. Remote Sens. 2018 , 39 , 2784–2817. [ Google Scholar ] [ CrossRef ]
- Goswami, R.; Dufort, P.; Tartaglia, M.C.; Green, R.E.; Crawley, A.; Tator, C.H.; Wennberg, R.; Mikulis, D.J.; Keightley, M.; Davis, K.D. Frontotemporal correlates of impulsivity and machine learning in retired professional athletes with a history of multiple concussions. Brain Struct. Funct. 2016 , 221 , 1911–1925. [ Google Scholar ] [ CrossRef ]
- Sanchez-Lengeling, B.; Aspuru-Guzik, A. Inverse molecular design using machine learning: Generative models for matter engineering. Science 2018 , 361 , 360–365. [ Google Scholar ] [ CrossRef ]
- Assouline, D.; Mohajeri, N.; Scartezzini, J.L. Quantifying rooftop photovoltaic solar energy potential: A machine learning approach. Sol. Energy 2017 , 141 , 278–296. [ Google Scholar ] [ CrossRef ]
- Sacha, D.; Sedlmair, M.; Zhang, L.; Lee, J.A.; Peltonen, J.; Weiskopf, D.; North, S.C.; Keim, D.A. What You See Is What You Can Change: Human-Centered Machine Learning By Interactive Visualization. Neurocomputing 2017 , 268 , 164–175. [ Google Scholar ] [ CrossRef ]
- Mullainathan, S.; Obermeyer, Z. Does Machine Learning Automate Moral Hazard and Error? Am. Econ. Rev. 2017 , 107 , 476–480. [ Google Scholar ] [ CrossRef ] [ PubMed ]
- Patel, M.J.; Khalaf, A.; Aizenstein, H.J. Studying depression using imaging and machine learning methods. Neuroimage Clin. 2016 , 10 , 115–123. [ Google Scholar ] [ CrossRef ] [ PubMed ]
- Hu, S.; O’Hagan, A.; Sweeney, J.; Ghahramani, M. A spatial machine learning model for analysing customers’ lapse behaviour in life insurance. Ann. Actuar. Sci. 2021 , 15 , 367–393. [ Google Scholar ] [ CrossRef ]
- Luo, X.; Liu, J.; Zhang, D.; Chang, X. A large-scale web QoS prediction scheme for the Industrial Internet of Things based on a kernel machine learning algorithm. Comput. Netw. 2016 , 101 , 81–89. [ Google Scholar ] [ CrossRef ]
- Lemley, J.; Bazrafkan, S.; Corcoran, P. Deep learning for consumer devices and services: Pushing the limits for machine learning, artificial intelligence, and computer vision. IEEE Consum. Electron. Mag. 2017 , 6 , 48–56. [ Google Scholar ] [ CrossRef ]
- Van Ginneken, B. Fifty years of computer analysis in chest imaging: Rule-based, machine learning, deep learning. Radiol. Phys. Technol. 2017 , 10 , 23–32. [ Google Scholar ] [ CrossRef ]
- Luo, G. Automatically explaining machine learning prediction results: A demonstration on type 2 diabetes risk prediction. Health Inf. Sci. Syst. 2016 , 4 , 1–9. [ Google Scholar ] [ CrossRef ]
- Jamali, A.A.; Ferdousi, R.; Razzaghi, S.; Li, J.; Safdari, R.; Ebrahimie, E. DrugMiner: Comparative analysis of machine learning algorithms for prediction of potential druggable proteins. Drug Discov. Today 2016 , 21 , 718–724. [ Google Scholar ] [ CrossRef ]
- Sui, H.; Li, L.; Zhu, X.; Chen, D.; Wu, G. Modeling the adsorption of PAH mixture in silica nanopores by molecular dynamic simulation combined with machine learning. Am. J. Hematol. 2016 , 85 , 1950–1959. [ Google Scholar ] [ CrossRef ]
- Chou, J.S.; Ngo, N.T. Time series analytics using sliding window metaheuristic optimization-based machine learning system for identifying building energy consumption patterns. Appl. Energy 2016 , 177 , 751–770. [ Google Scholar ] [ CrossRef ]
- Taylor, R.A.; Pare, J.R.; Venkatesh, A.K.; Mowafi, H.; Melnick, E.R.; Fleischman, W.; Hall, M.K. Prediction of In-hospital Mortality in Emergency Department Patients with Sepsis: A Local Big Data–Driven, Machine Learning Approach. Acad. Emerg. Med. 2016 , 23 , 269–278. [ Google Scholar ] [ CrossRef ]
- Dickson, M.E.; Perry, G.L.W. Identifying the controls on coastal cliff landslides using machine-learning approaches. Environ. Model. Softw. 2016 , 76 , 117–127. [ Google Scholar ] [ CrossRef ]
- Valdes, G.; Solberg, T.D.; Heskel, M.; Ungar, L.; Simone, C.B., 2nd. Using machine learning to predict radiation pneumonitis in patients with stage I non-small cell lung cancer treated with stereotactic body radiation therapy. Phys. Med. Biol. 2016 , 61 , 6105–6120. [ Google Scholar ] [ CrossRef ]
- Plawiak, P.; Sosnicki, T.; Niedzwiecki, M.; Tabor, Z.; Rzecki, K. Hand Body Language Gesture Recognition Based on Signals from Specialized Glove and Machine Learning Algorithms. IEEE Trans. Ind. Inform. 2016 , 12 , 1104–1113. [ Google Scholar ] [ CrossRef ]
- Azam, M.S.; Raihan, M.A.; Rana, H.K. An Experimental Study of Various Machine Learning Approaches in Heart Disease Prediction. Int. J. Comput. Appl. 2020 , 175 , 16–21. [ Google Scholar ]
- Mosquera, R.; Castrillón, O.D.; Parra Osorio, L. Prediction of psychosocial risks in colombian teachers of public schools using machine learning techniques. Inf. Tecnol. 2018 , 29 , 267–281. [ Google Scholar ] [ CrossRef ]
- Li, B.; Huang, J.; Feng, Y.; Wang, F.; Sang, J. A machine learning-based approach for improved orbit predictions of LEO space debris with sparse tracking data from a single station. IEEE Trans. Aerosp. Electron. Syst. 2020 , 56.6 , 4253–4268. [ Google Scholar ] [ CrossRef ]
- Mateo-Garcia, G.; Veitch-Michaelis, J.; Smith, L.; Oprea, S.V.; Schumann, G.; Gal, Y.; Baydin, A.G.; Backes, D. Towards global flood mapping onboard low cost satellites with machine learning. Sci. Rep. 2021 , 11 , 7249. [ Google Scholar ] [ CrossRef ]
- Wang, C.; Platnick, S.; Meyer, K.; Zhang, Z.; Zhou, Y. A machine-learning-based cloud detection and thermodynamic-phase classification algorithm using passive spectral observations. Atmos. Meas. Tech. 2020 , 13 , 2257–2277. [ Google Scholar ] [ CrossRef ]
Share and Cite
Wang, Z. Spatial Differentiation Characteristics of Rural Areas Based on Machine Learning and GIS Statistical Analysis—A Case Study of Yongtai County, Fuzhou City. Sustainability 2023 , 15 , 4367. https://doi.org/10.3390/su15054367
Wang Z. Spatial Differentiation Characteristics of Rural Areas Based on Machine Learning and GIS Statistical Analysis—A Case Study of Yongtai County, Fuzhou City. Sustainability . 2023; 15(5):4367. https://doi.org/10.3390/su15054367
Wang, Ziyuan. 2023. "Spatial Differentiation Characteristics of Rural Areas Based on Machine Learning and GIS Statistical Analysis—A Case Study of Yongtai County, Fuzhou City" Sustainability 15, no. 5: 4367. https://doi.org/10.3390/su15054367
Article Metrics
Article access statistics, further information, mdpi initiatives, follow mdpi.

Subscribe to receive issue release notifications and newsletters from MDPI journals

IMAGES
VIDEO
COMMENTS
Methods of analysis of results from case-control studies have evolved considerably since the 1950s. These methods have helped to improve the validity of the conclusions drawn from case-control research and have helped to ensure that the available data are utilized to their fullest extent.
Handbook of Statistical Methods for Case-Control Studies (Chapman & Hall/CRC Handbooks of Modern Statistical Methods): Borgan, Ørnulf, Breslow, Norman, Chatterjee, Nilanjan, Gail, Mitchell H., Scott, Alastair, Wild, Chris J.: 9781498768580: Amazon.com: Books Books › Science & Math › Biological Sciences Buy new: $114.79 List Price: $125.00
He was a major contributor to using survey sampling methods for analyzing case-control data. Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear...
Provided that it is read and used together with such a comprehensive epidemiological text, this new Handbook of Statistical Methods for Case-Control Studies is a valuable and important book, which will be useful for seminars and courses on the developments in statistical theory that have occurred since the publication of Breslow and Day in 1980.
Handbook of Statistical Methods for Case-Control Studies Edited By Ørnulf Borgan , Norman Breslow , Nilanjan Chatterjee , Mitchell H. Gail , Alastair Scott , Chris J. Wild Copyright Year 2018 ISBN 9780367571375 Published June 30, 2020 by Chapman & Hall 554 Pages FREE Standard Shipping Format Quantity USD $ 59 .95 Add to Cart Add to Wish List
Case-control studies produce the odds ratio to measure the strength of the link between exposure and the outcome. An odds ratio is the ratio of exposure probabilities in the case group to the odds of response in the control group. Calculating a confidence interval for each odds ratio is critical.
The main objective of this paper is to derive the valid sampling distribution of the observed counts in a case-control study with missing data under the assumption of missing at random by employing the conditional sampling method and the mechanism augmentation method. The proposed sampling distribut …
analyze the nested case-control data. The latter approach essentially involves analyzing the whole set of cohort data and using multiple imputation for those variables that were only collected in the case-control subset. There are also excellent chapters on the self-controlled case series method, and various methods for case-control studies of ...
Due to the clustering of teeth, the survival times of the matched teeth within subjects could be correlated and thus the statistical methods for conventional case-control studies cannot not be directly applied. We study the marginal proportional hazards regression model for data from this type studies. Second, we consider a case-cohort study ...
Traditional methods of occupational cohort analysis have used the standardized mortality ratio (SMR) as the fundamental measure of association between risk factor and disease. The SMR is shown here to result from maximum likelihood estimation in a multiplicative statistical model involving known national death rates.
Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field. It provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and m…
How to: Choose Case-control Statistical designs Case-control designs Sampling methodology The source population has to be (at least partially) defined by specified criteria so that appropriate controls can be selected. Such criteria will often be geographic, although they need not necessarily be so.
Matching on factors such as age and sex is commonly used in case-control studies. 1 This can be done for convenience (eg, choosing a control admitted to hospital on the same day as the case), to improve study efficiency by improving precision (under certain conditions) when controlling for the matching factors (eg, age, sex) in the analysis, or …
In the case-control literature, it is known that the marginal benefit to statistical power from increasing the number of controls per case beyond 4 or 5 is small (e.g., Ury 1975;Song and Chung ...
Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field. It provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and methods.
The Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field and published by Chapman & Hall/CRC Press (2018). The handbook provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and methods. The handbook is designed ...
Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field. It provides an in-depth treatment of up-to-date and currently developing statistical...
III Case-control Studies that Use Full-Cohort Information 205. 11 Alternative Formulation of Models in Case-Control Studies William E. Barlow John B. Cologne 207. 12 Multi-Phase Sampling Gustavo Amorim Alastair J. Scott Chris J. Wild 219. 13 Calibration in Case-Control Studies Thomas Lumley 239. 14 Secondary Analysis of Case-Control Data Chris ...
In this work, we proposed statistical methods for analyzing calibrated biomarker data pooled across multiple nested case-control studies. Our methods facilitate inference on the main effect of the biomarker as well as a biomarker-covariate interaction term. Keeping with common practice, we estimated study-specific calibration models from ...
Six of the studies were cohort studies (2840 participants overall) and 24 were case-control studies. Pre-eclampsia was defined the same across all studies, whereas periodontitis differed.
With the development of machine learning and GIS (geographic information systems) technology, it is possible to combine them to mine the knowledge rules behind massive spatial data. GIS, also known as geographic information systems, is a comprehensive discipline, which combines geography and cartography and has been widely used in different fields. It is a computer system for inputting ...