statistical methods for case control studies

Handbook of Statistical Methods for Case-Control Studies

Ørnulf borgan, norman breslow, nilanjan chatterjee, mitchell h. gail, alastair scott, chris j. wild.

What are VitalSource eBooks?

Prices & shipping based on shipping country

Multiple eBook Formats

ISBN | Quantity:

Shopping Cart Summary

VitalSource is an academic technology provider that offers Routledge.com customers access to its free eBook reader, Bookshelf. Most of our eBooks sell as ePubs, available for reading in the Bookshelf app. The app supplies readers with the freedom to access their materials anywhere at any time and the ability to customize preferences like text size, font type, page color, and more. To learn more about our eBooks, visit the links below:

Book Description

Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field. It provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and methods. The handbook is designed to serve as a reference text for biostatisticians and quantitatively-oriented epidemiologists who are working on the design and analysis of case-control studies or on related statistical methods research. Though not specifically intended as a textbook, it may also be used as a backup reference text for graduate level courses. Book Sections Classical designs and causal inference, measurement error, power, and small-sample inference Designs that use full-cohort information Time-to-event data Genetic epidemiology About the Editors Ørnulf Borgan is Professor of Statistics, University of Oslo. His book with Andersen, Gill and Keiding on counting processes in survival analysis is a world classic. Norman E. Breslow was, at the time of his death, Professor Emeritus in Biostatistics, University of Washington. For decades, his book with Nick Day has been the authoritative text on case-control methodology. Nilanjan Chatterjee is Bloomberg Distinguished Professor, Johns Hopkins University. He leads a broad research program in statistical methods for modern large scale biomedical studies. Mitchell H. Gail is a Senior Investigator at the National Cancer Institute. His research includes modeling absolute risk of disease, intervention trials, and statistical methods for epidemiology. Alastair Scott was, at the time of his death, Professor Emeritus of Statistics, University of Auckland. He was a major contributor to using survey sampling methods for analyzing case-control data. Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear regression and methods for fitting models to response-selective data.

Table of Contents

Introduction. Introduction. Origins. Classical Case-Control Studies. Design issues in case-control studies. Basic concepts and methods of analysis. Matched samples. Beyond logistic regression. Small sample methods. Multiple case or control groups. Power and sample size. Causal inference. Misclassification and measurement error. Analysis of secondary phenotype under case-control design. Sampling from a Defined Cohort. Two and three (or multi) phase sampling designs. Calibration and estimation of sampling weights. Maximum likelihood. Re-use of case-control samples. Misspecification. Case-control studies with complex sampling. Cohort sampling for time to event data. Case-cohort designs and analyses. Design options and partial likelihood analyses of nested case-control data. Inverse probability weighting in nested case-control studies. Multiple imputation. Maximum likelihood. Self controlled case series. Genetic Epidemiology. Basic design and association analysis of population-based case-control studies. Analysis of gene-environment interactions. Screening methods for detecting genetic association and interactions under case-control design. Analysis of family-based case-control studies. Fitting mixed model to case-control genome-wide association studies. Analysis of secondary phenotype under case-control design.

Ørnulf Borgan is Professor of Statistics, University of Oslo. His book with Andersen, Gill and Keiding on counting processes in survival analysis is a world classic. Norman E. Breslow was, at the time of his death, Professor Emeritus in Biostatistics, University of Washington. For decades, his book with Nick Day has been the authoritative text on case-control methodology. Nilanjan Chatterjee is Bloomberg Distinguished Professor, Johns Hopkins University. He leads a broad research program in statistical methods for modern large scale biomedical studies. Mitchell H. Gail is a Senior Investigator at the National Cancer Institute. His research includes modeling absolute risk of disease, intervention trials, and statistical methods for epidemiology. Alastair Scott was, at the time of his death, Professor Emeritus of Statistics, University of Auckland. He was a major contributor to using survey sampling methods for analyzing case-control data. Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear regression and methods for fitting models to response-selective data.

We use cookies to improve your website experience. To learn how to manage your cookie settings, please see our Cookie Policy . By continuing to use the website, you consent to our use of cookies.

The country you have selected will result in the following:

pubrica academy logo

Statistical analyses of case-control studies

statistical methods for case control studies

How Evidence-based practice (EBP) can be translated as health communication or patient education materials

statistical methods for case control studies

How to evaluate bias in meta-analysis within meta-epidemiological studies?

Introduction.

A case-control study is used to see if exposure is linked to a certain result (i.e., disease or condition of interest). Case-control research is always retrospective by definition since it starts with a result and then goes back to look at exposures. The investigator already knows the result of each participant when they are enrolled in their separate groups. Case-control studies are retrospective because of this, not because the investigator frequently uses previously gathered data. This article discusses statistical analysis in case-control studies.

Advantages and Disadvantages of Case-Control Studies

statistical methods for case control studies

Study Design

Participants in a case-control study are chosen for the study depending on their outcome status. As a result, some individuals have the desired outcome (referred to as cases), while others do not have the desired outcome (referred to as controls). After that, the investigator evaluates the exposure in both groups. As a result, in case-control research , the outcome must occur in at least some individuals. Thus, as shown in Figure 1, some research participants have the outcome, and others do not enrol.

statistical methods for case control studies

Figure 1. Example of a case-control study [1]

Selection of case

The cases should be defined as precisely as feasible by the investigator. A disease’s definition may be based on many criteria at times; hence, all aspects should be fully specified in the case definition.

Selection of a control

Controls that are comparable to the cases in a variety of ways should be chosen. The matching criteria are the parameters (e.g., age, sex, and hospitalization time) used to establish how controls and cases should be similar. For instance, it would be unfair to compare patients with elective intraocular surgery to a group of controls with traumatic corneal lacerations. Another key feature of a case-control study is that the exposure in both cases and controls should be measured equally.

Though some controls have to be similar to cases in many respects, it is possible to over-match. Over-matching might make it harder to identify enough controls. Furthermore, once a matching variable is chosen, it cannot be analyzed as a risk factor. Enrolling more than one control for each case is an effective method for increasing the power of research. However, incorporating more than two controls per instance adds little statistical value.

Data collection

Decide on the data to be gathered after precisely identifying the cases and controls; both groups must have the same data obtained in the same method. If the search for primary risk variables is not conducted objectively, the study may suffer from researcher bias, especially because the conclusion is already known. It’s crucial to try to hide the outcome from the person collecting risk factor data or interviewing patients, even if it’s not always practicable. Patients may be asked questions concerning historical issues (such as smoking history, food, usage of conventional eye medications, and so on). For some people, precisely recalling all of this information may be challenging.

Furthermore, patients who get the result (cases) are more likely to recall specifics of unfavourable experiences than controls. Recall bias is a term for this phenomenon. Any effort made by the researcher to reduce this form of bias would benefit the research.

The frequency of each of the measured variables in each of the two groups is computed in the analysis. Case-control studies produce the odds ratio to measure the strength of the link between exposure and the outcome. An odds ratio is the ratio of exposure probabilities in the case group to the odds of response in the control group. Calculating a confidence interval for each odds ratio is critical. A confidence interval of 1.0 indicates that the link between the exposure and the result might have been discovered by chance alone and that the link is not statistically significant. Without a confidence interval, an odds ratio isn’t particularly useful. Computer programmes are typically used to do these computations. Because no measures are taken in a population-based sample, case-control studies cannot give any information regarding the incidence or prevalence of a disease.

Risk Factors and Sampling

Case-control studies can also be used to investigate risk factors for a rare disease. Cases might be obtained from hospital records. Patients who present to the hospital, on the other hand, may not be typical of the general community. The selection of an appropriate control group may provide challenges. Patients from the same hospital who do not have the result are a common source of controls. However, hospitalized patients may not always reflect the broader population; they are more likely to have health issues and access the healthcare system.

Recent research on case-control studies using statistical analyses

i) R isk factors related to multiple sclerosis in Kuwait

This matched case-control research in Kuwait looked at the relationship between several variables: family history, stressful life events, tobacco smoke exposure, vaccination history, comorbidity, and multiple sclerosis (MS) risk. To accomplish the study’s goal, a matched case-control strategy was used. Cases were recruited from Ibn Sina Hospital’s neurology clinics and the Dasman Diabetes Institute’s MS clinic. Controls were chosen from among Kuwait University’s faculty and students. A generalized questionnaire was used to collect data on socio-demographic, possibly genetic, and environmental aspects from each patient and his/her pair-matched control. Descriptive statistics were produced, including means and standard deviations for quantitative variables and frequencies for qualitative variables. Variables that were substantially (p ≤ 0.15) associated with MS status in the univariable conditional logistic regression analysis were evaluated for inclusion in the final multivariable conditional logistic regression model. In this case-control study, 112 MS patients were invited to participate, and 110 (98.2 %) agreed to participate. Therefore, 110 MS patients and 110 control participants were enlisted, and they were individually matched with cases (1:1) on age (5 years), gender, and nationality (Fig. 1). The findings revealed that having a family history of MS was significantly associated with an increased risk of developing MS. In contrast, vaccination against influenza A and B viruses provided significant protection against MS.

statistical methods for case control studies

Figure 1. Flow chart on the enrollment of the MS cases and controls [1]

ii) Relation between periodontitis and COVID-19 infection

COVID-19 is linked to a higher inflammatory response, which can be deadly. Periodontitis is characterized by systemic inflammation. In Qatar, patients with COVID-19 were chosen from Hamad Medical Corporation’s (HMC) national electronic health data. Patients with COVID-19 problems (death, ICU hospitalizations, or assisted ventilation) were categorized as cases, while COVID-19 patients released without severe difficulties were categorized as controls. There was no control matching because all controls were included in the analysis. Periodontal problems were evaluated using dental radiographs from the same database. The relationships between periodontitis and COVID 19 problems were investigated using logistic regression models adjusted for demographic, medical, and behavioural variables. 258 of the 568 participants had periodontitis. Only 33 of the 310 patients with periodontitis had COVID-19 issues, whereas only 7 of the 310 patients without periodontitis had COVID-19 issues. Table 2 shows the unadjusted and adjusted odds ratios and 95 % confidence intervals for the relationship between periodontitis and COVID-19 problems. Periodontitis was shown to be substantially related to a greater risk of COVID-19 complications, such as ICU admission, the requirement for assisted breathing, and mortality, as well as higher blood levels of indicators connected to a poor COVID-19 outcome, such as D-dimer, WBC, and CRP.

Table 2. Associations between periodontal condition and COVID-19 complications [3]

statistical methods for case control studies

iii) Menstrual, reproductive and hormonal factors and thyroid cancer

The relationships between menstrual, reproductive, and hormonal variables and thyroid cancer incidence in a population of Chinese women were investigated in this study. A 1:1 corresponding hospital-based Case-control study was conducted in 7 counties of Zhejiang Province to investigate the correlations of diabetes mellitus and other variables with thyroid cancer. Case participants were eligible if they were diagnosed with primary thyroid cancer for the first time in a hospital between July 2015 and December 2017. The patients and controls in this research were chosen at random. At enrollment, the interviewer gathered all essential information face-to-face using a customized questionnaire. Descriptive statistics were utilized to characterize the baseline characteristics of female individuals using frequency and percentage. To investigate the connections between the variables and thyroid cancer, univariate conditional logistic regression models were used. We used four multivariable conditional logistic regression models adjusted for variables to investigate the relationships between menstrual, reproductive, and hormonal variables and thyroid cancer. In all, 2937 pairs of participants took part in the case-control research. The findings revealed that a later age at first pregnancy and a longer duration of breastfeeding were substantially linked with a lower occurrence of thyroid cancer, which might shed light on the aetiology, monitoring, and prevention of thyroid cancer in Chinese women [4].

It’s important to note that the term “case-control study” is commonly misunderstood. A case-control study starts with a group of people exposed to something and a comparison group (control group) who have not been exposed to anything and then follows them over time to see what occurs. However, this is not a case-control study. Case-control studies are frequently seen as less valuable since they are retrospective. They can, however, be a highly effective technique of detecting a link between an exposure and a result. In addition, they are sometimes the only ethical approach to research a connection. Case-control studies can provide useful information if definitions, controls, and the possibility for bias are carefully considered.

[1] Setia, Maninder Singh. “Methodology Series Module 2: Case-control Studies.” Indian journal of dermatology vol. 61,2 (2016): 146-51. doi:10.4103/0019-5154.177773

[2] El-Muzaini, H., Akhtar, S. & Alroughani, R. A matched case-control study of risk factors associated with multiple sclerosis in Kuwait. BMC Neurol 20, 64 (2020). https://doi.org/10.1186/s12883-020-01635-1 .

[3] Marouf, Nadya, Wenji Cai, Khalid N. Said, Hanin Daas, Hanan Diab, Venkateswara Rao Chinta, Ali Ait Hssain, Belinda Nicolau, Mariano Sanz, and Faleh Tamimi. “Association between periodontitis and severity of COVID‐19 infection: A case–control study.” Journal of clinical periodontology 48, no. 4 (2021): 483-491.

[4] Wang, Meng, Wei-Wei Gong, Qing-Fang He, Ru-Ying Hu, and Min Yu. “Menstrual, reproductive and hormonal factors and thyroid cancer: a hospital-based case-control study in China.” BMC Women’s Health 21, no. 1 (2021): 1-8.

pubrica-academy

pubrica-academy

Related posts.

statistical methods for case control studies

PUB - Selecting material (e.g. excipient, active pharmaceutical ingredient) for drug development

Selecting material (e.g. excipient, active pharmaceutical ingredient, packaging material) for drug development

statistical methods for case control studies

PUB - Health Economics of Data Modeling

Health economics in clinical trials

statistical methods for case control studies

PUB - Epidemiology designs for clinical trials

Epidemiology designs for clinical trials

Comments are closed.

close slider

Select Your Services Medical Writing Services Regulatory Science Writing Editing & Translation Medical & Scientific Editing Writing in Clinical Research (CRO) Clinical (or Medical) Auditing Medical Animations Solutions Medical Translation Scientific & Academic Publishing Manuscript Artwork Preparation Impact Factor Journal Publication Scientific Research & Analytics Healthcare Data Science Projects Bio-Statistical & Meta Data Analytics Scientific Communication Medical Communication Services

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Save citation to file

Email citation, add to collections.

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

Valid statistical inference methods for a case-control study with missing data

Affiliations.

The main objective of this paper is to derive the valid sampling distribution of the observed counts in a case-control study with missing data under the assumption of missing at random by employing the conditional sampling method and the mechanism augmentation method. The proposed sampling distribution, called the case-control sampling distribution, can be used to calculate the standard errors of the maximum likelihood estimates of parameters via the Fisher information matrix and to generate independent samples for constructing small-sample bootstrap confidence intervals. Theoretical comparisons of the new case-control sampling distribution with two existing sampling distributions exhibit a large difference. Simulations are conducted to investigate the influence of the three different sampling distributions on statistical inferences. One finding is that the conclusion by the Wald test for testing independency under the two existing sampling distributions could be completely different (even contradictory) from the Wald test for testing the equality of the success probabilities in control/case groups under the proposed distribution. A real cervical cancer data set is used to illustrate the proposed statistical methods.

Keywords: Bootstrap methods; Wald test; case–control study; missing at random; the mechanism augmentation method.

Similar articles

Publication types

Related information

Linkout - more resources, full text sources.

Other Literature Sources

full text provider logo

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Statistical methods for case-control and case-cohort studies with possibly correlated failure time data

Downloadable content.

statistical methods for case control studies

This work has no parents.

Select type of work

Master's papers.

Deposit your masters paper, project or other capstone work. Theses will be sent to the CDR automatically via ProQuest and do not need to be deposited.

Scholarly Articles and Book Chapters

Deposit a peer-reviewed article or book chapter. If you would like to deposit a poster, presentation, conference paper or white paper, use the “Scholarly Works” deposit form.

Undergraduate Honors Theses

Deposit your senior honors thesis.

Scholarly Journal, Newsletter or Book

Deposit a complete issue of a scholarly journal, newsletter or book. If you would like to deposit an article or book chapter, use the “Scholarly Articles and Book Chapters” deposit option.

Deposit your dataset. Datasets may be associated with an article or deposited separately.

Deposit your 3D objects, audio, images or video.

Poster, Presentation or Paper

Deposit scholarly works such as posters, presentations, conference papers or white papers. If you would like to deposit a peer-reviewed article or book chapter, use the “Scholarly Articles and Book Chapters” deposit option.

Book cover

Methoden der Statistik und Informatik in Epidemiologie und Diagnostik pp 97–109 Cite as

Statistical Methods for Cohort and Case-Control Studies

81 Accesses

1 Citations

Part of the Medizinische Informatik und Statistik book series (MEDINFO,volume 40)

Traditional methods of occupational cohort analysis have used the standardized mortality ratio (SMR) as the fundamental measure of association between risk factor and disease. The SMR is shown here to result from maximum likelihood estimation in a multiplicative statistical model involving known national death rates. The same model permits regression analysis of variations in the SMR according to the intensity, type, or duration of exposure to environmental agents.

A second method of analysis (COX,1972) results when the underlying death rates are treated as an unknown nuisance function. Case-control sampling from the “risk sets” formed during analysis leads to a third technique which is computationally more efficient than the other two.

All three methods yield roughly equivalent measures of the relative risk of respiratory cancer associated with arsenic trioxide exposure among a cohort of Montana smelter workers. Questions of efficiency, bias and cost in the selection of a method of analysis are discussed.

Research supported in part by USPHS grant 1 K07 CA00723 and the Alexander von Humboldt Foundation

This is a preview of subscription content, access via your institution .

Buying options

Unable to display preview.  Download preview PDF.

Baker RJ and Neider JA (1978). The GLIM System: Release 3, Oxford: Numerical Algorithms Group.

Google Scholar  

Berry G, Gilson JC, Holmes S, Lewisohn HC and Roach SA (1979). Asbestosis: a study of dose-response relationship in an asbestos textile factory. British Journal of Industrial Medicine 36, 98–112.

Breslow NE and Day NE (1980). Statistical Methods in Cancer Research I: The Analysis of Case-Control Studies. Lyon: IARC.

Cox DR (1972). Regression models and life tables (with discussion). Journal of the Royal Statistical Society Series B 34, 187–220.

MATH   Google Scholar  

Enterline PE (1976). Pitfalls in epidemiological research: an examination of the asbestos literature. Journal of Occupational Medicine 18, 150–156.

CrossRef   Google Scholar  

Fox AJ and Collier PF (1976). Low mortality rates in industrial cohort studies due to selection for work and survival in the industry. British Journal of Preventive and Social Medicine 30, 225–230.

Kalbfleisch JD and Prentice RL (1980). The Statistical Analysis of Failure Time Data. New York: Wiley.

Knox EG (1973). Computer simulation of industrial hazards. British Journal of Industrial Medicine 30, 54–63.

Lee AM and Fraumeni JF (1969). Arsenic and respiratory cancer in man. Journal of the National Cancer Institute 42, 1045–1052.

Lubin JH and Breslow NE (1983). Application of survival data ethodology to occupational mortality studies. (Unpublished manuscript).

Mancuso TF and El-Attar AA (1967). Mortality pattern in a cohort of asbestos workers. Journal of Occupational Medicine 9, 147–162.

Mosteller F and Tukey JW (1977). Data Analysis and Regression. Reading: Addison-Wesley.

Prentice RL and Breslow NE (1978). Retrospective studies and failure time models. Biometrika 65, 153–158.

CrossRef   MATH   Google Scholar  

Rao CR (1965). Linear Statistical Inference and its Applications. New York: Wiley.

Yule GU (1934). On some points relating to vital statistics, more especially statistics of occupational mortality. Journal of the Royal Statistical Society 94, 1–84.

Download references

Author information

Authors and affiliations.

Department of Biostatistics, University of Washington, Seattle, USA

N. E. Breslow

Institute for Documentation, Information, and Statistics, German Cancer Research Center, Heidelberg, USA

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

Universitäts-Krankenhaus Eppendorf, Institut für Mathematik und Datenverarbeitung in der Medizin, Universität Hamburg, Martinistraße 52, 2000, Hamburg 20, Deutschland

J. Berger & K. H. Höhne & 

Additional information

Dedicated to Professor Dr. Otto Westphal on the occasion of his 70th birthday.

Rights and permissions

Reprints and Permissions

Copyright information

© 1983 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper.

Breslow, N.E. (1983). Statistical Methods for Cohort and Case-Control Studies. In: Berger, J., Höhne, K.H. (eds) Methoden der Statistik und Informatik in Epidemiologie und Diagnostik. Medizinische Informatik und Statistik, vol 40. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-81938-4_12

Download citation

DOI : https://doi.org/10.1007/978-3-642-81938-4_12

Publisher Name : Springer, Berlin, Heidelberg

Print ISBN : 978-3-540-12007-0

Online ISBN : 978-3-642-81938-4

eBook Packages : Springer Book Archive

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Handbook of Statistical Methods for Case-Control Studies

Publisher description.

Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field. It provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and methods. The handbook is designed to serve as a reference text for biostatisticians and quantitatively-oriented epidemiologists who are working on the design and analysis of case-control studies or on related statistical methods research. Though not specifically intended as a textbook, it may also be used as a backup reference text for graduate level courses. Book Sections Classical designs and causal inference, measurement error, power, and small-sample inference Designs that use full-cohort information Time-to-event data Genetic epidemiology About the Editors Ørnulf Borgan is Professor of Statistics, University of Oslo. His book with Andersen, Gill and Keiding on counting processes in survival analysis is a world classic. Norman E. Breslow was, at the time of his death, Professor Emeritus in Biostatistics, University of Washington. For decades, his book with Nick Day has been the authoritative text on case-control methodology. Nilanjan Chatterjee is Bloomberg Distinguished Professor, Johns Hopkins University. He leads a broad research program in statistical methods for modern large scale biomedical studies. Mitchell H. Gail is a Senior Investigator at the National Cancer Institute. His research includes modeling absolute risk of disease, intervention trials, and statistical methods for epidemiology. Alastair Scott was, at the time of his death, Professor Emeritus of Statistics, University of Auckland. He was a major contributor to using survey sampling methods for analyzing case-control data. Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear regression and methods for fitting models to response-selective data.

More Books by Ornulf Borgan, Norman Breslow, Nilanjan Chatterjee, Mitchell H. Gail, Alastair Scott & Chris J. Wild

Other books in this series.

Intended for healthcare professionals

Home

Search form

Analysis of matched case-control studies

There are two common misconceptions about case-control studies: that matching in itself eliminates (controls) confounding by the matching factors, and that if matching has been performed, then a “matched analysis” is required. However, matching in a case-control study does not control for confounding by the matching factors; in fact it can introduce confounding by the matching factors even when it did not exist in the source population. Thus, a matched design may require controlling for the matching factors in the analysis. However, it is not the case that a matched design requires a matched analysis. Provided that there are no problems of sparse data, control for the matching factors can be obtained, with no loss of validity and a possible increase in precision, using a “standard” (unconditional) analysis, and a “matched” (conditional) analysis may not be required or appropriate.

Summary points

Matching in a case-control study does not control for confounding by the matching factors

A matched design may require controlling for the matching factors in the analysis

However, it is not the case that a matched design requires a matched analysis

A “standard” (unconditional) analysis may be most valid and appropriate, and a “matched” (conditional) analysis may not be required or appropriate

Matching on factors such as age and sex is commonly used in case-control studies. 1 This can be done for convenience (eg, choosing a control admitted to hospital on the same day as the case), to improve study efficiency by improving precision (under certain conditions) when controlling for the matching factors (eg, age, sex) in the analysis, or to enable control in the analysis of unquantifiable factors such as neighbourhood characteristics (eg, by choosing neighbours as controls and then controlling for neighbourhood in the analysis). The increase in efficiency occurs because it ensures similar numbers of cases and controls in confounder strata. For example, in a study of lung cancer, if controls are sampled at random from the source population, their age distribution will be much younger than that of the lung cancer cases. Thus, when age is controlled in the analysis, the young age stratum may contain mostly controls and few cases, whereas the old age stratum may contain mostly cases and fewer controls. Thus, statistical precision may be improved if controls are age matched to ensure roughly equal numbers of cases and controls in each age stratum.

There are two common misconceptions about case-control studies: that matching in itself eliminates confounding by the matching factors; and that if matching has been performed, then a “matched analysis” is required.

Matching in the design does not control for confounding by the matching factors. In fact, it can introduce confounding by the matching factors even when it did not exist in the source population. 1 The reasons for this are complex and will only be discussed briefly here. In essence, the matching process makes the controls more similar to the cases not only for the matching factor but also for the exposure itself. This introduces a bias that needs to be controlled in the analysis. For example, suppose we were conducting a case-control study of poverty and death (from any cause), and we chose siblings as controls (that is, for each person who died, we matched on family or residence by choosing a sibling who was still alive as a control). In this situation, since poverty runs in families we would tend to select a disadvantaged control for each disadvantaged person who had died and a wealthy control for each wealthy person who had died. We would find roughly equal percentages of disadvantaged people among the cases and controls, and we would find little association between poverty and mortality. The matching has introduced a bias, which fortunately (as we will illustrate) can be controlled by controlling for the matching factor in the analysis.

Thus, a matched design will (almost always) require controlling for the matching factors in the analysis. However, this does not necessarily mean that a matched analysis is required or appropriate, and it will often be sufficient to control for the matching factors using simpler methods. Although this is well recognised in both recent 2 3 and historical 4 5 texts, other texts 6 7 8 9 do not discuss this issue and present the matched analysis as the only option for analysing matched case-control studies. In fact, the more standard analysis may not only be valid but may be much easier in practice, and yield better statistical precision.

In this paper I explore and illustrate these problems using a hypothetical pair matched case-control study.

Options for analysing case-control studies

Unmatched case-control studies are typically analysed using the Mantel-Haenszel method 10 or unconditional logistic regression. 4 The former involves the familiar method of producing a 2×2 (exposure-disease) stratum for each level of the confounder (eg, if there are five age groups and two sex groups, then there will be 10 2×2 tables, each showing the association between exposure and disease within a particular stratum), and then producing a summary (average) effect across the strata. The Mantel-Haenszel estimates are robust and not affected by small numbers in specific strata (provided that the overall numbers of exposed or non-exposed cases or controls are adequate), although it can be difficult or impossible to control for factors other than the matching factors if some strata involve small numbers (eg, just one case and one control). Furthermore, the Mantel-Haenszel approach works well when there are only a few confounder strata, but will experience problems of small numbers (eg, strata with only cases and no controls) if there are too many confounders to adjust for. In this situation, logistic regression may be preferred, since this uses maximum likelihood methods, which enable the adjustment (given certain assumptions) of more confounders.

Suppose that for each case we have chosen a control who is in the same five year age group (eg, if the case is aged 47 years, then a control is chosen who is aged 45-49 years). We can then perform a standard analysis, which adjusts for the matching factor (age group) by grouping all cases and controls into five year age groups and using unconditional logistic regression 4 (or the Mantel-Haenszel method 10 ); if there are eight age groups then this analysis will just have eight strata (represented by seven age group dummy variables), each with multiple cases and controls. Alternatively we can perform a matched analysis (that is, retaining the pair matching of one control for each case) using conditional logistic regression (or the matched data methods, which are equivalent to the Mantel-Haenszel method); if there are 100 case-control pairs, this analysis will then have 100 strata.

The main reason for using conditional (rather than unconditional) logistic regression is that when the analysis strata are very small (eg, with just one case and one control for each stratum), problems of sparse data will occur with unconditional methods. 11 For example, if there are 100 strata, this requires 99 dummy variables to represent them, even though there are only 200 study participants. In this extreme situation, unconditional logistic regression is biased and produces an odds ratio estimate that is the square of the conditional (true) estimate of the odds ratio. 5 12

Example of age matching

Table 1 ⇓ gives an example of age matching in a population based case-control study, and shows the “true’ findings for the total population, the findings for the corresponding unmatched case-control study, and the findings for an age matched case-control study using the standard analysis. Table 2 ⇓ presents the findings for the same age matched case-control study using the matched analysis. All analyses were performed using the Mantel-Haenszel method, but this yields similar results to the corresponding (unconditional or conditional) logistic regression analyses.

Hypothetical study population and case-control study with unmatched and matched standard analyses

Hypothetical matched case-control study with matched analysis

Table 1 ⇑ shows that the crude odds ratio in the total population is 0.86 (0.70 to 1.05), but this changes to 2.00 (1.59 to 2.51) when the analysis is adjusted for age (using the Mantel-Haenszel method). This occurs because there is strong confounding by age—the cases are mostly old, and old people have a lower exposure than young people. Overall, there are 390 cases, and when 390 controls are selected at random from the non-cases in the total population (which is half exposed and half not exposed), this yields the same crude (0.86) and adjusted (2.00) odds ratios, but with wider confidence intervals, reflecting the smaller numbers of non-cases (controls) in the case-control study.

Why matching factors need to be controlled in the analysis

Now suppose that we reconduct the case-control study, matching for age, using two very broad age groups: old and young (table 1 ⇑ ). The number of cases and controls in each age group are now equal. However, the crude odds ratio (1.68, 1.25 to 2.24) is different from both the crude (0.86) and the adjusted (2.00) odds ratios in the total population. In contrast, the adjusted odds ratio (2.00) is the same as that in the total population and in the unmatched case-control study (both of these adjusted odds ratios were estimated using the standard approach). Thus, matching has not removed age confounding and it is still necessary to control for age (this occurs because the matching process in a case-control study changes the association between the matching factor and the outcome and can create an association even if there were none before the matching was conducted). However, there is a small increase in precision in the matched case-control study compared with the unmatched case-control studies (95% confidence intervals of 1.42 to 2.81 compared with 1.38 to 2.89) because there are now equal numbers of cases and controls in each age group (table 1 ⇑ ).

A pair matched study does not necessarily require a pair matched analysis

However, control for simple matching factors such as age does not require a pair matched analysis. Table 2 ⇑ gives the findings that would have been obtained from a pair matched analysis (this is created by assuming that in each age group, and for each case, the control was selected at random from all non-cases in the same age group). The standard adjusted (Mantel-Haenszel) analysis (table 1 ⇑ ) yields an odds ratio of 2.00 (95% confidence interval 1.42 to 2.81); the matched analysis (table 2 ⇑ ) yields the same odds ratio (2.00) but with a slightly wider confidence interval (1.40 to 2.89).

Advantages of the standard analysis

So for many matched case-control studies, we have a choice of doing a standard analysis or a matched analysis. In this situation, there are several possible advantages of using the standard approach.

The standard analysis can actually yield slightly better statistical precision. 13 This may apply, for example, if two or more cases and their matched controls all have identical values for their matching factors; then combining them into a single stratum produces an estimator with lower variance and no less validity 14 (as indicated by the slightly narrower confidence interval for the standard adjusted analysis (table 1 ⇑ ) compared with the pair matched analysis (table 2 ⇑ ). This particularly occurs because combining strata with identical values for the matching factors (eg, if two case-control pairs all concern women aged 55-59 years) may mean that fewer data are discarded (that is, do not contribute to the analysis) because of strata where the case and control have the same exposure status. Further gains in precision may be obtained if combining strata means that cases with no corresponding control (or controls without a corresponding case) can be included in the analysis. When such strata are combined, a conditional analysis may still be required if the resulting strata are still “small,” 13 but an unconditional analysis will be valid and yield similar findings if the resulting strata are sufficiently large. This may often be the case when matching has only been performed on standard factors such as sex and age group.

The standard analysis may also enhance the clarity of the presentation, particularly when analysing subgroups of cases and controls selected for variables on which they were not matched, since it involves standard 2×2 tables for each subgroup. 15

A further advantage of the standard analysis is that it makes it easier to combine different datasets that have involved matching on different factors (eg, if some have matched for age, some for age and sex, and some for nothing, then all can be combined in an analysis adjusting for age, sex, and study centre). In contrast, one multicentre study 16 (of which I happened to be a coauthor) attempted to (unnecessarily) perform a matched analysis across centres. Because not all centres had used pair matching, this involved retrospective pair matching in those centres that had not matched as part of the study design. This resulted in the unnecessary discarding of the unmatched controls, thus resulting in a likely loss of precision.

Conclusions

If matching is carried out on a particular factor such as age in a case-control study, then controlling for it in the analysis must be considered. This control should involve just as much precision as was used in the original matching 14 (eg, if exact age in years was used in the matching, then exact age in years should be controlled for in the analysis), although in practice such rigorous precision may not always be required (eg, five year age groups may suffice to control confounding by age, even if age matching was done more precisely than this). In some circumstances, this control may make no difference to the main exposure effect estimate—eg, if the matching factor is unrelated to exposure. However, if there is an association between the matching factor and the exposure, then matching will introduce confounding that needs to be controlled for in the analysis.

So when is a pair matched analysis required? The answer is, when the matching was genuinely at (or close to) the individual level. For example, if siblings have been chosen as controls, then each stratum would have just one case and the sibling control; in this situation, an unconditional logistic regression analysis would suffer from problems of sparse data, and conditional logistic regression would be required. Similar situations might arise if controls were neighbours or from the same general practice (if each general practice only had one or a few cases), or if matching was performed on many factors simultaneously so that most strata (in the standard analysis) had just one case and one control.

Provided, however, that there are no problems of sparse data, such control for the matching factors can be obtained using an unconditional analysis, with no loss of validity and a possible increase in precision.

Thus, a matched design will (nearly always) require controlling for the matching factors in the analysis. It is not the case, however, that a matched design requires a matched analysis.

I thank Simon Cousens, Deborah Lawlor, Lorenzo Richiardi, and Jan Vandenbroucke for their comments on the draft manuscript. The Centre for Global NCDs is supported by the Wellcome Trust Institutional Strategic Support Fund, 097834/Z/11/B.

Competing interests: I have read and understood the BMJ policy on declaration of interests and declare the following: none.

Provenance and peer review: Not commissioned; externally peer reviewed.

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/ .

statistical methods for case control studies

statistical methods for case control studies

Buy new: $59.95

statistical methods for case control studies

Sorry, there was a problem.

Other sellers on amazon.

Kindle app logo image

Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required . Learn more

Read instantly on your browser with Kindle for Web .

Using your mobile phone camera - scan the code below and download the Kindle app.

QR code to download the Kindle App

Handbook of Statistical Methods for Case-Control Studies (Chapman & Hall/CRC Handbooks of Modern Statistical Methods) 1st Edition

Enhance your purchase

Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field. It provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and methods. The handbook is designed to serve as a reference text for biostatisticians and quantitatively-oriented epidemiologists who are working on the design and analysis of case-control studies or on related statistical methods research. Though not specifically intended as a textbook, it may also be used as a backup reference text for graduate level courses.

Book Sections

About the Editors

Ørnulf Borgan is Professor of Statistics, University of Oslo. His book with Andersen, Gill and Keiding on counting processes in survival analysis is a world classic.

Norman E. Breslow was, at the time of his death, Professor Emeritus in Biostatistics, University of Washington. For decades, his book with Nick Day has been the authoritative text on case-control methodology.

Nilanjan Chatterjee is Bloomberg Distinguished Professor, Johns Hopkins University. He leads a broad research program in statistical methods for modern large scale biomedical studies.

Mitchell H. Gail is a Senior Investigator at the National Cancer Institute. His research includes modeling absolute risk of disease, intervention trials, and statistical methods for epidemiology.

Alastair Scott was, at the time of his death, Professor Emeritus of Statistics, University of Auckland. He was a major contributor to using survey sampling methods for analyzing case-control data.

Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear regression and methods for fitting models to response-selective data.

Check out reading-themed apparel and accessories in the new Amazon Books merch shop

Customers who bought this item also bought

Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction

Editorial Reviews

"This book is essential reading and reference for any statistical methodologist with interest in case-control studies...This book is a very good place to start on the next leg of our statistical journey in this field." ~Nicholas P. Jewell , ISCB Newsletter

" . . . as a handbook, it is designed to address specific methodological issues, more like a toolbox. And this is done well. All chapters come with an introduction and a worked example using sample data, with ample reference to further details. Occasional chapters on unconventional study designs provide food for thought. Overall, the book is well written and very comprehensive; it provides help for many situations, and for situations of greater complexity it points to further references." ~Anika Hüsing, Biometrical Journal

About the Author

Product details.

Customer reviews

Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.

To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.

Top review from the United States

There was a problem filtering reviews right now. please try again later..

statistical methods for case control studies

statistical methods for case control studies

Handbook of Statistical Methods for Case-Control Studies

The Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field and published by Chapman & Hall/CRC Press (2018). The handbook provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and methods. The handbook is designed to serve as a reference text for biostatisticians and quantitatively-oriented epidemiologists who are working on the design and analysis of case-control studies or on related statistical methods research.

This website provides supplementary materials for some of the chapters of the handbook. The website is maintained by Ørnulf Borgan (email: [email protected]).

statistical methods for case control studies

SUPPLEMENTARY MATERIALS 

Chapter 8: Small Sample Methods

Chapter 12: Multi-Phase Sampling

Chapter 13: Calibration in Case-Control Studies

Chapter 17: Survival Analysis of Case-Control Data: A Sample Survey Approach

Chapter 18: Nested Case-Control Studies: A Counting Process Approach

Chapter 19: Inverse Probability Weighting in Nested Case-Control Studies

Chapter 20: Multiple Imputation for Sampled Cohort Data

Chapter 21: Maximum Likelihood Estimation for Case-Cohort and Nested Case-Control Studies

Chapter 22: The Self-Controlled Case Series Method

statistical methods for case control studies

Get this book in print

What people are saying  -   Write a review

Other editions - view all, about the author  (2020).

Ørnulf Borgan is Professor of Statistics, University of Oslo. His book with Andersen, Gill and Keiding on counting processes in survival analysis is a world classic.

Norman E. Breslow was, at the time of his death, Professor Emeritus in Biostatistics, University of Washington. For decades, his book with Nick Day has been the authoritative text on case-control methodology.

Nilanjan Chatterjee is Bloomberg Distinguished Professor, Johns Hopkins University. He leads a broad research program in statistical methods for modern large scale biomedical studies.

Mitchell H. Gail is a Senior Investigator at the National Cancer Institute. His research includes modeling absolute risk of disease, intervention trials, and statistical methods for epidemiology.

Alastair Scott was, at the time of his death, Professor Emeritus of Statistics, University of Auckland. He was a major contributor to using survey sampling methods for analyzing case-control data.

Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear regression and methods for fitting models to response-selective data.

Bibliographic information

QR code for Handbook of Statistical Methods for Case-Control Studies

Uh-oh, it looks like your Internet Explorer is out of date. For a better shopping experience, please upgrade now.

  Javascript is not enabled in your browser. Enabling JavaScript in your browser will allow you to experience all the features of our site.    Learn how to enable JavaScript on your browser

Handbook of Statistical Methods for Case-Control Studies

Handbook of Statistical Methods for Case-Control Studies

Handbook of Statistical Methods for Case-Control Studies

Book Sections

About the Editors

Ørnulf Borgan is Professor of Statistics, University of Oslo. His book with Andersen, Gill and Keiding on counting processes in survival analysis is a world classic.

Norman E. Breslow was, at the time of his death, Professor Emeritus in Biostatistics, University of Washington. For decades, his book with Nick Day has been the authoritative text on case-control methodology.

Nilanjan Chatterjee is Bloomberg Distinguished Professor, Johns Hopkins University. He leads a broad research program in statistical methods for modern large scale biomedical studies.

Mitchell H. Gail is a Senior Investigator at the National Cancer Institute. His research includes modeling absolute risk of disease, intervention trials, and statistical methods for epidemiology.

Alastair Scott was, at the time of his death, Professor Emeritus of Statistics, University of Auckland. He was a major contributor to using survey sampling methods for analyzing case-control data.

Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear regression and methods for fitting models to response-selective data.

Related collections and offers

Product details, about the author, table of contents.

About the Editors xiii

List of Contributors xv

I Introduction 1

1 Origins of the Case-Control Study Norman E. Breslow Noel Weiss 3

2 Design Issues in Case-Control Studies Duncan C. Thomas 15

II Classical Case-Control Studies 39

3 Basic Concepts and Analysis Barbara McKnight 41

4 Matched Case-Control Studies Barbara McKnight 63

5 Multiple Case or Control Groups Barbara McKnight 77

6 Causal Inference from Case-Control Studies Vanessa Didelez Robin J. Evans 87

7 The Case-Crossover Study Design in Epidemiology Joseph A. "Chris" Delaney Samy Suissa 117

8 Small Sample Methods Jinko Graham Brad McNeney Robert Platt 133

9 Power and Sample Size for Case-Control Studies Mitchell H. Gail Sebastien Haneuse 163

10 Measurement Error and Case-Control Studies Raymond J. Carroll 189

III Case-control Studies that Use Full-Cohort Information 205

11 Alternative Formulation of Models in Case-Control Studies William E. Barlow John B. Cologne 207

12 Multi-Phase Sampling Gustavo Amorim Alastair J. Scott Chris J. Wild 219

13 Calibration in Case-Control Studies Thomas Lumley 239

14 Secondary Analysis of Case-Control Data Chris J. Wild 251

15 Response Selective Study Designs Using Existing Longitudinal Cohorts Paul J. Rathouz Jonathan S. Schildcrout Leila R. Zelnick Patrick J. Heagerty 261

IV Case-Control Studies for Time-to-Event Data 283

16 Cohort Sampling for Time-to-Event Data: An Overview Ørnulf Borgan Sven Ove Samuelsen 285

17 Survival Analysis of Case-Control Data: A Sample Survey Approach Norman E. Breslow Jie Kate Hu 303

18 Nested Case-Control Studies: A Counting Process Approach Ømulf Borgan 329

19 Inverse Probability Weighting in Nested Case-Control Studies Sven Ove Samuelsen Nathalie Støer 351

20 Multiple Imputation for Sampled Cohort Data Ruth H. Keogh 373

21 Maximum Likelihood Estimation for Case-Cohort and Nested Case-Control Studies Donglin Zeng Dan-Yu Lin 391

22 The Self-Controlled Case Series Method Paddy Farrington Heather Whitaker 405

V Case-Control Studies in Genetic Epidemiology 423

23 Case-Control Designs for Modern Genome-Wide Association Studies: Basic Principles and Overview Nilanjan Chatterjee 425

24 Analysis of Gene-Environment Interactions Summer S. Han Raymond J. Carroll Nilanjan Chatterjee 437

25 Two-Stage Testing for Genome-Wide Gene-Environment Interactions James Y. Dai Li Hsu Charles Kooperberg 459

26 Family-Based Case-Control Approaches to Study the Role of Genetics Clarice R. Weinberg Min Shi David M. Umbach 475

27 Mixed Models for Case-Control Genome-Wide Association Studies: Major Challenges and Partial Solutions David Golan Saharon Rosset 495

28 Analysis of Secondary Phenotype Data under Case-Control Designs Guoqing Diao Donglin Zeng Dan-Yu Lin 515

Customer Reviews

statistical methods for case control studies

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Logo of biosts

Statistical methods for biomarker data pooled from multiple nested case–control studies

Abigail sloan.

1 Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA

Stephanie A Smith-Warner

2 Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA

2a Department of Nutrition, Harvard T. H. Chan School of Public Health, Boston, MA, USA

Regina G Ziegler

3 Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA

4a Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA

4b Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, and Harvard Medical School, Boston, MA, USA

Associated Data

Pooling biomarker data across multiple studies allows for examination of a wider exposure range than generally possible in individual studies, evaluation of population subgroups and disease subtypes with more statistical power, and more precise estimation of biomarker-disease associations. However, circulating biomarker measurements often require calibration to a single reference assay prior to pooling due to assay and laboratory variability across studies. We propose several methods for calibrating and combining biomarker data from nested case–control studies when reference assay data are obtained from a subset of controls in each contributing study. Specifically, we describe a two-stage calibration method and two aggregated calibration methods, named the internalized and full calibration methods, to evaluate the main effect of the biomarker exposure on disease risk and whether that association is modified by a potential covariate. The internalized method uses the reference laboratory measurement in the analysis when available and otherwise uses the estimated value derived from calibration models. The full calibration method uses calibrated biomarker measurements for all subjects, including those with reference laboratory measurements. Under the two-stage method, investigators complete study-specific analyses in the first stage followed by meta-analysis in the second stage. Our results demonstrate that the full calibration method is the preferred aggregated approach to minimize bias in point estimates. We also observe that the two-stage and full calibration methods provide similar effect and variance estimates but that their variance estimates are slightly larger than those from the internalized approach. As an illustrative example, we apply the three methods in a pooling project of nested case–control studies to evaluate (i) the association between circulating vitamin D levels and risk of stroke and (ii) how body mass index modifies the association between circulating vitamin D levels and risk of cardiovascular disease.

1. Introduction

Combining data from multiple studies to maximize sample size has become a common strategy to quantify exposure-disease associations, including those where the exposure is a biomarker. Increased sample sizes facilitate subgroup and tumor subtype analyses, allow more precise estimation of the biomarker exposure effect over a wider range of biomarker measurements, and avoid issues related to data sparsity ( Key and others , 2010 ; Smith-Warner and others , 2006 ). The increase in the use of pooling consortia over time reflects the availability and advantages of big data in epidemiology and its promises to improve quantification of disease risk factors. Note that we use the term pooling throughout this article to refer to combination of data from individual participants and not physical specimen combination. Here, we define biomarkers as measurable indicators of health at the molecular, biochemical, or cellular level ( Key and others , 2010 ). Examples include proteins, antibodies, hormones, and lipids. Many consortia have analyzed biomarker-disease associations, including the Endogenous Hormones, Nutritional Biomarkers, and Prostate Cancer Collaborative Group ( Key and others , 2010 ), the COPD Biomarkers Qualification Consortium Database ( Tabberer and others , 2017 ), and the Circulating Biomarkers and Breast and Colorectal Cancer Consortium ( McCullough and others , 2018 ), among others. The participating cohorts in many consortia have employed nested case–control studies using individual or frequency matching to improve efficiency.

An important consideration when conducting pooled analyses of biomarker measurements from different studies is whether the measurements differ across studies due to real differences in the underlying populations or due to usage of different assays, kits, or laboratories in some or all studies. This consideration is particularly important when samples in the pooled analysis were assayed at different laboratories with different assays at different times. Examples of biomarkers with highly variable measurements across assays and laboratories include estradiol, testosterone, and insulin-like growth factor 1 ( Key and others , 2010 ). Measurements of circulating 25-hydroxyvitamin D (25(OH)D) also vary up to 40% between laboratories and assays ( Lai and others , 2012 ).

For consortial projects of biomarkers that do not use a single assay and laboratory, investigators must address potential between-study variation in biomarker measurements. Critically, to quantify risk associated with per-unit increases in the biomarker, a common metric for the biomarker data must be used in each of the contributing studies. One strategy used to harmonize biomarker measurements involves study-specific calibration models. In this method, a random subset of biospecimens from each study is reassayed at a designated reference laboratory. A study-specific calibration model is estimated in each study between the original “local” laboratory measurements and reference laboratory measurements. The resulting calibration equation is then used to estimate the reference laboratory biomarker measurement from the local laboratory measurement for all cases and controls in the individual study. Following the calibration procedure, the harmonized biomarker measurements can be modeled using categories defined by absolute concentrations, consortium-wide quantiles, or continuously. In practice, re-assayed biospecimens are typically selected at random from controls in each study owing to concerns about the availability of case biospecimens ( Sloan and others , 2019 ).

Two major classes of methods exist for analyzing data pooled from multiple studies, namely the two-stage approach and the aggregated approach ( Debray and others , 2013 ; Smith-Warner and others , 2006 ). Under the two-stage method, investigators complete study-specific analyses using standardized criteria in the first stage followed by meta-analysis in the second stage. In the aggregated approach, investigators combine harmonized data from all studies into a single dataset before performing statistical analyses on the combined dataset. Sloan and others (2019) developed pooling methodology for cohort studies and subdivided the aggregated approach into the internalized and full calibration approach. The internalized method uses the reference laboratory measurement in the analysis when available and the calibrated measurement otherwise. In contrast, the full calibration method uses calibrated biomarker measurements exclusively for all subjects regardless of the availability of reference laboratory measurements. In this article, we derive these approaches under the paradigm of nested case–control studies, allowing the potential inclusion of a biomarker–covariate interaction term.

The methods developed here are gnostic to the type of assay being used or biomarker being measured so long as investigators have access to reference assay measurements for a subset of individuals at each local laboratory and can model the relationship between reference laboratory measurements and local laboratory measurements. Variation in the assays or laboratories are captured in these study-specific models.

We can equivalently view pooled and calibrated biomarker data as a covariate measurement error problem. If we treat the reference and local laboratory measurements as the true and surrogate biomarker values, respectively ( Carroll and others , 2006 ), we can envision each study-specific calibration model as a different measurement error model. We leverage an existing strategy in the measurement error literature, namely regression calibration ( Carroll and others , 2006 ; Rosner and others , 1990 ), to form the basis of our methods. Although each of our methods are classified as a two-stage or aggregated approach, each utilizes concepts underlying regression calibration.

In this article, we propose calibration methods for pooled biomarker data from nested case–control studies that allow inference on the main effect of the biomarker in addition to biomarker–covariate interaction terms. Section 2 presents the models and statistical methods. Section 3 compares the methods via simulation and considers the inclusion of a covariate–biomarker interaction term. Section 4 illustrates the methods in examples involving 25(OH)D data pooled from the Nurses’ Health Study I (NHS1), Nurses’ Health Study II (NHS2), and Health Professionals Follow-up Study (HPFS) for stroke and cardiovascular disease (CVD) outcomes. Section 5 discusses our results.

2.1. Model and approximate conditional likelihood

equation M1

To estimate the biomarker exposure effect under the aggregated approach, we develop a likelihood-based method. The conditional logistic regression model for the biomarker–disease association is

equation M17

Under aggregation, the likelihood contribution from a stratum with only local laboratory biomarker measurements is

equation M31

2.2. Calibration model

equation M45

We assume a linear relationship between the reference and local laboratory measurements among the matched cases and controls such that

equation M57

2.3. Parameter estimation under aggregated approach

equation M94

2.4. Two-stage approach

The two-stage approach for pooled data uses regression calibration, a broadly applicable method initially developed in the measurement error literature, to adjust for calibration in the first stage study-specific analyses ( Carroll and others , 2006 ; Rosner and others , 1990 ; Spiegelman and others , 1997 , 2001 ). The second stage combines these estimates using fixed effects meta-analysis.

equation M122

2.5. Two-stage method for models with an interaction term

equation M136

3. Simulations

3.1. model without an interaction term.

equation M157

We also performed simulations that fixed the total sample size at 1000 participants while varying the calibration subset size between 30, 50, and 150 subjects (or 3%, 5%, and 15% participation rates, respectively). As shown in Figure 1 , at all calibration study sizes, the full calibration method offered nearly unbiased point estimates. With larger calibration study sizes, the MSEs decreased as a result of the improvement in efficiency. However, the internalized method estimates experienced increasing downward bias as the proportion of subjects participating in the calibration subset increased owing to increasingly differential calibration of cases and controls. As calibration study size increased, the two-stage method point estimates were increasingly less biased with decreasing MSEs owing to the improved bias and efficiency of calibration parameters.

An external file that holds a picture, illustration, etc.
Object name is kxz051f1.jpg

Comparison of methods as number of participants in the calibration study increases. The number of subjects in each study remains fixed at 1000, or equivalently, 500 case–control pairs. The calibration study participation rates considered are 3%, 5%, and 15%, or 30, 50, and 150 individuals, respectively. Panels a-c depict the percent bias of the parameter estimate while panels d-f display the MSE of the estimate.

3.2. Model with an interaction term

equation M222

4. Applied example

We completed two data examples to illustrate the methods. In the first example, we investigate the impact of circulating 25-hydroxyvitamin D (25(OH)D) levels on risk of stroke. In the second example, we investigate the impact of 25(OH)D levels and its interaction with a dichotomized body mass index (BMI) term on the risk of a composite outcome, fatal or nonfatal stroke, or myocardial infarction (henceforth referred to as the CVD endpoint). In both examples, we match each case to a single control based on sex and age at blood draw.

We applied the two aggregated methods (i.e. full calibration and internalized), two-stage, and naive methods to data combined from three large prospective cohort studies in the United States, including the HPFS ( Wu and others , 2011 ), the NHS1 ( Eliassen and others , 2016 ), and the NHS2 ( Eliassen and others , 2011 ). The HPFS began enrollment in 1986 and includes 51 529 male health professionals aged 40–75 years at baseline. The NHS1 enrolled 121 701 female nurses aged 30–55 years at baseline in 1976. The NHS2, a younger counterpart to the NHS1, was established in 1989 with the enrollment of 116 671 female nurses, aged 25–42 years at baseline. In each cohort, participants completed biannual questionnaires providing information about medical history, diet, and lifestyle conditions. Between 1989 and 1997, each study completed laboratory assays on blood samples for a host of biomarkers, including 25(OH)D, from a subset of participants. Subjects with a previous cancer diagnosis were not eligible for random selection. Individuals were excluded from the pooled analysis if they did not have 25(OH)D measurements available or stroke or myocardial infarction outcome data.

Each study obtained calibration data among a subset of controls by re-assaying their blood samples at Heartland Assays, LLC between 2011 and 2013. Circulating 25(OH)D levels were modeled continuously and reported using 20 nmol/L increments. Table 4 of the supplementary material available at Biostatistics online lists information about the main studies and the calibration subsets, including the parameter estimates of the study-specific calibration models.

equation M366

5. Discussion

In this work, we proposed statistical methods for analyzing calibrated biomarker data pooled across multiple nested case–control studies. Our methods facilitate inference on the main effect of the biomarker as well as a biomarker–covariate interaction term. Keeping with common practice, we estimated study-specific calibration models from subsets of controls reassayed at the reference lab. The methods developed here can also be used to contend with exposure measurement error when pooling data from multiple studies with internal validation subsets.

equation M371

Naive estimates were typically quite biased and illustrated the risk of failing to implement a calibration step when necessary. More problematically, the naive estimates were sometimes biased toward the alternative, resulting in an inflated type I error rate.

Although this article focuses on the common scenario of a controls-only calibration study, all the methods discussed also apply if the calibration subset includes both cases and controls. Furthermore, both the full calibration and internalized methods work for nonlinear calibration models. If necessary, one could include nonlinear terms in the calibration model when applying the full calibration and/or internalized methods. Note however that the two-stage method does require the linear calibration model in (2.2).

Regarding inclusion of covariates, if covariates are correlated with the biomarker and not the outcome, they may be included in the calibration model but not in the conditional logistic regression model. Covariates that are correlated with both the biomarker and the outcome can be included in both models.

Although the aggregated and two-stage methods are equally viable and valid options for analyzing outcome–exposure relationships in pooled data, logistical considerations may dictate the preferred approach for the statistical analysis. For instance, aggregated methods often lend themselves better to subgroup analyses because they reduce issues resulting from sparse data for a single study in specific strata. If the main exposure effect and at least some covariate effects are homogeneous, the aggregated method may also offer efficiency gains in covariate estimation relative to the two-stage method ( Lin and Zeng, 2010 ). However, the two-stage method may be more appealing than the aggregated methods at times for its intuitive and simple implementation, and its robustness to these covariate homogeneity assumptions.

6. Software

Functions in the form of R code (with an example) are available at the first author’s Github account https://github.com/agsloan/PoolingBiomarkerData and last author’s website https://www.hsph.harvard.edu/molin-wang/software .

Supplementary Material

Kxz051_supplementary_data, acknowledgments.

We are grateful to Tao Hou and Shiaw-Shyuan (Sherry) Yaun for their assistance in accessing the data. We also thank the Circulating Biomarkers and Breast and Colorectal Cancer Consortium team (R01CA152071, PI: Stephanie Smith-Warner; Intramural Research Program, Division of Cancer Epidemiology and Genetics, National Cancer Institute: Regina Ziegler) for conducting the calibration study in the vitamin D examples. Conflict of Interest : None declared.

Supplementary material

Supplementary material is available at http://biostatistics.oxfordjournals.org .

This work was supported by the NIH (T32-NS048005 to A.S.) and by the NIH/NCI (R03CA212799 to M.W.).

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Periodontics

Is there evidence of a relationship between pre-eclampsia and periodontitis?

Evidence-Based Dentistry ( 2023 ) Cite this article

Metrics details

Data sources

The review searched several databases which included Medline (from 1950), Pubmed (from 1946), Embase (from 1949), Lilacs, Cochrane Controlled Clinical Trial Register, CINAHL, ClinicalTrials.gov and Google Scholar (from 1990).

Study selection

Two of the authors (LD and HN) independently assessed the eligibility of studies by looking at the titles, abstracts and methods. If there was a disagreement, a third reviewer was consultant (QA) for a decision.

Data extraction and synthesis

A data extraction form was created and used. Data collected included: the first author’s name; publication year; study design; number of cases; number of controls, total sample size; country; national income group; mean age; the risk of estimates or data used to calculate the risk estimates; confidence intervals (CI) or data used to generate CI. For assessment of socioeconomic status and its role as a possible influential factor, the World Bank classification through Gross National Income per capita was used to determine which level (low-income, lower-middle-income, upper-middle-income, high-income) a country resided in. All authors cross-checked all data and discussions were had to resolve disagreements. Statistical software ‘RevMan’ was used to input data. Pooled odds ratios, mean difference, and 95% CI were calculated for the association between periodontitis and pre-eclampsia using a random-effects model. A significance level of 0.05 was used for pooled effect. Forest plots for primary analysis and subgroup analysis show the raw data, odds ratio and CIs, means and SDs for the chosen effect, heterogeneity statistic ( I 2 ), total number of participants per group, overall odds ratio and mean difference. Groups were divided for subgroup analysis by: study design (case-control and cohort); the studies’ definition of periodontitis (defined by pocket depth [PD] and/or clinical attachment loss [CAL]); and national income (high-income or middle-income or low-income countries). Cochran’s Q statistic and I 2 statistic were used to determine heterogeneity and degree of heterogeneity, respectively. For publication bias, Egger’s regression model and fail-safe number was used.

Thirty articles and 9650 women were included in total. Six of the studies were cohort studies (2840 participants overall) and 24 were case-control studies. Pre-eclampsia was defined the same across all studies, whereas periodontitis differed. There was a significant association between periodontitis and pre-eclampsia (OR 3.18, 95% CI 2.26–4.48, p  < 0.00001). In subgroup analysis of just cohort studies, the significance increased (OR 4.19, 95% CI 2.23–7.87, p  < 0.00001). It further increased looking at lower-middle-income countries (OR 6.70, 95% CI 2.61–17.19, p  < 0.0001).

Conclusions

Periodontitis in pregnancy is a risk factor for pre-eclampsia. The data would suggest that this is more prominent in lower-middle-income subgroups. Further research could be undertaken to explore the possible mechanisms and also if prevention of adequate treatment can reduce the risk of pre-eclampsia, thereby improving maternal health outcomes.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 4 print issues and online access

$259.00 per year

only $64.75 per issue

Rent or buy this article

Get just this article for as long as you need it

Prices may be subject to local taxes which are calculated during checkout

Knight M, UKOSS. Eclampsia in the United Kingdom 2005. BJOG. 2007;114:1072–8. https://doi.org/10.1111/j.1471-0528.2007.01423.x .

Article   PubMed   Google Scholar  

British Society of Periodontology. The good practitioner’s guide to periodontology. British Society of Periodontology 2016. http://www.bsperio.org.uk .

Moher D, Liberati A, Tetzlaff J, Altman DG,PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6:e1000097. https://doi.org/10.1371/journal.pmed.1000097 .

Article   PubMed   PubMed Central   Google Scholar  

Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA. 2000;283:2008–12. https://doi.org/10.1001/jama.283.15.2008 .

Download references

Author information

Authors and affiliations.

Paediatric Dentistry Department, Dundee Dental Hospital, Dundee, Scotland

Lauren Crowder

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Lauren Crowder .

Ethics declarations

Competing interests.

The author declares no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Cite this article.

Crowder, L. Is there evidence of a relationship between pre-eclampsia and periodontitis?. Evid Based Dent (2023). https://doi.org/10.1038/s41432-023-00870-y

Download citation

Received : 06 January 2023

Accepted : 12 January 2023

Published : 08 March 2023

DOI : https://doi.org/10.1038/s41432-023-00870-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

statistical methods for case control studies

Information

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

statistical methods for case control studies

sustainability-logo

Article Menu

statistical methods for case control studies

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Spatial differentiation characteristics of rural areas based on machine learning and gis statistical analysis—a case study of yongtai county, fuzhou city, 1. introduction, 2. machine learning and rural spatial differentiation characteristics, 2.1. gis technology, 2.2. spatial differentiation characteristics, 2.3. machine learning, 3. experiments on rural spatial differentiation characteristics, 3.1. data sources, 3.2. data preprocessing, 3.3. determination of evaluation indicators, 3.4. spatial analysis modeling, 3.5. feature pattern classification, 3.6. rural regional function evaluation, 3.7. rural regional function orientation, 4. analysis of the characteristics of regional rural spatial differentiation, 4.1. gis statistical analysis, 4.2. spatial differentiation feature analysis, 5. conclusions, data availability statement, conflicts of interest.

Share and Cite

Wang, Z. Spatial Differentiation Characteristics of Rural Areas Based on Machine Learning and GIS Statistical Analysis—A Case Study of Yongtai County, Fuzhou City. Sustainability 2023 , 15 , 4367. https://doi.org/10.3390/su15054367

Wang Z. Spatial Differentiation Characteristics of Rural Areas Based on Machine Learning and GIS Statistical Analysis—A Case Study of Yongtai County, Fuzhou City. Sustainability . 2023; 15(5):4367. https://doi.org/10.3390/su15054367

Wang, Ziyuan. 2023. "Spatial Differentiation Characteristics of Rural Areas Based on Machine Learning and GIS Statistical Analysis—A Case Study of Yongtai County, Fuzhou City" Sustainability 15, no. 5: 4367. https://doi.org/10.3390/su15054367

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

IMAGES

  1. bol.com

    statistical methods for case control studies

  2. Statistical analyses of case-control studies

    statistical methods for case control studies

  3. Statistical analyses of case-control studies

    statistical methods for case control studies

  4. Basic statistical analysis in genetic case-control studies (PDF Download Available)

    statistical methods for case control studies

  5. Statistical Methods in Cancer Research: Volume 1: The Analysis of Case-Control Studies : Breslow

    statistical methods for case control studies

  6. PPT

    statistical methods for case control studies

VIDEO

  1. EEI Fall Forum Corrosion Control Studies 1

  2. 4. Case Studies

  3. Collaborative Statistics Ch 1

  4. Cohort and Case Control Studies

  5. DATA ANALYSIS CASES

  6. DATA ANALYSIS CASES

COMMENTS

  1. Statistical analysis of case-control studies

    Methods of analysis of results from case-control studies have evolved considerably since the 1950s. These methods have helped to improve the validity of the conclusions drawn from case-control research and have helped to ensure that the available data are utilized to their fullest extent.

  2. Handbook of Statistical Methods for Case-Control Studies (Chapman

    Handbook of Statistical Methods for Case-Control Studies (Chapman & Hall/CRC Handbooks of Modern Statistical Methods): Borgan, Ørnulf, Breslow, Norman, Chatterjee, Nilanjan, Gail, Mitchell H., Scott, Alastair, Wild, Chris J.: 9781498768580: Amazon.com: Books Books › Science & Math › Biological Sciences Buy new: $114.79 List Price: $125.00

  3. Handbook of Statistical Methods for Case-Control Studies

    He was a major contributor to using survey sampling methods for analyzing case-control data. Chris J. Wild is Professor of Statistics, University of Auckland. His research includes nonlinear...

  4. Handbook of Statistical Methods for Case-Control Studies

    Provided that it is read and used together with such a comprehensive epidemiological text, this new Handbook of Statistical Methods for Case-Control Studies is a valuable and important book, which will be useful for seminars and courses on the developments in statistical theory that have occurred since the publication of Breslow and Day in 1980.

  5. Handbook of Statistical Methods for Case-Control Studies

    Handbook of Statistical Methods for Case-Control Studies Edited By Ørnulf Borgan , Norman Breslow , Nilanjan Chatterjee , Mitchell H. Gail , Alastair Scott , Chris J. Wild Copyright Year 2018 ISBN 9780367571375 Published June 30, 2020 by Chapman & Hall 554 Pages FREE Standard Shipping Format Quantity USD $ 59 .95 Add to Cart Add to Wish List

  6. Statistical analyses of case-control studies

    Case-control studies produce the odds ratio to measure the strength of the link between exposure and the outcome. An odds ratio is the ratio of exposure probabilities in the case group to the odds of response in the control group. Calculating a confidence interval for each odds ratio is critical.

  7. Valid statistical inference methods for a case-control study with

    The main objective of this paper is to derive the valid sampling distribution of the observed counts in a case-control study with missing data under the assumption of missing at random by employing the conditional sampling method and the mechanism augmentation method. The proposed sampling distribut …

  8. Handbook of Statistical Methods for Case-Control Studies

    analyze the nested case-control data. The latter approach essentially involves analyzing the whole set of cohort data and using multiple imputation for those variables that were only collected in the case-control subset. There are also excellent chapters on the self-controlled case series method, and various methods for case-control studies of ...

  9. Statistical methods for case-control and case-cohort studies with

    Due to the clustering of teeth, the survival times of the matched teeth within subjects could be correlated and thus the statistical methods for conventional case-control studies cannot not be directly applied. We study the marginal proportional hazards regression model for data from this type studies. Second, we consider a case-cohort study ...

  10. Statistical Methods for Cohort and Case-Control Studies

    Traditional methods of occupational cohort analysis have used the standardized mortality ratio (SMR) as the fundamental measure of association between risk factor and disease. The SMR is shown here to result from maximum likelihood estimation in a multiplicative statistical model involving known national death rates.

  11. Handbook of Statistical Methods for Case-Control Studies

    Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field. It provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and m…

  12. How to: Choose Case-control Statistical designs

    How to: Choose Case-control Statistical designs Case-control designs Sampling methodology The source population has to be (at least partially) defined by specified criteria so that appropriate controls can be selected. Such criteria will often be geographic, although they need not necessarily be so.

  13. Analysis of matched case-control studies

    Matching on factors such as age and sex is commonly used in case-control studies. 1 This can be done for convenience (eg, choosing a control admitted to hospital on the same day as the case), to improve study efficiency by improving precision (under certain conditions) when controlling for the matching factors (eg, age, sex) in the analysis, or …

  14. Handbook of Statistical Methods for Case-Control Studies

    In the case-control literature, it is known that the marginal benefit to statistical power from increasing the number of controls per case beyond 4 or 5 is small (e.g., Ury 1975;Song and Chung ...

  15. Handbook of Statistical Methods for Case-Control Studies (Chapman

    Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field. It provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and methods.

  16. Handbook of Statistical Methods for Case-Control Studies

    The Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field and published by Chapman & Hall/CRC Press (2018). The handbook provides an in-depth treatment of up-to-date and currently developing statistical methods for the design and analysis of case-control studies, as well as a review of classical principles and methods. The handbook is designed ...

  17. Handbook of Statistical Methods for Case-Control Studies

    Handbook of Statistical Methods for Case-Control Studies is written by leading researchers in the field. It provides an in-depth treatment of up-to-date and currently developing statistical...

  18. Handbook of Statistical Methods for Case-Control Studies

    III Case-control Studies that Use Full-Cohort Information 205. 11 Alternative Formulation of Models in Case-Control Studies William E. Barlow John B. Cologne 207. 12 Multi-Phase Sampling Gustavo Amorim Alastair J. Scott Chris J. Wild 219. 13 Calibration in Case-Control Studies Thomas Lumley 239. 14 Secondary Analysis of Case-Control Data Chris ...

  19. Statistical methods for biomarker data pooled from multiple nested case

    In this work, we proposed statistical methods for analyzing calibrated biomarker data pooled across multiple nested case-control studies. Our methods facilitate inference on the main effect of the biomarker as well as a biomarker-covariate interaction term. Keeping with common practice, we estimated study-specific calibration models from ...

  20. Is there evidence of a relationship between pre-eclampsia and

    Six of the studies were cohort studies (2840 participants overall) and 24 were case-control studies. Pre-eclampsia was defined the same across all studies, whereas periodontitis differed.

  21. Sustainability

    With the development of machine learning and GIS (geographic information systems) technology, it is possible to combine them to mine the knowledge rules behind massive spatial data. GIS, also known as geographic information systems, is a comprehensive discipline, which combines geography and cartography and has been widely used in different fields. It is a computer system for inputting ...