Study design depends greatly on the nature of the research question. In other words, knowing what kind of information the study should collect is a first step in determining how the study will be carried out (also known as the methodology).
Let’s say we want to investigate the relationship between daily walking and cholesterol levels in the body. One of the first things we’d have to determine is the type of study that will tell us the most about that relationship. Do we want to compare cholesterol levels among different populations of walkers and non-walkers at the same point in time? Or, do we want to measure cholesterol levels in a single population of daily walkers over an extended period of time?
The first approach is typical of a cross-sectional study. The second requires a longitudinal study. To make our choice, we need to know more about the benefits and purpose of each study type.
Both the cross-sectional and the longitudinal studies are observational studies. This means that researchers record information about their subjects without manipulating the study environment. In our study, we would simply measure the cholesterol levels of daily walkers and non-walkers along with any other characteristics that might be of interest to us. We would not influence non-walkers to take up that activity, or advise daily walkers to modify their behaviour. In short, we’d try not to interfere.
The defining feature of a cross-sectional study is that it can compare different population groups at a single point in time. Think of it in terms of taking a snapshot. Findings are drawn from whatever fits into the frame.
To return to our example, we might choose to measure cholesterol levels in daily walkers across two age groups, over 40 and under 40, and compare these to cholesterol levels among non-walkers in the same age groups. We might even create subgroups for gender. However, we would not consider past or future cholesterol levels, for these would fall outside the frame. We would look only at cholesterol levels at one point in time.
The benefit of a cross-sectional study design is that it allows researchers to compare many different variables at the same time. We could, for example, look at age, gender, income and educational level in relation to walking and cholesterol levels, with little or no additional cost.
However, cross-sectional studies may not provide definite information about cause-and-effect relationships. This is because such studies offer a snapshot of a single moment in time; they do not consider what happens before or after the snapshot is taken. Therefore, we can’t know for sure if our daily walkers had low cholesterol levels before taking up their exercise regimes, or if the behaviour of daily walking helped to reduce cholesterol levels that previously were high.
A longitudinal study, like a cross-sectional one, is observational. So, once again, researchers do not interfere with their subjects. However, in a longitudinal study, researchers conduct several observations of the same subjects over a period of time, sometimes lasting many years.
The benefit of a longitudinal study is that researchers are able to detect developments or changes in the characteristics of the target population at both the group and the individual level. The key here is that longitudinal studies extend beyond a single moment in time. As a result, they can establish sequences of events.
To return to our example, we might choose to look at the change in cholesterol levels among women over 40 who walk daily for a period of 20 years. The longitudinal study design would account for cholesterol levels at the onset of a walking regime and as the walking behaviour continued over time. Therefore, a longitudinal study is more likely to suggest cause-and-effect relationships than a cross-sectional study by virtue of its scope.
In general, the research should drive the design. But sometimes, the progression of the research helps determine which design is most appropriate. Cross-sectional studies can be done more quickly than longitudinal studies. That’s why researchers might start with a cross-sectional study to first establish whether there are links or associations between certain variables. Then they would set up a longitudinal study to study cause and effect.
Source:At Work, Issue 81, Summer 2015: Institute for Work & Health, Toronto
This column updates a previous column describing the same term, originally published in 2009.
1Department of Thoracic Surgery, Papworth Hospital, Cambridge, UK; 2Research and Development, CTBI, Papworth Hospital, Cambridge, UK
Correspondence to: Piergiorgio Solli. Papworth Hospital NHS Foundation Trust, Papworth Everard Cambridgeshire, CB23 3RE, UK. Email: firstname.lastname@example.org.
Author information ►Article notes ►Copyright and License information ►
Received 2015 Sep 19; Accepted 2015 Oct 9.
Copyright 2015 Journal of Thoracic Disease. All rights reserved.
This article has been cited by other articles in PMC.
Longitudinal studies employ continuous or repeated measures to follow particular individuals over prolonged periods of time—often years or decades. They are generally observational in nature, with quantitative and/or qualitative data being collected on any combination of exposures and outcomes, without any external influenced being applied. This study type is particularly useful for evaluating the relationship between risk factors and the development of disease, and the outcomes of treatments over different lengths of time. Similarly, because data is collected for given individuals within a predefined group, appropriate statistical testing may be employed to analyse change over time for the group as a whole, or for particular individuals (1).
In contrast, cross-sectional analysis is another study type that may analyse multiple variables at a given instance, but provides no information with regards to the influence of time on the variables measured—being static by its very nature. It is thus generally less valid for examining cause-and-effect relationships. Nonetheless, cross-sectional studies require less time to be set up, and may be considered for preliminary evaluations of association prior to embarking on cumbersome longitudinal-type studies.
Longitudinal study designs
Longitudinal research may take numerous different forms. They are generally observational, however, may also be experimental. Some of these are briefly discussed below:
Repeated cross-sectional studies where study participants are largely or entirely different on each sampling occasion;
Prospective studies where the same participants are followed over a period of time. These may include:
Cohort panels wherein some or all individuals in a defined population with similar exposures or outcomes are considered over time;
Representative panels where data is regularly collected for a random sample of a population;
Linked panels wherein data collected for other purposes is tapped and linked to form individual-specific datasets.
Retrospective studies are designed after at least some participants have already experienced events that are of relevance; with data for potential exposures in the identified cohort being collected and examined retrospectively.
Advantages and disadvantages
Longitudinal cohort studies, particularly when conducted prospectively in their pure form, offer numerous benefits. These include:
The ability to identify and relate events to particular exposures, and to further define these exposures with regards to presence, timing and chronicity;
Establishing sequence of events;
Following change over time in particular individuals within the cohort;
Excluding recall bias in participants, by collecting data prospectively and prior to knowledge of a possible subsequent event occurring, and;
Ability to correct for the “cohort effect”—that is allowing for analysis of the individual time components of cohort (range of birth dates), period (current time), and age (at point of measurement)—and to account for the impact of each individually.
Numerous challenges are implicit in the study design; particularly by virtue of this occurring over protracted time periods. We briefly consider the below:
Incomplete and interrupted follow-up of individuals, and attrition with loss to follow-up over time; with notable threats to the representative nature of the dynamic sample if potentially resulting from a particular exposure or occurrence that is of relevance;
Difficulty in separation of the reciprocal impact of exposure and outcome, in view of the potentiation of one by the other; and particularly wherein the induction period between exposure and occurrence is prolonged;
The potential for inaccuracy in conclusion if adopting statistical techniques that fail to account for the intra-individual correlation of measures, and;
Generally-increased temporal and financial demands associated with this approach.
Embarking on a longitudinal study
Conducting longitudinal research is demanding in that it requires an appropriate infrastructure that is sufficiently robust to withstand the test of time, for the actual duration of the study. It is essential that the methods of data collection and recording are identical across the various study sites, as well as being standardised and consistent over time. Data must be classified according to the interval of measure, with all information pertaining to particular individuals also being linked by means of unique coding systems. Recording is facilitated, and accuracy increased, by adopting recognised classification systems for individual inputs (2).
Numerous variables are to be considered, and adequately controlled, when embarking on such a project. These include factors related the population being studied, and their environment; wherein stability in terms of geographical mobility and distribution, coupled with an ability to continue follow-up remotely in case of displacement, are key. It is furthermore essential to appropriately weigh the various measures, and classify these accordingly so as to facilitate the allocation effort at the data collection stage, and also guide the use of possibly limited funds (3). Additionally, the engagement and commitment of organisations contributing to the project is essential; and should be maintained and facilitated by means of regular training, communication and inclusion as possible.
The frequency and degree of sampling should vary according to the specific primary endpoints; and whether these are based primarily on absolute outcome or variation over time. Ethical and consent considerations are also specific to this type of research. All effort should be made to ensure maximal retention of participants; with exit interviews offering useful insight as to the reason for uncontrolled departures (3).
The Critical Appraisal Skills Programme (CASP) (4) offers a series of tools and checklists that are designed to facilitate the evaluation of scientific quality of given literature. This may be extrapolated to critically assess a proposed study design. Additional depth of quality assessment is available through the use of various tools developed alongside the Consolidated Standards of Reporting Trials (CONSORT) guidelines, including a structured 33-point checklist proposed by Tooth et al. in 2004 (5).
Following adequate design, the launch and implementation of longitudinal research projects may itself require a significant amount of time; particularly if being conducted at multiple remote sites. Time invested in this initial period will improve the accuracy of data eventually received, and contribute to the validity of the results. Regular monitoring of outcome measures, and focused review of any areas of concern is essential (3). These studies are dynamic, and necessitate regular updating of procedures and retraining of contributors, as dictated by events.
The statistical testing of longitudinal data necessitates the consideration of numerous factors. Central amongst these are (I) the linked nature of the data for an individual, despite separation in time; (II) the co-existence of fixed and dynamic variables; (III) potential for differences in time intervals between data instances, and (IV) the likely presence of missing data (6).
Commonly applied approaches (7) are discussed below: (I) univariate (ANOVA) and multivariate (MANOVA) analysis of variance is often adopted for longitudinal analysis. Note, in both cases, the assumption of equal interval lengths and normal distribution in all groups; and that only means are compared, sacrificing individual-specific data. (II) mixed-effect regression model (MRM) focuses specifically on individual change over time, whilst accounting for variation in the timing of repeated measures, and for missing or unequal data instances, and (III) generalised estimating equation (GEE) models that rely on the independence of individuals within the population to focus primarily on regression data (6).
With ever-growing computational abilities, the repertoire of statistical tests is ever expanding. In depth understanding and appropriate selection is increasingly more important to ensure meaningful results.
Inaccuracies in the analysis of longitudinal research are rampant, and most commonly arise when repeated hypothesis testing is applied to the data, as it would for cross-sectional studies. This leads to an underutilisation of available data, an underestimation of variability, and an increased likelihood of type II statistical error (false negative) (8).
Example: the Framingham heart study
The mid-20th century saw a steady increase in cardiovascular-associated morbidity and mortality after efforts in improving sanitation along with the introduction of penicillin in the 1940s resulted in a significant decline in communicable disease. A drive to identify the risk factors for cardiovascular disease gave birth to the Framingham Heart study in 1948 (9).
Numerous predisposing factors were postulated to align together to produce cardiovascular disease, with increasing age being considered a central determinant. These formed the basis for the hypothesis that underpinned this longitudinal study.
The Framingham study is widely recognised as the quintessential longitudinal study in the history of medical research. An original cohort of 5,209 subjects from Framingham, Massachusetts between the ages of 30 and 62 years of age was recruited and followed up for 20 years. A number of hypothesis were generated and described by Dawber et al. (10) in 1980 listing various presupposed risk factors such as increasing age, increased weight, tobacco smoking, elevated blood pressure, elevated blood cholesterol and decreased physical activity. It is largely quoted as a successful longitudinal study owing to the fact that a large proportion of the exposures chosen for analysis were indeed found to correlate closely with the development of cardiovascular disease.
A number of biases exist within the Framingham Heart Study. Firstly it was a study carried out in a single population in a single town, bringing into question the generalisability and applicability of this data to different groups. However, Framingham was sufficiently diverse both in ethnicity and socio-economic status to mitigate this bias to a degree. Despite the initial intent of random selection, they needed the addition of over 800 volunteers to reach the pre-defined target of 5,000 subjects thus reducing the randomisation. They also found that their cohort of patients was uncharacteristically healthy.
The Framingham Heart study has given us invaluable data pertaining to the incidence of cardiovascular disease and further confirming a number of risk factors. The success of this study was further potentiated by the absence of treatments or modifiers, such as statin therapy and anti-hypertensives. This has enabled this study to more clearly delineate the natural history of this complex disease process.
Longitudinal methods may provide a more comprehensive approach to research, that allows an understanding of the degree and direction of change over time. One should carefully consider the cost and time implications of embarking on such a project, whilst ensuring complete and proven clarity in design and process, particularly in view of the protracted nature of such an endeavour; and noting the peculiarities for consideration at the interpretation stage.
Conflicts of Interest: The authors have no conflicts of interest to declare.
1. Van Belle G, Fisher L, Heagerty PJ, et al. Biostatistics: A Methodology for the Health Sciences. Longitudinal Data Analysis. New York, NY: John Wiley and Sons, 2004.
2. van Weel C. Longitudinal research and data collection in primary care.Ann Fam Med 2005;3Suppl 1:S46-51. [PMC free article][PubMed]
3. Newman AB. An overview of the design, implementation, and analyses of longitudinal studies on aging.J Am Geriatr Soc 2010;58Suppl 2:S287-91. [PMC free article][PubMed]
4. 12 questions to help you make sense of cohort study. Critical Appraisal Skills Programme (CASP) Cohort Study Checklist. Available online: http://media.wix.com/ugd/dded87_e37a4ab637fe46a0869f9f977dacf134.pdf
5. Tooth L, Ware R, Bain C, et al. Quality of reporting of observational longitudinal research.Am J Epidemiol 2005;161:280-8. [PubMed]
6. Edwards LJ. Modern statistical techniques for the analysis of longitudinal data in biomedical research.Pediatr Pulmonol 2000;30:330-44. [PubMed]
7. Nakai M, Ke W. Statistical Models for Longitudinal Data Analysis.Applied Mathematical Sciences 2009; 3:1979-89
8. Liu C, Cripe TP, Kim MO. Statistical issues in longitudinal data analysis for treatment efficacy studies in the biomedical sciences.Mol Ther 2010;18:1724-30. [PMC free article][PubMed]
9. Dawber TR, Kannel WB, Lyell LP. An approach to longitudinal studies in a community: the Framingham Study.Ann N Y Acad Sci 1963;107:539-56. [PubMed]
10. Dawber TR. The Framingham Study: The Epidemiology of Atherosclerotic Disease. Cambridge, Mass: Harvard University Press, 1980.
Articles from Journal of Thoracic Disease are provided here courtesy of AME Publications