Introduction

Whilst most people recover fully from severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) infection, some develop long-COVID. With increasing numbers of people having been infected since the start of the pandemic, attention is shifting from managing the acute infection to understanding long-COVID in order to inform the health and social care response. The WHO defined long-COVID as “a history of probable or confirmed SARS-CoV-2 infection … with symptoms that last for at least 2 months and cannot be explained by an alternative diagnosis”1. The imprecision of this, and other, definitions reflects our poor understanding of the nature of long-COVID and its underlying mechanisms.

A meta-analysis of 63 studies pooled symptom prevalence following laboratory-confirmed SARS-CoV-2 infection2. The sample size ranged from 58 to 1733 (16,336 in total), with most studies containing fewer than 200 participants. Overall, 53% of people reported one or more symptoms beyond 12 weeks following infection. The most common were fatigue, pain/discomfort, shortness of breath, cognitive impairment, and mental health problems. Eighteen studies were judged to be at high risk of bias, and the remaining 45 were at moderate risk, due to convenience sampling and unrepresentative study populations. Half of the studies investigated hospitalised cohorts. The only two studies with more than 1000 participants were a hospitalised cohort and a cohort of health-care workers with mild infection. In a subsequent meta-analysis of 18 studies with at least 12 months follow-up, study samples ranged from 51 to 2433 (8591 in total) and 28% of participants reported fatigue/weakness, 22% anxiety, 18% breathlessness, 19% memory loss, 18% concentration difficulties and 12% insomnia3. The authors highlighted a small sample size, lack of representativeness and low response rate as limitations. Whilst many long-COVID studies have focused on patients hospitalised with more severe infections, a meta-analysis of 9 studies reported persistent symptoms in up to one-third of people following mild SARS-CoV-2 infection4.

This study aimed to address some of the limitations of existing studies by determining the frequency, nature, determinants and impact of long-COVID in the general population using a largescale, nationwide study, including people who had severe, mild and asymptomatic infections and a never infected comparison group, and measuring serial self-reported outcomes as well as outcomes obtained from linkage to routine health records. Here we show that the long-term sequelae of COVID-19 are wide-ranging and impact on all aspects of daily living, are specific to symptomatic infections and more likely following severe infections requiring hospitalisation, and in older, female and more deprived patients with pre-existing health problems but vaccination appears to confer some protection.

Results

Cohort characteristics

Overall, 102,473 (16%) of the 638,125 people invited consented and completed at least one questionnaire: 33,281 (20%) of the 162,957 people who had a positive test, and 69,192 (15%) of the 475,168 invited following only negative tests. Completion rates were 14% (90,578/625,315), 9% (21,963/242,412) and 11% (934/8625) for the 6, 12 and 18 month follow-up questionnaires respectively (Supplementary Table 1). Compared with those who did not provide consent, responders were more likely to be female (60.9% vs 51.2%; p-value < 0.001), were older (>40 years 59.5% vs 46.0%; p-value < 0.001) and less deprived (most deprived SIMD quintile 24.0% vs 27.0%; p-value <0.001).

We excluded 6235 participants who had only negative tests recorded but self-reported they had tested positive. Therefore, the study cohort comprised 96,238 participants. Their median age at baseline was 45 (IQR 31–56) years, 39% were male, 91% white, 30% had at least one pre-existing health condition and 16% at least two; 4% had received at least one COVID-19 vaccination dose prior to their index test (Table 1).

Table 1 Characteristics of study participants by infection status and symptomatology

Symptoms during acute infection

Among the 33,281 participants who had a positive test, 31,486 (95%) were symptomatic at the time of infection; 1208 (4%) had one symptom, 1999 (6%) had two, 2493 (8%) had three, and 25,786 (82%) had more than three. Overall, 83% reported fatigue at the time of acute infection, 64% headache, 63% change in taste, 63% myalgia, 60% change in smell, 54% cough, 52% fever, 45% breathlessness, 41% loss of appetite, 38% joint pain, 31% sore throat, 23% diarrhea, 21% chest pain, 20% runny nose, 15% abdominal pain, 13% confusion, 13% hoarse voice, 9% hair loss, 8% ear pain, 2% reduced consciousness, and 0.3% seizures. Of those who reported symptom duration, 7259 (23%) reported <1 week, 13,710 (44%) 1–4 weeks, and 10,489 (33%) >4 weeks.

Outcomes

Of the 31,486 people who had had symptomatic infections, 1856 (6%) reported that they had not recovered at all on their most recent follow-up questionnaire, and 13,350 (42%) that they had only partially recovered. Among the 1342 people whose infection required hospitalization, the figures were 217 (16%) and 797 (59%), respectively, and among the 30,096 managed in the community, they were 1639 (5%) and 12,553 (42%), respectively.

For the 3941 with serial questionnaire data there was little change in the overall breakdown; at their first follow-up, 316 (8%) had not at all recovered and 1866 (47%) had only partially recovered, compared to 324 (8%) and 1806 (46%), respectively, at their most recent follow-up. However, there was some cross-over between groups; 1453 (37%) remained fully recovered, 1372 (35%) remained partially recovered, and 175 (4%) continued to report no recovery, while 494 (13%) reported delayed recovery (improvement over time), and 447 (11%) reported relapse (deterioration over time).

The results were similar when specific follow-up periods were compared. In the sub-group of 3744 participants who had symptomatic infections and who completed questionnaires at both 6 and 12 months follow-up, the breakdown by no, partial and full recovery was 295 (8%) 1766 (47%), and 1683 (45%) at six months and 303 (8%), 1705 (46%) and 1736 (46%) at 12 months (Supplementary Table 2). Similarly, in the 197 participants who completed questionnaires at both 12 and 18 months follow-up, the figures were 21 (11%), 100 (51%) and 76 (39%), and 21 (11%), 101 (51%) and 75 (38%) at 12 and 18 months, respectively.

Of the 21,525 people with ongoing symptoms following symptomatic infection, the most common were tiredness, headache and muscle aches/weakness (Table 2). However, symptoms were also common among people never infected. Compared with the latter, people who had previous symptomatic infection were significantly more likely to report 24 of the 26 symptoms at follow-up after adjusting for potential confounders (Table 3). After changes in smell and taste, the largest effect sizes were observed for symptoms that were potentially cardiovascular in origin (breathlessness, chest pain and palpitations) and confusion (Table 3). People with previous symptomatic infection were also more likely to have multiple (≥3) symptoms than people never infected (14,236 (45%) vs 19,613 (31%)). There was weak evidence of clustering of musculoskeletal and neuropsychological symptoms following previous symptomatic infection (Supplementary Fig. 2). Among the participants who completed serial questionnaires, there was evidence of improvements in taste and smell between 6 and 12 months follow-up but increased reporting of a dry or productive cough between 6 and 18 months (Supplementary Table 3).

Table 2 Crude outcomes of participants by infection status and symptomatology
Table 3 Univariate and multivariate binary logistic regression analyses of the associations between previous symptomatic SARS-CoV-2 infection and current symptoms, referent to people never infected

Routine data were available until January 2022, providing a median (IQR) of 7 (6–8) months follow-up. People who had previous symptomatic infection were not at significantly increased risk of all-cause hospitalization (fully adjusted OR 1.02, 95% CI 0.97–1.07, p = 0.386), ICU admission (fully adjusted OR 1.21, 95% CI 0.86–1.70, p = 0.267) or all-cause mortality (fully adjusted OR 0.64, 95% CI 0.39–1.05, p = 0.076). However, they had a median EQ-5D score of 75 (IQR 60–89) in their most recent follow-up questionnaire compared with 80 (IQR 63–90) for people never infected (p < 0.001). People who had previous symptomatic infection had significantly lower EQ-5D scores in both the univariate Poisson model (coefficient 0.96, 95% CI 0.96–0.96, p < 0.001) and in the fully adjusted model (coefficient 0.95, 95% C I 0.95–0.95, p < 0.001). Similarly, people who had had symptomatic infection were significantly more likely to report impaired mobility, housework/chores, working/studying, washing/dressing, exercise/sport, hobbies and relationships after adjusting for potential confounders (Table 4). Asymptomatic SARS-CoV-2 infection was not associated with increased risk of current symptoms, impaired daily activities, reduced quality of life, hospitalization, ICU admission or death.

Table 4 Univariate and multivariate binary logistic regression analyses of the associations between previous symptomatic SARS-CoV-2 infection and current difficulties in activities of daily living, referent to people never infected

Factors associated with outcomes

Following previous symptomatic infection, lack of complete recovery was associated with more severe (hospitalized) initial infection, older age, female sex, deprivation, white ethnicity, and pre-existing health conditions, including respiratory disease and depression (Table 5). Compared to unvaccinated people, people vaccinated prior to symptomatic infection were less likely to report persistent change in smell (OR 0.58, 0.45–0.76), change in taste (OR 0.61, 95% CI 0.46–0.79), problems hearing (OR 0.60, 95% CI 0.44–0.83), poor appetite (OR 0.72, 95% CI 0.53–0.99), balance problems (OR 0.71, 95% CI 0.53–0.95), confusion/difficulty concentrating (OR 0.72, 95% CI 0.58–0.89), and anxiety/depression (OR 0.76, 95% CI 0.64–0.92) at their latest follow-up after adjustment for potential confounders.

Table 5 Multivariate logistic regression analysis of the factors associated with no or partial recovery, referent to full recovery, among people with previous symptomatic SARS-CoV-2 infection

Discussion

Between 6 and 18 months following symptomatic SARS-CoV-2 infection, almost half of those infected reported no, or incomplete, recovery. Whilst recovery status remained constant over follow-up for most, 13% reported improvement over time and 11% deterioration. Symptomatic SARS-CoV-2 infection was associated with a wide range of persistent symptoms, impaired daily activities and reduced health-related quality of life, independent of sociodemographic factors and pre-existing health conditions. The strongest associations were observed for symptoms that were potentially cardiovascular in origin (breathlessness, chest pain and palpitations) and confusion. Lack of recovery was associated with more severe infection, older age, female gender, white ethnicity, deprivation, pre-existing respiratory disease and multimorbidity but pre-infection vaccination was associated with reduced risk of some persistent symptoms. We found no evidence of sequelae following asymptomatic infection.

Our finding of impaired daily activities is consistent with previous studies. In a meta-analysis of 12 studies covering 4828 participants previously infected by SARS-CoV-2, 35% had problems with mobility, 8% with personal care, 42% pain/discomfort, and 38% anxiety/depression5. Similarly, in a global social media survey of 1020 confirmed and 2742 suspected previous COVID-19 cases, 45% reported an ongoing impact on their ability to work6. In line with our findings, previous studies have reported that women, older and more deprived people, and those with pre-existing health problems were less likely to recover completely from COVID-197,8,9.

Of eight studies that assessed the effectiveness of pre-infection vaccination on long-COVID10, six (two cohort, two case-control, two cross-sectional) reported fewer symptoms 1–6 months following infection among those fully vaccinated11,12,13,14,15,16, including fatigue, headache, muscle weakness/pain, breathlessness, dizziness, and change in smell13,16. Our findings differ from previous studies12,13,14,16 in suggesting possible protection against persistent symptoms from even partial vaccination. Three studies have suggested that, among unvaccinated people, post-infection vaccination may also reduce the risk or severity of long-COVID15,17,18, especially, if given early following infection15.

As a general population study, our findings provide a better indication of the overall risk and burden of long-COVID than hospitalised cohorts. The inclusion of asymptomatic infections enabled us to demonstrate that long-COVID is specific to people with symptomatic infections. Incomplete ascertainment of all cases of COVID-19 was inevitable due to the lack of PCR testing at the beginning of the pandemic, followed by a gradual increase in testing capacity and therefore changes in testing criteria. The risk of misclassification was reduced by using a composite definition of the previous infection, requiring both laboratory confirmation and self-report, and excluding people who reported a previous positive test that was not on the database. The never-infected group will include some people with asymptomatic infection or symptomatic infection prior to testing becoming available. The latter could potentially lead to under-estimation of the true magnitude of association between SARS-CoV-2 infection and ongoing health problems.

Our cohort included a large sample (n = 33,281) of people previously infected and the response rate of 16% overall and 20% among people who had symptomatic infection was consistent with previous studies that have used SMS text invitations as the sole method of recruitment. For example, in a study on the impact of COVID-19, an SMS text invitation was sent to 7911 patients attending a rheumatology outpatient clinic, of whom 21% responded and 20% completed the questionnaire19. In another study, SMS text messages were sent to 350 cases recorded on a COVID-19 contact tracing database. Of these 24% responded and 1% provided the requested information on their contacts20. In our study, recruitment may have been lower in some sub-groups. For example, the questionnaire was written in English and accessed via a web-based app and therefore may have been inaccessible to people without internet access or without English as their first language. Formal and informal carers were permitted to assist respondents in completing the questionnaire. It is possible that people with persistent health problems may have been more motivated to participate. Whilst reporting of current symptoms will not be subject to recall bias, a potential study limitation is that symptoms at the time of acute infections were recalled at follow-up and, therefore, were potentially subject to recall bias.

Symptoms are common in the general population. The three most common symptoms among people previously infected were also reported by 16–32% of people never infected. Therefore, the inclusion of an uninfected comparison group enabled us to demonstrate that the outcomes were not due to confounding. This is a major strength of our study compared with other population cohort studies9,18. We matched our uninfected comparison group 3:1 at the time of invitation; allowing for an anticipated lower response rate among uninfected individuals and attrition due to subsequent infection. Serial outcome measurements in a sub-group of 3941 respondents enabled us to investigate the trajectory of long-COVID over time. Very few people had been fully vaccinated pre-infection. However, sufficient had received a single dose to demonstrate some evidence of protection against persistent symptoms. The non-significant lower risk of death following symptomatic infection is likely to reflect the fact that our study recruited people who had survived at least six months following infection.

Our finding of higher rates of recovery among black and South Asian participants is consistent with a previous study that observed a lower probability of persistent symptoms among Asian participants9. Apparently better long-term prognosis contrasts with the known higher risk of black and South Asian individuals during acute infection with SARS-CoV-221 and may reflect survival bias, since participants needed to have survived at least 6 months following their acute infection to participate in our study. The Scottish population is 96% white22. Therefore, it is important that ethnic-specific outcomes are reported by other long-COVID studies with more ethnically diverse populations.

In conclusion, 6–18 months following symptomatic SARS-CoV-2 infection, adults were at greater risk of a diverse group of symptoms, poorer quality of life and wide-ranging impairment of their daily activities, which could not be explained by confounding. Sequelae were more likely following severe infection and were not observed following asymptomatic infection and pre-infection vaccination may be protective.

Methods

Study design

Long-COVID in Scotland Study (Long-CISS) is an ambidirectional, general population cohort. Every adult (>16 years) in Scotland with a positive PCR test for SARS-CoV-2 from April 2020 was invited to participate along with a comparison group who had had a negative test but never had a positive test, matched 3:1 by age, sex, and area-based socioeconomic deprivation quintile. The National Health Service (NHS) Scotland platform that provides PCR result notifications identified eligible participants and invited them via automated SMS text messages. The study commenced in May 2021 and was recruited both retrospectively and prospectively based on existing and new test results, respectively. People in the comparison group were reallocated to the infected group if, and when, they had a positive test. The cohort included people with asymptomatic SARS-CoV-2 infection detected, for example, during occupational or travel-related screening. Participants provided electronic consent and study approval was obtained from the West of Scotland Research Ethics Committee (ref. 21/WS/0020) and the Public Benefit and Privacy Panel (ref. 2021-0180).

Data sources

A self-completed online questionnaire (Supplementary Fig. 1) collected information on pre-existing health conditions at the time of the index test (first positive test or, for the comparison group, most recent negative test) as well as current symptoms, limitations in daily activities and quality of life. Those who had tested positive also provided information on symptoms during the initial infection and current recovery status. Questionnaires were completed 6, 12 and 18 months after the index test. Additional data were obtained through linkage to electronic health records both five years prior to their index test and subsequent to the test (up to January 2022) on hospitalizations (Scottish Morbidity Record 01/04), dispensed prescriptions (Prescribing Information System), vaccinations, and death certificates (General Registrar Office). Long-CISS is ongoing and the findings in this manuscript relate to index tests performed up to May 2021 and follow-up questionnaires up to November 2021.

Definitions

Infection was defined as a positive PCR recorded on the national database and categorised as symptomatic or asymptomatic based on self-report. Severe infection was defined as an admission to hospital with ICD-10 code U07.1 on a date occurring between 1 day prior to the test and 2 weeks after. Respondents who reported having had a positive test that was not recorded on the database were excluded from the study as we could not determine whether they were incorrect or had taken the test outside Scotland.

Socioeconomic deprivation was obtained from postcode of residence using the Scottish Index of Multiple Deprivation derived from aggregated data on: income, employment, education, health, access to services, crime and housing23. SARS-CoV-2 variants were defined as dominant if they accounted for ≥95% of cases genotyped that week (https://sars2.cvr.gla.ac.uk/cog-uk/). Pre-existing health conditions were ascertained from self-report, previous hospitalizations and dispensed prescriptions. Respiratory disease was defined as International Classification of Diseases 10 (ICD10) codes J40-J47, J98.2 or J98.3, or bronchodilators, inhaled corticosteroids, cromoglycate, leukotriene or phosphodiesterase type-4 inhibitor (British National Formulary (BNF) 3.1–3.3), or self-report. Coronary heart disease was defined as ICD10 codes I11.0, I13.0, I13.2, I20-I25 (excluding I24.1), I50, T82.2, or Z95.5, or self-report. Depression was defined as ICD10 codes F30-F33, or anti-depressant, hypnotic or anxiolytic (BNF 4.1;4.3), or self-report. Diabetes was defined as ICD10 codes E10-E14, G590, G632, H280, H360, M142, N083, O240-O243 or self-report24. The total number of self-reported health conditions was categorised as 0, 1, 2–3 or ≥4.

Outcomes

The outcomes measured were 26 symptoms (harmonised with the ISARIC questionnaire)25, limitations across 7 activities of daily living, health-related quality of life (using EQ-5D), hospitalization, admission to an intensive care unit (ICU), and all-cause mortality in the whole cohort, as well as self-reported recovery status (full, partial or none) in the symptomatic infection group. Hospitalization (as an outcome) was defined as admission to hospital on a date at least two weeks following the index test, to exclude admissions related to the acute infection. Delayed recovery was defined as participants with previous symptomatic infection who reported no or partial recovery at their first follow-up but an improvement at subsequent follow-up; either from no to partial or full recovery, or from partial to full recovery. Relapse was defined as participants with previous symptomatic infections who reported full or partial recovery at their first follow-up but a deterioration at subsequent follow-up; either full to partial or no recovery, or from partial to no recovery.

Statistical analyses

Baseline characteristics and crude outcomes broken down by infection status (symptomatic, asymptomatic or never infected) were summarized using frequencies/percentages and medians/inter-quartile ranges for categorical and continuous variables and compared using chi-square tests and Mann–Whitney U tests, respectively. A correlation matrix of current symptoms was used to produce a heat map of symptom clustering at follow-up. Separate binary logistic regression models were run in the whole cohort to determine the association between infection status and the outcomes of individual symptoms, limitations in daily activities, hospitalization, ICU admission and death, using never infected as the referent group. Poisson regression models were run for the outcome of EQ-5D because it was a numeric variable and did not satisfy the assumptions required for linear regression. All models were run univariately, then adjusted incrementally for: socioeconomic factors (age, sex, ethnic group, deprivation) and stage of follow-up (6, 12 or 18 months); pre-existing health conditions (count, respiratory and coronary heart disease, depression, diabetes); vaccination status; and dominant SARS-CoV-2 variant. In the symptomatic infection group, the same models were run to determine the factors associated with these outcomes and recovery status. All analyses were performed using Stata v16.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.