Using CPRD primary care data

Learning objectives

By the end of this module, the reader will have learnt:

How is healthcare managed in the UK?

As CPRD data is collected from healthcare providers in the UK, it is important to understand the structure of the National Health Service (NHS), the role of primary care General Practitioners (GPs) and secondary care health providers, in order to utilise CPRD data effectively.

Certain conditions may be treated specifically in primary or secondary care and therefore researchers need to liaise with UK clinicians to understand how their conditions of interest are treated in the UK to ascertain the feasibility of using CPRD primary care and/or linked data to answer the research question of interest.

The NHS was first launched in the UK in 1948 and has since expanded to become the largest, public-funded health service in the world.

The NHS is managed separately within each country: England, Northern Ireland, Scotland, and Wales. This module focuses on the NHS in England but if you would like to find out more about the health service in other parts of the UK, additional information is available at www.nhs.uk.

There are useful resources available online for researchers interested in learning more about the NHS, including its history and structure, and therefore how healthcare is managed within the UK. One such resource is the online course developed by The King's Fund (www.kingsfund.org.uk/health-care-explained/online-course) with helpful videos linked at the end of the page.

The UK healthcare system is generally divided into different levels of services.

Primary care

With the exception of emergencies, primary care is generally the first point of contact for people in need of healthcare. These services are provided by a range of healthcare professionals including GPs, pharmacists, and dentists. Services include:

  • Day-to-day medical treatment and advice,
  • Long-term care for patients with chronic conditions such as hypertension and diabetes,
  • Referrals to secondary care.

Secondary care

These services are provided by specialist consultant doctors and health professionals. Patients are either ‘admitted’ and stay in hospital or receive treatment at day surgeries as ‘outpatients’. Secondary care is often referred to as ‘hospital care’ but community service providers treat patients in their homes. Services include:

  • Accident and emergency care such as treatment for a fracture,
  • Acute healthcare for conditions requiring a short stay in hospital, for example appendicitis or significant injury. Admissions are either arranged by GP referral or via the Accident and Emergency (A&E) department,
  • Elective care for the treatment of planned, non-emergency admissions such as knee replacements, cataract operations, or bariatric surgery,
  • Planned specialist care, for example childbirth and hysterectomy.  

What is the role of the GP practice in the UK?

GPs in the UK provide a wide range of care in comparison with their counterparts in many other countries. From a research perspective this provides a valuable source of information.

Patient records include clinical details (such as diagnoses and prescribed medication), in addition to information on preventative care and lifestyle choices.

These topics are often of key interest to researchers and this anonymised information is available from GP practices that contribute to CPRD.

GPs are the primary point of contact for healthcare in the UK and account for the majority of all consultations.

Most people in the UK are registered with a GP, although there are a few exceptions, more details in the ‘Who will not be represented?’ section below. It is only possible to be registered with one GP at any time.

GP practices have a broad range of responsibilities in terms of the care and services they provide:

  • Treatment and care of patients with acute and chronic conditions – the GP offers a range of treatments including prescriptions for medication and therapies such as counselling. GPs are also responsible for referring patients to consultants and hospitals for secondary care. (In general, GPs in the UK are responsible for the long-term care of patients with chronic conditions such as diabetes. In the US, the same patient is far more likely to be monitored and treated by an endocrinology or diabetes and metabolism specialist).
  • Preventative care – immunisations to prevent diseases such as tetanus, administering regular vaccinations such as the flu vaccine, screening for diseases including diabetes, cervical, and bowel cancer.
  • Health education and promotion – advice relating to alcohol and smoking cessation, diet and exercise, contraception, and protection against sexually transmitted diseases. 

How does CPRD work with GP practices? 

CPRD works with the major GP practice software providers in the UK to establish methods of secure and efficient data transfer, specifically enabling the clinical coded data recorded by GPs in their Electronic Healthcare Record (EHR) systems to be sent to CPRD. CPRD operates in compliance with the National Data Opt-out Policy for patients in England.

Individual GP practices opt-in to working with CPRD and submit a formal joining form in order to take part. Information for GP practices is available at www.cprd.com/join-growing-network-practices-contributing-cprd, including the measures taken to safeguard patient data (www.cprd.com/safeguarding-patient-data) and information about the data that CPRD holds (www.cprd.com/data-protection-and-processing-notice).   

Once a GP practice joins CPRD, CPRD liaises with the practice and their software provider to enable secure data sharing. An initial transfer of anonymised EHR clinical data, including historical records to date, are transferred from the GP practice to CPRD. From this point on, the practice’s software system manages incremental, automated data transfers to CPRD while the practice remains with CPRD.

CPRD primary care data contains anonymised patient registration information and all coded care events that general practice staff record to support the ongoing clinical care and management of their patients. This includes:

  • Demographic information (year of birth, sex, weight, etc.),
  • Records of clinical events (medical diagnoses, signs, and symptoms)
  • Referrals to specialists and other secondary care settings,
  • Prescriptions issued in primary care (drugs and medical devices),
  • Records of immunisations/vaccinations,
  • Diagnostic testing,
  • Lifestyle information (body mass index (BMI), smoking, and alcohol consumption),
  • And all other types of care administered as part of routine GP practice.

CPRD never receives any patient identifiers from a GP practice such as patient name, address, NHS number, full date of birth, or free text medical notes. 

What is CPRD GOLD and CPRD Aurum?

With these efforts, CPRD is able to offer access to the following primary care databases:

  • CPRD GOLD which contains data contributed by practices using Vision® GP practice software,
  • CPRD Aurum which contains data contributed by practices using EMIS Web® GP practice software.

Whilst the general information available in both databases is similar, each of these primary care databases has a different file structure and coding system due to the different GP practice software systems the data is collected from, which are described in the data specifications for each database available at www.cprd.com/primary-care-data-public-health-research. Researchers must consider how the data in each database is structured and coded in order to find the information they require, to design their research study, and define their patient cohorts.

Researchers may wish to use only CPRD GOLD primary care data, or only CPRD Aurum primary care data, or a combination of both for their research. The CPRD Aurum FAQs available at www.cprd.com/primary-care-data-public-health-research describe the similarities and differences between the two databases, and what to consider when using CPRD data.  

The CPRD GOLD database is updated on a monthly basis, and the CPRD Aurum database is updated on a quarterly basis, each update is known as a database build. Data is collected from contributing practices on a regular, ongoing basis, and then a ‘snapshot’ of the database is taken and released as a database build. It is important that researchers use the latest database build, to ensure the most recent patient opt outs are included. Release notes are published with each build, showing the data metrics for that database version. Each primary care database build has its own Digital Object Identifier (DOI) which are published at www.cprd.com/digital-object-identifiers-dois-datasets.

How is CPRD primary care data structured and provided for research? 

As mentioned above, the structure for each primary care database is different due to the differences in the GP software used to input the data.

CPRD provides all the primary care data associated with each patient defined for extraction in a series of text files.

The diagram below shows the structure of data files in CPRD GOLD:

Structure of data files in CPRD GOLD

 

The diagram below shows the structure of data files in CPRD Aurum:

Structure of data files in CPRD Aurum

 

The full description of each file and field in the CPRD primary care databases are provided in the data specifications available at www.cprd.com/primary-care-data-public-health-research

What coding systems are used in CPRD primary care data?

CPRD GOLD and CPRD Aurum code dictionaries are provided as text files that can be imported into standard statistical software to enable code searching. The medical and product dictionary files are provided in tab-delimited text format.

The NHS has moved to SNOMED Clinical Terms (CT) coding as a whole but the move across to this for different healthcare provider systems has taken place at different rates.

CPRD has developed a code browser tool specific for the medical codes and product codes used in CPRD GOLD and CPRD Aurum. Researchers should explore the code browser tool to see if their conditions or treatments of interest have codes which are recorded by GPs in their patients’ records. The code dictionaries are updated with each CPRD primary care database release to include any new codes introduced by the GP software systems. The code browser tools can be provided upon request, please contact enquiries@cprd.com so that download credentials can be set up for you (credentials expire within 7 days). 

More details about how GPs code information on patient EHR data can be found in the Defining your study population module. 

How do GPs enter patient data? 

Researchers need to consider how GPs may enter data for their specific patients of interest, for example conditions that are mainly treated in secondary care may only show the following in the patient’s primary care EHR:

  • The initial symptoms,
  • Potential tests or a diagnosis,
  • Referrals to secondary care,
  • General information fed back from secondary care may be added to the EHR,
  • Repeat prescriptions may also be recorded and provided by the GP.

Further details may be recorded in secondary data, for example:

  • The dates and symptoms recorded for a visit to hospital,
  • Any procedures carried out while in the hospital.

There are multiple coding systems used in healthcare data, e.g. Read codes and SNOMED for recording medical observations in primary care data, Gemscript and DM+D for recording drug and product prescriptions in primary care data, ICD-10 or OPCS codes in secondary care data, etc. Any particular condition or diagnosis might have several related medical codes.

CPRD recommends that researchers liaise with UK clinicians to understand how their patients of interest are treated, managed, and their data recorded within the UK healthcare system, in order to find the data they require and how to define their study population. For example, consider how GPs may search and select the code they want to record the information required, to build an effective code list to define a study population.

Within a patient’s medical record and during a consultation, GPs will record information depending on the GP software system being used, e.g. Vision® or EMIS Web®. Therefore, data required for research may be found in different files, coded using different systems. When using primary care data for research, researchers should consider these different data entry systems – while the general approach and methodology to conducting a study will be similar in both databases, the details of the study protocol may differ. The CPRD Aurum FAQ document available at www.cprd.com/primary-care-data-public-health-research describes the differences and similarities between CPRD GOLD and CPRD Aurum and offers points to consider.  

Who will not be represented?

Most people in the UK are registered with a GP, although there are a few exceptions:

  • Prisoner - not registered. Healthcare for prisoners is provided through offender health services commissioned by the NHS.
  • Members of the Armed Forces - not registered. Military GPs provide primary care for members of the Army, Navy, and Royal Air Force.
  • Private patient - may be registered. In the UK a patient can be registered with both a private and an NHS GP.
  • Homeless person - may be registered. Homeless people are 40 times more likely not to be registered with a GP than the general population. Their healthcare generally falls to secondary care settings. 

Could external factors affect data recording?

Yes, there are instances where data recorded by GPs may be affected, which should be considered when using CPRD data for research, for example:

  • The Quality and Outcomes Framework (QOF) was a voluntary scheme introduced in 2004 that encouraged and rewarded GPs in England for implementing good practice in specific domains, e.g.:
    • Clinical – how well are patients who are suffering from chronic diseases (e.g. asthma, diabetes, etc.) managed?
    • Public health – how well are preventative measures relating to blood pressure, cervical screening, contraception, obesity, cardiovascular disease, and smoking implemented?

QOF led to better record keeping and improved data quality of these certain areas in CPRD primary care data from specific time points, as shown in the CPRD Data Resource Profile publication. More information about QOF is available at www.qof.digital.nhs.uk

  • The COVID-19 pandemic has affected the number of patients attending GP practices for consultations, and therefore will have a follow-on effect on the data recorded in patient EHRs.
     

Next module: Using linked data

Page last reviewed