body properly. Current ways of classifying heart failure do not accurately predict how the disease is likely
to progress.
For the study, published in Lancet Digital Health, researchers looked at detailed anonymised patient data from more than 300,000 people aged 30 years or older who were diagnosed with heart failure in the UK over a span of 20 years.
Using several machine learning methods, they identified five subtypes: early onset, late onset, atrial
fibrillation related (atrial fibrillation is a condition causing an irregular heart rhythm), metabolic (linked to obesity but with a low rate of cardiovascular disease), and cardiometabolic (linked to obesity and
cardiovascular disease).
The researchers found differences between the subtypes in patients’ risk of dying in the year after
diagnosis. The all-cause mortality risks at one year were: early onset (20%), late onset (46%), atrial
fibrillation related (61%), metabolic (11%), and cardiometabolic (37%).
The research team also developed an app that clinicians could potentially use to determine which subtype a person with heart failure has, which may potentially improve predictions of future risk and inform discussions with patients.
Lead author Professor Amitava Banerjee (UCL Institute of Health Informatics) said: “We sought to
improve how we classify heart failure, with the aim of better understanding the likely course of disease
and communicating this to patients. Currently, how the disease progresses is hard to predict for
individual patients. Some people will be stable for many years, while others get worse quickly.
“Better distinctions between types of heart failure may also lead to more targeted treatments and may
help us to think in a different way about potential therapies.
“In this new study, we identified five robust subtypes using multiple machine learning methods and
multiple datasets.
“The next step is to see if this way of classifying heart failure can make a practical difference to patients –
whether it improves predictions of risk and the quality of information clinicians provide, and whether it
changes patients’ treatment. We also need to know if it would be cost effective. The app we have
designed needs to be evaluated in a clinical trial or further research, but could help in routine care.”
To avoid bias from a single machine learning method, the researchers used four separate methods to
group cases of heart failure. They applied these methods to data from two large UK primary care
datasets, which were representative of the UK population as a whole and were also linked to hospital
admissions and death records. (The datasets were Clinical Practice Research Datalink (CPRD) and The
Health Improvement Network (THIN), covering the years 1998 to 2018.)
The research team trained the machine learning tools on segments of the data and, once they had
selected the most robust subtypes, they validated these groupings using a separate dataset.
The subtypes were established on the basis of 87 (of a possible 635) factors including age, symptoms, the
presence of other conditions, the medications the patient was taking, and the results of tests (e.g., of
blood pressure) and assessments (e.g., of kidney function).
The team also looked at genetic data from 9,573 individuals with heart failure from the UK Biobank study.
They found a link between particular subtypes of heart failure and higher polygenic risk scores (scores of
overall risk due to genes as a whole) for conditions such as hypertension and atrial fibrillation.
The study was supported by the BigData@Heart Consortium from the European Union Innovative
Medicines Initiative-2.