Scientists tried to replicate a provocative gene editing paper in real-time, and documented it on Twitter
A study linking an edited CCR5 gene with dying young didn't pass the smell test
Last year, scientist Jiankui He shocked the world when he used the revolutionary gene editing tool CRISPR-Cas9 in two human babies in China during their neonatal development. The system made deletions in the gene CCR5, a gene that is important for helping HIV infect the immune system. The hope was the edits would make the babies HIV immune.
The move rocked the scientific community, sparking debate about the bioethical implications of human genome editing and the potential unknowns of using the technology to create immunity against diseases.
In June, an article published in Nature Medicine revealed that although editing the CCR5 gene can lead to immunity against HIV, it also increases the mortality rate by 21% in individuals with two copies of the edited gene. The findings revealed that one of the twins edited by Jiankui He has a higher chance of dying young than her sibling, raising serious questions about the long-term health effects that could result from human genome editing.
Each human chromosome, a super-packed carrier of our genetic information, holds two copies of each gene in the human genome, one passed down from your mother and one from your father. There is a chance that your mother and father have different copies of the CCR5 gene, each with slight changes in genetic information. This is referred to as heterozygosity. Or your parents might have the same genetic information for this gene, which means that you have two copies of the same genetic information, called homozygosity.
Of the two babies who were edited with the CRISPR system, one of the babies was homozygous for the deletion in the CCR5 gene, and one was heterozygous. The findings of the Nature Medicine study, led by biologist Rasmus Nielsen at the University of California – Berkeley, suggested that the baby with the homozygous deletion in CCR5 has a potentially higher risk of mortality than her sibling due to the intentional editing of her genome. This finding confirmed the worries of many in the scientific community about genome editing, and rocketed the paper to international headlines.
Yet despite the new research's widespread publicity, some scientists had doubts about the results. Within 24 hours of the publication of the paper, and the resulting hailstorm of international headlines, scientists around the world were already attempting to replicate the study's analyses to come to their own conclusions - and documenting their processes in near-real time on Twitter.
The researchers encountered a number of roadblocks that prevented them from being able to replicate the results. This lack of reproducibility led many to question the accuracy of the new findings, raising concerns that they were possibly false or over-inflated. And because the new study was making international headlines, some worried that the public was being seriously misinformed by unverifiable science.
Statistician Sean Harrison of the Institute for Epidemiology at the University of Bristol, who had access to the UK Biobank dataset used in the original paper, practiced “live science” by publicly attempting to replicate the findings of this paper on Twitter. The long thread, which included all the assumptions of the analysis that he could glean from the original publication, including the computer code he wrote and data manipulations he performed (which were not reported for the original paper), yielded markedly less significant results than those published. “Basic things we do to genetic data weren’t done,” Harrison commented. “There’s massive gaps in the methods.”
Nielsen's group calculated the survival rate for individuals with all variants of CCR5 deletion, both homozygous and heterozygous from over 400,000 individuals who had their genetic makeup sequenced and published on the UK Biobank. This data is not publicly available, but can be accessed with consent from the institution. This data block was the first roadblock in replication. Without a tangible dataset, there was no way for fellow researchers to complete their own analyses.
Those few researchers, including Sean Harrison, who did have access to the dataset ran into another issue: the lack of reporting for various components of the scientists' statistical approach.
Multiple analyses were used to calculate the study's major result, the 21% increase in mortality for individuals with homozygous mutations. However, the exact parameters that Nielsen and his team used to calculate this value were not clearly stated within the published manuscript or available supplemental material, making them nearly impossible to replicate.
Cecile Janssens, an epidemiologist at Emory University also publicly attempted to replicate the reports from the paper, but stopped far earlier than Harrison. “The paper is too confusing,” she wrote on Twitter, “with essential data unreported.”
With so many doubts cast upon this highly-publicized paper, it’s no wonder that Rasmus Nielsen took to Twitter to publicly refute the nay-sayers. He then performed a follow-up analysis using Harrison’s methods for adjusting samples for genetic relatedness and some of the suggestions made in the analytic setup. This analysis yielded the same results as the original paper. “If you, or anybody else, have questions about the paper or the analyses, please feel free to contact us”, Nielsen tweeted. “We will be happy to help, and we apologize if some of the Methods section is difficult to parse.”
There are still a few problems with the paper, the most obvious being the reporting for the UK Biobank itself. The database relies on self-reporting from all of its subjects on a volunteer basis, which can lead to a skew in the analysis by the “healthy volunteer effect.” This happens because healthy people preferentially volunteer their data compared to non-healthy ones, potentially resulting in a lower mortality rate.
In this case, the death rate of the UK Biobank sample was 46-58% lower than national average, meaning there is a possibility that the change in mortality rate should be bigger with people with the mutation. Additionally, the healthy volunteer effect could have resulted in members for which the CCR5 mutation could be advantageous (populations with higher rates of HIV) not being included in the Biobank sample, which would also skew the calculated death rate.
In the end, the main takeaway of Nielsen's paper for the scientific community wasn’t just the finding that genome editing of CCR5 can lead to a higher mortality rate in homozygous persons. It also underscored a steadily growing problem: that reproducibility of science is steadily declining, which can lead to wrong information being spread not only within the scientific community, but to the public as well.
A 2016 poll reported in Nature showed that over 1000 scientists out of 1500 had tried and failed to reproduce at least one other scientist’s experiment, and this problem is not getting better. In a follow-up to his Twitter thread, Sean Harrison further commented on why he thinks his public dismantling of the CCR5 science was important.
“This paper has received an incredible amount of attention,” he writes. “If it is flawed, then poor science is being heavily promoted.”