Tofersen is now the closest thing so far

This afternoon, FDA granted Accelerated Approval to tofersen, an antisense oligonucleotide drug for SOD1 ALS. Tofersen is now the best proof-of-concept yet that an antisense oligonucleotide can potentially do what we want it to do in prion disease.

background

We partnered with Ionis Pharmaceuticals in 2014 to work on developing an ASO to lower PrP in prion disease. While clinical trials have been delayed, the company says it is still committed to the program, and we still hope that an ASO could prove effective. When I first blogged about this effort in 2018, I noted that the FDA-approved drug nusinersen for spinal muscular atrophy was the best clinical proof of concept so far that ASOs could be effective in neurological disease. But nusinersen is not a perfect analogy to what we want to do in prion disease, for at least two reasons. First, nusinersen is a splice-modulating ASO — it changes how its target RNA is cut and pasted together, rather than lowering the amount of RNA; this means it has both a different chemistry and a different mechanism of action (if you’re confused, read this backgrounder). Second, spinal muscular atrophy causes the worst pathology in the spinal cord, which is also the region best reached by ASOs delivered via a lumbar puncture, whereas prion disease is a whole brain disease.

Thus, when we got bad news about tominersen, the ASO for Huntington’s disease, then more bad news about BIIB078 for C9orf72 ALS, there was room to wonder how much we should temper our optimism for a prion disease ASO. The results for tofersen, a drug to lower causal SOD1 protein in in SOD1 ALS, on the other hand, were not as easy to interpret. The drug missed its primary endpoint in a Phase III trial, but results trended slightly in the right direction, and the drug significantly lowered plasma neurofilament light (NfL), considered a marker of neuronal damage. This led some observers to speculate that the drug had actually yielded its intended benefit on the molecular level and on the cellular level, but just hadn’t had enough time to turn the ship around at the whole human being level.

FDA’s decision today takes exactly that view. Specifically, Accelerated Approval means that FDA considers the decrease in plasma NfL to be “reasonably likely” to predict that the drug is beneficial, while not yet being enough evidence to grant the drug full approval. Thus, Biogen still has to proceed with the open-label extension of the original VALOR clinical trial, as well as conducting the ATLAS trial (NCT04856982) [Benatar 2022], which is testing tofersen in pre-symptomatic at-risk SOD1 mutation carriers. These studies are hoped to provide additional evidence that tofersen works. If they don’t, then there is a possibility the provisional approval granted today could be withdrawn or limited.

evaluating the evidence for tofersen

The slides I blogged about last time were released by Biogen in October 2021, and reflected a July 2021 data freeze. The trial investigators subsequently published a paper based on a further January 2022 data freeze [Miller 2022] (full text here). To the best of my understanding, FDA’s decision is based on the January 2022 data freeze, so it is worth unpacking those data here. Ironically, the slides based on the July 2021 freeze show further-out timepoints functional measures, up to 80 weeks, versus just 52 weeks for the January 2022 data in the paper. Turns out that not all participants had that long of follow-up in the original data freeze; the January 2022 data freeze was designed to include 52 weeks of data for everyone (a year from when the last participant was randomized). The trial was originally randomized and blinded for 28 weeks, tofersen versus placebo, then at week 28, the placebo people were crossed over onto tofersen, and the whole cohort was followed out to 52 weeks, comparing early-start tofersen, versus the originally placebo, now late-start tofersen people. The double blind is maintained to this day, adding credibility to comparison between the early- and late-start groups. The primary endpoint concerned the fast-progressing SOD1 mutations, which are the majority of study participants, so almost all of the analysis in the paper is focused on that group.

Like the original slides, the published paper contains a data visualization gotcha. The error bars for the CSF biomarkers, SOD1 and NfL, in Figure 1 are 95% confidence intervals, while the error bars for functional measures in Figure 2 are single standard errors of the mean. Usually, when I see single standard errors, I try to mentally double the error bars to see how large the overlap truly is. Doing that mental exercise, it would appear that the confidence intervals in Figure 2 would indeed overlap, and thus that the differences are not even nominally significant at 52 weeks. However, the confidence intervals are stated in the text as well as in Table S5 and they do not overlap zero, implying they are nominally significant. For example, for ALSFRS-R, which was the primary endpoint, the difference is 3.5, confidence interval 0.4 to 6.7, in favor of early-start tofersen. How can this be? Reading the page-and-a-half long Statistical Analysis section of the paper, it would appear to me that this is because they used joint rank test on these open label extension endpoints. That means that when testing whether the ranks of ALSFRS-R are better in the early-start group than the late-start group at 52 weeks, people who died before 52 weeks are ranked at the bottom. Probably, the error bars in Figure 2 are based solely on the people still alive at each timepoint, whereas the 95% confidence interval in Table S5 is based on this joint rank method which accounts for deaths. If so, then for all of these key measures — ALSFRS-R, percentage of predicted slow vital capacity, and handheld dynamometry score — accounting for deaths makes a substantial difference in how good the outcome looks for early-start tofersen.

Survival time itself then becomes key to everything. Survival itself is the one metric not affected by this particular method of accounting, and it, in turn, underpins everything else. We should thus turn our attention to the analysis of time to death or permanent ventilation as of January 2022. The hazard ratio is 0.36, 95% confidence interval 0.14 to 0.94, for early-start tofersen. That means that people who got tofersen earlier died only about one-third as rapidly than people who got tofersen later, and while we’re not very confident in that point estimate, we’re quite confident that a difference in survival exists. What’s more, the median survival of everyone with SOD1 A4V (the most common fast-progressing mutation) was already 1.7 years at that data freeze (Figure S4), higher than all available estimates of median survival for A4V (which center on ~1.2 years, Figure S5). The authors point out that only 2 people have ever been reported in the literature to survive longer than 3 years with A4V, whereas one trial participant is already at 3.7 years and another two are closing in on 3 years. The open label extension is designed to remained blinded for a total of 3.5 years of follow-up for each participant. That would mean that if people continue to survive, the “final final” trial results should become available in July 2024.

How much can we believe these sorts comparisons of survival in a trial versus natural history in the literature? Might the trial enrollment criteria have biased for long survivors, or might the fact that now everyone in the trial knows they are on active drug, make the results not comparable to the prior published literature? Both are true: there are certainly biases out there, and there are good reasons why people do randomized trials rather than always comparing to natural history — but at some point the survival does become evidence of efficacy.

Priors matter, and the supporting data matter too. Let me contrast tofersen with a drug that I don’t think works. In prion disease, I have not been convinced that pentosan polysulfate extended survival, despite data showing that the four longest-lived vCJD patients ever were all treated with the drug [Newman 2014]. Those patients spent those years in a vegetative state. In animal models, pentosan polysulfate only worked at pre-symptomatic timepoints. In these human patients with very advanced disease, my best guess is that pentosan was not actually effective, but simply selected for families committed to keeping patients alive by any means necessary. This would be consistent with evidence that family decisions around end-of-life care impact survival in prion disease [Nagoshi 2011, McNiven 2019]. What makes the tofersen data different is that we’re seeing a longer survival time than the natural history data not in isolation, but in conjunction with nominally significant survival benefit in early- vs. late-start drug treatment, plus favorable differences in multiple functional measures, plus a biomarker that may suggest neuronal rescue.

In designing clinical trials, you want to be as rigorous as possible, hence pre-specification of statistical analysis plans, hence planned data freezes, hence primary endpoints, and so on. But at some point, you also want to be able to use your common sense. Of course, calling it “common sense” doesn’t do it justice, because the data are genuinely complicated. But given that tofersen missed the primary endpoint while looking overall favorable, at some point you have to ask yourself, do I think that it is more likely to work, or more likely not to work? While I approached the topic with initial skepticism, after reading the paper carefully I started to feel that yes, tofersen probably does work. It may not work as well as one might hope — we still need better SOD1 ALS drugs. And it probably doesn’t work as fast as one might hope; some people think the failed primary endpoint was just due to too short a randomization period. But after sitting with it for a while, I thought, if Sonia had a SOD1 mutation instead of a PRNP mutation, would I want this drug to be approved? Yes. I am not certain that it’s safe and effective, we could all be wrong, but the data make me think that tofersen is more likely than not to be beneficial. Indeed, the data look good enough that I actually lean towards wondering instead whether it is still ethical to randomize at-risk SOD1 carriers in the ATLAS trial, when there is a probably-safe-and-effective drug that could help them. 3 out of the 9 FDA Advisory Committee members recommended not just Accelerated Approval but rather full approval, perhaps feeling the same way.

implications for prion disease

I said earlier there were two limitations that made nusinersen an imperfect analogy to the PrP-lowering ASO we want to develop. Tofersen now addresses one of two: like the drug we want to develop, and unlike nusinersen, tofersen is designed to lower the amount of a target RNA molecule. The other limitation still stands: both nusinersen and tofersen are approved for neurological indications where spinal cord pathology predominates. ASOs dosed intrathecally — via a lumbar puncture — achieve better concentration in the spinal cord than in many brain regions [Jafar-nejad & Powers 2021]. How well an ASO can treat a whole brain disease remains to be established clinically. In addition, we still don’t know why the ASOs for Huntington’s and C9orf72 not only failed but seemed to make patients worse — was it therapeutic hypothesis, on-target toxicity, or off-target toxicity? And if the answer is off-target, how many other ASOs will have similar off-target effects, versus how many will look relatively clean (though certainly not devoid of adverse events) like tofersen? In other words there’s still plenty of jeopardy about whether an ASO for prion disease will work, and that’s why there will be trials. But after what’s been a rough few years for ASO therapies in neurodegenerative disease, today’s tofersen news is a reason for an uptick in optimism.