These are my notes for week 1 of Harvard’s Genetics 228: Genetics in Medicine from Bench to Bedside course. I added the class late so this is just a quick review of the paper discussed in class.

Week 1 of this class focused on Huntington disease and the class discussed the paper CAG repeat expansion in Huntington disease determines age at onset in a fully dominant fashion [Lee 2012].

Huntington disease had been mapped to chromosome 4p by linkage [Gusella 1983] for an entire decade before the causal variant was determined to be an exonic CAG repeat expansion in the first exon of a gene now called HTT [MacDonald 1993]. One of the original lines of evidence that demonstrated that the CAG expansion was the causal variant, and not just another linked marker, was that the repeat length was inversely correlated with age of onset [MacDonald 1993].

In the 22 years since then, loads of different research groups studying Huntington disease have presented different models to quantify exactly how CAG repeat length and age of onset are related. One such effort was undertaken by [Aziz 2009]. Aziz’s model asserted that the mutant (longer) CAG length, wild-type (or shorter) CAG length, and an interaction term between the two, were all significant predictors of age of onset.

The presently considered paper [Lee 2012] is framed as a response to [Aziz 2009]. The authors analyze CAG length and age of onset data in >4,000 individuals and find that the association with normal CAG length is driven by a single outlier with a 120 CAG mutant allele and an 11 CAG wild-type allele. Because the vast majority of individuals with HD have CAG repeat lengths in the 40-50 range, an outlier with a CAG of 120 can exert a huge amount of torque on the model. Once this point was excluded, neither the shorter CAG length nor the interaction between shorter and longer CAG length was significant. Indeed, the study included 10 individuals in whom both alleles of HTT had CAG length ≥36 (the lowest repeat length typically observed in HD patients), and these individuals had ages of onset consistent with their longer alleles.

The authors frame this as a positive result: Huntington disease is purely dominant, not co-dominant. HDBuzz explained this finding to its followers by saying that this makes things simpler and is more consistent with what we know about mutant huntingtin. Because it had the largest sample size ever reported to date, this study has also become the go-to citation for the relationship between CAG and age of onset - for instance, Stanley Prusiner always cites it in his annual keynote at the Prion conference when he argues that huntingtin is a prion.

A limitation of this study is that with only 10 people with both CAGs ≥36, there is not a lot of statistical power to determine whether having a second allele in this “mutant” range influences age of onset. Another limitation is that this and almost all studies of Huntington disease age of onset use log-linear models which may not provide the best fit to the underlying data. Finally, models of age of onset reported in studies like this can be misleading if used in genetic counseling without regard to the expected age of onset conditioned on the patient’s current age. Many of the difficulties in predicting age of onset will ultimately only be resolved through prospective studies of asymptomatic individuals harboring causal mutations, and these studies will take decades to complete. Still, for a rigorous actuarial dissection of age of onset in Huntington disease, taking into account some of the issues I have noted above, it is worth reading [Langbehn 2010].