Today I’m blogging from the Center for Human Genetic Research 2013 annual retreat.  These notes will be pretty unpolished and there are more notes on some speakers than others.

Speaker 1: John Rosand – Shared risk loci between different brain phenotypes

Dr. Rosand’s area of study is cerebrovascular disease – when blood clots block circulation to part of the brain (i.e. stroke) or when bleeding occurs in the brain (i.e. intracerebral hemorrhage).

The Trial of Org-10172 in Acute Stroke Treatment (TOAST) tried to create six phenotypic categories for cerebrovascular disease, with the hopes that which phenotypic category you are (along with genetics) would help predict drug response.  When they did their genotype-phenotype association study, they found that the strongest risk locus had already been identified as associated with white matter hyperintensity on MRI.  Another risk locus which validated on replication had already been associated with cardiovascular disease.  There’s also a connection with Alzheimer’s.  Hemorrhagic stroke can result from amyloid buildup in arteries – does this phenotype share risk factors with amyloid buildup in Alzheimer’s Disease?  Indeed, APOE E2 and E4 showed up as very significant risk factors for hemorrhagic stroke – but with opposite effect directions (E4 is protective and E2 is harmful, whereas for Alzheimer’s E4 is harmful and E2 is protective).

After finding how many shared loci there are, they decided to do a grand cross-disorder analysis of genetic risk factors in a variety of neurological and psychiatric disorders in the hopes of identifying shared mechanisms, cleverly titled “BRAINSTORM”.

Q&A: Small infarcts (normally associated with stroke) are also very common in Alzheimer’s disease.  We are still wondering why the APOE E2 and E4 effect directions are opposite.  May have to do with Aβ drainage: E2 may facilitate Aβ drainage, clearing it from the neurons but clogging the vessels.

Speaker 2: Alysa Doyle – Translating emerging schizophrenia genetics to a child clinical cohort

Dr. Doyle focuses on schizophrenia because it’s common: there’s a large public health impact and it is easy to gather large cohorts for studies.  So far what we know about schizophrenia genetics is there are lots of tiny effect size SNPs identified through GWAS, as well as (often de novo) microdeletions and microduplications.  What we know about the clinical course is that there are subtle motor, cognition, language and social impacts long before the first ‘episode’.

We’d like to be able to intervene early.  On the other hand, treating kids for psychiatric stuff is really controversial so we’ll need to have very strong evidence in order to proceed.  In the psychiatry field there is a move towards ‘dimensional’ nature of disorders and having a ‘dimensional’ diagnostic schema (i.e. multiple quantitative traits) rather than discrete diagnostic categories.  Therefore our clinical translational efforts must be able to deal with quantitative traits.

Cognition is a valuable quantitative phenotype here – cuts across bipolar, ADHD, autism, OCD – and is often seen in ‘asymptomatic’ relatives.  It is hypothesized that cognition may underlie the liability for a lot of different disorders.  Therefore: the newly proposed longitudinal study of genetic influences on cognition (LOGIC) will follow follow a deeply phenotyped youth cohort over time.

Speaker 3: Chris Newton-Cheh – Blood pressure genetics

High blood pressure is a risk factor for stroke, heart failure, heart attacks, and end-stage kidney disease.  Natriuretic peptides are peptides (very short proteins) that cause sodium to be excreted through the urine.  Chris studies two major ones: atrial natriuretic peptide (ANP, released in the upper heart chambers which are called ‘atria’) and brain/basic natriuretic peptide (BNP, named for its discovery in pig brains but mostly released in the heart in humans).  SNPs in the UTRs of ANP and BNP are associated with the extent of ANP/BNP response to salt intake, and thus with blood pressure. How can an untranslated SNP have this phenotypic effect?

The answer is differential binding of microRNAs.  Pankaj Arora identified three microRNAs that are predicted to bind differentially to the NPPA (gene which codes for ANP) 3′UTR depending on these SNPs.  Two of these microRNAs are expressed in cardiac tissue.  A luciferase assay identified one miR that suppresses the major allele but not minor allele of ANP.  Another assay showed that an antisense RNA to suppress that miR increases major allele but not minor allele expression.  And finally, an experiment on human cardiomyocytes showed that exogenous introduction of that miR suppresses ANP release.  This confirms miR binding as a mechanism for this SNP’s effect.

They have not yet found mechanisms for the effect of the other SNPs yet and they think they are almost certainly just tagging other things via linkage disequilibrium.

Speaker 4 – Jeremiah Scharf – Tourette syndrome

Tourette syndrome (TS) is characterized by tics such as in this video:


Tourette’s is likely to co-occur with OCD and ADHD.  It has estimated 60-80% heritability based on identical twin studies and fraternal twin studies, but at present we know almost nothing about the genetic architecture of this heritability, and there are no convincing Mendelian forms of the syndrome.  GWAS have identified some hits that are close to genome-wide significant, but not quite.  One is SLITRK5, and Slitrk5 knockout mice have OCD-like behaviors [perhaps he refers to Shmelkov 2010?] so that looks like a strong candidate.

They are currently embarking on GWAS as well as rare variant studies combining TS and OCD to maximize power.

Speaker 5 – Susan Cotman – Neuronal ceroid lipofuscinosis (NCL)

Neuronal ceroid lipofuscinosis includes a bunch of different diseases including Batten disease, which is basically the juvenile form of NCL. (Ran out of battery during this talk).

Speaker 6 (Keynote) - David Reich – Evidence for human interbreeding with neanderthals

Three challenges in getting neanderthal DNA from ancient bones:

  1. Getting enough DNA at all
  2. Of the DNA you extract from bone, only 0.1 – 3.0% of it is from the neanderthal – the rest is from microbes, fungi etc that colonized their bodies after death.
  3. It is very easy to contaminate the neanderthal DNA with modern human DNA during handling (by archaeologists, lab techs, etc.) and since the genomes are almost identical (~1 variant in 600b, so 100b sequencing reads will be mostly identical to human genome) it is hard to tell when contamination has happened.

Today’s solutions: extract DNA from center of bone (less chance for human contamination), prepare libraries in a ‘clean room’ and ligate barcodes in the clean room that will rule out later contamination.

Early studies used mitochondrial DNA because there are ~1000 mitochondria per cell, so it’s easier to get more mtDNA than nuclear DNA.  Based on mtDNA it could be shown that neanderthals were always more different from humans than humans were from each other.  This suggested that neanderthals and humans never interbred.

However, non-Africans match neanderthal SNPs more often than Africans do.  Africans and non-Africans split a long time ago and Africans are more diverse.  So we looked for anomalous genomic sites where non-Africans exhibited more diversity than Africans do.  Of those, the non-African variant matched neanderthals in 80% of cases.  This provides really strong evidence for ‘gene flow’ due to neanderthals interbreeding with non-Africans after the out of Africa migration.

We now think 1-4% of DNA in non-Africans is from interbreeding with neanderthals.  Based on linkage disequilibrium we think the interbreeding occurred within the last 85,000 years, which establishes the interbreeding as having occurred outside of Africa.

Neanderthal ancestry is especially strong in human immune genes esp. HLA; other human genomic regions have almost no neanderthal ancestry left.  Perhaps neanderthals were better adapted to the geographic regions (Europe, Asia) that humans migrated into and so variants helpful for surviving in those regions were selected-for.

Topic 2: admixture in Indian history.  [Reich 2009, and a forthcoming paper by Priya Moorjani].  Indians are less studied in genomics because the Indian government makes it very hard to get genetic material out of the country.  Only now are South Asians from Pakistan, Bangladesh and Indians in the U.S., U.K., etc finally making it into 1000 Genomes, etc.

When you MDS plot Indians, you find “the Indian Cline” – Indians are on a spectrum between peoples west (Persians, Europeans) and east (Andamanese, East & Southeast Asian).   Or if you MDS plot Europeans vs. Chinese you find PC1 separates them; if you then add Indians, they are spread along PC1.  Whereas if you plot Indians vs. Chinese, then add Europeans, you don’t see any cline.

Explanation consistent with history is that Indians are admixed from ancestral north Indians (‘ANI’) and ancestral south Indians (‘ASI’).  Different groups in India vary from 20% ‘northern’ and 80% ‘southern’ to 70%/30%.  Dravidian speakers and lower castes are associated with more ASI ancestry.  Some disease traits – esp. cardiac risk – associated with Indians come from the ASI ancestry.  Linkage disequilibrium size suggests that the ANI-ASI admixture occurred less than 3500 years ago with the rise of Vedic religion (Hinduism).  This represents a major demographic transformation long after the advent of agriculture and is different from more recent Indian history where endogamy (marrying people of same region, caste etc) has been expected.

Different Indian groups have different levels of endogamy which can be detected by linkage disequilibrium.  In Vysya, huge LD size suggests a recent (~2000 years) founder event and less than 1 in 100 marriages to outsiders per generation.  Medical implication: the close genetic relationships within whole regions / caste groups are what is responsible for most recessive genetic diseases in Indians, not due to recent consanguinity (marriage between second cousins, etc.)

An ancient sample from Luxembourg is ‘more European than Europeans’, suggesting that Europeans are also an admixture of ancient Europeans with Middle Easterners.

Q&A: Hybrid incompatibility issues that lead to speciation tend to arise first on the X chromosome. We do see plenty of neanderthal SNPs on modern human X chromosomes [perhaps referring to Yotova 2011??]

Q&A: The ANI admixture into India cannot be entirely, or maybe even at all, explained by Indo-European speakers’ migration into India as told by the Veda.  There are at least two different admixture events among upper caste North Indians and we don’t know if either corresponds to the Indo-European migration.

Speaker 7 – Rakesh Karmacharya – Psychiatric genetics

Schizophrenia is a psychosis disorder and has both ‘positive symptoms’ = things they have that you don’t (e.g. hallucinations) and ‘negative symptoms’ = things they don’t have that you do, as well as cognitive symptoms.  Bipolar is a mood disorder: manic and depressive.  But there is actually a spectrum between these two where patients have different amounts of mood and psychosis symptoms.  Both disorders have high heritability (identical twin concordance = 50%).

Treatment has not changed much.  Even today, first line of treatment in bipolar is lithium (discovered as a drug 1949); for schizophrenia the first line is still dopamine antagonists based on chlorpromazine (disccovered 1952), though chlorpromazine itself is not used as much.  All of these drugs have low tolerability and efficacy.  Our study of them is hampered by lack of access to relevant tissue (i.e. the brain).

Goal: identify molecular ‘signatures’ of bipolar and schizophrenia, so that we can then screen for small molecules that modify those signatures.  Open question: will cellular deficits / molecular signatures be present at baseline or only under specific environmental conditions, and if the latter, can we simulate relevant environmental factors in vitro?

Currently looking for molecular or morphological signatures in patient iPSC differentiated into neurons in culture.  Looking at different differentiation conditions to find relevant cell types where we find differences due to genetic background and not just artifacts of differentiation process.

Overall vision is to perform ‘an unbiased image-based and gene expression-based profiling of disease and control cells in the presence of small molecule perturbations.’  Procedure:

  • Obtain iPSC from schizophrenia/bipolar patients and healthy controls
  • Grow and differentiate cells in 384-well plates
  • Treat with a battery of 320 different small molecules
  • Stain with 6 stains for different parts of cell (nuclei, lysosomes, etc.)
  • Use CellProfiler to segment images and generate about 500 features on each object in the image
  • Search these 500 features for any that correlate with disease vs. healthy genotype
  • Also do gene expression profiling on the wells and look for gene expression signature of the disease(s).
Currently in data analysis stage.  The goal is to find a morphological or gene expression ‘signature’ of schizophrenia / bipolar which can then be used in high throughput screens for therapeutics.

Speaker 8 – Mike Talkowski – Translating genomics into the clinic

The challenge: for clinical purposes (as opposed to research purposes), n = 1.  You need to interpret one person’s genome and advise accordingly.  Our ‘annotation of the morbid genome’ – i.e. our knowledge of what variations cause disease – is still very poor.

Many of the same genes that cause developmental abnormalities if disrupted through chromosomal translocations turn out to also be highly enriched for SNPs that are risk factors for ADHD, schizophrenia, etc.  But even that is just looking at phenotypes relatively early at life.  Next step is we need to look at whether they are also enriched for late-onset things like dementia e.g. Alzheimer’s.

Open question is how to educate clinicians and patients on how to interpret high-resolution genetic information when even researchers don’t even know enough.  However remember: these medical decisions are currently being made without any high-resolution information.  The standard in the field is that if a fetus has an abnormal karyotype, the parents are told that there is a 6.1% risk of an ‘untoward outcome.’  This is an extremely crude estimate based on a study from decades ago.

Speaker 9 – Daniel MacArthur – Genomic approaches to finding causal mutations for Mendelian diseases

5-8% of U.S. population has a rare disease.  Many of these cannot be definitively diagnosed yet.  Many are genetically and phenotypically heterogeneous.

Exome sequencing identifies loads of variants – how do we filter for the causal variant?

  • Functional annotation – missense, nonsense, splice, etc.
  • Transmission – variant must be shared between all affected individuals in a family
  • Frequency – look only at rarer variants
  • Expression – gene must be expressed in affected tissue type
  • Protein-protein interaction
  • Replication – variant or disease gene turns up in multiple families

Politics of research make the ‘replication’ difficult for very rare diseases.  If one researcher has a possible causal gene in one family but has no other families, they can’t reach out to other researchers to share data without fear of being scooped.  Two possible solutions:

1. Reach out directly to the public rather than to other researchers.  This is what Rare Genomics Institute does – it encourages families to come in and use crowdsourcing to support their own exome sequencing.

2. Need a forum for ‘micropublications’ where researchers can publish uncertain works in progress so that their name is associated with it at an early stage and then researchers can find each other.