Check out the pangenome, the graph of us all

The new pangenome reference is a collection of different genomes from which to compare an individual genome sequence. Like a map of the subway system, the pangenome graph has many possible routes for a sequence to take, represented by the different colors. Credit: Darryl Leja, NHGRI

We used to imagine DNA as the book of life, the code, the Rosetta stone of Homo sapiens. But the repertoire of metaphors needs updating. Today, our species portrait has taken on the appearance of a network of nodes and relationships. Welcome to the age of the pangenome: the collective genome (pan in Greek means everything) that aspires to become more and more complex, plural, cosmopolitan and inclusive.

The Human Pangenome Project is an initiative launched by the US National Institutes of Health, which today lands in Nature with three original papers, and in Nature Biotechnology with a fourth study. Among the creators is Italian bioinformatician Giulio Formenti, who told us what this paradigm shift consists of. After the draft genome in 2001, came the near-complete genome in 2003, then last year it was the turn of the end-to-end (telomere-to-telomere) T2T genome, and now here is the pangenome. Do we have to overcome the logic of the finish line? ‘Yes, it’s a work in progress, the last transition is just beginning,’ says this 35-year-old talent who teaches at Rockefeller University in New York.

Let’s try to understand what distinguishes the latest discovery from the grandiose announcement of the standard sequence at the time of the Clinton presidency. The Human Genome Project had worked on a handful of samples donated by volunteers, whose recruitment had taken place in Buffalo with an advertisement in a local newspaper. All the sequences subsequently obtained were compared with that patchwork genome, which was considered standard even though it was far from representative of the world.

“The pioneer of human genetic diversity was Luigi Luca Cavalli-Sforza, the pangenome picks up his legacy by bringing it to the time of precision sequences and assembly super-algorithms,” the Milan-born scientist tells us. Technologies make a difference here, because enormous computing power is required to catalogue and annotate, letter by letter, sequence by sequence, every variation between individuals and populations.

If we open the double helix, DNA is a one-dimensional string of letters, but a set of human genetic variations must be layered, must acquire depth. What does it look like? “In jargon, it is said to be a graph. It is basically a network whose nodes represent shared and divergent regions. The structure of the data makes it possible to analyse it in an innovative way, see how the variations are distributed and make new discoveries’.

So far, the pangenome contains only 94 genomes, belonging to 47 people (each of us has two copies). “This may not seem much, as there are projects with many thousands of genomes. But the pangenome requires perfect sequences, quality is as important as quantity,’ Formenti explains. Another peculiarity concerns the updates to be made as new data arrives: each time, the pangenome has to be reworked from scratch. The third difficulty concerns the medical community, which will have to learn to compare the patients’ DNA with a graph instead of a simple reference sequence. “This will take time, but it’s worth doing, because then it will be possible to assess each individual variation and tailor treatments, for example in the field of oncology.”

By mid-2024, the pangenome will contain 700 genomes belonging to 350 individuals, and the consortium is discussing what criteria to follow in selecting samples. Italy will be represented thanks to samples sequenced at the Human Technopole by Alessandro Raveane and researchers from the Centre for Medical and Population Genomics directed by Nicole Soranzo in collaboration with the University of Florence. “Our country has been a crossroads of migrations, so it has great variability, think for example of Tuscans and Sardinians. It’s good to be included in the pangenome for everyone’s benefit,’ says Formenti.

As the pangenome becomes more inclusive, it will be a better mirror in which to look at ourselves as a species. What image will it give back? ‘Today you read in books that we are all 99.9 per cent the same, these percentages will have to be recalculated. We will find confirmation that there is no such thing as a race in the sense of rigid, separate categories, and human diversity will appear greater than has been documented so far’.

(translated from Corriere della sera)

Leave a comment