The software of life

The aim is to understand cancer, Alzheimer’s and Parkinson’s disease. 30 years ago, the project to decode the human genome was launched. But the human construction kit is still far from being understood.

It was a project like the moon landing: mapping the genetic blueprint of humans. The identification of all genes on the approximately 3.2 billion gene-letter DNA thread of the 23 chromosomes. A, C, G and T – the human genome consists of only four letters – cytosine (C), guanine (G), adenine (A) and thymine (T). They form the software of life, so to speak.

Start of work 30 years ago

Molecular biologist Robert Sinsheimer, chancellor of the University of California and as such accustomed to raising large sums for physics and astronomy, got the ball rolling in 1985. In 1988, the Human Genome Organization (HUGO) was founded as an association of scientists and research institutions to coordinate the participating working groups.

The actual Human Genome Project (HGP) started on 14. The project began in September 1990 as a public, predominantly American, large-scale research project. It quickly became a loose network of working groups from more than 30 countries. Around 60 percent adopted various centers in the USA. British scientists accounted for a quarter of the task. Genome researchers from France, Japan, China and Germany are working on the remaining sequences. By 2005, the work should be done, according to the plan. The total cost: around 3 billion dollars.

Race for money and honor

Quickly, however, it was over with the unity. U.S. scientist Craig Venter, initially part of the research project, announced in 1998 that he would decode the genome on his own with his company Celera Genomics – using a much faster but, according to many scientists, inaccurate and incomplete technique called the shotgun method.

Venter relied on the greatest possible automation and the concentrated computing power of his computers: not with enzymes, but with mechanical force (ultrasound), he broke down the DNA into snippets, which were then analyzed and reassembled with the help of immense computer power. A race for money and honor, in which Celera had the advantage of having access to the data of the competition. Conversely, Venter did not share his findings.

"Working version" of the human genome

In the end, the two sides partially worked together again. In June 2000, the "working version" of the human genome was announced and on 12. February 2001 published: an unimaginably long "text", about 3.000 books would fill, each book with 1.000 pages a 1.000 letters.

It turned out that this "text" is 99.9 percent identical for all people. The researchers were also able to read off approximately how many genes humans have. A surprise. Because it turns out that humans only have about 20.000 to 25.000 genes, only twice as many as a fruit fly, for example. But how can we explain the complexity of Homo sapiens??

Human genome considered completely deciphered

Since 2003, the human genome has been considered completely deciphered. Today, fast computers can read the genome of any human being in a few hours. Researchers identified the genes responsible for some diseases; for example, knowledge of genetic components of Alzheimer’s and diabetes led to better understanding and more targeted treatments.

However, diseases caused by a single defective gene are rare. It will probably take decades before the interplay of genes is understood. One door has been unlocked, but behind it are many new closed doors, says chemist Friederike Fehr of the Max Planck Institute in Gottingen, describing the current state of research. The text of the genome is known. But to understand him, more is needed. How are genes regulated and what do the proteins they produce do? What information lies between the genes? What was initially called data garbage or junk is now considered an additional layer of information.

US President Bill Clinton said on 6. April 2000: "Now we learn the language God used to create life."Today it is clear that many vocabulary words are still missing.

