New AI project aims to solve mysteries of rare childhood diseases

Illumina, D3b to analyze 100,000 whole genomes to help understand causes

Written by Marisa Wexler, MS |

An illustration of a DNA strand highlights its double-helix structure.

Genetics sequencing company Illumina is teaming up with the Center for Data-Driven Discovery in Biomedicine (D3b) on a new project that aims to leverage genomic data from thousands of children to better understand the biological underpinnings of childhood rare diseases and cancers.

With this initiative, D3b — a center at the Children’s Hospital of Philadelphia focused on translating scientific discoveries into medical treatments for patients — will use Illumina’s software and artificial intelligence (AI)-ready platforms to analyze whole genomes from 100,000 children with pediatric health conditions. A whole genome encompasses a person’s entire genetic code, totaling upwards of 6 billion individual base pairs (the building blocks or letters of a gene’s genetic code).

These genetic datasets have been collected through federally funded programs such as the Gabriella Miller Kids First Data Resource Center and the Children’s Brain Tumor Network.

“Genomic datasets like these give researchers powerful insight for precision medicine. Through advances in data, software, and AI, we are moving toward a future where genomic insights drive faster research breakthroughs,” James Han, vice president of bioinformatics at Illumina, said in a press release from the company.

Recommended Reading
A woman is shown walking hand-in-hand with a child.

Caregiving burden clear to parents of children with rare diseases

Project features one of the largest unified genomic datasets

Childhood rare diseases, such as AADC deficiency, are largely caused by genetic mutations. Many childhood cancers are also associated with mutations present at birth. But while AADC deficiency is known to be caused by mutations in the DDC gene, researchers have struggled to pinpoint a specific cause for many other rare diseases.

Part of the problem is that, to draw definitive conclusions about rare genetic diseases, researchers often need to comb through data from thousands of individual genomes. This lets scientists zero in on disease-causing mutations among the innumerable genetic changes that are a healthy part of normal human variability. Often, these datasets end up being siloed, with researchers at each institution relying on only a small amount of data gathered at their own center.

This new project is looking to address that issue by pooling thousands of genomes into a single unified database. According to Illumina, this new compilation of genomes is one of the largest unified genomic datasets ever assembled.

“We’re excited to apply leading-edge software technology against some of the toughest challenges in pediatric cancer and congenital conditions,” said Allison Heath, director of data technology and innovation at D3b. “Our goal is to empower researchers to uncover new biological signals and to bring genomic insights into routine clinical decision-making, creating a new standard of care.”