OICR leads more than 700 researchers from around the world in an unprecedented investigation into the dark matter of the human cancer genome.
Three billion letters of code make up our complete genetic blueprint, yet everything we know about cancer to date comes from only one per cent of those letters.
What about the other 99 per cent? Could those regions be holding clues to new cancer solutions and cures? What could we find if we looked into this dark matter? Dr. Lincoln Stein wanted to find out – and he wasn’t alone.
In the fall of 2015, more than 700 investigators from the International Cancer Genome Consortium (ICGC) expressed interest in exploring these uncharted regions. Four years and hundreds of terabytes of data analysis later, they’ve found ways to map the evolutionary history of cancer, identified traces of the disease long before it is diagnosed, and elevated the world’s standards for genomics data sharing and research.
“When this project was first announced, we were delighted by the overwhelming interest,” says Jennifer Jennings, Senior Project Manager of the ICGC. She says that was when the scientific leadership of ICGC realized that a concerted effort was needed to address common computational and logistical challenges, leverage the strengths of collaborators and develop shared infrastructure to achieve the ultimate goals of this research.
They named this project PCAWG, the Pan-Cancer Analysis of Whole Genomes Project, which would soon become the largest ever pan-cancer analysis of whole genomes and one of the largest coordinated cancer research endeavors to date.
Stein and a small group of scientific leaders took on the challenge of synchronizing research groups with similar research goals, strategically rearranging expertise and coordinating collaboration on an international scale.
“Organizing and bringing these researchers together was the greatest challenge,” says Stein, who is the Head of Adaptive Oncology at OICR. “Working with others may be slower at first and the benefits aren’t always evident, but the rigour of the resulting science and the progress made is greater than what any of us could do on our own.”
PCAWG researchers went on to investigate more than 2,800 cancer whole genomes from ICGC patient donors across more than 20 primary disease sites such as the pancreas and the brain. They created the computational tools and established the necessary infrastructure to process and analyze more than 800 terabytes of genomic data in a standardized, accurate and timely fashion.
Powered by these tools, they were able to order the progression of genetic changes that lead to certain types of cancer and showed that these events may occur decades before diagnosis.
“For exceptional cases like in certain ovarian cancers, we were able to see these early events happening 10 to 20 years before the patient has any symptoms,” says Stein. “This opens up a much larger window of opportunity for earlier detection and treatment than we thought possible.”
Understanding the order of genetic changes that lead to cancer – or the probability that one will occur after another – may allow researchers to outsmart how a tumour evolves. This knowledge could help devise new strategies to treat these changes as they occur or prevent them from occurring in the first place, Stein says.
We’ve discovered the causes of two thirds of cancers that were previously unexplained — but this is just the beginning - Dr. Lincoln Stein
PCAWG researchers have also discovered common patterns in the distribution of genetic mutations that may point to new causes of cancer. Similar to the common genetic signatures associated with smoking and ultraviolet radiation, these patterns may point to unknown environmental or behavioural causes that, once fully understood, could be used to change course and help prevent cancer.
“The biological insights discovered through PCAWG have tremendously advanced our understanding of cancer genomics and we’re approaching a place where we know all the molecular pathways involved with cancer,” says Stein. “We’ve discovered the causes of two thirds of cancers that were previously unexplained — but this is just the beginning.”
This July, PCAWG data were officially made available for the scientific community to use as a resource for future cancer research. More than 15 scholarly papers relying on these data have been published already, and an expected 40 will be published within the next year alone.
PCAWG methodologies are now the world’s gold standard for whole genome data processing and analysis. They will continue to be used for years to come as more patient samples are collected and sequenced around the world. All related computational tools, including the data exploration and discovery tools, have been made publicly available.
“We made both the genomic data and the computational pipelines to analyze it free to use for the global cancer research community,” says Stein. “Now, others can analyze these data – or new data – at the same level as we have in the pursuit of new cancer research discoveries.”