Data Mining creates new cancer classification system

Published October 7, 2014   |   
Suzanne Elvidge

Data mining information from more than 3500 tissue samples has found a way to classify cancer into 11 subtypes, finding characteristics that are shared between tumours that arise in different tissues. These findings could help doctors predict patients’ outcomes and choose treatments, as well as guiding researchers in developing new therapeutics, and stratifying patients for clinical trials.

Cancer is a complex disease. While the initial classification of cancer types is based on where it originates, sequencing the genome of different cancer samples has shown that there are different subtypes within cancers that arise from the same tissue-of-origin. A team of researchers from Europe and North America have created a new classification system based on molecular subtypes. The research involved analysing molecular data from 3,527 specimens from patients with 12 different cancer types, the most comprehensive and diverse collection of tumours ever analyzed by systematic genomic methods. The team analysed each tumour type using five genome-wide platforms and one proteomic platform, and was part of the Pan-Cancer Initiative of the Cancer Genome Atlas (TCGA).

The research team carried out data analysis, first analyzing the data from each platform separately and then combining them in an integrated cross-platform analysis. The analyses from the six platforms as well as the integrated analysis all revealed that the tissues could be divided into 11 major subtypes, providing the researchers with confidence in the subtypes they identified, but also suggesting that different kinds of data can be used to classify a tumour.

Read More