Strong, Michael J.; Blanchard, Eugene; Lin, Zhen; Morris, Cindy A.; Baddoo, Melody; Taylor, Christopher M.; Ware, Marcus L.; Flemington, Erik K.
Next generation sequencing (NGS) can globally interrogate the genetic composition of biological samples in an unbiased yet sensitive manner. The objective of this study was to utilize the capabilities of NGS to investigate the reported association between glioblastoma multiforme (GBM) and human cytomegalovirus (HCMV). A large-scale comprehensive virome assessment was performed on publicly available sequencing datasets from the Cancer Genome Atlas (TCGA), including RNA-seq datasets from primary GBM (n = 157), recurrent GBM (n = 13), low-grade gliomas (n = 514), recurrent low-grade gliomas (n = 17), and normal brain (n = 5), and whole genome sequencing (WGS) datasets from primary GBM (n = 51), recurrent GBM (n = 10), and normal matched blood samples (n = 20). In addition, RNA-seq datasets from MRI-guided biopsies (n = 92) and glioma stem-like cell cultures (n = 9) were analyzed. Sixty-four DNA-seq datasets from 11 meningiomas and their corresponding blood control samples were also analyzed. Finally, three primary GBM tissue samples were obtained, sequenced using RNA-seq, and analyzed. After in-depth analysis, the most robust virus findings were the detection of papillomavirus (HPV) and hepatitis B reads in the occasional LGG sample (4 samples and 1 sample, respectively). In addition, low numbers of virus reads were detected in several datasets but detailed investigation of these reads suggest that these findings likely represent artifacts or non-pathological infections. For example, all of the sporadic low level HCMV reads were found to map to the immediate early promoter intimating that they likely originated from laboratory expression vector contamination. Despite the detection of low numbers of Epstein-Barr virus reads in some samples, these likely originated from infiltrating B-cells. Finally, human herpesvirus 6 and 7 aligned viral reads were identified in all DNA-seq and a few RNA-seq datasets but detailed analysis demonstrated that these were likely derived from the homologous human telomeric-like repeats. Other low abundance viral reads were detected in some samples but for most viruses, the reads likely represent artifacts or incidental infections. This analysis argues against associations between most known viruses and GBM or mengiomas. Nevertheless, there may be a low percentage association between HPV and/or hepatitis B and LGGs.