Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression dataReport as inadecuate




Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data - Download this document for free, or read online. Document in PDF available to download.

BMC Medical Genomics

, 6:2

First Online: 29 January 2013Received: 18 May 2012Accepted: 25 January 2013

Abstract

BackgroundAvailability of chemical response-specific lists of genes gene sets for pharmacological and-or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining next-gen TM, and that these can be used with gene set analysis GSA methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles.

MethodsWe created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 human and 588 mouse gene sets from the Comparative Toxicogenomics Database CTD. We tested for significant differential expression SDE false discovery rate -corrected p-values < 0.05 of the next-gen TM-derived gene sets and the CTD-derived gene sets in gene expression GE data sets of five chemicals from experimental models. We tested for SDE of gene sets for six fibrates in a peroxisome proliferator-activated receptor alpha PPARA knock-out GE dataset and compared to results from the Connectivity Map. We tested for SDE of 319 next-gen TM-derived gene sets for environmental toxicants in three GE data sets of triazoles, and tested for SDE of 442 gene sets associated with embryonic structures. We compared the gene sets to triazole effects seen in the Whole Embryo Culture WEC, and used principal component analysis PCA to discriminate triazoles from other chemicals.

ResultsNext-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other chemicals.

ConclusionsGene set analysis with next-gen TM-derived chemical response-specific gene sets is a scalable method for identifying similarities in gene responses to other chemicals, from which one may infer potential mode of action and-or toxic effect.

KeywordsText mining Toxicogenomics Gene set analysis Electronic supplementary materialThe online version of this article doi:10.1186-1755-8794-6-2 contains supplementary material, which is available to authorized users.

Download fulltext PDF



Author: Kristina M Hettne - André Boorsma - Dorien A M van Dartel - Jelle J Goeman - Esther de Jong - Aldert H Piersma - Rob 

Source: https://link.springer.com/







Related documents