PathCluster: a framework for hierarchical gene set clustering
Identify key molecular functions or annotation categories and the relationship between them in large-scale expression profiles
Gene clustering and knowledge-based gene set analysis have been widely used
to infer useful biological insights and rich descriptions for large-scale gene expression
profiles. However, the conventional strategy based on a
posteriori mapping of biological knowledge (i.e., functional annotation of
genes) on gene clusters have several limitations requiring a more integrative
and comprehensive method. In this issue, I propose a simple but effective
solution by directly interrogating the expression profiles with available
knowledge (in terms of gene sets) using hierarchical clustering. The method, PathCluster generates
an ordered list of gene sets in a dendrogram in which the relationship between gene sets
or annotation categories can be visually investigated (e.g., putative
interaction between molecular functions or possible synergism between regulatory
sequence motifs). The key signatures as well as the relationship between
them can be identified providing the relevant and testable hypotheses in the
context of expression datasets. The use of extended biological
databases (e.g., functional annotation, the presence
of regulatory sequence motifs corresponding to transcription factor binding
sites or miRNA, literature-based signature and drug signatures representing the
specific experimental setting or perturbation by drugs, respectively) can enhance applicability
as well as the impact of the method. The software
package of PathCluster provides the easy-to-follow user interface as well as the
graphical interface to demonstrate the results.
- PathCluster software package for Windows platform - Download
- After download, execute SETUP.msi to install PathCluster package
(trial expression datasets and default function gene set data are included)
- PathCluster manual document (require Acrobat Reader) - Download
Send e-mail to developer