Gene Set Utilities

From GeneWeaver Wiki
Jump to: navigation, search



GeneWeaver tools operate on a weighted bi-partite adjacency matrix, a table of Association Scores in a Gene (row) x GeneSet (col) tab delimited text format. For many GeneSets, the scores are binary. To create sample GeneWeaver data for development or off-line analysis:
  1. Perform a database query using the search field.
  2. Add the GeneSets to a project.
  3. Go to the "Analyze GeneSets" page.
  4. Select the project or specific GeneSets from projects.
  5. Run the "Combine" tool under "Project Utilities" in the left side bar.
  6. Save the file to your computer.

Anchored Biclique of Biomolecular Associations (ABBA)

Given a set of interesting genes, do other genes have similar relationships to known sets of genes? For example, given a set of genes known to be related to drug abuse, what other genes share similar expression patterns in drug abuse gene sets? By answering this question, it becomes possible to elucidate under-studied or obfuscated genes that may play a role in complex phenotypes.

We have developed a new GeneWeaver tool to address this question, which we call Anchored Biclique of Biomolecular Associations (ABBA). This tool takes advantage of the large number of collected data and cross-species integration to find new genes for investigation.

The search begins with a user-provided list of genes of interest, such as highly-studied genes with known pathways and relationships. The database then finds any gene sets that contain at least N of the genes in the provided list. From the resulting list of gene sets, ABBA then isolates any genes that occur in at least M GeneSets but not in the initial list. These resulting genes share similar gene set overlap with the original input set, but may not have been previously considered in relation to the gene set of interest.

ABBA 1.png
ABBA applied to a set of 4 genes of interest. Lighter nodes indicate less overlap. Using N=2 produces a collection of 37 GeneSets as of 7 July 2010. For brevity, only the top 5 results are shown above. With M=15, the following table lists genes in the result having similar relationships to the input set.

ABBA 2.png

Without reasonable thresholds, the results quickly become overwhelming. As of this writing, a simple set of 4 genes of interest results in 555 GeneSets and over 38,000 genes in the candidate list. Increasing the input set to 7 genes of interest results in 983 GeneSets and almost 40,000 genes. Simply requiring gene sets to contain at least 3 genes significantly reduces the search space to 11 and 37 GeneSets, respectively.

ABBA 3.png

Boolean Algebra

The Boolean Algebra utility enables users to select two or more genesets and integrate them with advanced set logic. There are three set logic options: Union, Intersection, and Except.

Emphasis Genes

The Emphasis Genes utility enables users to select genes or an entire set of genes that are highlighted in analysis.

There are two ways to set emphasis genes.

  1. Choosing genes one at a time using the emphasis genes utility on the Analyze Gene Sets page on the lower left tool bar.
  2. Open up a GeneSet detail page and click the "Add all genes to emphasis gene set" link. You will need to create a GeneSet if your list is not already in the database.

To modify your emphasis genes, you can remove genes one at a time using the "remove" link next to each gene. To clear the entire list, click the "clear" link at the bottom of the page.

To use emphasis genes in your analyses, on the Analyze Gene Sets page, mark the check box next to "use" below the emphasis genes utility in the tool bar.

Personal tools

Getting Started