From GeneWeaver Wiki
Revision as of 16:08, 9 November 2015 by Jason.bubier (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Introduction to web-based analysis with Gene Weaver

This is a guided tour of GeneWeaver.org designed to highlight the use of major tools. Steps and explanations are given to generate the results shown. Additional tools, features, usage hints and approaches are marked Tip.” You can download a pdf version of the document File:Tutorial.pdf. Download Questions to answer with the tutorial. File:Questions for a Guided GW Tour.pdf

A beta version of the 2015 Tutorial is available File:Tutorial2015.pdf

Getting started

GeneWeaver is a web-based gene-centered database with integrated tools. It can combine diverse data sets from multiple species and experiment types, and allows simple and powerful data sharing publicly or across collaborative groups.

Point your browser to www.geneweaver.org.

Tutorial 1.png


Data in GeneWeaver is organized into sets of genes, or GeneSets. These sets contain some simple metadata, such as a name, description, and publication info, along with the set of genes. All of this information can be searched to find a set of interest. Searching GeneWeaver for data sets is possible through the Quick Search (found on the front page) or via the “Search” page found in the menu.

Tutorial 2a.png
Full Search Page

Boolean and Advanced Search Options

Tip: You can restrict your search to specific species, collaboration groups, or curation level (i.e. user-submitted or expert curated data). You can also narrow the search scope by searching only the GeneSet metadata (name, description), genes or microarray probesets, or directly by PubMed ID.

Tip: Creating powerful queries with Boolean search is also possible using “AND”, “OR”, “NOT” and parentheses to craft a precise query.

Search for “Nicotine hippocampus” now to find all the public GeneSets relating to nicotine studies in the hippocampus that have already been uploaded to GeneWeaver.

Tutorial 3.png
Advanced Search

View GeneSet Details

The results returned include a few gene expression experiments in mouse and rat which have been upload by [our curators] for analysis.

Tip: From the search results page you can click on the plus sign to show a little more detail about the set, such as species and publication info. You can also click on the GeneSet name to see the full details and the list of genes included in the set on a separate page.

Click on the first result to see an example GeneSet detail page.

Tutorial 4.png

Link to Publications and Gene Information

From the GeneSet details page, you can read more about the set, including a link to the provided publication in PubMed, if available. On the second half of the page, you will find a list of the genes in the GeneSet, along with a set of linkouts to other sites, and the score associated with the gene (type of score value depends on the source)

Tip: Using the “Display using” drop-down box, you can change the type of identifier used in the list. You can also export the displayed genes to a tab-separated file for use in other software.

Tutorial 5a.png
Tutorial 5b.png

Click your browser’s “Back” button to go back to the search results now.

Create a Project and Add GeneSets

On the search results page, you can use the checkboxes and drop-down menu to add GeneSets to a new project. Projects allow you to run tools and analyze sets of GeneSets.

Click “Select All” to highlight all the Nicotine GeneSets and then use the drop-down box to select “Create new Project…” to add the GeneSets to a new project. When prompted, supply an informative name for the new project and click OK.

Tutorial 6a.png

Analyze GeneSets Page

When a new Project is created, you will be taken to the “Analyze GeneSets” page, where you can see the Project, its GeneSets, and the analysis tools available. Projects can be wholly or piecewise selected for analysis. The Analysis tools, listed on the left, provide a graphical depiction of the tools’ results to aid with selection.

Tip: GeneSets can be easily removed from a project by clicking the “remove” icon on the right side. If you no longer wish to keep a project, you can completely clear it out by clicking the “delete” icon, which will delete the project but not the GeneSets it contains.

Tip: Most of the analysis tools have options to tweak their output. These options are available by clicking the plus sign next to the tool.

Tutorial 7.png

Run an Analysis Tool on Selected GeneSets

To run an analysis tool, simply select Projects and/or GeneSets and then click the button on the left representing the tool you wish to run.

Tip: Projects are very useful for organizing similar studies. For example, by keeping experimental nicotine data in one project, and morphine data in another project, you can simply select both projects at once to run a comparative analysis, while keeping the collections distinct.

Select your nicotine project and then run the “HiSim Graph” tool. You will be shown a status page while the tool is running (or waiting to run) which keeps you informed of the progress of the analysis.

Tutorial 8a.png
Tutorial 8b.png

Tip: If you close the window, all is not lost! Simply go to Analyze->Results to find your analysis history. Tools will not stop running if you close the page, so you can always come back to them.

Alternate way to run an Analysis Tool on Selected GeneSets

Alternatively you could just have Selected All than clicked Analyze and then the tool you wish to use (HiSim Graph in this case) without creating a project.


HiSim Graph

The HiSim Graph tool organizes multi-set intersections into a hierarchical directed acyclic graph (DAG). This organization infers an ontological relationship directly from the empirical data present in the original input sets. Genes in nodes at the top of the graph are found in multiple gene sets

Click on the topmost node in the map to examine the genes it contains more closely.


GeneSet Intersection List

The GeneSet Intersection List shows all the sets in consideration, and a matrix of the genes associated with them. When a single species-specific gene is shown, a green circle is visible, but when multi-species data are compared, homology clusters are indicated with brown circles. In the example below, you could read it as “genes with homology to Kcnk1 are found in all 4 mouse and rat GeneSets.”

Tip: Linkouts for more information are also available (as with the GeneSet details page).

Tip: This page can also be exported to a CSV-file (readable by Excel) for other uses by clicking the link at the bottom of the page.

Tip: If you have the GAGGLE Firegoose extension installed in Firefox, the various overlap levels of genes on this page can be exported as a list to other tools such as DAVID (for GO enrichment), STRING (for pairwise associations), R, MeV, or other supported tools.

Tutorial 10.png

Create or Join a Group

Let’s say we want to share our Nicotine project with a group of collaborators. First, we need to create or join a group. You can do this from your profile page, which is accessed by clicking “Groups” in the top right corner.

Tutorial 11.png

Type a group name in the Group box. To create a new group, click Create, or to join an existing group, click Join. For now, join the group “tutorial”.

Tip: To see other members of your groups, click on “[list]” to view their email addresses.

Tip: You can also update your email address and/or password from this page.

Sharing a Project

Let’s go back to the “Analyze GeneSets” page.

Shared Projects provide a useful way to share analyses with collaboration groups. Any project you have can be shared by clicking the plus sign to expand it, then using the “Share with” drop-down menu to select a group to share it with.

Tutorial 12a.png

Share your project with the “tutorial” group now.

Tip: Shared Projects can only be shared with one group at a time. To share the same project with multiple groups, you can simply make a copy of the project and share it separately.

Tutorial 12b.png
To view your Shared Projects, go to Analyze GeneSets -> Shared Projects in the menu.

Modify a Project

Shared Projects are read-only for everyone but the owner, so you will notice that there a fewer manipulations available on this page. If you want to modify a Project, you have to copy it to your own Projects. You can do this by selecting GeneSets and using the drop-down menu provided.

You should see anyone else’s nicotine projects, along with a small example project called “Tutorial Example.” Select this project and copy it to a new Project of your own.

Tip: You can run tools directly from this page, as long as you don’t need your own projects.

Tutorial 13.png

View a GeneSet Graph

Tutorial 14a.png

Let’s browse the genes in this new project. Select it and run the “View GeneSets” tool. This tool simply draws a node for every gene, and one for every GeneSet, and connects them with a line if they are associated. To aid comprehension, genes are arranged by connectedness (degree) from left to right (lower-higher).

You’ll notice that the resulting image is very tall and hard to read (and this was only 4 GeneSets!). This is an example of when it is good to change tool options. Options can be changed from the “Analyze GeneSets” page, or from the results page by clicking “Show Tool Options.” (shown below)

The image at left has a MinDegree of 2, let’s increase that to 3 and then re-run the tool.

Tutorial 14b.png

Modify GeneSet Graph Using Tool Options

Tutorial 15.png

That’s better. From this result, we can see that Mobp is connected to all 4 GeneSets and might warrant further study. Some of the 3-way genes might be interesting as well.

Tip: You can click on any gene in this image to search GeneWeaver for other GeneSets that contain that gene.

Let’s try out one more tool before we move on. Using the same Project, run the “Jaccard Similarity” tool.

Jaccard Similarity Tool

The Jaccard Similarity results give a large-scale, pairwise view of GeneSet overlaps. The Jaccard Similarity coefficient is a positive match score for the similarity of set-set composition. Using this tool, you can quickly see when sets are highly overlapping or completely disjoint, and refine projects with more informative GeneSets.

Tip: Click on any intersection to bring up the GeneSet Intersection page discussed earlier in this tutorial.

Tutorial 16.png

Upload Your Own GeneSets

Working with other people’s data is fun and all, but…. What’s in it for me?

Uploading your own data to GeneWeaver is fast and easy! Let’s get started by going to “Manage GeneSets” -> “Upload GeneSet”. This brings us to the upload page, where we need to provide a little bit of information:

Tutorial 17.png

Name – Shown in GeneSet Lists and Projects. Short but descriptive is best.

Label – Used to label nodes in results. Less than 24 characters recommended.

Description – Used to describe the experiment and selection criteria for the set. Should probably be similar to the Table caption for published results.

Access Restrictions – Describes who can access this set, Everyone (Public), just you (Private), or groups of collaborators (Groups). GeneSets can be shared with multiple groups.

Reference Info – Allow others to find and cite you when they use your data. Publication info can be automatically fetched from PubMed, or manually entered.

Uploading GeneSets: Case Study

We’ll go through these steps using an existing publication, but feel free to use your own.

Kuntz-Melcavage et al. Gene expression changes following extinction testing in a heroin behavioral incubation model. BMC Neurosci. 2009 Aug 7;10:95.

(PMID 19664213)

Data from Additional file 1: Changed genes on array. The data provided are the complete list of 66 genes that were identified to have changed expression at the p < 0.02 level of significance.

Additional file 1

Tutorial 18.png

Format Data for Upload

Tutorial 19a.png

We need to do a little data cleanup before uploading. Your uploads should be simple, 2-column input with gene or probe identifiers on the left and scores on the right. So, for this file we need to delete a bunch of columns. We also should remove the blanks for good measure.

Once it is cleaned up, we can save it as a Tab Delimited Text file, which can be directly uploaded to Gene Weaver.

For this tutorial though, we can save a step. Highlight all the data and copy it (Edit --> Copy). Then go back to your browser and the GeneWeaver upload page.

Tip: If you’re lazy but still want to follow along, the result of this data wrangling can be downloaded here: http://geneweaver.org/docs/tutorial_example_data.txt

Tutorial 19b.png

Provide GeneSet Metadata and Gene List

Paste the data into the gene list text box at the bottom. Don’t forget to pick the species (so the genes match correctly). Then, make sure all the other fields on the page all filled in. Providing a PubMed ID enables GeneWeaver to store the abstract and other info for searches. When you’re finished, click “Upload GeneSet”.

Tutorial 20a.png
Tutorial 20b.png

View Your New GeneSet Details Page

When upload completes, you will be taken to your new GeneSet details page. From here, you can add it to Projects and do further analysis.

Tutorial 21a.png
Tutorial 21b.png

Sample HiSim Graph

An example of the HiSim Graph results for the Tutorial Example project and the newly uploaded set. Notice that both Mouse and Rat data are integrated and overlapping in the results due to GeneWeaver’s usage of homology.

Tutorial 22.png
HiSim Graph

HiSim Graph Display Options

Many features of the HiSim Graph can be modified by clicking on ‘Show tool options’ or ‘Display Options’


Additional Analysis Tools and Project Utilities Available

Other tools are also available for use which were not covered in this tutorial:

- Hypergeometric Tests look very similar to the Jaccard Similarity tool, but show the results of Fisher’s exact test on all pairwise GeneSet intersections.

- Clustering uses hierarchical clustering to organize GeneSets by their similarity to each other. (Beta)

- Project Utilities allow you to manipulate your projects at a high level, creating new projects from the intersection of multiple projects, collapsing GeneSets in a project into a single new GeneSet to use later, filtering out GeneSets by similarity, and more.

- Our new ABBA tool (Anchored Biclique of Biomolecular Associations), found in the menu under “Search” -> ”for Genes”, allows you to query the entire database for genes with similar relationships to genes from a list of interest.

Tutorial 23a.png
Hypergeometric Tests
Tutorial 23b.png

Provide Feedback

This tutorial was intended to provide an introduction to practical approaches to the tools in GeneWeaver.

There are many tools and approaches that can be combined to create particular workflows to address a variety of genetics questions. Additional tutorials and documentation can be found on the website.

If you have any suggestions, comments, or questions please use the “Feedback” link located on every page.

Tutorial 24.png
Personal tools

Getting Started