Introduction to web-based analysis with Gene Weaver
This is a guided tour of GeneWeaver.org designed to highlight the use of major tools. Steps and explanations are given to generate the results shown. Additional tools, features, usage hints and approaches are marked “Tip.” You can download a pdf version of the document File:Tutorial.pdf. Download Questions to answer with the tutorial. File:Questions for a Guided GW Tour.pdf
A beta version of the 2015 Tutorial is available File:Tutorial2015.pdf
GeneWeaver is a web-based gene-centered database with integrated tools. It can combine diverse data sets from multiple species and experiment types, and allows simple and powerful data sharing publicly or across collaborative groups.
Point your browser to www.geneweaver.org.
Data in GeneWeaver is organized into sets of genes, or GeneSets. These sets contain some simple metadata, such as a name, description, and publication info, along with the set of genes. All of this information can be searched to find a set of interest. Searching GeneWeaver for data sets is possible through the Quick Search (found on the front page) or via the “Search” page found in the menu.
Boolean and Advanced Search Options
Tip: You can restrict your search to specific species, collaboration groups, or curation level (i.e. user-submitted or expert curated data). You can also narrow the search scope by searching only the GeneSet metadata (name, description), genes or microarray probesets, or directly by PubMed ID.
Tip: Creating powerful queries with Boolean search is also possible using “AND”, “OR”, “NOT” and parentheses to craft a precise query.
Search for “Nicotine hippocampus” now to find all the public GeneSets relating to nicotine studies in the hippocampus that have already been uploaded to GeneWeaver.
View GeneSet Details
The results returned include a few gene expression experiments in mouse and rat which have been upload by [our curators] for analysis.
Tip: From the search results page you can click on the plus sign to show a little more detail about the set, such as species and publication info. You can also click on the GeneSet name to see the full details and the list of genes included in the set on a separate page.
Click on the first result to see an example GeneSet detail page.
Link to Publications and Gene Information
From the GeneSet details page, you can read more about the set, including a link to the provided publication in PubMed, if available. On the second half of the page, you will find a list of the genes in the GeneSet, along with a set of linkouts to other sites, and the score associated with the gene (type of score value depends on the source)
Tip: Using the “Display using” drop-down box, you can change the type of identifier used in the list. You can also export the displayed genes to a tab-separated file for use in other software.
Click your browser’s “Back” button to go back to the search results now.
Create a Project and Add GeneSets
On the search results page, you can use the checkboxes and drop-down menu to add GeneSets to a new project. Projects allow you to run tools and analyze sets of GeneSets.
Click “Select All” to highlight all the Nicotine GeneSets and then use the drop-down box to select “Create new Project…” to add the GeneSets to a new project. When prompted, supply an informative name for the new project and click OK.
Analyze GeneSets Page
When a new Project is created, you will be taken to the “Analyze GeneSets” page, where you can see the Project, its GeneSets, and the analysis tools available. Projects can be wholly or piecewise selected for analysis. The Analysis tools, listed on the left, provide a graphical depiction of the tools’ results to aid with selection.
Tip: GeneSets can be easily removed from a project by clicking the “remove” icon on the right side. If you no longer wish to keep a project, you can completely clear it out by clicking the “delete” icon, which will delete the project but not the GeneSets it contains.
Tip: Most of the analysis tools have options to tweak their output. These options are available by clicking the plus sign next to the tool.
Run an Analysis Tool on Selected GeneSets
To run an analysis tool, simply select Projects and/or GeneSets and then click the button on the left representing the tool you wish to run.
Tip: Projects are very useful for organizing similar studies. For example, by keeping experimental nicotine data in one project, and morphine data in another project, you can simply select both projects at once to run a comparative analysis, while keeping the collections distinct.
Select your nicotine project and then run the “HiSim Graph” tool. You will be shown a status page while the tool is running (or waiting to run) which keeps you informed of the progress of the analysis.
Tip: If you close the window, all is not lost! Simply go to Analyze->Results to find your analysis history. Tools will not stop running if you close the page, so you can always come back to them.
Alternate way to run an Analysis Tool on Selected GeneSets
Alternatively you could just have Selected All than clicked Analyze and then the tool you wish to use (HiSim Graph in this case) without creating a project.
The HiSim Graph tool organizes multi-set intersections into a hierarchical directed acyclic graph (DAG). This organization infers an ontological relationship directly from the empirical data present in the original input sets. Genes in nodes at the top of the graph are found in multiple gene sets
Click on the topmost node in the map to examine the genes it contains more closely.
GeneSet Intersection List
The GeneSet Intersection List shows all the sets in consideration, and a matrix of the genes associated with them. When a single species-specific gene is shown, a green circle is visible, but when multi-species data are compared, homology clusters are indicated with brown circles. In the example below, you could read it as “genes with homology to Kcnk1 are found in all 4 mouse and rat GeneSets.”
Create or Join a Group
Let’s say we want to share our Nicotine project with a group of collaborators. First, we need to create or join a group. You can do this from your profile page, which is accessed by clicking “Groups” in the top right corner.
Type a group name in the Group box. To create a new group, click Create, or to join an existing group, click Join. For now, join the group “tutorial”.
Tip: To see other members of your groups, click on “[list]” to view their email addresses.
Tip: You can also update your email address and/or password from this page.
Sharing a Project
Let’s go back to the “Analyze GeneSets” page.
Shared Projects provide a useful way to share analyses with collaboration groups. Any project you have can be shared by clicking the plus sign to expand it, then using the “Share with” drop-down menu to select a group to share it with.
Share your project with the “tutorial” group now.
Tip: Shared Projects can only be shared with one group at a time. To share the same project with multiple groups, you can simply make a copy of the project and share it separately.
Modify a Project
Shared Projects are read-only for everyone but the owner, so you will notice that there a fewer manipulations available on this page. If you want to modify a Project, you have to copy it to your own Projects. You can do this by selecting GeneSets and using the drop-down menu provided.
You should see anyone else’s nicotine projects, along with a small example project called “Tutorial Example.” Select this project and copy it to a new Project of your own.
Tip: You can run tools directly from this page, as long as you don’t need your own projects.
View a GeneSet Graph
You’ll notice that the resulting image is very tall and hard to read (and this was only 4 GeneSets!). This is an example of when it is good to change tool options. Options can be changed from the “Analyze GeneSets” page, or from the results page by clicking “Show Tool Options.” (shown below)
The image at left has a MinDegree of 2, let’s increase that to 3 and then re-run the tool.
Modify GeneSet Graph Using Tool Options
That’s better. From this result, we can see that Mobp is connected to all 4 GeneSets and might warrant further study. Some of the 3-way genes might be interesting as well.
Let’s try out one more tool before we move on. Using the same Project, run the “Jaccard Similarity” tool.
Jaccard Similarity Tool
The Jaccard Similarity results give a large-scale, pairwise view of GeneSet overlaps. The Jaccard Similarity coefficient is a positive match score for the similarity of set-set composition. Using this tool, you can quickly see when sets are highly overlapping or completely disjoint, and refine projects with more informative GeneSets.
Tip: Click on any intersection to bring up the GeneSet Intersection page discussed earlier in this tutorial.
Upload Your Own GeneSets
Working with other people’s data is fun and all, but…. What’s in it for me?
Uploading your own data to GeneWeaver is fast and easy! Let’s get started by going to “Manage GeneSets” -> “Upload GeneSet”. This brings us to the upload page, where we need to provide a little bit of information:
Name – Shown in GeneSet Lists and Projects. Short but descriptive is best.
Uploading GeneSets: Case Study
We’ll go through these steps using an existing publication, but feel free to use your own.
Kuntz-Melcavage et al. Gene expression changes following extinction testing in a heroin behavioral incubation model. BMC Neurosci. 2009 Aug 7;10:95.
Data from Additional file 1: Changed genes on array. The data provided are the complete list of 66 genes that were identified to have changed expression at the p < 0.02 level of significance.
Format Data for Upload
We need to do a little data cleanup before uploading. Your uploads should be simple, 2-column input with gene or probe identifiers on the left and scores on the right. So, for this file we need to delete a bunch of columns. We also should remove the blanks for good measure.
Provide GeneSet Metadata and Gene List
Paste the data into the gene list text box at the bottom. Don’t forget to pick the species (so the genes match correctly). Then, make sure all the other fields on the page all filled in. Providing a PubMed ID enables GeneWeaver to store the abstract and other info for searches. When you’re finished, click “Upload GeneSet”.
View Your New GeneSet Details Page
When upload completes, you will be taken to your new GeneSet details page. From here, you can add it to Projects and do further analysis.
Sample HiSim Graph
An example of the HiSim Graph results for the Tutorial Example project and the newly uploaded set. Notice that both Mouse and Rat data are integrated and overlapping in the results due to GeneWeaver’s usage of homology.
HiSim Graph Display Options
Many features of the HiSim Graph can be modified by clicking on ‘Show tool options’ or ‘Display Options’
Additional Analysis Tools and Project Utilities Available
Other tools are also available for use which were not covered in this tutorial:
- Clustering uses hierarchical clustering to organize GeneSets by their similarity to each other. (Beta)
- Project Utilities allow you to manipulate your projects at a high level, creating new projects from the intersection of multiple projects, collapsing GeneSets in a project into a single new GeneSet to use later, filtering out GeneSets by similarity, and more.
- Our new ABBA tool (Anchored Biclique of Biomolecular Associations), found in the menu under “Search” -> ”for Genes”, allows you to query the entire database for genes with similar relationships to genes from a list of interest.
This tutorial was intended to provide an introduction to practical approaches to the tools in GeneWeaver.
There are many tools and approaches that can be combined to create particular workflows to address a variety of genetics questions. Additional tutorials and documentation can be found on the website.
If you have any suggestions, comments, or questions please use the “Feedback” link located on every page.