Chapter 2: Performing Integrative Functional Genomics Analysis in GeneWeaver.org by Jeremy J. Jay and Elissa J. Chesler in Gene Function Analysis, Methods in Molecular Biology vol 1101
You can download a pdf version of the document File:13.pdfhere
What would you like to do?
How to get your data set from your computer into GeneWeaver.
If your data has not yet been uploaded to GeneWeaver, please use the Upload / Share my data link first.
Prepare your data for upload
GeneWeaver takes a plain text format; learn how to prepare your files here.
- 1. Columnar format: Gene, Value
- First, make sure that your file has a header and only 2 columns of data: the gene identifier, followed by the value or score for that gene. For example:
Ensembl ID Correlation Gene1 0.25 Gene2 0.90
- If you don't have any scores, the system will automatically put a 1 as the score. If you have more than 2 columns, you will have to delete or combine the extras before uploading. For more information on supported Gene Identifiers or score types, see here.
- 2. Tab-separated plain text
- Save your file as tab-separated plain text. This option should be available in most software. For example, in Excel 2007, use “Save As...” and in the dialog that pops up, next to “Save as type...” pick “Text (Tab delimited)”
- 3. That's it
- Save the file somewhere you can get to it later. Continue to the next step.
How to get your prepared data uploaded to GeneWeaver.
- 1. Go to the GeneSet upload page
- 2. GeneSet Metadata
- Information about your data.
- 1. GeneSet Name
- This is the first thing people will see when they look at your data. It should be short but descriptive.
- 2. GeneSet Figure Label
- This is an even shorter identifier which will be used in images, when the full name would be too big. Abbreviations and acronyms are highly encouraged.
- 3. GeneSet Description
- Describe the data in a few sentences.
- 4. Availability
- Who is allowed to access this data? “Public” gives anyone access to your results, whereas “Private” gives no one access but yourself. “Group” lets you share your data with people who are in the same groups as you.
- 5. Groups
- When “Group” is selected under “Availability,” you can select any or all of the groups you are a member of. Hold <ctrl> when clicking to select more than one.
ADD LINK Learn more about using groups.
- 3. Input File
- Information about your data file's contents.
- 1. Species
- Select the source species for your data so we know which version of genes to use.
- 2. Gene ID Type
- What type of identifier is your data using? If it is probes from a microarray, you can simply select the array from the list. If not, you'll have to select the identifiers used. If you aren't sure, the following examples may help:
Entrez 12345 Ensembl Gene ENSMUSG00001234 Ensembl Protein ENSMUSP00001234 Ensembl Transcript ENSMUST00001234 Unigene Mm.12345 MGI MGI:12345 RGD RGD12345 HGNC 12345 ZFIN ZDB-GENE-12345-678 FlyBase FBgn0001234
- 3. Gene Score Type
- This tells GeneWeaver what type of scores you are providing in the right-hand column. This helps determine the direction to threshold.
Binary Value of 1 to include or 0 to exclude the gene. Multiple gene matches are averaged. Correlation Values between -1.0 and 1.0, where larger absolute values are more significant. p-value or q-value Values between 0.0 and 1.0 with smaller values being more significant. Effect Size Any numeric range
- 4. Threshold
- Specifies the threshold for inclusion in the set, so if a score is in the range specified it will be included, otherwise it will be ignore in analyses.
- 5. Input File
- Here is where you upload the file prepared earlier. This must be a tab-separated, plain text file. You can optionally save a little bit of hassle by using the “copy/paste genes” button, where you must still use the same format as the file, but you can simply paste the text instead of uploading.
- 4. Reference Info / Ontologies
- These two are optional, but help others find and use your data more easily.
- 1. Reference Info
- If the data has been published, simply provide the PubMed ID and CAN'T FIND THIS LINK: click [Retrieve Info from PM] to automatically fill in the rest of the information. Otherwise, you can fill in the information yourself so others know what to reference when using the data.
- 2. Ontologies
- Use the select box at the top to pick different ontology sources, and the tree browser to select terms which describe the phenotype of the data. You can select as many terms as you like, from as many sources as you like. All the selected terms will show up in the space to the left.
- 5. Upload
- Once everything has been filled in, hit “Submit.” GeneWeaver will check your data and let you know of any problems it finds.
Find existing data with particular gene(s) or keyword(s)
- 1. Click “Search” in the menu bar.
- 2. Species-Specific
- If you only want data from a particular species, select it in the box labeled “Species.”
- 3. Gene-Specific
- If you only want to search the empirical gene data, select “Gene IDs” in the box labeled “Search.” Sometimes the descriptions or abstract can reference a gene symbol as well, so you may want to leave this set to “All Fields” for the best results.
- 4. Search
- Enter your gene or keyword into the text box and click “Search.”
- 5. Result
- The result list will include a note after the GeneSet Name to let you know where the search matched. You can click in the GeneSet Name to read more about the set and its genes, or use the checkboxes and add to your projects.
Compare data sets
How to use the GeneWeaver Tools
A quick introduction to the analysis pipeline of GeneWeaver.
- 1. Analyze GeneSets
- Click “Analyze GeneSets” in the menu bar to view the Tools available in GeneWeaver, along with your own Project.
- 2. Projects
- 3. Add GeneSets to a Project
- Check the checkboxes next to your GeneSets of interest, and then find the drop box at the bottom of the list labelled “Add to Project...” and select “Create new Project.”
- 4. Name your Project
- A box will pop up asking for the name of your newly created project. Name it something relevant to the subject of the GeneSets, such as "alcohol." You will be brought back to the “Analyze GeneSets” page after your Project is created.
- 5. Adding GeneSets to existing projects
- Once a relevant Project is already created, you can quickly add to it by simply selecting it in the “Add to Project...” drop box, which will have a full list of all your existing Projects.
- 6. Analying a Project with a Tool
- Check any combination of Projects or GeneSets from the “Analyze GeneSets” page, and then click any of the Tool buttons at the top of the page to send those genesets to the Tool. This will bring you to a status page while the data is analyzed, and when complete, will direct you to the results for the Tool you selected. Learn more our available Tools.
HiSim Graph Diagram
The HiSim Graph Diagram uses the gene intersections of multiple sets to create a hierarchy of gene sets.
The GeneSet Graph illustrates relationships between genes and GeneSets.
The Jaccard Similarity shows how much overlap there is between pairs of sets.
The Hypergeometric Tests give the probability of an overlap between pairs of sets occurring randomly.
This tool uses the Jaccard distance to cluster gene sets which have genes in common.