Self-Curating Private GeneSets

For users new to GeneWeaver, GeneWeaver contains sets of genes created via automated, semi-autmated and manual processes. Indeed, a large number of GeneSets are created by private users or groups that wish to explore how their data sets of interest articulate with our larger collated sets of genes. While these sets are unofficial (Tier V in reference to our curation guidelines.), they may be submitted for examination by our curators for possible promotion to public/standardized data layers.

Required Metadata

GeneSet creation is relatively easy, it requires only a few things from the user:
  • Summarizing Statement
  • Keywords
  • Description
  • Pubmed ID
  • Tab separated gene data OR a file using the same separation format. You can find a description of how to upload genesets here.
When uploading new data sets from scratch, you will have the opportunity to add any or all of the above metadata to your geneset. This metadata is critical for GeneWeaver to function correctly, allowing us to identify other GeneSets of possible interest, duplications, or alternative curations. A strict adherence to the curation standards also allows you and other to make sense of resulting graphics and association labels. You can link directly to the gene upload page in GeneWeaver site here. Notice in Figure 1 below that you will be able to label your GeneSet as private at the time of upload. If you wish to make the change from Private to Public at a later time, you will need to visit the My GeneSets tab. See Figure 2.

Figure 2 Curation.png

Figure 1: In order to make GeneSets public they must adhere to our curation standards.

Figure 3 Curation.png

Figure 2: From here, find the GeneSet you want to modify and hit the Edit button. You'll find yourself back in the Upload GeneSet page where you can change from Private to Public.

Curation Process

Curation works by taking publicly published materials and annotating them with appropriate metadata to enable large-scale data integration. This means all materials will have to go through some type of quality assurance. In order to enable this process, GeneWeaver has a set of curation tools specifically designed to identify publications of interest, generate a workflow queue, and allow group members to simultaneously curation GeneSets of interest.

One of there tools is called a Stub Generator, and works by gathering appropriate publications of interest.

Figure 4 Curation.png

Figure 3: The stub generator.

From the Stub Generator, users can gather specific keyword-based data to create stubs, which are the empty collectors of GeneSets. From this point on, the user can add GeneSets to the stubs and then send them in for curation.

Quality assurance focuses on multiple details, a list of some, but not all are the following:

  • Appropriate naming conventions.
  • Concise description and accuracy in text.
  • A consistency in data, all included GeneSets are related to the topic material.
  • Value in the GeneSets, no GeneSet will contain only one gene.

Figure 5 Curation.png

Figure 4: A good example of proper submission.

This will result in a GeneSet tailored after your input!

