Dragon Database for Methylated Genes and Diseases  
How to search the database?
What is the score that is displayed next to sentences?
What are the benefits of registering for an account? Is it required?
How to submit data to the database?
Can I edit the submitted data?
What is the gene ID, disease ID and species ID?
Some associations do not include the species, PubMed ID or PubMed Central ID! why?
What is the format of the downloaded files that include the search results?
What is the meaning of gene expression and disease progression columns?
Who should I contact for technical questions?


How to search the database?
Please follow these steps to search the database:
  1. Go to Home page.

  2. Select the genes, diseases and/or species. You can select more than one gene, disease and/or species. You can type gene names in the text box or select the genes from the list. Also, you can filter the diseases using the diseases category, diseases types and diseases sub-types. Additionally, you can leave the gene, disease, and/or species fields empty. If you want to retrieve associations for all genes, diseaes and species contained in the database, you can click the submit button directly without selecting any genes, diseases or species.

  3. Then you will get a page that shows some statistics about your query. Please note that if the query is so long, the statistics page may not display in order to speed up the processing.

  4. You can select the associations in which you are interested, and click Submit

  5. Statistics Page
  6. Finally, you will get a page that shows color-highlighted evidence sentences of the associations, gene expression, diseases progression, confidence scores, and links to other databases.


What is the score that is displayed next to sentences?
The database contains associations that were extracted automatically by DEMGD system. The system gives a confidence score (between 0 and 1) to indicate the correctness of the associations. If the score is close to 1, the association has high confidence to be correct. If the score is close to 0, it is likely that the association is not correct.

What are the benefits of registering for an account? Is it required?
Registering for an account is not required. If you do not have an account, you can still search the database and download the search results. However, it is recommended to register for an account. If you have an account, you will be able to save search results in your account, and submit, view and edit all your submitted associations.

How to submit data to the database?
Please follow these steps to submit new associations to the database:
  1. Go to Submit page. You must login or sign up to an account

  2. You must enter the gene, disease and methylation type. It is optional to enter the species. Make sure that the gene, disease,and methylation word are written exactly as mentioned in the sentence.

  3. Then enter the PubMed ID (PMID) and/or PubMed Centeral ID (PMCID). You must enter either PMID or PMCID, or both.

  4. Finally, enter the sentence in which the association is mentioned.

  5. If there are several associations in the sentence, you can make a separate submission for each association.

We perform periodic checking for the newly submitted associations. Considering that the purpose of this project is to provide a database that does not necessarily involve manual-curation, we do not perform immediate manual checking or evaluation of the newly submitted information. However, we implemented three mechanisms to minimize human errors while submitting the information. Firstly, submitting the evidence sentence, PubMed ID and/or PubMed Central ID is necessary to enable quality assessment of submitted information. Secondly, in order to ensure that the user entered the genes and diseases correctly, the system checks if the names of submitted genes and diseases appear in the submitted sentences. Otherwise, the system displays a warning message if the submitted genes or diseases do not appear in the sentences. Users who signed up for accounts can edit the information they have submitted if they find any error in their submitted information.

Can I edit the submitted data?
If you submitted data to the database, and you realized that you have to edit it, please follow these steps:
  1. Log in to your account.

  2. Get the Assc ID of the association you want to edit. You can find it in the "view submissions" page.

  3. In the left panel, you will find a link "Edit Submissions".

  4. Enter the Assc ID in the box.

  5. Edit Page
  6. Finally, edit the association, and click save.

What is the gene ID, disease ID and species ID?
Considering that genes, diseases and species can be mentioned in different ways, we used genes ID, diseases ID, and species ID to provide normalization of genes, diseases and species, respectively. We used NCBI Entres Genes Database, Comparative Toxigenomics Database (CTD) and NCBI Taxonomy Database for genes ID, diseases ID, and species ID.
For gene ID, we also considered the species, because the same gene can have different gene IDs depending on the species with which it is associated. For example, if an association includes BRCA1 in human, the gene ID is 672, but in mice the gene ID is 12189. However, since the species was not identified for many associations, we used the gene ID for humans, led by the fact that humans are the most commonly studied species. However, for some genes, gene ID was not identified, if there is no match for the gene in humans.
Additionally, some of the diseases in DDMGD are very narrow, but CTD does not contain disease ID for such specific diseases. In this case, we provided links to the parent diseases. Moreover, CTD does not include disease ID for some of the diseases in DDMGD. Therefore, these diseases do not have disease ID.

Some associations do not include the species, PubMed ID, or PubMed Central ID! Why?
The species are usually mentioned less frequently than the genes, diseases and methylation words. Therefore, for some of the associations, the species could not be identified. However, in such cases, the system will still extract the association without the species.
It should be noted that, because some of the PubMed abstracts do not mention the PubMed Central ID, DEMGD, included only the PubMed ID for the sentences that were extracted from these sentences. Similarly, the PubMed ID could not be extracted from some of the downloaded PubMed Central full-text articles. Therefore, DEMGD included only the PubMed Central ID for the sentences that were extracted from these PubMed Central full text articles. However, for the rest of abstracts and full-text articles, both PubMed Central ID are available.

What is the format of the downloaded files that include the search results?
The downloaded file is a comma seperated file (CSV) that includes the following columns: gene, gene ID, disease, disease ID, species, species ID, methylation word, gene expression, disease progression, PubMed Central ID, PubMed ID, score, sentence, and email of the user who submitted the information. If the data was generated by our system, you will find DDMGD in the email column.

What is the meaning of gene expression and disease progression columns?
Gene expression and disease progression columns provide associations between gene methylation with gene expression and disease progression, respectively. This information was extracted automatically from the text. For example, the gene expression column may contain "increase expression". This means that gene methylation is associated with increase in gene expression. Similary, the disease progression column may contain "involved in disease progression". This means that gene methylation is involved in disease progression. However, majority of sentences from the processed text do not include information about gene expression or disease progression. Therefore, such information was not extracted. Instead, we provided links to other databases that provides information for gene expression. But we could not find resources that provide similar information for disease progression.

Who should I contact for technical questions?
For any comments or technical questions, please contact: arwa.binres@kaust.edu.sa




© 2013 King Abdullah University of Science and Technology. All rights reserved