DRAF
  Home Download Help Contact  



1. What is DRAF?

DRAF is a transcription factor (TF) binding site (TFBS) prediction tool. This tool gets an input DNA sequence from the user and predicts the DNA binding sites for a specific TF in this sequence. DRAF utilizes models based on TFBS DNA sequence and TF amino acid prporties to provide accuarate TFBS predictions.

The DRAF models were constructed as in the following figure:

The figure above shows the input data, training procedure and usage of the DRAF models for prediction of TF-TFBS links. Sequences of TFs and their TFBSs are represented in TF-TFBS links using physicochemical properties of TFs and binary representation of TFBSs. Then, the DRAF models were constructed for each group of TFs depending on the TFBS length. Finally, the DRAF models were tested using the holdout test dataset and another set of ChIP-seq peak data and their associated background datasets. The DRAF models aim at predicting which TF-TFBS link suggests a valid TFBS for a particular TF.


2. Input file format

DRAF accepts DNA sequnces in FASTA format . The sequence can be inserted directly in the text area of the DRAF web page or uploaded as a file. The following is an example of input sequences to DRAF:

>Seq1
AATTGCGGCACGCAACTTAGTTGACCGTTTGACTTAAATTGGGCCAATTGGCCATCACGCAACGGTTCCGGAGAGTACGTCACGCGAT
>Seq2
AATTGCGGCACGCAACTGTTGCGTGCTAGTTGACCGTTTGACTTAAATTGGGCCAATTGGCCATCACGCAACGGTTCCGGAGAGTACGTCACGCGAT


3. Output file format

DRAF outputs its prediction in gff3 format .


4. DRAF running steps

To use DRAF, please follow these steps:

Step 1: Select a set of TFs (one or multiple TFs) from the drop down list and select a sensitivity threshold between 0.1 and 0.9 from the sensitivity drop down list. The DRAF system will predict DNA binding sites for that TF using the corresponding model for this TF and using the predefined thresholds that provided the selected sensitivity level in the training data for that model. If the selected TF is among the 58 TFs that we have retrieved their ChipSeq datasets from Encode, then the DRAF model used the tresholds that were optimized to achieve the selected sensitivity level in the ChipSeq datasets associated with that TF.

Step 2: Provide the DNA sequence to the system in a FASTA format. You can paste the sequence in the text area provided in the submission page or upload it in a file (maximum size is 20MB).

Step 3: Click on Run DRAF button. The system will run and provide you with a link for the results page. Please bookmark that page and check later for the results.


5. How long does it take to run DRAF?

Our experiments show that average run time for one model on 500 ChIP-seq peaks sequences (each with 500 bp long) takes up to several minutes.