DEEPre

DEEPre is a sequence-based enzyme function predictor based on deep learning. It predicts the function of enzyme by predicting the Enzyme Commission (EC) numbers. It uses sequence one-hot encoding, position specific scoring matrix, and functional domain encoding as the raw feature encodings. Those raw encodings are input to a deep learning model with novel structure which could perform dimensionality uniformization, feature selection and classification model training simultaneously. More details could be referred to the paper: https://www.ncbi.nlm.nih.gov/pubmed/29069344.

Input Sequence Requirements

  1. The length of the input sequence should be between 50AA to 5000AA. Lines that contain '>' will be removed automatically; whitespace characters will be removed automatically.
  2. The input sequence should not contain letters other than the 20 AA letters.
  3. PSI-BLAST would report at least one hit against SwissProt. Our server can run over the extreme no-hit case. If you encounter that extreme case, please contact the developer: yu.li@kaust.edu.sa.

Input File Format

For each query, you should have two lines, one for whether you are sure that is an enzyme, one for the sequence information. The first line should start with '>'. If you are sure the sequence is an enzyme, put 'Yes' after '>'. If you are not sure about it, you do not have to put anything after the '>', but '>' should always be there. Put the whole sequence on the next line. Here is an example: Sample File. Please follow the format requirement strictly. We do not remove whitespace characters from the input file because it may contain multiple entries.

Sample Input

Here are some sample inputs. Phrases after '>' indicate the ground truth for that sequence.

>Non-enzyme
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

>Non-enzyme
MALTSFLPAPTQLSQDQLEAEEKARSQRSRQTSLVSSRREPPPYGYRKGWIPRLLEDFGDGGAFPEIHVAQYPLDMGRKKKMSNALAIQVDSEGKIKYDAIARQGQSKDKVIYSKYTDLVPKEVMNADDPDLQRPDEEAIKEITEKTRVALEKSVSQKVAAAMPVRAADKLAPAQYIRYTPSQQGVAFNSGAKQRVIRMVEMQKDPMEPPRFKINKKIPRGPPSPPAPVMHSPSRKVTVKEQQEWKIPPCISNWKNAKGYTIPLDKRLAADGRGLQTVHINENFAKLAEALYIADRKAREAVEMRAQVERKMAQKEKEKHEEKLREMAQKARERRAGIKTHVEKEDGEARERDEIRHDRRKERQHDRNLSRAAPDKRSKLQRNENRDISEVIALGVPNPRTSNEVQYDQRLFNQSKGMDSGFAGGEDEIYNVYDQAWRGGKDMAQSIYRPSKNLDKDMYGDDLEARIKTNRFVPDKEFSGSDRRQRGREGPVQFEEDPFGLDKFLEEAKQHGGSKRPSDSSRPKEHEHEGKKRRKE

>1.1.1.100,Oxidoreductases,Acting on the CH-OH group of donors,With NAD or NADP as acceptor
MHYLPVAIVTGATRGIGKAICQKLFQKGLSCIILGSTKESIERTAIDRGQLQSGLSYQRQCAIAIDFKKWPHWLDYESYDGIEYFKDRPPLKQKYSTLFDPCNKWSNNERRYYVNLLINCAGLTQESLSVRTTASQIQDIMNVNFMSPVTMTNICIKYMMKSQRRWPELSGQSARPTIVNISSILHSGKMKVPGTSVYSASKAALSRFTEVLAAEMEPRNIRCFTISPGLVKGTDMIQNLPVEAKEMLERTIGASGTSAPAEIAEEVWSLYSRTALET

>2.7.4.13,Transferases,Transferring Phosphorus-Containing Groups,Phosphotransferases with a phosphate group as acceptor
MELIFLSGIKRSGKDTTADYINSNFKSIKYQLAYPIKDALAIAWERKHAENPDVFTELKYEYFEGIGYDRETPLNLNKLDVIELMEETLIYLQRQYLPINGVNILSSLEGGYSYLDIKPYEALREAINNINDTWSIRRLMQALGTDVVVNLFDRMYWVKLFALNYMDYIGSDFDYYVVTDTRQVHEMETARAMGATVIHVVRSGTESTDKHITEAGLPIEEGDLVITNDGSLEELYSKIEKILR

>3.4.21.53,Hydrolases,Peptidase,Serine proteases
MLARALIRRRQAVTTLAAPSRARSTRSRALLDELGAGAVAAEGVGRARGSSKNAFVRATTANGNETLASAGDGGSTSSASSSSTTSGGIMVSAAHPSSHPQVLAVPLPRRPLMPGIIMPVKVTDEKLIAELEDMRNRGQAYVGAFLMRSEGSSSSSAAGKEEDAFDALTKRTVASVGLDGEEEEGADPSDHMHDIGTFAQVHNIVRLPADSPNGEESATLLLLGHRRLRKLGTMKRDPLVVQVEHLKDEKFDANDDIIKATTNEVVATIKDLLKTNPLHKETLQYFAQNFNDFQDPPKLADLGASMCSADDAQLQRVLELLSVKDRLDATLELLKKEVEIGKLQADIGKKVEDKISGDQRRYFLMEQLKSIKKELGMERDDKTALIEKFTKRFEPKRKSVPEETVKVIDEELQKLSGLEPSSSEFNVTRNYLEWLTSLPWGVCGDEKLDIAHAQEVLDADHYGLEDVKDRILEFIAVGQLLGTTQGKIITMVGPPGVGKTSIGQSIAKALGRKFYRFSVGGMSDVAEIKGHRRTYVGAMPGKLIQCLKSTGVCNPVVLIDEIDKLGRGYQGDPASALLELLDPEQNGTFLDHYLDVPVDLSKVLFVCTANVLDTIPGPLLDRMEVVRLSGYITDEKVQIARTYLEKAAKGKSGLSDFDATITDEAMSKLIGDYCREAGVRNLQKHLEKVYRKVALKVARAKSTDTTLDPIVIDVDDLVDYVGQPPFQTDRIYDETPPGVVTGLAWTAMGGSTLYIECTSVESGEGKGSLKTTGQLGDVMKESSAIAHTFTRGFLQSKDPGNDFLQKTSLHVHVPAGATPKDGPSAGVTITTSLLSLAMDKPVKPNLAMTGELTLTGRVLPVGGIKEKTIAARRSGVKTIIFPQGNKKDYDELSEDIREGLEACFVSTYDEVYRHALDWDR