The Wheeler lab has been awarded an NIH R15 grant from the National Institute of General Medical Sciences to develop “Improved protein-DNA models for translated sequence search with profile Hidden Markov models”. The grant is for $426K over three years, beginning April 1, 2017.
Fast and sensitive sequence database search is fundamental to modern molecular biology. The funded research will improve the accuracy of annotation of protein-coding content in sequenced genomes and metagenomic datasets. The research builds on established sequence database search software that employs probabilistic models to increase sensitivity through greater statistical power and ability to better model family complexity. The probabilistic models are called profile hidden Markov models (profile HMMs), and the software is HMMER.
Dr. Wheeler’s group will develop new models that account for frameshifting mutations or errors that obscure the protein-coding nature of sequence, and for splice sites that break genes or domains into distant fragments on the genome. Through a combination of new algorithms and application of existing approaches, these models will be fast enough to use for large-scale annotation, such as in the EMBL European Bioinformatics Institute’s Metagenomics Portal.
(See the press release: here)