Results
Isocitrate dehydrogenase 1 (IDH1)
BLAST
When the five IDH1 sequences were run through BLAST, most of the matches generated were with IDH1 genes in other organisms and these were most similar to the input sequences. Some of the other matches were with isopropylmalate dehydrogenases and tartrate dehydrogenases, known to be evolutionarily related to IDH. The last significant category of matches consisted of other types of IDH, mitochondrial NADP+ dependent and NAD dependent forms. All these matches aligned with the complete IDH input sequences, which meant that there was no way to narrow the match down to specific parts of the input sequences.
Each input sequence had some unique hits which did not belong to any of the above classes, but there was no other common class of hits.
ScanPROSITE
When the "Exclude patterns with a high probability of occurrence" box was checked, ScanPROSITE found only one pattern in the IDH1 sequences -- the isocitrate and isopropylmalate dehydrogenases signature. The pattern was found in the latter third of the protein, e.g. amino acids 303-322 in the E. coli sequence.
When the "Exclude patterns with a high probability of occurrence" box was not checked, there were 7 matching patterns for each input IDH sequence. All were short (< 8 amino acids) except for the one found in the first run.
PHI-BLAST
Using the isocitrate and isopropylmalate dehydrogenase signature pattern, I ran a PHI-BLAST search with the different IDH sequences. The results were as expected -- the matches were all members of the isocitrate and isopropylmalate dehydrogenase family of enzymes -- and included different IDH genes, isopropylmalate dehydrogenases and some tartrate dehydrogenases.
PSI-BLAST
The first iteration of PSI-BLAST returned the same hits as BLAST did. Choosing any one class of matches -- isopropylmalate dehydrogenases (and tartrate dehydrogenases), mitochondrial NADP+ dependent IDH and mitochondrial NAD+ dependent IDH -- for the next iteration simply generated more matches of the same class, along with the usual list of spurious matches. The search results using each class of enzymes for the next iteration overlapped considerably as well. Once again, the same problem was encountered as in the BLAST results; the alignments were almost always whole-protein, making it impossible to narrow down a consensus sequence to any one part of the input protein sequences.
DAB
The results of running DAB with each of the input sequences are linked in Table 1.
Table 1. Results of DAB with each IDH input sequence.
The important matches for each sequence are summarized below.
E. coli
- Sub-sequence 4 (amino acids 31 to 50) showed significant similarity to isopropylmalate dehydrogenase from two thermophilic bacteria, Thermus thermophilus and Thermus aquaticus.
- Sub-sequence 9 (amino acids 81 to 100) matched epidermal growth factor receptor and mutant epidermal growth factor receptor in Drosophila melanogaster.
- Sub-sequence 10 (amino acids 91 to 110) showed similarity to heat shock proteins 89 and 90 from various species.
- Sub-sequence 23 (amino acids 221 to 240) had several hits with NAD+ dependent isocitrate dehydrogenase, also called IDH3.
- Sub-sequence 27 (amino acids 261 to 280) gave two matches of interest -- one with an ADP-glucose phosphorylase and the other with leucyl tRNA synthetase.
A. thaliana
- Sub-sequences 23 and 24 (amino acids 221 to 250) were found to be similar to phosphatidylinositol 4-kinase from several species.
- Sub-sequence 24 (amino acids 231 to 250) matched different heavy chains of myosin from Drosophila melanogaster.
- Sub-sequences 27 and 28 (amino acids 261 to 290) showed similarity to ubiquitin transferase and ligase from Schizosaccaromyces pombe, or fission yeast.
- Sub-sequence 28 (amino acids 271 to 290) was found to be similar to MHC class I chain-related protein and its precursor in some primate species, including humans.
- There was a single hit for histidinol dehydrogenase (from Saccharomyces cerevisiae) in sub-sequence 35 (amino acids 341 to 360).
- Sub-sequence 39 (amino acids 381 to 400) showed similarity to arylamine N-acetyltransferase from several species including Homo sapiens and Mus musculus.
S. cerevisiae
- Sub-sequence 3 (amino acids 21 to 40) showed significant similarity to ribulose-1,5-bisphosphate carboxylase in several species of red algae.
- Sub-sequence 5 (amino acids 41 to 60) had one match with dissimilatory sulfite reductase from Desulfococcus multivorans, a sulfur reducing bacteria.
- Sub-sequence 8 (amino acids 71 to 90) was similar to threonyl-tRNA synthetase.
- Sub-sequence 27 and 28 (amino acids 261 to 290) were similar to ubiquitin transferase and ligase from Saccharomyces pombe.
M. musculus
- Sub-sequence 4 (amino acids 31 to 50) was similar to an ATP-binding cassette in humans.
- Sub-sequence 13 (amino acids 121 to 140) matched NAD dependent malate oxidoreductases and NADP dependent quinone oxidoreductase.
- Sub-sequence 16 (amino acids 151 to 170) was similar to a heterodisulfide reductase subunit from Aquifex aeolicus, a thermophilic bacterium.
- Sub-sequence 28 (amino acids 271 to 290) matched ubiquitin transferase and ligase from Saccharomyces pombe.
- Sub-sequence 36 (amino acids 351 to 370) was similar to phophoglycerate kinase from several species, including Homo sapiens and several marsupials.
H. sapiens
- Sub-sequence 4 (amino acids 31 to 50) was similar to an ATP-binding cassette in humans.
- Sub-sequence 12 (amino acids 111 to 130) matched an ATP-binding cassette multidrug transporter from Emericilla nidulans, a fungus.
- Sub-sequence 27 (amino acids 261 to 280) showed similarity to ubiquitin transferase and ligase from Saccharomyces pombe.
- Sub-sequences 35 and 36 (amino acids 341 to 370) were similar to phophoglycerate kinase from several species.
Consensus
- Sub-sequence 21 (amino acids 201 to 220) was similar to isopropylmalate dehydrogenase from two bacterial species, Haemophilus influenzae and Bacillus caldotenax.
- Sub-sequence 30 (amino acids 291 to 310) showed similarity to ubiquitin transferase and ligase from Saccharomyces pombe.
Chime
The results obtained with the above tools were summarized in interactive web presentations using Chime scripting. The first presentation shows the IDH1 protein from E. coli, complexed with Mg2+ and isocitrate. The second presentation shows the same molecule complexed with its NADP cofactor and Ca2+.
Caspase 3
BLAST
BLASTing the two caspase 3 sequences yielded matches mostly with other caspases. The highest scoring matches were with caspase 3 genes from other organisms. Caspases 7 and 6 scored highly as well, in that order. The rest of the matches were with several different caspases, including caspase 2, caspase 1 and caspase 8. The alignments of the matches, especially in the lower similarity hits tended to be in the region after the first 40 amino acids.
ScanPROSITE
With the "Exclude patterns with a high probability of occurrence" box checked, there were two main patterns detected: a histidine active site and a cysteine active site. The positions of these pattern matches were at amino acids 108-122 and 154-165 respectively, for both the human and the mouse caspase 3 sequences.
As for IDH, unchecking the "Exclude patterns with a high probability of occurrence" box yielded more pattern matches. There were nine matches including the two active site matches found in the first search. Once again, all matches other than those found in the first search were short (< 7 amino acids long).
PHI-BLAST
Using the two active site patterns found in the ScanPROSITE search, I ran PHI-BLAST with the two caspase 3 sequences. The results were similar to the BLAST results and caspase 3 (from other organisms), caspase 7 and caspase 6 featured prominently. The alignments were similar to the BLAST results as well, with most of the matches aligning after the first 40 amino acids of the input caspase 3 sequences.
PSI-BLAST
Since the first iteration of BLAST yielded no surprises, running PSI-BLAST was not very useful. Using any one caspase for the iteration generated more matches for that caspase type, e.g. when I chose all the caspase 6 matches for my iteration, the results were similar to running BLAST on a caspase 6 sequence. The alignments were identical with BLAST.
DAB
The results of running DAB with each of the caspase 3 sequences are linked in Table 2 below:
Table 2. Results of DAB with each caspase 3 input sequence.
The important matches for each sequence are summarized below.
Xenopus laevis
- Sub-sequence 4 (amino acids 31 to 50) was similar to a type I signal peptidase from Bacillus amyloliquefaciens.
- Sub-sequence 11 (amino acids 101 to 120) matched a hypothetical protein from Aquifex aeolicus and a self-splicing hypothetical protein from Methanococcus jannschii.
- Sub-sequence 15 (amino acids 141 to 160) was similar to a capsid protein (or its precursor) from three species of calicivirus.
- Sub-sequence 22 (amino acids 211 to 230) was similar to a conserved protein of unknown function from Methanobacterium thermoautotrophicum.
Gallus gallus
- Sub-sequence 6 (amino acids 51 to 70) matched beta-lactamase precursor from Bacteroides vulgatus.
- Sub-sequence 14 (amino acids 131 to 150) was similar to a hypothetical protein from Pyrococcus abyssi, which is an archaeal organism.
- Sub-sequence 17 (amino acids 161 to 180) had several matches for interleukin-1 beta converting enzyme (also called caspase 1) from various species.
- Sub-sequence 22 (amino acids 211 to 230) showed similarity to acetylcholinesterase from three species of cattle tick.
- Sub-sequence 25 (amino acids 241 to 260) matched an open reading frame with unknown gene product from Neisseria gonorrhoeae.
Rattus norvegicus
- Sub-sequence 7 (amino acids 61 to 80) was similar to cathepsin S from Homo sapiens. Cathepsin S is a cysteine protease. This sub-sequence also matched an ATP-dependent CLP protease from Mycobacterium tuberculosis.
- Sub-sequence 13 (amino acids 121 to 140) matched interleukin-1 beta converting enzyme (caspase 1).
Mus musculus
- Sub-sequence 10 (amino acids 91 to 110) was similar to a conserved protein of unknown function from Methanobacterium thermoautotrophicum, different from the match in the X. laevis sequence.
- Sub-sequence 14 (amino acids 131 to 150) was similar to a hypothetical protein from Thermotoga maritima, a thermophile.
- Sub-sequence 16 (amino acids 151 to 170) matched interleukin-1 beta converting enzyme (caspase 1).
- Sub-sequence 18 (amino acids 171 to 190) was similar to a 20S proteasome subunit from Streptomyces coelicolor.
Homo sapiens
- Sub-sequence 9 (amino acids 81 to 100) matched a 20S proteasome subunit and a multicatalytic endopeptidase from Arabidopsis thaliana.
- Sub-sequence 16 (amino acids 151 to 170) matched interleukin-1 beta converting enzyme (caspase 1).
Consensus
- Sub-sequence 11 (amino acids 101 to 120) was similar to vitamin D receptor from several species. This sub-sequence was also similar to a conserved protein of unknown function from Methanobacterium thermoautotrophicum, the same match seen with the M. musculus sequence.
- Sub-sequence 17 (amino acids 161 to 180) matched interleukin-1 beta converting enzyme (caspase 1).
- Sub-sequence 19 (amino acids 181 to 200) was similar to a 20S proteasome subunit from Streptomyces coelicolor, the same match that was obtained for the M. musculus sequence.
Chime
The results obtained with the above tools were summarized in an interactive web presentation using Chime.
Back to Table of Contents
Comments? Questions? Suggestions? Please e-mail rakarnik@davidson.edu.
Copyright 2000 Rahul Karnik.