*This page is part of undergraduate
assignment for Molecular Biology at Davidson College
By Elizabeth Shafer
Orthologs
I
did a BLASTp search of the
first 40 amino acids of IDP2 to search for amino acid structural homologs.
Below, I have described the nucleotide sequence, amino acid sequence, and
reproduced a sequence alignment amongst five IDP2 orthologs.
Isocitrate
Dehydrogenase from E.coli (IBL5).
Escherichia
coli is enterobacteria. Its has a gene that encodes for isocitrate
dehydrogenase.
For more information check:
Nucleotide Sequence CDS 1251 bp
1 atggaaagta aagtagttgt tccggcacaa ggcaagaaga tcaccctgca aaacggcaaa
61 ctcaacgttc ctgaaaatcc gattatccct tacattgaag gtgatggaat cggtgtagat
121 gtaaccccag ccatgctgaa agtggtcgac gctgcagtcg agaaagccta taaaggcgag
181 cgtaaaatct cctggatgga aatttacacc ggtgaaaaat ccacacaggt ttatggtcag
241 gacgtctggc tgcctgctga aactcttgat ctgattcgtg aatatcgcgt tgccattaaa
301 ggtccgctga ccactccggt tggtggcggt attcgctctc tgaacgttgc cctgcgccag
361 gaactggatc tctacatctg cctgcgtccg gtacgttact atcagggcac tccaagcccg
421 gttaaacacc ctgaactgac cgatatggtt atcttccgtg aaaactcgga agacatttat
481 gcgggtatcg aatggaaagc agactctgcc gacgccgaga aagtgattaa attcctgcgt
541 gaagagatgg gggtgaagaa aattcgcttc ccggaacatt gtggtatcgg tattaagccg
601 tgttcggaag aaggcaccaa acgtctggtt cgtgcagcga tcgaatacgc aattgctaac
661 gatcgtgact ctgtgactct ggtgcacaaa ggcaacatca tgaagttcac cgaaggagcg
721 tttaaagact ggggctacca gctggcgcgt gaagagtttg gcggtgaact gatcgacggt
781 ggcccgtggc tgaaagttaa aaacccgaac actggcaaag agatcgtcat taaagacgtg
841 attgctgatg cattcctgca acagatcctg ctgcgtccgg ctgaatatga tgttatcgcc
901 tgtatgaacc tgaacggtga ctacatttct gacgccctgg cagcgcaggt tggcggtatc
961 ggtatcgccc ctggtgcaaa catcggtgac gaatgcgccc tgtttgaagc cacccacggt
1021 actgcgccga aatatgccgg tcaggacaaa gtaaatcctg gctctattat tctctccgct
1081 gagatgatgc tgcgccacat gggttggacc gaagcggctg acttaattgt taaaggtatg
1141 gaaggcgcaa tcaacgcgaa aaccgtaacc tatgacttcg agcgtctgat ggatggcgct
1201 aaactgctga aatgttcaga gtttggtgac gcgatcatcg aaaacatgta a
Amino Acid Sequence 416 aa
MESKVVVPAQGKKITLQNGKLNVPENPIIPYIEGDGIGVDVTPA
MLKVVDAAVEKAYKGERKISWMEIYTGEKSTQVYGQDVWLPAETLDLIREYRVAIKGP
LTTPVGGGIRSLNVALRQELDLYICLRPVRYYQGTPSPVKHPELTDMVIFRENSEDIY
AGIEWKADSADAEKVIKFLREEMGVKKIRFPEHCGIGIKPCSEEGTKRLVRAAIEYAI
SVTLVHKGNIMKFTEGAFKDWGYQLAREEFGGELIDGGPWLKVKNPNTGKEIV
IADAFLQQILLRPAEYDVIACMNLNGDYISDALAAQVGGIGIAPGANIGDECAL
FEATHGTAPKYAGQDKVNPGSIILSAEMMLRHMGWTEAADLIVKGMEGAINAKTVTYD
FERLMDGAKLLKCSEFGDAIIENM
Isocitrate Dehydrogenase from Bacillus coagulans (2AYQ_B)
Bacillus coagulans is a
bacillus bacteria and moderate thermophile. Its homolog for isocitrate
dehydrogenase encodes a 3-Isopropylmalate dehydrogenase which is involved in
leucine biosynthesis at high temperatures.
For more information check:
Tsuchiya D, Sekiguchi T, Takenaka A.Crystal structure of 3-isopropylmalate dehydrogenase from the moderate facultative thermophile, Bacillus coagulans: two strategies for thermostabilization of protein structures.
J Biochem (Tokyo). 1997 Dec;122(6):1092-104.
CDS 1101 bp
1 atgaaaatga aactggccgt actgcccggc gatgggatcg ggccggaagt gatggatgca
61 gcgatccgcg ttttaaaaac agtgttggac aatgacgggc atgaagccgt ttttgaaaat
121 gcgctgattg ggggcgccgc cattgatgaa gcggggacgc ccctaccgga agaaacgctt
181 gacatttgcc gcaggagcga tgccattttg ctcggcgcgg taggggggcc gaaatgggat
241 cataacccgg cttccctccg cccggaaaaa ggcctgctcg ggctccggaa agaaatgggg
301 ctgtttgcga acctgcgccc ggttaaagca tatgccacac ttttaaacgc atcgccttta
361 aaacgggaac gtgtggaaaa cgtcgatctt gttattgtcc gcgaactgac gggcggcctc
421 tattttgggc gcccgagtga aaggcgcggg ccgggcgaga atgaagtggt agacacgctt
481 gcctatacaa gggaagagat tgaaagaatt attgagaaag cattccagct tgcccaaatc
541 agaagaaaaa aactggcatc cgtcgataag gcgaatgtgc tggaatcaag cagaatgtgg
601 cgcgaaattg cggaagaaac cgcgaaaaag tatccggacg tggaattgag ccatatgctt
661 gtcgactcaa cttcgatgca gctgattgca aatccgggcc aatttgatgt cattgtaaca
721 gagaatatgt tcggcgatat tttaagcgat gaagcgtccg tgattaccgg cagcctcggc
781 atgttgccat ccgcaagcct ccgttccgac cggttcggca tgtatgaacc ggtccacggc
841 tccgcgccgg atattgccgg gcagggaaaa gccaacccgc tcgggacagt gctgtcagcg
901 gctttgatgc tccgttattc gttcgggctt gagaaagaag cggcggccat tgaaaaagca
961 gtggatgatg tgcttcaaga cggctattgt acaggcgatt tgcaggtggc aaacggaaaa
1021 gtggtcagta caattgagct cacagaccgg ctgatcgaaa aattaaataa cagcgcagcc
1081 ggtccgcgca tttttcaata a
Amino Acid Sequence 336 aa
1 mkmklavlpg dgigpevmda airvlktvld ndgheavfen aliggaaide agtplpeetl
61 dicrrsdail lgavggpkwd hnpaslrpek gllglrkemg lfanlrpvka yatllnaspl
121 krervenvdl vivreltggl yfgrpserrg pgenevvdtl aytreeieri iekafqlaqi
181 rrkklasvdk anvlessrmw reiaeetakk ypdvelshml vdstsmqlia npgqfdvivt
241 enmfgdilsd easvitgslg mlpsaslrsd rfgmyepvhg sapdiagqgk anplgtvlsa
301 almlrysfgl ekeaaaieka vddvlqdgyc tgdlqvangk vvstieltdr lieklnnsaa
361 rprifq
Isocitrate Dehydrogenase from Thermus thermophilus (1XAA).
Thermus thermophilus is an
extremophile. Its homolog for isocitrate dehydrogenase encodes a
3-Isopropylmalate dehydrogenase which is involved in leucine biosynthesis at
high temperatures.
For more information check:
CDS 1488 bp
1 atgcccctga tcaccacgga aaccggcaag aagatgcacg ttctcgagga cgggcgcaag
61 ctcatcaccg tcatccccgg agacggcatc gggcccgagt gcgtggaggc taccctcaag
121 gtcctagagg cggccaaggc ccccctggcc tacgaggtgc gagaggcggg ggcgagcgtc
181 ttccggcggg gcatcgcctc gggcgttccc caggagacca ttgagtccat ccgcaagacc
241 cgggtggtcc tgaagggtcc cctggaaacc ccggtgggct acggggagaa gagcgccaac
301 gtcaccctaa ggaagctctt tgagacctac gccaacgtcc gccccgtgcg ggagttcccc
361 aacgtcccca ccccctatgc gggccggggc attgacctcg tggtggtgcg ggagaacgtg
421 gaggacctct acgccgggat tgagcacatg cagaccccga gcgtggccca gaccctcaag
481 ctcatctcct ggaagggatc ggagaagatc gtccgcttcg cctttgagct ggcccgggcc
541 gaggggcgga agaaggtcca ctgcgccacc aagtccaaca tcatgaagct cgccgaagga
601 cccaagcggg cctttgagca ggtggcccag gagtaccccg acatagaagc ggtccacatc
661 atcgtggaca acgctgccca ccagctggtg aaaaggcccg agcagtttga ggtgatcgtc
721 accaccaaca tgaacggaga catcctctcc gacctcacct cggggctcat tgggggcctg
781 ggcttcgctc cctcggccaa catcggcaac gaggtggcca tctttgaggc cgtccacggt
841 tccgccccca agtacgccgg gaagaacgtc atcaacccca ccgcggtcct cctctcggcg
901 gtgatgatgc tccgctacct ggaggagttc gccacggcgg accttataga gaacgccctc
961 ctctacaccc tcgaggaggg ccgggtcctc acgggggacg tggtgggcta cgaccggggg
1021 gccaagacca cggagtacac cgaggccatc atccagaacc tgggcaagac cccaaggaag
1081 acccaggtgc ggggctacaa gcccttccgc ctgccccagg tggacggggc catcgccccc
1141 atcgtcccta ggagccgccg ggttgtgggg gtggacgtct tcgtggaaac caacctcctg
1201 cccgaggccc tgggaaaggc cctggaggac cttgccgcgg gcaccccctt ccggctcaag
1261 atgatctcca accggggcac ccaggtctac ccccccaccg gcgggctcac ggacctggtg
1321 gaccactacc gctgccgctt cctctacacg ggggaggggg aggctaagga cccggagatc
1381 ctggacctcg taagccgggt ggcaagccgc ttccgctgga tgcacctgga gaagctccag
1441 gaatttgacg gcgagcccgg cttcaccaag gcccaagggg aagactaa
Amino Acid Sequence 345 aa
1 mkvavlpgdg igpevteaal kvlraldeae glglayevfp fggaaidafg epfpeptrkg
61 veeaeavllg svggpkwdgl prkirpetgl lslrksqdlf anlrpakvfp glerlsplke
121 eiargvdvli vreltggiyf geprgmseae awnteryskp evervarvaf eaarkrrkhv
181 vsvdkanvle vgefwrktve evgrgypdva lehqyvdama mhlvrsparf dvvvtgnifg
241 dilsdlasvl pgslgllpsa slgrgtpvfe pvhgsapdia gkgianptaa ilsaammleh
301 afglvelark vedavakall etpppdlggs agteaftatv lrhla
Isocitrate Dehydrogenase from Brassica napus (gi 126201)
Brassica napus is a flowering plant. Its homolog for isocitrate dehydrogenase encodes a 3-Isopropylmalate dehydrogenase which is a chloroplast precusor.
For more information check:
Ellerstrom M, Josefsson LG, Rask L, Ronne H.Cloning of a cDNA for rape chloroplast 3-isopropylmalate dehydrogenase by genetic complementation in yeast.
Plant Mol Biol. 1992 Feb;18(3):557-66.
CDS 1221 bp
1 atggcggcgg ctctgcagac taacatccga ccggttaagt ttccggctac gttgagagct
61 ctcaccaaac aatcttctcc agcacccttt agagtgagat gcgccgctgc ttcccccggg
121 aaaaagagat acaatatcac tctccttccc ggcgatggaa tcggtccgga ggtcatctcc
181 atcgctaaaa atgtgcttca gcaagctggt tccttggaag gtctggagtt tagcttccag
241 gaaatgcctg taggaggagc tgctttggat ttggtcggag tgcctttgcc tgaggagacc
301 gtctcggctg ctaaagaatc agatgctgtg cttcttggag ccattggagg gtacaaatgg
361 gataagaatg aaaaacattt gaagcctgag actgggttac ttcaacttcg ggctggtctt
421 aaagtctttg ctaatctgag acctgctaca gttcttccac agttagtgga tgcttcgacc
481 ttgaagagag aggttgcaga aggtgttgat ctgatggttg ttagggagct tacaggaggt
541 atttactttg gagtgccaag gggcattaag actaatgaaa atggtgagga agttgggtat
601 aataccgagg tctatgctgc tcacgagatt gatagaattg ctcgtgttgc cttcgagact
661 gctcggaaac ggcgtggcaa gctgtgttct gttgacaaag ctaatgtctt agatgcctcg
721 attttatgga ggagacgagt aacagcacta gctgctgaat atccggatgt tgaactgtca
781 catatgtatg ttgacaatgc tgccatgcag cttgttcgtg accctaaaca gtttgacacc
841 attgttacaa acaacatttt tggtgatata ttatccgatg aagcgtcgat gatcacagga
901 agcatcggca tgcttccctc tgctagtctc agtgattcgg gacctggact ctttgaacct
961 atacatggtt ctgcacctga tattgctgga caggataaag caaacccgtt ggcaaccatc
1021 ctcagcgctg caatgcttct gaaatacgga ctcggagagg agaaggcagc taagagaatc
1081 gaagacgctg tgttgggtgc tctgaacaaa ggattcagaa caggagacat ctactccgca
1141 ggaactaaac ttgtgggctg caaggagatg ggagaggaag ttctgaagtc agtggattcc
1201 cacgttcaag cttctgttta a
Amino Acid Sequence 406 aa
1 maaalqtnir pvkfpatlra ltkqsspapf rvrcaaaspg kkrynitllp gdgigpevis
61 iaknvlqqag sleglefsfq empvggaald lvgvplpeet vsaakesdav llgaiggykw
121 dknekhlkpe tgllqlragl kvfanlrpat vlpqlvdast lkrevaegvd lmvvreltgg
181 iyfgvprgik tnengeevgy ntevyaahei driarvafet arkrrgklcs vdkanvldas
241 ilwrrrvtal aaeypdvels hmyvdnaamq lvrdpkqfdt ivtnnifgdi lsdeasmitg
301 sigmlpsasl sdsgpglfep ihgsapdiag qdkanplati lsaamllkyg lgeekaakri
361 edavlgalnk gfrtgdiysa gtklvgckem geevlksvds hvqasv
Isocitrate Dehydrogenase from Neurospora crassa (gi_462502)
Neurospora crassa is an eukaryotic fungi. Its homolog for isocitrate dehydrogenase encodes a 3-Isopropylmalate dehydrogenase which is involved in leucine biosynthesis.
For more information check:
Li Q, Jarai G, Yaghmai B, Marzluf GA. The leu-1 gene of Neurospora crassa: nucleotide and deduced amino acid sequence comparisons. Gene 1993 Dec 22;136(1-2):301-5
CDS 1107 bp
1 atggctactc ataacattgt tgtgttcggt ggtgaccact gcggtcccga ggttgttctc
61 gaggccatca aggtcctcaa ggcgatcgag accaacagcc cttcggcgtg caagttcaac
121 ctccagaacc acctccttgg cggtgcctcc atcgacaagc acaatgaccc cctcaccgat
181 gaggccctca acgccgccaa ggctgccgat gccgtccttc tcggtgccat tggcggtccc
241 gaatggggca cctcttccac cgtccgcccc gagcaaggtc tcctgaagct ccgcaaggag
301 ctcggcacct atggcaacct tcgcccttgc aactttgctt ccgagtccct cgtcgacagc
361 tctcccctca aggccgaggt ctgccgcggc actgacttca ttgtcgtccg tgagcttacc
421 ggtggtatct actttggtga ccgcaccgag gatgacggct ccggctacgc ctgcgatacc
481 gagccctaca gccgcgccga gatcgtgcgc atcgccagac tcgccggctt cctcgccctg
541 gccaagaact ctcccgccaa ggtctggtct ctggacaagg ccaacgtgct cgccaccagc
601 cgcctctggc gcaagactgt gaccgacgtc attagcaagg agtgccccca gcttcagctc
661 gagcaccagc tcatcgacag cgccgccatg ctgctcgtca agaacccccg tgccctcaac
721 ggcgtcgtca ttaccagcaa cctctttggc gacatcatct cggacgaggc ctcggtcatc
781 cccggctcca tcggcctgct cccttccgcc agctggggcg gaatccccga cggcaaggtc
841 aagtgcaacg gcatttacga gcccatccac ggctccgctc ccgatatttc gggtaagggc
901 atcgtcaacc ccgtctgtac cattctctcc gtcgccatga tgctccgcta ctcgctcaac
961 ctccccaagg aggccgatgc cgttgaggct gctgtcaagg cagccattga caacggtacc
1021 aagaccaagg accttggcgg caacgctact acttcggata tgggtaacgc tgtagttgct
1081 gagttggaga agatccttaa ggcttaa
Amino Acid Sequence 368 bp
1 mathnivvfg gdhcgpevvl eaikvlkaie tnspsackfn lqnhllggas idkhndpltd
61 ealnaakaad avllgaiggp ewgtsstvrp eqgllklrke lgtygnlrpc nfaseslvds
121 splkaevcrg tdfivvrelt ggiyfgdrte ddgsgyacdt epysraeivr iarlagflal
181 aknspakvws ldkanvlats rlwrktvtdv iskecpqlql ehqlidsaam llvknpraln
241 gvvitsnlfg diisdeasvi pgsigllpsa swggipdgkv kcngiyepih gsapdisgkg
301 ivnpvctils vammlrysln lpkeadavea avkaaidngt ktkdlggnat tsdmgnavva
361 elekilka
Amino Acid Sequence Alignment:
10 20 30 40 50 60 ....*....|....*....|....*....|....*....|....*....|....*....| consensus 1 KIALLPGDGIGPEVTEAALKVLKAALEKAP---LEFEFEEYLVGGAAIDATG--EPLPDE 55 1BL5 26 IIPYIEGDGIGVDVTPAMLKVVDAAVEKAYkgeRKISWMEIYTGEKSTQVYGqdVWLPAE 85 2AYQ_B 4 KLAVLPGDGIGPEVMDAAIRVLKTVLDNDG---HEAVFENALIGGAAIDEAG--TPLPEE 58 1XAA 2 KVAVLPGDGIGPEVTEAALKVLRALDEAEG---LGLAYEVFPFGGAAIDAFG--EPFPEP 56 gi 126201 45 NITLLPGDGIGPEVISIAKNVLQQAGSLEG---LEFSFQEMPVGGAALDLVG--VPLPEE 99 gi 462502 5 NIVVFGGDHCGPEVVLEAIKVLKAIETNSPs-aCKFNLQNHLLGGASIDKHN--DPLTDE 61 |
70 80 90 100 110 120 ....*....|....*....|....*....|....*....|....*....|....*....| consensus 56 TLEACRKADAVLKGAVGGPKWd--pGEVRPENglLALRKELDLYANLRPVKVYp-aLGDK 112 1BL5 86 TLDLIREYRVAIKGPLTTPVG----GGIRSLN--VALRQELDLYICLRPVRYY---QGTP 136 2AYQ_B 59 TLDICRRSDAILLGAVGGPKwdhnpASLRPEKglLGLRKEMGLFANLRPVKAYa-tLLNA 117 1XAA 57 TRKGVEEAEAVLLGSVGGPKwdglpRKIRPETglLSLRKSQDLFANLRPAKVFp-gLERL 115 gi 126201 100 TVSAAKESDAVLLGAIGGYKwdkneKHLKPETglLQLRAGLKVFANLRPATVLp-qLVDA 158 gi 462502 62 ALNAAKAADAVLLGAIGGPEWgt-sSTVRPEQglLKLRKELGTYGNLRPCNFAsesLVDS 120 |
130 140 150 160 170 180 ....*....|....*....|....*....|....*....|....*....|....*....| consensus 113 SPLKNEvvEGVDIVIVRELTGGIYFGIEKGIDG-----------------SGN-GEEVAV 154 1BL5 137 SPVKHP--ELTDMVIFRENSEDIYAGIEWKADSadaekvikflreemgvkKIRfPEHCGI 194 2AYQ_B 118 SPLKRErvENVDLVIVRELTGGLYFGRPSER-------------------RGPgENE-VV 157 1XAA 116 SPLKEEiaRGVDVLIVRELTGGIYFGEPRGMS-----------------------EAEAW 152 gi 126201 159 STLKREvaEGVDLMVVRELTGGIYFGVPRGIKT-----------------NEN-GEEVGY 200 gi 462502 121 SPLKAEvcRGTDFIVVRELTGGIYFGDRTEDDG-----------------SG-----YAC 158 |
190 200 210 220 230 240 ....*....|....*....|....*....|....*....|....*....|....*....| consensus 155 DTKLYSRDEIERIARAAFELARKRG-RKKVTSVDKANVLKSSD----LWREIVAEVAAK- 208 1BL5 195 GIKPCSEEGTKRLVRAAIEYAIAND-RDSVTLVHKGNIMKFTEgafkDWGYQLAREEFGg 253 2AYQ_B 158 DTLAYTREEIERIIEKAFQLAQIR--RKKLASVDKANVLESSR----MWREIAEETAK-- 209 1XAA 153 NTERYSKPEVERVARVAFEAARKR--RKHVVSVDKANVLEVGE----FWRKTVEEVGR-- 204 gi 126201 201 NTEVYAAHEIDRIARVAFETARKR--RGKLCSVDKANVLDASI----LWRRRVTALAA-- 252 gi 462502 159 DTEPYSRAEIVRIARLAGFLALAKNsPAKVWSLDKANVLATSR----LWRKTVTDVISK- 213 |
250 260 270 280 290 300 ....*....|....*....|....*....|....*....|....*....|....*....| consensus 209 ---------------EYPDIELEHMLVDNAAMQLVKNP-KQFDVIVTPNLFGDILSDEAS 252 1BL5 254 elidggpwlkvknpnTGKEIVIKDVIADAFLQQILLRP-AEYDVIACMNLNGDYISDALA 312 2AYQ_B 210 ---------------KYPDVELSHMLVDSTSMQLIANP-GQFDVIVTENMFGDILSDEAS 253 1XAA 205 ---------------GYPDVALEHQYVDAMAMHLVRSP-ARFDVVVTGNIFGDILSDLAS 248 gi 126201 253 ---------------EYPDVELSHMYVDNAAMQLVRDP-KQFDTIVTNNIFGDILSDEAS 296 gi 462502 214 ---------------ECPQLQLEHQLIDSAAMLLVKNPrALNGVVITSNLFGDIISDEAS 258 |
310 320 330 340 350 360 ....*....|....*....|....*....|....*....|....*....|....*....| consensus 253 MLTGSLGMLPSASLGPDGf------ALFEPVHGSAPDIAGKDKANPIATILSAAMMLRHs 306 1BL5 313 AQVGGIGIAPGANIGDEC-------ALFEATHGTAPKYAGQDKVNPGSIILSAEMMLRH- 364 2AYQ_B 254 VITGSLGMLPSASLRSdr------fGMYEPVHGSAPDIAGQGKANPLGTVLSAALMLRYs 307 1XAA 249 VLPGSLGLLPSASLGrg-------tPVFEPVHGSAPDIAGKGIANPTAAILSAAMMLEHa 301 gi 126201 297 MITGSIGMLPSASLSDsg------pGLFEPIHGSAPDIAGQDKANPLATILSAAMLLKYg 350 gi 462502 259 VIPGSIGLLPSASWGGIPdgkvkcnGIYEPIHGSAPDISGKGIVNPVCTILSVAMMLRYs 318 |
370 380 390 400 ....*....|....*....|....*....|....*....|....*.. consensus 307 LGLED-AaDAIEAAVLKTLEAGIRTKDLAGNAD--KYVSTSEFGDAV 350 1BL5 365 MGWTEaA-DLIVKGMEGAINAKTVTYDFERLMDgaKLLKCSEFGDAI 410 2AYQ_B 308 FGLEK-EaAAIEKAVDDVLQDGYCTGDLQVANG--KVVSTIELTDRL 351 1XAA 302 FGLVE-LaRKVEDAVAKALLETP-PPDLGGSAG--TEAFTATVLRHL 344 gi 126201 351 LGEEK-AaKRIEDAVLGALNKGFRTGDIYSAGT--KLVGCKEMGEEV 394 gi 462502 319 LNLPK-EaDAVEAAVKAAIDNGTKTKDLGGNA------TTSDMGNAV 358 |
Isocitrate
Dehydrogenase Main Page
Created
by: Elizabeth Shafer. Email questions to lishafer@davidson.edu