Mac DNAsis Analysis of Human Hexokinase
Mac DNAsis is a computer program that was used to analyze
the DNA sequence of the human cDNA for human hexokinase. In this
analysis MacDNAsis is used to determine the largest open reading frame,
the molecular weight, a hydropathy plot, a antigenicity plot, the secondary
structure, a multiple sequence alignment for five species with hexokinase,
and a phylogenetic tree using five different species with hexokinase.
__________________________________________________________________________________________________________________
Largest Open Reading Frame
The largest ORF or open reading frame between a start codon and a stop codon of
the human hexokinase cDNA was determined and assumed to be the coding sequence
for the human hexokinase. The coding region extended from nucleotide 82
to 2835. The original human cDNA from which this sequence was derived can
be seen by clicking here.
Fig. 1: This shows three different ORFs (1 on the top and 3 on the
bottom) for the cDNA for human hexokinase. The red triangles represent
start codons and the green lines represent stop codons. The largest
open reading frame is indicated by the black box extending from nucleotide
82 to 2835 for a total of 2753 nucleotides.
________________________________________________________________________________________________________________________
Determination of Molecular Weight
First the DNA of the ORF was translated into the correct amino acid
sequence using the genetic code. The the weight of each amino acid
was added together to obtain the total weight of the protein in daltons.
The weight was 102497.74 daltons.
______________________________________________________________________________________________________________________
Kyte and Doolittle Hydropathy Plot
Here MacDNAsis was used to prepare a hydopathy plot of
hexokinase. A hydropathy plot or Kyte and Doolittle Plot shows the
hydrophobicity of a protein along the y axis. The amino acids are
shown by the numbers along the x axis. The Kyte and Doolittle plot
is used to determine whether a protein is a transmembrane protein.
Peaks above 2 indicate strong hydrophobic regions making the protein a
strong candidate for a transmemebrane protein. A transmemebrane protein
must be hydrophobic in some regions in order to be compatible with the
hydrophobic region between the cytoplasmic side and the extracellular side
of a membrane or between the lumen side and the cytoplasmic side
of a membrane.
Fig. 2: Pictured here is a Kyte Doolittle hydropathy plot.
The average is -.12 so the protein is slightly more hydrophilic. There
are however 10 peaks that rise above 2 approximately at the 50th,
110th,
230th,
420th,
480th,
540th, 595th,
610th,670th,
and 740th amino acid. The most hydrophobic
region is the 50th amino acid rising almost to 4.There is a pretty even
stretch of peaks past the 420th amino acid to the end which seems to indicate
a pretty hydrophobic region. All of this data suggests that
this protein is a transmembrane protein.
_______________________________________________________________________________________________________________________
Hopp and Woods Antigenecity Plot
In this section an antigenecity plot was made. These plots show hydrophilicity
instead of hydrophobicity like the Kite and Doolittle plot. A more
hydrophilic region is a better place for an antigen to bind. Thus
an antigency plot helps to determine where on a protein a monoclonal antibody
would bind well. Again the the x axis represents the individual amino
acids but the Y axis represents the hydrophilicity.
Fig. 3: This is a hydrophobicity plot. The protein seems
to have pretty even distributions of hydrophilic and hydrophobic regions
but there are several hydrophilic peaks. There are relatively high
peaks at approximately amino acids 150, 290,
and 650 which are at
about 2. There are and additional few amino acids that come very
close to 2 at approximately 250,
350,
and 550.
These are the most hydrophilic sites which would be best for antigen binding.
_________________________________________________________________________________________________________________________
Prediction of Protein Secondary
Structure
The Chou, Fasman, and Rose plot can be used to
predict the secondary structure of hexokinase. Protein secondary structure
is due to hydrogen bonding with nitrogen and oxygen which are quite electronegative.
These hydrogen bonds are very strong and cause different conformations
with in the protein including alpha helices, beta pleaded sheets, turns,
and coilded regions.
Fig. 4: This figure shows the various location of alpha helices (blue),
beta pleaded sheets (red), turns (green), and coils (balck). This protein
of 918 amino acids seems to have many alapha helicies and beta pleaded sheets
through out the protein with 8 turns and a few coils regions. This image
can be compared to the RasMol image of hexokinase here.
_________________________________________________________________________________________________________________________
Multiple Sequence Alignment
In this multiple sequence alignment the hexokinase protein was compared
in five species for amino acid sequence similarity. The five species
were that were examined were:
Homo
Sapien (previously examined in this MacDNAsis),
Bos
taurus
Yarrowia
lipolytica
Arabidopsis
thaliana
Mus
musculus.
Click on the species to see the DNA and protein sequence from my Genbank
search.
Fig. 5: This is a small portion of the diagram comparing the
entire protein sequences of hexokinases in 5 species. This small
segment compares amino acids 201 to 250. The row marked protein stands
for Homo Sapien. The row marked mus stands for Mus musculus.
The row marked bos taurus stands for Bos taurus. The row marked yarrow
stands for Yarrowia lipolytica. The row marked arab stands for Arabidopsis
thaliana. The highlighted regions indicated amino acids which are
similar in different species. Plain blue letters are amino acids that don't
seem to correlate to other species. Dashes indicate amino acids that
are absent in that particular species but present in another. These
dashes are used in order to best align all the sequences. There seems
to be a great deal of similarity in amino acids in the human, mouse, and
bos taurus species which seems reasonable because all three species are
mammals. The Yarrowia lipolytica and Arabidopsis thaliana (plant)
seem to correlate with the human and other mammals less closely probably
because they are not very closely related to mammals. Between amino
acids 201 and about 240 these two species seem to be lacking many amino
acids that the mammals have. Then between 240 and 250 all five species
seem to have the greatest number of correlating amino acid matches.
______________________________________________________________________________________________________________________
Phylogenetic Tree
This is a representation of the overall conservation of amino acids
in hexokinase for the same five species as used above.
The five species were that were examined were:
Homo
Sapien (previously examined in this MacDNAsis),
Bos
taurus
Yarrowia
lipolytica
Arabidopsis
thaliana
Mus
musculus.
Click on the species to see the DNA and protein sequence from my Genbank
search. This diagram shows the degree of amino acid conservation
over time.
Fig 6: This figure shows a phylogenetic tree for the same five
species compared above and the same abbreviations stand. It looks
as thought the human and the mouse have a very similar hexokinase
proteins with overall compatibility at 89.2%. The bos taurus also
seems to be very similar in amino acid sequence with these two species
at 88.5%. The Yarrowia lipolytica and Arabidopsis thaliana (plant)
are not largely similar at 20% and overall compatibility of all five species
is 14.5% . It seems logical that all of the mammals would have
highly conserved sequences since they are all evolved from a similar ancestor.
The Yarrowia lipolytica and Arabidopsis thaliana (plant) on the other hand
show little amino acid sequence conservation amongst themselves and amonst
the other species because they are probably related by a more distant ancestor.
_________________________________________________________________________________________________________________________
Click to return to my Main
Page
Click here to return to the Molecular
Biology Home Page
Send comments, questions, and suggestions to:
Sabrautigam@davidson.edu