This webpage was produced as an assignment for an undergraduate course at Davidson College

Proteomic Information for DOT1 and YDR458c

As discussed in the previous assignment, DOT1 has been determined as a Histone-lysine N-methyltransferase. While expression data from DNA microarrays confirmed this function, it is important to investigate the role of the protein as well, when examining function. To do this, several databases were mined for information regarding DOT1: below you will find information about the function, structure, and location of the DOT1 protein.

The BIND database
is very helpful as it consolidates the various function annotations for the DOT1 protein, so that we may see the differing opinions. For DOT1, the functions listed above are all related, thus we have confirmed a general function for the protein.

Here is some basic information on the DOT1 protein, and the accession number in the NCBI database is NP_010728:

This sequence information is from the Uni-Prot database and shows us the length and weight of the DOT1 protein.

Another important fact about the protein is that its isoelectric pH is approximately 9.4 - this can be found at the PROWL website under sequence analysis of the protein as well as the SGD database as shown above. The isoelectric point is the 'settling point' of the protein: the pH at which the protein has a neutral overall charge. Since the isoelectric point is higher than neutral pH at 7.0, we would imagine that the protein has more alkaline amino acids than acidic. This pI is a slightly different value on the SGD website, however they are close enough to be considered not significantly different.
The amino acid sequence of a protein can sometimes give us clues as to a protein's function based on conserved domains in other proteins, thus I have used the Swiss-Prot database to analyze the features of the amino acid sequence in the DOT1 protein.

The images above are from the Swiss-Prot database, and it is very interesting to note that one mutation at amino acid 401 can abolish the silencing function of the protein. Since DOT stands for Disrupter of Telomeric Silencing, having a single mutation that is able to change this function is very significant. This would lead me to believe that the amino acid at site 401 could be in the middle of an active site, or part of an important structural element of the protein. This is confirmed when we see that mutations from 399-403 cause loss of ability to perform methyltransferase activity, one of the main functions of the gene! This tells me I should look at the structure of the protein, to see where these amino acids are present in the molecule, as perhaps this will give us a clue as to their importance.

To do this, I searched the PDB (Protein Database), which contains the structures of many proteins that have been crystallized.

This image is the yeast version of DOT1, and as you can see it has three units, all of which are identical.

If you select Glycine and highlight all the glycines, you can find the G at site 401, and notice that it is very close to three other Glycines as well. Although they are in the middle of the protein, they are close to the cavity that forms between the three protein units, and they are assembled along a curve, thus perhaps they are important for structure. Since the subunits appear to form a cavity between them, we might guess that this is the location of the active site. I would imagine that the protein interacting with DOT1 would enter the cavity to bind with the active site of DOT1.

One of the most interesting things to examine when looking at the function of a protein is what other proteins your protein of interest interacts with. The DIP (Database of Interacting Proteins) is an amazing resource that displays protein interactions as networks with other proteins, out to the second degree of interaction. As the information is compiled from many sources, some interactions are more informed than others, however this is a very useful way to visualize the interactions of your protein.

The graph above shows DOT1 as the red circle in the center of the diagram and the lines connecting it are the proteins it interacts with. The orange circles indicate a first node, and the yellow circles indicate secondary nodes. Thus DOT1 interacts physically only with the proteins represented by orange circles. The green lines indicate interactions that are confirmed, and red lines indicate unverified results. Thicker lines represent that there are more data verifying the connection.
Since only one of the connections from the DOT1 protein has been confirmed, this is the most likely to occur interaction. I was able to determine that the protein linked to DOT1 is HHT1, and its function as described by the Saccharomyces Genome Database is as below:

Although this does not appear to be directly related at first, when we closely examine the data from Uni-Prot, we see the following comments:

As you can see, the functions are closely related, thus this supports the idea of guilt by association - these two proteins are associated with each other, and have similar functions as well: both are involved in telomere silencing. Perhaps HHT1 is the protein that interacts with the active site of DOT1, or HHT1 could assist in localizing DOT1 to the end of the telomere. It could also be the actual telomere that is inserted into the space between the three subunits of DOT1.

It is interesting to note, however that in going to another database I have an entirely new set of proteins listed as interacting with DOT1, which does not include HHT1. From the YeastGrid we see that the following proteins have interactions with DOT1p.

However none of these proteins appear to have functions that are similar to DOT1. Also, none of these are located in the nucleus, as is the DOT1 protein, thus it is suspicious that these proteins would interact, since they are not in the same locations. HHT1 protein however, is in fact located in the nucleus, thus this is a much more plausible connection because if the two proteins are located in the same place of the cell, they are more likely to encounter one another.

No information could be found for this protein in the Yeast Two Hybrid database, or the Swiss 2D-Page database, and this protein was not included in the figures from Benno, et. al, or Schwikowski.
For a more in depth analysis on the function of DOT1 in relation to the several other DOT genes, please see the paper by Singer, et. al 1998.


From the previous assignment linked on my Genomics Homepage, you can see that I have made several guesses as to the function of the YDR458c gene using sequence analysis and microarray expression data. Now, I will examine the role of the protein by checking the same databases used for the DOT1 protein. This is because I have now confirmed the activity of the DOT1 protein, thus hopefully using the same databases will give me an accurate description of the YDR458c protein function.


BIND database
once again has consolidated the various GO annotations for the YDR458c protein.

As we can see from the symbols at the top of this image, the function of the protein according to the BIND database is cell multiplication. This agrees with the theory that the protein is involved with mitosis. The interesting new fact is that the protein may be involved in binding ATP. As ATP is an energy source for the cell, perhaps YDR458c is involved in controlling the energy being input into mitosis. This could affect the speed with which the cell undergoes mitosis. Or perhaps the protein merely needs to bind ATP in order to use energy for separating the sister chromatids. As the protein is integral to the membrane it is difficult to believe that it would be able to move around the nucleus, thus the segregation of the chromatids would have to be mediated by YDR458c while it is immobilized in the membrane. Perhaps YDR458c is an anchor for the spindles that separate the chromatids.


My first search was done with the Uni-Prot database, where I found a match for YDR458c, and found that the protein has the name SRC1, and has been confirmed to be involved in sister chromatid separation. However, the protein name SRC1 does not seem to be in use on any other database, thus I will continue to refer to this protein as YDR458cp.
To read an in depth paper about the SRC1 protein, please see Rodriguez-Navarro, 2002.

As we can see from other data found at the Uni-Prot database, the amino acid sequence for this protein is 834 amino acids long, and is 95.5 kDa. Again, to find the isoelectric point, I will use the PROWL database, where I see that the isoelectric pH is 8.1, thus the protein is neutral at slightly above a neutral pH: this protein is not as basic as DOT1p. This isoelectric point is confirmed by the SGD where we see the following information about the protein. Again, 7.8 and 8.1 are not significantly different.

Swiss Prot

As we can see from the image above, the YDR458c protein has what is most likely a transmembrane domain. This has already been established by the Kyte-Doolittle plot in the previous analysis. Therefore this protein is most likely an integral membrane protein. The poly serine domain upon research appears to be involved as an active site for transcriptional events such as splicing. Therefore, perhaps this protein has alternative forms where this amino acid region is the signal for splicing.

Unfortunately, the PDB did not have YDR458c in their database, thus this protein has not yet been crystallized to determine the structure. The protein also had no known interactions in the DIP database, thus other avenues had to be explored.


Another place to search for protein-protein interactions is at MIPS. Here we find that YDR458c has one interaction, however it is not a physical interaction.

Below we can see the information known about CDC8 from the Uni-Prot database and the SGD database directly below that.

The information from these two databases shows us that CDC8 is involved in mitosis and meiosis. This is interesting because one of the hypotheses for the function of YDR458c is that it is involved in mitosis. An interesting thing to note in the Uni-Prot information is that the localization of the protein is specified as nucleus and cytoplasm. This may indicate that the protein travels through the nuclear membrane during its normal function. As a transmembrane protein that is localized to the nucleus, perhaps YDR458c assists CDC8 in moving from the nucleus to the cytoplasm. However the MIPS database indicated that this was a genetic interaction as opposed to a physical interaction, thus perhaps YDR458c senses the concentration of CDC8 and is involved in regulating the movement of the protein from the nucleus to the cytoplasm or vice versa through transcriptional regulation. This could be the explanation for the poly serine site on one end of the protein. When the concentration of CDC8 is too high during mitosis, the protein structure is altered through the splicing site, and this then leads to a change in transcription or movement of CDC8.


From the information gathered above, we can see that while there is some information on the YDR458c protein available, we are far from understanding its function well. There are several experiments that could be done to improve on our knowledge about this protein. For example, there was no information on YDR458c in the Yeast Two-Hybrid database. Perhaps a Y2H screen could be done to find proteins with which YDR458c interacts physically. If no proteins were pulled down, it would be interesting to test if YDR458c interacts directly with the DNA, and not with any other proteins. Also, the protein has not been crystallized: if X-ray crystallography could be done on this protein, we may find more information hidden in the structure of the molecule. For example, where exactly is the poly-serine domain? Is it found inside the nucleus, or on the other side of the membrane? Also, does the function of YDR458c alter if you remove the section that comes before the serine rich area? To better confirm the theory that YDR458c is involved in mitosis, a protein macroarray could be done during mitosis to see if regulation of the protein is repressed or induced. If induced, then this would suggest the protein is necessary for mitosis. This would also be helpful because a similar analysis could be done for this as was done on the DNA microarray data: similar expression patterns could be followed as perhaps these proteins have similar functions.

While there are many experiments that can be done on YDR458c to determine its function, we must keep in mind that there are thousands of proteins that also have the same level of understanding or lower than YDR458c, thus it may be some time before the function is fully known, especially if this is not determined to be a crucial protein for any specific processes.




Miriam S. Singera, Alon Kahanaa,b, Alexander J. Wolfb, Lia L. Meisingera, Suzanne E. Petersonb, Colin Goggina, Maureen Mahowalda, and Daniel E. Gottschlingb, 1998. "Identification of High-Copy Disruptors of Telomeric Silencing in Saccharomyces cerevisiae". Genetics: 150, 613-632.

Rodriguez-Navarro S, Igual JC, Perez-Ortin JE, 2002. "SRC1: an intron-containing yeast gene involved in sister chromatid segregation." Yeast: 19, 43-54.

Genomics Home Page

Send questions or comments to: Megan McDonald