This web page was produced as an assignment for an undergraduate course at Davidson College.

My Favorite Yeast Genes:

PDI1 & non-annotated YCL047C
 
 
Introduction
 

PDI1 is an annotated gene located on Saccharomyces cerevisiae chromosome 3. PDI1 encodes a protein disulfide isomerase, present in the endoplasmic reticulum, which plays an essential role in the formation of disulfide bonds in secretory and cell-surface proteins (Dolinski 2004). While the function of PDI1 is well-known, the neighboring YCL047C hypothetical ORF is entirely uncharacterized. This ORF’s cellular component, molecular function, and biological process are still unknown (Dolinski 2004).

 

 
 

My Favorite Annotated Yeast Gene: PDI1

 

Chromosomal Location

PDI1 is located within a small region of Saccharomyces cerevisiae chromosome 3, as shown below.

 

Figure 1 Chromosomal Location of PDI1 . This image shows the location of PDI1, on the Crick strand of yeast chromosome 3 (coordinates 50221-48653). Also visible is the non-annotated YCL047C ORF (left). Image from: http://db.yeastgenome.org/ (Cherry 1997). Permission pending.

 

Biological Process

PDI1 encodes the essential protein disulfide isomerase. The PDI1 protein product catalyzes the formation of native disulfide bonds in secretory proteins, an important factor leading to the overall structure of these proteins (Givol 1995).

 

Molecular Function

The PDI1 protein contains two thioredoxin-like domains (see figure 2), each with a copy of the active site sequence motif CGHC (Norgaard 2001).

 

Figure 2 Thioredoxin-like domains of PDI1. This image depicts the relative position and extent of the thioredoxin-like domains (open boxes) within the PDI1 protein. CGHC active site sequence motifs, which form an intramolecular disulfide bond capable of converting a pair of sulfhydryl groups in a polypeptide substrate into a disulfide bond, are also shown. Image from: Norgaard 2001. Permission pending.

 

The two cysteines within these active sites possess the ability to cycle between a reduced and an oxidized state. In the oxidized state, the two cysteines form an intramolecular disulfide bond that “in principle, enables PDI to convert a pair of sulfhydryl groups in a polypeptide substrate into a disulfide bond” (Norgaard 2001).

 

Figure 3 Reaction Cycle for the Oxidation of a Nascent Polypeptide Catalyzed by PDI. This figure illustrates an oxidation reaction in which the intramolecular disulfide bond of the CGHC motif is transferred to a pair of sulfhydryls in a substrate polypeptide. Image from: (Holst 1997). Permission pending.

 

Cellular Component

The folding of secretory proteins occurs within the lumen of the endoplasmic reticulum. Consequently, the yeast protein PDI1, is also found in this region (Norgaard 2001).

 

Related Genetic Disorders

Within the Saccharomyces cerevisiae genome, there are at least four other genes with homology to PDI1 (figure 4) that produce PDI-like proteins.

 

Figure 4 PDI-like Proteins. This figure illustrates the results of a BLASTp search, using the PDI1 protein as the query sequence. The underlined results represent proteins with PDI-like protein properties. Image from: Gish 2004. Permision pending.

The underlined genes in figure 4 also produce proteins with two thioredoxin-like domains; however, these homologues are not interchangeable with PDI1. In fact, a yeast producing normal quantities of all PDI-like proteins, but suffering from a PDI1 deletion, will die (Norgaard 2001). On the other hand, a yeast producing normal quantities of PDI1, but no PDI-like proteins, will function normally (Norgaard 2001). This lethality effect certainly demonstrates the importance of PDI1 to Saccharomyces cerevisiae.

 

More Protein Information from the Web

The PDI gene is 1569 bp long. To view the complete sequence, click http://db.yeastgenome.org/.

 

Protein disulfide isomerase (PDI) is composed of 522 amino acids. To view the complete sequence, click http://www.ncbi.nlm.nih.gov/.

 

PDI1 is a very active protein. The following table lists eight proteins with which PDI1 is known to associate.
 

Figure 5 Proteins with which PCI1 is Known to Associate. This table illustrates the variety of proteins, representing a spectrum of processes, with which PDI1 associates. This variety leads one to assume that PDI1 may have broad-reaching effects, all of which are not yet fully understood. Image from: http://biodata.mshri.on.ca/. Permission pending.

 

A conserved domain search for the PDI1 sequence resulted in hits for thioredoxin, thiol-disulfide isomerase, and thiol:disulfide interchange domains (figure 6).

 

Figure 6 Conserved Domain Search for PDI1 Protein. This image shows conservation in the thioredoxin-like regions. Image from: http://www.ncbi.nlm.nih.gov/. Permission pending.

The Kyte-Doolittle hydropathy plot results (shown below) indicate the possibility of two transmembrane regions within the protein.

 

Figure 7 Kyte-Doolittle Hydropathy Plot of PDI1. This plot represents one region (near amino acid 20) in which there is a strong likelihood of a transmembrane domain and one region (near amino acid 150) in which the likelihood of a transmembrane domain is possible, but less likely. Image from: http://occawlonline.pearsoned.com/bookbind/pubbooks/bc_mcampbell_genomics_1/medialib/activities/kd/kyte-doolittle.htm. Permission granted by AM Campbell, PhD.

 

The PREDATOR in silico prediction for the secondary structure of PDI1 is shown below (figure 8).

 

Figure 8 PREDATOR prediction for PDI1. Red areas indicate extended strand regions. Blue areas indicate alpha helices. Purple regions indicate areas of random coil. It is interesting to note that both active site CGHC sequence motifs occur in areas of random coil. Image from: http://npsa-pbil.ibcp.fr/. Permission pending.

 

While there was no match for PDI1 within the protein database, a PDB-Homology search yielded 1mek, a human protein disulfide isomerase with 42% identity match and a 33% positive score < http://db.yeastgenome.org>. This protein is shown below.

Figure 9 Human Protein Disulfide Isomerase (mek1). This protein, expressed within humans, functions in the same way as PDI1. Mek1 demonstrates a 42% identity match and a 33% positive score with PDI1. Image from: http://www.rcsb.org/. Permission pending.

 

 
 

My Favorite Non-Annotated Yeast ORF: YCL047C

 

Chromosomal Location

The YCL047C ORF is located within a small region of Saccharomyces cerevisiae chromosome 3, as shown below.

 

Figure 10 Chromosomal Location of YCL047C .  This image shows the location of YCL047C on the Crick strand of yeast chromosome 3 (coordinates 44437- 43661).  Also visible is the annotated PDI1 gene (right).  Image from: http://db.yeastgenome.org/ (Cherry 1997).  Permission pending.

 

Protein Information from the Web

The YCLO47C ORF is 777 bp long.  To view the complete sequence, click http://db.yeastgenome.org/.

 

The YCL047C hypothetical protein is composed of 258 amino acids and has a molecular weight of 29,673 Daltons.  To view the complete amino acid sequence, click http://db.yeastgenome.org/.

 

BLASTp results for the hypothetical protein sequence yield a number of results (the best hits are shown below).

 

Figure 11 YCL047C-like Proteins. This figure illustrates the results of a BLASTp search, using the YCL047C hypothetical protein as the query sequence. Image from: http://www.ncbi.nlm.nih.gov/. Permission Pending.

 

While most of the strongest hits correspond to hypothetical proteins (in a variety of animals), the third hit is for the AFR721Wp protein in Eremothecium gossypii.  Within this protein, there is a cytidylyltransferase conserved domain.  Cytidylyltransferases form the critical intermediates in the biosynthesis of lipids and complex carbohydrates (Weber 1999).

 

A conserved domain search of the YCL047C sequence resulted in only one hit – the nicotinic acid mononucleotide adenylyltransferase domain, used in coenzyme metabolism.  These results show a 98% alignment between the NadD region and that of YCL047C.

 

Figure 12 Conserved Domain Search for YCL047C hypothetical ORF.  This image shows conservation of the nicotinic acid mononucleotide adenylyltransferase domain.  Image from: http://www.ncbi.nlm.nih.gov/.  Permission pending.

 

It is also interesting to note that YCL047C’s closest homolog (excluding hemiascomycetes) is a hypothetical protein within the Schizosaccharomyces pombe genome, called SPAC694.03 (figure 13).  However, as this is also a hypothetical protein, the study of this homolog provides few insights.  Yet, it is important to note that SPAC694.03 also contains a cytidylyltransferase conserved domain.

 

Figure 13 SPAC694.03, YCL047C's Closest Homolog. This figure illustrates that the SPAC694.03 hypothetical protein shares 31.5% identity with our YCL047C hypothetical protein. This makes SPAC694.03, found in Schizosaccharomyces pombe, the closest homolog of the YCL047C hypothetical protein. Image from: http://mips.gsf.de/. Permission Pending.

 

The Kyte-Doolittle hydropathy plot results (shown below) indicate the possibility of two transmembrane regions within YCL047C.

 

Figure 14 Kyte-Doolittle Hydropathy Plot of YCL047C.  This plot predicts the presence of one transmembrane domain near amino acid 65 and a second transmembrane domain near amino acid 108. Image from: http://occawlonline.pearsoned.com/bookbind/pubbooks/bc_mcampbell_genomics_1/medialib/activities/kd/kyte-doolittle.htm. Permission granted by AM Campbell, PhD.

The PREDATOR in silico prediction for the secondary structure of the YCL047C hypothetical protein is shown below.

Figure 15 PREDATOR prediction for YCL047.  Red areas indicate extended strand regions.  Blue areas indicate alpha helices.  Purple regions indicate areas of random coil.  According to the PREDATOR prediction, 50% of this molecule would show a random coil patter, 37.21% would form alpha helices, and the remaining 12.79% would manifest in regions of extended strand.  Image from: http://npsa-pbil.ibcp.fr/.  Permission pending.

 

Disorders

Protein disruptions within YCL047C have not been greatly studied.  The only information available is the result of a high throughput experiment which deleted the entire YCL047C ORF.  This study showed the YCL047C-deletion mutant to be “viable” in high salt, sorbitol, galactose, pH 8, minimal medium and nystatin treatments (Giaever 2002).

 

Predictions
Based on the conserved domain information above, I would predict that YCL047C is a functional protein involved in coenzyme metabolism. Based on the Kyte-Doolittle hydropathy plot, I would also wager that this protein was somehow integrated into a membrane within Saccharomyces cerevisiae. However, unlike PDI1, this YCL047C protein product would not be essential to yeast viability.
 
 

 

References

Cherry JM, et al.  Genetic and physical maps of Saccharomyces cerevisiae.  Nature 1997 May 29; 387: 67-73.

Dolinski K, et al.  2004.  Saccharomyces Genome Database.  <http://www.yeastgenome.org>.  Accessed 2004 Oct 4.

Giaever G, et al.  Functional profiling of the Saccharomyces cerevisiae genome.  Nature 2002 Jul 25; 418 (6896): 387-91.

Gish W.  2004.  WU-Blast.  <http://blast.wustl.edu>.  Accessed 2004 Oct 4.

Givol D, et al.  Disulfide interchange and the three-dimensional structure of proteins.  Proceedings of the National Academy of Science USA 1995; 53: 676-684.

Heyer L, Johnson S, McCord RP, and Robinson L. Kyte Doolittle Hydropathy Plot. <http://occawlonline.pearsoned.com/bookbind/pubbooks/bc_mcampbell_genomics_1/medialib/activities/kd/kyte-doolittle.htm>. Accessed 2004 Oct 4.

Holst B, et al.  Active Site Mutations in Yeast Protein Disulfide Isomerase Cause Dithiothreitol Sensitivity and a Reduced Rate of Protein Folding in the Endoplasmic Reticulum.  Journal of Cell Biology 1997 Sep 22; 138(6): 1229-1238.

[IBCP] Institut de Biologie et Chimie des Proteines. 2001 Jan 23. Pole BioInformatique Lyonnais: Network Protein Sequence Analysis. <http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_preda.html>. Accessed 2004 Oct 7.

[MIPS] Munich Information Center for Protein Sequences. 2003 Nov 3. CYGD: Comprehensive Yeast Genome Database. <http://mips.gsf.de/genre/proj/yeast/index.jsp>. Accessed 2004 Oct 7.

Mount Sinai Hospital. 2003. Yeast Grid. <http://biodata.mshri.on.ca:80/yeast_grid/servlet/SearchPage>. Accessed 2004 Oct 7.

[NCBI] National Center for Biotechnology Information. 2004 Sep 8. NCBI Homepage. <http://www.ncbi.nih.gov>. Accessed 2004 Oct 7.

Norgaard P, et al.  Functional differences in yeast protein disulfide isomerases.  The Journal of Cell Biology 2001 Feb 5; 152(3): 533-562.

[RCSB] Research Collaboratory for Structural BioInformatics. 2004 Oct 5. RCSB PDB: Protein Data Bank. <http://www.rcsb.org/pdb>. Accessed 2004 Oct 7.

Weber CH, et al.  A prototypical cytidylyltransferase: CTP:glycerol-3-phosphate cytidylyltransferase from Bacillus subtilis.  Structure 1999; 7(9):1113-1124.

 
 
 
 

Stephen's Genomics Webpage

Davidson College Genomics Webpage

Davidson College Biology Webpage