Bio343: Laboratory Methods In Genomics

Fall, 2008

A. Malcolm Campbell


Davidson students will be working the the Joint Genome Institute (funded by DOE) to annotate the Halorhabdus utahensis AX-2, DSM 12940 genome (See one publication). Davidson students will decode this genome that has never been analyzed before. Their work with be added to a database with the possibility of publishing their results.

This will be a stand alone lab course that is primarily data analysis by computer. DNA microarrays are fading fast but DNA sequencing will be used more each year. I want Davidson students to be skilled in an area with a long research potential.

I am very excited about this new course. Only 15 schools in the world get to participate in a pilot program run through JGI and we are the only school taking on an entire genome solo. It will be a lot of fun to do real genomics research on a species which is poorly understood. Our species is supposed to have an "energy component" to its metabolism which is one reason DOE is interested.


Tentative Syllabus: Bio 343 Laboratory Methods in Genomics

Student Collaborators

Learning Outcomes

1) Understand what a gene is through in depth analysis of a genome.

2) Determine how genomes are organized.

3) Generate species-specific metabolic maps.

4) Recognize that automated annotation is imperfect and many judgment calls are necessary.

5) Evaluate evolutionary paths as revealed in novel genomes.

6) Gain a real research experience and all that comes with it.

7) Develop computer skills used in modern genomics.

8) Excel in collaborative learning and research.

 

Required Readings

1) Genome: the autobiography of a species in 23 chromosomes. Matt Ridley. HarperCollins Publisher. Available at bookstores and Amazon.com.

2) Online Tools (FireFox browser to work)

3) Research publications on genomes (PDFs distributed during semester).


Tentative Weekly Schedule

Week of Semester
Subject Matter and Assignments Due
Week 1:
Aug 26 & 28

Pre-Semester Assessment

Discuss Semester-Long Research Plans & Set Educational Goals

Discuss Domains of Life, Genome Sequencing, JGI and our species

Halorhabdus utahensis AX-2, DSM 12940 web site

Wiki Terminology and Online Glossary

Report Information: Word File Template (One Gene per File) (excel version)

Strategy for Coverage, Quality Control (QC), and Triage

Read: Armbrust et al. 2004. The Genome of the Diatom Thalassiosira pseudonana: Ecology, Evolution, and Metabolism. Science 306: 79 - 86.

Start with RNA genes first and ID species specific Shine-Dalgarno sequence

Amino Acids Table (learn 1 letter code)

Genetic Code

Week 2:
Sep 2 & Sep 4

Examine 50 RNA gene results:

 

 

Identify Shine-Dalgarno sequence for our genome. Look in large ribosome subunit genes (LSU), DNA polymerase subunits, RNA polymerase subunits. Collect consensus results.

If Shine-Dalgarno is missing, is the gene part of an operon?

Try these genes out to see challenges:

1) Verify start codon:
Neighborhood six frame translation with putative ORF's shown below for gene_oid=2500587809 197587..197745(-)
DNA-directed RNA polymerase, subunit N (EC 2.7.7.6) (IMGterm).
Vs
Halorubrum lacusprofundi ATCC 49239 ID 641184591

2) Verify Shine-Dalgarno sequences:

Neighborhood six frame translation with putative ORF's shown below for gene_oid=2500587812 198625..198975(-)
LSU ribosomal protein L18AE (IMGterm).

And

Neighborhood six frame translation with putative ORF's shown below for gene_oid=2500588579 976173..976517(+)
LSU ribosomal protein L12AE (IMGterm).

And

Neighborhood six frame translation with putative ORF's shown below for gene_oid=2500588578 975105..976157(+)
LSU ribosomal protein L10P (IMGterm).
** has SD but it is not called by software

And

Neighborhood six frame translation with putative ORF's shown below for gene_oid=2500588577 974467..975105(+)
LSU ribosomal protein L1P (IMGterm).

BUT, this one does not seem to have one

Neighborhood six frame translation with putative ORF's shown below for gene_oid=2500588575 971692..972180(+)
LSU ribosomal protein L11P (IMGterm).

See examples from a different species (Ammonifex)

Standard Operating Procedure - step 1.

3 Teams Presentations + 1 option by Dr. C.

Controlled vocabulary

Problems to be addressed: Pseudogenes, transposons, horizontal gene transfer, orthologs, paralogs, homology, hypothetical genes, unknown function, quality of data for annotation.

Establish SOP (standard operating procedures) for genes.

Databases and Tools: BLAST, CDD, KEGG, EcoCyc, Tcoffee, EC numbers, and phylogenetic trees

Week 3:
Sep 9 &11

Report from Programmers - compiling information

Report from G6 on goals and tools

H. utahensis publication

Status of JCVI and Eco Cyc requests

Work on annotation projects

Continue gene annotation
Week 4:
Sep 16 & 18

G6 Determines

  • list the information we want order of priority
  • decision tree for programmers to use
  • SOP

Programmers start collecting information on every gene for a web site

WikiPathways for community-based annotation

The SEED for automated annotation and viewing

Manatee for automated annotation (JCVI)

 

Read Glass et al. 2000. The complete sequence of the mucosal pathogen Ureaplasma urealyticum. Nature. 407: 757 - 762.

Discuss end goals and methods for accomplishing this

Week 5:
Sep 23 & 25

First group of glossary entries (graded by Dr. C.)

Continue gene annotation
Week 6:
Sep 30 & Oct 2

Work on 3-way comparisons

Focus on research questions related to databases

Continue 3-way comparisons

Continue gene annotations

Week 7:
Oct 7 & 9

First methodology tutorial Due (graded by Dr. C.)

Conclude gene annotations

Oral presentation of your favorite gene (graded by Dr. C.)
(peer-critiqued in class)

Week 8:
Oct 14 & 16

Fall Break

Dana 146 from 10-11am

Scientific Heat about Cold Hits
Dr. Keith Devlin
How do you calculate the significance of a DNA profile match in a “Cold Hit” case, where the match is the result of a search through a DNA database? What statistical information about the database identification may be presented in court as evidence? These (mathematical) questions form the basis of a major and ongoing and sometimes heated debate in law enforcement, the legal profession, and academia, with several capital cases currently stalled in the appeals courts until the issues are resolved. I’ve been asked to provide testimony in a number of those appeals cases. The research I’ve done for my involvement has demonstrated just how subtle the issues are, and why, with so much at stake, the courts are worried that they may not be getting the calculations right.

Week 9:
Oct 21 & 23

Read Turnbaugh et al. 2006. An obesity-associated gut microbiome with increased capacity for energy harvest. Nature. 444: 1027 - 1031.

Continue gene annotation and start pathways

Pathways and Structures: Assembly and Annotations

 

Finish gene story for paper and start pathways

Pathways and Structures: Assembly and Annotations

Week 10:
Oct 28 & 30

SOP for Metabolic Pathways

Continue pathway annotations
Week 11:
Nov 4 & 6

Continue pathway annotations

Finalize Pathways and Experimental Testing
Week 12:
Nov 11 & 13

Develop pathways stories

Develop pathway tutorials

Second tutorial Due (graded by Dr. C.)

Dr. Gary Stormo talks about genomics in the future

Week 13:
Nov 18 & 20

Assess Status and Agree on Endgame

Write the final paper

Week 14:
Nov 25
Dec 2 & 4

Your Favorite Pathway Oral Presntation (graded by Dr. C.)

First draft of final paper (web site) due
Bring Hard Copy to collect comments
Peer review of draft paper (P/F graded)

Finish final paper
Week 15:
Dec 9

Final draft of final paper (web site) due
Slide Show of Germany and Kenya
Fellowships or Teacher Training
Survey

Course Evaluations

No final exam


Grading
Grades will be based on: glossary entries (10%); online tutorials for annotation process (20%); peer review (10%); final research paper (20%); lab notebook for genes and pathways (20%); and oral presentations and class participation (20%). The exact nature of the final paper cannot be determined at this point. You will use the online lab notebook to track your daily progress. Keep in mind that your work will be the foundation that investigators will use for subsequent research.

Grading Scale:

Conversion of Percentages to Letter Grades
A = 100 - 95 A- = 94 - 92
B+ = 91 - 89 B = 88 - 86 B- = 85 - 83
C+ = 82 - 80 C = 79 - 77 C - = 76 - 74
D+ = 73 - 71 D = 70 - 68
F = < 67



Genomics Concentration

Biology Home Page


© Copyright 2009 Department of Biology, Davidson College, Davidson, NC 28035
Send comments, questions, and suggestions to: macampbell@davidson.edu