BLAST Search

Discovery Questions

The investigators who eventually cloned the HD gene in 1993, used a novel approach that lead them to the correct gene. However, since the human genome sequence is freely available online, we will utilize this resource to home in on the HD gene. Rather than using the Genome Browser as we did for CFTR, we will use a sequence search engine called BLAST. There are two types of BLAST searches, one for protein sequences (BLASTp) and one for nucleotide sequences (BLASTn). We will use both to find the HD gene.

Go to the National Center for Biotechnology Information (NCBI) BLAST web site. From the BLAST web page, follow these directions:
1) Click on “Protein-protein BLAST [blastp]” which will allow you to submit protein sequences for comparison to all protein sequences available in the world. The goal is to submit a sequence (called a query sequence) and find those sequences that match your query sequence. The longer your query sequence is, the longer the search will take but the more likely you are to find meaningful results.

2) We are going to perform BLASTn or BLASTp searches with seven sequences. Each one is listed below. You should perform each search separately, and record the name of the gene, the abbreviated name of each gene, and a short description of the gene or protein’s role in cells.

3) A small section of cDNA was cloned and sequenced from a person who had died from HD. Brain tissue was removed and cDNAs were produced. Many cDNAs were sequenced but these seven are of particular interest to us. Six of the seven cDNAs allowed the investigators to deduce amino acid sequences using the genetic code. Below are some of the deduced amino acids sequences. See if you can figure out which one might be the one that causes HD. Copy and paste the sequences into the large blank "search" space and then click on the “BLAST!” button. Notice that the default includes a "Do CD-Search" which you can turn off. The conserved domain (CD) search finds functional domains within any proteins that match your query sequence.

4) On the intermediate results page, click on the “Format!” button. You may have to wait a while for the results, depending on when you submit this BLASTp search. You will get a visual result that shows some of the hits (or database matches)

5) Click on the first human hit (Homo sapiens). For each of these protein fragments, read the short description and see if you think this might be the cause for HD. Remember, we are looking for a dominant disease with a loss of mental function.

Write down the names, the abbreviated names, and the accession numbers of all 7 sequences that you find by this search. An accession number is a unique identifier given to each entry in the database. Also, jot down a short description for each protein or gene you locate in your BLAST searches.

Amino Acid Sequence #1

Amino Acid Sequence #2

Amino Acid Sequence #3

Amino Acid Sequence #4

For sequence #5, investigators were not able to determine the proper reading frame for deducing an amino acid sequence. Therefore, submit a BLASTn search by going back to the BLAST page and choosing click on “Standard nucleotide-nucleotide BLAST [blastn]”
cDNA sequence #5
cttgcctgac atcggtttcc cctcccccac ggtcccaaga tggttgtgga catccaatct cacagcagag tcatctccta tgcaggctgc ctgactcaga tgtctccctt tgccattttt

Go back to BLAST web site and perform a BLASTp for....
Amino Acid Sequence #6

Amino Acid Sequence #7

6) By now, you have figured out which gene/protein is the right one because it is well documented in the database. Clcik on the link "gi|90903231|ref|NP_002102.4|" that is the first human huntingtin link. ON this page, click on "GeneID:3064" that appears just above the amino acid sequence. How long is the HD gene? How long is the mRNA? What is the common name for this protein? On which chromosome is HD located?

7) Let’s determine the genomic location of the seven genes you found in your BLASTings. Where is each of the seven genes located? You can search quickly at Mapviewer Enter each gene’s name in the “Search for” box and then hit the “Find” button.
Abbreviated names tend to work better than full names.

8) Click on the OMIM (Online Mendelian Inheritance in Man(kind) link from the MapViewer HD results. What is the cause of Huntington’s disease? In other words, what does this gene/protein look like when a person has HD? How does it differ from most people’s alleles? You can do a find function on this page for the term "IT15" to get you started. Go to the "allelic Variants" section to see the normal and the diseased range of (CAG)n trinucleotide repeats.

9) Go back to OMIM ans serach for "huntingtin". How many OMIM hits are there? Scroll down and click on the link above the gene called “HUNTINGTIN-INTERACTING PROTEIN 1; HIP1”. Read about HIP1, under the "GENE FUNCTION" heading. What sort of protein interactions happen between HD and HIP1?

Genomics Course Page

Biology Department Main Page

Send comments, questions, and suggestions to: or (704) 894 - 2692

© Copyright 2006 Department of Biology, Davidson College, Davidson, NC 28035