Introduction to Bioinformatics: CMSC54610-1, Spring quarter 2006.
Lecture 2
Some things I forgot from last time:
Chapter 2: Data searches and pairwise alignments
- Dot plots: a visual guide to local alignments
- The
dotmatcher
program is explained in this
introduction
-
What does it do? Shows where the restricted edit distance is small.
- Edit distance
notes
- Distance versus score
- Scoring matrices:
BLOSSUM, PAM etc. are often defined by
aligning similar sequences
- Dynamic programming to compute the extrema (min distance <==> max score)
- Global alignment: Needleman-Wunsch
- Local alignment: Smith-Waterman (omit negative scores)
- Database searches
- BLAST: find short seeds and grow
- FASTA: look at differences
- Multiple sequence alignments
- Homework problems:
- (2.1) From the book: Chapter 2, problem 2.5: give at least
two optimal alignements for this sequence (see the solution at the
back of the book for support, but you still need to find all the
arrows!)
- (2.2) From the book: Chapter 2, problem 2.6
- (2.3) use dotmatcher to examine the alignment of the sequences
in the handout:
- AATTGCCGCCGTCGTTTTCAGCAGTTATGTCAGATC
- TCCCAGTTATGTCAGGGGACACGAGCATGCAGAGAC
Try different window sizes and `threshold' values to see what
shows the local alignment best. Explain your reasoning and
include plots to back up your claims. You should exhibit at least
two areas of local alignment of six or more. Label your plots so
it is clear which sequence is on which axis. Also label the major
local alignments that emerge in the plots.
- Homework is due by the beginning of the next class.
- Preferred: Homework should be turned in physically before the start of the class
- Emergency: send it to me electronically in standard formats.