Announcements
 (4/26) Because of my snafu, I will hold a rerun of Sunday's study session on Tuesday (2/28) during my normal office hours (46pm).
 (4/26) I messed up. The study session is from 35pm!
 (4/22) A reminder that I will hold a study session for the final exam on Sunday 4/26 in Sitterson SN011 from 5pm7pm. I will return all remaining graded materials at this session.
 (4/12) Problem Set 4 notebook from the study session. Her is the genotype file.
 (4/10) Problem Set #3 is now posted.
 (4/6) The due date for problem set #2 has been extended until Wednesday (4/8), as announced on Twitter. I will hold a study session on Tuesday (4/7) from 5pm7pm.
 (4/1) There is a scheduling conflict with the classroom tomorrow. As a result the study session will take place in SN014 rather than SN011 (our normal classroom). I may also be a little late (~5:30) depending on traffic.
 (2/24) Problem Sets can now be submitted online at this link. I have also added a link to the notebook that I showed in class to implement the hint .
 (2/21) Problem Set #1 Hint: Many people have asked me for a hint on how to approach problem #4 of the first problem set. Consider that we have framentsize sets from the three digestions A, B, and AB. There is a simple way to reduce the problem from O(AB!) to O(2^{AB}). Both are exponential, but the latter is far smaller. Heres the idea. Every fragment in sets A and B must be formed by some subset of fragments from AB. Thus, if you enumerate every subset of fragments from AB and add their lengths you should eventaully see every fragment size in sets A an B. This will require a loop of 2^{20} for our case, but that is a manageable amount of work (Took just a few seconds on my computer). Keep in mind, there may be multiple subsets that lead to the same A or B fragment size. You'll need to keep track of everyone. Now that you know which subsets of fragments add up to make the A and B fragments, you need to consider how these subsets overlap. This leads to a simple graph problem where each solution is merely a path.
 (1/21) I have decided to stick with the old textbook:
An Introduction to Bioinformatics Algorithms
by Neil C. Jones and Pavel A. Pevzner
MIT Press © 2004, ISBN: 0262101068.
I intended to include a few topics from the new book, but when I do so, I will provide pdfs for any additional reading materials.
 (1/7) First class meeting in SN011. See you there
Course Description
Computational methods are fueling a revolution in the biological sciences. Computers are already nearly as indispensable as microscopes for analyzing and interpreting biological data. As a result, two new multidisciplinary fields, bioinformatics and computational biology, have emerged. This course will explore the computational methods and algorithmic principles driving this revolution. It will cover basic topics in molecular biology, genetics, and proteomics. The course also addresses basic computational theory and algorithms including asymptotic notation, recursion, divideandconquer approaches, graph algorithms, dynamic programming, and greedy algorithms. These fundamental concepts from computer science will be taught within the context of motivating problems drawn from contemporary biology. Example biological topics include sequence alignment, motif finding, gene rearrangement, DNA sequencing, protein peptide sequencing, phylogeny, and gene expression analysis.
This course is suitable for both computer science and biology students at both undergraduate and graduate levels. Students who wish to take this course should have some programming experience in a modern language. Knowledge of data structures, algorithm design, and biology is helpful but not required. There will be 5 problem sets with short programming assignments, a midterm, and a final exam.
A syllabus for this offering of Comp555 can be downloaded from here.
Book, Course Information, and Prerequisites
This semester I will be making a transition from the old version of the textbook to a new version. I largely intend to follow the organization of the old book while incorporating materials and exercises from the new one. At this moment, I would hold off purchasing either one, until I resolve which will be more beneficial. I expect to do this by the end of the second week of classes.
Here is the old book:
An Introduction to Bioinformatics Algorithms
by Neil C. Jones and Pavel A. Pevzner
MIT Press © 2004, ISBN: 0262101068.
And here is the new one:
Bioinformatics Algorithms: An Active Learning Approach
by Phillip Compeau and Pavel Pevzner
Active Learning Publishers © 2014, ISBN: 9780990374602.
Credit Hours: 
3 
Location: 
SN014 
Time: 
MW 1:252:40PM 
URL: 
http://www.csbio.unc.edu/mcmillan/?run=Courses.Comp555S15 
Prerequisites: 
COMP 410, Math 381, or equivalents

Course Instructors
Instructor: 
Leonard McMillan 

Office: 
SN311 
email: 
mcmillan@unc.edu 
Office Hours: 
TBA 
Schedule
Date 
Lecture Topic 
Reading Assignment 
Homework 
January 7 
Introduction (pdf) 
Chapter 1 

January 12 
HighThroughput Biology (pdf) 
Chapter 3 

January 14 
Algorithms and Complexity (pdf) (Notebook html) 
Chapter 2 

January 19 
MLK Holiday (No class) 


January 14 
DNA Restriction Mapping (pdf) (Notebook html) 
Chapter 4.14.3 

January 26 
Motif Finding (pdf) (Notebook html) 
Chapter 4.44.9 

January 28 
Greedy Algorithms (pdf) (Notebook html) 
Chapter 4.44.9 

February 2 
Genome Rearrangements (pdf) 
Chapter 5.35.5 
Problem Set #1 (Notebook) (Hint) 
February 4 
Dynamic Programming Preliminaries (pdf) (Notebook html) 
Chapter 6.16.3 

February 9 
Sequence Alignments (pdf) (Notebook html) 
Chapter 6.46.8 

February 11 
Local Alignments (pdf) 
Chapter 6.96.10 

February 16,18 
Gene Prediction (pdf) 
Chapter 6.116.14 
Problem Set #2 
February 23 
Divide and Conquer Algorithms (pdf) 
Chapter 7.17.4 

February 23 
Graph Algorithms (pdf) 
Chapter 8.18.8 

March 2 
Genome Assembly (pdf) 
Chapter 8.9 

March 4 
Midterm Exam (Covers to Lecture 12) 
Open Book and Notes. No computer. 
March 9,11 
Spring Break 
March 16 
Returned and discussed midterm 


March 18 
Protein Sequencing (pdf) 
Chapter 8.108.15 

March 23, 25 
Pattern Matching (pdf) 
Chapter 9.19.5 

March 30 
Suffix Arrays (pdf) (Notebook html) 
Not in book 

April 1 
BurrowsWheeler Transforms (pdf) 
Not in book 

April 5 
Clustering and Evolution (pdf) 
Chapter 10.410.8 

April 7 
Imperfect Tree Construction (pdf) 
Chapter 10.910.11 
Problem Set #3 
April 13 
Perfect Phylogeny (pdf) 
Not in book 

April 15 
HMMs (pdf) 
Chapter 11 

April 20 
Randomized Algorithms (pdf) 
Chapter 12 
