Logged in as: guest Log in


  • (4/26) Because of my snafu, I will hold a rerun of Sunday's study session on Tuesday (2/28) during my normal office hours (4-6pm). 
  • (4/26) I messed up. The study session is from 3-5pm!
  • (4/22) A reminder that I will hold a study session for the final exam on Sunday 4/26 in Sitterson SN011 from 5pm-7pm. I will return all remaining graded materials at this session.
  • (4/12) Problem Set 4 notebook from the study session. Her is the genotype file.
  • (4/10) Problem Set #3 is now posted.
  • (4/6) The due date for problem set #2 has been extended until Wednesday (4/8), as announced on Twitter. I will hold a study session on Tuesday (4/7) from 5pm-7pm. 
  • (4/1) There is a scheduling conflict with the classroom tomorrow. As a result the study session will take place in SN014 rather than SN011 (our normal classroom). I may also be a little late (~5:30) depending on traffic.
  • (2/24) Problem Sets can now be submitted online at this link. I have also added a link to the notebook that I showed in class to implement the hint .
  • (2/21) Problem Set #1 Hint: Many people have asked me for a hint on how to approach problem #4 of the first problem set. Consider that we have frament-size sets from the three digestions A, B, and AB. There is a simple way to reduce the problem from O(|AB|!) to O(2|AB|). Both are exponential, but the latter is far smaller. Heres the idea. Every fragment in sets A and B must be formed by some subset of fragments from AB. Thus, if you enumerate every subset of fragments from AB and add their lengths you should eventaully see every fragment size in sets A an B. This will require a loop of 220 for our case, but that is a manageable amount of work (Took just a few seconds on my computer). Keep in mind, there may be multiple subsets that lead to the same A or B fragment size. You'll need to keep track of everyone.  Now that you know which subsets of fragments add up to make the A and B fragments, you need to consider how these subsets overlap. This leads to a simple graph problem where each solution is merely a path.
  • (1/21) I have decided to stick with the old textbook:
      An Introduction to Bioinformatics Algorithms
      by Neil C. Jones and Pavel A. Pevzner 
      MIT Press © 2004, ISBN: 0262101068.
    I intended to include a few topics from the new book, but when I do so, I will provide pdfs for any additional reading materials.
  • (1/7) First class meeting in SN011. See you there

Course Description

Computational methods are fueling a revolution in the biological sciences. Computers are already nearly as indispensable as microscopes for analyzing and interpreting biological data. As a result, two new multidisciplinary fields, bioinformatics and computational biology, have emerged. This course will explore the computational methods and algorithmic principles driving this revolution. It will cover basic topics in molecular biology, genetics, and proteomics. The course also addresses basic computational theory and algorithms including asymptotic notation, recursion, divide-and-conquer approaches, graph algorithms, dynamic programming, and greedy algorithms. These fundamental concepts from computer science will be taught within the context of motivating problems drawn from contemporary biology. Example biological topics include sequence alignment, motif finding, gene rearrangement, DNA sequencing, protein peptide sequencing, phylogeny, and gene expression analysis.

This course is suitable for both computer science and biology students at both undergraduate and graduate levels. Students who wish to take this course should have some programming experience in a modern language. Knowledge of data structures, algorithm design, and biology is helpful but not required. There will be 5 problem sets with short programming assignments, a midterm, and a final exam.

A syllabus for this offering of Comp555 can be downloaded from here.

Book, Course Information, and Prerequisites

This semester I will be making a transition from the old version of the textbook to a new version. I largely intend to follow the organization of the old book while incorporating materials and exercises from the new one. At this moment, I would hold off purchasing either one, until I resolve which will be more beneficial. I expect to do this by the end of the second week of classes.

Here is the old book:

bioalgorithmsBook.jpg An Introduction to Bioinformatics Algorithms
by Neil C. Jones and Pavel A. Pevzner
MIT Press © 2004, ISBN: 0262101068.

And here is the new one:

Bioinformatics Algorithms Bioinformatics Algorithms: An Active Learning Approach
by Phillip Compeau and Pavel Pevzner
Active Learning Publishers © 2014, ISBN: 978-0-9903746-0-2.

Credit Hours: 3
Location: SN014
Time: MW 1:25-2:40PM
URL: http://www.csbio.unc.edu/mcmillan/?run=Courses.Comp555S15
Prerequisites: COMP 410, Math 381, or equivalents

Course Instructors

Instructor: Leonard McMillan Leonard's Mug
Office: SN311
email: mcmillan@unc.edu
Office Hours: TBA


Date Lecture Topic Reading Assignment Homework
January 7 Introduction (pdf) Chapter 1  
January 12 High-Throughput Biology (pdf) Chapter 3  
January 14 Algorithms and Complexity (pdf) (Notebook html) Chapter 2  
January 19 MLK Holiday (No class)    
January 14 DNA Restriction Mapping (pdf) (Notebook html) Chapter 4.1-4.3  
January 26 Motif Finding (pdf) (Notebook html) Chapter 4.4-4.9  
January 28 Greedy Algorithms (pdf) (Notebook html) Chapter 4.4-4.9  
February 2 Genome Rearrangements (pdf) Chapter 5.3-5.5 Problem Set #1 (Notebook) (Hint)
February 4 Dynamic Programming Preliminaries (pdf) (Notebook html) Chapter 6.1-6.3  
February 9 Sequence Alignments (pdf) (Notebook html) Chapter 6.4-6.8  
February 11 Local Alignments (pdf) Chapter 6.9-6.10  
February 16,18 Gene Prediction (pdf) Chapter 6.11-6.14 Problem Set #2
February 23 Divide and Conquer Algorithms (pdf) Chapter 7.1-7.4  
February 23 Graph Algorithms (pdf) Chapter 8.1-8.8  
March 2 Genome Assembly (pdf) Chapter 8.9  
March 4 Midterm Exam (Covers to Lecture 12) Open Book and Notes. No computer.
March 9,11 Spring Break
March 16 Returned and discussed midterm    
March 18 Protein Sequencing (pdf) Chapter 8.10-8.15  
March 23, 25 Pattern Matching (pdf) Chapter 9.1-9.5  
March 30 Suffix Arrays (pdf) (Notebook html) Not in book  
April 1 Burrows-Wheeler Transforms (pdf) Not in book  
April 5 Clustering and Evolution (pdf) Chapter 10.4-10.8  
April 7 Imperfect Tree Construction (pdf) Chapter 10.9-10.11 Problem Set #3
April 13 Perfect Phylogeny (pdf) Not in book  
April 15 HMMs (pdf) Chapter 11  
April 20 Randomized Algorithms (pdf) Chapter 12  

Site built using pyWeb version 1.10
© 2010 Leonard McMillan, Alex Jackson and UNC Computational Genetics