Next we will consider tools for analyzing high-throughput sequence data.
We will make use of the "Pysam" python package which you can download as follows:
$ sudo easy_install pysam
password: *
-- or --
$ sudo pip install pysam
password: *
Pysam is a wrapper around the popular SAMtools used for processing SAM and BAM files. You will find documentation for Pysam here.
import os
import pysam
The Pysam package provides direct access to the samtool's command line functionality. Samtool's commands are excuted as follows:
pysam.command(arg1, arg2, ...)
is equivalent to typing
$ samtools command arg1 arg2 ...
Most of the Pysam routines assume that the supplied bam file has a precreated index file. The following code segment creates an index file if one does not already exist. However, the SAMtools "index" function assumes that the supplied BAM is sorted by genomic position, which is not handeled in this fragment. If needed, you can replace the lines of the if with:
pysam.sort("FF0683F.bam", "FF0683F.sorted")
pysam.index("FF0683F.bam")
bamfile = pysam.Samfile("FF0683F.bam", "rb")
if not os.path.exists("FF0683F.bam.bai"):
pysam.index("FF0683F.bam")
By the way, this BAM file has reads from an RNA-sequencing experiment.
sizeCount = {}
for read in bamfile.fetch():
sizeCount[len(read.seq)] = sizeCount.get(len(read.seq), 0) + 1
N = 0
for key, count in sizeCount.iteritems():
print "%3d %10d" % (key, count)
N += count
print "Total: %d reads" % N
100 72399228 Total: 72399228 reads
Let's see how many alignments are contained in the BAM file. Usually, there will be one per chromosome. We can also get at the lengths of each.
print bamfile.references
print bamfile.lengths
('chr1', 'chr10', 'chr11', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17', 'chr18', 'chr19', 'chr2', 'chr3', 'chr4', 'chr5', 'chr6', 'chr7', 'chr8', 'chr9', 'chrM', 'chrX') (197195432L, 129993255L, 121843856L, 121257530L, 120284312L, 125194864L, 103494974L, 98319150L, 95272651L, 90772031L, 61342430L, 181748087L, 159599783L, 155630120L, 152537259L, 149517037L, 152524553L, 131738871L, 124076172L, 16299L, 166650296L)
Let's count the number of reads overlapping a given region. This region coincides with a single exon gene named Hist1h1c.
N = 0
for read in bamfile.fetch('chr13', 23830650, 23832250):
N += 1
print N , "reads in region"
1347 reads in region
Rather than processing the file by reads (sorrted by their alignment positions), we can instead process the BAM file by genomic position using a "pileup" iterator. A pileup returns the set of reads that overlap a range of genomic postions. You can further interate through this set.
consistent = 0
for column in bamfile.pileup('chr13', 23830650, 23832250):
pos = column.pos
counts = {base : 0 for base in "ACGTN"}
for read in column.pileups:
if (not read.is_del):
base = read.alignment.seq[read.qpos]
counts[base] += 1
# print out columns with any noise
if max(counts.itervalues()) != sum(counts.values()):
print pos, sorted(counts.items()), sum(counts.itervalues()) - max(counts.itervalues())
else:
consistent += 1
print "%d consistent columns" % consistent
23830729 [('A', 0), ('C', 2), ('G', 48), ('N', 0), ('T', 0)] 2 23830743 [('A', 1), ('C', 61), ('G', 0), ('N', 0), ('T', 0)] 1 23830746 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 63)] 1 23830749 [('A', 0), ('C', 66), ('G', 1), ('N', 0), ('T', 0)] 1 23830752 [('A', 0), ('C', 1), ('G', 68), ('N', 0), ('T', 0)] 1 23830756 [('A', 1), ('C', 71), ('G', 0), ('N', 0), ('T', 0)] 1 23830758 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 72)] 1 23830760 [('A', 0), ('C', 73), ('G', 1), ('N', 0), ('T', 0)] 1 23830766 [('A', 75), ('C', 0), ('G', 4), ('N', 0), ('T', 0)] 4 23830770 [('A', 0), ('C', 1), ('G', 79), ('N', 0), ('T', 0)] 1 23830773 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 80)] 1 23830777 [('A', 81), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23830780 [('A', 86), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23830789 [('A', 0), ('C', 1), ('G', 91), ('N', 0), ('T', 0)] 1 23830793 [('A', 88), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23830800 [('A', 1), ('C', 1), ('G', 0), ('N', 0), ('T', 78)] 2 23830812 [('A', 1), ('C', 78), ('G', 0), ('N', 0), ('T', 0)] 1 23830817 [('A', 78), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23830821 [('A', 69), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23830826 [('A', 0), ('C', 1), ('G', 67), ('N', 0), ('T', 0)] 1 23830827 [('A', 68), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23830835 [('A', 0), ('C', 1), ('G', 1), ('N', 0), ('T', 68)] 2 23830837 [('A', 0), ('C', 0), ('G', 3), ('N', 0), ('T', 69)] 3 23830851 [('A', 0), ('C', 74), ('G', 0), ('N', 0), ('T', 1)] 1 23830857 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 71)] 1 23830860 [('A', 1), ('C', 0), ('G', 72), ('N', 0), ('T', 0)] 1 23830861 [('A', 0), ('C', 1), ('G', 71), ('N', 0), ('T', 0)] 1 23830874 [('A', 74), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23830884 [('A', 0), ('C', 72), ('G', 1), ('N', 0), ('T', 0)] 1 23830886 [('A', 0), ('C', 0), ('G', 2), ('N', 0), ('T', 70)] 2 23830888 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 69)] 1 23830912 [('A', 0), ('C', 73), ('G', 1), ('N', 0), ('T', 0)] 1 23830920 [('A', 0), ('C', 74), ('G', 0), ('N', 0), ('T', 1)] 1 23830926 [('A', 1), ('C', 0), ('G', 2), ('N', 0), ('T', 71)] 3 23830934 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 70)] 1 23830942 [('A', 80), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23830943 [('A', 77), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23830976 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 78)] 1 23830979 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 76)] 1 23830983 [('A', 1), ('C', 78), ('G', 0), ('N', 0), ('T', 0)] 1 23830988 [('A', 1), ('C', 0), ('G', 77), ('N', 0), ('T', 0)] 1 23830996 [('A', 0), ('C', 1), ('G', 75), ('N', 0), ('T', 0)] 1 23830997 [('A', 0), ('C', 2), ('G', 0), ('N', 0), ('T', 74)] 2 23831002 [('A', 76), ('C', 3), ('G', 0), ('N', 0), ('T', 0)] 3 23831003 [('A', 1), ('C', 77), ('G', 0), ('N', 0), ('T', 0)] 1 23831005 [('A', 81), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831007 [('A', 1), ('C', 0), ('G', 81), ('N', 0), ('T', 0)] 1 23831008 [('A', 0), ('C', 0), ('G', 82), ('N', 0), ('T', 1)] 1 23831020 [('A', 1), ('C', 0), ('G', 0), ('N', 0), ('T', 92)] 1 23831032 [('A', 94), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831033 [('A', 94), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831040 [('A', 0), ('C', 90), ('G', 1), ('N', 0), ('T', 0)] 1 23831047 [('A', 1), ('C', 0), ('G', 80), ('N', 0), ('T', 0)] 1 23831066 [('A', 73), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831072 [('A', 72), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831073 [('A', 0), ('C', 1), ('G', 71), ('N', 1), ('T', 0)] 2 23831080 [('A', 71), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831081 [('A', 71), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23831083 [('A', 0), ('C', 0), ('G', 70), ('N', 0), ('T', 1)] 1 23831084 [('A', 1), ('C', 71), ('G', 0), ('N', 0), ('T', 0)] 1 23831094 [('A', 0), ('C', 73), ('G', 1), ('N', 0), ('T', 0)] 1 23831099 [('A', 0), ('C', 68), ('G', 1), ('N', 0), ('T', 0)] 1 23831100 [('A', 2), ('C', 0), ('G', 65), ('N', 0), ('T', 0)] 2 23831102 [('A', 65), ('C', 0), ('G', 2), ('N', 0), ('T', 0)] 2 23831103 [('A', 0), ('C', 0), ('G', 65), ('N', 0), ('T', 2)] 2 23831105 [('A', 62), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831109 [('A', 0), ('C', 62), ('G', 2), ('N', 0), ('T', 0)] 2 23831111 [('A', 1), ('C', 64), ('G', 1), ('N', 0), ('T', 0)] 2 23831123 [('A', 59), ('C', 0), ('G', 2), ('N', 0), ('T', 0)] 2 23831129 [('A', 0), ('C', 61), ('G', 0), ('N', 0), ('T', 1)] 1 23831130 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 62)] 1 23831131 [('A', 63), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831137 [('A', 0), ('C', 0), ('G', 67), ('N', 0), ('T', 1)] 1 23831139 [('A', 0), ('C', 1), ('G', 67), ('N', 0), ('T', 0)] 1 23831140 [('A', 64), ('C', 1), ('G', 0), ('N', 1), ('T', 0)] 2 23831145 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 68)] 1 23831148 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 66)] 1 23831149 [('A', 0), ('C', 0), ('G', 66), ('N', 0), ('T', 1)] 1 23831152 [('A', 62), ('C', 1), ('G', 1), ('N', 0), ('T', 0)] 2 23831154 [('A', 62), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831182 [('A', 81), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831185 [('A', 79), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831188 [('A', 0), ('C', 1), ('G', 76), ('N', 0), ('T', 0)] 1 23831192 [('A', 87), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831195 [('A', 86), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831196 [('A', 1), ('C', 0), ('G', 86), ('N', 0), ('T', 0)] 1 23831213 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 86)] 1 23831216 [('A', 0), ('C', 87), ('G', 0), ('N', 0), ('T', 1)] 1 23831225 [('A', 1), ('C', 1), ('G', 2), ('N', 0), ('T', 75)] 4 23831229 [('A', 1), ('C', 85), ('G', 0), ('N', 0), ('T', 0)] 1 23831230 [('A', 83), ('C', 0), ('G', 3), ('N', 0), ('T', 0)] 3 23831245 [('A', 0), ('C', 1), ('G', 104), ('N', 0), ('T', 1)] 2 23831252 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 109)] 1 23831255 [('A', 0), ('C', 113), ('G', 1), ('N', 0), ('T', 0)] 1 23831256 [('A', 0), ('C', 114), ('G', 1), ('N', 0), ('T', 0)] 1 23831257 [('A', 113), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831259 [('A', 0), ('C', 1), ('G', 110), ('N', 0), ('T', 0)] 1 23831260 [('A', 1), ('C', 111), ('G', 0), ('N', 0), ('T', 0)] 1 23831262 [('A', 0), ('C', 111), ('G', 0), ('N', 0), ('T', 1)] 1 23831265 [('A', 0), ('C', 0), ('G', 112), ('N', 1), ('T', 0)] 1 23831266 [('A', 108), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831270 [('A', 0), ('C', 1), ('G', 1), ('N', 1), ('T', 107)] 3 23831275 [('A', 112), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831276 [('A', 0), ('C', 1), ('G', 114), ('N', 0), ('T', 0)] 1 23831281 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 113)] 1 23831286 [('A', 112), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23831291 [('A', 0), ('C', 2), ('G', 0), ('N', 0), ('T', 98)] 2 23831293 [('A', 102), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23831300 [('A', 107), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831303 [('A', 1), ('C', 108), ('G', 0), ('N', 0), ('T', 0)] 1 23831315 [('A', 123), ('C', 1), ('G', 2), ('N', 0), ('T', 0)] 3 23831318 [('A', 1), ('C', 0), ('G', 1), ('N', 0), ('T', 126)] 2 23831319 [('A', 1), ('C', 0), ('G', 0), ('N', 0), ('T', 133)] 1 23831320 [('A', 0), ('C', 0), ('G', 133), ('N', 0), ('T', 1)] 1 23831322 [('A', 0), ('C', 136), ('G', 0), ('N', 0), ('T', 1)] 1 23831324 [('A', 133), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831326 [('A', 1), ('C', 0), ('G', 137), ('N', 0), ('T', 0)] 1 23831327 [('A', 0), ('C', 131), ('G', 1), ('N', 0), ('T', 1)] 2 23831335 [('A', 2), ('C', 0), ('G', 136), ('N', 0), ('T', 1)] 3 23831339 [('A', 1), ('C', 132), ('G', 0), ('N', 0), ('T', 0)] 1 23831343 [('A', 1), ('C', 131), ('G', 0), ('N', 0), ('T', 0)] 1 23831348 [('A', 130), ('C', 0), ('G', 0), ('N', 0), ('T', 2)] 2 23831349 [('A', 130), ('C', 0), ('G', 0), ('N', 0), ('T', 2)] 2 23831353 [('A', 2), ('C', 0), ('G', 0), ('N', 0), ('T', 134)] 2 23831359 [('A', 0), ('C', 1), ('G', 135), ('N', 0), ('T', 0)] 1 23831363 [('A', 0), ('C', 133), ('G', 1), ('N', 0), ('T', 1)] 2 23831366 [('A', 0), ('C', 133), ('G', 1), ('N', 0), ('T', 0)] 1 23831369 [('A', 131), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831375 [('A', 2), ('C', 121), ('G', 0), ('N', 0), ('T', 0)] 2 23831376 [('A', 122), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831378 [('A', 118), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831382 [('A', 1), ('C', 117), ('G', 0), ('N', 0), ('T', 0)] 1 23831383 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 118)] 1 23831384 [('A', 0), ('C', 117), ('G', 0), ('N', 0), ('T', 1)] 1 23831386 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 118)] 1 23831394 [('A', 1), ('C', 115), ('G', 0), ('N', 0), ('T', 0)] 1 23831396 [('A', 110), ('C', 1), ('G', 0), ('N', 0), ('T', 2)] 3 23831399 [('A', 109), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831405 [('A', 107), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831420 [('A', 0), ('C', 2), ('G', 0), ('N', 0), ('T', 80)] 2 23831454 [('A', 59), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831469 [('A', 1), ('C', 76), ('G', 0), ('N', 0), ('T', 0)] 1 23831470 [('A', 75), ('C', 1), ('G', 0), ('N', 0), ('T', 1)] 2 23831471 [('A', 0), ('C', 79), ('G', 2), ('N', 0), ('T', 0)] 2 23831475 [('A', 0), ('C', 81), ('G', 1), ('N', 0), ('T', 0)] 1 23831477 [('A', 0), ('C', 79), ('G', 1), ('N', 0), ('T', 0)] 1 23831484 [('A', 1), ('C', 2), ('G', 0), ('N', 0), ('T', 70)] 3 23831485 [('A', 0), ('C', 1), ('G', 72), ('N', 0), ('T', 1)] 2 23831486 [('A', 1), ('C', 1), ('G', 71), ('N', 0), ('T', 0)] 2 23831488 [('A', 68), ('C', 0), ('G', 2), ('N', 0), ('T', 0)] 2 23831492 [('A', 0), ('C', 1), ('G', 80), ('N', 0), ('T', 0)] 1 23831493 [('A', 0), ('C', 1), ('G', 3), ('N', 0), ('T', 77)] 4 23831496 [('A', 0), ('C', 1), ('G', 78), ('N', 0), ('T', 0)] 1 23831502 [('A', 1), ('C', 0), ('G', 96), ('N', 0), ('T', 0)] 1 23831503 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 94)] 1 23831510 [('A', 1), ('C', 99), ('G', 0), ('N', 0), ('T', 0)] 1 23831512 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 100)] 1 23831513 [('A', 100), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831514 [('A', 0), ('C', 0), ('G', 100), ('N', 0), ('T', 1)] 1 23831515 [('A', 101), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831519 [('A', 0), ('C', 0), ('G', 105), ('N', 0), ('T', 1)] 1 23831527 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 101)] 1 23831533 [('A', 0), ('C', 0), ('G', 90), ('N', 0), ('T', 1)] 1 23831540 [('A', 108), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23831543 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 108)] 1 23831547 [('A', 109), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831552 [('A', 0), ('C', 107), ('G', 0), ('N', 0), ('T', 1)] 1 23831556 [('A', 101), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831557 [('A', 98), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831558 [('A', 0), ('C', 97), ('G', 1), ('N', 0), ('T', 0)] 1 23831559 [('A', 92), ('C', 2), ('G', 0), ('N', 0), ('T', 2)] 4 23831565 [('A', 1), ('C', 94), ('G', 0), ('N', 0), ('T', 0)] 1 23831566 [('A', 0), ('C', 100), ('G', 0), ('N', 0), ('T', 1)] 1 23831570 [('A', 1), ('C', 0), ('G', 106), ('N', 0), ('T', 0)] 1 23831577 [('A', 0), ('C', 1), ('G', 106), ('N', 0), ('T', 0)] 1 23831579 [('A', 109), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831580 [('A', 0), ('C', 0), ('G', 110), ('N', 0), ('T', 1)] 1 23831589 [('A', 120), ('C', 2), ('G', 0), ('N', 0), ('T', 0)] 2 23831590 [('A', 114), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831593 [('A', 1), ('C', 118), ('G', 0), ('N', 0), ('T', 0)] 1 23831594 [('A', 1), ('C', 2), ('G', 1), ('N', 0), ('T', 115)] 4 23831596 [('A', 0), ('C', 0), ('G', 119), ('N', 1), ('T', 0)] 1 23831604 [('A', 0), ('C', 1), ('G', 1), ('N', 0), ('T', 106)] 2 23831619 [('A', 111), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831620 [('A', 114), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23831623 [('A', 1), ('C', 0), ('G', 1), ('N', 0), ('T', 121)] 2 23831625 [('A', 124), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831626 [('A', 0), ('C', 0), ('G', 2), ('N', 0), ('T', 122)] 2 23831629 [('A', 0), ('C', 0), ('G', 126), ('N', 0), ('T', 1)] 1 23831632 [('A', 124), ('C', 0), ('G', 3), ('N', 0), ('T', 0)] 3 23831633 [('A', 1), ('C', 1), ('G', 124), ('N', 0), ('T', 0)] 2 23831634 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 120)] 1 23831635 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 119)] 1 23831638 [('A', 1), ('C', 0), ('G', 114), ('N', 0), ('T', 0)] 1 23831642 [('A', 0), ('C', 1), ('G', 1), ('N', 0), ('T', 112)] 2 23831643 [('A', 0), ('C', 0), ('G', 115), ('N', 0), ('T', 1)] 1 23831645 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 114)] 1 23831646 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 117)] 1 23831649 [('A', 0), ('C', 2), ('G', 117), ('N', 0), ('T', 0)] 2 23831656 [('A', 0), ('C', 123), ('G', 1), ('N', 0), ('T', 1)] 2 23831659 [('A', 124), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831664 [('A', 0), ('C', 0), ('G', 122), ('N', 0), ('T', 1)] 1 23831667 [('A', 1), ('C', 117), ('G', 0), ('N', 0), ('T', 0)] 1 23831668 [('A', 1), ('C', 120), ('G', 0), ('N', 0), ('T', 1)] 2 23831672 [('A', 121), ('C', 1), ('G', 1), ('N', 0), ('T', 0)] 2 23831677 [('A', 122), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23831679 [('A', 1), ('C', 0), ('G', 0), ('N', 0), ('T', 121)] 1 23831680 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 120)] 1 23831692 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 116)] 1 23831695 [('A', 1), ('C', 0), ('G', 111), ('N', 0), ('T', 0)] 1 23831706 [('A', 117), ('C', 0), ('G', 0), ('N', 1), ('T', 0)] 1 23831708 [('A', 0), ('C', 1), ('G', 116), ('N', 0), ('T', 0)] 1 23831709 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 116)] 1 23831715 [('A', 1), ('C', 116), ('G', 0), ('N', 0), ('T', 0)] 1 23831716 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 116)] 1 23831724 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 109)] 1 23831738 [('A', 0), ('C', 106), ('G', 0), ('N', 0), ('T', 1)] 1 23831741 [('A', 0), ('C', 0), ('G', 105), ('N', 0), ('T', 1)] 1 23831742 [('A', 1), ('C', 0), ('G', 99), ('N', 0), ('T', 0)] 1 23831743 [('A', 0), ('C', 98), ('G', 1), ('N', 0), ('T', 0)] 1 23831751 [('A', 0), ('C', 0), ('G', 2), ('N', 0), ('T', 92)] 2 23831752 [('A', 0), ('C', 1), ('G', 94), ('N', 0), ('T', 0)] 1 23831753 [('A', 0), ('C', 0), ('G', 94), ('N', 0), ('T', 1)] 1 23831754 [('A', 0), ('C', 95), ('G', 0), ('N', 0), ('T', 2)] 2 23831755 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 93)] 1 23831756 [('A', 1), ('C', 94), ('G', 0), ('N', 0), ('T', 0)] 1 23831757 [('A', 1), ('C', 96), ('G', 0), ('N', 0), ('T', 0)] 1 23831759 [('A', 96), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831765 [('A', 92), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831766 [('A', 0), ('C', 0), ('G', 95), ('N', 0), ('T', 2)] 2 23831768 [('A', 0), ('C', 86), ('G', 0), ('N', 0), ('T', 1)] 1 23831774 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 84)] 1 23831780 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 83)] 1 23831786 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 83)] 1 23831792 [('A', 0), ('C', 1), ('G', 2), ('N', 0), ('T', 85)] 3 23831803 [('A', 1), ('C', 0), ('G', 88), ('N', 0), ('T', 0)] 1 23831804 [('A', 0), ('C', 0), ('G', 3), ('N', 0), ('T', 86)] 3 23831809 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 78)] 1 23831813 [('A', 0), ('C', 0), ('G', 81), ('N', 0), ('T', 1)] 1 23831815 [('A', 1), ('C', 84), ('G', 0), ('N', 0), ('T', 0)] 1 23831825 [('A', 0), ('C', 85), ('G', 0), ('N', 0), ('T', 1)] 1 23831832 [('A', 1), ('C', 86), ('G', 0), ('N', 0), ('T', 1)] 2 23831834 [('A', 1), ('C', 0), ('G', 0), ('N', 0), ('T', 89)] 1 23831836 [('A', 89), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23831837 [('A', 93), ('C', 0), ('G', 0), ('N', 0), ('T', 2)] 2 23831842 [('A', 98), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23831847 [('A', 0), ('C', 103), ('G', 1), ('N', 0), ('T', 0)] 1 23831848 [('A', 104), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831849 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 103)] 1 23831854 [('A', 1), ('C', 0), ('G', 1), ('N', 0), ('T', 96)] 2 23831861 [('A', 1), ('C', 104), ('G', 0), ('N', 0), ('T', 0)] 1 23831872 [('A', 0), ('C', 0), ('G', 111), ('N', 0), ('T', 1)] 1 23831877 [('A', 1), ('C', 115), ('G', 0), ('N', 0), ('T', 0)] 1 23831882 [('A', 0), ('C', 1), ('G', 1), ('N', 1), ('T', 129)] 3 23831889 [('A', 0), ('C', 1), ('G', 134), ('N', 0), ('T', 0)] 1 23831891 [('A', 131), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831894 [('A', 1), ('C', 136), ('G', 0), ('N', 0), ('T', 0)] 1 23831897 [('A', 1), ('C', 0), ('G', 130), ('N', 0), ('T', 0)] 1 23831898 [('A', 125), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23831904 [('A', 126), ('C', 0), ('G', 0), ('N', 1), ('T', 0)] 1 23831911 [('A', 0), ('C', 134), ('G', 2), ('N', 0), ('T', 0)] 2 23831914 [('A', 0), ('C', 136), ('G', 1), ('N', 0), ('T', 0)] 1 23831919 [('A', 128), ('C', 0), ('G', 3), ('N', 0), ('T', 1)] 4 23831925 [('A', 0), ('C', 0), ('G', 131), ('N', 0), ('T', 1)] 1 23831932 [('A', 1), ('C', 0), ('G', 0), ('N', 0), ('T', 130)] 1 23831935 [('A', 0), ('C', 128), ('G', 1), ('N', 0), ('T', 0)] 1 23831936 [('A', 0), ('C', 128), ('G', 1), ('N', 0), ('T', 0)] 1 23831937 [('A', 121), ('C', 1), ('G', 1), ('N', 0), ('T', 0)] 2 23831938 [('A', 0), ('C', 130), ('G', 1), ('N', 0), ('T', 0)] 1 23831944 [('A', 0), ('C', 0), ('G', 0), ('N', 1), ('T', 132)] 1 23831945 [('A', 0), ('C', 0), ('G', 135), ('N', 0), ('T', 1)] 1 23831947 [('A', 134), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23831949 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 135)] 1 23831950 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 138)] 1 23831951 [('A', 0), ('C', 0), ('G', 2), ('N', 0), ('T', 137)] 2 23831953 [('A', 0), ('C', 0), ('G', 137), ('N', 0), ('T', 1)] 1 23831956 [('A', 0), ('C', 0), ('G', 2), ('N', 0), ('T', 137)] 2 23831958 [('A', 1), ('C', 0), ('G', 138), ('N', 0), ('T', 0)] 1 23831960 [('A', 132), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23831968 [('A', 128), ('C', 1), ('G', 1), ('N', 0), ('T', 0)] 2 23831969 [('A', 0), ('C', 127), ('G', 1), ('N', 0), ('T', 0)] 1 23831971 [('A', 127), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23831973 [('A', 0), ('C', 129), ('G', 0), ('N', 0), ('T', 3)] 3 23831977 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 127)] 1 23831979 [('A', 1), ('C', 0), ('G', 0), ('N', 0), ('T', 119)] 1 23831985 [('A', 105), ('C', 1), ('G', 1), ('N', 0), ('T', 0)] 2 23832005 [('A', 91), ('C', 1), ('G', 0), ('N', 0), ('T', 0)] 1 23832010 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 89)] 1 23832011 [('A', 92), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23832013 [('A', 1), ('C', 91), ('G', 0), ('N', 0), ('T', 0)] 1 23832019 [('A', 1), ('C', 98), ('G', 1), ('N', 0), ('T', 0)] 2 23832020 [('A', 98), ('C', 1), ('G', 1), ('N', 0), ('T', 0)] 2 23832026 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 107)] 1 23832034 [('A', 101), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23832039 [('A', 0), ('C', 0), ('G', 92), ('N', 1), ('T', 0)] 1 23832040 [('A', 91), ('C', 0), ('G', 0), ('N', 1), ('T', 0)] 1 23832046 [('A', 83), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23832047 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 81)] 1 23832050 [('A', 1), ('C', 0), ('G', 0), ('N', 0), ('T', 77)] 1 23832053 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 77)] 1 23832065 [('A', 1), ('C', 0), ('G', 76), ('N', 0), ('T', 0)] 1 23832071 [('A', 1), ('C', 75), ('G', 0), ('N', 0), ('T', 0)] 1 23832078 [('A', 0), ('C', 0), ('G', 1), ('N', 0), ('T', 65)] 1 23832081 [('A', 1), ('C', 65), ('G', 0), ('N', 0), ('T', 0)] 1 23832086 [('A', 0), ('C', 0), ('G', 65), ('N', 0), ('T', 1)] 1 23832088 [('A', 0), ('C', 0), ('G', 68), ('N', 0), ('T', 1)] 1 23832108 [('A', 0), ('C', 2), ('G', 1), ('N', 0), ('T', 63)] 3 23832111 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 60)] 1 23832114 [('A', 2), ('C', 57), ('G', 0), ('N', 0), ('T', 0)] 2 23832117 [('A', 52), ('C', 1), ('G', 0), ('N', 0), ('T', 1)] 2 23832124 [('A', 44), ('C', 0), ('G', 0), ('N', 0), ('T', 1)] 1 23832137 [('A', 36), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23832141 [('A', 34), ('C', 1), ('G', 2), ('N', 0), ('T', 0)] 3 23832153 [('A', 30), ('C', 0), ('G', 1), ('N', 0), ('T', 0)] 1 23832156 [('A', 0), ('C', 1), ('G', 30), ('N', 0), ('T', 0)] 1 23832167 [('A', 0), ('C', 1), ('G', 0), ('N', 0), ('T', 22)] 1 23832186 [('A', 0), ('C', 1), ('G', 1), ('N', 0), ('T', 8)] 2 23832190 [('A', 1), ('C', 1), ('G', 0), ('N', 0), ('T', 8)] 2 23832192 [('A', 1), ('C', 0), ('G', 0), ('N', 0), ('T', 6)] 1 1360 consistent columns
%matplotlib inline
import matplotlib.pyplot as plot
import numpy
x = []
y = []
for column in bamfile.pileup('chr13', 23830650, 23832250):
x.append(column.pos)
n = 0
for read in column.pileups:
if (not read.is_del):
n += 1
y.append(n)
plot.figure(figsize=(15, 5))
plot.plot(x, y, 'b')
plot.plot([x[0], x[-1]], [numpy.mean(y[50:-50]), numpy.mean(y[50:-50])], ':r')
[<matplotlib.lines.Line2D at 0x103ffd290>]