# Welcome to Jupyter

Jupyter is an integrated, literate, and web-based programming environment. In Jupyter, you can author executable *Notebooks* that combine code, output, and documentation. It is dynamic and interpretive, and encourages exploration and experimentation.

A Juputer notebook is divided into *cells*. Each cell can be executed and edited independently. You can also save the contents of cells to produce stand-alone programs if you wish. Jupyter supports many programming languages, such as Python, R, and Julia. In Comp555 we will use primarily Python 3.

In [None]:
for i in range(0,91,7):
    if (i == 42):
        print(i, "cool")
    else:
        print(i)

In [None]:
# list comprehensions

x = [base for base in "ABCDEFG"]
print(x)
x = [base for i, base in enumerate("ABCDEFGHIJKLM") if i % 3 == 0]
print(x)
x = {base: i+1 for i, base in enumerate("ABCDEFGHIJKLM") if i % 2 == 1}
print(x)
x = list(x)
print(x)

In [None]:
# In Python, the remainder of lines after a "pound sign" (a.k.a hashtag) is a comment.
N = 16
x = range(-N,N+1,2)    # range(N) -> [0, 1,..., N-1]; range(N,M) -> [N, N+1, ..., M-2, M-1]
                       # range(N,M,S) -> [N, N+S, N+2S, ..., N+kS < M]

print(list(x), len(x))

In [None]:
y = [v*v for v in x]
z = [100-3*v for v in x]
print(y)
print(z)

In [None]:
%matplotlib inline
import matplotlib.pyplot as plot

result = plot.plot(x,y)
result = plot.plot(x,z)

# Next we'll consider some string manipulations

In [None]:
import random

bases = ["ACGT"[random.randint(0,3)] for i in range(100)]
dna = ''.join(bases)

print(dna)

In [None]:
dna[::-2]

In [None]:
for i in range(10,len(dna),10):
    print(dna[:i], dna[i:])

In [None]:
def kmers(seq, k):
    return [seq[i:i+k] for i in range(len(seq)-k+1)]

def reverseComp(seq):
    return ''.join([{'A':'T','C':'G','G':'C','T':'A'}[c] for c in reversed(seq)])

kmerList = kmers(dna, 4)
print(dna)
print(kmerList)

palindromes = [kmer for kmer in kmerList if reverseComp(kmer) == kmer]
print(set(palindromes))

repeats = [kmer for kmer in kmerList if kmerList.count(kmer) > 1]
print(repeats)

unique = [kmer for kmer in kmerList if kmerList.count(kmer) == 1]
print(unique)

In [None]:
def reverseCompV2(seq):
    return ''.join(reversed(seq.translate(str.maketrans("ACGT", "TGCA"))))

assert reverseComp(dna) == reverseCompV2(dna)

%timeit reverseComp(dna)
%timeit reverseCompV2(dna)

# For more challenges let's look at [Rosalind](http://rosalind.info)

Let's try [problem 1](http://rosalind.info/problems/dna/).

In [None]:
dna = "AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC"

print(" ".join([str(dna.count(base)) for base in "ACGT"]))

##  That was easy

Let's try [problem 2](http://rosalind.info/problems/rna/).

In [None]:
dna = "GATGGAACTTGACTACGTAAATT"

print(dna.replace('T','U'))

In [None]:
dna = "GATGGAACTTGACTACGTAAATT"

print(''.join(['U' if (base == 'T') else base for base in dna]))

## Also simple

Let's try [problem 3](http://rosalind.info/problems/revc/).

In [None]:
dna = "AAAACCCGGT"

print(''.join([{'A':'T','C':'G','G':'C','T':'A'}[base] for base in reversed(dna)]))

## Will they ever require more than one line?

Let's try [problem 4](http://rosalind.info/problems/fib/).

In [None]:
def rabbits(generations, pairsPerLitter):
    sequence = [0,1]
    while (len(sequence) - 1 < generations):
        sequence.append(sequence[-2]*pairsPerLitter + sequence[-1])
    return sequence[-1]

print(rabbits(5, 3))

# Now try one on your own

Let's try [problem 5](http://rosalind.info/problems/gc/).

In [None]:
dna = "CCACCCTCGTGGTATGGCTAGGCATTCAGGAACCGGAGAACGCTTCAGACCAGCCCGGACTGGGAACCTGCGGGCAGTAGGTGGAAT"

100.0*(dna.count('C') + dna.count('G'))/len(dna)