Compgen Tool Suite

Review session - BCNF decomposition
jbrian / Version 1
consider a relation ABCD, A is key, there is an FD B -> C

the algorithm for bcnf decomposition:
for each fd:
If X --> Y violates BCNF,
decompose R into R - Y and XY.
repeat for all fds and all child tables

in this example, B is X, C is Y

R - Y = ABCD - C = ABD
XY = BC
the schema ABD, BC is in BCNF.

-----
def decompose(r, fds):
    for each fd:
        if fd x->y violates bcnf in r:
            split r into r-y and xy
            recur on each, merge the results
    # if we get here, no fd was a violation, so just return the original table
    return [r]
    
print decompose("CSJDPQV", [("SD", "P"), ("J", "S"), ("JP", "C")])
print
print decompose("ABCDEFGH", [("A", "B"), ("ACD", "E"), ("EF", "GH")])
    
# x -> y violates bcnf if x does not contain a key for R, 
    AND R contains x, AND union(R, Y) is not empty
# x is a superkey of R iff x -> R
    # (tangent: x is a key of R iff x is a superkey, and no subset of x is a superkey.)
    
attribute closure: section 19.3.2:
attribute closure of some set of attributes X:
closure = X
repeat until there is no change:
    for each U->V
        if closure contains U:
            closure = union(closure, V)
-------
example 2:
"ABCDEFGH", [("A", "B"), ("ACD", "E"), ("EF", "GH")]
first fd violates bcnf, split into:
AB  ACDEFGH
suppose we also have an FD E->B:
neither child table
    * contains all of the left hand size (E), and
    * contains at least 1 attribhte from the right hand side (B)
Thus, we've already fixed that.
-------

# recommend you convert your strings to sets, as they have many useful built in functions
>>> set("ABCD") - set("C")
set(['A', 'B', 'D'])
# others useful operators: | (union), & (intersection), >=, <=, >, < (superset, subset, strict superset, strict subset)

# converting a set to a string
>>> ''.join(set("ABCD") - set("C"))
'ABD'

# tangent: the join method:
>>> '-'.join(["AB", "CD", "EF"])
'AB-CD-EF'