consider a relation ABCD, A is key, there is an FD B -> C
the algorithm for bcnf decomposition: for each fd: If X --> Y violates BCNF, decompose R into R - Y and XY. repeat for all fds and all child tables
in this example, B is X, C is Y
R - Y = ABCD - C = ABD XY = BC the schema ABD, BC is in BCNF.
----- def decompose(r, fds): for each fd: if fd x->y violates bcnf in r: split r into r-y and xy recur on each, merge the results # if we get here, no fd was a violation, so just return the original table return [r]
# x -> y violates bcnf if x does not contain a key for R, AND R contains x, AND union(R, Y) is not empty # x is a superkey of R iff x -> R # (tangent: x is a key of R iff x is a superkey, and no subset of x is a superkey.)
attribute closure: section 19.3.2: attribute closure of some set of attributes X: closure = X repeat until there is no change: for each U->V if closure contains U: closure = union(closure, V) ------- example 2: "ABCDEFGH", [("A", "B"), ("ACD", "E"), ("EF", "GH")] first fd violates bcnf, split into: AB ACDEFGH suppose we also have an FD E->B: neither child table * contains all of the left hand size (E), and * contains at least 1 attribhte from the right hand side (B) Thus, we've already fixed that. -------
# recommend you convert your strings to sets, as they have many useful built in functions >>> set("ABCD") - set("C") set(['A', 'B', 'D']) # others useful operators: | (union), & (intersection), >=, <=, >, < (superset, subset, strict superset, strict subset)
# converting a set to a string >>> ''.join(set("ABCD") - set("C")) 'ABD'
# tangent: the join method: >>> '-'.join(["AB", "CD", "EF"]) 'AB-CD-EF'