Lossless join decomposition
In database design, a lossless join decomposition is a decomposition of a relation [math]\displaystyle{ R }[/math] into relations [math]\displaystyle{ R_1, R_2 }[/math] such that a natural join of the two smaller relations yields back the original relation. This is central in removing redundancy safely from databases while preserving the original data.[1]
Criteria
Lossless join can also be called nonadditive.[2]
If [math]\displaystyle{ R }[/math] is split into [math]\displaystyle{ R_1 }[/math] and [math]\displaystyle{ R_2 }[/math], for this decomposition to be lossless (i.e., [math]\displaystyle{ R_1 \bowtie R_2 = R }[/math]) then at least one of the two following criteria should be met.
Check 1: Verify join explicitly
Projecting on [math]\displaystyle{ R_1 }[/math] and [math]\displaystyle{ R_2 }[/math], and joining them back, results in the relation you started with.[3][unreliable source?]
Check 2: Via functional dependencies
Let [math]\displaystyle{ R }[/math] be a relation schema.
Let F be a set of functional dependencies on [math]\displaystyle{ R }[/math].
Let [math]\displaystyle{ R_1 }[/math] and [math]\displaystyle{ R_2 }[/math] form a decomposition of [math]\displaystyle{ R }[/math] .
The decomposition is lossless if one of the sub-relations (i.e. [math]\displaystyle{ R_1 }[/math] or [math]\displaystyle{ R_2 }[/math]) is a subset of the closure of their intersection. In other words, the decomposition is a lossless-join decomposition of [math]\displaystyle{ R }[/math] if at least one of the following functional dependencies are in F+ (where F+ stands for the closure for every attribute or attribute sets in F):[4]
- [math]\displaystyle{ R_1 \cap R_2 \rightarrow R_1 }[/math]
- [math]\displaystyle{ R_1 \cap R_2 \rightarrow R_2 }[/math]
Examples
- Let [math]\displaystyle{ R = (A, B, C, D) }[/math] be the relation schema, with attributes A, B, C and D.
- Let [math]\displaystyle{ F = \{ A \rightarrow BC \} }[/math] be the set of functional dependencies.
- Decomposition into [math]\displaystyle{ R_1 = (A, B, C) }[/math] and [math]\displaystyle{ R_2 = (A, D) }[/math] is lossless under F because [math]\displaystyle{ R_1 \cap R_2 = A) }[/math]. A is a superkey in [math]\displaystyle{ R_1 }[/math], meaning we have a functional dependency [math]\displaystyle{ \{A \rightarrow BC\} }[/math]. In other words, now we have proven that [math]\displaystyle{ (R_1 \cap R_2 \rightarrow R_1) \in F^+ }[/math].
References
- ↑ Pohler, K (2015). "Lossless-Join Decomposition: applications in quantitative computing metrics". International Journal of Applied Computer Science 21 (4): 190–212.
- ↑ Elmasri, Ramez (2016). Fundamentals of database systems (Seventh ed.). Hoboken, NJ: Pearson. p. 461. ISBN 978-0133970777.
- ↑ "Lossless Join Property". https://stackoverflow.com/questions/5771810/lossless-join-property.
- ↑ "Lossless Join Decomposition". University at Buffalo (Jan Chomicki). http://www.cse.buffalo.edu/~chomicki/560/handout-design.pdf.
- ↑ "Lossless-Join Decomposition". http://www.cs.sfu.ca/CourseCentral/354/zaiane/material/notes/Chapter7/node7.html.
- ↑ "www.data-e-education.com - Lossless Join Decomposition". http://www.data-e-education.com/E121_Lossless_Join_Decomposition.html.
Original source: https://en.wikipedia.org/wiki/Lossless join decomposition.
Read more |