Join dependency

From HandWiki

In database theory, a join dependency is a constraint on the set of legal relations over a database scheme. A table [math]\displaystyle{ T }[/math] is subject to a join dependency if [math]\displaystyle{ T }[/math] can always be recreated by joining multiple tables each having a subset of the attributes of [math]\displaystyle{ T }[/math]. If one of the tables in the join has all the attributes of the table [math]\displaystyle{ T }[/math], the join dependency is called trivial. The join dependency plays an important role in the Fifth normal form, also known as project-join normal form, because it can be proven that if a scheme [math]\displaystyle{ R }[/math] is decomposed in tables [math]\displaystyle{ R_1 }[/math] to [math]\displaystyle{ R_n }[/math], the decomposition will be a lossless-join decomposition if the legal relations on [math]\displaystyle{ R }[/math] are restricted to a join dependency on [math]\displaystyle{ R }[/math] called [math]\displaystyle{ *(R_1,R_2,\ldots,R_n) }[/math].

Another way to describe a join dependency is to say that the relationships in the join dependency are independent of each other.

Unlike in the case of functional dependencies, there is no sound and complete axiomatization for join dependencies,[1] though axiomatization exist for more expressive dependency languages such as full typed dependencies.[2]:Chapter 8 However, implication of join dependencies is decidable.[2]:Theorem 8.4.12

Formal definition

Let [math]\displaystyle{ R }[/math] be a relation schema and let [math]\displaystyle{ R_1, R_2, \ldots, R_n }[/math] be a decomposition of [math]\displaystyle{ R }[/math].

The relation [math]\displaystyle{ r(R) }[/math] satisfies the join dependency

[math]\displaystyle{ *(R_1,R_2,\ldots,R_n) }[/math] if [math]\displaystyle{ \bowtie_{i = 1}^n \Pi_{R_i}(r) = r. }[/math]

A join dependency is trivial if one of the [math]\displaystyle{ R_i }[/math] is [math]\displaystyle{ R }[/math] itself.[3]

2-ary join dependencies are called multivalued dependency as a historical artifact of the fact that they were studied before the general case. More specifically if U is a set of attributes and R a relation over it, then R satisfies [math]\displaystyle{ X \twoheadrightarrow Y }[/math] if and only if R satisfies [math]\displaystyle{ *(X\cup Y, X\cup(U-Y)). }[/math]

Example

Given a pizza-chain that models purchases in table Order = {order-number, customer-name, pizza-name, courier}. The following relations can be derived:

  • customer-name depends on order-number
  • pizza-name depends on order-number
  • courier depends on order-number

Since the relationships are independent there is a join dependency as follows: *((order-number, customer-name), (order-number, pizza-name), (order-number, courier)).

If each customer has his own courier however, there can be a join-dependency like this: *((order-number, customer-name), (order-number, pizza-name), (order-number, courier), (customer-name, courier)), but *((order-number, customer-name, courier), (order-number, pizza-name)) would be valid as well. This makes it obvious that just having a join dependency is not enough to normalize a database scheme.

See also

References

  1. Petrov, S. V. (1989). "Finite axiomatization of languages for representation of system properties". Information Sciences 47: 339–372. doi:10.1016/0020-0255(89)90006-6. 
  2. 2.0 2.1 Abiteboul; Hull; Vianu (1995). Foundations of databases. Addison-Wesley. ISBN 9780201537710. https://archive.org/details/foundationsofdat0000abit. 
  3. Silberschatz, Korth. Database System Concepts (1st ed.).