Monotonic query

From HandWiki

In database theory and systems, a monotonic query is one that does not lose any tuples it previously made output, with the addition of new tuples in the database. Formally, a query q over a schema R is monotonic if and only if for every two instances I, J of R, [math]\displaystyle{ I \subseteq J \Rightarrow q(I) \subseteq q(J) }[/math] (q must be a monotonic function). [1]

An example of a monotonic query is a select-project-join query containing only conditions of equality (also known as conjunctive queries). Examples of non-monotonic queries are aggregation queries, or queries with set difference.

Identifying whether a query is monotonic can be crucial for query optimization, especially in view maintenance and data stream management. Since the answer set for a monotonic query can only grow as more tuples are added to the database, query processing may be optimized by executing only the new portions of the database and adding the new results to the existing answer set.

Applications

Unnesting Queries

Monotonic queries are important in the topic of unnesting SQL queries. If a query is monotonic, it implies that a nested query can actually be unnested.

Data streams

A data stream is a real-time, continuous, ordered (implicitly by arrival time or explicitly by timestamp) sequence of items. The number of items is considered to be infinite and therefore cannot feasibly be stored in its entirety. Queries over data streams are often called continuous or long-running queries, and are mostly run over a limited window of tuples in the stream. To evaluate a continuous query, one can simply reevaluate the query over newly arrived tuples, and append the new tuples to the existing result set. More formally, let A(Q, t) be the answer set of a continuous query Q at time t, τ be the current time, and 0 the start time. Then, if Q is monotonic, its result set at time τ is

[math]\displaystyle{ A(Q, \tau) =\bigcup_{t = 1}^{\tau} (A(Q,t) - A(Q, t-1)) \cup A(Q,0) }[/math]

In contrast, non-monotonic queries have the following answer semantics:

[math]\displaystyle{ A(Q, \tau) =\bigcup_{t = 0}^{\tau} A(Q,t) }[/math]

[2]

References

  1. Abiteboul, Serge; Richard Hull; Victor Vianu (1994). Foundations Of Databases. Addison-Wesley. ISBN 9780201537710. https://archive.org/details/foundationsofdat0000abit. 
  2. Golab, Lukasz; M. Tamer Ozsu (June 2003). "Issues in Data Stream Management". SIGMOD Record 32 (2): 5–14. doi:10.1145/776985.776986.