Space hierarchy theorem

From HandWiki
Short description: Both deterministic and nondeterministic machines can solve more problems given more space

In computational complexity theory, the space hierarchy theorems are separation results that show that both deterministic and nondeterministic machines can solve more problems in (asymptotically) more space, subject to certain conditions. For example, a deterministic Turing machine can solve more decision problems in space n log n than in space n. The somewhat weaker analogous theorems for time are the time hierarchy theorems. The foundation for the hierarchy theorems lies in the intuition that with either more time or more space comes the ability to compute more functions (or decide more languages). The hierarchy theorems are used to demonstrate that the time and space complexity classes form a hierarchy where classes with tighter bounds contain fewer languages than those with more relaxed bounds. Here we define and prove the space hierarchy theorem.

The space hierarchy theorems rely on the concept of space-constructible functions. The deterministic and nondeterministic space hierarchy theorems state that for all space-constructible functions f(n),

[math]\displaystyle{ \mathsf{SPACE}\left(o(f(n))\right) \subsetneq \mathsf{SPACE}(f(n)) }[/math],

where SPACE stands for either DSPACE or NSPACE, and o refers to the little o notation.

Statement

Formally, a function [math]\displaystyle{ f:\mathbb{N} \longrightarrow \mathbb{N} }[/math] is space-constructible if [math]\displaystyle{ f(n) \ge \log~n }[/math] and there exists a Turing machine which computes the function [math]\displaystyle{ f(n) }[/math] in space [math]\displaystyle{ O(f(n)) }[/math] when starting with an input [math]\displaystyle{ 1^n }[/math], where [math]\displaystyle{ 1^n }[/math] represents a string of n consecutive 1s. Most of the common functions that we work with are space-constructible, including polynomials, exponents, and logarithms.

For every space-constructible function [math]\displaystyle{ f:\mathbb{N} \longrightarrow \mathbb{N} }[/math], there exists a language L that is decidable in space [math]\displaystyle{ O(f(n)) }[/math] but not in space [math]\displaystyle{ o(f(n)) }[/math].

Proof

The goal is to define a language that can be decided in space [math]\displaystyle{ O(f(n)) }[/math] but not space [math]\displaystyle{ o(f(n)) }[/math]. The language is defined as L:

[math]\displaystyle{ L = \{~ (\langle M \rangle, 10^k): M \mbox{ uses space } \le f(|\langle M \rangle, 10^k|) \mbox{ and time } \le 2^{f(|\langle M \rangle, 10^k|)} \mbox{ and } M \mbox{ does not accept } (\langle M \rangle, 10^k) ~ \} }[/math]

For any machine M that decides a language in space [math]\displaystyle{ o(f(n)) }[/math], L will differ in at least one spot from the language of M. Namely, for some large enough k, M will use space [math]\displaystyle{ \le f(|\langle M \rangle, 10^k|) }[/math] on [math]\displaystyle{ (\langle M \rangle, 10^k) }[/math] and will therefore differ at its value.

On the other hand, L is in [math]\displaystyle{ \mathsf{SPACE}(f(n)) }[/math]. The algorithm for deciding the language L is as follows:

  1. On an input x, compute [math]\displaystyle{ f(|x|) }[/math] using space-constructibility, and mark off [math]\displaystyle{ f(|x|) }[/math] cells of tape. Whenever an attempt is made to use more than [math]\displaystyle{ f(|x|) }[/math] cells, reject.
  2. If x is not of the form [math]\displaystyle{ \langle M \rangle, 10^k }[/math] for some TM M, reject.
  3. Simulate M on input x for at most [math]\displaystyle{ 2^{f(|x|)} }[/math] steps (using [math]\displaystyle{ f(|x|) }[/math] space). If the simulation tries to use more than [math]\displaystyle{ f(|x|) }[/math] space or more than [math]\displaystyle{ 2^{f(|x|)} }[/math] operations, then reject.
  4. If M accepted x during this simulation, then reject; otherwise, accept.

Note on step 3: Execution is limited to [math]\displaystyle{ 2^{f(|x|)} }[/math] steps in order to avoid the case where M does not halt on the input x. That is, the case where M consumes space of only [math]\displaystyle{ O(f(x)) }[/math] as required, but runs for infinite time.

The above proof holds for the case of PSPACE, but some changes need to be made for the case of NPSPACE. The crucial point is that while on a deterministic TM, acceptance and rejection can be inverted (crucial for step 4), this is not possible on a non-deterministic machine.

For the case of NPSPACE, L needs to be redefined first:

[math]\displaystyle{ L = \{~ (\langle M \rangle, 10^k): M \mbox{ uses space } \le f(|\langle M \rangle, 10^k|) \mbox{ and } M \mbox{ accepts } (\langle M \rangle, 10^k) ~ \} }[/math]

Now, the algorithm needs to be changed to accept L by modifying step 4 to:

  • If M accepted x during this simulation, then accept; otherwise, reject.

L can not be decided by a TM using [math]\displaystyle{ o(f(n)) }[/math] cells. Assuming L can be decided by some TM M using [math]\displaystyle{ o(f(n)) }[/math] cells, and following from the Immerman–Szelepcsényi theorem, [math]\displaystyle{ \overline L }[/math] can also be determined by a TM (called [math]\displaystyle{ \overline M }[/math]) using [math]\displaystyle{ o(f(n)) }[/math] cells. Here lies the contradiction, therefore the assumption must be false:

  1. If [math]\displaystyle{ w = (\langle \overline M \rangle, 10^k) }[/math] (for some large enough k) is not in [math]\displaystyle{ \overline L }[/math] then M will accept it, therefore [math]\displaystyle{ \overline M }[/math] rejects w, therefore w is in [math]\displaystyle{ \overline L }[/math] (contradiction).
  2. If [math]\displaystyle{ w = (\langle \overline M \rangle, 10^k) }[/math] (for some large enough k) is in [math]\displaystyle{ \overline L }[/math] then M will reject it, therefore [math]\displaystyle{ \overline M }[/math] accepts w, therefore w is not in [math]\displaystyle{ \overline L }[/math] (contradiction).

Comparison and improvements

The space hierarchy theorem is stronger than the analogous time hierarchy theorems in several ways:

  • It only requires s(n) to be at least log n instead of at least n.
  • It can separate classes with any asymptotic difference, whereas the time hierarchy theorem requires them to be separated by a logarithmic factor.
  • It only requires the function to be space-constructible, not time-constructible.

It seems to be easier to separate classes in space than in time. Indeed, whereas the time hierarchy theorem has seen little remarkable improvement since its inception, the nondeterministic space hierarchy theorem has seen at least one important improvement by Viliam Geffert in his 2003 paper "Space hierarchy theorem revised". This paper made several generalizations of the theorem:

  • It relaxes the space-constructibility requirement. Instead of merely separating the union classes [math]\displaystyle{ \mathsf{DSPACE}(O(s(n)) }[/math] and [math]\displaystyle{ \mathsf{DSPACE}(o(s(n)) }[/math], it separates [math]\displaystyle{ \mathsf{DSPACE}(f(n)) }[/math] from [math]\displaystyle{ \mathsf{DSPACE}(g(n)) }[/math] where [math]\displaystyle{ f(n) }[/math] is an arbitrary [math]\displaystyle{ O(s(n)) }[/math] function and g(n) is a computable [math]\displaystyle{ o(s(n)) }[/math] function. These functions need not be space-constructible or even monotone increasing.
  • It identifies a unary language, or tally language, which is in one class but not the other. In the original theorem, the separating language was arbitrary.
  • It does not require [math]\displaystyle{ s(n) }[/math] to be at least log n; it can be any nondeterministically fully space-constructible function.

Refinement of space hierarchy

If space is measured as the number of cells used regardless of alphabet size, then [math]\displaystyle{ \mathsf{SPACE}(f(n)) = \mathsf{SPACE}(O(f(n))) }[/math] because one can achieve any linear compression by switching to a larger alphabet. However, by measuring space in bits, a much sharper separation is achievable for deterministic space. Instead of being defined up to a multiplicative constant, space is now defined up to an additive constant. However, because any constant amount of external space can be saved by storing the contents into the internal state, we still have [math]\displaystyle{ \mathsf{SPACE}(f(n)) = \mathsf{SPACE}(f(n)+O(1)) }[/math].

Assume that f is space-constructible. SPACE is deterministic.

  • For a wide variety of sequential computational models, including for Turing machines, SPACE(f(n)-ω(log(f(n)+n))) ⊊ SPACE(f(n)). This holds even if SPACE(f(n)-ω(log(f(n)+n))) is defined using a different computational model than [math]\displaystyle{ \mathsf{SPACE}(f(n)) }[/math] because the different models can simulate each other with [math]\displaystyle{ O(\log(f(n)+n)) }[/math] space overhead.
  • For certain computational models, we even have SPACE(f(n)-ω(1)) ⊊ SPACE(f(n)). In particular, this holds for Turing machines if we fix the alphabet, the number of heads on the input tape, the number of heads on the worktape (using a single worktape), and add delimiters for the visited portion of the worktape (that can be checked without increasing space usage). SPACE(f(n)) does not depend on whether the worktape is infinite or semi-infinite. We can also have a fixed number of worktapes if f(n) is either a SPACE constructible tuple giving the per-tape space usage, or a SPACE(f(n)-ω(log(f(n)))-constructible number giving the total space usage (not counting the overhead for storing the length of each tape).

The proof is similar to the proof of the space hierarchy theorem, but with two complications: The universal Turing machine has to be space-efficient, and the reversal has to be space-efficient. One can generally construct universal Turing machines with [math]\displaystyle{ O(\log(space)) }[/math] space overhead, and under appropriate assumptions, just [math]\displaystyle{ O(1) }[/math] space overhead (which may depend on the machine being simulated). For the reversal, the key issue is how to detect if the simulated machine rejects by entering an infinite (space-constrained) loop. Simply counting the number of steps taken would increase space consumption by about [math]\displaystyle{ f(n) }[/math]. At the cost of a potentially exponential time increase, loops can be detected space-efficiently as follows:[1]

Modify the machine to erase everything and go to a specific configuration A on success. Use depth-first search to determine whether A is reachable in the space bound from the starting configuration. The search starts at A and goes over configurations that lead to A. Because of determinism, this can be done in place and without going into a loop.

It can also be determined whether the machine exceeds a space bound (as opposed to looping within the space bound) by iterating over all configurations about to exceed the space bound and checking (again using depth-first search) whether the initial configuration leads to any of them.

Corollaries

Corollary 1

For any two functions [math]\displaystyle{ f_1 }[/math], [math]\displaystyle{ f_2: \mathbb{N} \longrightarrow \mathbb{N} }[/math], where [math]\displaystyle{ f_1(n) }[/math] is [math]\displaystyle{ o(f_2(n)) }[/math] and [math]\displaystyle{ f_2 }[/math] is space-constructible, [math]\displaystyle{ \mathsf{SPACE}(f_1(n)) \subsetneq \mathsf{SPACE}(f_2(n)) }[/math].

This corollary lets us separate various space complexity classes. For any natural number k, the function [math]\displaystyle{ n^k }[/math] is space-constructible. Therefore for any two natural numbers [math]\displaystyle{ k_1 \lt k_2 }[/math] we can prove [math]\displaystyle{ \mathsf{SPACE}(n^{k_1}) \subsetneq \mathsf{SPACE}(n^{k_2}) }[/math].

Corollary 2

NLPSPACE.

Proof

Savitch's theorem shows that [math]\displaystyle{ \mathsf{NL} \subseteq \mathsf{SPACE}(\log^2n) }[/math], while the space hierarchy theorem shows that [math]\displaystyle{ \mathsf{SPACE}(\log^2n) \subsetneq \mathsf{SPACE}(n) }[/math]. The result is this corollary along with the fact that TQBF ∉ NL since TQBF is PSPACE-complete.

This could also be proven using the non-deterministic space hierarchy theorem to show that NL ⊊ NPSPACE, and using Savitch's theorem to show that PSPACE = NPSPACE.

Corollary 3

PSPACEEXPSPACE.

This last corollary shows the existence of decidable problems that are intractable. In other words, their decision procedures must use more than polynomial space.

Corollary 4

There are problems in PSPACE requiring an arbitrarily large exponent to solve; therefore PSPACE does not collapse to DSPACE(nk) for some constant k.

Corollary 5

SPACE(n) ≠ PTIME.

To see it, assume the contrary, thus any problem decided in space [math]\displaystyle{ O(n) }[/math] is decided in time [math]\displaystyle{ O(n^c) }[/math], and any problem [math]\displaystyle{ L }[/math] decided in space [math]\displaystyle{ O(n^b) }[/math] is decided in time [math]\displaystyle{ O((n^b)^c)=O(n^{bc}) }[/math]. Now [math]\displaystyle{ \mathsf{P}:=\bigcup_{k\in\mathbb N}\mathsf{DTIME}(n^k) }[/math], thus P is closed under such a change of bound, that is [math]\displaystyle{ \bigcup_{k\in\mathbb N}\mathsf{DTIME}(n^{bk})\subseteq\mathsf{P} }[/math], so [math]\displaystyle{ L\in\mathsf{P} }[/math]. This implies that for all [math]\displaystyle{ b, \mathsf{SPACE}(n^b)\subseteq\mathsf{P}\subseteq\mathsf{SPACE}(n) }[/math], but the space hierarchy theorem implies that [math]\displaystyle{ \mathsf{SPACE}(n^2)\not\subseteq\mathsf{SPACE}(n) }[/math], and Corollary 6 follows. Note that this argument neither proves that [math]\displaystyle{ \mathsf{P}\not\subseteq\mathsf{SPACE}(n) }[/math] nor that [math]\displaystyle{ \mathsf{SPACE}(n)\not\subseteq\mathsf{P} }[/math], as to reach a contradiction we used the negation of both sentences, that is we used both inclusions, and can only deduce that at least one fails. It is currently unknown which fail(s) but conjectured that both do, that is that [math]\displaystyle{ \mathsf{SPACE}(n) }[/math] and [math]\displaystyle{ \mathsf{P} }[/math] are incomparable -at least for deterministic space.[2] This question is related to that of the time complexity of (nondeterministic) linear bounded automata which accept the complexity class [math]\displaystyle{ \mathsf{NSPACE}(n) }[/math] (aka as context-sensitive languages, CSL); so by the above CSL is not known to be decidable in polynomial time -see also Kuroda's two problems on LBA.

See also

References