3SUM

From HandWiki
Revision as of 17:55, 8 February 2024 by Jworkorg (talk | contribs) (correction)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Short description: Problem in computational complexity theory


Question, Web Fundamentals.svg Unsolved problem in computer science:
Is there an algorithm to solve the 3SUM problem in time [math]\displaystyle{ O(n^{2-\epsilon}) }[/math], for some [math]\displaystyle{ \epsilon\gt 0 }[/math]?
(more unsolved problems in computer science)

In computational complexity theory, the 3SUM problem asks if a given set of [math]\displaystyle{ n }[/math] real numbers contains three elements that sum to zero. A generalized version, k-SUM, asks the same question on k numbers. 3SUM can be easily solved in [math]\displaystyle{ O(n^2) }[/math] time, and matching [math]\displaystyle{ \Omega(n^{\lceil k/2 \rceil}) }[/math] lower bounds are known in some specialized models of computation (Erickson 1999).

It was conjectured that any deterministic algorithm for the 3SUM requires [math]\displaystyle{ \Omega(n^2) }[/math] time. In 2014, the original 3SUM conjecture was refuted by Allan Grønlund and Seth Pettie who gave a deterministic algorithm that solves 3SUM in [math]\displaystyle{ O(n^2 / ({\log n} / {\log \log n})^{2/3}) }[/math] time.[1] Additionally, Grønlund and Pettie showed that the 4-linear decision tree complexity of 3SUM is [math]\displaystyle{ O(n^{3/2}\sqrt{\log n}) }[/math]. These bounds were subsequently improved.[2][3][4] The current best known algorithm for 3SUM runs in [math]\displaystyle{ O(n^2 (\log \log n)^{O(1)} / {\log^2 n}) }[/math] time.[4] Kane, Lovett, and Moran showed that the 6-linear decision tree complexity of 3SUM is [math]\displaystyle{ O(n{\log^2 n}) }[/math].[5] The latter bound is tight (up to a logarithmic factor). It is still conjectured that 3SUM is unsolvable in [math]\displaystyle{ O(n^{2-\Omega(1)}) }[/math] expected time.[6]

When the elements are integers in the range [math]\displaystyle{ [-N, \dots, N] }[/math], 3SUM can be solved in [math]\displaystyle{ O(n + N\log N) }[/math] time by representing the input set [math]\displaystyle{ S }[/math] as a bit vector, computing the set [math]\displaystyle{ S+S }[/math] of all pairwise sums as a discrete convolution using the fast Fourier transform, and finally comparing this set to [math]\displaystyle{ S }[/math].[7]

Quadratic algorithm

Suppose the input array is [math]\displaystyle{ S[0..n-1] }[/math]. In integer (word RAM) models of computing, 3SUM can be solved in [math]\displaystyle{ O(n^2) }[/math] time on average by inserting each number [math]\displaystyle{ S[i] }[/math] into a hash table, and then, for each index [math]\displaystyle{ i }[/math] and [math]\displaystyle{ j }[/math], checking whether the hash table contains the integer [math]\displaystyle{ -(S[i]+S[j]) }[/math].

It is also possible to solve the problem in the same time in a comparison-based model of computing or real RAM, for which hashing is not allowed. The algorithm below first sorts the input array and then tests all possible pairs in a careful order that avoids the need to binary search for the pairs in the sorted list, achieving worst-case [math]\displaystyle{ O(n^2) }[/math] time, as follows.[8]

sort(S);
for i = 0 to n - 2 do
    a = S[i];
    start = i + 1;
    end = n - 1;
    while (start < end) do
        b = S[start]
        c = S[end];
        if (a + b + c == 0) then
            output a, b, c;
            // Continue search for all triplet combinations summing to zero.
            // We need to update both end and start together since the array values are distinct.
            start = start + 1;
            end = end - 1;
        else if (a + b + c > 0) then
            end = end - 1;
        else
            start = start + 1;
    end
end

The following example shows this algorithm's execution on a small sorted array. Current values of a are shown in red, values of b and c are shown in magenta.

 -25 -10 -7 -3 2 4 8 10  (a+b+c==-25)
 -25 -10 -7 -3 2 4 8 10  (a+b+c==-22)
 . . .
 -25 -10 -7 -3 2 4 8 10  (a+b+c==-7)
 -25 -10 -7 -3 2 4 8 10  (a+b+c==-7)
 -25 -10 -7 -3 2 4 8 10  (a+b+c==-3)
 -25 -10 -7 -3 2 4 8 10  (a+b+c==2)
 -25 -10 -7 -3 2 4 8 10  (a+b+c==0)

The correctness of the algorithm can be seen as follows. Suppose we have a solution a + b + c = 0. Since the pointers only move in one direction, we can run the algorithm until the leftmost pointer points to a. Run the algorithm until either one of the remaining pointers points to b or c, whichever occurs first. Then the algorithm will run until the last pointer points to the remaining term, giving the affirmative solution.

Variants

Non-zero sum

Instead of looking for numbers whose sum is 0, it is possible to look for numbers whose sum is any constant C. The simplest way would be to modify the original algorithm to search the hash table for the integer [math]\displaystyle{ (C -(S[i]+S[j])) }[/math].

Another method:

  • Subtract C/3 from all elements of the input array.
  • In the modified array, find 3 elements whose sum is 0.

For example, if A=[1,2,3,4] and if you are asked to find 3SUM for C=4, then subtract 4/3 from all the elements of A, and solve it in the usual 3sum way, i.e., [math]\displaystyle{ (a-C/3) + (b-C/3) + (c-C/3) = 0 }[/math].

Three different arrays

Instead of searching for the 3 numbers in a single array, we can search for them in 3 different arrays. I.e., given three arrays X, Y and Z, find three numbers aX, bY, cZ, such that [math]\displaystyle{ a+b+c=0 }[/math]. Call the 1-array variant 3SUM×1 and the 3-array variant 3SUM×3.

Given a solver for 3SUM×1, the 3SUM×3 problem can be solved in the following way (assuming all elements are integers):

  • For every element in X, Y and Z, set: [math]\displaystyle{ X[i] \gets X[i]*10+1 }[/math], [math]\displaystyle{ Y[i] \gets Y[i]*10+2 }[/math], [math]\displaystyle{ Z[i] \gets Z[i]*10-3 }[/math].
  • Let S be a concatenation of the arrays X, Y and Z.
  • Use the 3SUM×1 oracle to find three elements [math]\displaystyle{ a' \in S,\ b' \in S,\ c' \in S }[/math] such that [math]\displaystyle{ a'+b'+c'=0 }[/math].
  • Return [math]\displaystyle{ a \gets (a'-1)/10,\ b \gets (b'-2)/10,\ c \gets (c'+3)/10 }[/math].

By the way we transformed the arrays, it is guaranteed that aX, bY, cZ.[9]

Convolution sum

Instead of looking for arbitrary elements of the array such that:

[math]\displaystyle{ S[k]=S[i]+S[j] }[/math]

the convolution 3sum problem (Conv3SUM) looks for elements in specific locations:[10]

[math]\displaystyle{ S[i+j]=S[i]+S[j] }[/math]

Reduction from Conv3SUM to 3SUM

Given a solver for 3SUM, the Conv3SUM problem can be solved in the following way.[10]

  • Define a new array T, such that for every index i: [math]\displaystyle{ T[i]=2n S[i]+i }[/math] (where n is the number of elements in the array, and the indices run from 0 to n-1).
  • Solve 3SUM on the array T.

Correctness proof:

  • If in the original array there is a triple with [math]\displaystyle{ S[i+j]=S[i]+S[j] }[/math], then [math]\displaystyle{ T[i+j]=2n S[i+j]+i+j = (2n S[i] + i) + (2n S[j] + j)=T[i]+T[j] }[/math], so this solution will be found by 3SUM on T.
  • Conversely, if in the new array there is a triple with [math]\displaystyle{ T[k]=T[i]+T[j] }[/math], then [math]\displaystyle{ 2n S[k] + k = 2n(S[i]+S[j]) + (i+j) }[/math]. Because [math]\displaystyle{ i+j\lt 2n }[/math], necessarily [math]\displaystyle{ S[k] = S[i]+S[j] }[/math] and [math]\displaystyle{ k=i+j }[/math], so this is a valid solution for Conv3SUM on S.

Reduction from 3SUM to Conv3SUM

Given a solver for Conv3SUM, the 3SUM problem can be solved in the following way.[6][10]

The reduction uses a hash function. As a first approximation, assume that we have a linear hash function, i.e. a function h such that:

[math]\displaystyle{ h(x+y)=h(x)+h(y) }[/math]

Suppose that all elements are integers in the range: 0...N-1, and that the function h maps each element to an element in the smaller range of indices: 0...n-1. Create a new array T and send each element of S to its hash value in T, i.e., for every x in S([math]\displaystyle{ \forall x \in S }[/math]):

[math]\displaystyle{ T[h(x)] = x }[/math]

Initially, suppose that the mappings are unique (i.e. each cell in T accepts only a single element from S). Solve Conv3SUM on T. Now:

  • If there is a solution for 3SUM: [math]\displaystyle{ z=x+y }[/math], then: [math]\displaystyle{ T[h(z)]=T[h(x)]+T[h(y)] }[/math] and [math]\displaystyle{ h(z)=h(x)+h(y) }[/math], so this solution will be found by the Conv3SUM solver on T.
  • Conversely, if a Conv3SUM is found on T, then obviously it corresponds to a 3SUM solution on S since T is just a permutation of S.

This idealized solution doesn't work, because any hash function might map several distinct elements of S to the same cell of T. The trick is to create an array [math]\displaystyle{ T^* }[/math] by selecting a single random element from each cell of T, and run Conv3SUM on [math]\displaystyle{ T^* }[/math]. If a solution is found, then it is a correct solution for 3SUM on S. If no solution is found, then create a different random [math]\displaystyle{ T^* }[/math] and try again. Suppose there are at most R elements in each cell of T. Then the probability of finding a solution (if a solution exists) is the probability that the random selection will select the correct element from each cell, which is [math]\displaystyle{ (1/R)^3 }[/math]. By running Conv3SUM [math]\displaystyle{ R^3 }[/math] times, the solution will be found with a high probability.

Unfortunately, we do not have linear perfect hashing, so we have to use an almost linear hash function, i.e. a function h such that:

[math]\displaystyle{ h(x+y)=h(x)+h(y) }[/math] or
[math]\displaystyle{ h(x+y)=h(x)+h(y)+1 }[/math]

This requires to duplicate the elements of S when copying them into T, i.e., put every element [math]\displaystyle{ x\in S }[/math] both in [math]\displaystyle{ T[h(x)] }[/math] (as before) and in [math]\displaystyle{ T[h(x)]-1 }[/math]. So each cell will have 2R elements, and we will have to run Conv3SUM [math]\displaystyle{ (2R)^3 }[/math] times.

3SUM-hardness

A problem is called 3SUM-hard if solving it in subquadratic time implies a subquadratic-time algorithm for 3SUM. The concept of 3SUM-hardness was introduced by (Gajentaan Overmars). They proved that a large class of problems in computational geometry are 3SUM-hard, including the following ones. (The authors acknowledge that many of these problems are contributed by other researchers.)

  • Given a set of lines in the plane, are there three that meet in a point?
  • Given a set of non-intersecting axis-parallel line segments, is there a line that separates them into two non-empty subsets?
  • Given a set of infinite strips in the plane, do they fully cover a given rectangle?
  • Given a set of triangles in the plane, compute their measure.
  • Given a set of triangles in the plane, does their union have a hole?
  • A number of visibility and motion planning problems, e.g.,
    • Given a set of horizontal triangles in space, can a particular triangle be seen from a particular point?
    • Given a set of non-intersecting axis-parallel line segment obstacles in the plane, can a given rod be moved by translations and rotations between a start and finish positions without colliding with the obstacles?

By now there are a multitude of other problems that fall into this category. An example is the decision version of X + Y sorting: given sets of numbers X and Y of n elements each, are there n² distinct x + y for xX, yY?[11]

See also

Notes

  1. Grønlund & Pettie 2014.
  2. Freund 2017.
  3. Gold & Sharir 2017.
  4. 4.0 4.1 Chan 2018.
  5. Kane, Lovett & Moran 2018.
  6. 6.0 6.1 Kopelowitz, Tsvi; Pettie, Seth; Porat, Ely (2014). "3SUM Hardness in (Dynamic) Data Structures". arXiv:1407.6756 [cs.DS].
  7. Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2009) [1990]. Introduction to Algorithms (3rd ed.). MIT Press and McGraw-Hill. ISBN 0-262-03384-4.  Ex. 30.1–7, p. 906.
  8. Visibility Graphs and 3-Sum by Michael Hoffmann
  9. For a reduction in the other direction, see Variants of the 3-sum problem.
  10. 10.0 10.1 10.2 Patrascu, M. (2010). "Towards polynomial lower bounds for dynamic problems". Proceedings of the 42nd ACM symposium on Theory of computing - STOC '10. pp. 603. doi:10.1145/1806689.1806772. ISBN 9781450300506. 
  11. Demaine, Erik; Erickson, Jeff; O'Rourke, Joseph (20 August 2006). "Problem 41: Sorting X + Y (Pairwise Sums)". http://cs.smith.edu/~orourke/TOPP/P41.html. 

References