Finger search

From HandWiki
Example of finger search on treaps.

In computer science, a finger search on a data structure is an extension of any search operation that structure supports, where a reference (finger) to an element in the data structure is given along with the query. While the search time for an element is most frequently expressed as a function of the number of elements in a data structure, finger search times are a function of the distance between the element and the finger.

In a set of n elements, the distance d(x,y) (or simply d when unambiguous) between two elements x and y is their difference in rank. If x and y are the ith and jth largest elements in the structure, then the difference in rank is |i - j|. If a normal search in some structure would normally take O(f(n)) time, then a finger search for x with finger y should ideally take O(f(d)) time. Remark that since dn, it follows that in the worst case, finger search is only as bad as normal search. However, in practice these degenerate finger searches do more work than normal searches. For instance, if f(n) is log(n), and finger search does twice as many comparisons as normal search in the worst case, it follows that finger search is slower when d > n. Therefore, finger search should only be used when one can reasonably expect the target to actually be close to the finger.

Implementations

Some popular data structures support finger search with no additional changes to the actual structure. In structures where searching for an element x is accomplished by narrowing down an interval over which x can be found, finger search from y is typically accomplished by reversing the search process from y until the search interval is large enough to contain x, at which point search proceeds as normal.

Sorted linked lists

In a linked list, one normally searches linearly for an element by walking from one end to the other. If the linked list is sorted, and we have a reference to some node containing y, then we may find x in O(d) time by starting our search from y.

Sorted arrays

In a sorted array A, one normally searches for an element x in A with a binary search. Finger search is accomplished by performing a one-sided search from A[j] = y. While binary search halves the search space after each comparison, one-sided search doubles the search space after each comparison. Specifically, on the kth iteration of one-sided search (assuming x > y), the interval under consideration is A[j, j+2k-1]. Expansion halts as soon as A[j + 2k-1] ≥ x, at which point this interval is binary searched for x.

If one-sided search takes k iterations to find an interval that contains x, then it follows that d > 2k-2. Binary searching this range will also take another k iterations. Therefore, finger search for x from y takes O(k) = O(log d) time.

Skip lists

In a skip list, one can finger search for x from a node containing the element y by simply continuing the search from this point. Note that if x < y, then search proceeds backwards, and if x > y, then search proceeds forwards. The backwards case is symmetric to normal search in a skip list, but the forward case is actually more complex. Normally, search in a skiplist is expected to be fast because the sentinel at the start of the list is as tall as the tallest node. However, our finger could be a node of height 1. Because of this, we may occasionally climb while trying to search; something which never occurs normally. However, even with this complication, we can achieve O(log d) expected search time.[1]

Treaps

A treap is a randomized binary search tree (BST). Searching in a treap is the same as searching for an element in any other BST. Treaps however have the property that the expected path length between two elements of distance d is O(log d). Thus, to finger search from the node containing y for x, one can climb the tree from y until an ancestor of x is found, at which point normal BST search proceeds as usual. While determining if a node is the ancestor of another is non-trivial, one may augment the tree to support queries of this form to give expected O(log d) finger search time.[1]

Ropes and trees

Implementations of the rope (data structure) typically implement a cord position iterator to traverse the string. The iterator can be seen as a finger that points at some specific character of the string. Like most balanced trees, ropes require O(log(n)) time to retrieve data in one leaf of the tree when given only the root of the tree. Reading every leaf of the tree would seem to require O(n*log(n)) time. However, by storing a little extra information, the iterator can be used to read "the next" leaf in only O(1) time, and every leaf of a tree in only O(n) time. Implementations of ropes typically cache "extra information" about the entire the path from the root to the current node position in the iterator. Other implementations of tree data structures sometimes store "extra information" in the tree itself, storing a pointer in each node to its parent or its successor (in addition to the normal pointers in each node to its children), and storing only the current node position in the iterator.[2] [3]

Generalizations

If one can perform finger search in an iterative manner in O(f(d)) time, where each iteration takes O(1) time, then by providing c different fingers, one can perform finger search in O(c min{d(x, y1), ..., d(x, yc)}) time. This is accomplished by starting finger search for all c fingers, and iterating them forward one step each until the first one terminates.

Given any sequence A = [a1, ..., am] of m accesses, a structure is said to have the static finger property for a fixed finger f, if the time to perform A is [math]\displaystyle{ O\left(\sum_{i=1}^{m}\log d(f,a_i)\right) }[/math]. Splay trees have this property for any choice of f with no additional processing on sufficiently large sequences of accesses.[4]

Applications

Finger search can be used to re-use work already done in previous searches. For instance, one way to iterate over the elements in a data structure is to simply finger search for them in order, where the finger for one query is the location of the result of the last. One may optimize their data structure by doing this internally if it is known that searches are frequently near the last search.

A finger search tree is a type of data structure that allows fingers to be specified such that all or some of their supported operations are faster when they access or modify a location near a finger. In contrast to the finger searches described in this article, these fingers are not provided at query time, but are separately specified so that all future operations use these fingers. One benefit of this is that the user needs not manipulate or store internal references to the structure, they may simply specify an element in the structure.

References