Substring

Short description: Contiguous part of a sequence of symbols

"string" is a substring of "substring"

In formal language theory and computer science, a substring is a contiguous sequence of characters within a string.^{[citation needed]} For instance, "the best of" is a substring of "It was the best of times". In contrast, "Itwastimes" is a subsequence of "It was the best of times", but not a substring.

Prefixes and suffixes are special cases of substrings. A prefix of a string [math]\displaystyle{ S }[/math] is a substring of [math]\displaystyle{ S }[/math] that occurs at the beginning of [math]\displaystyle{ S }[/math]; likewise, a suffix of a string [math]\displaystyle{ S }[/math] is a substring that occurs at the end of [math]\displaystyle{ S }[/math].

The substrings of the string "apple" would be: "a", "ap", "app", "appl", "apple", "p", "pp", "ppl", "pple", "pl", "ple", "l", "le" "e", "" (note the empty string at the end).

Substring

A string [math]\displaystyle{ u }[/math] is a substring (or factor)^[1] of a string [math]\displaystyle{ t }[/math] if there exists two strings [math]\displaystyle{ p }[/math] and [math]\displaystyle{ s }[/math] such that [math]\displaystyle{ t = pus }[/math]. In particular, the empty string is a substring of every string.

Example: The string [math]\displaystyle{ u= }[/math]ana is equal to substrings (and subsequences) of [math]\displaystyle{ t= }[/math]banana at two different offsets:

banana
 |||||
 ana||
   |||
   ana

The first occurrence is obtained with [math]\displaystyle{ p= }[/math]b and [math]\displaystyle{ s= }[/math]na, while the second occurrence is obtained with [math]\displaystyle{ p= }[/math]ban and [math]\displaystyle{ s }[/math] being the empty string.

A substring of a string is a prefix of a suffix of the string, and equivalently a suffix of a prefix; for example, nan is a prefix of nana, which is in turn a suffix of banana. If [math]\displaystyle{ u }[/math] is a substring of [math]\displaystyle{ t }[/math], it is also a subsequence, which is a more general concept. The occurrences of a given pattern in a given string can be found with a string searching algorithm. Finding the longest string which is equal to a substring of two or more strings is known as the longest common substring problem. In the mathematical literature, substrings are also called subwords (in America) or factors (in Europe).^{[citation needed]}

Prefix

A string [math]\displaystyle{ p }[/math] is a prefix^[1] of a string [math]\displaystyle{ t }[/math] if there exists a string [math]\displaystyle{ s }[/math] such that [math]\displaystyle{ t = ps }[/math]. A proper prefix of a string is not equal to the string itself;^[2] some sources^[3] in addition restrict a proper prefix to be non-empty. A prefix can be seen as a special case of a substring.

Example: The string ban is equal to a prefix (and substring and subsequence) of the string banana:

banana
|||
ban

The square subset symbol is sometimes used to indicate a prefix, so that [math]\displaystyle{ p \sqsubseteq t }[/math] denotes that [math]\displaystyle{ p }[/math] is a prefix of [math]\displaystyle{ t }[/math]. This defines a binary relation on strings, called the prefix relation, which is a particular kind of prefix order.

Suffix

A string [math]\displaystyle{ s }[/math] is a suffix^[1] of a string [math]\displaystyle{ t }[/math] if there exists a string [math]\displaystyle{ p }[/math] such that [math]\displaystyle{ t = ps }[/math]. A proper suffix of a string is not equal to the string itself. A more restricted interpretation is that it is also not empty.^[1] A suffix can be seen as a special case of a substring.

Example: The string nana is equal to a suffix (and substring and subsequence) of the string banana:

banana
  ||||
  nana

A suffix tree for a string is a trie data structure that represents all of its suffixes. Suffix trees have large numbers of applications in string algorithms. The suffix array is a simplified version of this data structure that lists the start positions of the suffixes in alphabetically sorted order; it has many of the same applications.

Border

A border is suffix and prefix of the same string, e.g. "bab" is a border of "babab" (and also of "baboon eating a kebab").^{[citation needed]}

Superstring

A superstring of a finite set [math]\displaystyle{ P }[/math] of strings is a single string that contains every string in [math]\displaystyle{ P }[/math] as a substring. For example, [math]\displaystyle{ \text{bcclabccefab} }[/math] is a superstring of [math]\displaystyle{ P = \{\text{abcc}, \text{efab}, \text{bccla}\} }[/math], and [math]\displaystyle{ \text{efabccla} }[/math] is a shorter one. Concatenating all members of [math]\displaystyle{ P }[/math], in arbitrary order, always obtains a trivial superstring of [math]\displaystyle{ P }[/math]. Finding superstrings whose length is as small as possible is a more interesting problem.

A string that contains every possible permutation of a specified character set is called a superpermutation.

References

↑ ^{Jump up to: 1.0} ^1.1 ^1.2 Lothaire, M. (1997). Combinatorics on words. Cambridge: Cambridge University Press. ISBN 0-521-59924-5.
↑ Kelley, Dean (1995). Automata and Formal Languages: An Introduction. London: Prentice-Hall International. ISBN 0-13-497777-7.
↑ Gusfield, Dan (1999). Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. US: Cambridge University Press. ISBN 0-521-58519-8.

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Substring. Read more

[Lot97-1] {Jump up to: 1.0} ^1.1 ^1.2 Lothaire, M. (1997). Combinatorics on words. Cambridge: Cambridge University Press. ISBN 0-521-59924-5.

[Kel95-2] Kelley, Dean (1995). Automata and Formal Languages: An Introduction. London: Prentice-Hall International. ISBN 0-13-497777-7.

[Gus97-3] Gusfield, Dan (1999). Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. US: Cambridge University Press. ISBN 0-521-58519-8.

[1]

[2]

[3]

[1]

Anonymous

Search

Substring

Namespaces

More

Page actions

Contents

Substring

Prefix

Suffix

Border

Superstring

See also

References

Navigation

Navigation

Help

Translate

Wiki tools

Wiki tools

Anonymous

Search

Substring

Substring

Prefix

Suffix

Border

Superstring

See also

References

Navigation

Wiki tools

Page tools

Other projects

Categories