ProbCons

In bioinformatics and proteomics, ProbCons is an open source software for probabilistic consistency-based multiple alignment of amino acid sequences. It is one of the most efficient protein multiple sequence alignment programs, since it has repeatedly demonstrated a statistically significant advantage in accuracy over similar tools, including Clustal and MAFFT.^[1]^[2]

Algorithm

The following describes the basic outline of the ProbCons algorithm.^[3]

Step 1: Reliability of an alignment edge

For every pair of sequences compute the probability that letters $x_{i}$ and $y_{i}$ are paired in $a^{*}$ an alignment that is generated by the model.

$\begin{aligned} P (x_{i} \sim y_{i} | x, y) \overset{d e f}{=} & \Pr [x_{i} \sim y_{i} in some a | x, y] \\ = & \sum_{\binom{alignment a}{with x_{i} - y_{i}}} \Pr [a | x, y] \\ = & \sum_{alignment a} 𝟏 {x_{i} - y_{i} \in a} \Pr [a | x, y] \end{aligned}$

(Where $𝟏 {x_{i} \sim y_{i} \in a}$ is equal to 1 if $x_{i}$ and $y_{i}$ are in the alignment and 0 otherwise.)

Step 2: Maximum expected accuracy

The accuracy of an alignment $a^{*}$ with respect to another alignment $a$ is defined as the number of common aligned pairs divided by the length of the shorter sequence.

Calculate expected accuracy of each sequence:

$\begin{aligned} E_{\Pr [a | x, y]} (acc (a^{*}, a)) & = \sum_{a} \Pr [a | x, y] acc (a^{*}, a) \\ = \frac{1}{\min (| x |, | y |)} \cdot \sum_{a} 𝟏 {x_{i} \sim y_{i} \in a} \Pr [a | x, y] \\ = \frac{1}{\min (| x |, | y |)} \cdot \sum_{x_{i} - y_{i}} P (x_{i} \sim y_{j} | x, y) \end{aligned}$

This yields a maximum expected accuracy (MEA) alignment:

$E (x, y) = \arg \max_{a^{*}} E_{\Pr [a | x, y]} (acc (a^{*}, a))$

Step 3: Probabilistic Consistency Transformation

All pairs of sequences x,y from the set of all sequences $𝒮$ are now re-estimated using all intermediate sequences z:

$P^{'} (x_{i} - y_{i} | x, y) = \frac{1}{| 𝒮 |} \sum_{z} \sum_{1 \leq k \leq | z |} P (x_{i} \sim z_{i} | x, z) \cdot P (z_{i} \sim y_{i} | z, y)$

This step can be iterated.

Step 4: Computation of guide tree

Construct a guide tree by hierarchical clustering using MEA score as sequence similarity score. Cluster similarity is defined using weighted average over pairwise sequence similarity.

Step 5: Compute MSA

Finally compute the MSA using progressive alignment or iterative alignment.

References

↑ "PROBCONS: Probabilistic Consistency-based Multiple Sequence Alignment". Genome Research 15 (2): 330–340. 2005. doi:10.1101/gr.2821705. PMID 15687296.
↑ Roshan, Usman (2014-01-01). "Multiple Sequence Alignment Using Probcons and Probalign". in Russell, David J (in English). Multiple Sequence Alignment Methods. Methods in Molecular Biology. 1079. Humana Press. pp. 147–153. doi:10.1007/978-1-62703-646-7_9. ISBN 9781627036450.
↑ Lecture "Bioinformatics II" at University of Freiburg

External links

Official website

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/ProbCons. Read more

[1] "PROBCONS: Probabilistic Consistency-based Multiple Sequence Alignment". Genome Research 15 (2): 330–340. 2005. doi:10.1101/gr.2821705. PMID 15687296.

[2] Roshan, Usman (2014-01-01). "Multiple Sequence Alignment Using Probcons and Probalign". in Russell, David J (in English). Multiple Sequence Alignment Methods. Methods in Molecular Biology. 1079. Humana Press. pp. 147–153. doi:10.1007/978-1-62703-646-7_9. ISBN 9781627036450.

[3] Lecture "Bioinformatics II" at University of Freiburg

[1]

[2]

[3]

Anonymous

Search

ProbCons

Namespaces

More

Page actions

Contents

Algorithm

Step 1: Reliability of an alignment edge

Step 2: Maximum expected accuracy

Step 3: Probabilistic Consistency Transformation

Step 4: Computation of guide tree

Step 5: Compute MSA

See also

References

External links

Navigation

Navigation

Help

googletranslator

Navigation

Wiki tools

Wiki tools

Anonymous

Search

ProbCons

Algorithm

Step 1: Reliability of an alignment edge

Step 2: Maximum expected accuracy

Step 3: Probabilistic Consistency Transformation

Step 4: Computation of guide tree

Step 5: Compute MSA

See also

References

External links

Navigation

Wiki tools

Page tools

Other projects

Categories