Search About RLL About Mattick About Supplement Add to Supplement PDF file providers Help

Full record view

Lücking, R./ B. P. Hodkinson/ A. Stamatakis/ R. A. Cartwright 2011: PICS-Ord: Unlimited Coding of Ambiguous Regions by Pairwise Identity and Cost Scores Ordination. - BMC Bioinformatics 12: 10. [RLL List # 224 / Rec.# 33102]
Abstract: Background: We present a novel method to encode ambiguously aligned regions in fixed multiple sequence alignments by 'Pairwise Identity and Cost Scores Ordination' (PICS-Ord). The method works via ordination of sequence identity or cost scores matrices by means of Principal Coordinates Analysis (PCoA). After identification of ambiguous regions, the method computes pairwise distances as sequence identities or cost scores, ordinates the resulting distance matrix by means of PCoA, and encodes the principal coordinates as ordered integers. Three biological and 100 simulated datasets were used to assess the performance of the new method. Results: Including ambiguous regions coded by means of PICS-Ord increased topological accuracy, resolution, and bootstrap support in real biological and simulated datasets compared to the alternative of excluding such regions from the analysis a priori. In terms of accuracy, PICS-Ord performs equal to or better than previously available methods of ambiguous region coding (e.g., INAASE), with the advantage of a practically unlimited alignment size and increased analytical speed and the possibility of PICS-Ord scores to be analyzed together with DNA data in a partitioned maximum likelihood model. Conclusions: Advantages of PICS-Ord over step matrix-based ambiguous region coding with INAASE include a practically unlimited number of OTUs and seamless integration of PICS-Ord codes into phylogenetic datasets, as well as the increased speed of phylogenetic analysis. Contrary to word- and frequency-based methods, PICS-Ord maintains the advantage of pairwise sequence alignment to derive distances, and the method is flexible with respect to the calculation of distance scores. In addition to distance and maximum parsimony, PICS-Ord codes can be analyzed in a Bayesian or maximum likelihood framework. RAxML (version 7.2.6 or higher that was developed for this study) allows up to 32-state ordered or unordered characters. A GTR, MK, or ORDERED model can be applied to analyse the PICS-Ord codes partition, with GTR performing slightly better than MK and ORDERED. Availability: An implementation of the PICS-Ord algorithm is available from http://scit.us/projects/ngila/wiki/PICS-Ord webcite. It requires both the statistical software, R http://www.r-project.org and the alignment software Ngila http://scit.us/projects/ngila .
– doi:10.1186/1471-2105-12-10

Notes: Sequence data from Graphidaceae and Physciaceae, in addition to simulated data, were analyzed to assess the new methodology.

URL: http://www.biomedcentral.com/1471-2105/12/10

[Email correction]


Upload PDF file to the RLL web site

If you have a PDF file of this RLL/Mattic record, and there are no copyright problems involved, you may upload the file to the RLL/Mattick site. The PDF file will be automatically linked to the paper, and available for download by everyone. Only one PDF file can be linked to a paper, any previous link will be lost.

PDF file::
NB! Legal characters: a-z, A-Z, 0-9, hyphen, underscore, dot (i.e. no diacritics, ampersand, space, etc.).

  


Upload URL to PDF file or web site

Alternatively, you can link this RLL/Mattick record to a PDF file or web page placed somewhere else on the web. Again, only a single link can exist for each record; any previous link will be lost.

Copy and paste the URL you wish to link to this record: