標題: A test for the consecutive ones property on noisy data - Application to physical mapping and sequence assembly
作者: Lu, WF
Hsu, WL
資訊工程學系
Department of Computer Science
關鍵字: consecutive ones property;physical mapping;DNA sequence assembly;probe hybridization;clustering algorithm
公開日期: 2003
摘要: A (0,1)-matrix satisfies the consecutive ones property (COP) for the rows if there exists a column permutation such that the ones in each row of the resultant matrix are consecutive. The consecutive ones test is useful for physical mapping and DNA sequence assembly, for example, in the STS content mapping of YAC library, and in the Bactig assembly based on STS as well as EST markers. The linear time algorithm by Booth and Lueker (1976) for this problem has a serious drawback: the data must be error free. However, laboratory work is never flawless. We devised a new iterative clustering algorithm for this problem, which has the following advantages: 1. If the original matrix satisfies the COP, then the algorithm will produce a column ordering realizing it without any fill-in. 2. Under moderate assumptions, the algorithm can accommodate the following four types of errors: false negatives, false positives, normnique probes, and chimeric clones. Note that in some cases (low quality EST marker identification), NPs occur because of repeat sequences. 3. In case some local data is too noisy, our algorithm could likely discover that and suggest additional lab work to reduce the degree of ambiguity in that part. 4. A unique feature of our algorithm is that, rather than forcing all probes to be included and ordered in the final arrangement, our algorithm would delete some noisy probes. Thus, it could produce more than one contig. The gaps are created mostly by noisy probes.
URI: http://hdl.handle.net/11536/28257
http://dx.doi.org/10.1089/106652703322539051
ISSN: 1066-5277
DOI: 10.1089/106652703322539051
期刊: JOURNAL OF COMPUTATIONAL BIOLOGY
Volume: 10
Issue: 5
起始頁: 709
結束頁: 735
Appears in Collections:Articles


Files in This Item:

  1. 000186395700005.pdf