標題: Design of a mathematical expression understanding system
作者: Lee, HJ
Wang, JS
交大名義發表
資訊工程學系
National Chiao Tung University
Department of Computer Science
關鍵字: character segmentation;character recognition;expression formation;error correction
公開日期: 1-三月-1997
摘要: A scientific document usually consists of text and mathematical expressions. In this paper, we present a system for segmenting and understanding text and mathematical expressions in a document, The system can be divided into six stages: page segmentation and labeling, character segmentation, feature extraction, character recognition, expression formation, and error correction and expression extraction. After we extract all text lines in a document, we separate all symbols in each text line and calculate direction-feature vectors and aspect ratios for those symbols. Then, a nearest-neighbor algorithm recognizes characters. In the expression formation stage, we build a symbol relation tree for each text line that represents the relationships among the symbols in the text line. Each text line is decomposed into a collection of primitive tokens: operands, operators and separators. Heuristic rules based on these primitive tokens are used to correct text recognition errors. Finally, we extract all mathematical expressions according to basic expression forms. Several pages of documents were scanned to test the method. All mathematical expressions are understood. In the expressions generated, a few symbols are misrecognized. The average recognition rate was 96.16%. (C) 1997 Elsevier Science B.V.
URI: http://hdl.handle.net/11536/695
ISSN: 0167-8655
期刊: PATTERN RECOGNITION LETTERS
Volume: 18
Issue: 3
起始頁: 289
結束頁: 298
顯示於類別:期刊論文


文件中的檔案:

  1. A1997WZ62900008.pdf