標題: 以三維模式轉換技術作二維虛擬人臉之自動產生及其應用
Automatic 2D Virtual Face Generation by 3D Model Transformation Techniques and Applications
作者: 張依帆
Yi-Fan Chang
蔡文祥
Wen-Hsiang Tsai
多媒體工程研究所
關鍵字: 虛擬人臉;說話卡通人臉;語句切割;表情模擬;轉頭模擬;嘴形合成;語音同步;Virtual face;Talking cartoon face;Sentence utterance segmentation;Facial expression simulation;Head movement simulation;Lip movement synthesis;Speech synchronization
公開日期: 2006
摘要: 本論文提出了一套自動產生會說話的虛擬卡通臉系統。這個系統包含了四個階段:卡通人臉產生、語音分析、臉部表情與嘴形合成、動畫製作。配合本論文採用的人臉模型,系統會自動建構出一個三維人臉座標系統,並利用三維轉換技術產生不同角度的二維卡通人臉。同時我們以部份特徵點作為控制點來控制卡通人臉的表情,並藉由統計的方法來模擬說話時自然轉頭及不同表情的時間變化。接著,藉由分析輸入的語音及相對應的文字稿件,我們將語音以句子的形式作切割,再使用語音同步技術,配合提出的十二種基本嘴形來模擬會說話的卡通臉。最後,藉由一可編輯且具有開放性之可擴展標記語言(XML),亦即SVG,來達成繪圖及語音同步輸出之效果。利用上述方法,我們實作出兩種有趣的應用。從我們所獲得的良好實驗結果,證實了本論文所提出方法之可行性及應用性。
In this study, a system for automatic generation of talking cartoon faces is proposed, which includes four processes: cartoon face creation, speech analysis, facial expression and lip movement synthesis, and animation generation. A face model of 72 facial feature points is adopted. A method for construction of a 3D local coordinate system for the cartoon face is proposed, and a transformation between the global and the local coordinate systems by the use of a knowledge-based coordinate system transformation method is conducted. A 3D rotation technique is applied to the cartoon face model with some additional points to draw the face in different poses. A concept of assigning control points is applied to animate the cartoon face with different facial expressions. A statistical method is proposed to simulate the timing information of various facial expressions. For lip synchronization, a sentence utterance segmentation algorithm is proposed and a syllable alignment technique is applied. Twelve basic mouth shapes for Mandarin speaking are defined to synthesize lip movements. A frame interpolation method is utilized to generate the animation. Finally, an editable and opened vector-based XML language - Scalable Vector Graphics (SVG) is used for rendering and synchronizing the cartoon face with speech. Two kinds of interesting applications are implemented. Good experimental results show the feasibility and applicability of the proposed methods.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009457518
http://hdl.handle.net/11536/82239
Appears in Collections:Thesis


Files in This Item:

  1. 751801.pdf