Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   

 The Delphion Integrated View

  Buy Now:   Buy PDF- 10pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 Email this to a friend  Email this to a friend 
Title: US6366885: Speech driven lip synthesis using viseme based hidden markov models
[ Derwent Title ]

Country: US United States of America

View Images High


10 pages

Inventor: Basu, Sankar; Tenafly, NJ
Faruquie, Tanveer Atzal; Munirka, India
Neti, Chalapathy V.; Yorktown Heights, NY
Rajput, Nitendra; New Delhi, India
Senior, Andrew William; New York, NY
Subramaniam, L. Venkata; New Delhi, India
Verma, Ashish; New Delhi, India

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 2002-04-02 / 1999-08-27

Application Number: US1999000384763

IPC Code: Advanced: G10L 21/06; G11B 27/031; G11B 27/10;
IPC-7: G10L 15/14; G10L 21/06; G11B 27/00;

ECLA Code: G11B27/031; G11B27/10; S10L21/10L;

U.S. Class: 704/270; 704/235; 704/258;

Field of Search: 704/270,258,235 348/345,576 345/473 352/087 707/500.1

Priority Number:
1999-08-27  US1999000384763

Abstract:     A method of speech driven lip synthesis which applies viseme based training models to units of visual speech. The audio data is grouped into a smaller number of visually distinct visemes rather than the larger number of phonemes. These visemes then form the basis for a Hidden Markov Model (HMM) state sequence or the output nodes of a neural network. During the training phase, audio and visual features are extracted from input speech, which is then aligned according to the apparent viseme sequence with the corresponding audio features being used to calculate the HMM state output probabilities or the output of the neutral network. During the synthesis phase, the acoustic input is aligned with the most likely viseme HMM sequence (in the case of an HMM based model) or with the nodes of the network (in the case of a neural network based system), which is then used for animation.

Attorney, Agent or Firm: Whitman, Curtis & Christofferson, P.C. ; Kaufman, Stephen C. ;

Primary / Asst. Examiners: Dorvil, Richemond; Nolan, Daniel A.

INPADOC Legal Status: Show legal status actions

Family: None

First Claim:
Show all 16 claims
Having thus described our invention, what we claim as new and desire to secure by Letters Patent is as follows:     1. A computer implemented method of synthesizing lip movements from speech acoustics, comprising the steps of:
  • developing a direct correspondence between audio data and distinct visemes;
  • applying said correspondence to new audio data and generating an output viseme sequence corresponding to said new audio data.

Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 17 U.S. patent(s) that reference this one

U.S. References: Go to Result Set: All U.S. references   |  Forward references (17)   |   Backward references (5)   |   Citation Link

Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 16pp US5657426  1997-08 Waters et al.  Digital Equipment Corporation Method and apparatus for producing audio-visual synthetic speech
Get PDF - 18pp US5880788  1999-03 Bregler  Interval Research Corporation Automated synchronization of video image sequences to new soundtracks
Get PDF - 8pp US5884267  1999-03 Goldenthal et al.  Digital Equipment Corporation Automated speech alignment for image synthesis
Get PDF - 20pp US6052132  2000-04 Christian et al.  Digital Equipment Corporation Technique for providing a computer generated face having coordinated eye and head movement
Get PDF - 26pp US6208356  2001-03 Breen et al.  British Telecommunications public limited company Image synthesis
Foreign References: None

Other References:
  • Chen et al ("Audio-Visual Integration in Multimodal Communication," IEEE Proceedings vol. 86 No. 5, May 1998).*
  • Goldschen et al ("Rationale for Phoneme-Viseme Mapping and Feature Selection in Visual Speech Recognition", Aug. 28-Sep. 8, 1995).

  • Inquire Regarding Licensing

    Powered by Verity

    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help