Title: US5875426: Recognizing speech having word liaisons by adding a phoneme to reference word models
Country: US United States of America

Inventor: Bahl, Lalit Rai; Amawalk, NY
De Gennaro, Steven Vincent; Pawling, NY
deSouza, Peter Vincent; San Jose, CA
Epstein, Edward Adam; Putnam Valley, NY
Le Roux, Jean-Michel; Elmsford, NY
Lewis, Burn Lewin; Ossining, NY
Waast-Richard, Claire; Paris, France

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
Published / Filed: 1999-02-23 / 1996-06-12

Application Number: US1996000662407

IPC Code: Advanced: G10L 15/08; G10L 15/18;
IPC-7: G10L 5/06;

ECLA Code: G10L15/08; G10L15/18; S10L15/187;

U.S. Class: Current: 704/255; 704/252; 704/E15.014; 704/E15.018;
Original: 704/255; 704/252;

Field of Search: 395/2.64,2.66,2.52,2.61,2.6,2.4,2.09,2.53,2.63,2.62

Priority Number:
1996-06-12  US1996000662407

Abstract:     A method and system of recognizing speech. The method and system perform a fast match on a word in the string of speech to be recognized which generates a fast match list representing words in a system vocabulary that most likely match a current word to be recognized. Next, the method and system perform a detailed match on the words in the fast match list and generate a detailed match list representing words that most likely match the current word to be recognized. Then for each word in the detailed match list that can accept a liaison phoneme from a preceding word, where each word is a liaison receptor, adding to the detailed match list a form of the liaison receptor, where the form represents an addition of a liaison phoneme to the liaison receptor, creating a modified detailed match list which is inclusive of the forms of the liaison receptors added to the detailed match list. Finally the method and system outputs a word in the modified detailed match list that has the highest probability of matching the word to be recognized.

Attorney, Agent or Firm: Tassinari, Jr., Robert P. ;

Primary / Asst. Examiners: Hudspeth, David R.; Storm, Donald L.

We claim:     1. A method for recognizing speech, comprising:
  • inputting an utterance, said utterance includes at least one word in a system vocabulary; representing the utterance as a temporal sequence of frames, each frame representing acoustic parameters of the utterance at one of a succession of brief time periods;
  • generating a first match list of most probable matches between a sequence of one or more of the frames and words in the system vocabulary;
  • analyzing the first match to output a second match list, where the second match list establishes a ranking of the most probable matches between the sequence of one or more of the frames and words in the system vocabulary;
  • selecting each match in the second match list that can accept a liaison phoneme from an immediately preceding word, where the selected match in the second match list represents a current word;
  • determining whether the immediately preceding word can generate a liaison, wherein a liaison generator is a word that ends with an unpronounced consonant phoneme when followed by a word beginning with a consonant phoneme, and ends with a pronounced phoneme when followed by a word with a beginning selected from the group consisting of a vowel and a vowel-like phoneme;
  • amending the second match list by adding a word that represents a placement of the liaison phoneme at the beginning of the current word that creates a third match list;
  • selecting a word from the third match list having the highest ranking of the most probable match to the sequence of frames.

