Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   


 The Delphion Integrated View

  Buy Now:   Buy PDF- 13pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 
 Email this to a friend  Email this to a friend 
       
Title: US6266637: Phrase splicing and variable substitution using a trainable speech synthesizer
[ Derwent Title ]


Country: US United States of America

View Images High
Resolution

 Low
 Resolution

 
13 pages

 
Inventor: Donovan, Robert E.; Mt. Kisco, NY
Franz, Martin; Yorktown Heights, NY
Roukos, Salim E.; Scarsdale, NY
Sorensen, Jeffrey; Seymour, CT

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 2001-07-24 / 1998-09-11

Application Number: US1998000152178

IPC Code: Advanced: G10L 13/06;
Core: G10L 13/00;
IPC-7: G10L 13/00;

ECLA Code: G10L13/06;

U.S. Class: Current: 704/258; 704/E13.009;
Original: 704/258;

Field of Search: 704/258,260,265

Priority Number:
1998-09-11  US1998000152178

Abstract:     In accordance with the present invention, a method for providing generation of speech includes the steps of providing input to be acoustically produced, comparing the input to training data or application specific splice files to identify one of words and word sequences corresponding to the input for constructing a phone sequence, using a search algorithm to identify a segment sequence to construct output speech according to the phone sequence and concatenating segments and modifying characteristics of the segments to be substantially equal to requested characteristics. Application specific data is advantageously used to make pertinent information available to synthesize both the phone sequence and the output speech. Also, described is a system for performing operations in accordance with the disclosure.

Attorney, Agent or Firm: F. Chau & Associates, LLP ;

Primary / Asst. Examiners: Zele, Krista; Opsasnick, Michael N.

INPADOC Legal Status: Show legal status actions

Family: None

First Claim:
Show all 27 claims
What is claimed is:     1. A method for providing generation of speech comprising the steps of:
  • providing splice phrases including recorded human speech to be employed in synthesizing speech;
  • constructing a splice file dictionary including every word and every word sequence for the splice phrases and including a phone sequence associated with every word and every word sequence for the splice phrases;
  • providing input to be acoustically produced;
  • comparing the input to training data in the splice file dictionary to identify one of words and word sequences corresponding to the input for constructing a phone sequence;
  • comparing the input to a pronunciation dictionary when the input is not found in the training data of the splice file dictionary;
  • identifying a segment sequence using a first search algorithm to construct output speech according to the phone sequence; and
  • concatenating segments of the segment sequence and modifying characteristics of the segments to be substantially equal to requested characteristics.


Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 65 U.S. patent(s) that reference this one

       
U.S. References: Go to Result Set: All U.S. references   |  Forward references (65)   |   Backward references (15)   |   Citation Link

Buy
PDF
Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 18pp US4692941  1987-09 Jacks et al.  First Byte Real-time text-to-speech conversion system
Get PDF - 23pp US4882759  1989-11 Bahl et al.  International Business Machines Corporation Synthesizing word baseforms used in speech recognition
Get PDF - 11pp US5202952  1993-04 Gillick et al.  Dragon Systems, Inc. Large-vocabulary continuous speech prefiltering and processing system
Get PDF - 10pp US5333313  1994-07 Heising  Franklin Electronic Publishers, Incorporated Method and apparatus for compressing a dictionary database by partitioning a master dictionary database into a plurality of functional parts and applying an optimum compression technique to each part
Get PDF - 28pp US5384893  1995-01 Hutchins  Emerson & Stern Associates, Inc. Method and apparatus for speech synthesis based on prosodic analysis
Get PDF - 13pp US5502791  1996-03 Nishimura et al.  International Business Machines Corporation Speech recognition by concatenating fenonic allophone hidden Markov models in parallel among subwords
Get PDF - 11pp US5513298  1996-04 Stanford et al.  International Business Machines Corporation Instantaneous context switching for speech recognition systems
Get PDF - 11pp US5526463  1996-06 Gillick et al.  Dragon Systems, Inc. System for processing a succession of utterances spoken in continuous or discrete form
Get PDF - 14pp US5706397  1998-01 Chow  Apple Computer, Inc. Speech recognition system with multi-level pruning for acoustic matching
Get PDF - 25pp US5839105  1998-11 Ostendorf et al.  ATR Interpreting Telecommunications Research Laboratories Speaker-independent model generation apparatus and speech recognition apparatus each equipped with means for splitting state having maximum increase in likelihood
Get PDF - 28pp US5884261  1999-03 DeSouza et al.  Apple Computer, inc. Method and apparatus for tone-sensitive acoustic modeling
Get PDF - 17pp US5937385  1999-08 Zarozny et al.  International Business Machines Corporation Method and apparatus for creating speech recognition grammars constrained by counter examples
Get PDF - 39pp US5983180  1999-11 Robinson  SoftSound Limited Recognition of sequential data using finite state sequence models organized in a tree structure
Get PDF - 33pp US6032111  2000-02 Mohri  AT&T Corp. Method and apparatus for compiling context-dependent rewrite rules and input strings
Get PDF - 17pp US6038533  2000-03 Buchsbaum et al.  Lucent Technologies Inc. System and method for selecting training text
       
Foreign References: None

Other References:
  • E-Speech web page; http://www.espeech.com/NaturalSynthesis.htm.
  • Bahl et al., (1993) Context Dependent Vector Quantization for Continuous Speech Recognition; Proc. ICASSP 93, Minneapolis, vol. 2, pp. 632-635.
  • Donovan, R.E. (1996); Trainable Speech Synthesis, PhD. Thesis, Cambridge University Engineering Department.
  • Donovan, et al., (1998), The IBM Trainable Speech Synthesis System, Proc. ICSLP 98, Sydney.
  • Moulines et al., (1990), Pitch-Synchronous Waveform Processing Techniques for Text-to-Speech Synthesis Using Diphones, Speech Communication, 9, pp. 453-467. (15 pages) Cited by 19 patents
  • Jon Rong-Wei Yi; Natural-Sounding Speech Synthesis Using Varibable-Length Units; Massachusetts Institute of Technology; pp. l -121; 1998.


  • Inquire Regarding Licensing

    Powered by Verity


    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help