CA2233728A1 - Multiple models integration for multi-environment speech recognition - Google Patents

Multiple models integration for multi-environment speech recognition

Info

Publication number
CA2233728A1
CA2233728A1 CA002233728A CA2233728A CA2233728A1 CA 2233728 A1 CA2233728 A1 CA 2233728A1 CA 002233728 A CA002233728 A CA 002233728A CA 2233728 A CA2233728 A CA 2233728A CA 2233728 A1 CA2233728 A1 CA 2233728A1
Authority
CA
Canada
Prior art keywords
speech recognition
models
multiple models
environment speech
integration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CA002233728A
Other languages
French (fr)
Other versions
CA2233728C (en
Inventor
Mazin G. Rahim
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AT&T Intellectual Property II LP
Original Assignee
AT&T Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by AT&T Corp filed Critical AT&T Corp
Publication of CA2233728A1 publication Critical patent/CA2233728A1/en
Application granted granted Critical
Publication of CA2233728C publication Critical patent/CA2233728C/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • G10L15/144Training of HMMs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/228Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise

Abstract

A speech recognition system which effectively recognizes unknown speech from multiple acoustic environments includes a set of secondary models, each associated with one or more particular acoustic environments, integrated with a base set of recognition models. The speech recognition system is trained by making a set of secondary models in a first stage of training, and integrating the set of secondary models with a base set of recognition models in a second stage of training.
CA002233728A 1997-05-27 1998-03-31 Multiple models integration for multi-environment speech recognition Expired - Lifetime CA2233728C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US08/863,927 US5960397A (en) 1997-05-27 1997-05-27 System and method of recognizing an acoustic environment to adapt a set of based recognition models to the current acoustic environment for subsequent speech recognition
US08/863,927 1997-05-27

Publications (2)

Publication Number Publication Date
CA2233728A1 true CA2233728A1 (en) 1998-11-27
CA2233728C CA2233728C (en) 2002-10-15

Family

ID=25342132

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002233728A Expired - Lifetime CA2233728C (en) 1997-05-27 1998-03-31 Multiple models integration for multi-environment speech recognition

Country Status (4)

Country Link
US (1) US5960397A (en)
EP (2) EP1526504B1 (en)
CA (1) CA2233728C (en)
DE (2) DE69838189T2 (en)

Families Citing this family (97)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998028733A1 (en) * 1996-12-24 1998-07-02 Koninklijke Philips Electronics N.V. A method for training a speech recognition system and an apparatus for practising the method, in particular, a portable telephone apparatus
JP3584458B2 (en) * 1997-10-31 2004-11-04 ソニー株式会社 Pattern recognition device and pattern recognition method
US6327565B1 (en) * 1998-04-30 2001-12-04 Matsushita Electric Industrial Co., Ltd. Speaker and environment adaptation based on eigenvoices
US6980952B1 (en) * 1998-08-15 2005-12-27 Texas Instruments Incorporated Source normalization training for HMM modeling of speech
US6411930B1 (en) * 1998-11-18 2002-06-25 Lucent Technologies Inc. Discriminative gaussian mixture models for speaker verification
US6275800B1 (en) * 1999-02-23 2001-08-14 Motorola, Inc. Voice recognition system and method
DE60018696T2 (en) * 1999-07-01 2006-04-06 Koninklijke Philips Electronics N.V. ROBUST LANGUAGE PROCESSING OF CHARACTERED LANGUAGE MODELS
US6691089B1 (en) * 1999-09-30 2004-02-10 Mindspeed Technologies Inc. User configurable levels of security for a speaker verification system
US7016835B2 (en) * 1999-10-29 2006-03-21 International Business Machines Corporation Speech and signal digitization by using recognition metrics to select from multiple techniques
US20020055844A1 (en) * 2000-02-25 2002-05-09 L'esperance Lauren Speech user interface for portable personal devices
DE60120949T2 (en) 2000-04-04 2007-07-12 Gn Resound A/S A HEARING PROSTHESIS WITH AUTOMATIC HEARING CLASSIFICATION
WO2001022790A2 (en) * 2001-01-05 2001-04-05 Phonak Ag Method for operating a hearing-aid and a hearing aid
DE10041456A1 (en) * 2000-08-23 2002-03-07 Philips Corp Intellectual Pty Method for controlling devices using voice signals, in particular in motor vehicles
US6560755B1 (en) 2000-08-24 2003-05-06 Cadence Design Systems, Inc. Apparatus and methods for modeling and simulating the effect of mismatch in design flows of integrated circuits
US7219058B1 (en) * 2000-10-13 2007-05-15 At&T Corp. System and method for processing speech recognition results
US7457750B2 (en) * 2000-10-13 2008-11-25 At&T Corp. Systems and methods for dynamic re-configurable speech recognition
JP4244514B2 (en) * 2000-10-23 2009-03-25 セイコーエプソン株式会社 Speech recognition method and speech recognition apparatus
WO2002056303A2 (en) * 2000-11-22 2002-07-18 Defense Group Inc. Noise filtering utilizing non-gaussian signal statistics
WO2001020965A2 (en) 2001-01-05 2001-03-29 Phonak Ag Method for determining a current acoustic environment, use of said method and a hearing-aid
US6804647B1 (en) * 2001-03-13 2004-10-12 Nuance Communications Method and system for on-line unsupervised adaptation in speaker verification
US7239324B2 (en) 2001-03-23 2007-07-03 Microsoft Corporation Methods and systems for merging graphics for display on a computing device
US7038690B2 (en) * 2001-03-23 2006-05-02 Microsoft Corporation Methods and systems for displaying animated graphics on a computing device
US6933856B2 (en) * 2001-08-02 2005-08-23 Halliburton Energy Services, Inc. Adaptive acoustic transmitter controller apparatus and method
WO2004013997A1 (en) * 2001-08-02 2004-02-12 Halliburton Energy Service, Inc. Adaptive acoustic transmitter controller apparatus and method
US20030033143A1 (en) * 2001-08-13 2003-02-13 Hagai Aronowitz Decreasing noise sensitivity in speech processing under adverse conditions
US7437289B2 (en) * 2001-08-16 2008-10-14 International Business Machines Corporation Methods and apparatus for the systematic adaptation of classification systems from sparse adaptation data
US6778957B2 (en) * 2001-08-21 2004-08-17 International Business Machines Corporation Method and apparatus for handset detection
US6862359B2 (en) 2001-12-18 2005-03-01 Gn Resound A/S Hearing prosthesis with automatic classification of the listening environment
US7072834B2 (en) * 2002-04-05 2006-07-04 Intel Corporation Adapting to adverse acoustic environment in speech processing using playback training data
US7804973B2 (en) * 2002-04-25 2010-09-28 Gn Resound A/S Fitting methodology and hearing prosthesis based on signal-to-noise ratio loss data
GB2389217A (en) * 2002-05-27 2003-12-03 Canon Kk Speech recognition system
US7398209B2 (en) 2002-06-03 2008-07-08 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US7693720B2 (en) 2002-07-15 2010-04-06 Voicebox Technologies, Inc. Mobile systems and methods for responding to natural language speech utterance
US20040024599A1 (en) * 2002-07-31 2004-02-05 Intel Corporation Audio search conducted through statistical pattern matching
JP4352790B2 (en) * 2002-10-31 2009-10-28 セイコーエプソン株式会社 Acoustic model creation method, speech recognition device, and vehicle having speech recognition device
TWI245259B (en) 2002-12-20 2005-12-11 Ibm Sensor based speech recognizer selection, adaptation and combination
US20040181409A1 (en) * 2003-03-11 2004-09-16 Yifan Gong Speech recognition using model parameters dependent on acoustic environment
JP4033299B2 (en) * 2003-03-12 2008-01-16 株式会社エヌ・ティ・ティ・ドコモ Noise model noise adaptation system, noise adaptation method, and speech recognition noise adaptation program
US7292982B1 (en) * 2003-05-29 2007-11-06 At&T Corp. Active labeling for spoken language understanding
US7516071B2 (en) * 2003-06-30 2009-04-07 International Business Machines Corporation Method of modeling single-enrollment classes in verification and identification tasks
US9240188B2 (en) 2004-09-16 2016-01-19 Lena Foundation System and method for expressive language, developmental disorder, and emotion assessment
US10223934B2 (en) 2004-09-16 2019-03-05 Lena Foundation Systems and methods for expressive language, developmental disorder, and emotion assessment, and contextual feedback
US8938390B2 (en) 2007-01-23 2015-01-20 Lena Foundation System and method for expressive language and developmental disorder assessment
US9355651B2 (en) 2004-09-16 2016-05-31 Lena Foundation System and method for expressive language, developmental disorder, and emotion assessment
US8078465B2 (en) * 2007-01-23 2011-12-13 Lena Foundation System and method for detection and analysis of speech
US7729909B2 (en) * 2005-03-04 2010-06-01 Panasonic Corporation Block-diagonal covariance joint subspace tying and model compensation for noise robust automatic speech recognition
US20060245641A1 (en) * 2005-04-29 2006-11-02 Microsoft Corporation Extracting data from semi-structured information utilizing a discriminative context free grammar
US20070033027A1 (en) * 2005-08-03 2007-02-08 Texas Instruments, Incorporated Systems and methods employing stochastic bias compensation and bayesian joint additive/convolutive compensation in automatic speech recognition
US7640160B2 (en) 2005-08-05 2009-12-29 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US7620549B2 (en) 2005-08-10 2009-11-17 Voicebox Technologies, Inc. System and method of supporting adaptive misrecognition in conversational speech
US7949529B2 (en) * 2005-08-29 2011-05-24 Voicebox Technologies, Inc. Mobile systems and methods of supporting natural language human-machine interactions
EP1934971A4 (en) * 2005-08-31 2010-10-27 Voicebox Technologies Inc Dynamic speech sharpening
US7729911B2 (en) * 2005-09-27 2010-06-01 General Motors Llc Speech recognition method and system
US8509563B2 (en) * 2006-02-02 2013-08-13 Microsoft Corporation Generation of documents from images
US7813926B2 (en) * 2006-03-16 2010-10-12 Microsoft Corporation Training system for a speech recognition application
US8073681B2 (en) 2006-10-16 2011-12-06 Voicebox Technologies, Inc. System and method for a cooperative conversational voice user interface
US20080147411A1 (en) * 2006-12-19 2008-06-19 International Business Machines Corporation Adaptation of a speech processing system from external input that is not directly related to sounds in an operational acoustic environment
CA2676380C (en) 2007-01-23 2015-11-24 Infoture, Inc. System and method for detection and analysis of speech
US7818176B2 (en) 2007-02-06 2010-10-19 Voicebox Technologies, Inc. System and method for selecting and presenting advertisements based on natural language processing of voice-based input
US8762143B2 (en) 2007-05-29 2014-06-24 At&T Intellectual Property Ii, L.P. Method and apparatus for identifying acoustic background environments based on time and speed to enhance automatic speech recognition
KR100911429B1 (en) * 2007-08-22 2009-08-11 한국전자통신연구원 Apparatus and Method for generating noise adaptive acoustic model including Discriminative noise adaptive training for environment transfer
US8548791B2 (en) * 2007-08-29 2013-10-01 Microsoft Corporation Validation of the consistency of automatic terminology translation
US8180637B2 (en) * 2007-12-03 2012-05-15 Microsoft Corporation High performance HMM adaptation with joint compensation of additive and convolutive distortions
US8140335B2 (en) 2007-12-11 2012-03-20 Voicebox Technologies, Inc. System and method for providing a natural language voice user interface in an integrated voice navigation services environment
US8468019B2 (en) * 2008-01-31 2013-06-18 Qnx Software Systems Limited Adaptive noise modeling speech recognition system
US8725492B2 (en) * 2008-03-05 2014-05-13 Microsoft Corporation Recognizing multiple semantic items from single utterance
US8589161B2 (en) 2008-05-27 2013-11-19 Voicebox Technologies, Inc. System and method for an integrated, multi-modal, multi-device natural language voice services environment
US9305548B2 (en) 2008-05-27 2016-04-05 Voicebox Technologies Corporation System and method for an integrated, multi-modal, multi-device natural language voice services environment
US8326637B2 (en) 2009-02-20 2012-12-04 Voicebox Technologies, Inc. System and method for processing multi-modal device interactions in a natural language voice services environment
US8386251B2 (en) * 2009-06-08 2013-02-26 Microsoft Corporation Progressive application of knowledge sources in multistage speech recognition
US9026444B2 (en) 2009-09-16 2015-05-05 At&T Intellectual Property I, L.P. System and method for personalization of acoustic models for automatic speech recognition
US9502025B2 (en) 2009-11-10 2016-11-22 Voicebox Technologies Corporation System and method for providing a natural language content dedication service
US9171541B2 (en) 2009-11-10 2015-10-27 Voicebox Technologies Corporation System and method for hybrid processing in a natural language voice services environment
WO2011071484A1 (en) * 2009-12-08 2011-06-16 Nuance Communications, Inc. Guest speaker robust adapted speech recognition
WO2011116514A1 (en) * 2010-03-23 2011-09-29 Nokia Corporation Method and apparatus for determining a user age range
GB2480085B (en) * 2010-05-05 2012-08-08 Toshiba Res Europ Ltd A speech processing system and method
KR20120054845A (en) * 2010-11-22 2012-05-31 삼성전자주식회사 Speech recognition method for robot
US8756062B2 (en) * 2010-12-10 2014-06-17 General Motors Llc Male acoustic model adaptation based on language-independent female speech data
US8630860B1 (en) * 2011-03-03 2014-01-14 Nuance Communications, Inc. Speaker and call characteristic sensitive open voice search
US8738376B1 (en) * 2011-10-28 2014-05-27 Nuance Communications, Inc. Sparse maximum a posteriori (MAP) adaptation
US9263040B2 (en) 2012-01-17 2016-02-16 GM Global Technology Operations LLC Method and system for using sound related vehicle information to enhance speech recognition
US9418674B2 (en) * 2012-01-17 2016-08-16 GM Global Technology Operations LLC Method and system for using vehicle sound information to enhance audio prompting
US9934780B2 (en) 2012-01-17 2018-04-03 GM Global Technology Operations LLC Method and system for using sound related vehicle information to enhance spoken dialogue by modifying dialogue's prompt pitch
US8484025B1 (en) * 2012-10-04 2013-07-09 Google Inc. Mapping an audio utterance to an action using a classifier
US9653070B2 (en) 2012-12-31 2017-05-16 Intel Corporation Flexible architecture for acoustic signal processing engine
US9552825B2 (en) * 2013-04-17 2017-01-24 Honeywell International Inc. Noise cancellation for voice activation
WO2016044290A1 (en) 2014-09-16 2016-03-24 Kennewick Michael R Voice commerce
US9898459B2 (en) 2014-09-16 2018-02-20 Voicebox Technologies Corporation Integration of domain information into state transitions of a finite state transducer for natural language processing
JP6464650B2 (en) * 2014-10-03 2019-02-06 日本電気株式会社 Audio processing apparatus, audio processing method, and program
EP3207467A4 (en) 2014-10-15 2018-05-23 VoiceBox Technologies Corporation System and method for providing follow-up responses to prior natural language inputs of a user
US10431214B2 (en) 2014-11-26 2019-10-01 Voicebox Technologies Corporation System and method of determining a domain and/or an action related to a natural language input
US10614799B2 (en) 2014-11-26 2020-04-07 Voicebox Technologies Corporation System and method of providing intent predictions for an utterance prior to a system detection of an end of the utterance
CN105976827B (en) * 2016-05-26 2019-09-13 南京邮电大学 A kind of indoor sound localization method based on integrated study
US10331784B2 (en) 2016-07-29 2019-06-25 Voicebox Technologies Corporation System and method of disambiguating natural language processing requests
WO2019113477A1 (en) 2017-12-07 2019-06-13 Lena Foundation Systems and methods for automatic determination of infant cry and discrimination of cry from fussiness
US20190251428A1 (en) * 2018-02-09 2019-08-15 Oath Inc. System and method for query to ad matching using deep neural net based query embedding
US11783826B2 (en) * 2021-02-18 2023-10-10 Nuance Communications, Inc. System and method for data augmentation and speech processing in dynamic acoustic environments

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4720802A (en) * 1983-07-26 1988-01-19 Lear Siegler Noise compensation arrangement
US4897878A (en) * 1985-08-26 1990-01-30 Itt Corporation Noise compensation in speech recognition apparatus
GB2216320B (en) * 1988-02-29 1992-08-19 Int Standard Electric Corp Apparatus and methods for the selective addition of noise to templates employed in automatic speech recognition systems
US5761639A (en) * 1989-03-13 1998-06-02 Kabushiki Kaisha Toshiba Method and apparatus for time series signal recognition with signal variation proof learning
US5794194A (en) * 1989-11-28 1998-08-11 Kabushiki Kaisha Toshiba Word spotting in a variable noise level environment
DE4131387A1 (en) * 1991-09-20 1993-03-25 Siemens Ag METHOD FOR RECOGNIZING PATTERNS IN TIME VARIANTS OF MEASURING SIGNALS
DE69322894T2 (en) * 1992-03-02 1999-07-29 At & T Corp Learning method and device for speech recognition
US5473728A (en) * 1993-02-24 1995-12-05 The United States Of America As Represented By The Secretary Of The Navy Training of homoscedastic hidden Markov models for automatic speech recognition
DE4325404C2 (en) * 1993-07-29 2002-04-11 Tenovis Gmbh & Co Kg Procedure for determining and classifying noise types
AU7802194A (en) * 1993-09-30 1995-04-18 Apple Computer, Inc. Continuous reference adaptation in a pattern recognition system
US5572624A (en) * 1994-01-24 1996-11-05 Kurzweil Applied Intelligence, Inc. Speech recognition system accommodating different sources
US5590242A (en) * 1994-03-24 1996-12-31 Lucent Technologies Inc. Signal bias removal for robust telephone speech recognition
US5727124A (en) * 1994-06-21 1998-03-10 Lucent Technologies, Inc. Method of and apparatus for signal recognition that compensates for mismatching
JP2768274B2 (en) * 1994-09-08 1998-06-25 日本電気株式会社 Voice recognition device
JP3652753B2 (en) * 1994-10-28 2005-05-25 三菱電機株式会社 Speech modified speech recognition apparatus and speech recognition method
US5742928A (en) * 1994-10-28 1998-04-21 Mitsubishi Denki Kabushiki Kaisha Apparatus and method for speech recognition in the presence of unnatural speech effects
US5721808A (en) * 1995-03-06 1998-02-24 Nippon Telegraph And Telephone Corporation Method for the composition of noise-resistant hidden markov models for speech recognition and speech recognizer using the same
JP2780676B2 (en) * 1995-06-23 1998-07-30 日本電気株式会社 Voice recognition device and voice recognition method
US5806029A (en) * 1995-09-15 1998-09-08 At&T Corp Signal conditioned minimum error rate training for continuous speech recognition

Also Published As

Publication number Publication date
EP0881625B1 (en) 2005-08-10
DE69831114D1 (en) 2005-09-15
DE69838189D1 (en) 2007-09-13
DE69838189T2 (en) 2008-04-30
EP0881625A2 (en) 1998-12-02
US5960397A (en) 1999-09-28
CA2233728C (en) 2002-10-15
EP1526504B1 (en) 2007-08-01
EP0881625A3 (en) 1999-07-28
DE69831114T2 (en) 2006-05-18
EP1526504A1 (en) 2005-04-27

Similar Documents

Publication Publication Date Title
CA2233728A1 (en) Multiple models integration for multi-environment speech recognition
AU2001275991A1 (en) System and method for voice recognition with a plurality of voice recognition engines
AU4887696A (en) Speech recognition
DE68912397D1 (en) Speech recognition with speaker adaptation through learning process.
EP0651372A3 (en) Automatic speech recognition (ASR) processing using confidence measures.
CA2235364A1 (en) Automated meaningful phrase clustering
WO1999016052A3 (en) Speech recognition system for recognizing continuous and isolated speech
CA2210887A1 (en) Method and apparatus for speech recognition adapted to an individual speaker
EP0655732A3 (en) Soft decision speech recognition.
AU2001288808A1 (en) System and method for automatic voice recognition using mapping
DE69827586D1 (en) Technology for the adaptation of hidden Markov models for speech recognition
AU3751695A (en) Speech recognition
DE69432570D1 (en) voice recognition
EP0566884A3 (en) Context-dependent speech recognizer using estimated next word context.
EP1400523A3 (en) Intermediate compounds
CA2202663A1 (en) Voice-operated services
AU1067900A (en) Network and language models for use in a speech recognition system
CA2204866A1 (en) Signal conditioned minimum error rate training for continuous speech recognition
CA2228948A1 (en) Pattern recognition
GB2236232B (en) Voice information service system utilizing approximately matched input character string and key word,and the method for the approximate matching thereof
DE59607861D1 (en) Speech recognition system
GB2307582A (en) System for recognizing spoken sounds from continuous speech and method of using same
EP0775963A3 (en) Indexing a database by finite-state transducer
AU1694101A (en) A ventilating device for ventilating through a ridge
CA2247512A1 (en) Automatic speech recognition

Legal Events

Date Code Title Description
EEER Examination request
MKEX Expiry

Effective date: 20180403