DE69719236T2 - Verfahren und System zur Spracherkennung mittels verborgener Markoff-Modelle mit kontinuierlichen Ausgangswahrscheinlichkeiten - Google Patents

Verfahren und System zur Spracherkennung mittels verborgener Markoff-Modelle mit kontinuierlichen Ausgangswahrscheinlichkeiten

Info

Publication number
DE69719236T2
DE69719236T2 DE69719236T DE69719236T DE69719236T2 DE 69719236 T2 DE69719236 T2 DE 69719236T2 DE 69719236 T DE69719236 T DE 69719236T DE 69719236 T DE69719236 T DE 69719236T DE 69719236 T2 DE69719236 T2 DE 69719236T2
Authority
DE
Germany
Prior art keywords
models
speech recognition
continuous output
output probabilities
markoff
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
DE69719236T
Other languages
English (en)
Other versions
DE69719236D1 (de
Inventor
Xuedong D Huang
Milind V Mahajan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Corp
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Publication of DE69719236D1 publication Critical patent/DE69719236D1/de
Application granted granted Critical
Publication of DE69719236T2 publication Critical patent/DE69719236T2/de
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • G10L15/144Training of HMMs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06F18/295Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/187Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
DE69719236T 1996-05-01 1997-04-29 Verfahren und System zur Spracherkennung mittels verborgener Markoff-Modelle mit kontinuierlichen Ausgangswahrscheinlichkeiten Expired - Lifetime DE69719236T2 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US08/655,273 US5937384A (en) 1996-05-01 1996-05-01 Method and system for speech recognition using continuous density hidden Markov models

Publications (2)

Publication Number Publication Date
DE69719236D1 DE69719236D1 (de) 2003-04-03
DE69719236T2 true DE69719236T2 (de) 2003-09-18

Family

ID=24628243

Family Applications (1)

Application Number Title Priority Date Filing Date
DE69719236T Expired - Lifetime DE69719236T2 (de) 1996-05-01 1997-04-29 Verfahren und System zur Spracherkennung mittels verborgener Markoff-Modelle mit kontinuierlichen Ausgangswahrscheinlichkeiten

Country Status (5)

Country Link
US (1) US5937384A (de)
EP (1) EP0805434B1 (de)
JP (1) JP3933750B2 (de)
CN (1) CN1112669C (de)
DE (1) DE69719236T2 (de)

Families Citing this family (89)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6567778B1 (en) * 1995-12-21 2003-05-20 Nuance Communications Natural language speech recognition using slot semantic confidence scores related to their word recognition confidence scores
US6807537B1 (en) * 1997-12-04 2004-10-19 Microsoft Corporation Mixtures of Bayesian networks
US6418431B1 (en) * 1998-03-30 2002-07-09 Microsoft Corporation Information retrieval and speech recognition based on language models
US6574597B1 (en) * 1998-05-08 2003-06-03 At&T Corp. Fully expanded context-dependent networks for speech recognition
ATE263997T1 (de) * 1998-09-29 2004-04-15 Lernout & Hauspie Speechprod Zwischen-wörter verbindung phonemische modelle
US6571210B2 (en) 1998-11-13 2003-05-27 Microsoft Corporation Confidence measure system using a near-miss pattern
US7082397B2 (en) 1998-12-01 2006-07-25 Nuance Communications, Inc. System for and method of creating and browsing a voice web
US6570964B1 (en) 1999-04-16 2003-05-27 Nuance Communications Technique for recognizing telephone numbers and other spoken information embedded in voice messages stored in a voice messaging system
US7058573B1 (en) 1999-04-20 2006-06-06 Nuance Communications Inc. Speech recognition system to selectively utilize different speech recognition techniques over multiple speech recognition passes
US7181399B1 (en) * 1999-05-19 2007-02-20 At&T Corp. Recognizing the numeric language in natural spoken dialogue
US6539353B1 (en) * 1999-10-12 2003-03-25 Microsoft Corporation Confidence measures using sub-word-dependent weighting of sub-word confidence scores for robust speech recognition
US6529866B1 (en) * 1999-11-24 2003-03-04 The United States Of America As Represented By The Secretary Of The Navy Speech recognition system and associated methods
US6751621B1 (en) * 2000-01-27 2004-06-15 Manning & Napier Information Services, Llc. Construction of trainable semantic vectors and clustering, classification, and searching using trainable semantic vectors
US6633845B1 (en) * 2000-04-07 2003-10-14 Hewlett-Packard Development Company, L.P. Music summarization system and method
US6662158B1 (en) 2000-04-27 2003-12-09 Microsoft Corporation Temporal pattern recognition method and apparatus utilizing segment and frame-based models
US6629073B1 (en) * 2000-04-27 2003-09-30 Microsoft Corporation Speech recognition method and apparatus utilizing multi-unit models
US7912868B2 (en) * 2000-05-02 2011-03-22 Textwise Llc Advertisement placement method and system using semantic analysis
US6865528B1 (en) * 2000-06-01 2005-03-08 Microsoft Corporation Use of a unified language model
US7031908B1 (en) 2000-06-01 2006-04-18 Microsoft Corporation Creating a language model for a language processing system
WO2002001549A1 (en) * 2000-06-15 2002-01-03 Intel Corporation Speaker adaptation using weighted feedback
US6684187B1 (en) 2000-06-30 2004-01-27 At&T Corp. Method and system for preselection of suitable units for concatenative speech
US6505158B1 (en) 2000-07-05 2003-01-07 At&T Corp. Synthesis-based pre-selection of suitable units for concatenative speech
US6728674B1 (en) 2000-07-31 2004-04-27 Intel Corporation Method and system for training of a classifier
US6999926B2 (en) * 2000-11-16 2006-02-14 International Business Machines Corporation Unsupervised incremental adaptation using maximum likelihood spectral transformation
DE60113787T2 (de) * 2000-11-22 2006-08-10 Matsushita Electric Industrial Co., Ltd., Kadoma Verfahren und Vorrichtung zur Texteingabe durch Spracherkennung
US7587321B2 (en) * 2001-05-08 2009-09-08 Intel Corporation Method, apparatus, and system for building context dependent models for a large vocabulary continuous speech recognition (LVCSR) system
US6928409B2 (en) * 2001-05-31 2005-08-09 Freescale Semiconductor, Inc. Speech recognition using polynomial expansion and hidden markov models
ES2190342B1 (es) * 2001-06-25 2004-11-16 Universitat Pompeu Fabra Metodo para identificacion de secuencias de audio.
US7324945B2 (en) * 2001-06-28 2008-01-29 Sri International Method of dynamically altering grammars in a memory efficient speech recognition system
US7711570B2 (en) * 2001-10-21 2010-05-04 Microsoft Corporation Application abstraction with dialog purpose
US8229753B2 (en) * 2001-10-21 2012-07-24 Microsoft Corporation Web server controls for web enabled recognition and/or audible prompting
US6990445B2 (en) * 2001-12-17 2006-01-24 Xl8 Systems, Inc. System and method for speech recognition and transcription
US20030115169A1 (en) * 2001-12-17 2003-06-19 Hongzhuan Ye System and method for management of transcribed documents
US7050975B2 (en) * 2002-07-23 2006-05-23 Microsoft Corporation Method of speech recognition using time-dependent interpolation and hidden dynamic value classes
US7752045B2 (en) 2002-10-07 2010-07-06 Carnegie Mellon University Systems and methods for comparing speech elements
US7200559B2 (en) 2003-05-29 2007-04-03 Microsoft Corporation Semantic object synchronous understanding implemented with speech application language tags
US8301436B2 (en) * 2003-05-29 2012-10-30 Microsoft Corporation Semantic object synchronous understanding for highly interactive interface
US7650282B1 (en) * 2003-07-23 2010-01-19 Nexidia Inc. Word spotting score normalization
US7280967B2 (en) * 2003-07-30 2007-10-09 International Business Machines Corporation Method for detecting misaligned phonetic units for a concatenative text-to-speech voice
US8160883B2 (en) * 2004-01-10 2012-04-17 Microsoft Corporation Focus tracking in dialogs
US7406416B2 (en) 2004-03-26 2008-07-29 Microsoft Corporation Representation of a deleted interpolation N-gram language model in ARPA standard format
US7478038B2 (en) 2004-03-31 2009-01-13 Microsoft Corporation Language model adaptation using semantic supervision
EP1741092B1 (de) * 2004-04-20 2008-06-11 France Télécom Spracherkennung durch kontextuelle modellierung der spracheinheiten
TWI276046B (en) * 2005-02-18 2007-03-11 Delta Electronics Inc Distributed language processing system and method of transmitting medium information therefore
US7970613B2 (en) 2005-11-12 2011-06-28 Sony Computer Entertainment Inc. Method and system for Gaussian probability data bit reduction and computation
US7778831B2 (en) 2006-02-21 2010-08-17 Sony Computer Entertainment Inc. Voice recognition with dynamic filter bank adjustment based on speaker categorization determined from runtime pitch
US8010358B2 (en) * 2006-02-21 2011-08-30 Sony Computer Entertainment Inc. Voice recognition with parallel gender and age normalization
KR100845428B1 (ko) * 2006-08-25 2008-07-10 한국전자통신연구원 휴대용 단말기의 음성 인식 시스템
US20080103772A1 (en) * 2006-10-31 2008-05-01 Duncan Bates Character Prediction System
JP4322934B2 (ja) * 2007-03-28 2009-09-02 株式会社東芝 音声認識装置、方法およびプログラム
US9129599B2 (en) * 2007-10-18 2015-09-08 Nuance Communications, Inc. Automated tuning of speech recognition parameters
US8352265B1 (en) 2007-12-24 2013-01-08 Edward Lin Hardware implemented backend search engine for a high-rate speech recognition system
US8639510B1 (en) * 2007-12-24 2014-01-28 Kai Yu Acoustic scoring unit implemented on a single FPGA or ASIC
US8463610B1 (en) 2008-01-18 2013-06-11 Patrick J. Bourke Hardware-implemented scalable modular engine for low-power speech recognition
US20100057452A1 (en) * 2008-08-28 2010-03-04 Microsoft Corporation Speech interfaces
US9484019B2 (en) * 2008-11-19 2016-11-01 At&T Intellectual Property I, L.P. System and method for discriminative pronunciation modeling for voice search
US8442833B2 (en) * 2009-02-17 2013-05-14 Sony Computer Entertainment Inc. Speech processing with source location estimation using signals from two or more microphones
US8442829B2 (en) * 2009-02-17 2013-05-14 Sony Computer Entertainment Inc. Automatic computation streaming partition for voice recognition on multiple processors with limited memory
US8788256B2 (en) * 2009-02-17 2014-07-22 Sony Computer Entertainment Inc. Multiple language voice recognition
EP2238899B1 (de) * 2009-04-06 2016-10-05 GN Resound A/S Effiziente Beurteilung des Hörvermögens
US8606578B2 (en) * 2009-06-25 2013-12-10 Intel Corporation Method and apparatus for improving memory locality for real-time speech recognition
JP2012108748A (ja) * 2010-11-18 2012-06-07 Sony Corp データ処理装置、データ処理方法、およびプログラム
US8688453B1 (en) * 2011-02-28 2014-04-01 Nuance Communications, Inc. Intent mining via analysis of utterances
CN102129860B (zh) * 2011-04-07 2012-07-04 南京邮电大学 基于无限状态隐马尔可夫模型的与文本相关的说话人识别方法
US8972260B2 (en) * 2011-04-20 2015-03-03 Robert Bosch Gmbh Speech recognition using multiple language models
WO2013003772A2 (en) * 2011-06-30 2013-01-03 Google Inc. Speech recognition using variable-length context
US10339214B2 (en) * 2011-11-04 2019-07-02 International Business Machines Corporation Structured term recognition
US9785613B2 (en) * 2011-12-19 2017-10-10 Cypress Semiconductor Corporation Acoustic processing unit interface for determining senone scores using a greater clock frequency than that corresponding to received audio
US9153235B2 (en) 2012-04-09 2015-10-06 Sony Computer Entertainment Inc. Text dependent speaker recognition with long-term feature based on functional data analysis
US9224384B2 (en) * 2012-06-06 2015-12-29 Cypress Semiconductor Corporation Histogram based pre-pruning scheme for active HMMS
US9514739B2 (en) * 2012-06-06 2016-12-06 Cypress Semiconductor Corporation Phoneme score accelerator
US9508045B2 (en) * 2012-08-17 2016-11-29 Raytheon Company Continuous-time baum-welch training
US9336771B2 (en) * 2012-11-01 2016-05-10 Google Inc. Speech recognition using non-parametric models
US9240184B1 (en) 2012-11-15 2016-01-19 Google Inc. Frame-level combination of deep neural network and gaussian mixture models
KR101905827B1 (ko) * 2013-06-26 2018-10-08 한국전자통신연구원 연속어 음성 인식 장치 및 방법
US9711148B1 (en) * 2013-07-18 2017-07-18 Google Inc. Dual model speaker identification
GB2523353B (en) * 2014-02-21 2017-03-01 Jaguar Land Rover Ltd System for use in a vehicle
US10014007B2 (en) 2014-05-28 2018-07-03 Interactive Intelligence, Inc. Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system
US10255903B2 (en) 2014-05-28 2019-04-09 Interactive Intelligence Group, Inc. Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system
US9858922B2 (en) 2014-06-23 2018-01-02 Google Inc. Caching speech recognition scores
US9299347B1 (en) 2014-10-22 2016-03-29 Google Inc. Speech recognition using associative mapping
CA3004700C (en) * 2015-10-06 2021-03-23 Interactive Intelligence Group, Inc. Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system
US10229672B1 (en) 2015-12-31 2019-03-12 Google Llc Training acoustic models using connectionist temporal classification
KR102434604B1 (ko) * 2016-01-05 2022-08-23 한국전자통신연구원 개인화된 음성 인식을 수행하기 위한 음성 인식 단말, 음성 인식 서버 및 음성 인식 방법
US10665243B1 (en) * 2016-11-11 2020-05-26 Facebook Technologies, Llc Subvocalized speech recognition
US10706840B2 (en) 2017-08-18 2020-07-07 Google Llc Encoder-decoder models for sequence to sequence mapping
US11211065B2 (en) * 2018-02-02 2021-12-28 Genesys Telecommunications Laboratories, Inc. System and method for automatic filtering of test utterance mismatches in automatic speech recognition systems
US11783818B2 (en) * 2020-05-06 2023-10-10 Cypress Semiconductor Corporation Two stage user customizable wake word detection
CN116108391B (zh) * 2023-04-12 2023-06-30 江西珉轩智能科技有限公司 一种基于无监督学习的人体姿态分类识别系统

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4587670A (en) * 1982-10-15 1986-05-06 At&T Bell Laboratories Hidden Markov model speech recognition arrangement
US4783803A (en) * 1985-11-12 1988-11-08 Dragon Systems, Inc. Speech recognition apparatus and method
JPS62231993A (ja) * 1986-03-25 1987-10-12 インタ−ナシヨナル ビジネス マシ−ンズ コ−ポレ−シヨン 音声認識方法
US4866778A (en) * 1986-08-11 1989-09-12 Dragon Systems, Inc. Interactive speech recognition apparatus
US4817156A (en) * 1987-08-10 1989-03-28 International Business Machines Corporation Rapidly training a speech recognizer to a subsequent speaker given training data of a reference speaker
US5027406A (en) * 1988-12-06 1991-06-25 Dragon Systems, Inc. Method for interactive speech recognition and training
US5268990A (en) * 1991-01-31 1993-12-07 Sri International Method for recognizing speech using linguistically-motivated hidden Markov models
US5267345A (en) * 1992-02-10 1993-11-30 International Business Machines Corporation Speech recognition apparatus which predicts word classes from context and words from word classes
US5293584A (en) * 1992-05-21 1994-03-08 International Business Machines Corporation Speech recognition system for natural language translation
US5333236A (en) * 1992-09-10 1994-07-26 International Business Machines Corporation Speech recognizer having a speech coder for an acoustic match based on context-dependent speech-transition acoustic models
EP0602296A1 (de) * 1992-12-17 1994-06-22 International Business Machines Corporation Adaptives Verfahren zur Erzeugung gebietsabhängiger Modelle für intelligente Systeme
US5627939A (en) * 1993-09-03 1997-05-06 Microsoft Corporation Speech recognition system and method employing data compression
US5621859A (en) * 1994-01-19 1997-04-15 Bbn Corporation Single tree method for grammar directed, very large vocabulary speech recognizer
US5642519A (en) * 1994-04-29 1997-06-24 Sun Microsystems, Inc. Speech interpreter with a unified grammer compiler
JP3581401B2 (ja) * 1994-10-07 2004-10-27 キヤノン株式会社 音声認識方法
US5710866A (en) * 1995-05-26 1998-01-20 Microsoft Corporation System and method for speech recognition using dynamically adjusted confidence measure

Also Published As

Publication number Publication date
EP0805434B1 (de) 2003-02-26
DE69719236D1 (de) 2003-04-03
EP0805434A2 (de) 1997-11-05
EP0805434A3 (de) 1998-08-26
US5937384A (en) 1999-08-10
JPH1063291A (ja) 1998-03-06
CN1171592A (zh) 1998-01-28
CN1112669C (zh) 2003-06-25
JP3933750B2 (ja) 2007-06-20

Similar Documents

Publication Publication Date Title
DE69719236D1 (de) Verfahren und System zur Spracherkennung mittels verborgener Markoff-Modelle mit kontinuierlichen Ausgangswahrscheinlichkeiten
DE59809609D1 (de) Verfahren zur Spracherkennung mit Sprachmodellanpassung
DE59801560D1 (de) Verfahren zur Spracherkennung mit Sprachmodellanpassung
DE69725106D1 (de) Verfahren und Vorrichtung zur Spracherkennung mit Rauschadaptierung
DE69625950T2 (de) Verfahren und Vorrichtung zur Spracherkennung und Übersetzungssystem
DE69717899T2 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69726235D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69518705D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69524829D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE59707384D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69828141D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69806557T2 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69631728D1 (de) Verfahren und Vorrichtung zur Sprachkodierung
DE69324428D1 (de) Verfahren zur Sprachformung und Gerät zur Spracherkennung
DE69727895D1 (de) Verfahren und Vorrichtung zur Sprachkodierung
DE69613646D1 (de) Verfahren zur Sprachdetektion bei starken Umgebungsgeräuschen
DE69220825T2 (de) Verfahren und System zur Spracherkennung
DE69830017D1 (de) Verfahren und Vorrichtung zur Spracherkennung
DE69607913T2 (de) Verfahren und vorrichtung zur spracherkennung auf der basis neuer wortmodelle
DE69028842D1 (de) Verfahren und Einrichtung zur Wortmodellierung mittels zusammengesetzten Markov-Modellen
DE69614937D1 (de) Verfahren und System zur Spracherkennung mit verringerter Erkennungszeit unter Berücksichtigung von Veränderungen der Hintergrundgeräusche
DE69618408D1 (de) Verfahren und Vorrichtung zur Sprachkodierung
DE69613644D1 (de) Verfahren zur Erzeugung eines Sprachmodels und Spracherkennungsvorrichtung
DE69937854D1 (de) Verfahren und Vorrichtung zur Spracherkennung unter Verwendung von phonetischen Transkriptionen
DE69517829T2 (de) Vorrichtung und Verfahren zur Spracherkennung

Legal Events

Date Code Title Description
8364 No opposition during term of opposition