DE69719236T2 - Verfahren und System zur Spracherkennung mittels verborgener Markoff-Modelle mit kontinuierlichen Ausgangswahrscheinlichkeiten - Google Patents

Verfahren und System zur Spracherkennung mittels verborgener Markoff-Modelle mit kontinuierlichen Ausgangswahrscheinlichkeiten

Info

Publication number: DE69719236T2
Authority: DE; Germany
Prior art keywords: models; speech recognition; continuous output; output probabilities; markoff
Prior art date: 1996-05-01
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.): Expired - Lifetime

Application number

DE69719236T

Other languages

English (en)

Other versions

DE69719236D1 (de

Inventor

Xuedong D Huang

Milind V Mahajan

Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)

Microsoft Corp

Original Assignee

Microsoft Corp

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

1996-05-01

Filing date

1997-04-29

Publication date

2003-09-18

1997-04-29 Application filed by Microsoft Corp filed Critical Microsoft Corp

2003-04-03 Publication of DE69719236D1 publication Critical patent/DE69719236D1/de

2003-09-18 Application granted granted Critical

2003-09-18 Publication of DE69719236T2 publication Critical patent/DE69719236T2/de

2017-04-30 Anticipated expiration legal-status Critical

Status Expired - Lifetime legal-status Critical Current

Links

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
- G10L15/144—Training of HMMs
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
- G06F18/295—Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/187—Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams

DE69719236T 1996-05-01 1997-04-29 Verfahren und System zur Spracherkennung mittels verborgener Markoff-Modelle mit kontinuierlichen Ausgangswahrscheinlichkeiten Expired - Lifetime DE69719236T2 (de)

Applications Claiming Priority (1)

Application Number	Priority Date	Filing Date	Title
US08/655,273 US5937384A (en)	1996-05-01	1996-05-01	Method and system for speech recognition using continuous density hidden Markov models

Publications (2)

Publication Number	Publication Date
DE69719236D1 DE69719236D1 (de)	2003-04-03
DE69719236T2 true DE69719236T2 (de)	2003-09-18

Family

ID=24628243

Family Applications (1)

Application Number	Title	Priority Date	Filing Date
DE69719236T Expired - Lifetime DE69719236T2 (de)	1996-05-01	1997-04-29	Verfahren und System zur Spracherkennung mittels verborgener Markoff-Modelle mit kontinuierlichen Ausgangswahrscheinlichkeiten

Country Status (5)

Country	Link
US (1)	US5937384A (de)
EP (1)	EP0805434B1 (de)
JP (1)	JP3933750B2 (de)
CN (1)	CN1112669C (de)
DE (1)	DE69719236T2 (de)

Families Citing this family (89)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
US6567778B1 (en) *	1995-12-21	2003-05-20	Nuance Communications	Natural language speech recognition using slot semantic confidence scores related to their word recognition confidence scores
US6807537B1 (en) *	1997-12-04	2004-10-19	Microsoft Corporation	Mixtures of Bayesian networks
US6418431B1 (en) *	1998-03-30	2002-07-09	Microsoft Corporation	Information retrieval and speech recognition based on language models
US6574597B1 (en) *	1998-05-08	2003-06-03	At&T Corp.	Fully expanded context-dependent networks for speech recognition
ATE263997T1 (de) *	1998-09-29	2004-04-15	Lernout & Hauspie Speechprod	Zwischen-wörter verbindung phonemische modelle
US6571210B2 (en)	1998-11-13	2003-05-27	Microsoft Corporation	Confidence measure system using a near-miss pattern
US7082397B2 (en)	1998-12-01	2006-07-25	Nuance Communications, Inc.	System for and method of creating and browsing a voice web
US6570964B1 (en)	1999-04-16	2003-05-27	Nuance Communications	Technique for recognizing telephone numbers and other spoken information embedded in voice messages stored in a voice messaging system
US7058573B1 (en)	1999-04-20	2006-06-06	Nuance Communications Inc.	Speech recognition system to selectively utilize different speech recognition techniques over multiple speech recognition passes
US7181399B1 (en) *	1999-05-19	2007-02-20	At&T Corp.	Recognizing the numeric language in natural spoken dialogue
US6539353B1 (en) *	1999-10-12	2003-03-25	Microsoft Corporation	Confidence measures using sub-word-dependent weighting of sub-word confidence scores for robust speech recognition
US6529866B1 (en) *	1999-11-24	2003-03-04	The United States Of America As Represented By The Secretary Of The Navy	Speech recognition system and associated methods
US6751621B1 (en) *	2000-01-27	2004-06-15	Manning & Napier Information Services, Llc.	Construction of trainable semantic vectors and clustering, classification, and searching using trainable semantic vectors
US6633845B1 (en) *	2000-04-07	2003-10-14	Hewlett-Packard Development Company, L.P.	Music summarization system and method
US6662158B1 (en)	2000-04-27	2003-12-09	Microsoft Corporation	Temporal pattern recognition method and apparatus utilizing segment and frame-based models
US6629073B1 (en) *	2000-04-27	2003-09-30	Microsoft Corporation	Speech recognition method and apparatus utilizing multi-unit models
US7912868B2 (en) *	2000-05-02	2011-03-22	Textwise Llc	Advertisement placement method and system using semantic analysis
US6865528B1 (en) *	2000-06-01	2005-03-08	Microsoft Corporation	Use of a unified language model
US7031908B1 (en)	2000-06-01	2006-04-18	Microsoft Corporation	Creating a language model for a language processing system
WO2002001549A1 (en) *	2000-06-15	2002-01-03	Intel Corporation	Speaker adaptation using weighted feedback
US6684187B1 (en)	2000-06-30	2004-01-27	At&T Corp.	Method and system for preselection of suitable units for concatenative speech
US6505158B1 (en)	2000-07-05	2003-01-07	At&T Corp.	Synthesis-based pre-selection of suitable units for concatenative speech
US6728674B1 (en)	2000-07-31	2004-04-27	Intel Corporation	Method and system for training of a classifier
US6999926B2 (en) *	2000-11-16	2006-02-14	International Business Machines Corporation	Unsupervised incremental adaptation using maximum likelihood spectral transformation
DE60113787T2 (de) *	2000-11-22	2006-08-10	Matsushita Electric Industrial Co., Ltd., Kadoma	Verfahren und Vorrichtung zur Texteingabe durch Spracherkennung
US7587321B2 (en) *	2001-05-08	2009-09-08	Intel Corporation	Method, apparatus, and system for building context dependent models for a large vocabulary continuous speech recognition (LVCSR) system
US6928409B2 (en) *	2001-05-31	2005-08-09	Freescale Semiconductor, Inc.	Speech recognition using polynomial expansion and hidden markov models
ES2190342B1 (es) *	2001-06-25	2004-11-16	Universitat Pompeu Fabra	Metodo para identificacion de secuencias de audio.
US7324945B2 (en) *	2001-06-28	2008-01-29	Sri International	Method of dynamically altering grammars in a memory efficient speech recognition system
US7711570B2 (en) *	2001-10-21	2010-05-04	Microsoft Corporation	Application abstraction with dialog purpose
US8229753B2 (en) *	2001-10-21	2012-07-24	Microsoft Corporation	Web server controls for web enabled recognition and/or audible prompting
US6990445B2 (en) *	2001-12-17	2006-01-24	Xl8 Systems, Inc.	System and method for speech recognition and transcription
US20030115169A1 (en) *	2001-12-17	2003-06-19	Hongzhuan Ye	System and method for management of transcribed documents
US7050975B2 (en) *	2002-07-23	2006-05-23	Microsoft Corporation	Method of speech recognition using time-dependent interpolation and hidden dynamic value classes
US7752045B2 (en)	2002-10-07	2010-07-06	Carnegie Mellon University	Systems and methods for comparing speech elements
US7200559B2 (en)	2003-05-29	2007-04-03	Microsoft Corporation	Semantic object synchronous understanding implemented with speech application language tags
US8301436B2 (en) *	2003-05-29	2012-10-30	Microsoft Corporation	Semantic object synchronous understanding for highly interactive interface
US7650282B1 (en) *	2003-07-23	2010-01-19	Nexidia Inc.	Word spotting score normalization
US7280967B2 (en) *	2003-07-30	2007-10-09	International Business Machines Corporation	Method for detecting misaligned phonetic units for a concatenative text-to-speech voice
US8160883B2 (en) *	2004-01-10	2012-04-17	Microsoft Corporation	Focus tracking in dialogs
US7406416B2 (en)	2004-03-26	2008-07-29	Microsoft Corporation	Representation of a deleted interpolation N-gram language model in ARPA standard format
US7478038B2 (en)	2004-03-31	2009-01-13	Microsoft Corporation	Language model adaptation using semantic supervision
EP1741092B1 (de) *	2004-04-20	2008-06-11	France Télécom	Spracherkennung durch kontextuelle modellierung der spracheinheiten
TWI276046B (en) *	2005-02-18	2007-03-11	Delta Electronics Inc	Distributed language processing system and method of transmitting medium information therefore
US7970613B2 (en)	2005-11-12	2011-06-28	Sony Computer Entertainment Inc.	Method and system for Gaussian probability data bit reduction and computation
US7778831B2 (en)	2006-02-21	2010-08-17	Sony Computer Entertainment Inc.	Voice recognition with dynamic filter bank adjustment based on speaker categorization determined from runtime pitch
US8010358B2 (en) *	2006-02-21	2011-08-30	Sony Computer Entertainment Inc.	Voice recognition with parallel gender and age normalization
KR100845428B1 (ko) *	2006-08-25	2008-07-10	한국전자통신연구원	휴대용 단말기의 음성 인식 시스템
US20080103772A1 (en) *	2006-10-31	2008-05-01	Duncan Bates	Character Prediction System
JP4322934B2 (ja) *	2007-03-28	2009-09-02	株式会社東芝	音声認識装置、方法およびプログラム
US9129599B2 (en) *	2007-10-18	2015-09-08	Nuance Communications, Inc.	Automated tuning of speech recognition parameters
US8352265B1 (en)	2007-12-24	2013-01-08	Edward Lin	Hardware implemented backend search engine for a high-rate speech recognition system
US8639510B1 (en) *	2007-12-24	2014-01-28	Kai Yu	Acoustic scoring unit implemented on a single FPGA or ASIC
US8463610B1 (en)	2008-01-18	2013-06-11	Patrick J. Bourke	Hardware-implemented scalable modular engine for low-power speech recognition
US20100057452A1 (en) *	2008-08-28	2010-03-04	Microsoft Corporation	Speech interfaces
US9484019B2 (en) *	2008-11-19	2016-11-01	At&T Intellectual Property I, L.P.	System and method for discriminative pronunciation modeling for voice search
US8442833B2 (en) *	2009-02-17	2013-05-14	Sony Computer Entertainment Inc.	Speech processing with source location estimation using signals from two or more microphones
US8442829B2 (en) *	2009-02-17	2013-05-14	Sony Computer Entertainment Inc.	Automatic computation streaming partition for voice recognition on multiple processors with limited memory
US8788256B2 (en) *	2009-02-17	2014-07-22	Sony Computer Entertainment Inc.	Multiple language voice recognition
EP2238899B1 (de) *	2009-04-06	2016-10-05	GN Resound A/S	Effiziente Beurteilung des Hörvermögens
US8606578B2 (en) *	2009-06-25	2013-12-10	Intel Corporation	Method and apparatus for improving memory locality for real-time speech recognition
JP2012108748A (ja) *	2010-11-18	2012-06-07	Sony Corp	データ処理装置、データ処理方法、およびプログラム
US8688453B1 (en) *	2011-02-28	2014-04-01	Nuance Communications, Inc.	Intent mining via analysis of utterances
CN102129860B (zh) *	2011-04-07	2012-07-04	南京邮电大学	基于无限状态隐马尔可夫模型的与文本相关的说话人识别方法
US8972260B2 (en) *	2011-04-20	2015-03-03	Robert Bosch Gmbh	Speech recognition using multiple language models
WO2013003772A2 (en) *	2011-06-30	2013-01-03	Google Inc.	Speech recognition using variable-length context
US10339214B2 (en) *	2011-11-04	2019-07-02	International Business Machines Corporation	Structured term recognition
US9785613B2 (en) *	2011-12-19	2017-10-10	Cypress Semiconductor Corporation	Acoustic processing unit interface for determining senone scores using a greater clock frequency than that corresponding to received audio
US9153235B2 (en)	2012-04-09	2015-10-06	Sony Computer Entertainment Inc.	Text dependent speaker recognition with long-term feature based on functional data analysis
US9224384B2 (en) *	2012-06-06	2015-12-29	Cypress Semiconductor Corporation	Histogram based pre-pruning scheme for active HMMS
US9514739B2 (en) *	2012-06-06	2016-12-06	Cypress Semiconductor Corporation	Phoneme score accelerator
US9508045B2 (en) *	2012-08-17	2016-11-29	Raytheon Company	Continuous-time baum-welch training
US9336771B2 (en) *	2012-11-01	2016-05-10	Google Inc.	Speech recognition using non-parametric models
US9240184B1 (en)	2012-11-15	2016-01-19	Google Inc.	Frame-level combination of deep neural network and gaussian mixture models
KR101905827B1 (ko) *	2013-06-26	2018-10-08	한국전자통신연구원	연속어 음성 인식 장치 및 방법
US9711148B1 (en) *	2013-07-18	2017-07-18	Google Inc.	Dual model speaker identification
GB2523353B (en) *	2014-02-21	2017-03-01	Jaguar Land Rover Ltd	System for use in a vehicle
US10014007B2 (en)	2014-05-28	2018-07-03	Interactive Intelligence, Inc.	Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system
US10255903B2 (en)	2014-05-28	2019-04-09	Interactive Intelligence Group, Inc.	Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system
US9858922B2 (en)	2014-06-23	2018-01-02	Google Inc.	Caching speech recognition scores
US9299347B1 (en)	2014-10-22	2016-03-29	Google Inc.	Speech recognition using associative mapping
CA3004700C (en) *	2015-10-06	2021-03-23	Interactive Intelligence Group, Inc.	Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system
US10229672B1 (en)	2015-12-31	2019-03-12	Google Llc	Training acoustic models using connectionist temporal classification
KR102434604B1 (ko) *	2016-01-05	2022-08-23	한국전자통신연구원	개인화된 음성 인식을 수행하기 위한 음성 인식 단말, 음성 인식 서버 및 음성 인식 방법
US10665243B1 (en) *	2016-11-11	2020-05-26	Facebook Technologies, Llc	Subvocalized speech recognition
US10706840B2 (en)	2017-08-18	2020-07-07	Google Llc	Encoder-decoder models for sequence to sequence mapping
US11211065B2 (en) *	2018-02-02	2021-12-28	Genesys Telecommunications Laboratories, Inc.	System and method for automatic filtering of test utterance mismatches in automatic speech recognition systems
US11783818B2 (en) *	2020-05-06	2023-10-10	Cypress Semiconductor Corporation	Two stage user customizable wake word detection
CN116108391B (zh) *	2023-04-12	2023-06-30	江西珉轩智能科技有限公司	一种基于无监督学习的人体姿态分类识别系统

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
US4587670A (en) *	1982-10-15	1986-05-06	At&T Bell Laboratories	Hidden Markov model speech recognition arrangement
US4783803A (en) *	1985-11-12	1988-11-08	Dragon Systems, Inc.	Speech recognition apparatus and method
JPS62231993A (ja) *	1986-03-25	1987-10-12	インタ−ナシヨナル　ビジネス　マシ−ンズ　コ−ポレ−シヨン	音声認識方法
US4866778A (en) *	1986-08-11	1989-09-12	Dragon Systems, Inc.	Interactive speech recognition apparatus
US4817156A (en) *	1987-08-10	1989-03-28	International Business Machines Corporation	Rapidly training a speech recognizer to a subsequent speaker given training data of a reference speaker
US5027406A (en) *	1988-12-06	1991-06-25	Dragon Systems, Inc.	Method for interactive speech recognition and training
US5268990A (en) *	1991-01-31	1993-12-07	Sri International	Method for recognizing speech using linguistically-motivated hidden Markov models
US5267345A (en) *	1992-02-10	1993-11-30	International Business Machines Corporation	Speech recognition apparatus which predicts word classes from context and words from word classes
US5293584A (en) *	1992-05-21	1994-03-08	International Business Machines Corporation	Speech recognition system for natural language translation
US5333236A (en) *	1992-09-10	1994-07-26	International Business Machines Corporation	Speech recognizer having a speech coder for an acoustic match based on context-dependent speech-transition acoustic models
EP0602296A1 (de) *	1992-12-17	1994-06-22	International Business Machines Corporation	Adaptives Verfahren zur Erzeugung gebietsabhängiger Modelle für intelligente Systeme
US5627939A (en) *	1993-09-03	1997-05-06	Microsoft Corporation	Speech recognition system and method employing data compression
US5621859A (en) *	1994-01-19	1997-04-15	Bbn Corporation	Single tree method for grammar directed, very large vocabulary speech recognizer
US5642519A (en) *	1994-04-29	1997-06-24	Sun Microsystems, Inc.	Speech interpreter with a unified grammer compiler
JP3581401B2 (ja) *	1994-10-07	2004-10-27	キヤノン株式会社	音声認識方法
US5710866A (en) *	1995-05-26	1998-01-20	Microsoft Corporation	System and method for speech recognition using dynamically adjusted confidence measure

1996
- 1996-05-01 US US08/655,273 patent/US5937384A/en not_active Expired - Lifetime
1997
- 1997-04-29 EP EP97107116A patent/EP0805434B1/de not_active Expired - Lifetime
- 1997-04-29 DE DE69719236T patent/DE69719236T2/de not_active Expired - Lifetime
- 1997-04-30 CN CN97114917A patent/CN1112669C/zh not_active Expired - Lifetime
- 1997-05-01 JP JP14838597A patent/JP3933750B2/ja not_active Expired - Lifetime

Also Published As

Publication number	Publication date
EP0805434B1 (de)	2003-02-26
DE69719236D1 (de)	2003-04-03
EP0805434A2 (de)	1997-11-05
EP0805434A3 (de)	1998-08-26
US5937384A (en)	1999-08-10
JPH1063291A (ja)	1998-03-06
CN1171592A (zh)	1998-01-28
CN1112669C (zh)	2003-06-25
JP3933750B2 (ja)	2007-06-20

Legal Events

Date	Code	Title	Description
2004-03-25	8364	No opposition during term of opposition

Publication	Publication Date	Title
DE69719236D1 (de)	2003-04-03	Verfahren und System zur Spracherkennung mittels verborgener Markoff-Modelle mit kontinuierlichen Ausgangswahrscheinlichkeiten
DE59809609D1 (de)	2003-10-23	Verfahren zur Spracherkennung mit Sprachmodellanpassung
DE59801560D1 (de)	2001-10-31	Verfahren zur Spracherkennung mit Sprachmodellanpassung
DE69725106D1 (de)	2003-10-30	Verfahren und Vorrichtung zur Spracherkennung mit Rauschadaptierung
DE69625950T2 (de)	2003-12-24	Verfahren und Vorrichtung zur Spracherkennung und Übersetzungssystem
DE69717899T2 (de)	2003-08-21	Verfahren und Vorrichtung zur Spracherkennung
DE69726235D1 (de)	2003-12-24	Verfahren und Vorrichtung zur Spracherkennung
DE69518705D1 (de)	2000-10-12	Verfahren und Vorrichtung zur Spracherkennung
DE69524829D1 (de)	2002-02-07	Verfahren und Vorrichtung zur Spracherkennung
DE59707384D1 (de)	2002-07-11	Verfahren und Vorrichtung zur Spracherkennung
DE69828141D1 (de)	2005-01-20	Verfahren und Vorrichtung zur Spracherkennung
DE69806557T2 (de)	2003-01-30	Verfahren und Vorrichtung zur Spracherkennung
DE69631728D1 (de)	2004-04-08	Verfahren und Vorrichtung zur Sprachkodierung
DE69324428D1 (de)	1999-05-20	Verfahren zur Sprachformung und Gerät zur Spracherkennung
DE69727895D1 (de)	2004-04-08	Verfahren und Vorrichtung zur Sprachkodierung
DE69613646D1 (de)	2001-08-09	Verfahren zur Sprachdetektion bei starken Umgebungsgeräuschen
DE69220825T2 (de)	1998-02-19	Verfahren und System zur Spracherkennung
DE69830017D1 (de)	2005-06-09	Verfahren und Vorrichtung zur Spracherkennung
DE69607913T2 (de)	2000-10-05	Verfahren und vorrichtung zur spracherkennung auf der basis neuer wortmodelle
DE69028842D1 (de)	1996-11-14	Verfahren und Einrichtung zur Wortmodellierung mittels zusammengesetzten Markov-Modellen
DE69614937D1 (de)	2001-10-11	Verfahren und System zur Spracherkennung mit verringerter Erkennungszeit unter Berücksichtigung von Veränderungen der Hintergrundgeräusche
DE69618408D1 (de)	2002-02-14	Verfahren und Vorrichtung zur Sprachkodierung
DE69613644D1 (de)	2001-08-09	Verfahren zur Erzeugung eines Sprachmodels und Spracherkennungsvorrichtung
DE69937854D1 (de)	2008-02-14	Verfahren und Vorrichtung zur Spracherkennung unter Verwendung von phonetischen Transkriptionen
DE69517829T2 (de)	2001-03-08	Vorrichtung und Verfahren zur Spracherkennung