2023年全國(guó)碩士研究生考試考研英語(yǔ)一試題真題(含答案詳解+作文范文)_第1頁(yè)
已閱讀1頁(yè),還剩5頁(yè)未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、ScienceDirectAvailable online at www.sciencedirect.comProcedia Computer Science 128 (2018) 32–371877-0509 © 2018 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (

2、https://creativecommons.org/licenses/by-nc-nd/4.0/) Selection and peer-review under responsibility of the scientific committee of the International Conference on Natural Language and Speech Processing. 10.1016/j.procs.

3、2018.03.005© 2018 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0/) Selection and peer-review under responsibi

4、lity of the scientific committee of the International Conference on Natural Language and Speech Processing. Keywords: Automatic Speech Recognition; ASR Error Detection; ASR Error Correction; ASR evaluation;1. Introducti

5、onAutomatic Speech Recognition (ASR) systems aims at converting a speech signal into a sequence of words either for text-based communication purposes or for device controlling. The purpose of evaluating ASR systems is to

6、 simu- late human judgement of the performance of the systems in order to measure their usefulness and assess the remaining difficulties and especially when comparing systems. The standard metric of ASR evaluation is the

7、 Word Error Rate, which is defined as the proportion of word errors to words processed. ASR has matured to the point of commercial applications by providing transcription with an acceptable level of performance which all

8、ows integration into many applications. In general, ASR systems are effective when the con- ditions are well controlled. Nevertheless, they are too dependent on the task being performed and the results are far from ideal

9、, and especially for Large Vocabulary Continuous Speech Recognition (LVCSR) applications. This later still one of the most challenging tasks in the field, due to a number of factors, including poor articulation, variable

10、? Corresponding author. Tel.: +212-523-344-822 ; fax: +212-523-394-915. E-mail address: errattahi.r@ucd.ac.maAbstractEven though Automatic Speech Recognition (ASR) has matured to the point of commercial applications, hig

11、h error rate in some speech recognition domains remain as one of the main impediment factors to the wide adoption of speech technology, and especially for continuous large vocabulary speech recognition applications. The

12、persistent presence of ASR errors have intensified the need to find alternative techniques to automatically detect and correct such errors. The correction of the transcription errors is very crucial not only to improve t

13、he speech recognition accuracy, but also to avoid the propagation of the errors to the subsequent language processing modules such as machine translation. In this paper, basic principles of ASR evaluation are first summa

14、rized, and then the state of the current ASR errors detection and correction research is reviewed. We focus on emerging techniques using word error rate metric.International Conference on Natural Language and Speech Proc

15、essing, ICNLSP 2015Automatic Speech Recognition Errors Detection and Correction: A ReviewRahhal Errattahia,?, Asmaa El Hannania, Hassan OuahmaneaaLaboratory of Information Technologies, National School of Applied Science

16、s, University of Chouaib Doukkali, El Jadida - Morocco34 Rahhal Errattahi et al. / Procedia Computer Science 128 (2018) 32–37A key practical issue with ASR evaluation metrics calculation is finding the word alignment b

17、etween the reference and the automatic transcription, which constitute the first step in the evaluation procedure. In other words, the reference and recognised words get matched in order to decide which word have been de

18、leted or inserted, and which reference- recognised string pairs have been aligned to each other, which may result in a hit or a substitution. This is normally done by using the Viterbi Edit Distance [17] to efficiently s

19、elect the reference and the recognised word sequence alignment for which the weighted error score is minimized. The Edit Distance usually aligns an identical weights (1 for the Levensthein distance) to all three, inserti

20、on, substitution and deletion. Yet, unified weights may present a doubt to choose the best path alignment in the case when we have different ones which have the same score. To avoid this problem Morris et al. [12] sugges

21、t using different weights, such that substitution will be favoured than insertion and deletion. In general, it’s recommended to put WI = WD , and WS < WI + WS . Where WI, WS and WD are respectively the weight of inser

22、tion, substitution, and deletion.2.3. ASR Evaluation MetricsAccording to McCowan et al. [11] an ideal ASR evaluation metric should be: (i) Direct; measure ASR component independently on the ASR application, (ii) Objectiv

23、e; the measure should be calculated in an automated manner,(iii) Interpretable; the absolute value of the measure must give an idea about the performance, and (iv) Modular; the evaluation measure should be general to all

24、ow thorough application-dependent analysis. Word Error Rate (WER) is the most popular metric for ASR evaluation, it measures the percentage of incorrect words (Substitutions (S), Insertions (I), Deletions (D)) regarding

25、the total number of words processed. It is defined asWER = S + D + IN1 = S + D + IH + S + D (1)where I = total number of insertions, D = total number of deletions, S = total number of substitutions, H = total number of h

26、its, and N1 = total number of input words. Despite of being the most commonly used, WER has many shortcomings [10]. First of all, WER is not a true percentage because it has no upper bound, so it doesn’t tell you how goo

27、d a system is, but only that one is better than another. Moreover, WER is not D/I symmetric, so in noisy conditions WER could exceed 100%, for the fact that it gives far more weight to insertions than to deletions. The W

28、ER still effective for speech recognition where errors can be corrected by typing, such as, dictation. However, for almost any other type of speech recognition systems, where the goal is more than transcription, it is ne

29、cessary to look for an alternative, or additional, evaluation framework. Many researchers have proposed alternative measures to solve the evident limitations of WER. In [12] Andrew et al. introduced two information theor

30、etic measures of word information communicated. The first one, named Relative Information Lost (RIL), is based on Mutual Information (I, or MI) [7], which measures the statistical dependence between the input words X and

31、 output words Y, and is calculated using the Shannon Entropy H as follow:RIL = H(Y|X)H(Y) (2)withH(Y) = ?n ?i=1 P(yi)logP(yi) (3)andH(X|Y) = ? ?i, j P(xi, yj)logP(xi, yj) (4)Nevertheless, the RIL still too far from an ad

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 眾賞文庫(kù)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

最新文檔

評(píng)論

0/150

提交評(píng)論