2023年全國碩士研究生考試考研英語一試題真題(含答案詳解+作文范文)_第1頁
已閱讀1頁,還剩18頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

1、自然語言處理的新需求、新資源、新技術(shù),董振東 董強(qiáng)e-mail: dzddong@public.bta.net.cnhttp://www.keenage.comTel: (8610) 6287-5641 Tel: (8610) 6676-8816 Intel China Research Forum北京 2000/10/11,提綱差距與反思 -- 成就是過去時,不足是將來時《知網(wǎng)》-- 一種

2、新的資源 -- 新資源促進(jìn)新技術(shù),差距與反思,層次淺語境窄知識顆粒度小流行技術(shù)的極限,音字轉(zhuǎn)換信息過濾文本分析語音識別文本分類歧義排除OCR信息檢索文本理解語法檢查自動文摘機(jī)器翻譯,音字轉(zhuǎn)換系統(tǒng)(輸入法)上這攤某被立即送往醫(yī)院,但終因史學(xué)過多,不止身亡。傷者譚某被立即送往醫(yī)院,但終因失血過多,不治身亡。上這里某被立即送往醫(yī)院,但終因留學(xué)過多,不止身亡。傷者李某被立即送往醫(yī)院,但

3、終因流血過多,不治身亡。唐非情此獲準(zhǔn)。唐飛請辭獲準(zhǔn)。,語法檢查In a few years' time, there will be no Internet companies -- there will just be companies -- and all companies that are going to operate in the economics of a few years, in the fut

4、ure, are going to be Internet companies. -- Andrew S. Grove(1) areThe world will little note, nor long remember what we say here, but it can never forget what they did here.-- Abraham Lincon(1)world

5、wills / worlds will(2)will little neither note, nor / will no little note, orIt would be something that we actually will almost take for granted and wonder how business was done before we incorporated this in a very d

6、eep way-- Bill Gates (1) wonders,信息檢索華人 ? 中華人民共和國,新華人壽保險北大 ? 北大西洋,北大荒,臺北大學(xué)葡萄 ? 葡萄牙,葡萄糖,自動文摘Advances in Automatic Text SummarizationWith the rapid growth of the World Wide Web and electronic information

7、 services, information is becoming available on-line at an incredible rate. One result is the oft-decried information overload. No one has time to read everything, yet we often have to make critical decisions based on wh

8、at we are able to assimilate. The technology of automatic text summarization is becoming indispensable for dealing with this problem. Text summarization is the process of distilling the most important information from a

9、source to produce an abridged version for a particular user or task. Until now there has been no state-of-the-art collection of the most important writings in automatic text summarization. This book presents the key de

10、velopments in the field in an integrated framework and suggests future research areas. The book is organized into six sections: Classical Approaches, Corpus-Based Approaches, Exploiting Discourse Structure, Knowledge-Ric

11、h Approaches, Evaluation Methods, and New Summarization Problem Areas. (55%),自動翻譯Do you think I could stay here and become nothing to you? Do you think, because I am poor, and obscure and plain, th

12、at I am soulless and heartless? I have as much soul as you and fully as much heart. And if God had gifted me with wealth and beauty I should have made it as hard for you to leave me as it is now for me to leave you. Ther

13、e I have spoken my heart and let me go.你想我可以在這里停留并且適合沒有任何東西成為你嗎?你想,因?yàn)槲邑毟F,并且昏暗和簡明,我是卑鄙和無情的嗎?我有和靈魂AS多你和充分作為許多心臟。同時如果上帝有我天才財富以及美麗我作為應(yīng)該使它努力為你因?yàn)樗F(xiàn)在為我。我已在那里說我的心臟并且讓我去。(1988年推出的系統(tǒng))你認(rèn)為我能這里留下和變成對你沒有什么嗎?你因?yàn)槲沂强蓱z和不引人注目和清楚

14、,認(rèn)為我是沒有靈魂的和無情嗎?我有同樣多作為你精神和完全同樣多心臟.和如果上帝有有天賦有財富的我和美,我應(yīng)該已使它變得你同樣地難以現(xiàn)在我留下你讓我保持現(xiàn)在的樣子.那里我說我的心臟已和讓我去。(1999年推出的系統(tǒng)),《知網(wǎng)》-- 一種新的資源,《知網(wǎng)》的近期發(fā)展與應(yīng)用《知網(wǎng)》的關(guān)鍵,HowNet 意味著什么?How knowledge is represented and acquired?How meanin

15、g can be formalized and calculated?How meaning is expressed and conveyed?,《知網(wǎng)》近期的發(fā)展,《知網(wǎng)》2000版與1999版的不同2000版1999版語種中(GB)-英雙語 GB + Big5功能瀏覽器 + 數(shù)據(jù)維護(hù)基本數(shù)據(jù)知識詞典11萬記錄6萬記錄新增修訂10,000余增加中文多義義項(xiàng)

16、例子內(nèi)容擴(kuò)展事件關(guān)系與角色轉(zhuǎn)換庫知網(wǎng)-中文信息結(jié)構(gòu)庫,《知網(wǎng)》近期的應(yīng)用,資源擴(kuò)展語義標(biāo)注建立關(guān)系網(wǎng)絡(luò)信息處理應(yīng)用語義分析排除歧義英中-中英雙向機(jī)器翻譯信息過濾,《知網(wǎng)》的關(guān)鍵,《知網(wǎng)》的靈魂關(guān)系 -- 關(guān)系的動態(tài)的、多層次的體現(xiàn)寓于靜態(tài)的、孤立的描述之中意義的形式化、可計算《知網(wǎng)》應(yīng)用的關(guān)鍵 -- 新技術(shù)的引入大語境 – 可能且受鼓勵元規(guī)則的機(jī)制,舉例:,我上星期把自行車賣了。今天一大

17、早買主來找我,他說那車的車身有過硬傷,他要退貨。買主 -- the buyer車 – bycicle (bike)? or car?車身 -- the body of,sell|賣[agent,possession,target,cost]buy|買(X)??sell|賣(Y) [mutual implication]; agent OF buy|買=target OF sell|賣; source OF buy|

18、買=agent OF sell|賣; possession OF buy|買=possession OF sell|賣; cost OF buy|買=cost OF sell|賣.W_C=買主G_C=NE_C=W_E=buyerG_E=NE_E=DEF=human|人,#commercial|商,*buy|買,W_C=自行車G_C=NE_C=W_E=bicycleG_E=NE_E=DEF=LandVe

19、hicle|車車 – [last]查詢,W_C=車身G_C=NE_C=W_E=body of a vehicleG_E=NE_E=DEF=part|部件,%LandVehicle|車,body|身,唐飛請辭獲準(zhǔn)。SYN_S=V --> VSEM_S=(事件,行動,使動/阻動) --> [結(jié)果事件] (事件) 請-示,請-轉(zhuǎn),請-來,請-教,請-吃,請-喝,請-提意見,SYN_S=V <-

20、- VSEM_S=(事件) [遞續(xù)] <-- (事件) 舉槍-射擊,拔槍-射擊,拔刀-相助,拜師-學(xué)藝,打擊-報復(fù), 討論-決定,討論-通過,立案-偵查,報到-上班,掛號-交費(fèi), 細(xì)嚼-慢咽,登臺-獻(xiàn)藝,握手-告別,前來-報到,列隊(duì)-歡迎, 泛濫-成災(zāi),撥號-接通,離家-出走,縱火-焚燒,改惡-從善, 走私-販私,出席-作陪,防火-護(hù)林,封山-育林,退耕-還林, 抗洪-救災(zāi),團(tuán)結(jié)-互助,團(tuán)結(jié)-

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 眾賞文庫僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論