外文翻譯----文件系統(tǒng)虛擬化和服務(wù)網(wǎng)格數(shù)據(jù)管理_第1頁
已閱讀1頁,還剩12頁未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

1、<p>  文件系統(tǒng)虛擬化和服務(wù)網(wǎng)格數(shù)據(jù)管理</p><p>  摘要:根據(jù)計(jì)算的大小和行政區(qū)劃來看,它的規(guī)模在日益增大。例如科學(xué)網(wǎng)格,指幾個(gè)機(jī)構(gòu)之間資源利用協(xié)調(diào)解決問題,企業(yè)信息系統(tǒng),從多個(gè)站點(diǎn)聚合協(xié)同努力發(fā)展。在這些系統(tǒng)中常見的是,應(yīng)用程序和數(shù)據(jù)都分布在資源跨行政區(qū)劃和廣域網(wǎng)。這樣的環(huán)境可稱為“格式”環(huán)境。</p><p>  關(guān)鍵詞:數(shù)據(jù);應(yīng)用;程序</p>

2、<p><b>  1.介紹</b></p><p>  這環(huán)境有以下特色 1.1 特點(diǎn): ?異質(zhì)性:有一個(gè)存在于多種應(yīng)用程序和資源網(wǎng)格式環(huán)境。這些資源通常有不同的硬件配置(例如,CPU速度和結(jié)構(gòu),內(nèi)存大小,磁盤帶寬和容量)和軟件設(shè)置(例如,操作系統(tǒng)和圖書館),而應(yīng)用程序也有不同的特性(如數(shù)據(jù)訪問模式)和需要(如需要的數(shù)據(jù)訪問性能,安全性和可靠性)。 ?活力:在網(wǎng)格式的環(huán)境中

3、部署的系統(tǒng)具有高度的活力。在機(jī)器和網(wǎng)絡(luò)故障可發(fā)生在任何時(shí)間和非專用資源可以動(dòng)態(tài)加入和退出制度。另一方面,應(yīng)用根據(jù)需要啟動(dòng)和終止,以及他們的工作量也隨著時(shí)間的推移會(huì)有所不同。 ?規(guī)模:大量資源可以在網(wǎng)格式的環(huán)境中匯總。他們分布在不同的機(jī)構(gòu)和廣域連接網(wǎng)絡(luò),提供的計(jì)算能力和存儲(chǔ)能力,支持處決許多應(yīng)用程序。 </p><p>  1.2重點(diǎn)是分布式數(shù)據(jù)管理的兩個(gè)具??體方面 系統(tǒng):數(shù)據(jù)供應(yīng) - 上運(yùn)行的應(yīng)用程序提供計(jì)

4、算資源遠(yuǎn)程訪問存儲(chǔ)對存儲(chǔ)資源的數(shù)據(jù),并會(huì)管理數(shù)據(jù)供應(yīng) - 的建立,配置和遠(yuǎn)程終端數(shù)據(jù)訪問。因?yàn)樯鲜龅娜蝿?wù)異構(gòu),動(dòng)態(tài)的,大規(guī)模的性質(zhì)在網(wǎng)格式計(jì)算環(huán)境造成這些獨(dú)特的挑戰(zhàn)應(yīng)用程序和資源。 首先,應(yīng)用程序和數(shù)據(jù)資源的多樣性促使供應(yīng)解決方案,可以透明地部署,無需改變現(xiàn)有的經(jīng)營 系統(tǒng)(海外/ SS)和修改應(yīng)用程序源代碼或二進(jìn)制代碼。第二,寬領(lǐng)域,跨應(yīng)用程序域環(huán)境必要定制的優(yōu)化數(shù)據(jù)訪問,以解決效率低下(網(wǎng)絡(luò)延時(shí)長,有限的網(wǎng)絡(luò)帶寬),不安全(

5、不安全的資源,有限的互不同域之間的信任),以及 不安全(不可靠機(jī)器和網(wǎng)絡(luò))是在這種環(huán)境中的典型。最后但并非最不重要的,在一個(gè)大的,動(dòng)態(tài)的數(shù)據(jù)管理系統(tǒng)的配置也欲望靈活的控制和遠(yuǎn)程數(shù)據(jù)訪問的自動(dòng)優(yōu)化,以與眾多的應(yīng)用程序提供數(shù)據(jù)的復(fù)雜性協(xié)議,以靈活地適應(yīng)不斷變化的環(huán)境,并提供應(yīng)用所需的性能,安全性和可靠性。 為了應(yīng)對這些挑戰(zhàn),本文提出了兩個(gè)層次的數(shù)據(jù)管理系統(tǒng)中,文件系統(tǒng)虛擬化應(yīng)用提供定制網(wǎng)格范圍內(nèi)的數(shù)據(jù)接入和服務(wù)為基礎(chǔ)的中間件使數(shù)據(jù)

6、自動(dòng)管理供應(yīng)。特別是,該系統(tǒng)已</p><p>  2.數(shù)據(jù)管理系統(tǒng)的論文中所提出的架構(gòu),以解決三個(gè)重要的問題</p><p>  2.1應(yīng)用程序透明的電網(wǎng)范圍內(nèi)的數(shù)據(jù)訪問 第一個(gè)問題是,如何提供應(yīng)用程序透明的并網(wǎng)范圍內(nèi)的數(shù)據(jù)訪問? 不同于傳統(tǒng)的分布式網(wǎng)格,因?yàn)樗麄兊挠?jì)算環(huán)境的鮮明的特點(diǎn),例如,廣域網(wǎng)絡(luò),異構(gòu)的終端系統(tǒng),與不相交的管理域。在局域網(wǎng)絡(luò)(LAN)里的這些差異帶來的數(shù)據(jù)

7、管理的新挑戰(zhàn)系統(tǒng)和成功的技術(shù),例如,局域網(wǎng)文件系統(tǒng),不能直接應(yīng)用在網(wǎng)格環(huán)境。相反,數(shù)據(jù)網(wǎng)格管理需要專門處理這些獨(dú)特的問題?,F(xiàn)有的解決方案通過專門的使用Grid數(shù)據(jù)API或庫,允許應(yīng)用程序訪問。然而,應(yīng)用程序源或二進(jìn)制必要的修改經(jīng)常發(fā)生后,最終用戶和開發(fā)人員的肩膀上的負(fù)擔(dān)并提出一個(gè)障礙 的應(yīng)用程序不能輕易修改。因此,應(yīng)用程序的透明度可取的,以便在網(wǎng)格,一個(gè)廣泛的應(yīng)用部署,其中格啟用應(yīng)該是網(wǎng)格中間件的責(zé)任,但不是應(yīng)用程序用戶或開發(fā)人員。

8、 本論文提出了一個(gè)用戶級DFS的虛擬化,即網(wǎng)格虛擬文件系統(tǒng)(GVFS中),為應(yīng)用程序透明的網(wǎng)格數(shù)據(jù)訪問。由于眾所周知的DFS的界面是由GVFS中保存并提交給應(yīng)用程序,沒有修改要求他們的源代碼,庫或二進(jìn)制文件。此外,該方法是基于用戶級的虛擬化技術(shù),它不要求改變現(xiàn)有Ø /不銹鋼并可以方便地部署在</p><p><b>  朗讀</b></p><p>

9、  顯示對應(yīng)的拉丁字符的拼音</p><p>  FILE SYSTEM VIRTUALIZATION AND SERVICE FOR GRID DATA MANAGEMENT</p><p>  1. INTRODUCTION </p><p>  Computations are becoming increasingly larger scale, in te

10、rms of both size and </p><p>  geographical and administration distribution. Examples include scientific grids [1] </p><p>  which harness resources among several institutions for coordinated p

11、roblem solving, and </p><p>  enterprise information systems that aggregate eforts from multiple sites for collaborative </p><p>  development. Common in these systems is that applications and

12、data are distributed on </p><p>  resources across administrative boundaries and wide-area networks. Such environments </p><p>  can be referred as the "grid-style" environments, whic

13、h have the following distinctive </p><p>  characteristics:</p><p>  Heterogeneity: There exist a wide variety of applications and resources in a </p><p>  grid-style environment.

14、 The resources typically have diferent hardware configurations </p><p>  (e.g., CPU speed and architecture, memory size, disk bandwidth and capacity) and </p><p>  software setups (e.g., operati

15、ng systems and libraries); the applications also have </p><p>  diverse characteristics (e.g., data access pattern) and needs (e.g., desired data access </p><p>  performance, security, and reli

16、ability). </p><p>  Dynamism: Systems deployed in a grid-style environment are highly dynamic. </p><p>  Failures on machines and networks can happen at any time, and non-dedicated </p>

17、<p>  resources may dynamically join and leave the system. On the other hand, applications </p><p>  are started and terminated on demand, and their workloads also vary over time. </p><p>

18、  Scale: Large amounts of resources can be aggregated in a grid-style environment. </p><p>  They are distributed across diferent institutions and connected on wide-area </p><p>  networks, pro

19、viding the computing power and storage capacity to support executions </p><p>  of many applications. </p><p>  This dissertation focuses on two specific aspects of data management in distribute

20、d </p><p>  systems: data provisioning — providing applications running on the computing resources </p><p>  with remote access to their data stored on the storage resources, and the management

21、 of </p><p>  the data provisioning — the establishment, configuration, and termination of the remote </p><p>  data access. Computing in a grid-style environment poses unique challenges to th

22、ese </p><p>  tasks because of the above mentioned heterogeneous, dynamic, and large-scale nature of </p><p>  applications and resources.</p><p>  First, the diversity of applicati

23、ons and resources motivates a data provisioning </p><p>  solution that can be transparently deployed, without modifying the existing operating </p><p>  systems (O/Ss) and changing the applicat

24、ion source code or binaries. Second, the </p><p>  wide-area, cross-domain environments necessitate application-tailored optimizations for </p><p>  data access to address the inefciency (long

25、network delay, limited network bandwidth), </p><p>  insecurity (insecure resources, limited mutual-trust between diferent domains), and </p><p>  unsafety (unreliable machines and networks) tha

26、t are typical in such environments. Last </p><p>  but not least, the management of data provisioning in a large, dynamic system also </p><p>  desires ?exible control and automatic optimizatio

27、n of the remote data access, in order </p><p>  to deal with the complexity of providing data to many applications, to agilely adapt to </p><p>  the changing environments, and to deliver applic

28、ation-desired performance, security, and </p><p>  reliability. </p><p>  To address these challenges, this dissertation presents a two-level data management </p><p>  system in whi

29、ch file system virtualization provides application-tailored grid-wide data </p><p>  access, and service-based middleware enables autonomic management of the data </p><p>  provisioning. In par

30、ticular, this system has made the following contributions:</p><p>  It provides on-demand, cross-domain data access transparently for unmodified </p><p>  applications and O/Ss based on user-lev

31、el virtualization of widely available O/S-level </p><p>  distributed file systems (DFSs). </p><p>  It supports application-tailored enhancements designed for grid-style environments on </p&

32、gt;<p>  several important aspects of remote data access, including performance, consistency, </p><p>  security, and reliability. </p><p>  It employs middleware services to achieve ?exi

33、ble and interoperable management </p><p>  of grid-scale data provisioning, which is capable of controlling the lifecycles and </p><p>  configurations of dynamic data sessions based on applicat

34、ion needs. </p><p>  It develops autonomic functions to automatically optimize the data management </p><p>  according to high-level objectives, in order to reduce the complexity of managing dat

35、a </p><p>  sessions and adapt them promptly to changing environments. </p><p>  Finally, the proposed system has been demonstrated, with thorough experimental </p><p>  evaluation,

36、 that it is efective and can significantly outperform conventional</p><p>  DFS-based approaches in grid-style environments; it has also been successfully </p><p>  deployed in a production grid

37、 system [2][3] for several years, supporting scientific </p><p>  tools and users from many disciplines. </p><p>  The data management system proposed in this dissertation is architected to addr

38、ess </p><p>  three important questions, which are discussed in the following subsections respectively. </p><p>  2.1 Application-Transparent Grid-Wide Data Access </p><p>  The

39、first question is, how to provide application-transparent grid-wide data access? </p><p>  Grids difer from traditional distributed computing environments because of their </p><p>  distinct cha

40、racteristics, e.g., wide-area networking, heterogeneous end systems, and disjoint </p><p>  administrative domains. These diferences bring new challenges to data management </p><p>  systems, a

41、nd the technologies that are successful in local-area networks (LAN), e.g., </p><p>  LAN file systems, cannot be directly applied in a grid environment. Instead, grid data </p><p>  management

42、 needs to specifically address these unique issues. </p><p>  Existing solutions allow applications to access grid data through the use of specialized </p><p>  APIs or libraries. However, the

43、required modifications on application sources or binaries </p><p>  often place a burden upon the shoulders of end users and developers, and present a hurdle </p><p>  to applications that canno

44、t be easily modified. Therefore, application-transparency is </p><p>  desirable to facilitate the deployment of a wide range of applications on grids, where </p><p>  grid-enabling should be t

45、he responsibility of the grid middleware but not the application </p><p>  users or developers. </p><p>  This dissertation presents a user-level DFS virtualization, namely Grid Virtual File <

46、;/p><p>  System (GVFS), for application-transparent grid data access. Because the well-known </p><p>  DFS interface is preserved by GVFS and presented to applications, no modifications are </

47、p><p>  required to their source code, libraries, or binaries. In addition, the proposed approach is </p><p>  based on user-level virtualization techniques, which requires no changes to existing

48、O/Ss </p><p>  and can be conveniently deployed on grid resources. Furthermore, user-level enhancements </p><p>  designed for grid-style environments are built upon the virtualization layer to

49、 enable data </p><p>  provisioning with application-desired characteristics.</p><p>  In short, the proposed GVFS approach answers the first question by providing </p><p>  transpa

50、rent grid-wide data access for unmodified applications and O/Ss through the </p><p>  user-level DFS virtualization. </p><p>  2.2 Application-Tailored Grid Data Provisioning </p><

51、p>  The second question is, how to provide data with application-tailored optimizations? </p><p>  Typical O/Ss are designed to support general-purpose applications, but it is often </p><p> 

52、 the case that "one size does not fit all". Applications have diverse characteristics and </p><p>  requirements, in terms of, for example, data access patterns, acceptable caching and </p>

53、<p>  consistency policies, security concerns, and fault tolerance requirements. To provide the </p><p>  desired performance, security, and reliability to a grid application, data provisioning needs

54、</p><p>  to be optimized according to the application's behaviors and needs. </p><p>  Because an optimization tailored for one application (e.g., aggressive prefetching of </p><

55、p>  file contents) may result in performance degradation for several others (e.g., sparse files, </p><p>  databases), application-tailored features are typically not implemented in general-purpose </p

56、><p>  O/S kernels. In addition, kernel-level modifications are difcult to port and deploy, </p><p>  notably in shared environments. Toolkit-based solutions typically give users powerful APIs &l

57、t;/p><p>  to program remote data access with desired behaviors, but few programmers are skilled to </p><p>  make efective use of such APIs. </p><p>  To solve this problem, user-leve

58、l DFS customizations are proposed to support </p><p>  application-tailored GVFS data sessions. In particular, enhancements designed for </p><p>  grid-style environments are provided upon the

59、virtualization layer in GVFS, which </p><p>  include customizable disk caching and multithreading for high-performance data </p><p>  access, efcient consistency protocols for application-desir

60、ed data coherence, strong </p><p>  and grid-compatible security for secure grid-wide data access, and reliability protocols </p><p>  supporting application-transparent failure detection and re

61、covery. Based on GVFS, data </p><p>  sessions can be created on demand on a per-application basis, where each session can </p><p>  apply and configure these enhancements independently to addr

62、ess its application's needs.</p><p>  Therefore, the answer to the second question is to use the application-tailored </p><p>  enhancements enabled by GVFS to provide grid-wide data session

63、s with application-desired </p><p>  performance, consistency, security, and reliability. </p><p>  2.3 Service-Based Autonomic Data Management </p><p>  The third question is, h

64、ow to manage data provisioning in a grid-scale system with </p><p>  dynamically changing environments? </p><p>  Based on the GVFS approach, data sessions can be started on demand and </p>

65、;<p>  independently customized for applications. However, in a large-scale system, the </p><p>  management of many dynamic data sessions is another challenging task due to its </p><p>

66、  complexity. Data sessions need to be dynamically established and destroyed based on </p><p>  the lifecycles of applications and the locations of their instantiations and data storage. </p><p&g

67、t;  Customization of data sessions also implies the consideration of various relevant factors </p><p>  and tuning of many parameters, in accordance with the desired behaviors and the </p><p>  

68、surrounding environments. Dynamically changing application workload and resource </p><p>  availability further require continuous monitoring of data sessions and timely adaptation of </p><p> 

69、 their configurations. </p><p>  These requirements are often beyond the capability of end-users and even system </p><p>  administrators. Yet the goals of users or administrators are rather si

70、mple and explicit. </p><p>  For example, from an application user's point of view, it is desired that the job execution </p><p>  is fast, secure, and reliable; from a resource provider'

71、;s point of view, it is expected that </p><p>  the resource use is healthy and profitable. Therefore, this dissertation presents a novel </p><p>  service-based autonomic data management appro

72、ach to automatically manage and </p><p>  optimize the data provisioning according to such high-level objectives. </p><p>  This dissertation proposes a set of data management services to manage

73、 the </p><p>  per-application GVFS sessions, enforce the isolation among the independent sessions, </p><p>  and apply the desired customization for each session. They support ?exible control

74、</p><p>  over the lifecycles and configurations of data sessions, and can explore the knowledge</p><p>  of applications (e.g., data access patterns, data sharing scenarios, and service quality

75、 </p><p>  requirements) to customize their data sessions on the use of performance, consistency, </p><p>  security, and reliability enhancements. These services also provide interoperable int

76、erfaces </p><p>  which allow for direct interactions with other grid middleware services and automated </p><p>  executions of data provisioning tasks. </p><p>  To further reduce

77、human intervention in managing data sessions and enable them </p><p>  to promptly adapt to the changing environments, autonomic functions are built into the </p><p>  data management services t

78、o make them capable of automatically monitoring, analyzing, </p><p>  and optimizing the distributed entities of grid-wide data sessions, and cooperatively </p><p>  working together to achieve

79、the desired data provisioning and resource usage goals. Such </p><p>  autonomic management is applied to several important aspects of data sessions including </p><p>  cache configuration, dat

80、a replication, and session redirection. </p><p>  In summary, the GVFS-based data management system addresses the last question by </p><p>  employing autonomic services to provide automatic man

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 眾賞文庫僅提供信息存儲(chǔ)空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論