Network Working Group T. Dreibholz Internet-Draft University of Duisburg-Essen Intended status: Informational January 5, 2010 Expires: July 9, 2010 Applicability of Reliable Server Pooling for Real-Time Distributed Computing draft-dreibholz-rserpool-applic-distcomp-08.txt Abstract This document describes the applicability of the Reliable Server Pooling architecture to manage real-time distributed computing pools and access the resources of such pools. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on July 9, 2010. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents Dreibholz Expires July 9, 2010 [Page 1] Internet-Draft RSerPool for Distributed Computing January 2010 carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Distributed Computing using RSerPool . . . . . . . . . . . . . 3 2.1. Requirements . . . . . . . . . . . . . . . . . . . . . . . 3 2.2. Architecture . . . . . . . . . . . . . . . . . . . . . . . 4 2.3. Limitations . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Reference Implementation . . . . . . . . . . . . . . . . . . . 5 4. Security Considerations . . . . . . . . . . . . . . . . . . . . 5 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6.1. Normative References . . . . . . . . . . . . . . . . . . . 6 6.2. Informative References . . . . . . . . . . . . . . . . . . 7 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 8 Dreibholz Expires July 9, 2010 [Page 2] Internet-Draft RSerPool for Distributed Computing January 2010 1. Introduction Reliable Server Pooling defines protocols for providing highly available services. The services are located in a pool of redundant servers and if a server fails, another server will take over. The only requirement put on these servers belonging to the pool is that if state is maintained by the server, this state must be transferred to the other server taking over. The goal is to provide server-based redundancy. Transport and network level redundancy are handled by the transport and network layer protocols. The application may choose to distribute its traffic over the servers of the pool conforming to a certain policy. 1.1. Scope The scope of this document is to explain the way of using Reliable Server Pooling mechanisms to manage and access pools of Distributed Computing resources. 1.2. Terminology The terms are commonly identified in related work and can be found in the Aggregate Server Access Protocol and Endpoint Handlespace Redundancy Protocol Common Parameters document [RFC5354]. 2. Distributed Computing using RSerPool 2.1. Requirements The application scenario for Distributed Computing is defined as follows: o Clients generate large computation jobs. Jobs have to be processed by servers as soon as possible (real-time), i.e. unlike concepts like SETI@home [SETIatHome], it is not possible to let clients fetch a job, process it later and may be some day upload the result. o Jobs may be partitionable, i.e. they can be split up to smaller pieces which can be processed independently and the processing results can be concatenated to the processing result of the complete job. Jobs have to be processed by servers. Dreibholz Expires July 9, 2010 [Page 3] Internet-Draft RSerPool for Distributed Computing January 2010 o Servers may be unreliable; i.e. user computers may be temporarily added to the pool of computing resources and may be revoked when they are used again by their owners. Furthermore, they may simply disappear because of broken network connections (modems, etc.) or power turned off. o The processing power of servers in a pool of computing resources may be very heterogeneous, i.e. a few supercomputers and many low- end user PCs. Maintaining a Distributed Computing pool for the scenario described above arises the following requirements to the pool management: o It must be possible to manage large server pools, e.g. up to some hundreds or even thousands of servers. o Due to heterogeneous processing resources within a pool, it must be possible to use appropriate server selection procedures to meaningfully utilize the available resources. o It must be possible to dynamically add and remove servers. o Servers may be unreliable, especially when the servers are represented by user PCs. Failover mechanisms are required to continue an interrupted computation session. 2.2. Architecture All requirements for pool and session management of the Distributed Computing scenario defined in the previous section can be fulfilled by the Reliable Server Pooling architecture: o An efficient implementation of the handlespace management structures allows pools to contain thousands of elements. Handlespace management structures have been proposed, implemented and analyzed in [IJHIT2008], [Contel2005], [Dre2006]. o RSerPool allows to specify server selection rules by pool member selection policies [RFC5356]. A set of adaptive and non-adaptive policies is already defined. To fulfill the requirements of new applications, it is also possible to define new policies. Research has already been made on the subject of load distribution efficiency of pool policies in Distributed Computing scenarios: see [LCN2005], [Dre2006], [Tencon2005], [Euromicro2007], [ICN2005] for details. o Dynamic addition and removal of PEs is a feature of RSerPool [RFC5352]. Dreibholz Expires July 9, 2010 [Page 4] Internet-Draft RSerPool for Distributed Computing January 2010 o The control/data channel concept [RFC5351] of RSerPool realizes a session layer. That is, RSerPool already handles the main task of maintaining and monitoring connections between PUs and PEs; the only task of the application layer to provide full failover functionality is to realize an application-dependent failover procedure. By the usage of client-based state synchronization [LCN2002], [Euromicro2005] in the form of ASAP Cookies, a failover may be fully transparent to the PU while only a state restoration is necessary on the PE side. A demo application [RSerPoolPage] using the RSerPool session layer in a Distributed Computing application is described in [Infocom2005]. 2.3. Limitations Applying RSerPool for distributed computing applications, the duties of the RSerPool architecture are still limited to the management of pools and independent sessions only. It is in particular a non-goal to provide functionalities like data synchronization among sessions, user authentication, accounting or the support for more than one administrative domain. Such functionalities are considered to be application-specific and are therefore out of the scope of RSerPool. 3. Reference Implementation The RSerPool reference implementation RSPLIB, including example Distributed Computing applications, can be found at [RSerPoolPage]. It supports the functionalities defined by [RFC5351], [RFC5352], [RFC5353], [RFC5354] and [RFC5355] as well as the options [I-D.dreibholz-rserpool-asap-hropt], [I-D.dreibholz-rserpool-enrp-takeover] and [I-D.dreibholz-rserpool-delay]. An introduction to this implementation is provided in [Dre2006]. 4. Security Considerations The protocols used in the Reliable Server Pooling architecture only try to increase the availability of the servers in the network. RSerPool protocols do not contain any protocol mechanisms which are directly related to user message authentication, integrity and confidentiality functions. For such features, it depends on the IPSEC protocols or on Transport Layer Security (TLS) protocols for its own security and on the architecture and/or security features of its user protocols. The RSerPool architecture allows the use of different transport protocols for its application and control data exchange. These Dreibholz Expires July 9, 2010 [Page 5] Internet-Draft RSerPool for Distributed Computing January 2010 transport protocols may have mechanisms for reducing the risk of blind denial-of-service attacks and/or masquerade attacks. If such measures are required by the applications, then it is advised to check the SCTP (see [RFC4960]) applicability statement [RFC3257] for guidance on this issue. 5. IANA Considerations This document introduces no additional considerations for IANA. 6. References 6.1. Normative References [RFC5351] Lei, P., Ong, L., Tuexen, M., and T. Dreibholz, "An Overview of Reliable Server Pooling Protocols", RFC 5351, September 2008. [RFC5352] Stewart, R., Xie, Q., Stillman, M., and M. Tuexen, "Aggregate Server Access Protocol (ASAP)", RFC 5352, September 2008. [RFC5353] Xie, Q., Stewart, R., Stillman, M., Tuexen, M., and A. Silverton, "Endpoint Handlespace Redundancy Protocol (ENRP)", RFC 5353, September 2008. [RFC5354] Stewart, R., Xie, Q., Stillman, M., and M. Tuexen, "Aggregate Server Access Protocol (ASAP) and Endpoint Handlespace Redundancy Protocol (ENRP) Parameters", RFC 5354, September 2008. [RFC5355] Stillman, M., Gopal, R., Guttman, E., Sengodan, S., and M. Holdrege, "Threats Introduced by Reliable Server Pooling (RSerPool) and Requirements for Security in Response to Threats", RFC 5355, September 2008. [RFC5356] Dreibholz, T. and M. Tuexen, "Reliable Server Pooling Policies", RFC 5356, September 2008. [RFC3257] Coene, L., "Stream Control Transmission Protocol Applicability Statement", RFC 3257, April 2002. [RFC4960] Stewart, R., "Stream Control Transmission Protocol", RFC 4960, September 2007. Dreibholz Expires July 9, 2010 [Page 6] Internet-Draft RSerPool for Distributed Computing January 2010 6.2. Informative References [RSerPoolPage] Dreibholz, T., "Thomas Dreibholz's RSerPool Page", URL: http://tdrwww.iem.uni-due.de.de/dreibholz/rserpool/. [Dre2006] Dreibholz, T., "Reliable Server Pooling -- Evaluation, Optimization and Extension of a Novel IETF Architecture", Ph.D. Thesis University of Duisburg-Essen, Faculty of Economics, Institute for Computer Science and Business Information Systems, URL: http:// duepublico.uni-duisburg-essen.de/servlets/DerivateServlet/ Derivate-16326/Dre2006-final.pdf, March 2007. [LCN2005] Dreibholz, T. and E. Rathgeb, "On the Performance of Reliable Server Pooling Systems", Proceedings of the 30th IEEE Local Computer Networks Conference, November 2005. [Tencon2005] Dreibholz, T. and E. Rathgeb, "The Performance of Reliable Server Pooling Systems in Different Server Capacity Scenarios", Proceedings of the IEEE TENCON, November 2005. [LCN2002] Dreibholz, T., "An efficient approach for state sharing in server pools", Proceedings of the 27th IEEE Local Computer Networks Conference, October 2002. [Euromicro2005] Dreibholz, T. and E. Rathgeb, "RSerPool -- Providing Highly Available Services using Unreliable Servers", Proceedings Proceedings of the 31st IEEE EuroMirco Conference on Software Engineering and Advanced Applications, August 2005. [Euromicro2007] Dreibholz, T., Zhou, X., and E. Rathgeb, "A Performance Evaluation of RSerPool Server Selection Policies in Varying Heterogeneous Capacity Scenarios", Proceedings of the 33rd IEEE EuroMirco Conference on Software Engineering and Advanced Applications, August 2007. [ICN2005] Dreibholz, T., Rathgeb, E., and M. Tuexen, "Load Distribution Performance of the Reliable Server Pooling Framework", Proceedings of the 4th IEEE International Conference on Networking, April 2005. [Infocom2005] Dreibholz, T. and E. Rathgeb, "An Application Dreibholz Expires July 9, 2010 [Page 7] Internet-Draft RSerPool for Distributed Computing January 2010 Demonstration of the Reliable Server Pooling Framework", Proceedings of the 24th IEEE Infocom, March 2005. [Contel2005] Dreibholz, T. and E. Rathgeb, "Implementing the Reliable Server Pooling Framework", Proceedings of the 8th IEEE International Conference on Telecommunications, June 2005. [IJHIT2008] Dreibholz, T. and E. Rathgeb, "An Evalulation of the Pool Maintenance Overhead in Reliable Server Pooling Systems", International Journal of Hybrid Information Technology (IJHIT) Volume 1, Number 2, April 2008. [SETIatHome] "SETI@home: Search for Extraterrestrial Intelligence at home", URL: http://setiathome.ssl.berkeley.edu. [I-D.dreibholz-rserpool-asap-hropt] Dreibholz, T., "Handle Resolution Option for ASAP", draft-dreibholz-rserpool-asap-hropt-04 (work in progress), January 2009. [I-D.dreibholz-rserpool-enrp-takeover] Dreibholz, T. and X. Zhou, "Takeover Suggestion Flag for the ENRP Handle Update Message", draft-dreibholz-rserpool-enrp-takeover-01 (work in progress), January 2009. [I-D.dreibholz-rserpool-delay] Dreibholz, T. and X. Zhou, "Definition of a Delay Measurement Infrastructure and Delay-Sensitive Least-Used Policy for Reliable Server Pooling", draft-dreibholz-rserpool-delay-03 (work in progress), January 2009. Dreibholz Expires July 9, 2010 [Page 8] Internet-Draft RSerPool for Distributed Computing January 2010 Author's Address Thomas Dreibholz University of Duisburg-Essen, Institute for Experimental Mathematics Ellernstrasse 29 45326 Essen, Nordrhein-Westfalen Germany Phone: +49-201-1837637 Fax: +49-201-1837673 Email: dreibh@iem.uni-due.de URI: http://www.iem.uni-due.de/~dreibh/ Dreibholz Expires July 9, 2010 [Page 9]