PPSP Y. Hu Internet Draft Y. Xia Intended status: Informational NEC Labs China Expires: April 2010 October 26, 2009 Tracker vs. DHT Performance Comparison for P2P Streaming draft-hu-ppsp-tracker-dht-performance-comparison-01.txt Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on March 20, 2010. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Hu Expires April 26 2010 [Page 1] Internet-Draft Tracker vs. DHT Performance Comparison October 2009 Abstract The draft makes performance analysis in two kinds of resource exchange and discovery methods, tracker based and DHT based architectures in P2P streaming. Our analysis shows that centralized tracker solution for resource discovery (and chunk lookup) has much shorter response time than highly distributed DHT approach. Hu Expires April 26, 2010 [Page 2] Internet-Draft Tracker vs. DHT Performance Comparison October 2009 Table of Contents 1. Introduction.................................................4 2. Resource discovery...........................................4 2.1. Lookup efficiency.......................................4 2.2. Network traffic.........................................5 2.3. Host requirement........................................6 3. Chunk discovery..............................................7 3.1. Lookup efficiency.......................................8 3.2. Network traffic.........................................8 3.3. Host requirement........................................9 4. Conclusion..................................................11 5. References..................................................12 5.1. Normative References...................................12 5.2. Informative References.................................12 Hu Expires April 26, 2010 [Page 3] Internet-Draft Tracker vs. DHT Performance Comparison October 2009 1. Introduction Assume there are D resources shared by N peers in a P2P streaming system, there are different kinds of methods for a given peer to locate or discover a specific resource [PPSP]. One is tracker-based method, where a peer reports the resource(s) it has to a centralized server (i.e. the tracker). When the tracker receives a resource request from a peer, it returns to the requesting peer with a list of peers which have the requested resource. Another method is a DHT-based (such as [Chord]), fully-distributed lookup protocol, where resource information is stored by all the peers in the P2P network. We estimate the performance of the two methods. For P2P streaming usage scenarios, N is the number of active users in a P2P streaming software (such as PPLive, PPStream), D is the number of channels (for live streaming) or videos (for VoD) the software provides. For a popular P2P streaming software, there are about 10 million active users and about 100 thousand resources. 2. Resource discovery P2P streaming is usually divided into chunks. A chunk is a basic unit of partitioned streaming, which is used for the purpose of storage and advertising to neighbors what parts of a movie a peer holds [Sigcomm:P2P streaming]. In this part of comparison, we only consider the discovery of the coarse information (resource information), and compare the discovery of the grain information (chunk information) in the next part. In other words, we compare the following two methods. In tracker- based method, the tracker stores the resource information (given a resource Ri, the list of peers [Pi] that have Ri), while the chunk information (Pi has which chunks) is exchanged using peer gossip. Similarly, in DHT-based method, only resource information is obtained using DHT method, and chunk information is also exchanged using peer gossip after the requesting peer gets the peer list that has the requested resource. One assumption in DHT-based method is that all the nodes are geographically distributed on the whole Internet. 2.1. Lookup efficiency In the estimation, we assume average RTT in the network is 200ms, N = 10,000,000 and D = 100,000. In tracker-based method, assume one tracker is used, resource information is stored using dictionary: {Ri: [Pi]}, then: Hu Expires April 26, 2010 [Page 4] Internet-Draft Tracker vs. DHT Performance Comparison October 2009 (a) Lookup message: O(1) (b) Lookup operations: O(1) Lookup operation is to find one record in N records in the tracker. (c) Lookup latency: O(1)*RTT = 200ms The delay caused by lookup operations is negligible compared with the network delay. In DHT-based method: (a) Lookup message: O(log(N)) = 23 (b) Lookup operations: log(N)*O(1) = 23 Each node maintains information about O(log(N)) other nodes. Lookup operation is done by routing through O(log(N)) other nodes toward the destination. In each node, lookup operation is to find one record in O(log(N)) records. (c) Lookup latency: O(log(N))*RTT = 23*200ms = 4.6s The delay caused by lookup operations is negligible compared with the network delay. Summary: Tracker-based method is much faster than DHT-based method. The 4.6s lookup latency is relatively high in P2P streaming applications. 2.2. Network traffic Assume on average each peer requests new channels or videos every T seconds, then totally there are N/T requests per second in the network. Assume on average the size of a message is S Bytes. In the following estimation, we assume T = 60 sec. The size of the response message is usually larger than the request message. For simplicity, we assume S = 1K Bytes. In tracker-based method: (a) Number of messages per second: N/T*2 = 10,000,000/60*2 = 3.3*100,000 Hu Expires April 26, 2010 [Page 5] Internet-Draft Tracker vs. DHT Performance Comparison October 2009 There are N/T channel/video requests per second in the network, and each request takes 2 messages (one request and one response) (b) Size of messages per second: N/T*2*S = 3.3*100,000*1K = 3.3 *100,000,000 = 0.33GBytes (c) Number of messages in node join/leave: O(1) In DHT-based method: (a) Number of messages per second: N/T*2*log(N) = 10,000,000/60*2*23 = 7.7*1,000,000 There are N/T channel/video requests per second in the network, and each request takes 2*log(N) messages. (b) Size of messages per second: N/T*2*log(N) *S = 7.7*1,000,000*1K = 7.7*1,000,000,000 = 7.7GBytes (c) Number of messages in node join/leave: O((logN)*(logN)) = 541 Summary: Tracker-based method has much smaller network traffic overhead than DHT-based method, but both methods are acceptable in P2P streaming applications. 2.3. Host requirement In tracker-based method: (a) Memory requirement: Information of D resources is stored in the tracker. If on average one peer has C resources (the user watches the video or stores the video for other peers), then the average length of the peer list for each resource is N*C/D. Assume each peer is represented by P Bytes, then the total size of this table is D* (N*C/D)*P = N*C*P Bytes. Assume C = 10, P = 20 Bytes, then memory requirement is N*C*P = 10,000,000*10*20 = 2GBytes (b) Number of received requests per second: N/T = 10,000,000/60 = 1.67*100,000 Assume on average each peer requests new channels or videos every T seconds, then totally there are N/T requests per second to the tracker. Hu Expires April 26, 2010 [Page 6] Internet-Draft Tracker vs. DHT Performance Comparison October 2009 (c) Size of request/response messages per second: N/T*2*S =3.33*100,000*1K = 0.33GBytes S is the average size of request/response message, and we assume S = 1K Bytes. In DHT-based method: (a) Memory requirement: Information of D resources is stored in N peers, so on average each peer stores information for D/N = 100,000/10,000,000 = 0.01 resource. For those peers that store resource information, the average length of the peer list for each resource is N*C/D. Assume each peer is represented by P=20 Bytes, then the memory requirement is (N*C/D)*P = 10,000,000*10/100,000*20 Bytes = 20KBytes. In addition, each node maintains information about O(log(N)) other nodes, assume the size of the neighbor information is E Bytes, this part of memory requirement is O(log(N))*E Bytes = 23*10 = 230 Bytes (assume E = 10 Bytes). (b) Number of received requests per second: There are N/T channel/video requests per second in the network, and each request takes log(N) request messages. So on average the number of requests one peer receives per second is: N/T*log(N)/N = log(N)/T = 23/60 = 0.4 (c) Size of request/response messages per second: 2*log(N)/T*S = 2*0.4*1K = 0.8 KBytes S is the average size of request/response message, and assume S = 1K Bytes. Summary: DHT-based method has much less host resources requirement than tracker-based method. For robustness and performance considerations, multiple trackers can be used. 3. Chunk discovery In this part, we compare the following two methods. In tracker-based method, the tracker stores the resource information, and the chunk information is exchanged using peer gossip. While in DHT-based method, both resource information and chunk information are obtained using Hu Expires April 26, 2010 [Page 7] Internet-Draft Tracker vs. DHT Performance Comparison October 2009 DHT method. For the DHT-based method, we use the first solution in [PPSPChunk]. For the tracker-based method, since the tracker only stores the resource information, the performance analysis is the same as the previous part. In this part, we also add the performance analysis for the peer gossip. We assume each peer gossips with M neighbors. 3.1. Lookup efficiency In the following estimation, we assume average RTT in the network is 200ms, N = 10,000,000, D = 100,000 and M = 20. Tracker side: (a) Lookup message: O(1) (b) Lookup operations: O(1) (c) Lookup latency: O(1)*RTT = 200ms Peer side: (a) Lookup message: M*O(1) = 20 Each peer send a lookup message to M neighbors. (b) Lookup operations: O(1) (c) Lookup latency: O(1)*RTT = 200ms In DHT-based method: (a) Lookup message: O(log(N)) = 23 (b) Lookup operations: log(N)*O(1) = 23 (c) Lookup latency: O(log(N))*RTT = 23*200ms = 4.6s Summary: Tracker-based method is much faster than DHT-based method. The 4.6s lookup latency is relatively high in P2P streaming applications. 3.2. Network traffic In tracker-based method: Hu Expires April 26, 2010 [Page 8] Internet-Draft Tracker vs. DHT Performance Comparison October 2009 Tracker side: (a) Number of messages per second: N/T*2 = 10,000,000/60*2 = 3.3*100,000 (b) Size of messages per second: N/T*2*S = 3.3*100,000*1K = 3.3 *100,000,000 =0.33GBytes (c) Number of messages in node join/leave: O(1) Peer side: Assume each peer sends gossip message every I seconds, I = 10 sec. (a) Number of messages per second: M*N/I*2 = 20*10,000,000/10*2 = 4*10,000,000 (b) Size of messages per second: M*N/I*2*S = 4*10,000,000*1K = 40GBytes In DHT-based method: Assume the rate of the video is R KBytes/sec, the size of the chunk is Z Kbytes, then the number of chunk requested per second is R/Z. Assume R = 32 Kbytes/sec, Z = 16 Kbytes, then R/Z = 2. (a) Number of messages per second: N*(R/Z)*2log(N) = 10,000,000*2*2*23 = 1,000,000,000 There are N*(R/Z) chunk requests per second in the network, and each request takes 2*log(N) messages. (b) Size of messages per second: N*(R/Z)*2log(N)*S = 1,000,000,000*1K = 1TBytes (c) Number of messages in node join/leave: O((logN)*(logN)) = 541 Summary: Tracker-based method has much smaller network traffic overhead than DHT-based method, but both methods are acceptable in P2P streaming applications. 3.3. Host requirement In tracker-based method: Tracker side: (a) Memory requirement: N*C*P = 10,000,000*10*20 = 2GBytes (b) Number of received requests per second: N/T = 10,000,000/60 = 1.7*100,000 Hu Expires April 26, 2010 [Page 9] Internet-Draft Tracker vs. DHT Performance Comparison October 2009 (c) Size of request/response messages per second: N/T*2*S = 3.3*100,000*1K = 0.33GBytes Peer side: Assume the size of the bitmap is Bm Bytes, Bm = 1K. (a) Memory requirement: M*Bm = 20*1K = 20KBytes (b) Number of received requests per second: M/I = 20/10 = 2 (c) Size of request/response messages per second: M/I*2*S = 2*2*1K = 4KBytes In DHT-based method: (a) Memory requirement: Assume the number of chunks in one resource is H, H = 10000. D*H chunk information is stored in N peers, so on average each peer stores information for D*H/N = 100,000*10,000/10,000,000 = 100 chunks. The average length of the peer list for each chunk is N*C/D. Assume each peer is represented by P Bytes, then the memory requirement is (N*C/D)*P*(100 chunks) = 20KBytes*(100 chunks) = 2MBytes. In addition, each node maintains information about O(log(N)) other nodes, this part of memory requirement is O(log(N))*E Bytes = 23*10 = 230 Bytes (assume E = 10 Bytes). (b) Number of received requests per second: There are N*(R/Z)=2N chunk requests per second in the network, and each request takes log(N) request messages. So on average the number of requests one peer receives per second is: 2*log(N) = 2*23 = 46 (c) Size of request/response messages per second: 2*log(N)*2*S = 46*2*1K = 92 KBytes S is the average size of request/response message, and assume S = 1K Bytes. Summary: DHT-based method has much less host resources requirement than tracker-based method. For robustness and performance considerations, multiple trackers can be used. Hu Expires April 26, 2010 [Page 10] Internet-Draft Tracker vs. DHT Performance Comparison October 2009 4. Conclusion Centralized tracker solution for resource discovery (and chunk lookup) has much shorter response time than highly distributed DHT approach. In a large network, the DHT approach's response time can be very long (on the order of seconds), which is not suitable for delay sensitive applications. On the other hand, the per-host requirement of the tracker is higher than that of DHT nodes, but is still within reach of a $1000 commodity PC. Because the lookup messages have business value and should be stored for latter analysis, using a small number of trackers will make the job much easier than a highly distributed approach. Hu Expires April 26, 2010 [Page 11] Internet-Draft Tracker vs. DHT Performance Comparison October 2009 5. References 5.1. Normative References [1] [PPSP] Problem Statement of P2P Streaming Protocol (PPSP), Y. Zhang et al., Internet Draft draft-zhang-ppsp-problem-statement-05.txt [2] [Chord] Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, Ion Stoica et al., In SIGCOMM 01. [3] [Sigcomm:P2P streaming] Challenges, Design and Analysis of a Large-scale P2P-VoD System, Yan Huang et al., In SIGCOMM 08. [4] [PPSPChunk] Chunk Discovery for P2P Streaming, N. Zong, Internet Draft draft-zong-ppsp-chunk-discovery-00.txt 5.2. Informative References Author's Addresses Yan Hu NEC Labs China Email: huyan@research.nec.com.cn Yong Xia NEC Labs China Email: xiayong@research.nec.com.cn Hu Expires April 26, 2010 [Page 12]