Traffic Measurement for Big Network Data by Shigang Chen, Min Chen, Qingjun Xiao

This booklet offers a number of compact and speedy equipment for on-line site visitors dimension of massive community info. It describes demanding situations of on-line site visitors dimension, discusses the kingdom of the sphere, and offers an summary of the aptitude strategies to significant problems.
The authors introduce the matter of per-flow measurement size for giant community info and current a quick and scalable counter structure, referred to as Counter Tree, which leverages a two-dimensional counter sharing scheme to accomplish much better reminiscence potency and considerably expand estimation variety.
Unlike conventional techniques to cardinality estimation difficulties that allocate a separated info constitution (called estimator) for every stream, this publication takes a special layout course via viewing all of the flows jointly as a complete: every one circulation is allotted with a digital estimator, and those digital estimators percentage a standard reminiscence area. A framework of digital estimators is designed to use the assumption of sharing to an array of cardinality estimation suggestions, attaining much better reminiscence potency than the simplest present paintings.
To finish, the authors talk about power unfold estimation in high-speed networks. they give a compact info constitution referred to as multi-virtual bitmap, which may estimate the cardinality of the intersection of an arbitrary variety of units. utilizing multi-virtual bitmaps, an implementation that may carry excessive estimation accuracy lower than a really tight reminiscence house is gifted.
The result of those experiments will shock either pros within the box and advanced-level scholars attracted to the subject. by way of offering either an outline and the result of particular experiments, this booklet comes in handy for these new to on-line site visitors dimension and specialists at the topic.

B) 96 bits per flow, 3 registers of 32 bits each. (c) 32 bits per flow, 1 register of 32 bits. 3 LogLog and HyperLogLog LogLog [5] and HyperLogLog [9] were designed to compress the size of each register from 32 bits to 5 bits for the same estimation range of 232 . Their performance is presented in Figs. 4. The estimation accuracy of LogLog and HyperLogLog (HLL) is much improved as compared with PCSA, because smaller registers mean there are more of them under the same memory constraint, which drives the estimation variance down.

For comparison, we conduct the same experiments on E-CTE, and the results are depicted in Fig. 18. Owing to the status bits in E-CTE, the increase of h only slightly degrades the performance of ECT. , h D 6. 2 0 r=50 r=100 r=200 0 1 2 3 s (× 103) Fig. 15 Impact of r on the performance of CTE, where M D 0:5 MB, b D 4, and d D 2. (a) Shows estimation results of CTE when r D 50. (b) Shows estimation results of CTE when r D 100. (c) Shows estimation results of CTE when r D 200. 12, and 295,451 packets, respectively.

6 Counter Tree-Based Maximum Likelihood Estimation In this section, we provide and analyze another estimator for flow sizes called Counter Tree-based Maximum likelihood Estimation (CTM). 11), the probability of Zi D zi is ! n k zi ProbfZi D zi g D . 1 zi m k n / m zi : The value of n is known from the Counter Tree. The values of m and k are determined by prescribed system parameters M, b and d, h. zi /. zi /ProbfYi D xi zg / ! zi / D . s; yi / D ysi . s; 1r /. Hence, the likelihood function for observing X0 D x0 , X1 D x1 , .

