An accurate evaluation of machine learning algorithms for flow-based P2P traffic detection

Soysal, Murat
Schmidt, Şenan Ece
Today, peer-to-peer (P2P) traffic consumes the largest fraction of network bandwidth. The files shared by P2P communications are mostly copyright protected, and there are issues related to Quality of Service (QoS) support and billing of P2P traffic. Hence, scalable and accurate detection of peer-to-peer (P2P) traffic is a significant problem for network service providers. Flow-based detection methods employ characteristics of data flows such as the number of packets per flow to classify P2P and non-P2P traffic. Thus, they provide solutions to problems of port-based and signature-based detection such as P2P applications with dynamic ports, updating the signature database and encrypted packets. In this paper, a comparative evaluation of several flow-based P2P traffic detection methods that employ machine learning (ML) techniques is presented. Different from previous work, the effect of network parameters is taken into consideration in our evaluation. Furthermore a new verification approach based on custom-made data is presented which can circumvent the accuracy problems of the previous verification methods that use port-based or signature-based techniques for the accuracy evaluation.