An Algorithm for Privacy-Preserving Distributed User Statistics
Tschorsch, F., & Scheuermann, B.
Elsevier Computer Networks, 54(14), 2775-2787
In this paper, we propose a privacy-preserving method to determine the number of distinct users who connected to one or more entry points of a distributed Internet service with multiple service operators. The problem is motivated by the anonymization network Tor, and the difficulties that arise when aiming to estimate the number of Tor users. We present a way to perform distributed user counting with accurate estimates and a high level of privacy protection, based on a probabilistic data structure. We start from a relatively naive approach, and analyze the level of privacy protection that it provides. Subsequently, we improve on this baseline mechanism, building upon the gained insights. In order to assess the privacy properties of the discussed techniques, we use a novel probabilistic analysis approach which compares an attacker’s a priori and a posteriori knowledge.