Nataliia Bielova: Detecting third-party tracking and GDPR violations in Web applications

November 7, 2019 @ 3:00 pm - 4:30 pm

Third-party Web tracking has been extensively studied over the last decade. In our work, we focus on two problems that address third-party tracking in Web applications and GDPR impact on trackers.

First, we study detection of third-party tracking. Most of previous studies and user tools rely on filter lists. However, there has always been a suspicion that lists miss many trackers. In this paper, we propose an alternative method to detect trackers inspired by analyzing behavior of invisible pixels. By crawling 84,658 webpages from 8,744 domains, we detect that third-party invisible pixels are widely deployed: they are present on more than 94.51% of domains and constitute 35.66% of all third-party images. We propose a fine-grained behavioral classification of tracking based on the analysis of invisible pixels. We use this classification to detect new categories of tracking and uncover new collaborations between domains on the dataset of 4,2M third-party requests. We demonstrate that two popular methods to detect tracking, based on EasyList&EasyPrivacy and on Disconnect lists respectively miss 25.22% and 30.34% of the trackers that we detect. Moreover, we find that if we combine all three lists, 379,245 requests originated from 8,744 domains still track users on 68.70% of websites. 

Second, we analyse GDPR impact on third-party trackers. GDPR defines rights for data subjects (users) and obligations for data controllers (trackers) but it is unclear how subjects and controllers interact concretely. We investigate whether it is safe for a data subject to exercise the right of access of her own data by analysing how subject access request procedures are implemented in third-party tracking services. We observe that some trackers use unsafe or doubtful procedures to authenticate data subjects: the most common flaw is the use of authentication based on a copy of the subject’s national identity card transmitted over an insecure channel.  


