Navigating Cookie Compliance Around the Globe
May 31, 24
Information
Authors
Brian Tang, Duc Bui , Kang G. Shin
Conference
Under submission at USENIX
Blog
Intro
Online services increasingly provide users with cookie consent settings to accept/reject the cookies placed on their web browsers. Despite this increased adoption, little has been done to detect and understand the behavior of users’ cookie consent from a global perspective. These cookie consent mechanisms are notoriously inconsistent in their behavior, often violating users’ specified consent. To remedy this important oversight, we propose an end-to-end automated system, called ConsentChk, that detects and analyzes inconsistencies between a website’s cookie usage and users’ cookie consent preferences. ConsentChk uses a formal model to systematically categorize the types of cookie consent violations. ConsentChk detects and analyzes cookie usage and consent preferences even on websites that do not display cookie banners for new visitors. We have conducted large-scale studies in 8 English-speaking regions across the world, and analyzed discrepancies in cookie consent behavior across 1,458 globally-popular websites. We find that location is one of the largest root causes of cookie consent inconsistencies, with regions outside of the GDPR containing up to 26,149 more cookie consent violations than within the GDPR for a set of 1,458 sites. On average, this results in up to 20 additional cookie inconsistencies per website. Our evaluation reveals that specifics details in regulations and consent management platform implementations significantly impact cookie consent behavior. Our work has uncovered various root causes behind these persistent cookie consent inconsistencies, indicating the prevalence of inconsistently implemented and enforced cookie consent with respect to regional privacy laws. The resulting implementations produce misleading, or even deceptive, cookie consent management, highlighting the need to improve consent libraries and privacy law clarity.
Design Overview
The design of ConsentChk consists of (1) a cookie consent exerciser (Appendix A) and (2) a cookie settings button detector (Section 3.3).
Automatic (Un)consent Tool: To analyze a specific cookie setting instance, its UI controls need to be mapped to the components in the analysis framework. The main manual effort is to map the HTML elements to the corresponding cookie consent categories. For each CMP, we analyze its UI variants in the cookie setting dataset. We use the Chrome DevTools to identify CSS selectors that uniquely identify UI elements on the layout. Although the identification of the mapping is done manually, we need the manual mapping only once for each of the limited number of cookie settings layouts.
Cookie Consent Preference Extractor: After setting the consent on the UI, to extract the consent of each cookie recorded by the cookie library, we extract the consent for each category and the list of cookies for each category. Combining these two lists, we get the consent for each individual cookie. For example, OneTrust stores consents of categories in the OptanonConsent cookie and the lists of cookies per cookie category in en.json.
Methodology
From the top 20k global websites in the Tranco list November 2023 (ID: 5Y3LN), we select 10,436 websites, which have an English homepage and were loaded successfully, for further analysis with ConsentChk. We use the most up-to-date list at the measurement time to avoid the domains that become nonexistent. Some websites in the list failed to load due to various issues. For example, some URLs are non-website ad-serving domains. The language of the websites is determined by a neural-network-based language detector after converting the web pages to plain text.
Of the selected 10.4k websites, ConsentChk successfully analyzed the flow-to-consent consistency of 1,458 (13.97%) websites. OneTrust and Cookiebot are the most popular, appearing on 1,340 and 110 sites, respectively. Termly was detected only on 8 sites. From the analyzed websites, ConsentChk extracted 254,790 cookie declarations. The number of cookie declarations per website varies with an average of 174.8 (191.0 SD), ranging from 1 to 1,601 cookies.
We evaluate the detection performance in regions with privacy regulations that generally require user consent before data collection. We select Ireland, the UK, California, Michigan, Canada, South Africa, Singapore, and Australia, as eight measurement locations. These locations were selected because (1) the websites are displayed in English and (2) the location supports a privacy framework requiring notices prior to data collection (except Michigan as a control for a US state without CCPA-like privacy laws). We measured the websites from IP addresses by using proxies running on AWS, a major cloud provider.
Discussion
The United States has the Worst Cookie Compliance. Even in California, a state with the strictest privacy laws, the prevalence of rejected cookie usages and consent choice omissions on sites is 3.64% – 4.92% higher than the EU. Worse still, the amount of cookies involved with the rejected cookie usage and consent choice omission violations is almost twice that of the UK or Ireland. This significant increase in cookie consent violations can be attributed to the fact that the CCPA does not directly state requirements for cookie consent in its articles. Compared to the fines enacted in the EU and UK due to the GDPR violations, regulation and enforcement in California are focused on the “Do Not Sell” consent mechanisms. Thus, websites load many more cookies on browsers located in these regions. Additionally, website developers are less inclined to rigorously monitor the proper functionality of their cookie consent mechanisms. ConsentChk found rejected cookie usage on 86.22% of websites and omitted cookies from the cookie consent settings on 93.47% of websites when accessing from California. Table 6 shows the comparison between each of the 8 regions.
Cookie Consent Violations are Much More Prevalent in the Non-GDPR Regions. Interestingly, non-European countries like Singapore and South Africa have more violations than the EU and UK, but fewer violations than the US. For example, regions like Michigan contain up to 26,149 more cookie consent violations than within the GDPR, whereas Singapore and South Africa contain 11,019 and 14,002 additional cookie consent violations respectively. This gap in cookie compliance is likely a result of more 1st and 3rd party cookies being loaded outside of the EU, due to geolocation-based compliance features on CMPs.
We have developed a browser extension, called ConsentEnforcer, to help end-users audit and enforce their cookie preferences on the websites they visit. The extension comprises a cookie auditor and a consent enforcer. The auditor collects and displays detected flow-to-preference inconsistencies and categories of the cookie receivers on a website. The extension enforces the user’s rejection by removing rejected cookies from HTTP(S) requests and responses. By blocking the rejected cookies, ConsentEnforcer blocks only the cookies under the scope of the user consent while keeping other cookies’ behavior unchanged to avoid disruptive user experience. The cookie flows are extracted in real time and compared with the consent library cookie declarations to verify any cookie consent violations. Only the cookies that are declared but incorrectly enforced by the website are blocked. Rather than deleting cookies from the browser, the extension intercepts any cookies requested or sent in the network traffic.
Citation
@inproceedings{tang2025cookie,
title={Navigating Cookie Compliance Around the Globe},
author={Tang, Brian and Bui, Duc and Shin, Kang G.},
booktitle={25th Privacy Enhancing Technologies Symposium},
year={2025},
}