Tech

Back to basics: what is a cookie?

Given the Irish DPA’s recent investigation into Yahoo!’s use of cookies in compliance with GDPR, MOW believes it important for people to better understand the common web technology at the heart of the investigation. 

Cookies have a functional role in data-driven business processes but are also fundamental to the architecture of the web. Whilst most of us are familiar cookies and their use in digital advertising, the detail is mysterious to most and misinformation and misunderstanding abound.

Just what is a “cookie” and how is it used by digital publishers to operate and grow their business? 

What are cookies?

Cookies are a web technology that enables remote servers to first set an empty file (called a cookie file) on a user’s local device.

The contents of the cookie are not specified by web standards, but are unique to each organization that sets, reads, updates or deletes these files.

Most frequently the cookies contain random identifiers not linked to a specific individual’s identity, just like those used by Apple in its Siri and News+ services (“Apple News delivers personalised content without knowing who you are. The content you read is associated with a random identifier, not your Apple ID – Privacy – Apple). 

The identifiers stored in cookies enable businesses to remember information associated with a user’s web-enabled device to facilitate necessary consumer-facing business functions such as remembering items added to a shopping basket, providing access to and displaying an email only for a specific registered user, or personalizing content.

The same identifiers stored in the same cookie files also enable businesses to conduct necessary business-facing functions such as fraud detection, frequency capping of advertising, attributing credit of advertising to business outcomes and billing.

The information linked to the random identifiers is most frequently stored on each business server. This means the end-user’s device is not burdened with unnecessary processing costs.

What is their function in online advertising?

Marketers want to focus their limited advertising budget to place their paid messaging in front of audiences that are most likely to respond. Data is collected and sold for this purpose. 

The sale of data is never about the “purchase of cookies to gain insight into websites into websites you visit” as was stated in a Techzine article reporting on Yahoo!’s violations. Marketers have no interest in tracking every website you visit. 

Prior online activity is used for limiting exposure (aka “frequency capping” the number of ads shown to the same random identifier), billing purposes (to ensure they pay the publishers for the appropriate amount for the total ads delivered), and detecting publisher fraud (to ensure they do not pay for ads deliver ads to robot scripts that request publisher pages).

Importantly, marketers do not need to know user identity to tailor advertising to potentially interested users. As outlined above, cookies contain random identifiers not linked to a specific individual’s identity and are used by Apple in Apple News to “deliver personalised content without knowing who you are” (Privacy – Apple).  If random identifiers are privacy secure enough for Apple, they must be privacy secure enough for everyone else, right?

Are cookies unsafe?

The most common and extreme misconception of cross-site information matching, using cookies, is that the user identity is akin to personal identity. As stated, marketers do not need or want to collect identifiable personal information about users. Personal information is safeguarded by a process of pseudonymisation, whereby information relating to an individual’s identity is replaced or removed. 

The second misconception is that first party, or in-house, data handling is safer than data exchanges between third parties, using cookies. The UK ICO and CMA have rejected this distinction on several occassions – see the CMA/ICO joint statement (19 May 2021), paragraphs 77-79. The UK’s Supreme Court came to the same conclusion in Lloyd vs Google (ironically, Google argued in this case that there is nothing inherently bad about a cookie from a third-party). See the judgement linked here. It is also worth noting that the largest fines from violating an individual’s privacy rights under GDPR have been handed to “first parties”. See our post on Google’s continuing privacy woes. 

Indeed, it could be argued that it is far simpler to regulate ad-driven data exchanges between third parties than information handling within a single domain. In the latter case, in order to uncover evidence of data misuse, an investigation is required and orders must be issued to force open company books, a far more inefficient process than simply monitoring information exchanges to ensure necessary safeguards are in place to protect user identity. Google’s $391.5 million settlement came following a hugely labour intensive 4-year investigation, for instance. 

So, in short, cookies are part of the web architecture, without which the internet would not properly function. Cookies are not in themselves harmful. Furthermore, contrary to propaganda peddled by those who have the most to gain from reduced interoperation, the data exchanges that they facilitate are far easier to scrutinise than supposedly “safer” first-party data.


Header image courtesy of public domain pictures (licensed for free under the public domain pictures license).