Private Click Measurement

Draft Community Group Report,

This version:
https://privacycg.github.io/private-click-measurement/
Issue Tracking:
GitHub
Inline In Spec
Editor:
(Apple Inc.)

Abstract

This specification defines a privacy preserving way to attribute a conversion, such as a purchase or a sign-up, to a previous ad click.

Status of this document

This specification is intended to be merged into the HTML Living Standard. It is neither a WHATWG Living Standard nor is it on the standards track at W3C.

It was published by the Privacy Community Group. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

1. Introduction

This section is non-normative.

A popular business model for the web is to get attribution and payment for conversions, for instance purchases or sign-ups, which result from the click on an ad. Traditionally, such attribution has been facilitated by user identifying cookies sent in third-party HTTP requests to the click source. However, the same technology can be and has been used for privacy invasive cross-site tracking of users.

The technology described in this document is intended to allow for ad click attribution while disallowing arbitrary cross-site tracking.

1.1. Goals

1.2. Terminology

ad click

This document will use the term “ad click” for any kind of user gesture on an ad that invokes the navigation to a link destination, such as clicks, taps, and accessibility tools.

conversion

A user activity that is notable such as a purchase, a sign-up to a service, or the submission of personal information such as an email address.

The four parties involved in this technology are:

user

They click on an ad, end up on a destination website, and perform what’s deemed to be a conversion, such as a purchase.

user agent

The web browser that acts on behalf of the user and facilitates ad click attribution.

ad click source

The first-party website where the user clicks on the ad.

ad click destination

The destination website where the conversion happens.

The data consumed by the user agent to support ad click attribution is:

ad campaign id

A six-bit decimal value for an ad campaign associated with the ad click destination. This means support for 64 concurrent ad campaigns per ad click destination on the ad click source. Example: merchant.example can run up to 64 concurrent ad campaigns on search.example. The valid decimal values are 00 to 63.

ad attribution data

A six-bit decimal value encoding the details of the attribution. This data may contain things like specific steps in a sales funnel or the value of the sale in buckets, such as less than $10, between $10 and $50, between $51 and $200, above $200, and so on. The valid decimal values are 00 to 63.

ad attribution priority

An optional six-bit decimal value encoding the priority of the attribution. The priority is only intended for the user agent to be able to pick the most important attribution request if there are multiple. One such case may be after the user has taken step 1 through 3 in a sales funnel and the third step is the most important to get attribution for. The valid decimal values are 00 to 63.

1.3. A High Level Scenario

A high level example of a scenario where the described technology is intended to come into play is this:

  1. A user searches for something on search.example's website.

  2. The user is shown an ad for a product and clicks it.

  3. The ad click source, search.example, informs the user agent (see § 2 Ad Click Source Link Format):

  4. The user agent navigates and takes note that the user landed on the intended ad click destination.

  5. The user’s activity on the ad click destination leads to a conversion.

  6. A third-party HTTP request is made on the ad click destination website to ​https://search.example/.well-known/ad-click-attribution

  7. The user agent checks for pending ad click attributions for the ad click source-destination pair and if there’s a hit, makes or schedules an HTTP request to ​https://search.example/.well-known/ad-click-attribution with the ad attribution data.

    One thing to consider here is whether there should be an option to send the ad attribution data to the ad click destination too.

2. Ad Click Source Link Format

This specification adds two attributes to the HTMLAnchorElement interface. Authors can use these attributes in HTML content like so (where 17 is an ad campaign id and https://destination.example/ is an ad click destination):

<a adcampaignid="17" addestination="https://destination.example/">

Formally:

partial interface HTMLAnchorElement {
    [CEReactions] attribute DOMString adCampaignId;
    [CEReactions] attribute USVString adDestination;
};

The IDL attributes adCampaignId and adDestination must reflect the adcampaignid and addestination content attributes, respectively.

Should these attributes be on HTMLHyperlinkElementUtils instead? <https://github.com/privacycg/private-click-measurement/issues/1>

If an element with such attributes triggers a top frame navigation that lands, possibly after HTTP redirects, on the ad click destination, the user agent stores the request for ad click attribution as the tuple ( ad click source, ad click destination, ad campaign id ). If any of the conditions do not hold, such as the ad campaign id not being a valid six-bit decimal value, the request for ad click attribution is ignored.

3. Legacy Triggering of Ad Click Attribution

Triggering of attribution is what happens when there is a conversion.

Existing ad click attribution relies on third-party HTTP requests to the ad click source and these requests are typically the result of invisible image elements or "tracking pixels" placed in the DOM solely to fire HTTP GET requests. To allow for a smooth transition from these old pixel requests to the new Ad Click Attribution technology, we propose a server-side redirect to a well-known location as a legacy triggering mechanism. [WELL-KNOWN]

To trigger an ad click attribution request, the top frame context of an ad click destination page needs to do the following:
  1. An HTTP GET request to the ad click source. This HTTP request may be the result of an HTTP redirect, such as searchUK.example HTTP 302 redirect to search.example. The use of HTTP GET is intentional in that existing “pixel requests” can be repurposed for this and in that the HTTP request should be idempotent.

  2. A secure HTTP GET redirect to the URL returned by generating an ad click attribution URL for ad click source with ad attribution data and ad attribution priority. This ensures that the ad click source is in control of who can trigger click attribution on its behalf and optionally what the priority of the attribution is. If the user agent gets such an HTTP request, it will check its stored requests for click attribution, and if there’s a match for (ad click source, ad click destination), it will make or schedule a secure HTTP POST request to the URL returned by generating an ad click attribution URL for ad click source with ad attribution data and ad campaign id with the referer header set to ad click destination. The use of HTTP POST is intentional in that it differs from the HTTP GET redirect used to trigger the attribution and in that it is not expected to be idempotent. If any of the conditions do not hold, such as the ad attribution data being a valid six-bit decimal value, the request for ad click attribution is ignored. ISSUE: We may have to add a nonce to the HTTP POST request to prohibit double counting in cases where the user agent decides to retry the request.

If there are multiple ad click attribution requests for the same ad click source-destination pair, the one with the highest ad attribution priority will be the one sent and the rest discarded.

This needs to be reworked to monkeypatch HTML’s "follows a hyperlink" algorithm.

4. Ad Click Attribution URLs

Clients generate an ad click attribution URL for source with attribution data and additional data by following these steps:

  1. Let url be the result of concatenating the strings « ".well-known", "ad-click-attribution" attribution data, additional data » using the separator U+002F (/).

  2. Return the result of calling URL(url, base) with url url and base source.

5. Click source/destination pairs

An ad click source-destination pair is a tuple of two sites: (source, destination).

6. Six-bit decimal values

A six-bit decimal value is a string for which the extract a six-bit decimal value algorithm does not return failure.

The strings "00" and "63" are both six-bit decimal values, whereas "7", "98", and "!!11one" are not.

Clients extract a six-bit decimal value from a string string by running these steps:

  1. If string’s length is not 2, return failure.

  2. Let tens be the code unit at position 0 within string, and ones be the code unit at position 1.

  3. If tens is less than U+0030 (0) or greater than U+0036 (6), return failure.

  4. If ones is less than U+0030 (0) or greater than U+0039 (9), return failure.

  5. If tens is U+0036 (6) and ones is greater than U+0033 (3), return failure.

  6. Return (tens - 30) * 10 + ones - 30.

7. Modern Triggering of Ad Click Attribution

We envision a JavaScript API that is called on an ad click destination page as a modern means to trigger attribution at a conversion. This API call removes the necessity for third-party "pixels" which is great for ad click sources who do not want to be third party resources.

8. Privacy Considerations

The total entropy in ad click attribution HTTP requests is 12 bits (6+6), which means 4096 unique values can be managed for each ad click source-destination pair.

With no other means of cross-site tracking, neither the ad click source nor the ad click destination will know whether the user has clicked an associated ad or not when a conversion happens. This restricts the entropy under control to 6 bits at any moment.

Even if the ad click source and/or the ad click destination were to be in control of both six-bit decimal value, the total is 12 bits or 4096 unique values.

We believe these restrictions avoid general cross-site tracking while still providing useful ad click attribution at web scale.

In the interest of user privacy, user agents are encouraged to deploy the following restrictions to when and how they make secure HTTP POST requests to an Ad Click Attribution URL:

9. Performance Considerations

The user agent may want to limit the amount of stored ad click attribution data. Limitations can be set per ad click source, per ad click destination, and on the total amount of ad click attribution data.

10. IANA considerations

10.1. The ad-click-attribution well-known URI

This document defines the “.well-known” URI ad-click-attribution. This registration will be submitted to the IESG for review, approval, and registration with IANA using the template defined in [WELL-KNOWN] as follows:

URI suffix

ad-click-attribution

Change controller

W3C

Specification document(s)

This document is the relevant specification. (See § 4 Ad Click Attribution URLs.)

Related information:

None.

11. Related Work

The Improving Web Advertising Business Group has related work that started in January 2019. It similarly uses a .well-known path with no cookies. [METRICS]

Brave publised a security and privacy model for ad confirmations in March 2019. [CONFIRMATIONS]

Google Chrome published an explainer document on May 22, 2019, for a very similar technology. They cross-reference this spec in its earlier form on the WebKit wiki. [EVENT-LEVEL]

12. Acknowledgements

Thanks to Brent Fulgham, Ehsan Akghari, Erik Neuenschwander, Jason Novak, Maciej Stachowiak, Mark Xue, and Steven Englehardt for their feedback on this proposal.

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[DOM]
Anne van Kesteren. DOM Standard. Living Standard. URL: https://dom.spec.whatwg.org/
[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[URL]
Anne van Kesteren. URL Standard. Living Standard. URL: https://url.spec.whatwg.org/
[WebIDL]
Boris Zbarsky. Web IDL. 15 December 2016. ED. URL: https://heycam.github.io/webidl/
[WELL-KNOWN]
M. Nottingham. Well-Known Uniform Resource Identifiers (URIs). May 2019. Proposed Standard. URL: https://tools.ietf.org/html/rfc8615

Informative References

[CONFIRMATIONS]
Security and privacy model for ad confirmations. URL: https://github.com/brave/brave-browser/wiki/Security-and-privacy-model-for-ad-confirmations
[EVENT-LEVEL]
Click Through Conversion Measurement Event-Level API Explainer. URL: https://github.com/WICG/conversion-measurement-api
[METRICS]
Privacy protecting metrics for web audience measurement. URL: https://github.com/w3c/web-advertising/blob/master/admetrics.md

IDL Index

partial interface HTMLAnchorElement {
    [CEReactions] attribute DOMString adCampaignId;
    [CEReactions] attribute USVString adDestination;
};

Issues Index

One thing to consider here is whether there should be an option to send the ad attribution data to the ad click destination too.
Should these attributes be on HTMLHyperlinkElementUtils instead? <https://github.com/privacycg/private-click-measurement/issues/1>
This needs to be reworked to monkeypatch HTML’s "follows a hyperlink" algorithm.