Friday, April 10, 2020

Leakage paths for the Apple / Google Bluetooth tracing system

Overview of the tracing system

I read through Apple’s docs for their contact tracing partnership with Google. This article summarizes how it works and possible data leakage paths. This isn’t totally my area, send me corrections if I’m wrong.

The tracing system works using Bluetooth advertise packets. That page says:

Advertising allows devices to broadcast information defining their intentions.

The A/G tracing system populates these packets with a unique identifier called a Rolling Proximity Identifier (henceforth RPI).

To make it hard to track someone over time, the RPIs are opaque and don’t contain any kind of user ID. They’re created using a one-way hash (click the link if you don’t know what that is) from a Daily Tracing Key, or DTK.

The DTK is in turn created via a key derivation function from a root Tracing Key that’s unique to your device (but never leaves it). (I’m not sure why DTKs are dervied rather than random).

To summarize:

TK -> DTK (daily) -> RPI (every 10 minutes)

If someone tests positive, they upload their recent DTKs to a database. Other devices download the last N days of positive keys on some periodic basis, run it against their list of RPIs with timestamps (because DTK + time + 1-way hash = RPI), and get a list of RPIs that are sick.

As far as I know, nobody is talking about making this mandatory: either to participate in tracing or to report when you’re sick. I think that’s a good thing. As with all non-mandatory systems, the most effective legislative path to making them mandatory is to normalize them first, convince a majority, and then make them mandatory later.

Also, there are leakage paths. Read on:

Find out if someone specific is sick

If I’m targeting an individual, I can capture their RPIs pretty easily and get notified that they’re sick.

If I operate an office building, I can pretty easily narrow down an RPI to N people entering a building at once. I don’t know if there are bluetooth scrapers for employers; I’d be shocked if there aren’t. I think this is what the estimote guy Steve Cheney is building this month.

If this continues for a while, and if sickness status is worth any money (the latter is a big if), we’ll see darkweb marketplaces where you can buy an individual’s RPIs.

My point is that even if this is fine for emergencies (another big if), don’t make the mistake of letting it be normalized for non-epidmic times or seasonal flu.

Use stationary beacons to track someone’s travel path

Let’s say I had one iPhone per subway entrance in NYC, just sitting there collecting everyone’s RPIs. When someone tests positive and publishes their keys, I can then track their . I won’t know who they are, but I can at least grab aggregate information about where coronavirus getters travel.

Does this sound like a bridge too far? It isn’t: passive bluetooth observation stations are already ubiquitous, so this isn’t insane.

Increased hit rate of stationary / marketing beacons

If everyone has bluetooth on all the time for health reasons, this is like duck season for companies that already operate consumer surveillance platforms targeting bluetooth / wifi. It’s a bait ball.

These companies aren’t signatory to any special privacy rules that affect this emergency, and in fact have relatively few privacy obligations generally because they don’t have a contract with the owners of the phones they’re targeting.

Not only will their data be much richer, not only can they now merge in people’s epidemiological data, not only do they have an expertise in de-anonymizing bluetooth traces, but the data that they collect now will enrich their database for a long time; understanding what their normal sparse DB looks like at, say, 80% population adherence will allow them to beef up their inference models. And this is a capability that will only slowly decay as consumer behavior and devices switch out.

Leakage of information when someone isn’t sick

It seems like this isn’t possible given their spec, except:

  • You still have to phone home once a day to get a list of sick people’s tokens
  • The system can encourage somebody to go to a hospital and get tested, at which point an institution can collect a DNA sample.

Am I paranoid? Maybe, but if the question is ‘can you use this system to make someone think they need to go to the hospital based on approximate location’, the answer is yes.

Fraud resistance

This isn’t a leakage path, but I’m wondering what stops someone from sending fake positive results that cause overloads of our testing capacity as a low-grade form of terrorism.

Will we offer a ‘signed testing payload’ from labs? Will I share my DTKs with labs? The spec doesn’t say.

Every product that supports anonymous use needs to plan for fraud.

The docs I read don’t say whether the API will be locked down in any way, but I’m guessing that even if it is on ios it won’t be on droid.

Conclusions

I think there is information that could help us answer the ‘do we need this’ question:

  • Are you less likely to transmit the disease outdoors than in close quarters?
  • Do masks work effectively to prevent transmission by a sick person? If yes, where do they need to be worn?
  • How important are asymptomatic spreaders as a vector?
  • How effective is fever as a screening tool?

I’m not a doctor and don’t know the answers to these questions. I think the knowledge-base here is evolving and doctors may not know the answers to these questions yet.

I don’t understand the supply chain / lab capacity questions affecting test availability.

As best I understand the public policy question, it’s ‘when do we open, how do we prevent a huge wave, and how do we prepare for it’.

Given all those unknowns, I shouldn’t express an opinion on ‘do we need this’ and so I won’t.

Separately from questions of necessity, I’ll say that I hate:

  • All forms of centralized location tracking, mandatory or otherwise, because there is never transparency about what is collected or how it’s used (google has had multiple ‘mea culpas’ over confusing or totally ignored location settings)
  • Apple and Google collaborating on data collection. I think Apple lovers are saying ‘Apple keeps Google honest here’. I think that’s true in the web standards space, where they’re not collaborating but competing. I think it wasn’t true in the giant silicon valley hiring scandal, where ‘collaboration’ was actually collusion.
  • Having bluetooth or wifi turned on outdoors – they leak

We should make it illegal to collect trace data for purposes other than personal testing decisions for the duration of this crisis.

All that said, we should do what we have to to stay healthy.



from Hacker News https://ift.tt/3a1Epiy

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.