Aggregated statistics have a comforting property: they name nobody. A counter that says "31 people in this area this week" reads like the safest possible output a location feature can have. Our internal red-teaming exercise set out to break that assumption — and did, repeatedly, using nothing but the aggregates themselves.
This article generalizes what we found. As always on RED: lessons, not blueprints. The attacks are described at the level of class, not payload, and the numbers are illustrative.
The three attack classes that matter
Differencing across time. If a cell reports 31 on Monday and 32 on Tuesday, the delta is a single person. In dense urban cells that delta dissolves into noise. In a rural cell with single-digit churn, a watcher who knows when one specific person travels can confirm presence with high confidence after a handful of observations.
Differencing across space. Overlapping or adjacent cells leak at their boundaries. Two queries whose areas differ by one building yield the population of that building. Any API that lets a client choose the query region — even coarsely — hands the attacker the scalpel.
Cohort intersection. Aggregates filtered by attribute ("users interested in X, in area Y, active this week") multiply quickly. Three innocuous filters can intersect down to a cohort of one. The aggregate then isn't a statistic anymore; it's a profile with a number attached.
Design rules we now treat as floors
- Minimum cohort thresholds, enforced server-side. No aggregate is ever computed, cached, or logged below the threshold. Not "rounded up", not "fuzzed" — refused.
- Net-delta updates on a fixed cadence. Publishing only smoothed net changes on a slow clock destroys most timing side-channels. Real-time freshness is a product preference, not a requirement, and it is the single most expensive thing you can give away.
- Fixed, non-overlapping geometry. Clients never choose query regions. Cells are pre-defined, sized by population density rather than area, and merged automatically when a region's cohort shrinks.
- Suppression must fail closed. When the threshold check errors, times out, or is misconfigured, the correct output is nothing. We found that the failure path is where most real-world deployments leak — the happy path gets reviewed, the exception handler ships a raw count.
The uncomfortable conclusion
The threat scales inversely with density: aggregates are most dangerous exactly where queer users are most isolated and most at risk. A design that is safe for a capital city is not safe for a village, and "average-case privacy" is the wrong bar when the worst case is a person. Size your thresholds for the emptiest cell you will ever serve, and let the dense cells enjoy the slack.
// Published under CC BY 4.0 — take the patterns, cite the source. · ← All articles