If you read one article on RED before the others, make it this one — most of our location and aggregation work builds on it.
k-anonymity says: in this dataset, every person is indistinguishable from at least k−1 others on the quasi-identifying attributes. It's an attractive property because it's simple to state, simple to test, and gives product specs a number to put in a box. "All cohorts are k≥25" reads like a guarantee.
It isn't. It's a floor — the minimum below which you are certainly unsafe — routinely mistaken for a ceiling.
Three ways k-anonymous data still outs people
Homogeneity. If all 25 people in a cohort share the sensitive attribute, learning that someone is in the cohort reveals the attribute with k-anonymity perfectly intact. A cohort of "25 users in this town interested in men" outs all 25 the moment membership leaks — the indistinguishability protects nobody when the secret is the cohort definition itself.
Composition. k-anonymity holds per release. Two releases, each individually k-anonymous, can intersect to a cohort of one. Real systems publish continuously — every update, every filter variant, every time window is another release, and the guarantee quietly evaporates across them.
Background knowledge. The attacker who matters isn't a stranger staring at a spreadsheet. It's someone who already knows the target's town, schedule, or device — and needs the dataset to confirm only one bit. k-anonymity's threat model assumes the attacker knows the quasi-identifiers and nothing else; real adversaries know more, and every extra fact divides k.
What we use it for instead
We treat k as a refusal threshold, not a privacy claim: no aggregate, cohort, or export is computed below it, ever, and the check fails closed. Above the floor, the real analysis starts — composition across releases, homogeneity of the cohort, the background knowledge our actual adversaries hold. Those questions don't reduce to one number, which is exactly why the one number keeps getting used in their place.
The test we apply to any "it's k-anonymous" claim is one sentence: what does an attacker who already knows everything except one bit learn from this release? If the answer is "that bit," k was never the protection. It was the floor you were standing on while you lost it.
// Published under CC BY 4.0 — take the patterns, cite the source. · ← All articles