this post was submitted on 13 Dec 2023
40 points (93.5% liked)

United Kingdom

4083 readers
249 users here now

General community for news/discussion in the UK.

Less serious posts should go in [email protected] or [email protected]
More serious politics should go in [email protected].

Try not to spam the same link to multiple feddit.uk communities.
Pick the most appropriate, and put it there.

Posts should be related to UK-centric news, and should be either a link to a reputable source, or a text post on this community.

Opinion pieces are also allowed, provided they are not misleading/misrepresented/drivel, and have proper sources.

If you think "reputable news source" needs some definition, by all means start a meta thread.

Posts should be manually submitted, not by bot. Link titles should not be editorialised.

Disappointing comments will generally be left to fester in ratio, outright horrible comments will be removed.
Message the mods if you feel something really should be removed, or if a user seems to have a pattern of awful comments.

founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 3 points 11 months ago (1 children)

you can't "anonymize" data

ask the people outed as lgbt by netflix's anonymized data set

[–] [email protected] 1 points 11 months ago (1 children)

You absolutely can anonymise data.

However it's also true that of you don't do it correctly users can be identified. Sounds like Netflix didn't do it properly. I don't know, do you have a link I could look at?

[–] [email protected] 3 points 11 months ago (1 children)

anonymising data is a treadmill problem

what might work now won't hold up to the de-anonymising techniques of a few years from now

so no, you can't really

[–] [email protected] 1 points 11 months ago (1 children)

Create anonymous UUID, store interactions against this in a separate table, ensure PII is removed prior to storing. So instead of Max Reboo has purchased a subscription to jugs and hooters it's user 12345678901234576 has purchased jugs and hooters. How can a future treadmill de-anonymise this? For sure if the storage is done badly then you can track back to a particular user.

Also, once again, can you link to the netflix issue you quoted above please. Thanks.

[–] [email protected] 3 points 11 months ago* (last edited 11 months ago)

Create anonymous UUID, store interactions against this in a separate table, ensure PII is removed prior to storing

which is more or less exactly what netflix did -> the whole thing's not that hard to find on google

but you need something to distinguish users at least a bit or the data's equivalent to sales figures

you combine that "not-quite-pii" with other independent data sources that have similar "not-quite-pii" and build a complete picture

the treadmill effect comes from active research in this exact area trying to de-anonymise data sets finding new techniques to get around old ones