this post was submitted on 19 Jan 2024
285 points (97.3% liked)

Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ

54609 readers
375 users here now

⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.

Rules • Full Version

1. Posts must be related to the discussion of digital piracy

2. Don't request invites, trade, sell, or self-promote

3. Don't request or link to specific pirated titles, including DMs

4. Don't submit low-quality posts, be entitled, or harass others



Loot, Pillage, & Plunder

📜 c/Piracy Wiki (Community Edition):


💰 Please help cover server costs.

Ko-Fi Liberapay
Ko-fi Liberapay

founded 1 year ago
MODERATORS
 

I know this isn't strictly piracy related, I apologise, but I think it is tangentally related in that piracy protects you from data theft by avoiding the services the biggest thieves operate. Also, I feel like people here might be very interested in this take.

Apparently, the "legal" data brokerage industry was worth $319 billion in 2021, and is predicted to be worth $545 billion in 2028.[^1]

Meanwhile, in 2021 there were only 7.9 billion people in the world[^2] - many of whom do not have internet access or have very little data being traded. If we generously assume 6 billion people have equal volumes of data being traded, that means each person's data is worth $53.17 per year on the market.

Data is effectively stolen from people. We do not get anything in return for it. We may be offered access to a website free of charge, but that is a separate transaction - it is not appropriate for another transaction to be hidden in the fine print of the terms and conditions. When you buy insurance, the key terms have to be front and centre - you pay x, you get y service. Not "You can have y for free!!! ^(But^ ^also^ ^you^ ^give^ ^us^ ^x^ ^for^ ^free.)^" You're supposed to be able to compare the value of the things being traded.

Bearing in mind that this is merely data brokerage, not actual processing or deriving any value from the data, a simple profit margin can be applied. They simply collect the data - easily and at low cost through automated processes - and then sell it. If businesses still took a very generous 30% profit (rather than a ludicrous infinite and pure profit) then the value of an average person's data that they are owed is around $40 per year.


To run the other numbers to check, the global population in 2028 is predicted to be 8.4 billion - a growth of 6.329%. So our 6 billion population would become 6.38 billion, and with the $545 billion market value an individual's data would be worth $85.43 on the market, or $65.71 to the individual. The value of user data is predicted to rise.

Obviously that 6 billion population figure I used is an approximation - a blind one at that. To give a worst case valuation for 2021, if we assume all 7.9 billion people equally have data being traded, then an individual's data is worth $40.38 on the market, and $31.06 to the user. These are the minimum values, averaged evenly across the entire global population.


When Google and Facebook started out, data had very little value - there was no market for it. Thus it seemed reasonable to let them just take it, even if maybe it could be worth something. The service they offered was new and novel, a shiny new toy for everyone to play with. They then used this data to become some of the wealthiest businesses in the world. Now, even big players like Microsoft have joined in, in spite of the fact that their main products are paid products.

One form of bank fraud is where the criminal takes pennies out of multiple accounts, the idea being that people won't notice such a small debit, and banks might write it off as some kind of error. This has been legislated against and proven illegal - yet these assholes take $40 each from everyone and get away with it!

[^1]:https://www.knowledge-sourcing.com/report/global-data-broker-market Edit: lmao we broke it https://web.archive.org/web/20240107042301/https://www.knowledge-sourcing.com/report/global-data-broker-market ...or did they maybe take it down?? /tinfoil Edit2: it's back up lol [^2]:https://www.populationpyramid.net/world/2021/

top 33 comments
sorted by: hot top controversial new old
[–] [email protected] 88 points 10 months ago (3 children)

Also, I'm really happy I finally found a genuine excuse to show off Lemmy's citation feature lol

[–] [email protected] 10 points 10 months ago

Kudos, very nice write-up!

[–] [email protected] 5 points 10 months ago (1 children)

Oh interesting, I didn't know there was a citation feature but can't see them in Thunder. PR time perhaps.

[–] [email protected] 2 points 10 months ago

Yeah it's really not very well known, also the links don't actually work properly.

[–] [email protected] 4 points 10 months ago (1 children)

Although it doesn't work for me. On browser, when I click the citation, it just opens the post again.

[–] [email protected] 3 points 10 months ago

Yes I've noticed that as well, the links have always been borked. Doubt it'll get fixed any time soon, but at least the ground work is there and it makes it ever so slightly easier to make the formatting.

[–] [email protected] 42 points 10 months ago (3 children)

$53 is a lot less than I would have expected honestly. But I guess that's a mean average figure. It's going to be practically worthless for poorer people. And since wealth is not evenly distributed, and since personal data of people with disposable income is worth a lot more, the average internet user's data is probably worth a lot more.

[–] [email protected] 28 points 10 months ago (1 children)

Sure, it's not the hundreds of dollars I'd estimated previously. In the past I've said "the data brokerage industry is a multi-trillion dollar industry" and come up with figures ranging from $100-$700 per year owed to the user.

However, it should be said that this is just data brokerage. Not all businesses sell the data they collect, instead they keep it proprietary and use it themselves. Google, for example, sells advertising, not user data.

So I think my estimations here have been very conservative overall, and the real value may well be much higher.

Also, it's not just about it being a small amount from an individual, it's the fact that they're robbing everyone blind that really gets my wick. No one really understands the value of user data, not intuitively, and the whole transaction is done in a deceptive manner to abuse this fact.

[–] [email protected] 6 points 10 months ago (1 children)

It’s true that a lot of data isn’t sold, but a large chunk of the figure you quote also seems to include business data — stuff that contains zero personal information but is still hugely valuable to companies and investors (look at how much this report costs, for example, or consider that a Bloomberg terminal costs around $25k/yr).

And remember, those investment buyers make up a big chunk of the consumer data market too and are only interested in aggregated insights to inform trading strategies. They don’t care about personal info or targeted ads.

[–] [email protected] 1 points 10 months ago* (last edited 10 months ago)

Damn lmao did we kill my first source? It won't load anymore for me to double check what is included.

With regards to consumer data being aggregated insights, rather than personal info or targeted ads, that still doesn't mean they should get it for free, though. Furthermore, I'd argue that all info is personal info, given that it is so easy to identify a person with very few data points.

Edit: You're right, it includes business data. However I'd expect much of that data is paid for down to the data subject, excluding the stuff that's public domain.

It's not reasonable that business data should be fairly paid for, while consumer data isn't.

[–] [email protected] 9 points 10 months ago (1 children)

The way I see it, that number is a baseline figure for what their services would be offered for in exchange. If someone came up to me and said "here, I'll give you $53 and in exchange you'll let me surveil you for a year" I'd say no, but maybe someone else would've said yes. Then, as an experiment, maybe we can let the market take it from there, now that there's a price and some form of discovery mechanism.

[–] [email protected] 3 points 10 months ago

Exactly. Also, the main point I'm trying to make here is that data does not have a completely trivial value - it's not pennies per year, even with a conservative estimate.

[–] [email protected] 7 points 10 months ago

6 bil ia a stretch.
I'd go with 60-70% of tge general population being perpetually online +10% of the older folks only having a smartphone for whatsapp and some other stuff.

[–] [email protected] 26 points 10 months ago

All of this is so some dipshits can try to sell you something that you don't even need.

[–] [email protected] 24 points 10 months ago (2 children)

this service claims:

"Ad based search engines make almost $300 a year off their users.

Google generated $76 billion in US ad revenue in 2023. Google had 274 million unique visitors in the US as of February 2023.

To estimate the revenue per user, we can divide the 2023 US ad revenue by the 2023 number of users: $76 billion / 274 million = $277 revenue per user in the US or $23 USD per month, on average! That means there is someone, somewhere, a third party and a complete stranger, an advertiser, paying $23 per month for your searches."

https://help.kagi.com/kagi/why-kagi/why-pay-for-search.html

[–] [email protected] 6 points 10 months ago

Would that factor in the [unknown] costs of that revenue? Running all the servers (incl youtube), offices and staff aint cheap. So more likely some is paying enough to leave 23USD on top of massive costs.

[–] [email protected] 3 points 10 months ago

That's very interesting! I'd also read somewhere that data collection was a trillion dollar industry, however the figure I found here is purely data brokerage so does not include Google per se - Google sell advertising, the data they collect is kept to themselves, so it's much harder to pin down a value.

It also stands to reason that an American's data is worth more on the market than, say, a North Korean's - users who use the internet more will have more data being traded.

[–] [email protected] 14 points 10 months ago (4 children)

To buy weed, my state requires folks hand over their ID, and the shop records the person’s info to make sure they’re not selling to a minor.
For someone that doesn’t want their info anywhere, I’m mildly annoyed by this, but I understand it.

My weed shop had a loyalty program where (because obviously they have to track your purchases because of state law), you got points based on how much you spent. It was automatic. No opting in or out or whatever. They had to collect the data, and figured they’d reward their customers for coming back.

Last week, they told me they were discontinuing the existing rewards program, and spinning up a new one that customers have to sign up for.
To me, that means they’re not just handling the data they’re required to maintain in house, but need me to opt in to something or otherwise waive my right to privacy in some fashion. I scanned the QR code they referenced and the page (off-site from their actual website) wouldn’t even load unless I disabled tracking protection/ad-blocking.
I closed the tab and am now wondering if I need a different weed shop.

[–] [email protected] 11 points 10 months ago

You need to hand a twenty to a dude on the corner. That’s privacy. We used to have it.

[–] [email protected] 7 points 10 months ago

I closed the tab and am now wondering if I need a different weed shop.

The answer is yes. Make sure to also let management know exactly why so they know how bad they fucked up

[–] [email protected] 5 points 10 months ago (2 children)

Go olde skool and find a local weed guy. I assume they must still exist.

[–] [email protected] 7 points 10 months ago

I don't know if the local weed guy has the good stuff that dispensaries do.

[–] [email protected] 6 points 10 months ago

I don’t smoke, though. I’m a gummies kinda guy, and those are hard to get right unless you’re like, an operation, you know?

Dispensary gummies are lab tested. Although there’s a bit of a problem with lab shopping here, they’re going to be pretty consistent in terms of dosage. I won’t wind up accidentally couch-locked because the dose was too high or the gummies had an unexpected activation time.

[–] [email protected] 5 points 10 months ago

Yeah I really hate that kind of thing. I went into a gas station once, and at the registers it had a tiny little label saying they had CCTV with facial recognition, for crime prevention and "legitimate interest" - the GDPR term that websites always hide and sneak in pre-ticked, even when you think the main points are completely unchecked. There wasn't even a clear way to opt out either, just a QR code you could scan. I didn't scan, I've avoided that place since.

I also vote for finding a new weed shop, but ideally do tell them your reason why.

[–] [email protected] 14 points 10 months ago (1 children)

Perfectly relevant. Thank you.

I enjoyed reaching the gist of your meaning: Legislation needs to be written.

So let’s hope that can happen.

And on a personal level, what have you heard about people who intentionally make their data useless?

This has been my strategy.

I never buy what’s recommended.

I purchase items with cash when I can so they are not added to my “profile”.

Have you heard much about this strategy? How might it work if everyone used it? Generally thoughts for how we can defy their machine and protect ourselves?

[–] [email protected] 6 points 10 months ago (1 children)

I enjoyed reaching the gist of your meaning: Legislation needs to be written.

So let’s hope that can happen.

Agreed. The first step I think is education, letting people know the value, pointing out that it is a pandemic problem that affects everyone, then convincing politicians that they are being robbed too. If a lawmaker thinks they're a victim, then they might actually pull their finger out.

And on a personal level, what have you heard about people who intentionally make their data useless?

This has been my strategy.

I do that to some degree, with some things. Like with captcha, I play a game of getting things wrong, but just enough to get through. Not every attempt though, I want it to still think I'm a human that's smarter than the machine, then when I think it's giving me a genuine training screen I spoil it.

I don't use cash as much as I maybe should, I prefer it in some regards, but contactless card purchases are just so easy. I've never used Google or Apple Pay, though, but that's more because I run custom firmware. Also, I've since learned that when you use your phone to pay it's the equivalent to chip and PIN. You are authorising the transaction and taking responsibility for it, whereas if you use a contactless debit/credit card it is processed as "cardholder not present", whereby the seller assumes more responsibility if you dispute it. This method of transaction isn't new, it's how catalogue or telephone purchases were always done, as well as online purchases. But if you use your card with chip and PIN, or if you use your phone, you will have a much harder time disputing any transaction.

Have you heard much about this strategy? How might it work if everyone used it? Generally thoughts for how we can defy their machine and protect ourselves?

In terms of user data protection, really I think the cat is long since out of the bag. There's no putting it back in - and in many ways we shouldn't, as data is useful and has benefits to society. I think it should go either one of two ways:

  1. Allow businesses to continue their free data collection, but force them to make the raw data public. Any processing they do can be private, but the raw data doesn't belong to them.
  2. Have businesses start paying the data subject for their data.

In the meantime, one way a user can limit their data collection using restrictive privacy browsing settings. For my personal PC's, I not only run uBlock Origin but also uMatrix - a deprecated extension made by the same author. This has similar funcionality to uBlock Origin when you set it to author mode, where it can selectively block different domains, but uMatrix presents it as a matrix which also allows you to select the type of content as well as domains. By default, it blocks all 3rd party frames, audio/video media, scripts, XHR, and "other", so quite often it leaves websites broken on first load, but then I pick through and enable the bare minimum of content to get it working. This isn't for everyone, of course, as it can be a hassle sometimes - particularly with payment processors which are all done on multiple 3rd party servers. However, it does highlight to me how endemic Google are with captcha, even when it doesn't give you a captcha prompt. I can't log into some of my online banking without enabling connections to Google, which is sickening. This is an example of what uMatrix looks like:

The extension doesn't get updates anymore, so my lists are out of date compared to uBlock Origin. I'm pretty sure I could update them manually, but since I run uBO as well I don't really feel the need. I've tried running just uMatrix, but uBO has its own array of special lists and without those YouTube ad blocking doesn't work.

[–] [email protected] 2 points 10 months ago (1 children)

uBlock Origin and Matrix are entirely new to me — of course I can’t have missed references to them in discussions but I just don’t have the time to chase down every good idea.

Simply getting engaged with the Fediverse has been adventure enough.

Here’s hoping more and more people start valuing their privacy more and more.

[–] [email protected] 3 points 10 months ago* (last edited 10 months ago)

uBlock Origin is essential. Firefox (or a hardened fork) with uBlock Origin is the bare minimum protection, IMO. Definitely don't use Chrome or any derivative (which is basically all of them these days, eg Microsoft Edge, Brave).

uMatrix is deprecated and breaks websites by default. I love it, but it's not for everyone. I don't use it on all my devices, though.

[–] [email protected] 6 points 10 months ago (1 children)

This is a more believable figure than the trillions people were throwing around

[–] [email protected] 4 points 10 months ago

Those people were me lol

[–] [email protected] 3 points 9 months ago (1 children)

Disclosure: I'm affiliated with this company.

There's a platform where you can add personal data in the form of questionnaires, documents, and integrations that pull profile data from social media, then allows you to sell the data to buyers at your discretion. The platform does not own your data, does not access it, and simply acts as a broker directly between you and the buyer. Not a ton of activity on it at the moment, but it's picking up as clients shift spending from big tech to pay users for their own acquisition.

https://tartle.co

[–] [email protected] 2 points 9 months ago (1 children)

Glad to see there's someone trying to make the process more legitimate.

However, I feel like the better solution is to require that all raw data be publicly available - no one pays for it, but everyone can access it. Then, when people process the data, they can keep their methods and results secret. I think this is perhaps a more practical solution as the cat is already out of the bag, you're not going to get the likes of Facebook paying users appropriately.

[–] [email protected] 1 points 9 months ago

You are right about not getting Facebook to pay for the data, but each time a company pays you $2 to be referred to their site, that's $2 Facebook didn't receive. Anything you earn on TARTLE comes directly out of the purse of big tech.