this post was submitted on 05 Nov 2024
130 points (98.5% liked)

Linux

48038 readers
759 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

The developers of the Manjaro Linux distribution, built on the basis of Arch Linux and aimed at beginners, announced the beginning of testing a new service MDD (Manjaro Data Donor), designed to collect statistics about the system and send it to the external server of the project. The author of the MDD intended to enable telemetry by default (opt-out), but the decision has not yet been approved and, judging by the objections of some developers and users, it is likely that telemetry will be offered as an option requiring prior consent of the user (a request to enable telemetry is proposed to be added to the greeting interface after the first download).

The report includes data such as host name, kernel version, desktop component versions, detailed information about hardware and drivers involved, screen size and resolution information, network device MAC addresses, disk serial numbers, disk partition data, information about the number of running processes and installed packages, versions of basic packages such as systemd, gcc, bash and PipeWire.

The sent data is stored on the project server in the ClickHouse database and visualized using the Grafana platform. The IP addresses of users are not stored, and the hash from the /etc/machine-id file is used as the system identifier.

Аccording to the code https://github.com/manjaro/mdd/blob/master/mdd.py#L40 sends everything.

top 50 comments
sorted by: hot top controversial new old
[–] [email protected] 19 points 2 days ago

data such as host name,

Okay why do they need to know that? Why do they need to know if the computer is called "Melissa's Laptop" or "Workstation 15, Internal security division"? Seems like this kind of data could if stolen be misused and it has minimal legitimate purpose IMO as anyone can put anything as host name and while in organizations it often corresponds to use it doesn't have to for individuals. Someone could call their machine "Mack's Porn Rig" and they only use it for doing banking and a little coding.

kernel version, desktop component versions, detailed information about hardware and drivers involved, screen size and resolution information,

This all seems legitimate enough, this would be helpful for understanding the hardware their users run on and targeting features or bug fixes.

network device MAC addresses,

Not great but there is an argument for it, they could just grab and send the first 3-4 octets which would give them the info they need on manufacturers without getting uniquely identifiable data that along with some of this other stuff is concerning for fingerprinting.

disk serial numbers,

Okay, what the fuck. Why do they need disk serial numbers? What possible use is there for that. Those are used for warranty claims and could be used as part of uniquely fingerprinting a computer and person. Not cool.

disk partition data,

This is vague enough. I guess one could choose to see this as just info about partitions in use say if there's also an NTFS partition that looks like a Windows install that would be useful but on the other hand data encompassed within a partition could also nefariously be read as allowing them access to all your data. Partition layout, partition labels, and file systems used on disks available to the system would be a clearer way to put this and erase any doubt.

information about the number of running processes and installed packages, versions of basic packages such as systemd, gcc, bash and PipeWire.

All this is also fine just technical data stuff.

[–] [email protected] 81 points 3 days ago (3 children)

Opt-out? Seriously? What are the Manjaro devs smoking?

[–] [email protected] 25 points 3 days ago

Whatever they can get their hands on, including your unique hardware identifiers

[–] [email protected] 11 points 2 days ago* (last edited 2 days ago)

Ad firm money.

Maybe I'm just cynical, but my first instinct when I see stuff like this is they have a secret contract with an advertiser and are selling this information.

load more comments (1 replies)
[–] [email protected] 69 points 3 days ago* (last edited 3 days ago) (4 children)

enable telemetry by default ... MAC addresses, disk serial numbers

Another reason to not use Manjaro. Just use Endeavour instead.

Edit: I'm not against telemetry pre se. I have the KDE feedback enabled for example but that was opt in and sends no unique data.

[–] [email protected] 28 points 3 days ago (8 children)

It's all about trust. Manjaro has given me reasons to distrust them.

load more comments (8 replies)
load more comments (3 replies)
[–] [email protected] 45 points 3 days ago* (last edited 3 days ago) (1 children)

Opt-out? I see it's time for the seasonal Manjaro fuck up.

[–] [email protected] 16 points 3 days ago

They'll find some way to make this change break the AUR again

[–] [email protected] 54 points 3 days ago (13 children)

network device MAC addresses, disk serial numbers

That's enough. I'm calling it evil from now on.

[–] [email protected] 26 points 3 days ago

Thought it's probably fine after reading the title, but this shit isn't fine. What the fuck.

load more comments (12 replies)
[–] [email protected] 36 points 3 days ago

Why do they need information about the hostname? Is it really valuable for them to know how many systems are named daves-pc?

[–] [email protected] 26 points 3 days ago (1 children)

Why on earth do they need to know hostname? MAC addresses?

[–] [email protected] 18 points 2 days ago

And disk serial numbers 😟

[–] 0x0 28 points 3 days ago (2 children)

I get the usefulness of technical telemetry such as kernel version, RAM, disk space, processor type, etc... but NIC MAC? HDD serial? WTF?

[–] [email protected] 12 points 2 days ago (2 children)

Those are absolutely ways of covertly identifying your device while technically not counting as "personal information" under privacy laws.

load more comments (2 replies)
[–] Fijxu 11 points 2 days ago* (last edited 2 days ago) (5 children)

Yeah that makes no sense lol. Who needs MAC addresses to debug and fix bugs? No one.

load more comments (5 replies)
[–] notprogrammer 30 points 3 days ago

The report includes data such as host name, kernel version, desktop component versions, detailed information about hardware and drivers involved, screen size and resolution information, network device MAC addresses, disk serial numbers, disk partition data, information about the number of running processes and installed packages, versions of basic packages such as systemd, gcc, bash and PipeWire.

That's insane

[–] [email protected] 15 points 2 days ago (1 children)

I've defended Manjaro many a time, despite the mistakes they've made. The main reason for this, Manjaro is the most stable Linux distro I've used.

However, the main reason I ditched Windows as my primary OS was telemetry (and bloat). If Manjaro introduce this, it absolutely must be opt-in.

I actually contribute to the Steam hardware survey as I want to ensure Valve, but more so hardware manufacturers, are aware desktop Linux systems for gaming and creative work are viable. But it's my choice to contribute.

If Manjaro don't implement this as an opt-in then I'll be installing Arch. It will be a pain to configure my software again but needs must.

[–] [email protected] 9 points 2 days ago (2 children)

If manjaro is the most stable distro you’ve used you can’t have used a lot

[–] [email protected] 3 points 2 days ago (1 children)

I mostly used Ubuntu based desktop distros and frequently had issues with the 6 monthly update cycle. Problems with Fedora too. I have not had a single update issue with Manjaro. I often have different distros running in VM's and whilst Arch has been the most reliable, most are not.

I also setup loads of Linux servers in my I.T. job that I used to have, so I have plenty experience.

The bottom line is Manjaro desktop has been ridiculously reliable for me. Therefore other peoples hate of it washes over me and is meaningless.

[–] [email protected] 2 points 1 day ago

Yeah, besides some Nvidia driver problems, Manjaro was stable for me as well

Have chosen it, because it was fast to setup and the base configuration wasn't too of far off my liking

But, by now I'm considering to switch

[–] [email protected] 2 points 2 days ago* (last edited 2 days ago)

Yeah the Manjaro devs have a long history of gaffes not to mention the infamous one with PGP keys requiring users to reset their system clock

[–] [email protected] 12 points 2 days ago (1 children)

Once again proven right that EndeavourOS is the superior downstream Arch distro

load more comments (1 replies)
[–] [email protected] 22 points 3 days ago (1 children)

I just don't see a good reason to use Manjaro and many reasons not to.

[–] [email protected] 12 points 3 days ago (2 children)

Friends don't let friends use Manjaro

load more comments (2 replies)
[–] [email protected] 17 points 2 days ago (1 children)

Manjaro is already less stable than arch, now it collects your data involuntarily? Fucking wild how anyone can use it.

[–] [email protected] 6 points 2 days ago

clown distro makes clown decision

[–] [email protected] 14 points 2 days ago* (last edited 2 days ago)

That list about which data they're collecting is longer than my highschool essay

[–] [email protected] 10 points 2 days ago

Glad i said fuck it and went straight to actual arch when i wanted to try arch based. Literally like 9/10 times i hear manjaro brought up its not going to be in praise. Ffs lol

[–] [email protected] 18 points 3 days ago

It amazes me it's still as popular as it is and still own goaling at least once a year.

[–] [email protected] 8 points 2 days ago (2 children)

With archinstall, anybody can install Arch in 10 minutes nowadays. Why use Manjaro ?

load more comments (2 replies)
[–] [email protected] 8 points 2 days ago

hostname? MAC address? serial numbers? does "partitionx data also include names and GUIDs?

why would they need these? what is wrong with them??

[–] [email protected] 3 points 2 days ago

Dammit, Manjaro. Why you gotta be WEIRD?! I used to love their branding, but they keep doing crazy things that would clearly alienate the userbase that's left...

[–] [email protected] 13 points 3 days ago* (last edited 3 days ago) (5 children)

This may be illegal in EU if they don't use opt in. ~~Even then it may be illegal for under 18 year olds to collect MAC addresses and disk serial numbers, as those can potentially be used for identification.~~

The data is anonymized, and the IP is NOT stored. So I'm not sure this violates GDPR?

From the code we can see the machine ID is anonymized, sending only a SHA256 checksum.

def get_hashed_device_id():
    # Read the machine ID
    with open("/etc/machine-id", "r") as f:
        machine_id = f.read().strip()

    # Hash the machine ID using SHA-256 to anonymize it
    hashed_id = hashlib.sha256(machine_id.encode()).digest()

    # Convert the first 16 bytes of the hash to a UUID (version 5 UUID format)
    return str(uuid.UUID(bytes=hashed_id[:16], version=5))

This makes it somewhat a nothingburger IMO.

[–] [email protected] 10 points 3 days ago* (last edited 2 days ago)

That's not anonymous, that's pseudonymous.

What is the point of this? The machine-id already looks to be some unique random number, so you're calculating another unique random-looking number from that, might as well use the original number.

You can't glean any useful information from a unique random-looking number that would help with developing Manjaro. You can't calculate any statistics from that. The only use is tracking.

Edit: And as mentioned in my other comment, reversing the MAC SHA by brute force is trivial, so that one at least (and possibly the other hardware serial numbers they collect) shouldn't even be considered pseudonymous.

load more comments (4 replies)
[–] [email protected] 11 points 3 days ago

Why do they need half that data for a derivative of a distro? Fuck off. I don’t care if someone collects the model number of my GPU or whatever but that sounds like personally identifiable tracking data, not basic “telemetry” data to set development priorities or whatever.

[–] [email protected] 10 points 3 days ago (1 children)
  • users can be identified
  • probably Opt-out (still in discussion)

Two nogos combined makes nonogogos. Why do they need host name, MAC address and disk serial numbers? Why can't people set how much they want to send in, like KDE Plasma does? Will the data be shown to the user before its send in? Steam does that perfectly (show data and its opt-in) and that is even a proprietary application. Telemetry is okay if its done right, without user identification, opt-in and not hiding whats sent, preferably in multiple levels of what is being send.

I used Manjaro before and switched to EndeavorOS because I was not happy. Now I am. Manjaro can't stop being stupid (not the users, I'm not attacking any user here, only the maintainers or developers of Manjaro).

load more comments (1 replies)
load more comments
view more: next ›