The developers of the Manjaro Linux distribution, built on the basis of Arch Linux and aimed at beginners, announced the beginning of testing a new service MDD (Manjaro Data Donor), designed to collect statistics about the system and send it to the external server of the project. The author of the MDD intended to enable telemetry by default (opt-out), but the decision has not yet been approved and, judging by the objections of some developers and users, it is likely that telemetry will be offered as an option requiring prior consent of the user (a request to enable telemetry is proposed to be added to the greeting interface after the first download).
The report includes data such as host name, kernel version, desktop component versions, detailed information about hardware and drivers involved, screen size and resolution information, network device MAC addresses, disk serial numbers, disk partition data, information about the number of running processes and installed packages, versions of basic packages such as systemd, gcc, bash and PipeWire.
The sent data is stored on the project server in the ClickHouse database and visualized using the Grafana platform. The IP addresses of users are not stored, and the hash from the /etc/machine-id
file is used as the system identifier.
Аccording to the code https://github.com/manjaro/mdd/blob/master/mdd.py#L40 sends everything.
That’s not anonymous, that’s pseudonymous.
What is the point of this? The machine-id already looks to be some unique random number, so you’re calculating another unique random-looking number from that, might as well use the original number.
You can’t glean any useful information from a unique random-looking number that would help with developing Manjaro. You can’t calculate any statistics from that. The only use is tracking.
Edit: And as mentioned in my other comment, reversing the MAC SHA by brute force is trivial, so that one at least (and possibly the other hardware serial numbers they collect) shouldn’t even be considered pseudonymous.