F30: System-Wide Change proposal: DNF UUID

<a href="" title=""></a>

== Summary ==
Right now, we estimate installed Fedora systems by counting unique IP
addresses which show up in our updates mirror statistics. We need
better data than that. There are some proposals for more complicated
systems, but a quick thing we can do is implement a per-system UUID
(unique identifier) and count that instead of IP addresses.

This is what openSUSE does — see <a href="" title=""></a> for
live stats. See also
[ ... at lists dot
this previous Fedora Council discussion] for background.

== Owner ==

* Name: [[User:mattdm|Matthew Miller]]
* Email: mattdm

== Detailed Description ==

=== The problem ===

* A. Currently, we can only count Fedora OS use by observing IP
addresses. This is subject to undercounting due to NAT — and to
overcounting due to short DHCP leases and laptops moving between work
or school and home or coffee shop.
* B. We can count what releases are observed, but we can’t distinguish variants.
* C. We can’t count quickly because various logs are copied back to a
central server and data is not consistent for several days.

=== Constraints ===
* The Fedora community cares about privacy and is adverse to tracking
measures. We don't want to track; just count.
* For this reason, we don’t want to use any identifier like
/etc/machine-id which may be used for other purposes.
* And, also for that reason, there needs to be a relatively easy way to opt out.
* This needs to work with Yum/DNF, MicroDNF, PackageKit, Cockpit,
rpm-ostree, GNOME Software, Muon, Apper, and software update
mechanisms used in other spins.
* We need to be able to distinguish between short-lived instances
(like temporary containers or test machines) and actual installations.

=== Non-Goals ===
* We don’t want to track users, just count systems.
* Except for distinguishing temporary installations from “real” use,
we don’t need to track systems over time. We just want a daily or
weekly moment-in-time count.
* Being able to see how systems are upgraded over time might be
interesting but isn’t as important as privacy concerns.

=== Other Elements ===
* VARIANT_ID will be set in /etc/os-release. See [[Changes/Label Our
Variants]] We want that, plus VERSION_ID and machine architecture.
* We may also want each report to contain a boolean flag showing
whether the system has been in use for at least 24 hours to help
separately categorize test and other throw-away instances.
* openSUSE already uses a UUID in zypper; this is ground already traveled
* Yum and DNF have built in support for fileset variables which can be
‘removed’ to deal with privacy issues.

See <a href="" title=""></a>

== Benefit to Fedora ==

* Better metrics overall
* Public stats page updated automatically
* Better knowledge of relative use of different variants
* Insight into Fedora's use in short-lived test systems and temporary
containers vs. longer-term installations

== Scope ==
* Proposal owners: work with DNF team and infrastructure to implement
the UUID feature and corresponding backend data collection
* DNF team: feature work
* Maintainers of other package management tools: make sure feature
works in these cases as well
* Other developers: Spin maintainers should make sure that VARIANT_ID
is being set in /etc/os-release
** List of deliverables: affects all deliverables
* Policies and guidelines: none
* Trademark approval: none

== Upgrade/compatibility impact ==
Older versions will not have the UUID counting enabled; we will keep
collecting stats in the traditional way for those systems.

== How To Test ==
Once the system is in place, we will see data collected.

== User Experience ==
User experience will not change. Users who wish to opt out of counting
will have an easy way to do so.

== Dependencies ==
Package managers.

== Contingency Plan ==
* Contingency mechanism: continue counting the old way
* Contingency deadline: does not block release; we can ship with the
feature incomplete, although it would certainly be most useful to have
it available at GA
* Blocks release? No
* Blocks product? No

== Documentation ==
Release notes need to be written, and documentation describing how to opt out.

== Release Notes ==
This needs to be written but depends on exact implementation.


Re: F30: System-Wide Change proposal: DNF UUID

By =?ISO-8859-2?Q?... at 01/08/2019 - 04:30

Dne 07. 01. 19 v 17:34 Ben Cotton napsal(a):
How Mock should handle this? DNF executed by Mock cannot send VERSION_ID and VARIANT_ID of chroot(ed) environment
because they are not know yet.

I think the question in general is - how to put tracking of build systems aside?


Re: F30: System-Wide Change proposal: DNF UUID

By Matthew Miller at 01/08/2019 - 09:43

On Tue, Jan 08, 2019 at 09:30:56AM +0100, Miroslav Suchý wrote:
Possibly it could use a special VARIANT_ID reserved for this case?

Re: F30: System-Wide Change proposal: DNF UUID

By Christopher at 01/08/2019 - 01:29

A few concerns/comments (inline):

"Counts are estimates" is not necessarily a problem. Please explain why this is a problem. Also, why not use statistical modeling to try to improve the estimates based on these known behaviors?

Distinguishing between variants is reasonable, I think.

Why is eventual consistency not sufficient? What's wrong with waiting several days? Why would better counting be so time-sensitive/urgent?

I think you mean "averse", and yes, I think you're right about privacy. Who is "we"? I don't think you're speaking for the entire Fedora community.

If this must happen, please make it opt-in. You can use the opt-in data as a sample set statistical model for the rest of the data (those who didn't opt-in), so you don't need to have many people opt-in.

At the very least, the installer should have a dedicated full screen page explaining the feature with the ability to opt-out. The opt-out should not be buried in a menu, or some post-install step. dnf system-upgrades should be opted-out by default.

I think you're using the word "need" here, when "want" is more accurate. Either way, why do you want/need to do this?

A lot of the reasoning for this proposal seems to be based on "interesting", which I agree isn't as important as privacy concerns. Perhaps this is just my opinion, but I don't think a case has adequately been made for getting more accurate counts of Fedora installs.

It's not clear how these things benefit Fedora. Are they benefits for their own sake or do they serve some larger purpose? Who looks at these metrics or stats pages and needs them to be more accurate or more regularly/automatically updated? What benefit do these insights bring to the Fedora community?

Documentation would certainly be necessary, but not sufficient. A good (prominent) UI for opting out is needed, or make it opt-in.

Re: F30: System-Wide Change proposal: DNF UUID

By Stephen John Smoogen at 01/08/2019 - 12:22

On Tue, 8 Jan 2019 at 00:30, Christopher Tubbs < ... at fedoraproject dot org>

Currently the statistics are done off of the http logs from the proxies
which just see a basic set of information. Due to the fact that the
proxy/cache boxes are remote we wait for the rsync to take N days to
complete, then merge all the logs and then do a simple processing on an 8
year old 24 GB server.

The data in this merged log file is noisy due to dnf/yum trying to be
resilient as possible. A single 'dnf update' or 'yum update' may show up as
multiple requests for the same data on different proxies because something
didn't look right.. or it might just show up once. However I don't know if
I have 10 systems behind a firewall or just 1. At the moment I assume that
I have 1 by just saving the tuple (date,ip=x,arch=x,rel=y) once per day.

Trying to count the number of times that tuple occurred was very very noisy
where looking at specific ip addresses I knew have N systems would show up
as either Nhundred systems or none.. depending on the vagaries of the
internet and whatever the systems decided to do that day.

Re: F30: System-Wide Change proposal: DNF UUID

By Kevin Kofler at 01/07/2019 - 18:09

Ben Cotton wrote:
Please no! This is an inherent privacy violation. I hate software doing this
and I always opt out of it. I find it especially worrying that Free Software
is now doing this more and more often, this used to be something only
privacy-violating proprietary software would do.

You will never be able to reliably count all Fedora installations. Any UUID
you introduce can be opted out of, bypassed, etc. Installations using local
mirrors for updates will never send you a UUID to begin with. All numbers
will always be estimates, no matter how deeply you invade our privacy in an
attempt to get a supposedly better count.

I also don't see why it is so important to have an absolute count of Fedora
users. IMHO, data like the relative download frequency of the different
Fedora deliverables is much more interesting (though you have to keep in
mind that the download count does not necessarily reflect the true user
preferences because deliverables that you advertise more prominently will
necessarily get downloaded more often than those hidden behind several
clicks from the download page).

But sending a UUID inherently also allows to track the machine. There is no
way for the user to be sure that the UUID will not be used to track them.
Even if the software on the Fedora infrastructure is completely open and
audited, there might still be some proxy in the middle, some mirror
operator, etc. abusing the UUID for tracking purposes. And besides, the user
would in all cases have to trust that Fedora really runs the published code
and only the published code on the infrastructure servers.

The only reliable way to ensure that users will not be tracked by a UUID is
to not send a UUID to begin with!

I don't think using an identifier different from /etc/machine-id will really
help all that much. Whatever identifier you use can be abused for tracking.

Such a tracking feature must be opt-in, not opt-out! See also the EU GDPR.

But I think that such a privacy invasion is incompatible with the Fedora
project's goals to begin with, even if it is opt-in.

Apper is no longer shipped in Fedora. The KDE Spin uses plasma-pk-updates as
its official updater, but Discover and Dnfdragora (which are both shipped
for different purposes) can also be used to update the system.

But if you require an explicit opt out in more than one place (e.g., once
for DNF and once for PackageKit), that makes this feature all the more
dangerous and scary.

And how would you accomplish that? Other than an "I am a test installation"
checkbox in the installer, I don't see at all how it could be done.

Again, there is no way you can guarantee that. We would have to take your
word for it. This is not acceptable.

So this is even more data that you would be collecting behind the user's

The installation would also only end up recognized as permanent after the 24
hours pass. And who says a test installation cannot last more than 24 hours?
I think it can last at least a week, but that also means that it would take
a whole week until you can reasonably assume that an installation is
probably permanent.

Without being able to magically predict the future and without asking the
user, I don't think you can ever be able to make this distinction reliably.

Kevin Kofler

Re: F30: System-Wide Change proposal: DNF UUID

By Matthew Miller at 01/07/2019 - 23:00

On Mon, Jan 07, 2019 at 11:09:48PM +0100, Kevin Kofler wrote:
Since there is no personal information attached, I don't see how on the face
of it this is a privacy violation. I want to take this concern seriously,
but I need more to go on than "this is inherent". Can you elaborate?

It's true that it will always be an estimate. I think this scheme gives a
reasonable better estimate.

The download count is *really* noisy. There are an order of magnitude more
bot and automatic downloads then there are ones that seem initiated by a
human. Maybe this is due to automated systems, but I suspect it is basically
just the horrible nature of the internet. Unless we were to gate downloads
with a captcha or registration (which, uh, we don't want, just to be clear),
I don't see any way to make those numbers useful.

Like I said, tracking is a non-goal. And, we want a design that is resistant
to tracking -- but I don't think we need to go overboard.

This will be reviewed by lawyers. And, I do note that what I am proposing is
nothing more than what openSUSE already does.

One method: separate UUIDs which only show up on a single day. (This is why
a UUID is better than just a ping.)

Sure, it's a threshold and we'd have to set a balance.

Re: F30: System-Wide Change proposal: DNF UUID

By Nicolas Mailhot at 01/08/2019 - 05:15

Le 2019-01-08 04:00, Matthew Miller a écrit :
That's not how you need to think of it.

Basically, the European definition is that if it can be correlated to
personal information, it *is* personal information, regardless of what
the original info is in isolation.

That means that if you want it not to be personal info, you need to make
bloody sure it is not shared with data aggregators, your protect against
leaking to systems that allows correlation, and every time there is an
advance in big data processing that enables more kinds of correlation,
that automatically restricts what you can safely collect.


Re: F30: System-Wide Change proposal: DNF UUID

By Zbigniew =?utf-... at 01/08/2019 - 05:10

On Mon, Jan 07, 2019 at 10:00:25PM -0500, Matthew Miller wrote:
I'm not a lawyer, but GDPR is something that affects all of use. Going
by the wiki page and GDPR announcements from European Commission:

Lawful basis for processing:
(b)-(e) obviously don't apply

We could argue [1] that reliably collecting the number of individual
installations is a "legitimate interest", for example because it
allows us to decide what parts of Fedora are most used and direct our
efforts there. I think it's pretty obvious that knowing the number of
users is a valid interest for any software project. Then we could use
point (f).

Otherwise, we have to use point (a) which is only satisfied by an clearly
worded, and specific, opt-*in* dialogue.

[1] <a href="" title=""></a>


Re: F30: System-Wide Change proposal: DNF UUID

By =?ISO-8859-2?Q?... at 01/08/2019 - 06:04

Dne 08. 01. 19 v 10:10 Zbigniew Jędrzejewski-Szmek napsal(a):
IANAL but I disagree. With IP address, I can very easily guess your town/village. With more effort I can track you to
individual house and individual device.
You cannot say the same about UUID.


Re: F30: System-Wide Change proposal: DNF UUID

By Peter Robinson at 01/08/2019 - 07:06

I agree with this, I think even the machine ID would be more anonymous
than an IP address in most cases.


Re: F30: System-Wide Change proposal: DNF UUID

By Panu Matilainen at 01/08/2019 - 07:24

On 1/8/19 1:06 PM, Peter Robinson wrote:
In *isolation*, yes. The problem is that here it'll be associated with
an IP address and together they reveal things that neither of them do alone.

Re: F30: System-Wide Change proposal: DNF UUID

By =?ISO-8859-2?Q?... at 01/08/2019 - 06:17

Dne 08. 01. 19 v 11:04 Miroslav Suchý napsal(a):
I just checked and UUID is definitelly under ‘pseudonymisation’ - see Article 4 - Definitions:

<a href="" title=""></a>

And that pseudonymisation is actually encouraged to be used for statictical purposes - (156) in the preamble.


Re: F30: System-Wide Change proposal: DNF UUID

By Nicolas Mailhot at 01/08/2019 - 06:35

Le 2019-01-08 11:17, Miroslav Suchý a écrit :
Only if it can not be correlated back (Art 156)

“The further processing of personal data for archiving purposes in the
public interest, scientific or historical research purposes or
statistical purposes is to be carried out when the controller has
assessed the feasibility to fulfil those purposes by processing data
*which* *do* *not* *permit* *or* *no* *longer* *permit* *the*
*identification* *of* *data* *subjects*, provided that appropriate
safeguards exist (such as, for instance, pseudonymisation of the data)”

Otherwise it is considered personal data (Art 26)

“Personal data which have undergone pseudonymisation, which could be
attributed to a natural person by the use of additional information
should be considered to be information on an identifiable natural
person. ”

Re: F30: System-Wide Change proposal: DNF UUID

By =?ISO-8859-2?Q?... at 01/08/2019 - 07:33

Dne 08. 01. 19 v 11:35 Nicolas Mailhot napsal(a):
How do you identify data subject solely on UUID?


Re: F30: System-Wide Change proposal: DNF UUID

By Nicolas Mailhot at 01/08/2019 - 08:38

Le 2019-01-08 12:33, Miroslav Suchý a écrit :
Art 26 makes it pretty clear that reversing must take into account all
the other data that can be associated with the pseudominisation (either
because it is available at the same time or can be associated with it
some other way). So you don’t get to play the solely card.


Re: F30: System-Wide Change proposal: DNF UUID

By Benjamin Berg at 01/08/2019 - 08:15

On Tue, 2019-01-08 at 12:33 +0100, Miroslav Suchý wrote:
You also inherently collect information such as the IP and the
timestamp of the request which in principle permits identification. You
could for example collect the IP from Fedora account logins and one of
these pings. This way you can de-anonymise the data collected for the


Re: F30: System-Wide Change proposal: DNF UUID

By Owen Taylor at 01/08/2019 - 10:59

On Tue, Jan 8, 2019 at 7:17 AM Benjamin Berg < ... at redhat dot com> wrote:
We can certainly implement a setup that does not collect or store the
UUID together with the IP address or timestamp. Send the UUID as a
HTTP header, don't log it, send the UUID off to a counting service
(*). If we make sure the UUID is protected in transit, sent only to
our own servers (or servers configured by the user), and not collected
or stored in a personally identifiable way, I suspect that we're
meeting our obligations under the GDPR, though we'd need to
double-check any selected solution carefully.

That being said, certainly some users might still have an issue with
having a UUID sent to Fedora servers even if we are meeting our legal
obligations. What we say we are doing with the data might not
correspond to reality in case of a security breach or court order. For
this reason, the first_time_this_week=1 option that Lennart and
Benjamin mentioned has some appeal to me - it would avoid the need for
extra opt-in/out screens, confusing text, etc. It would also allow any
yum repository to do counting the same way - not just our own


(*) implementation left to your imagination. Store a hash of the UUID
for a week then discard. Use HyperLogLog. Etc.

Re: F30: System-Wide Change proposal: DNF UUID

By Benjamin Berg at 01/08/2019 - 15:38

On Tue, 2019-01-08 at 09:59 -0500, Owen Taylor wrote:
You are right that it is possible to immediately discard or obfuscate
the information.

But, as Nicolas pointed out, the argument here is that the UUID itself
likely needs to be considered "personal data" in the GDPR sense. And
even doing something as minimal as that seems to imply "processing"[1]
the data in the GDPR sense.


[1] The definition of "processing" reads:
‘processing’ means any operation or set of operations which is
performed on personal data or on sets of personal data, whether or not
by automated means, such as collection, recording, organisation,
structuring, storage, adaptation or alteration, retrieval,
consultation, use, disclosure by transmission, dissemination or
otherwise making available, alignment or combination, restriction,
erasure or destruction;

Re: F30: System-Wide Change proposal: DNF UUID

By Tomasz Torcz at 01/09/2019 - 08:38

On Tue, Jan 08, 2019 at 08:38:01PM +0100, Benjamin Berg wrote:
Nb. “UUID” sounds terribly technical. Can we use some term which
is already known and understood by users, e.g. Advertising ID?

Re: F30: System-Wide Change proposal: DNF UUID

By Matthew Miller at 01/09/2019 - 11:44

On Wed, Jan 09, 2019 at 01:38:07PM +0100, Tomasz Torcz wrote:
Well, it very much is not an "advertising ID", so not that.

But I think we're going to explore the non-uuid "countme" flag option
instead, which makes that irrelevant.

Re: F30: System-Wide Change proposal: DNF UUID

By Bruno Wolff III at 01/08/2019 - 03:42

On Mon, Jan 07, 2019 at 22:00:25 -0500,
Matthew Miller < ... at fedoraproject dot org> wrote:
From the users point if view, they can't tell if IP addresses are tracked
along with UUIDs. Some IP addresses can be tied to specific users, and now
with UUIDs, the same machine can be seen to use different IP addresses so
that a person can now be seen to be using multiple IP addreses that couldn't
be as easily correlated before. Some of these IP addresses may have been
hard to associate with the person previously.
Users can defend against this by being selective when they do updates
relatively easily as long as updates are the only thing using this UUID.

If you care about that level of not revealing usage, Fedora is probably not
the best distribution in the first place. A number of packages do not make
a priority of limiting networking requests. For example it is common for
web browsers in Fedora to refer to a network version of a Fedora web page as
their default start page rather than using a local copy of this page that
might be a bit out of date. So I don't know if IP address correlation is
likely to be of big concern to many Fedora users. I would prefer that Fedora
make different privacy / convenience trade offs than it does, but I'm pretty
sure I'm in a small minority and I'm able to do work arounds on my end for
this for cases where I want to spend the effort.

Re: F30: System-Wide Change proposal: DNF UUID

By Kevin Kofler at 01/07/2019 - 23:45

Matthew Miller wrote:
I detailed it further down my message: my concern is that the UUID can
theoretically be used to track users, to build personas out of them from the
packages downloaded by the UUID, and in the extreme case even to identify
the person owning the UUID by name (e.g., if a package downloaded by the
UUID is downloaded only by 1 person and you find some bug report for it in
Bugzilla). I don't care that you promise that you won't do it, the fact is
that you *can*. And possibly others can too, depending on how exactly this
is implemented.

If you take privacy seriously, you have to assume the worst. It is always
safer to send less data rather than more.

Kevin Kofler

Re: F30: System-Wide Change proposal: DNF UUID

By Stephen John Smoogen at 01/08/2019 - 08:49

On Mon, 7 Jan 2019 at 22:47, Kevin Kofler <kevin. ... at chello dot at> wrote:
Currently we can't see what packages a client requested. All the
Fedora mirror proxies sees is - - [31/Dec/2018:09:07:21 +0000] "GET
/metalink?repo=fedora-28&arch=x86_64 HTTP/1.1" 200 62200 "-"

The additional information could be - - [31/Dec/2018:09:07:21 +0000] "GET
HTTP/1.1" 200 62200 "-" "dnf/2.7.5"

Individual mirrors do see what packages the person requested but do
not see the uuid=<blah>, edition=<blah> data - - [31/Dec/2018:06:44:46 +0000] "GET
HTTP/1.1" 200 3312 "-" "dnf/2.7.5" - - [31/Dec/2018:06:44:46 +0000] "GET
HTTP/1.1" 200 448854 "-" "dnf/2.7.5" - - [31/Dec/2018:06:45:21 +0000] "GET
HTTP/1.1" 404 299 "-" "dnf/2.7.5"

Re: F30: System-Wide Change proposal: DNF UUID

By Lennart Poettering at 01/08/2019 - 11:22

If all you want to do is count, then it should be entirely sufficient
to do it like this:

GET /metalink?repo=fedora-28&arch=x86_64&edition=<blah>&countme=1 HTTP/1.1

the first time within each one-week window and a simple

GET /metalink?repo=fedora-28&arch=x86_64&edition=<blah> HTTP/1.1

all other times.

Then, sum up how many "countme=1" GET requests we get per week, and
you have a good count, without tracking individual clients, without
inventing new uuids¹.

Such a form of counting is so minimal that I think you don't even have
to query the user whether he agrees with that in the installer UI. And
the user knows that with the one additional bit of info he grants you
every week there's very little you can do you couldn't do in the
status quo ante.

Morever, doing accumulation like the proposed also makes things
extremely simple to account for, as you don't have to store per-client
info in a huge database on the server. Instead it's entirely
sufficient to have a single counter for each subset of distro you want
to count.

In the interest of privacy the valid desire to have statistics
about the use of our distro needs to be implemented with data
frugality in mind. Keeping a full database of all uuids of all clients
on a Fedora server somewhere is definitely not data frugality if all
you want is count. Even if Fedora wouldn't misuse the data, somebody
might exploit the server and steal the database and there you go. Not
even having the database is hence the much better approach, and you
really need neither the database nor the uuid concept to do proper

So yeah, in the interest of privacy and simplicity, please don't got
the uuid way, there are simpler and better approaches.


(Footnote: ¹ if you are concerned that not every client is updated
every week, then you could even slightly extend this and maybe submit
countme=2 the first time within each 4 week period, and countme=3
within each 52 week period, so that you you catch even those though it
will take a bit longer for them to accumulate them)

Re: F30: System-Wide Change proposal: DNF UUID

By Roberto Ragusa at 01/11/2019 - 17:36

As an additional improvement, is it really needed to count every machine?
We can subsample a lot, and only let some specific machines to show
up for counting.

That is, apply the logic above only if(hash(machine_id)%1000==0)
(this becomes a poll instead of a referendum, results must then be multiplied by 1000)

Or, to avoid having somebody constantly be counted and other constantly ignored,
the rule could be if(hash(machine_id)%1000==hash(weekofthecentury)%1000)

With this setup I know that 99.9% of the weeks I'm not reporting anything at all.

Of course 1000 is a constant that may be tuned, but looks a good choice
to me if the expected total number is on the order of 1 million.


Re: F30: System-Wide Change proposal: DNF UUID

By Nico Kadel-Garcia at 01/12/2019 - 23:24

On Fri, Jan 11, 2019 at 4:37 PM Roberto Ragusa < ... at robertoragusa dot it> wrote:
The difficulty is not the counting. Requiring safe counting and
aggregation by the server is a requirement that no server or
intermediate server or proxy needs to follow, and would require
configuration or filtering control of a server that is outside of
client hands. It's not legally or technologically mandated. The great
use fo r the data is tracking hosts, metadata that is saleable and
likely to help provide a new form of tracking information.

Writing this into the dnf behavior is typical, but i't's not
beneficial to the clients. It's beneficial to the mirrors, who are
likely to sell the data. While it may be that infamous problem, a
"Simple Matter Of Programming(tm)" to sanitize the data, there are
strong motivations to collect it and sell it, and I'd expect various
mirrors to start doing so within moments of the activation of the

Re: F30: System-Wide Change proposal: DNF UUID

By Stephen John Smoogen at 01/13/2019 - 12:53

1. The mirrors do not see this.
2. We aren't talking about UUIDs anymore and just a countme variable being
sent periodically. If a countme is going to be too much data to send, then
clients are probably already sending way too much data already.

Re: F30: System-Wide Change proposal: DNF UUID

By Nico Kadel-Garcia at 01/13/2019 - 13:46

On Sun, Jan 13, 2019 at 11:54 AM Stephen John Smoogen < ... at gmail dot com> wrote:
If it's not available to the mirrors, then anyone who hardcodes a
mirror's URL into the local "baseurl" settings is not going to be
counted this way, and we're back at the "we don't know how many
clients there are" problem. If only the "mirrorlist" hosts see the
UUID, "countme" or any other identical client ID.

Then can we change the title of the thread?

If the "countme" variable is unique and sent only to the host
providing the mirrorlist, it's tracking data. That host becomes
responsible for anonymization, and it is *too late* unless the data
encrypted at the client, say with the GPG key of the relevant
repository, and that starts requiring GPG private keys on the host
providing the mirrorlist. If it's bonig across the wire, even with
SSL, man-in-the-middle is an old, old problem.

Whether the mirrorlist back end software is promised to be sanitized,
it's tracking data. Sadly, I've been through this in other venues. The
data was considerd "safe" because it was "anonymized". Except that the
original web traffic was tappable, along with IP addresses and unique
client information. A subpoena, a Patriot Act request, or even a
foreign worker with an H1-B visa reporting back to foreign
intelligence or a technology competitor could obtain a great deal of
trackable data.

Am I paranoid? Yes. Am i paranoid *enough*? I'm not so sure, we've
seen assembly of pseudonymous data and metadata throughout the history
of intelligence work. Demanding it, and handling it safely, is often
an exercise in people claiming "no one would do that!", "no one would
bother to investigate that", and people misusing it as a matter of
course. I'd suggest it's not even worth the effort to demand or to
collect with such concerns.

Nico Kadel-Garcia < ... at gmail dot com>

Re: F30: System-Wide Change proposal: DNF UUID

By Stephen John Smoogen at 01/13/2019 - 15:15

Once a time period (day, week, month), an update would just add a countme=1
to it.

There is no more client id. There is no data other than that. We would just
count all the countme=1 and get an idea of what was going on. It isn't an
exact number but it puts some amount of solid-ness in the fuzzy cloud. The
more complicated version which mattdm is wanting is that countme gets
incremented by the week after install. Nothing else. No data from the
/etc/machine-id, no data from /var/yum/uuid etc.

Re: F30: System-Wide Change proposal: DNF UUID

By Matthew Miller at 01/14/2019 - 13:58

On Sun, Jan 13, 2019 at 02:15:19PM -0500, Stephen John Smoogen wrote:
I have a draft update to the change <a href="" title=""></a>
which I'm waiting to hear back from the DNF team on. Once I do, Ben will
post that as a new thread with a new title.

Re: F30: System-Wide Change proposal: DNF UUID

By Samuel Sieb at 01/13/2019 - 01:44

On 1/12/19 7:24 PM, Nico Kadel-Garcia wrote:
Except that you've missed the point that's been made several times that
the mirrors do not see this information ever. It's only the mirror
managers that would see it and those are not managed by the public.

Re: F30: System-Wide Change proposal: DNF UUID

By John Harris at 01/11/2019 - 19:48

On Friday, January 11, 2019 4:36:54 PM EST Roberto Ragusa wrote:
If this is done, the likelyhood of invalid data for the given Spin is pretty
high. For example, Workstation could show as being more popular than all of
the other spins combined, just because it's more popular than any given spin
(likely because it's advertised prominently, while other spins are hidden
behind a link at the middle of the download page).

Re: F30: System-Wide Change proposal: DNF UUID

By Adam Williamson at 01/12/2019 - 03:27

On Fri, 2019-01-11 at 18:48 -0500, John Harris wrote:
Just as a note, Workstation isn't a spin, it's a Fedora Edition:

<a href="" title=""></a>

framing it as if it's "just another spin" is a bit off. Its prominence
is quite intentional and the whole / editions thing was
precisely about picking some specific 'flavors' of Fedora and giving
them prominence over the others.

Re: F30: System-Wide Change proposal: DNF UUID

By John Harris at 01/12/2019 - 05:37

On Saturday, January 12, 2019 2:27:33 AM EST Adam Williamson wrote:
Really, the issue there is specifically that it isn't "just another spin", but
I'm sure you knew that's what I was getting at. Fedora's aggressive marketing
of specifically GNOME, while hiding other Spins, would be an interesting
factor in review of metrics of spins.

Re: F30: System-Wide Change proposal: DNF UUID

By Stephen John Smoogen at 01/12/2019 - 11:50

Side note, I was at a loss of what you were getting at. There were several
ways it could be interpreted and has been used by people in the past to
mean different things.

The problem is that there is an inherent conflict of resources here. When
we put everything on the download pages, everyone including the spin owners
say it was too confusing. But choosing which things get put on a special
page or not ends up getting the opposite "You're oppressing me" or "Oh its
ok if you drop everyone but MY spin". The opposite catch-22 is that the
spin may only have 1-2 to handle issues but they aren't getting more or
less people because they don't have more than 1-2 people on it. This leads
to multiple spins only getting looked at in the beta where someone sees "oh
it won't get in the next release.. ok I will see if I can get time to fix

It is a complicated problem and doing the basic hand-waving of "it is
because Fedora markets specifically GNOME they suck" just makes people
pissed off and entrenched versus coming up with a workable solution.

Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposa

By Kevin Kofler at 01/12/2019 - 14:51

Stephen John Smoogen wrote:
I think John's statement was pretty clear: The artificial distinction
between "Editions" and "Spins" needs to go away.

What Spin owners have you asked? While I cannot officially speak for the KDE
SIG, I am almost certain (as a former member and a current passive follower
of the KDE SIG) that the KDE SIG has never said such a thing. To the best of
my knowledge, the KDE SIG's proposals to improve the download page have
always suggested listing ALL Spins, not just KDE/Plasma and
Workstation/Desktop/GNOME. (My suggestion has always been to order them by
decreasing download counts.)

So if that is your concern, the solution would be to define some minimum
formal requirements for a Spin to be listed on the get.fp.o front page. But
then those requirements should also apply to the 3 "Editions": if they don't
fit the criteria, they should be kicked out as well. (I could see that
possibly happenening for Server or Atomic/Silverblue at some point. The
Fedora user base is clearly desktop-centric. But I am NOT saying that they
should necessarily be delisted, just that they should be held to the same
maintenance standards as the Spins.)

That said, I am pretty sure that if the Spins were more prominently
advertised, they would be more likely to attract helping hands. As it stands
now, users not yet familiar with Fedora might not even realize that the
Spins even exist.

I would propose this mockup (mix of HTML and ASCII art, sorry – each '#'
sign stands for a nice colored icon, e.g., a notebook icon, an upstream
desktop project logo, etc.):

# fedora <h1>Welcome to Fedora, a GNU/Linux distribution entirely composed
of free and open source software, downloadable at no cost.</h1>

Fedora is an operating system that you can use, share, distribute, and
modify as you like, all completely for free. <a href=…>More information</a>.

What hardware (physical or virtual) do you want to install Fedora on?
<a href="#workstation"> | <a href="#server"> | <a href="#container">
# Desktop | # Server | # Container
# Notebook/Laptop | # VPS | # Docker
# Workstation | # Server VM | # Kubernetes
</a> | </a> | </a>

<a name="workstation"><h2>Desktop, Notebook/Laptop, Workstation</h2></a>

Fedora for the workstation: One operating system, many faces.

You can select between several different desktop workspace environments with
different looks, feels, and user experiences, while always being able to use
the full set of applications included with or shipped by third parties for
Fedora. <a href=…>More information</a>.

# GNOME – The default desktop environment in Fedora, recommended for new
users. <a href=…>Download now (x86_64 ISO image)</a>
# KDE Plasma – [description] <a href=…>Download now (x86_64 ISO image)</a>
# Xfce – [description] <a href=…>Download now (x86_64 ISO image)</a>
[all other Spins – the whole list should be ordered by decreasing download

Fedora also offers convenient Labs for some niche use cases, to save you the
trouble of manually installing your niche applications on one of the above
# Astronomy (based on: # KDE Plasma) – [description] <a href=…>Download now
(x86_64 ISO image)</a>
# Design Suite (based on: # GNOME) – [description] <a href=…>Download now
(x86_64 ISO image)</a>
[all other Labs – the whole list should be ordered by decreasing download

<a name="server"><h2>Server, VPS, Server VM</h2></a>

Fedora for the server: […]

# Server – [description] <a href=…>Download now (x86_64 ISO image)</a>

<a name="container"><h2>Container, Docker, Kubernetes</h2></a>

Fedora for containers: […]

# Silverblue – [description] <a href=…>Download now (x86_64 […] image)</a>

[end mockup]

This mockup can easily be extended with more columns in the hardware table
(and corresponding linked to page sections), e.g., a fourth column for ARM
mobile devices.

Kevin Kofler

Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposa

By Matthew Miller at 01/14/2019 - 13:56

On Sat, Jan 12, 2019 at 07:51:34PM +0100, Kevin Kofler wrote:
It's not an artificial distinction. Editions are particular solutions
targeting particular key use cases identified by the Fedora Board (and now
Council). This is different from a desktop Spin, which is focused on
delivering that particular technology, or from Labs, which are focused on
more niche use cases.

Since this is an offshoot of a thread about metrics, I want to emphasize
that by all the metrics we have, this has been *very* successful. Fedora
numbers were flat-to-decreasing when we started this, and now they're
steeply up and growing.

There *are* "some minimum formal requirements". An Edition is a Fedora
solution made by a formal Fedora Working Group in response to a strategic
use case identified by the community through the Fedora Council. The WG
needs formal membership, needs to meet regularly, and needs to have a
regularly-refreshed requirements document.

I really, really, strongly encourage the team behind each spin to advertise
more prominently. The Council is even willing to allocate funds as necessary
to help do that.

Fedora is a Project. That Project makes an operating system platform and
various operating system and platform solutions.


Your "choose your Fedora adventure" page is interesting, but not new. We
talked about this with the design team and they're really not in favor of
that as the primary user experience for people who don't know what they
want. It can be overwhelming and potentially full of traps.

I think it's better to not focus so much on the central page or on the
"getfedora" brochure site, and to instead make the page for each particular
solution more useful and more discoverable.

Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposa

By Kevin Kofler at 01/14/2019 - 16:12

Matthew Miller wrote:
This is a political/marketing distinction and not a technical one.

For Editions vs. Spins, the Editions are in practice all focusing on a
particular technology: Workstation on GNOME, Server on server software, and
Atomic/Silverblue on atomic updates. Workstation in particular attempts to
simultaneously cater to very different "key use cases": web developers,
gamers using proprietary graphics drivers, etc., so it is pretty much a
general-purpose deliverable and not optimized for any particular use case;
the only set point (not up for discussion) that I can see is that it is
based on GNOME.

For Editions vs. Labs, the distinction between a "key" use case and a
"niche" use case is purely subjective. (The only objective distinction that
I see is that the Labs are actually much more tuned to their use cases than
the Editions, which use them mostly as an alibi.) An ordering by decreasing
download count would suffice to make the distinction between "key" and
"niche" purely objectively (and without having to draw a clear line where
"key" ends and "niche" starts).

I can see the point of the distinction between Spins and Labs (at least as a
terminology – the processes are essentially the same for both anyway), but
Editions claim to be use-case-centric like Labs while really being like
technology-centric like Spins. So the marketing is pretty deceptive.

But the setup I propose has never been tried. The pre-"Fedora.Next"
interations of the Fedora download page were also heavily biased towards
GNOME (or "Desktop" as the GNOME-based deliverable used to be called). So
you do not have any usable metrics for comparison.

That is not a formal requirement, it's a subjective committee decision. (See
also what happened when the KDE SIG tried to create a science-centered
Edition based on KDE Plasma, capitalizing on the many scientific KDE
(kdeedu) and Qt applications and on the work done by the KDE Scientific and
KDE Astronomy Labs. The Board/Council was just not interested for purely
political reasons.)

These are reasonable criteria for being listed (though I'd also add some
technical usability criteria, to make sure that the WG is actually producing
a usable deliverable), but they should be the same for all
Spins/Labs/Editions independently of whether the Council subjectively
believes that that particular work deserves being an "Edition" or not.

No amount of advertising we can do is going to be as prominent as the
getfedora download page. All users are driven to that page.

The only option would be to completely rebrand the Spin to an independent
Remix with its own name and domain (so searches for the new name would go
directly to the new domain and not to getfedora), but even then, it would be
very tough to even come close to the brand recognition Fedora has.

Oh no, not the KDE rebranding fiasco here too!

Almost everyone still calls "KDE Plasma" just "KDE", despite all the
insistence that "KDE" is not a particular piece of software (anymore), but a
community. Trying to do the same to the "Fedora" brand is going to flop
exactly the same way.

The Design team is doubly biased in that several key members are involved
with the GNOME community, which:
1. gives them an incentive to promote the GNOME Workstation at the expense
of all other deliverables (conflict of interest), and
2. means they come from an environment where it is desired to offer as few
options as possible. GNOME is well known in the community for hardcoding
everything and reducing configuration options to a minimum.
Together, these biases led to the current design of promoting only the GNOME

And since it was apparently requested from above that the other options also
show up SOMEWHERE, they were hidden with all possible tricks (below the
scrolling horizon, even with grayed-out icons!). The only thing still
missing is the "Beware of the leopard!" sign.

But getfedora is the one discoverable place that all new users are being
focused on.

Kevin Kofler

Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposa

By Matthew Miller at 01/14/2019 - 19:48

On Mon, Jan 14, 2019 at 09:12:51PM +0100, Kevin Kofler wrote:

This is not new. In Mo's blog post about the history of the Fedora logo,
there are separate logos for "Fedora Project" and for "Fedora Core" — the
OS deliverable. Merging Core and Extras into one thing was absolutely the
right thing to do for the project, but not having a unique name for the
resulting OS was a mistake and leads to this. Ah well.

I'm not going to go out of my way to crusade about this by tracking down
people who Say It Wrong On The Internet, but I think as a project we can at
least attempt to be internally consistent, and I think there are huge
benefits in making sure Fedora (the project) isn't tied to one particular

Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposa

By John Harris at 01/14/2019 - 20:58

On Monday, January 14, 2019 6:48:47 PM EST Matthew Miller wrote:
In your opinion, is the purpose of the Fedora Project something other than the
creation and maintenance of the distribution known as Fedora?

Fedora's purpose [was Re: Editions vs. Spins...] DNF UUID)

By Matthew Miller at 01/15/2019 - 12:49

On Mon, Jan 14, 2019 at 07:58:43PM -0500, John Harris wrote:
It has always been broader than that.

Way back in history, the project was created by the merger of the
(short-lived) "Red Hat Linux Project", which had a narrow
distro-producing mission, and, which had the goal of
publishing what it described as "third-party software" _on top of_ that
distribution. When these projects merged, they nominally took on the
Red Hat Linux Project mission, but in practice, the effort remained
wider — for example, the Fedora Legacy effort to provide security
updates for non-Fedora Red Hat Linux 8 and 9.

Take a look at the "Objectives of Fedora" list from back in 2008
<a href=";oldid=2124" title=";oldid=2124">;oldid=2124</a>
... actually this is even older but we lost wiki history from before

Building a distro is *one* of the objectives, but they're not really
all *just* about that. This was reflected in the 2010 mission statement
"to lead the advancement of free and open source software and content
as a collaborative community."

At that time, the above page was expanded (see
<a href=";oldid=157737" title=";oldid=157737">;oldid=157737</a>)
and included these top level things:

* "Creating a Free (as in Freedom) distribution"
* "Building open source software communities"
* "Developing the science and practice of building communities"

When the Council sat down to review this two years ago, we felt like in
some ways the ambition there exceeded our practical *actual* abilities,
and we chose to dial back the scope a bit and to focus on platform
building. (That resulted in the current mission statement: "Fedora
creates an innovative platform for hardware, clouds, and containers
that enables software developers and community members to build
tailored solutions for their users.")

That platform, still, is broader than what is commonly understood as
"the distribution known as Fedora". Notably, it includes EPEL, which,
by the numbers, is used on many more systems than the Fedora OS
distribution itself. (In many ways, I think EPEL is the natural
successor to the part of our heritage.) It also includes
CoreOS, and Silverblue, and the IoT thing (which needs a catchy name).
These are built from the same bits but are in many ways different from
our traditional distribution.

In the future, we should also be open to building and including open
source software in non-RPM formats and considering that all under the
Fedora umbrella as well. If we scoped our mission to just maintaining
the distro as it exists today, we might not feel like that's even
possible. We shouldn't limit ourselves in that way.

Likewise, Kevin mentioned earlier in this thread the view that "the
Fedora user base is clearly desktop-centric". I don't think that's
actually completely true (see EPEL and CoreOS, but also the lots of
people who came into the project from a server/sysadmin background).
But even if we take it as true, that view leads inevitably to an
overall narrowing. It's only a small step to "the Fedora user base is
clearly GNOME-centric" and so on down.

Now, we *could* take the strategic direction that it'd be better to cut
all that stuff away and really focus on making that desktop GNOME OS
_only_. We could probably do that very successfully, even. But to me
that doesn't feel right for the project's heritage. We've decided to go
the other direction. Rather than picking one narrow thing and saying
"this OS offering *is* Fedora", we want to enable *lots* of different
solutions and offerings under the Fedora Project umbrella. If an idea
fits with our core values and you want to work on it in Fedora,

So, although I disagree about Editions and the web site,
I *do* want to see Fedora KDE Plasma Desktop, Fedora Cinnamon Desktop,
Fedora Astronomy Lab, Fedora Jam, and all the rest get more promotion
and support. That's totally the within the project's mission, and I
totally support the teams behind those efforts. The Mindshare Committee
and the Council can't promise *people* to do things, but we can
allocate funds to help drive towards subproject goals.

Re: Editions vs. Spins

By J.C. Cleaver at 01/14/2019 - 22:28

On 1/14/2019 4:58 PM, John Harris wrote:
For better or worse, EPEL is under the aegis of the Fedora Project.

If "Fedora Workstation" became "Fedora", and the Fedora Project became
something... else... , which reflected a broader purpose and set of
stakeholders, I suspect a good amount of tension on cadence and policy
might be resolved.


Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposa

By John Harris at 01/14/2019 - 14:05

On Monday, January 14, 2019 12:56:30 PM EST Matthew Miller wrote:
The easiest way to make any of the Spins more accessible, for them to have any
chance comparable to the prominent advertising of Workstation and similar
options, would be to make them more prominent on the "getfedora" index. This
also have a huge effect on SEO.

Right now, in DuckDuckGo:

"download fedora" returns: <a href="" title=""></a>
first, and <a href="" title=""></a> next.

"get fedora" returns <a href="" title=""></a> first, and <a href="" title=""></a>
en/workstation/download/ next.

Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposa

By Michael Catanzaro at 01/14/2019 - 19:19

On Mon, Jan 14, 2019 at 12:05 PM, John Harris < ... at splentity dot com>
So the reason spins are not very visible -- and ought to stay not very
visible -- is that they don't get the same level of attention as the
main products, and we don't really want anybody to download those
unless they know in advance what they are doing. In particular, we
really don't want Fedora to be judged by the quality of its spins and
labs. There are a lot of them, and it's just not plausible to keep up
with quality control for every one.

The Plasma spin is perhaps an exception here. I could totally see that
one being elevated to the level of Fedora product: "Fedora Plasma" or
something like that. I wouldn't really mind having two desktop
products, myself. We just can't create Fedora products for every single
desktop out there, or the download page is going to become way too hard
to navigate, and users will become less-likely to wind up with the
versions of Fedora that we want to promote. So if we promote KDE to a
product, I'd say we'd have to draw the line there, and I'd argue that
would make sense due to KDE's outsized importance to the Fedora
community relative to other spins, and the QA it already receives
(especially its blocker bug eligibility). I assume we fear branding
difficulties if we have multiple UIs for Fedora? Perhaps it'd be a huge
mistake. But the potential benefits of attracting more KDE users and
developers to Fedora might well outweigh the cost! It's at least worth
seriously considering.


Re: Editions vs. Spins (was: Re: F30: System-Wide Change proposa

By Zbigniew =?utf-... at 01/13/2019 - 14:19

On Sat, Jan 12, 2019 at 07:51:34PM +0100, Kevin Kofler wrote:
The issue of which spins/editions are promoted is orthogonal to the
issue of counting. After all, counting just reflects the actual
frequency of installations, not the reasons for it.

But counting may provide a fresh look at this issue. We'll have much
better data which spins/editions are used. If it turns out that KDE is
more popular than previous statistics showed, or that KDE has a higher
retention rate (the number of short-lived installations is low
suggesting that users "like it if they see it"), this would be a
strong argument to make the KDE spin more visible.


Re: Editions vs. Spins

By J.C. Cleaver at 01/14/2019 - 14:47

On 1/13/2019 10:19 AM, Zbigniew Jędrzejewski-Szmek wrote:
Relying on use-data here seems counter-intuitive. Shouldn't the decision
be made based on project first principles or community goals first, with
visibility and marketing effort being put in subsequent to that to meet
those goals?


Re: Editions vs. Spins

By Dan Book at 01/14/2019 - 14:52

On Mon, Jan 14, 2019 at 1:48 PM Japheth Cleaver < ... at terabithia dot org>

It is important for the community/project goals to align with what people
actually want.


Editions vs. Spins (was: Re: F30: System-Wide Change proposal: D

By Kevin Kofler at 01/12/2019 - 09:37

John Harris wrote:

This pointless artificial distinction between "Editions" and "Spins" needs
to stop (because there is no technical difference whatsoever between the 2
concepts), as does the unfair advertising ("Editions" as shiny logos above
the scrolling horizon and with one-click links directly to the ISO vs.
"Spins" hidden beyond the scrolling horizon, grayed out, with no names and
no description, and requiring at least 2 clicks to get them). But the people
in power still refuse to do anything about it.

The grayed out logos are particularly outrageous because they are doing to
the upstream logos exactly the kind of things explicitly forbidden in the
Fedora logo guidelines (changing the colors and even reducing them to 2).
(It so happens that the new Fedora logo will likely allow this kind of
usage, but have you ever asked the upstreams whether THEY are OK with those
unilateral changes to their logos?) I really don't see why, whereas all
other icons on get.fp.o are colored, the ones for the Spins (and ONLY those)
have to be grayed out. Yet <a href="" title=""></a> was closed as
"fixed" without any actual fix having been deployed, ever.

I also find it funny that the argument for the one-click direct ISO download
for GNOME "Workstation" (or formerly "Desktop") has always been that choices
confuse users. But now there is a "Workstation"/"Server"/"Atomic" choice.
While "Workstation" vs. "Server" is something that makes sense to most
users, "Atomic" is definitely not (and the description full of technical
jargon such as "Docker" and "Kubernetes" won't help either). Yet, Fedora
still refuses to show the full list of choices there and shows only those 3
arbitrarily picked ones.

Kevin Kofler

Re: F30: System-Wide Change proposal: DNF UUID

By Matthew Miller at 01/08/2019 - 11:49

On Tue, Jan 08, 2019 at 04:22:39PM +0100, Lennart Poettering wrote:
I do like this idea!

And, if there's not an associated UUID, it's more comfortable to do
"countme=2" the second week and onward -- this would make it easy to
distinguish systems which are short-lived. (Or "countme=new" and
"countme=ongoing" or something?)

Hmmmm. How comfortable would people be with reporting an incrementing count
*every* week (again, without a UUID attached)? That'd give a new axis into
the data which I can imagine being quite useful.