DevHeads.net

Removal of GCC from the buildroot

As per Changes/Remove GCC from BuildRoot
<https://fedoraproject.org/wiki/Changes/Remove_GCC_from_BuildRoot>, I'm
going to automatically add BuildRequires: gcc and/or BuildRequires: gcc-c++
to packages which fail to build with common messages (like gcc: command not
found, also autotools/cmake/meson are supported).

I'm going to do this tomorrow.

After which, I'm going to ask rel-eng to finally remove it from buildroot.
This will happen before mass rebuild. Stay tuned.

Comments

Re: Removal of GCC from the buildroot

By Michael Catanzaro at 07/20/2018 - 07:42

On Sun, Jul 8, 2018 at 1:46 PM, Igor Gnatenko
< ... at fedoraproject dot org> wrote:
I just got four bug reports for Vala projects that are failing due to
missing GCC:

<a href="https://bugzilla.redhat.com/show_bug.cgi?id=1603972" title="https://bugzilla.redhat.com/show_bug.cgi?id=1603972">https://bugzilla.redhat.com/show_bug.cgi?id=1603972</a>
<a href="https://bugzilla.redhat.com/show_bug.cgi?id=1604143" title="https://bugzilla.redhat.com/show_bug.cgi?id=1604143">https://bugzilla.redhat.com/show_bug.cgi?id=1604143</a>
<a href="https://bugzilla.redhat.com/show_bug.cgi?id=1604150" title="https://bugzilla.redhat.com/show_bug.cgi?id=1604150">https://bugzilla.redhat.com/show_bug.cgi?id=1604150</a>
<a href="https://bugzilla.redhat.com/show_bug.cgi?id=1604352" title="https://bugzilla.redhat.com/show_bug.cgi?id=1604352">https://bugzilla.redhat.com/show_bug.cgi?id=1604352</a>

The error message is:

configure: error: in `/builddir/build/BUILD/five-or-more-3.28.0':
configure: error: no acceptable C compiler found in $PATH

Michael

Re: Removal of GCC from the buildroot

By Michael Catanzaro at 07/23/2018 - 10:03

Another one, this time without any Vala:

<a href="https://bugzilla.redhat.com/show_bug.cgi?id=1606043" title="https://bugzilla.redhat.com/show_bug.cgi?id=1606043">https://bugzilla.redhat.com/show_bug.cgi?id=1606043</a>

Re: Removal of GCC from the buildroot

By Igor Gnatenko at 07/23/2018 - 11:22

Thanks a lot for your input! I'm going to block the Change tracking bug and
fix them automatically within a few days.

<a href="https://bugzilla.redhat.com/show_bug.cgi?id=1606043" title="https://bugzilla.redhat.com/show_bug.cgi?id=1606043">https://bugzilla.redhat.com/show_bug.cgi?id=1606043</a>

Re: Removal of GCC from the buildroot

By =?UTF-8?B?TWlyb... at 07/13/2018 - 05:19

On 8.7.2018 20:46, Igor Gnatenko wrote:
I've clicked randomly trough failures during the mass rebuild at [1].

I see quite a lot of commands not founds for gcc, cc, c++...

I think the maintainers should add them and that's fine, but it seemed
that during this change you said you will add those. Did it happen?

[1] <a href="https://kojipkgs.fedoraproject.org/mass-rebuild/f29-failures.html" title="https://kojipkgs.fedoraproject.org/mass-rebuild/f29-failures.html">https://kojipkgs.fedoraproject.org/mass-rebuild/f29-failures.html</a>

Removal of GCC from the buildroot

By R P Herrold at 07/13/2018 - 13:45

This list seems to only cover packages starting with an
uppercase letter, or a letter before lowercase 'i'

also, it only lists one maintainer, and omits co-maintainers

Would it be possible for a full list to be produced, and once
done, mentioned here?

thank you

-- Russ herrold

Re: Removal of GCC from the buildroot

By Jason L Tibbitts III at 07/13/2018 - 14:19

RPH> This list seems to only cover packages starting with an uppercase
RPH> letter, or a letter before lowercase 'i'

Well, the list is incomplete because the mass rebuild is not complete.
Upper case letters and digits were submitted first. Currently packages
in the "perl" range are being submitted and with the exception of a few
which seem to have hung, most builds through the 'e' range have
completed.

So you will have to wait a while longer if you insist on having a
complete list of failures.

- J<

Re: Removal of GCC from the buildroot

By Igor Gnatenko at 07/13/2018 - 06:39

Yes, I've pushed over 2k commits adding those, however regexp might have
not catched all possible cases. Would appreciate if you would link such
packages so that I can fix them. Or maintainers can do it themselves.

[1] <a href="https://kojipkgs.fedoraproject.org/mass-rebuild/f29-failures.html" title="https://kojipkgs.fedoraproject.org/mass-rebuild/f29-failures.html">https://kojipkgs.fedoraproject.org/mass-rebuild/f29-failures.html</a>

Re: Removal of GCC from the buildroot

By =?ISO-8859-1?Q?... at 07/16/2018 - 13:09

On Fri, 2018-07-13 at 12:39 +0200, Igor Gnatenko wrote:
releng's debconf-1.5.63-4.fc29 failed to build
man2html-1.6-22.g.fc29
noip-2.1.9-26.fc29
p7zip-16.02-13.fc29 failed to build
perl-File-FcntlLock-0.22-13.fc29
perl-Mail-Transport-Dbx
pngquant-2.12.1-2.fc29
subdownloader-2.0.19-8.fc29
python-bitarray-0.8.3-2.fc29
rawstudio-2.1-0.19.20170414.g003dd4f_rawspeed.20161119.gfa23d1c.fc29
virtualbox-guest-additions-5.2.14-2.fc29
tetrinetx-1.13.16-21.fc29

I already fixed unar and dpkg , if you fixed some of those. I'll be
grateful

Re: Removal of GCC from the buildroot

By Till Maas at 07/16/2018 - 09:36

Can you maybe add:

g?cc: [Co]mmand not found

to the script, regrep and fix the resulting packages. I just checked 3
of 12 of my failed pkgs and they had error messages like the following:

| make[1]: gcc: Command not found
| sh: gcc: command not found
| make: cc: Command not found
|
| <a href="https://kojipkgs.fedoraproject.org//work/tasks/215/28320215/build.log" title="https://kojipkgs.fedoraproject.org//work/tasks/215/28320215/build.log">https://kojipkgs.fedoraproject.org//work/tasks/215/28320215/build.log</a>
| <a href="https://kojipkgs.fedoraproject.org//work/tasks/9137/28229137/build.log" title="https://kojipkgs.fedoraproject.org//work/tasks/9137/28229137/build.log">https://kojipkgs.fedoraproject.org//work/tasks/9137/28229137/build.log</a>
| <a href="https://kojipkgs.fedoraproject.org//work/tasks/4199/28314199/build.log" title="https://kojipkgs.fedoraproject.org//work/tasks/4199/28314199/build.log">https://kojipkgs.fedoraproject.org//work/tasks/4199/28314199/build.log</a>

If you have the tools ready, it would make it easier to just re-run it
IMHO.

Kind regards
Till

Re: Removal of GCC from the buildroot

By Federico Bruni at 07/16/2018 - 01:11

Il giorno ven 13 lug 2018 alle 12:39, Igor Gnatenko
< ... at fedoraproject dot org> ha scritto:
I've just added and pushed the needed BuildRequires for my package
(extractpdfmark).
I didn't bump the release version. Will you do it when you make a new
mass rebuild?

I see that on 3rd of July the build was successful (despite the missing
gcc-c++ requirement):
<a href="https://koji.fedoraproject.org/koji/buildinfo?buildID=1102851" title="https://koji.fedoraproject.org/koji/buildinfo?buildID=1102851">https://koji.fedoraproject.org/koji/buildinfo?buildID=1102851</a>

Does it mean that the change in koji was implemented only recently?

Re: Removal of GCC from the buildroot

By Igor Gnatenko at 07/16/2018 - 04:51

Yes, it was implemented on 10th or something like that. You need to bump
release and rebuild as usual.

Re: Removal of GCC from the buildroot

By Rolf Fokkens at 07/14/2018 - 08:16

This bit the bcache-tools package too, which I fixed.

On 07/13/2018 12:39 PM, Igor Gnatenko wrote:

Re: Removal of GCC from the buildroot

By =?UTF-8?B?TWlyb... at 07/13/2018 - 08:52

On 13.7.2018 12:39, Igor Gnatenko wrote:
Sorry, it was just random browsing and I cannot seem to find them again
except bionetgen, which Zbyszek already took care of.

Re: Removal of GCC from the buildroot

By Zbigniew =?utf-... at 07/13/2018 - 07:34

On Fri, Jul 13, 2018 at 12:39:55PM +0200, Igor Gnatenko wrote:
bionetgen was one. It was failing with "/bin/sh: g++: command not found".
It is my package, I took care of that already. (Now it's failing on something
unrelated.)

Zbyszek

Re: Removal of GCC from the buildroot

By Kevin Kofler at 07/10/2018 - 11:44

Igor Gnatenko wrote:
I still think that this change is absolutely counterproductive, because it
will actually INCREASE local mock build times for all C/C++ programs for all
packagers, because gcc and gcc-c++ will no longer be included in the root
cache.

It is also yet another pointless mass change to a huge number of packages,
right after the %defattr one.

Kevin Kofler

Re: Removal of GCC from the buildroot

By Igor Gnatenko at 07/10/2018 - 12:03

However, it will DECREASE local mock build times for all non-C/C++
programs. And now we will know which packages actually need C and/or C++
compiler.

A lot of packages in 2018 are not written in C/C++, welcome to XXI century!

It came up multiple times and we are pretty much in agreement that we *need*
such cleanups.

Re: Removal of GCC from the buildroot

By Kevin Kofler at 07/11/2018 - 10:37

Igor Gnatenko wrote:
… and this is the problem that needs fixing.

It is just a PITA to have packages dragging in more and more interpreters
and/or language runtimes. The slowness and lack of compile-time type safety
of interpreted languages are also a big problem.

Kevin Kofler

Re: Removal of GCC from the buildroot

By Josh Stone at 07/11/2018 - 12:26

On 07/11/2018 07:37 AM, Kevin Kofler wrote:
(donning a Rust Evangelism cape)
So I hear you like compile-time safety...

No, I don't seriously want to get into a language comparison here,
except to say that it's reasonable for the world to expand beyond C/C++,
even for compiled languages.

And back on topic, rustc currently requires cc as a linker anyway.

Re: Removal of GCC from the buildroot

By Jan Kratochvil at 07/11/2018 - 13:01

On Wed, 11 Jul 2018 18:26:23 +0200, Josh Stone wrote:
There is no C/C++ language. There are two orthogonal languages, C and C++.
(And some people say C++11 and C++03 are also orthogonal.)

Jan Kratochvil

Re: Removal of GCC from the buildroot

By Kevin Kofler at 07/11/2018 - 13:16

Jan Kratochvil wrote:
Yes, C and C++ are divergent languages (I wouldn't call them "orthogonal",
but they are definitely different things), but gcc-c++ currently Requires
gcc, so if we have C++ support in the default buildroot (which I think we
should), we automatically also have C support.

Kevin Kofler

Re: Removal of GCC from the buildroot

By Josh Stone at 07/11/2018 - 13:10

On 07/11/2018 10:01 AM, Jan Kratochvil wrote:
If you're going to be pedantic, know that "/" can be shorthand for "or":
https://en.wikipedia.org/wiki/Slash_(punctuation)#Connecting_alternatives

Re: Removal of GCC from the buildroot

By Adam Williamson at 07/11/2018 - 12:19

On Wed, 2018-07-11 at 16:37 +0200, Kevin Kofler wrote:
Unless you think Fedora can somehow "fix" this "problem", then whether
you think it's a "problem" or not, it's the reality of the world Fedora
lives in.

Re: Removal of GCC from the buildroot

By Zbigniew =?utf-... at 07/10/2018 - 16:13

On Tue, Jul 10, 2018 at 06:03:33PM +0200, Igor Gnatenko wrote:
Yes.

Also, we'll have a mass rebuild tomorrow. If it turns out to be slower
than the previous one, we can easily re-add gcc to the koji buildroot.

Zbyszek

Re: Removal of GCC from the buildroot

By Andrew Lutomirski at 07/11/2018 - 12:37

On Tue, Jul 10, 2018 at 1:13 PM, Zbigniew Jędrzejewski-Szmek <

From a design perspective, minimizing the contents of the buildroot is a
good idea, I think, but I think it would be great if the runtime
installation of dependencies during the package build process were sped up
dramatically.

(Hmm. Some future version of rpm/dnf could get really fancy and *reflink*
package contents into the build chroot rather than untarring them every
time.)

Re: Removal of GCC from the buildroot

By Mikolaj Izdebski at 07/11/2018 - 13:08

On 07/11/2018 06:37 PM, Andrew Lutomirski wrote:
Koji gets repodata and packages from HTTP servers, through caching
proxies located in the same datacenters as builders. Most often used
packages are cached in memory, so download speeds are not a problem. At
least for non-s390x builders. Accessing packages directly from NFS would
be slower.

The slowest parts of setting up chroot is writing packages to disk,
synchronously. This part can be speeded up a lot by enabling nosync in
site-defaults.cfg mock config on Koji builders, setting cache=unsafe on
kvm buildvms, or both. These settings are safe because builders upload
all results to hubs upon task completion. With these settings chroot
setup can take about 30 seconds.

Once this is optimized, another slow part is loading repodata into
memory - uncompressing it, parsing and creating internal libsolv data
structures. This could be speeded up by including solv/solvx files in
repodata, but I think that would require some code changes.

Re: Removal of GCC from the buildroot

By Andrew Lutomirski at 07/11/2018 - 13:31

On Wed, Jul 11, 2018 at 10:08 AM, Mikolaj Izdebski < ... at redhat dot com> wrote:
I wonder if the time taken to decompress everything is relevant.
Fedora currently uses xz, which isn't so fast. zchunk is zstd under
the hood, which should be lot faster to decompress, especially on ARM
builders.

I don't suppose this could get done?

Hmm. On my system, there are lots of .solv and .solvx files in
/var/cache/dnf. I wonder if it would be straightforward to have a
daily job that updates the builder filesystem by just having dnf
refresh metadata and generate the .solv/.solvx files? There wouldn't
be any dnf changes needed AFAICT -- just some management on the
builder infrastructure. This would at least avoid a bunch of
duplicate work on most builds.

Re: Removal of GCC from the buildroot

By Mikolaj Izdebski at 07/11/2018 - 13:53

On 07/11/2018 07:31 PM, Andrew Lutomirski wrote:
Repodada consumed by dnf is gzip-compressed, which is quickly
decompressible. But decompression is done in the same thread as XML
parsing and creating pool data structures, so it affects repodata
loading times to some degree.

I proposed this a few years ago, but the answer was "no".

That wouldn't save much time (and would still require Koji code changes
as dnf uses different cache directories for each task). Just like
caching chroots is not effective, so Koji disables it. Most repos simply
change too often, and there are a lot (over 150) of builders. What would
help is generating solv/solvx during repo generation - builders would
download them and load very quickly. But that requires code changes and
would only save a few seconds per build.

Re: Removal of GCC from the buildroot

By Kevin Fenzi at 07/11/2018 - 15:26

On 07/11/2018 10:53 AM, Mikolaj Izdebski wrote:
I think the reason why releng didn't want to do that is because we don't
want to trade speed for reliability. True, we don't care if a machine
crashes in the middle of a build (because another one will take it after
the crashed one comes back), but we don't want to change anything that
might affect the actual build artifacts.

So, are we sure that nosync (disabling all fsync calls) doesn't change
the builds being made? What about test suites for packages that
specifically call fsync? They would always pass even if there was a
problem? We could try this in staging I suppose and have koschei run a
ton of builds to see what breaks...

I don't see the cache=unsafe anywhere (although the name sure makes me
want to enable it for official builds let me tell ya. ;) Can you point
out more closely where it is or docs for it?

kevin

Re: Removal of GCC from the buildroot

By Zbigniew =?utf-... at 07/11/2018 - 16:27

On Wed, Jul 11, 2018 at 12:26:01PM -0700, Kevin Fenzi wrote:
The effects of fsync are impossible to see unless you hard-reboot the
machine. (OK, strictly speaking, you can time the fsync call, but let's
ignore that). I'd be more worried about some side-effects of the way
that nosync is implemented with a LD_PRELOAD. I wonder if it wouldn't be
more robust to use nspawn's syscall filter to filter the fsync calls.
(If nspawn is already used by koji, not sure.)

Zbyszek

Re: Removal of GCC from the buildroot

By Mikolaj Izdebski at 07/12/2018 - 12:00

On 07/11/2018 10:27 PM, Zbigniew Jędrzejewski-Szmek wrote:
Koji does not use systemd-nspawn. It uses plain old chroot.

Re: Removal of GCC from the buildroot

By =?ISO-8859-1?Q?... at 07/19/2018 - 08:58

Dne 12.7.2018 v 18:00 Mikolaj Izdebski napsal(a):
Actually, it would be nice if somebody wanted to implement this for us
who use mock with systemd-nspawn :)

V.

Re: Removal of GCC from the buildroot

By Petr Pisar at 07/12/2018 - 03:32

On 2018-07-11, Zbigniew Jędrzejewski-Szmek < ... at in dot waw.pl> wrote:
Are you sure non-fsynced changes are are guaranteed to be visible on
block cache level? E.g. if you mix read/write and mmaped I/O from
different processes?

Can the syscall filter fake a success of the syscall return value?
Correctly written applications check fsync() return value and forward
the error.

-- Petr

Re: Removal of GCC from the buildroot

By Chris Adams at 07/12/2018 - 08:54

Once upon a time, Petr Pisar < ... at redhat dot com> said:
fsync() has nothing to do with that - it is purely a request to push the
buffer to disk. There is nothing defined about fsync() that would
affect inter-process I/O.

<a href="http://pubs.opengroup.org/onlinepubs/009695299/functions/fsync.html" title="http://pubs.opengroup.org/onlinepubs/009695299/functions/fsync.html">http://pubs.opengroup.org/onlinepubs/009695299/functions/fsync.html</a>

Re: Removal of GCC from the buildroot

By Simo Sorce at 07/12/2018 - 07:26

On Thu, 2018-07-12 at 07:32 +0000, Petr Pisar wrote:
In linux file writes and memory writes all hit the unified page cache
so there is not difference at all, only direct io skips the page cache
IIRC (but it should also invalidate it, so again no issues to
applications).

fsync only really make sure that what's in memory is pushed down to
disk and is safely on permanent storage (which is a lie with some
storage, but that is a different problem).

No, nspawn's filter just uses seccmop filters, which return
EINVAL/EPERM (IIRC) on blocked arguments/syscalls

So it is indeed not appropriate to use nspawn's filters to block
fsync()

Simo.

Re: Removal of GCC from the buildroot

By Andrew Lutomirski at 07/13/2018 - 14:47

Seccomp can be used to block a syscall and fake a return value of 0
(success) or any error code chosen by the filter. I assume systemd
exposes this functionality.

Re: Removal of GCC from the buildroot

By Zbigniew =?utf-... at 07/12/2018 - 07:22

On Thu, Jul 12, 2018 at 07:32:19AM +0000, Petr Pisar wrote:
Block cache — no, I don't think so. But do we have packages that do
anything like this during build? It'd require low-level fs support
and would be probably pretty fragile anyway.

It can, e.g. something like system-nspawn --system-call-filter='~sync:0 fsync:0'
should be a good start.

Zbyszek

Re: Removal of GCC from the buildroot

By Zbigniew =?utf-... at 07/11/2018 - 16:29

On Wed, Jul 11, 2018 at 08:27:22PM +0000, Zbigniew Jędrzejewski-Szmek wrote:
Oh, I saw Mikołaj's answer just now. So yeah, if nosync is only used
for dnf then we should really enable it by default.

Zbyszek

Re: Removal of GCC from the buildroot

By Mikolaj Izdebski at 07/11/2018 - 15:57

On 07/11/2018 09:26 PM, Kevin Fenzi wrote:
nosync is used by mock only for running dnf(/yum). It's not used for
rpmbuild nor runroot, so it won't affect package tests. It could
theoretically affect scriplets ran during package installation, but I've
been using nosync in all my Koji instances for a few years and I didn't
see any problems. Nosync is used in Copr and I didn't get any reports
about it breaking anything. Recently, to test the change in subject,
Igor Gnatenko did a few Fedora rebuilds a Koji set up by me, of course
with nosync enabled, and I didn't see any problems related to nosync either.

I would really like that.

cache=unsafe is documented at [1]. (Basically, in virt_install_command
you append ",cache=unsafe" to --disk parameter, next to "bus=virtio".)
It makes buildvmhost cache all disk operations and ignore sync
operations. Similar to nosync, but does not work on buildhw, works on
virthost level, applies to all operations, not just dnf.

[1]
<a href="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/virtualization_tuning_and_optimization_guide/index#sect-Virtualization_Tuning_Optimization_Guide-BlockIO-Caching" title="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/virtualization_tuning_and_optimization_guide/index#sect-Virtualization_Tuning_Optimization_Guide-BlockIO-Caching">https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7...</a>

Re: Removal of GCC from the buildroot

By Kevin Fenzi at 07/11/2018 - 16:37

On 07/11/2018 12:57 PM, Mikolaj Izdebski wrote:
I'd say open a releng ticket on it and we can track it there?
This sounds like it might be worth doing...

Ah, I see at the vm level. Yeah, I don't think this would be very much
of a win for us. The x86_64 buildvm's have all their storage on iscsi,
the arm ones have their storage on ssd's. I suppose it could help the
ppc64{le} ones, they are on 10k sas drives. I'm pretty leary of enabling
anything called 'unsafe' though.

kevin

Re: Removal of GCC from the buildroot

By Cole Robinson at 07/12/2018 - 14:10

On 07/11/2018 04:37 PM, Kevin Fenzi wrote:
I think it's unsafe only in the case of on-disk consistency, so across
VM reboots. I _think_ over a single run of a VM it's safe, which may
describe koji usage.

I know rjones has looked deeply at qemu caching methods for use in
libguestfs so maybe he can comment, CC'd

- Cole

Re: Removal of GCC from the buildroot

By Richard W.M. Jones at 07/12/2018 - 16:17

On Thu, Jul 12, 2018 at 02:10:37PM -0400, Cole Robinson wrote:
I cover caching modes about half way down here:

<a href="https://rwmj.wordpress.com/2013/09/02/new-in-libguestfs-allow-cache-mode-to-be-selected/" title="https://rwmj.wordpress.com/2013/09/02/new-in-libguestfs-allow-cache-mode-to-be-selected/">https://rwmj.wordpress.com/2013/09/02/new-in-libguestfs-allow-cache-mode...</a>

First off, cache=unsafe really does improve performance greatly, I
measured around 25% on a disk-heavy workload.

Does each build start with its own fresh VM? Do you care about the
data in that build VM if either qemu or the host crashes? If the
answers are 'Yes' and 'No' respectively to these questions then IMHO
this is the ideal situation for cache=unsafe.

The caveats:

If qemu or the host crashes, the disk image underlying these VMs will
(like 99.9% certainty) be corrupted. Even 'sync' inside the VM will
not do what you expect, it is just ignored. It's NOT a good idea on
VMs which are used for long periods when the host might reboot during
that time. It's NOT a good idea if you deeply care about the data in
the disk image.

It should only be used when the VM data can be recreated from scratch.

In libguestfs we use cachemode.*unsafe in a few places, carefully
chosen, when the above conditions apply.
<a href="https://github.com/libguestfs/libguestfs/search?q=cachemode+unsafe&amp;unscoped_q=cachemode+unsafe" title="https://github.com/libguestfs/libguestfs/search?q=cachemode+unsafe&amp;unscoped_q=cachemode+unsafe">https://github.com/libguestfs/libguestfs/search?q=cachemode+unsafe&amp;unsco...</a>

Rich.

Re: Removal of GCC from the buildroot

By Daniel P. Berrange at 07/16/2018 - 12:10

On Thu, Jul 12, 2018 at 09:17:41PM +0100, Richard W.M. Jones wrote:
FYI to augment what Rich's blog post says, it helps to understand the
difference between cache modes. The QEMU 'cache' setting actually
controls 3 separate tunables under the hood:

│ cache.writeback cache.direct cache.no-flush
─────────────┼─────────────────────────────────────────────────
writeback │ on off off
none │ on on off
writethrough │ off off off
directsync │ off on off
unsafe │ on off on

IOW, changing from cache=none to cache=unsafe turns off O_DIRECT so data
is buffered in host RAM, and also turns off disk flushing, so QEMU never
requests it to be pushed out to disk. The latter change is what makes
it so catastrophic on host failure - even a journalling filesystem in
the guest won't save you because we're ignoring the flush requests that
are required to make the journal work safely.

The combination of not using O_DIRECT and not honouring flush requests
means that all I/O operations on the guest complete pretty much immediately
without ever waiting for the host todo the real I/O.

The amount of RAM you have in the host though is pretty relevant here.
If the guest is doing I/O faster than the host OS can write it to disk
and there's never any flush requests to slow the guest down, you're
going to use an ever increasing amount of host RAM for caching I/O.
This could be a bad thing if you're contending on host RAM - it could
even push other important guests out to swap or trigger OOM killer.

IOW, using O_DIRECT (cache=none or directsync) is a good thing if you
need predictable host RAM usage - the only RAM used for I/O cache is
that assigned to the guest OS itself.

With using cache=unsafe for Koji I'd be a little concerned about
whether a build could inflict a denial of service on host RAM either
intentionally or accidentally, as the guest is relatively untrustworthy
and/or unconstrained in what it is running.

Finally the issue of O_DIRECT vs host page cache *only* applies if your
QEMU process is using locally exposed storage. ie a plain file, or a
local device node in /dev. If QEMU is using iSCSI via its built-in
network client, then host page cache vs O_DIRECT is irrelevant. In
this latter case, using cache=unsafe might be OK from a host RAM
consumption POV - though I'm not entirely sure what the RAM usage
pattern of the QEMU iSCSI client is like.

Regards,
Daniel

Re: Removal of GCC from the buildroot

By Mikolaj Izdebski at 07/13/2018 - 10:05

On 07/12/2018 10:17 PM, Richard W.M. Jones wrote:
Thanks Richard, your expert opinion is appreciated.

The answers are 'No' and 'Not much'.

1. VMs are installed once and are running for week/months until they are
reinstalled. In the meantime guests and hosts are rebooted during
routine maintenance, to apply updates.

2. There would be no data loss in case of host or hypervisor crash.
Worst case, if guest operating system was corrupted sysadmins would need
to trigger VM install.

We do run guests for long time and reboot hosts. But I think there is no
danger if you ensure that guest OS is shut down cleanly before host reboot.

This is the case of Koji builders. They don't contain any special data,
just operating system and configuration that can be recreated easily.

Re: Removal of GCC from the buildroot

By Richard W.M. Jones at 07/15/2018 - 11:47

On Fri, Jul 13, 2018 at 04:05:42PM +0200, Mikolaj Izdebski wrote:
In this case my preferred advice would be: DO NOT use cache=unsafe.

We've only tested scenarios for very short-lived build or temporary
VMs (for example when I was building RISC-V packages before we had
Koji, I used a script which created a VM per build and there it made
sense to use cache=unsafe).

I do not think it's a good idea to be using this for VMs which are in
any way long-lived as there could be unforeseen side effects which I'm
not aware of and certainly have never tested.

Host crash => yes you'd definitely need to reinstall that VM.

It's not a worst case, a host crash would near-definitely corrupt a VM
that was ignoring flush requests. It might even corrupt in an
undetectable way (eg. throwing away data while leaving metadata
intact).

Rich.

Re: Removal of GCC from the buildroot

By Andrew Lutomirski at 07/21/2018 - 17:54

Would it make sense to boot the builders with -snapshot and
cache=unsafe? After all, during normal operation, they don’t need to
persist anything.

It might even be reasonable to reboot the VMs after every single build.

Re: Removal of GCC from the buildroot

By Kevin Fenzi at 07/22/2018 - 14:04

On 07/21/2018 02:54 PM, Andrew Lutomirski wrote:
I don't think thats at all worth it for a slight bit of build speed.

Well, koji has no ability to do that currently, and note that some
builders can in fact be doing multiple builds at once, so you would need
to make sure all in progress builds were done and no new ones arrived, etc.

There was a project a while back to make koji builders more dynamic (I
think by making them cloud instances), but I am not sure whatever
happened with it.

kevin

Re: Removal of GCC from the buildroot

By Peter Robinson at 07/23/2018 - 04:36

On Sun, Jul 22, 2018 at 7:04 PM, Kevin Fenzi < ... at scrye dot com> wrote:
I seem to remember there was discussion of replacing mock with docker
containers as well, again I don't know what happened to that either.

Re: Removal of GCC from the buildroot

By Cole Robinson at 07/16/2018 - 09:27

On 07/15/2018 11:47 AM, Richard W.M. Jones wrote:
One other datapoint is that I _think_ openqa uses cache=unsafe, which is
used for Fedora automated install testing. I'm basing this largely on
cache=unsafe in the openqa sources.

- Cole

Re: Removal of GCC from the buildroot

By Adam Williamson at 07/16/2018 - 11:49

On Mon, 2018-07-16 at 09:27 -0400, Cole Robinson wrote:
That's mostly true, I think, except when doing multipath testing (where
it uses cache=none instead). However, openQA very much meets the
definition of 'short-lived / temporary' VMs - each openQA 'job' uses a
new VM, so the longest any one ever lasts is 2 hours (the hard limit on
an openQA job's lifetime). It also uses fresh disk images each time
(even when using a pre-created base disk image, it doesn't use it
directly but creates new scratch images based on the base image). I
don't know whether this is true of the Koji builder VMs.

Re: Removal of GCC from the buildroot

By Colin Walters at 07/11/2018 - 13:03

On Wed, Jul 11, 2018, at 12:37 PM, Andrew Lutomirski wrote:
Try `rpm-ostree ex container` today and see just how fast it is to construct
filesystem trees out of hardlinks from cached unpacked package trees imported
into an OSTree repository.

The main blocker right now actually is:
<a href="https://github.com/projectatomic/rpm-ostree/issues/1180" title="https://github.com/projectatomic/rpm-ostree/issues/1180">https://github.com/projectatomic/rpm-ostree/issues/1180</a>