Greetings everyone.
Many years ago mock introduced and then made default it's isolation to use systemd-nspawn instead of chroot. Shortly after the nspawn isolation was added, it was used in fedoraproject koji builds, but there were issues and since then the fedoraproject koji has defaulted to using chroot isolation.
Releng has had a ticket open for a long time to switch ( https://pagure.io/releng/issue/6967 )
I think the two items listed there (kernel bind mounts and loop devices) have long since been fixed, so I would like to propose we switch rawhide to using nspawn and see if any other issues show up.
If everyone is ok with it, I can enable it (just for rawhide) and we can look for issues with composes or any other items. At least then we would have a good list of things we would like to fix. If it turns out things just work ok we can leave it enabled.
I think moving to this will: * More closely match developers local test mock builds. * Provide better isolation for builds * Help with resources as systemd-nspawn is a lot more cgroup aware than chroot * Allow us to close a 5 year old ticket. ;)
Thoughts?
kevin
Yes, please! I'm looking forward to this!
On Thu, Aug 18, 2022, 5:02 PM Kevin Fenzi kevin@scrye.com wrote:
Greetings everyone.
Many years ago mock introduced and then made default it's isolation to use systemd-nspawn instead of chroot. Shortly after the nspawn isolation was added, it was used in fedoraproject koji builds, but there were issues and since then the fedoraproject koji has defaulted to using chroot isolation.
Releng has had a ticket open for a long time to switch ( https://pagure.io/releng/issue/6967 )
I think the two items listed there (kernel bind mounts and loop devices) have long since been fixed, so I would like to propose we switch rawhide to using nspawn and see if any other issues show up.
If everyone is ok with it, I can enable it (just for rawhide) and we can look for issues with composes or any other items. At least then we would have a good list of things we would like to fix. If it turns out things just work ok we can leave it enabled.
I think moving to this will:
- More closely match developers local test mock builds.
- Provide better isolation for builds
- Help with resources as systemd-nspawn is a lot more cgroup aware than
chroot
- Allow us to close a 5 year old ticket. ;)
Thoughts?
kevin _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On Thu, Aug 18, 2022 at 11:02 PM Kevin Fenzi kevin@scrye.com wrote:
Greetings everyone.
Many years ago mock introduced and then made default it's isolation to use systemd-nspawn instead of chroot. Shortly after the nspawn isolation was added, it was used in fedoraproject koji builds, but there were issues and since then the fedoraproject koji has defaulted to using chroot isolation.
Releng has had a ticket open for a long time to switch ( https://pagure.io/releng/issue/6967 )
I think the two items listed there (kernel bind mounts and loop devices) have long since been fixed, so I would like to propose we switch rawhide to using nspawn and see if any other issues show up.
If everyone is ok with it, I can enable it (just for rawhide) and we can look for issues with composes or any other items. At least then we would have a good list of things we would like to fix. If it turns out things just work ok we can leave it enabled.
I think moving to this will:
- More closely match developers local test mock builds.
- Provide better isolation for builds
- Help with resources as systemd-nspawn is a lot more cgroup aware than
chroot
- Allow us to close a 5 year old ticket. ;)
Thoughts?
Go for it :) I think now is a good time to do it: Most people are focusing on getting F37 lined up and some rough waters in rawhide would be fine. F38 is still a long way out.
Do it!
On Thu, Aug 18, 2022, 4:01 PM Kevin Fenzi kevin@scrye.com wrote:
Greetings everyone.
Many years ago mock introduced and then made default it's isolation to use systemd-nspawn instead of chroot. Shortly after the nspawn isolation was added, it was used in fedoraproject koji builds, but there were issues and since then the fedoraproject koji has defaulted to using chroot isolation.
Releng has had a ticket open for a long time to switch ( https://pagure.io/releng/issue/6967 )
I think the two items listed there (kernel bind mounts and loop devices) have long since been fixed, so I would like to propose we switch rawhide to using nspawn and see if any other issues show up.
If everyone is ok with it, I can enable it (just for rawhide) and we can look for issues with composes or any other items. At least then we would have a good list of things we would like to fix. If it turns out things just work ok we can leave it enabled.
I think moving to this will:
- More closely match developers local test mock builds.
- Provide better isolation for builds
- Help with resources as systemd-nspawn is a lot more cgroup aware than
chroot
- Allow us to close a 5 year old ticket. ;)
Thoughts?
kevin _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
* Kevin Fenzi:
Greetings everyone.
Many years ago mock introduced and then made default it's isolation to use systemd-nspawn instead of chroot. Shortly after the nspawn isolation was added, it was used in fedoraproject koji builds, but there were issues and since then the fedoraproject koji has defaulted to using chroot isolation.
Releng has had a ticket open for a long time to switch ( https://pagure.io/releng/issue/6967 )
I think the two items listed there (kernel bind mounts and loop devices) have long since been fixed, so I would like to propose we switch rawhide to using nspawn and see if any other issues show up.
What's the version of nspawn that will be used here? Presumably it's not the rawhide version, but the host version?
Thanks, Florian
On Fri, 19 Aug 2022 at 05:44, Florian Weimer fweimer@redhat.com wrote:
- Kevin Fenzi:
Greetings everyone.
Many years ago mock introduced and then made default it's isolation to use systemd-nspawn instead of chroot. Shortly after the nspawn isolation was added, it was used in fedoraproject koji builds, but there were issues and since then the fedoraproject koji has defaulted to using chroot isolation.
Releng has had a ticket open for a long time to switch ( https://pagure.io/releng/issue/6967 )
I think the two items listed there (kernel bind mounts and loop devices) have long since been fixed, so I would like to propose we switch rawhide to using nspawn and see if any other issues show up.
What's the version of nspawn that will be used here? Presumably it's not the rawhide version, but the host version?
Currently I think all builders are Fedora 36.
* Stephen Smoogen:
On Fri, 19 Aug 2022 at 05:44, Florian Weimer fweimer@redhat.com wrote:
- Kevin Fenzi:
Greetings everyone.
Many years ago mock introduced and then made default it's isolation to use systemd-nspawn instead of chroot. Shortly after the nspawn isolation was added, it was used in fedoraproject koji builds, but there were issues and since then the fedoraproject koji has defaulted to using chroot isolation.
Releng has had a ticket open for a long time to switch ( https://pagure.io/releng/issue/6967 )
I think the two items listed there (kernel bind mounts and loop devices) have long since been fixed, so I would like to propose we switch rawhide to using nspawn and see if any other issues show up.
What's the version of nspawn that will be used here? Presumably it's not the rawhide version, but the host version?
Currently I think all builders are Fedora 36.
Okay, I tried to reproduce this environment with the mock in Fedora 36 and the fedora-rawhide-x86_64 configuration. This tester:
#include <err.h> #include <errno.h> #include <stdio.h> #include <stdlib.h> #include <sys/mman.h> #include <sys/syscall.h> #include <sys/wait.h> #include <unistd.h> #include <signal.h>
static void noop_handler (int signo) { }
int main (int argc, char **argv) { if (argc != 3) { fprintf (stderr, "usage: %s FIRST-SYSCALL LAST-SYSCALL\n", argv[0]); return 1; } int first_syscall = atoi (argv[1]); if (first_syscall <= 0) errx (1, "invalid system call number: %s", argv[1]); int last_syscall = atoi (argv[2]); if (last_syscall <= 0) errx (1, "invalid system call number: %s", argv[2]);
if (signal (SIGALRM, noop_handler) == SIG_ERR) err (1, "signal (SIGALRM)");
volatile long int *results = mmap (NULL, 2 * sizeof (*results), PROT_READ | PROT_WRITE, MAP_ANONYMOUS | MAP_SHARED, -1, 0); if (results == MAP_FAILED) err (1, "mmap");
for (int nr = first_syscall; nr <= last_syscall; ++nr) { results[0] = -1; results[1] = 0; pid_t pid = fork (); if (pid < 0) err (1, "fork"); if (pid == 0) { errno = 0; results[0] = syscall (nr, 0, 0, 0, 0, 0, 0, 0); results[1] = errno; _exit (0); }
alarm (1); int status; int waitpid_ret = waitpid (pid, &status, 0); int waitpid_error = errno; alarm (0);
if (waitpid_ret < 0) { if (errno != EINTR) { errno = waitpid_error; err (1, "waitpid"); } else printf ("%d: timeout\n", nr); } else if (results[1] != ENOSYS) { errno = results[1]; printf("%d: %ld (errno %ld [%#m])\n", nr, results[0], results[1]); } } }
Produces this output when run with arguments 330 800:
330: 1 (errno 0 [Success]) 331: 0 (errno 0 [Success]) 332: -1 (errno 14 [Bad address]) 333: -1 (errno 22 [Invalid argument]) 334: -1 (errno 22 [Invalid argument]) 424: -1 (errno 9 [Bad file descriptor]) 425: -1 (errno 14 [Bad address]) 426: -1 (errno 95 [Operation not supported]) 427: -1 (errno 95 [Operation not supported]) 428: -1 (errno 14 [Bad address]) 429: -1 (errno 14 [Bad address]) 430: -1 (errno 14 [Bad address]) 431: -1 (errno 22 [Invalid argument]) 432: -1 (errno 22 [Invalid argument]) 433: -1 (errno 14 [Bad address]) 434: -1 (errno 22 [Invalid argument]) 435: -1 (errno 22 [Invalid argument]) 436: 0 (errno 0 [Success]) 437: -1 (errno 22 [Invalid argument]) 438: -1 (errno 9 [Bad file descriptor]) 439: -1 (errno 14 [Bad address]) 440: -1 (errno 9 [Bad file descriptor]) 441: -1 (errno 22 [Invalid argument]) 442: -1 (errno 22 [Invalid argument]) 444: -1 (errno 14 [Bad address]) 445: -1 (errno 77 [File descriptor in bad state]) 446: -1 (errno 77 [File descriptor in bad state]) 448: -1 (errno 9 [Bad file descriptor]) 449: -1 (errno 22 [Invalid argument]) 450: 0 (errno 0 [Success])
This looks very good, no problematic EPERM errors. So I don't expect this type of system call compatibility issues from the switch.
Thanks, Florian
On Fri, Aug 19, 2022 at 04:50:07PM +0200, Florian Weimer wrote:
- Stephen Smoogen:
On Fri, 19 Aug 2022 at 05:44, Florian Weimer fweimer@redhat.com wrote:
- Kevin Fenzi:
Greetings everyone.
Many years ago mock introduced and then made default it's isolation to use systemd-nspawn instead of chroot. Shortly after the nspawn isolation was added, it was used in fedoraproject koji builds, but there were issues and since then the fedoraproject koji has defaulted to using chroot isolation.
Releng has had a ticket open for a long time to switch ( https://pagure.io/releng/issue/6967 )
I think the two items listed there (kernel bind mounts and loop devices) have long since been fixed, so I would like to propose we switch rawhide to using nspawn and see if any other issues show up.
What's the version of nspawn that will be used here? Presumably it's not the rawhide version, but the host version?
Currently I think all builders are Fedora 36.
Okay, I tried to reproduce this environment with the mock in Fedora 36 and the fedora-rawhide-x86_64 configuration. This tester:
....snip...
This looks very good, no problematic EPERM errors. So I don't expect this type of system call compatibility issues from the switch.
Great! thanks for testing. I seem to dimly recall that glibc was something that nspawn broke before, but like I said, it was only right after it landed that it was even attempted.
Since everyone seems postivie on this, I'll look at switching it on monday and see what breaks.
kevin
On Fri, 2022-08-19 at 09:46 -0700, Kevin Fenzi wrote:
Since everyone seems postivie on this, I'll look at switching it on monday and see what breaks.
Does this apply just to package builds, or to everything? i.e. are live image builds also going to use it?
It's not necessarily a problem if so, I just kinda want to know so I can adjust openQA's live image build test (which is meant to shadow the official builds as closely as possible).
Thanks!
On Sat, Aug 20, 2022 at 11:18:14PM -0400, Adam Williamson wrote:
On Fri, 2022-08-19 at 09:46 -0700, Kevin Fenzi wrote:
Since everyone seems postivie on this, I'll look at switching it on monday and see what breaks.
Does this apply just to package builds, or to everything? i.e. are live image builds also going to use it?
It's not necessarily a problem if so, I just kinda want to know so I can adjust openQA's live image build test (which is meant to shadow the official builds as closely as possible).
It would be anything use f38-build... so yes, livemedia and images too.
kevin
I just edited the f38 tag at Mon Aug 22 08:29:51 PM UTC 2022 to switch to nspawn.
If you have detected a issue that seems like it might be related to this change, please file a releng ticket and we will dig into it.
Thanks,
kevin
* Kevin Fenzi:
I just edited the f38 tag at Mon Aug 22 08:29:51 PM UTC 2022 to switch to nspawn.
If you have detected a issue that seems like it might be related to this change, please file a releng ticket and we will dig into it.
Thanks, the glibc (nested) container tests started running during Koji builds, and promply result in additional test failures (all glibc test bugs as far as I can tell).
Just to be clear, this nested container support is really helpful.
Florian
On Tue, Sep 13, 2022 at 05:01:57PM +0200, Florian Weimer wrote:
- Kevin Fenzi:
I just edited the f38 tag at Mon Aug 22 08:29:51 PM UTC 2022 to switch to nspawn.
If you have detected a issue that seems like it might be related to this change, please file a releng ticket and we will dig into it.
Thanks, the glibc (nested) container tests started running during Koji builds, and promply result in additional test failures (all glibc test bugs as far as I can tell).
Just to be clear, this nested container support is really helpful.
Glad to hear it.
We have just one issue left that I know of... ostree_installers are failing. Thats hopefully going to be worked around in a pungi tweak, and then we should be good (as far as I am aware).
https://bugzilla.redhat.com/show_bug.cgi?id=2123812
kevin
On 18. 08. 22 23:01, Kevin Fenzi wrote:
Greetings everyone.
Many years ago mock introduced and then made default it's isolation to use systemd-nspawn instead of chroot. Shortly after the nspawn isolation was added, it was used in fedoraproject koji builds, but there were issues and since then the fedoraproject koji has defaulted to using chroot isolation.
Releng has had a ticket open for a long time to switch ( https://pagure.io/releng/issue/6967 )
I think the two items listed there (kernel bind mounts and loop devices) have long since been fixed, so I would like to propose we switch rawhide to using nspawn and see if any other issues show up.
If everyone is ok with it, I can enable it (just for rawhide) and we can look for issues with composes or any other items. At least then we would have a good list of things we would like to fix. If it turns out things just work ok we can leave it enabled.
I think moving to this will:
- More closely match developers local test mock builds.
- Provide better isolation for builds
- Help with resources as systemd-nspawn is a lot more cgroup aware than
chroot
- Allow us to close a 5 year old ticket. ;)
Thoughts?
Let's do it :)
On Thursday, August 18, 2022 4:01:39 PM CDT Kevin Fenzi wrote:
Greetings everyone.
Many years ago mock introduced and then made default it's isolation to use systemd-nspawn instead of chroot. Shortly after the nspawn isolation was added, it was used in fedoraproject koji builds, but there were issues and since then the fedoraproject koji has defaulted to using chroot isolation.
Releng has had a ticket open for a long time to switch ( https://pagure.io/releng/issue/6967 )
I think the two items listed there (kernel bind mounts and loop devices) have long since been fixed, so I would like to propose we switch rawhide to using nspawn and see if any other issues show up.
If everyone is ok with it, I can enable it (just for rawhide) and we can look for issues with composes or any other items. At least then we would have a good list of things we would like to fix. If it turns out things just work ok we can leave it enabled.
I think moving to this will:
- More closely match developers local test mock builds.
- Provide better isolation for builds
- Help with resources as systemd-nspawn is a lot more cgroup aware than
chroot
- Allow us to close a 5 year old ticket. ;)
Thoughts?
kevin
Yes please!
devel@lists.stg.fedoraproject.org