I've just spent two days trying to upgrade our school's Fedora 27 FreeIPA servers to Fedora 28 and kept hitting multiple roadblocks. I finally found this post on the freeipa-users mailing list:
https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedora hosted.org/thread/BTMTZ4QULRAP6AZDDCXUWWVLMDIODXRP/
It basically says (and I've spent two days attesting to the fact) that FreeIPA isn't actually ready for production use on Fedora 28.
I would like to suggest that, for something as central to the Fedora Server story as FreeIPA is, we should have done at least one of the following:
1. Posted the above message to at least one of the Fedora users/devel/server mailing lists. 2. Put something like the above message in the Fedora 28 release notes. 3. Modularized FreeIPA, putting the current 4.6.90-pre series in a development module, and putting Fedora 27's 4.6.x series in a stable module.
It seems that we knew that FreeIPA wasn't ready well before Fedora 28 was released, so I think we really dropped the ball by not releasing this information sooner and not distributing it more widely.
Jonathan
P.S. For those who care, VM snapshots are wonderful. I restored our FreeIPA servers from snapshots, so any users who changed their password in the last 24 hours or so will have to change it again.
Jonathan,
your point 3 is not going to work. As I outlined in that email, so many component literally broke FreeIPA in Fedora 28 development timeline, that keeping Fedora 27's 4.6.x series was not possible.
I'm sorry for not putting out the message to more generic lists. I assumed wrongly that people interested in FreeIPA deployments would be on freeipa-users@ mailing list already.
I spent last four months trying to communicate the dire state FreeIPA will be in the Fedora 28 release to Fedora people. I failed, perhaps my style of "everything is on fire" was less than convincing.
On Fri, May 25, 2018 at 08:31:48PM -0000, Alexander Bokovoy wrote:
your point 3 is not going to work. As I outlined in that email, so many component literally broke FreeIPA in Fedora 28 development timeline, that keeping Fedora 27's 4.6.x series was not possible.
I'm sorry for not putting out the message to more generic lists. I assumed wrongly that people interested in FreeIPA deployments would be on freeipa-users@ mailing list already.
Just as one example: it's what I use to as a KDC at home (mainly for my NFS/krb5 testing). I routinely upgrade my home server soon after a new Fedora version comes out and wouldn't have thought (or even really know how) to check whether some component might break on upgrade.
--b.
I spent last four months trying to communicate the dire state FreeIPA will be in the Fedora 28 release to Fedora people. I failed, perhaps my style of "everything is on fire" was less than convincing.
On Fri, 2018-05-25 at 20:31 +0000, Alexander Bokovoy wrote:
Jonathan,
your point 3 is not going to work. As I outlined in that email, so many component literally broke FreeIPA in Fedora 28 development timeline, that keeping Fedora 27's 4.6.x series was not possible.
I'm sorry for not putting out the message to more generic lists. I assumed wrongly that people interested in FreeIPA deployments would be on freeipa-users@ mailing list already.
I spent last four months trying to communicate the dire state FreeIPA will be in the Fedora 28 release to Fedora people. I failed, perhaps my style of "everything is on fire" was less than convincing.
To be totally honest, I did not get this message either, Alex - my understanding was that once we finally got all the intended packages landing and the automated tests worked, you actually thought FreeIPA was in acceptable shape fore real use. If it was known that it was not, we absolutely ought to have communicated this *far* more widely than on a niche mailing list: FreeIPA is supposed to be a key feature of Fedora Server which is itself a key edition of Fedora. This should have been up-front in the release notes and the release announcement, or frankly, should've caused us to rethink the release plans.
Obviously there was some sort of significant communication fail if enough people missed the message that this got totally whiffed on, so we should absolutely figure out what we can do better there.
Perhaps this also suggests our existing release criteria and test cases for FreeIPA are insufficient: if it can pass our existing tests and thus appear to meet our existing criteria, yet be in your judgment "not ready for production", that seems fundamentally wrong. How do you we think we could address that? Can you give some kind of summary of the issues here, which we can use to think about how to extend the test cases and criteria?
For now I'd highly recommend we do our best to cut any losses here, but perhaps we should ask Matt to figure out the best way to do that? CCing Matt, in case he hasn't seen this - Matt, see Jonathan Dieter's initial post on server@ for context. Thanks!
On Fri, May 25, 2018 at 3:23 PM, Adam Williamson adamwill@fedoraproject.org wrote:
On Fri, 2018-05-25 at 20:31 +0000, Alexander Bokovoy wrote:
Jonathan,
your point 3 is not going to work. As I outlined in that email, so many component literally broke FreeIPA in Fedora 28 development timeline, that keeping Fedora 27's 4.6.x series was not possible.
I'm sorry for not putting out the message to more generic lists. I assumed wrongly that people interested in FreeIPA deployments would be on freeipa-users@ mailing list already.
I spent last four months trying to communicate the dire state FreeIPA will be in the Fedora 28 release to Fedora people. I failed, perhaps my style of "everything is on fire" was less than convincing.
To be totally honest, I did not get this message either, Alex - my understanding was that once we finally got all the intended packages landing and the automated tests worked, you actually thought FreeIPA was in acceptable shape fore real use. If it was known that it was not, we absolutely ought to have communicated this *far* more widely than on a niche mailing list: FreeIPA is supposed to be a key feature of Fedora Server which is itself a key edition of Fedora. This should have been up-front in the release notes and the release announcement, or frankly, should've caused us to rethink the release plans.
Obviously there was some sort of significant communication fail if enough people missed the message that this got totally whiffed on, so we should absolutely figure out what we can do better there.
Perhaps this also suggests our existing release criteria and test cases for FreeIPA are insufficient: if it can pass our existing tests and thus appear to meet our existing criteria, yet be in your judgment "not ready for production", that seems fundamentally wrong. How do you we think we could address that? Can you give some kind of summary of the issues here, which we can use to think about how to extend the test cases and criteria?
For now I'd highly recommend we do our best to cut any losses here, but perhaps we should ask Matt to figure out the best way to do that? CCing Matt, in case he hasn't seen this - Matt, see Jonathan Dieter's initial post on server@ for context. Thanks!
Spitball flinger here...
What's the easiest way to quickly (today or tomorrow) prevent dnf system-upgrade from working for Fedora 26/27 -> 28? i.e. to just gracefully fail? Is there a way to insert some bogus dependency so that the upgrade fails, that could then be manually added to --exclude= if the user wants to proceed with the upgrade knowing the consequences to FreeIPA (or they don't use FreeIPA)?
I'm just wondering what the least bad option is here. And it sounds like a bunch of user@ messages that dnf system-upgrade isn't working for server is bad, but less bad than blowing up people's FreeIPA setups. But I'm not sure.
On Fri, May 25, 2018 at 3:34 PM, Chris Murphy lists@colorremedies.com wrote:
What's the easiest way to quickly (today or tomorrow) prevent dnf system-upgrade from working for Fedora 26/27 -> 28?
^Server Of course I don't mean all editions fail to update, just Server. And even that's imperfect, people could have non-Server edition and still use FreeIPA so what then? What about failing to do the upgrade if FreeIPA is installed?
On Fri, May 25, 2018 at 5:37 PM Chris Murphy lists@colorremedies.com wrote:
On Fri, May 25, 2018 at 3:34 PM, Chris Murphy lists@colorremedies.com
wrote:
What's the easiest way to quickly (today or tomorrow) prevent dnf system-upgrade from working for Fedora 26/27 -> 28?
^Server Of course I don't mean all editions fail to update, just Server. And even that's imperfect, people could have non-Server edition and still use FreeIPA so what then? What about failing to do the upgrade if FreeIPA is installed?
Can't we just get a working version of FreeIPA into Fedora 28?
On Fri, May 25, 2018 at 6:02 PM, Neal Gompa ngompa13@gmail.com wrote:
On Fri, May 25, 2018 at 5:37 PM Chris Murphy lists@colorremedies.com wrote:
On Fri, May 25, 2018 at 3:34 PM, Chris Murphy lists@colorremedies.com
wrote:
What's the easiest way to quickly (today or tomorrow) prevent dnf system-upgrade from working for Fedora 26/27 -> 28?
^Server Of course I don't mean all editions fail to update, just Server. And even that's imperfect, people could have non-Server edition and still use FreeIPA so what then? What about failing to do the upgrade if FreeIPA is installed?
Can't we just get a working version of FreeIPA into Fedora 28?
I've read some of the emails, but I can't tell if a working version is imminent. My questions above are from a triage perspective, the working assumption being that a working version is not imminent, and that the Ideal Situation ship has already sailed.
On Fri, May 25, 2018 at 5:37 PM Chris Murphy <lists(a)colorremedies.com>
Can't we just get a working version of FreeIPA into Fedora 28?
This question reflects complexity we have to deal with. What is in Fedora 28 release is good as a standalone server because that's tested. Upgrade of that standalone server from Fedora 27 is working relatively well. What's broken is anything where you have more than one server: a replicated environment is a majority of our users' deployments.
"Just" is never an easy option. 4.6.90.pre2 is in Fedora 28 stable updates now, since May 15th. It looks like it is stable enough but it took us more than two months of bugfixing to get there and Fedora 28 release deadline was well before we solved majority of the issues.
See my other answer (to Adam) for details.
On Sat, 2018-05-26 at 02:30 +0000, Alexander Bokovoy wrote:
"Just" is never an easy option. 4.6.90.pre2 is in Fedora 28 stable updates now, since May 15th. It looks like it is stable enough but it took us more than two months of bugfixing to get there and Fedora 28 release deadline was well before we solved majority of the issues.
Most of my problems were with 4.6.90.pre1, but I did eventually upgrade to 4.6.90.pre2 and was seeing https://pagure.io/freeipa/issue/7565 when trying to setup the replica, so I'm not sure if it should be considered stable yet.
Jonathan
On Fri, 2018-05-25 at 20:31 +0000, Alexander Bokovoy wrote:
To be totally honest, I did not get this message either, Alex - my understanding was that once we finally got all the intended packages landing and the automated tests worked, you actually thought FreeIPA was in acceptable shape fore real use. If it was known that it was not, we absolutely ought to have communicated this *far* more widely than on a niche mailing list: FreeIPA is supposed to be a key feature of Fedora Server which is itself a key edition of Fedora. This should have been up-front in the release notes and the release announcement, or frankly, should've caused us to rethink the release plans.
I did tell that several times but the only real answer I've got: "these issues are not blocking criteria for Fedora Server". At some point you choose your own fights: fixing software or fixing release criteria. For Fedora 29 I'd like us to extend Fedora Server blocking criteria, now that majority of porting has been completed.
For us a push with Python3 migration (we have to migrate all Python base, not a selected module here or there), NSS to OpenSSL migration, mod_nss to mod_ssl migration, NSS default database format migration, Apache ignorance of its ecosystem (changes in ABI in mod_proxy in minor versions), modularity inconsistence through the course of year 2017, have killed a lot of the productive time.
Obviously there was some sort of significant communication fail if enough people missed the message that this got totally whiffed on, so we should absolutely figure out what we can do better there.
Perhaps this also suggests our existing release criteria and test cases for FreeIPA are insufficient: if it can pass our existing tests and thus appear to meet our existing criteria, yet be in your judgment "not ready for production", that seems fundamentally wrong. How do you we think we could address that? Can you give some kind of summary of the issues here, which we can use to think about how to extend the test cases and criteria?
The issues were listed in the email referenced by Jonathan already.
- Replication failures should have been a blocker alone (they are for FreeIPA team) but Fedora Server criteria does not include them.
- Broken NSS sqldb defaults caused us several months working on fixes. The latest one, https://bodhi.fedoraproject.org/updates/FEDORA-2018-8cf042000b, was only pushed after Fedora 28 release. https://bugzilla.redhat.com/show_bug.cgi?id=1568271 was found in late April, after we did fight all the previous issues. We started with NSS sqldb adaptation in October 2017.
- Only on Thursday this week we've finally tracked down a nasty python-ldap bug that crashed FreeIPA framework on every time --all option was used on a host or service entry with additional access controls defined. This is not part of Fedora Server criteria but kills FreeIPA use with delegated permissions to retrieve Kerberos credentials.
- We had to do a lot of Python 3 porting work for other projects. Time is not unlimited, especially when it comes to releases and blocking criteria.
- Dogtag had to work on Tomcat 8.5 adaptation where existing API it dependent on was removed.
So on May 15th we released https://www.freeipa.org/page/Releases/4.6.90.pre2 which is now in Fedora 28 stable updates. We consider it as one of closer candidates to being stable. Between 4.6.90.pre1 and pre2 are two months of hard work across several sizeable projects (freeipa, sssd, 389-ds, MIT Kerberos, dogtag, nss, authselect, gssproxy, to name a few).
We have a testing setup at FreeIPA upstream that allows us to test complex topologies. Only recently we were able to move to Fedora 28 testing there as we had issues with our components. There we test also what OpenQA is unable to test so far. I think 4.6.90.pre2 is in much better shape than what Fedora 28 had released. However, if we were to get it as a blocking release, Fedora 28 would have been delayed by at least a month.
As I said, we had no choice: a push of NSS sqldb defaults change forced us to work on both nss-related code and openssl migration at the same time. It made impossible to keep FreeIPA from Fedora 27 and do our work in a separate module. This was known since autumn 2017 and was a well voiced situation.
On 05/26/2018 04:26 AM, Alexander Bokovoy wrote:
We have a testing setup at FreeIPA upstream that allows us to test complex topologies. Only recently we were able to move to Fedora 28 testing there as we had issues with our components. There we test also what OpenQA is unable to test so far. I think 4.6.90.pre2 is in much better shape than what Fedora 28 had released. However, if we were to get it as a blocking release, Fedora 28 would have been delayed by at least a month.
I think I speak for all of the AtomicCI team [0] when I say that we will continue working with you to get more tests into the Fedora workflow. Please reach out to me off-list if you have a set of good tests in mind that we could help you run in the Fedora workflow (and maybe even link them to run when other packages change, so they don't break FreeIPA) to prevent this happening again in Fedora 29 and onwards.
Thanks, -Dominik
On 05/26/2018 04:26 AM, Alexander Bokovoy wrote: I think I speak for all of the AtomicCI team [0] when I say that we will continue working with you to get more tests into the Fedora workflow. Please reach out to me off-list if you have a set of good tests in mind that we could help you run in the Fedora workflow (and maybe even link them to run when other packages change, so they don't break FreeIPA) to prevent this happening again in Fedora 29 and onwards.
Yes, Dominik. We need to find a way to get tests from FreeIPA PR CI to be run as a part of Fedora Bodhi tests.
Just to show how important is it to test replication: this is a latest NSS bug we found with FreeIPA/Dogtag lightweight CA replication today, May 28th: https://bugzilla.redhat.com/show_bug.cgi?id=1583027. It is a difference in NSS database backends behavior and shows how little upstream tests or cares for non-Firefox use cases.
On 05/26/2018 04:26 AM, Alexander Bokovoy wrote: I think I speak for all of the AtomicCI team [0] when I say that we will continue working with you to get more tests into the Fedora workflow. Please reach out to me off-list if you have a set of good tests in mind that we could help you run in the Fedora workflow (and maybe even link them to run when other packages change, so they don't break FreeIPA) to prevent this happening again in Fedora 29 and onwards.
I raised this internally and will come back to you in two weeks once I'm back from SambaXP conference (scheduled for next week) and a PTO afterwards.
Thank you, I'm looking forward to syncing up with you off-list regarding this.
If anyone else is interested in following the discussion, please let me know so I can loop you in.
Thanks, -Dominik
On 06/01/2018 02:46 PM, Alexander Bokovoy wrote:
On 05/26/2018 04:26 AM, Alexander Bokovoy wrote: I think I speak for all of the AtomicCI team [0] when I say that we will continue working with you to get more tests into the Fedora workflow. Please reach out to me off-list if you have a set of good tests in mind that we could help you run in the Fedora workflow (and maybe even link them to run when other packages change, so they don't break FreeIPA) to prevent this happening again in Fedora 29 and onwards.
I raised this internally and will come back to you in two weeks once I'm back from SambaXP conference (scheduled for next week) and a PTO afterwards.
On Fri, May 25, 2018 at 10:26 PM Alexander Bokovoy abbra@fedoraproject.org wrote:
On Fri, 2018-05-25 at 20:31 +0000, Alexander Bokovoy wrote:
To be totally honest, I did not get this message either, Alex - my understanding was that once we finally got all the intended packages landing and the automated tests worked, you actually thought FreeIPA was in acceptable shape fore real use. If it was known that it was not, we absolutely ought to have communicated this *far* more widely than on a niche mailing list: FreeIPA is supposed to be a key feature of Fedora Server which is itself a key edition of Fedora. This should have been up-front in the release notes and the release announcement, or frankly, should've caused us to rethink the release plans.
I did tell that several times but the only real answer I've got: "these issues are not blocking criteria for Fedora Server". At some point you choose your own fights: fixing software or fixing release criteria. For Fedora 29 I'd like us to extend Fedora Server blocking criteria, now that majority of porting has been completed.
To be clear, what I understood prior to F28 release was that the creation of new FreeIPA Replicas was not working. Had I realized that the problem was more in-depth, I absolutely would have hit the big red button on the release. I may have been misunderstanding what you were telling me, but my impression of the issues I heard from you was that it did *not* in fact affect the set of things we were treating as blocking.
(For a bit of a history lesson, at the time when we first started shipping a separate Server Edition, we expressly did not include replicas as a blocking feature because we were trying to encourage people to use RHEL/CentOS for the sort of environments where replicas would be required. Also, replicas were MUCH harder to set up in those days than they are today. We absolutely should have this on the blocking criteria for Fedora 29).
For us a push with Python3 migration (we have to migrate all Python base, not a selected module here or there), NSS to OpenSSL migration, mod_nss to mod_ssl migration, NSS default database format migration, Apache ignorance of its ecosystem (changes in ABI in mod_proxy in minor versions), modularity inconsistence through the course of year 2017, have killed a lot of the productive time.
Obviously there was some sort of significant communication fail if enough people missed the message that this got totally whiffed on, so we should absolutely figure out what we can do better there.
Perhaps this also suggests our existing release criteria and test cases for FreeIPA are insufficient: if it can pass our existing tests and thus appear to meet our existing criteria, yet be in your judgment "not ready for production", that seems fundamentally wrong. How do you we think we could address that? Can you give some kind of summary of the issues here, which we can use to think about how to extend the test cases and criteria?
The issues were listed in the email referenced by Jonathan already.
- Replication failures should have been a blocker alone (they are for
FreeIPA team) but Fedora Server criteria does not include them.
Yeah, see above. That was due to a historical decision that is no longer appropriate as well as a misunderstanding on my part about the severity of the problem; I honestly did not know that the problems extended to existing replicas.
- Broken NSS sqldb defaults caused us several months working on fixes. The
latest one, https://bodhi.fedoraproject.org/updates/FEDORA-2018-8cf042000b, was only pushed after Fedora 28 release. https://bugzilla.redhat.com/show_bug.cgi?id=1568271 was found in late April, after we did fight all the previous issues. We started with NSS sqldb adaptation in October 2017.
This might be fodder for a separate thread, but has the FreeIPA team considered dropping NSS as a crypto library entirely? It really seems that the NSS upstream cares only about Firefox and is perfectly happy to break all other consumers whenever they feel like it.
- Only on Thursday this week we've finally tracked down a nasty
python-ldap bug that crashed FreeIPA framework on every time --all option was used on a host or service entry with additional access controls defined. This is not part of Fedora Server criteria but kills FreeIPA use with delegated permissions to retrieve Kerberos credentials.
We do need to add HBAC rules to the criteria as well (and I thought we did have at least minimal testing for this), but I suspect this would *not* have risen to the status of blocker, but would probably would have been a lively conversation at the blocker bug meetings.
- We had to do a lot of Python 3 porting work for other projects. Time is
not unlimited, especially when it comes to releases and blocking criteria.
This is one place where I think the FreeIPA team needed to be more proactive. Presumably, this work was known about well before Final Freeze. Given FreeIPA's critical place in the Server Edition, it would have been grounds for approaching the Server WG and FESCo about an adjustment to the Fedora Schedule.
- Dogtag had to work on Tomcat 8.5 adaptation where existing API it
dependent on was removed.
This is the sort of place where I think that modularity can help in the future. Tomcat regularly breaks backwards-compatibility and I think we in Fedora need to have a way to keep the known-working versions in the distribution, even if it is non-default.
So on May 15th we released https://www.freeipa.org/page/Releases/4.6.90.pre2 which is now in Fedora 28 stable updates. We consider it as one of closer candidates to being stable. Between 4.6.90.pre1 and pre2 are two months of hard work across several sizeable projects (freeipa, sssd, 389-ds, MIT Kerberos, dogtag, nss, authselect, gssproxy, to name a few).
We have a testing setup at FreeIPA upstream that allows us to test complex topologies. Only recently we were able to move to Fedora 28 testing there as we had issues with our components. There we test also what OpenQA is unable to test so far. I think 4.6.90.pre2 is in much better shape than what Fedora 28 had released. However, if we were to get it as a blocking release, Fedora 28 would have been delayed by at least a month.
As I said, we had no choice: a push of NSS sqldb defaults change forced us to work on both nss-related code and openssl migration at the same time. It made impossible to keep FreeIPA from Fedora 27 and do our work in a separate module. This was known since autumn 2017 and was a well voiced situation.
It may have been known since autumn 2017, but it was not sufficiently voiced. As I said, I failed to understand the degree of trouble that FreeIPA was in. I suspect some of that was communication fatigue with failing to get me to understand, but your statement above that you basically abandoned trying to get me to understand isn't a good outcome either.
If nothing else, it might have been prudent to find another person to speak to (or a different person to speak *for* you who might have more success). Or perhaps at least had proposed a voice conversation where at least I could have heard the urgency in your voice that was apparently missing from my reading of your IRC communiques.
On ti, 29 touko 2018, Stephen Gallagher wrote:
On Fri, May 25, 2018 at 10:26 PM Alexander Bokovoy abbra@fedoraproject.org wrote:
On Fri, 2018-05-25 at 20:31 +0000, Alexander Bokovoy wrote:
To be totally honest, I did not get this message either, Alex - my understanding was that once we finally got all the intended packages landing and the automated tests worked, you actually thought FreeIPA was in acceptable shape fore real use. If it was known that it was not, we absolutely ought to have communicated this *far* more widely than on a niche mailing list: FreeIPA is supposed to be a key feature of Fedora Server which is itself a key edition of Fedora. This should have been up-front in the release notes and the release announcement, or frankly, should've caused us to rethink the release plans.
I did tell that several times but the only real answer I've got: "these issues are not blocking criteria for Fedora Server". At some point you choose your own fights: fixing software or fixing release criteria. For Fedora 29 I'd like us to extend Fedora Server blocking criteria, now that majority of porting has been completed.
To be clear, what I understood prior to F28 release was that the creation of new FreeIPA Replicas was not working. Had I realized that the problem was more in-depth, I absolutely would have hit the big red button on the release. I may have been misunderstanding what you were telling me, but my impression of the issues I heard from you was that it did *not* in fact affect the set of things we were treating as blocking.
(For a bit of a history lesson, at the time when we first started shipping a separate Server Edition, we expressly did not include replicas as a blocking feature because we were trying to encourage people to use RHEL/CentOS for the sort of environments where replicas would be required. Also, replicas were MUCH harder to set up in those days than they are today. We absolutely should have this on the blocking criteria for Fedora 29).
For us a push with Python3 migration (we have to migrate all Python base, not a selected module here or there), NSS to OpenSSL migration, mod_nss to mod_ssl migration, NSS default database format migration, Apache ignorance of its ecosystem (changes in ABI in mod_proxy in minor versions), modularity inconsistence through the course of year 2017, have killed a lot of the productive time.
Obviously there was some sort of significant communication fail if enough people missed the message that this got totally whiffed on, so we should absolutely figure out what we can do better there.
Perhaps this also suggests our existing release criteria and test cases for FreeIPA are insufficient: if it can pass our existing tests and thus appear to meet our existing criteria, yet be in your judgment "not ready for production", that seems fundamentally wrong. How do you we think we could address that? Can you give some kind of summary of the issues here, which we can use to think about how to extend the test cases and criteria?
The issues were listed in the email referenced by Jonathan already.
- Replication failures should have been a blocker alone (they are for
FreeIPA team) but Fedora Server criteria does not include them.
Yeah, see above. That was due to a historical decision that is no longer appropriate as well as a misunderstanding on my part about the severity of the problem; I honestly did not know that the problems extended to existing replicas.
- Broken NSS sqldb defaults caused us several months working on fixes. The
latest one, https://bodhi.fedoraproject.org/updates/FEDORA-2018-8cf042000b, was only pushed after Fedora 28 release. https://bugzilla.redhat.com/show_bug.cgi?id=1568271 was found in late April, after we did fight all the previous issues. We started with NSS sqldb adaptation in October 2017.
This might be fodder for a separate thread, but has the FreeIPA team considered dropping NSS as a crypto library entirely? It really seems that the NSS upstream cares only about Firefox and is perfectly happy to break all other consumers whenever they feel like it.
Yes, and we spent more than six months clearing the fallout. It is not completed yet and will not be completed any time soon because openssl is not a better crypto library, just a different beast.
Namely, it has issues with HSM support that prevent Dogtag from moving away from NSS. 389-ds also uses NSS for its server-side operations, implementing a hybrid mode where openldap libraries (compiled against openssl) are fed with certificates extracted from NSS database at runtime. There are numerous other issues in this migration path.
- Only on Thursday this week we've finally tracked down a nasty
python-ldap bug that crashed FreeIPA framework on every time --all option was used on a host or service entry with additional access controls defined. This is not part of Fedora Server criteria but kills FreeIPA use with delegated permissions to retrieve Kerberos credentials.
We do need to add HBAC rules to the criteria as well (and I thought we did have at least minimal testing for this), but I suspect this would *not* have risen to the status of blocker, but would probably would have been a lively conversation at the blocker bug meetings.
HBAC rules are domain of SSSD. If that fails, existing OpenQA tests for domain clients will notice it even with a default allow_all rule.
- We had to do a lot of Python 3 porting work for other projects. Time is
not unlimited, especially when it comes to releases and blocking criteria.
This is one place where I think the FreeIPA team needed to be more proactive. Presumably, this work was known about well before Final Freeze. Given FreeIPA's critical place in the Server Edition, it would have been grounds for approaching the Server WG and FESCo about an adjustment to the Fedora Schedule.
We've been saying that existing schedule is unrealistic for at least 3-4 Fedora releases now. I don't think it is productive to ask for extension every time. Let's be clear: Fedora puts unrealistic goals to make it possible to move forward over multiple releases. It just unrealistic to tackle them within the same release for such a complex infrastructure arrangement we deal with.
However, I'm very grateful to Python team at Red Hat who helped us enormously over past two years with Python 3 migrations in a number of key components. I'm not talking about pure Python code as in majority cases Fedora had faced. Samba has ~200K lines of generated C Python extension code that needs to be supported with both Python 2 and Python 3 at the same time.
- Dogtag had to work on Tomcat 8.5 adaptation where existing API it
dependent on was removed.
This is the sort of place where I think that modularity can help in the future. Tomcat regularly breaks backwards-compatibility and I think we in Fedora need to have a way to keep the known-working versions in the distribution, even if it is non-default.
I don't think modularity could help here. Well, may be with tomcat, but it will not help with NSS and other low-level libraries.
We had also to help Dogtag guys who were heads down in Common Criteria work for about a year. Python 3 migration for their installer came out of this work in March. Without Python 3-enabled dogtag installer we weren't able to get rid of Python 2 in Fedora 28 at all.
So on May 15th we released https://www.freeipa.org/page/Releases/4.6.90.pre2 which is now in Fedora 28 stable updates. We consider it as one of closer candidates to being stable. Between 4.6.90.pre1 and pre2 are two months of hard work across several sizeable projects (freeipa, sssd, 389-ds, MIT Kerberos, dogtag, nss, authselect, gssproxy, to name a few).
We have a testing setup at FreeIPA upstream that allows us to test complex topologies. Only recently we were able to move to Fedora 28 testing there as we had issues with our components. There we test also what OpenQA is unable to test so far. I think 4.6.90.pre2 is in much better shape than what Fedora 28 had released. However, if we were to get it as a blocking release, Fedora 28 would have been delayed by at least a month.
As I said, we had no choice: a push of NSS sqldb defaults change forced us to work on both nss-related code and openssl migration at the same time. It made impossible to keep FreeIPA from Fedora 27 and do our work in a separate module. This was known since autumn 2017 and was a well voiced situation.
It may have been known since autumn 2017, but it was not sufficiently voiced. As I said, I failed to understand the degree of trouble that FreeIPA was in. I suspect some of that was communication fatigue with failing to get me to understand, but your statement above that you basically abandoned trying to get me to understand isn't a good outcome either.
If nothing else, it might have been prudent to find another person to speak to (or a different person to speak *for* you who might have more success). Or perhaps at least had proposed a voice conversation where at least I could have heard the urgency in your voice that was apparently missing from my reading of your IRC communiques.
No, this is not about you or me, Stephen, or anyone specific otherwise.
I think it is just a general issue with mandate-driven releases -- it does not work when a committee issuing a mandate has no involvement in the actual implementation. To me it looks like Fedora Server SIG is generally not interested in identity management area development (as opposed to being able to consume resulted features admin-wise) so we are left with our own effort. It is the same with my interest in other areas though, so people who do AI/ML stuff couldn't care less to take my advise either. ;)
On Tue, 2018-05-29 at 15:31 +0300, Alexander Bokovoy wrote:
To me it looks like Fedora Server SIG is generally not interested in identity management area development (as opposed to being able to consume resulted features admin-wise) so we are left with our own effort.
Well...that's sort of how it's supposed to be. Fedora is a distribution, it consumes its actual constituent applications and libraries and so on from upstreams. Anything that's constituted as part of Fedora is supposed to be involved in the business of turning upstream software projects into an operating system, not *developing* those upstream software projects. Obviously in practice sometimes those "upstream" projects are intimately related to the job of being an operating system (e.g. dnf and anaconda) and the people involved in maintaining them switch between their Upstream Hat and Downstream Hat constantly, but the separation persists. Developing FreeIPA - or postgres or Apache or Cockpit - is not Server SIG's job. Even developing rolekit was not technically Server SIG's job, only integrating it into Fedora Server, though that was definitely a case of sgallagh switching hats a lot.
What definitely is/should have been Server SIG's job was recognizing this impedance mismatch between the update of components on which FreeIPA depended and the ability of upstream FreeIPA development to accommodate those updates at a pace compatible with the Fedora release schedule; ultimately it really ought to have been our job to see there was a big problem there and try to manage it from the downstream perspective somehow. I'd have to check back through archives and stuff to see exactly what we did about that, but there are probably ways we could have done it better.
On Tue, 2018-05-29 at 08:03 -0400, Stephen Gallagher wrote:
I may have been misunderstanding what you were telling me, but my impression of the issues I heard from you was that it did *not* in fact affect the set of things we were treating as blocking.
Well, we don't currently explicitly block on anything to do with replication at all. This is one of the things we should perhaps reconsider.
On Tue, 2018-05-29 at 08:03 -0400, Stephen Gallagher wrote:
We do need to add HBAC rules to the criteria as well (and I thought we did have at least minimal testing for this),
We do, but it probably doesn't hit this. Note the role requirements include "The FreeIPA configuration web UI must be available and allow at least basic configuration of user accounts and permissions", I've been kinda interpreting 'permissions' there to include HBAC.
On Fri, May 25, 2018 at 10:23 PM, Adam Williamson adamwill@fedoraproject.org wrote:
On Fri, 2018-05-25 at 20:31 +0000, Alexander Bokovoy wrote:
Jonathan,
your point 3 is not going to work. As I outlined in that email, so many component literally broke FreeIPA in Fedora 28 development timeline, that keeping Fedora 27's 4.6.x series was not possible.
I'm sorry for not putting out the message to more generic lists. I assumed wrongly that people interested in FreeIPA deployments would be on freeipa-users@ mailing list already.
I spent last four months trying to communicate the dire state FreeIPA will be in the Fedora 28 release to Fedora people. I failed, perhaps my style of "everything is on fire" was less than convincing.
To be totally honest, I did not get this message either, Alex - my understanding was that once we finally got all the intended packages landing and the automated tests worked, you actually thought FreeIPA was in acceptable shape fore real use. If it was known that it was not, we absolutely ought to have communicated this *far* more widely than on a niche mailing list: FreeIPA is supposed to be a key feature of Fedora Server which is itself a key edition of Fedora. This should have been up-front in the release notes and the release announcement, or frankly, should've caused us to rethink the release plans.
Obviously there was some sort of significant communication fail if enough people missed the message that this got totally whiffed on, so we should absolutely figure out what we can do better there.
Perhaps this also suggests our existing release criteria and test cases for FreeIPA are insufficient: if it can pass our existing tests and thus appear to meet our existing criteria, yet be in your judgment "not ready for production", that seems fundamentally wrong. How do you we think we could address that? Can you give some kind of summary of the issues here, which we can use to think about how to extend the test cases and criteria?
For now I'd highly recommend we do our best to cut any losses here, but perhaps we should ask Matt to figure out the best way to do that? CCing Matt, in case he hasn't seen this - Matt, see Jonathan Dieter's initial post on server@ for context. Thanks!
We should at least get details into the common bugs page with recommendations as what to do, maybe stick on F-27, plus reference to a tracker bug.
On Fri, May 25, 2018 at 10:23 PM, Adam Williamson <adamwill(a)fedoraproject.org> wrote:
We should at least get details into the common bugs page with recommendations as what to do, maybe stick on F-27, plus reference to a tracker bug.
I added https://fedoraproject.org/wiki/Common_F28_bugs#FreeIPA_4.6.x_-.3E_4.6.90_inp...
We don't have a tracker bug because most things were fixed already in 4.6.90.pre2 which is in Fedora 28 updates stable. However, if people perform an upgrade using original Fedora Server media, they will have issues no matter how many fixes we'd provide.
On Fri, 2018-05-25 at 20:31 +0000, Alexander Bokovoy wrote:
Jonathan,
your point 3 is not going to work. As I outlined in that email, so many component literally broke FreeIPA in Fedora 28 development timeline, that keeping Fedora 27's 4.6.x series was not possible.
Fair enough.
I'm sorry for not putting out the message to more generic lists. I assumed wrongly that people interested in FreeIPA deployments would be on freeipa-users@ mailing list already.
I do really appreciate that you posted the message somewhere. Without it, I would have kept on fighting with my install, assuming that the problem was on my end (our FreeIPA installation has gone through multiple upgrades, and who knows what cruft has accumulated over the years).
Are the Fedora 28 release notes fixed, or could we amend them to include the essence of your original email? And then amend them again when you get a stable FreeIPA server in -updates?
Jonathan
On Fri, 2018-05-25 at 20:31 +0000, Alexander Bokovoy wrote:
Fair enough.
I do really appreciate that you posted the message somewhere. Without it, I would have kept on fighting with my install, assuming that the problem was on my end (our FreeIPA installation has gone through multiple upgrades, and who knows what cruft has accumulated over the years).
Are the Fedora 28 release notes fixed, or could we amend them to include the essence of your original email? And then amend them again when you get a stable FreeIPA server in -updates?
Did you try with 4.6.90.pre2 in updates stable? Because that's our current stable candidate. There are few fixed that will be coming on top of that but aside from https://pagure.io/freeipa/issue/7565 and few OTP-related investigations this is considered to be close to 4.7.0 release.
On Mon, 2018-05-28 at 07:12 +0000, Alexander Bokovoy wrote:
Did you try with 4.6.90.pre2 in updates stable? Because that's our current stable candidate. There are few fixed that will be coming on top of that but aside from https://pagure.io/freeipa/issue/7565 and few OTP-related investigations this is considered to be close to 4.7.0 release.
Yes, see my reply to your other message (I wish HyperKitty made it easy to create a link to a single message).
Basically, I hit https://pagure.io/freeipa/issue/7565 on 4.6.90-pre2.
Jonathan
On Mon, May 28, 2018 at 08:55:06PM +0300, Jonathan Dieter wrote:
On Mon, 2018-05-28 at 07:12 +0000, Alexander Bokovoy wrote:
Did you try with 4.6.90.pre2 in updates stable? Because that's our current stable candidate. There are few fixed that will be coming on top of that but aside from https://pagure.io/freeipa/issue/7565 and few OTP-related investigations this is considered to be close to 4.7.0 release.
Yes, see my reply to your other message (I wish HyperKitty made it easy to create a link to a single message).
It does – at the upper right hand part of the message, under the date, you have two icons: uppercase A and a percent sign. The percent is labelled "permalink" and gives a link to specific message: https://lists.fedoraproject.org/archives/list/server@lists.fedoraproject.org...
It looks crappy, but it works.
On Mon, 2018-05-28 at 20:10 +0200, Tomasz Torcz wrote:
On Mon, May 28, 2018 at 08:55:06PM +0300, Jonathan Dieter wrote:
Yes, see my reply to your other message (I wish HyperKitty made it easy to create a link to a single message).
It does – at the upper right hand part of the message, under the date, you have two icons: uppercase A and a percent sign. The percent is labelled "permalink" and gives a link to specific message: https://lists.fedoraproject.org/archives/list/server@lists.fedoraproj ect.org/message/VOIG36CLJCKWWVGET3MI7IH6YPWQNG7N/
It looks crappy, but it works.
That's pretty awesome! Thanks for the pointer.
FTR, the message I was referencing was: https://lists.fedoraproject.org/archives/list/server@lists.fedoraproject.org...
Jonathan
On Mon, May 28, 2018 at 12:56 PM, Jonathan Dieter jdieter@gmail.com wrote:
On Mon, 2018-05-28 at 20:10 +0200, Tomasz Torcz wrote:
On Mon, May 28, 2018 at 08:55:06PM +0300, Jonathan Dieter wrote:
Yes, see my reply to your other message (I wish HyperKitty made it easy to create a link to a single message).
It does – at the upper right hand part of the message, under the date, you have two icons: uppercase A and a percent sign. The percent is labelled "permalink" and gives a link to specific message: https://lists.fedoraproject.org/archives/list/server@lists.fedoraproj ect.org/message/VOIG36CLJCKWWVGET3MI7IH6YPWQNG7N/
It looks crappy, but it works.
That's pretty awesome! Thanks for the pointer.
FTR, the message I was referencing was: https://lists.fedoraproject.org/archives/list/server@lists.fedoraproject.org...
Also each email has a header 'Archived-At:' so use "show original" or equivalent and you'll get a URL.
server@lists.fedoraproject.org