Hey,
Just forwarding it here so Python folks don't miss it on the main devel list.
Thanks, Mark.
-------- Forwarded Message --------
From: Mark McLoughlin markmc@redhat.com Reply-to: Mark McLoughlin markmc@redhat.com To: Development discussions related to Fedora devel@lists.fedoraproject.org Subject: Python libraries and backwards compat [was Re: What would it take to make Software Collections work in Fedora?] Date: Mon, 04 Mar 2013 22:51:31 +0000
On Thu, 2012-12-06 at 07:06 -0800, Adam Williamson wrote:
On Thu, 2012-12-06 at 15:30 +0100, Nicolas Mailhot wrote:
IMHO use of software collections is a symptom of a badly run organisation not devoting enough cycles to maintain the software it uses, and hoping (as in wishful thinking) no problem will go critical before the product they built on top of those collections is end-of-lifed
I completely fail to see how entities with that problem will manage to maintain the package number explosion creating software collections will induce.
On the one hand, I agree completely - I think the 'share all dependencies dynamically' model that Linux distros have traditionally embraced is the right one, and that we're a strong vector for spreading the gospel when it comes to that model, and it'd be a shame to compromise that.
On the other hand, we've been proselytizing the Java heretics for over a decade now, and the Ruby ones for a while, and neither shows any signs of conversion or just plain going away, so we may have to call it an ecumenical matter and deal with their models somehow. Sucky as it may be. I don't know, I'm a bit conflicted.
It's interesting that you call out Java and Ruby folks as being heretics. I guess that means all is kosher with Python?
OpenStack is getting burned by API instability in some Python packages, so I've started a thread on Python's distutils-sig to try and guage the level of heresy amongst Python folks :)
It started here:
http://mail.python.org/pipermail/distutils-sig/2013-February/020030.html
and now we're talking about Software Collections here:
http://mail.python.org/pipermail/distutils-sig/2013-March/020074.html
Two things I'm picking up from the thread:
A trend towards "semantic versioning" and, implicit in that, an acceptance of API breakages so long as the major number of a library version is incremented
Supporting the parallel installation of incompatible versions of libraries isn't seen as an issue because you can "just use virtual environments" ... which amounts to Python Software Collections.
The combination of those two things suggests to me that the Python world will start looking a lot less sane to packagers - i.e. library maintainers breaking API compatibility more often and assuming we can just use SC or similar to have multiple incompatible versions installed.
I can see OpenStack upstream reacting to this by "capping" its required version range for each library it depends so that if the library does release an incompatible version, OpenStack sticks with the latest compatible version.
If that happens, I think OpenStack packagers will need to look seriously at using Software Collections. Basically, we'd look to package and maintain our own stack of all the Python libraries we need above the core Python libraries.
So, you'd have openstack-nova, openstack-glance, etc. all installed as normal in /usr, /var, etc. but they'd require python libraries from the openstack-grizzly SC like openstack-grizzly-python-eventlet which would be installed in /opt/fedora/openstack-grizzly/root/usr/lib/python.
I'd appreciate it if someone else with a Fedora Python packaging background could look into this and, hopefully, explain how the discussion on distutils-sig isn't so terrifying after all.
Cheers, Mark.
On 03/05/2013 08:53 AM, Mark McLoughlin wrote:
I'd appreciate it if someone else with a Fedora Python packaging background could look into this and, hopefully, explain how the discussion on distutils-sig isn't so terrifying after all.
As far as I can see, your concerns are valid in the near term, but manageable in the longer term. The Linux distros have tried to advocate system-wide dynamic linking for years, and ISVs have overwhelmingly responded by choosing not to support the platform, rather than by embracing dynamic linking. ISVs have instead voted with their feet by embracing Microsoft, Google and Apple, all OS vendors that explicitly encourage bundling of dependencies within an application. The Java and Ruby communities don't care, and the Python community doesn't actually care either (we're just a lot more conservative in general about making backwards incompatible changes in the first place). In the fight between easier security updates for system administrators and easier cross-platform and cross-OS-version support for developers, shared dependencies have lost, and lost comprehensively, amongst all but the largest software vendors.
It's *not* a coincidence that CPython publishes pre-built binaries for Windows and Mac OS X, but not for any Linux distro. Cross-distro packaging is just too hard, and not worth the effort, so we just publish the source archive and tell the individual distro communities "you figure it out" (and, to their credit, they generally do).
Even amongst the Linux community, the popularity of virtual appliances and virtual machines in general show that people recognise the serious architectural problems created by coupling nominally independent components together by forcing them to share dependencies. The response to that has not been "virtual machines are evil, because they bundle dependencies" (even though that's exactly what they do - they even bundle the underlying OS!), it has been, "we need to create the appropriate tools to easily update all these virtual machines when it is time to deploy a security fix".
Language specific virtual environments are no different: the response should NOT be "virtual environments are evil, we need to discourage developers from using them", it should be "what tools do we need to create to make it easy to deploy security fixes to language specific virtual environments?". When dependencies are bundled without metadata, that is indeed truly evil, as there's no way to know what needs to be updated to install a security fix. Virtual environments aren't like that - the necessary dependency information is there, the deployment tools to manage security updates just need to be written (and perhaps the behaviour of some existing tools adjusted to make it easier to deploy those fixes).
We can either dig in our heels, demanding that every language include support for dynamic linking against a version of a library at runtime without the distros doing any extra work (and demand that each individual upstream project do the necessary work to select the correct version at runtime), or we can try to create shared dependency systems that *don't suck* from the point of view of a single-developer ISV that wants to easily support arbitrary Linux distros (who may all be shipping different versions of the application's dependencies), as well as Mac OS X and Windows (who probably aren't shipping shared versions of the application's dependencies at all).
Specifically in the context of Python's virtual environments, Fedora and other distros definitely have scope to make contributions that improve the security update handling for system administrators, *without* compromising on ease of deployment for cross-platform developers. For example:
* A Python venv may include *.pth files in the venv's site-packages directory that add more directories to sys.path. This means it is possible to enhance pip and other installers to use a shared version store, *without* needing to use anything other than virtual environments to expose the appropriate versions of the shared dependencies to a given application. This can be done purely through installation tools and Python's existing import infrastructure *without* the application needing to do anything special. * Similar effects can also be achieved through symlinks (a *.pth based approach will likely be an easier sell upstream, though, since "doesn't work properly on Windows" is not an acceptable limitation when it comes to Python's packaging infrastructure)
Python upstream is heavily focused on cross platform development. If distros want to resolve sysadmin specific problems (such as handling security updates to shared dependencies), they need to bring a solution that doesn't make life harder for developers and also works on Mac OS X and Windows, or upstream will ignore you.
If you want me to flesh out the "shared versions for Python virtual environments" idea, so you can pitch it to the pip and virtualenv developers, I can certainly do that, but I don't have time to work on it, or advocate for it myself.
Regards, Nick.
On 03/05/2013 12:57 PM, Nick Coghlan wrote:
If you want me to flesh out the "shared versions for Python virtual environments" idea, so you can pitch it to the pip and virtualenv developers, I can certainly do that, but I don't have time to work on it, or advocate for it myself.
I ended up elaborating on this point over on distutils-sig: http://mail.python.org/pipermail/distutils-sig/2013-March/020081.html
Cheers, Nick.
python-devel@lists.fedoraproject.org