Hi,
Question: Shall Fedora Extras support ix86 optimized packages but i386-compiled packages rsp. under shall circumstances shall Fedora Extras support such packages?
Background: Some folks have started to add i686-built application packages in addition to i386-built packages to Fedora Extras, claiming these i686-built, "optimized packages" would result into much better performance of these packages ("up to factor 2").
I'd rather prefer all packages to be built with RH/FC standard rpm compilation flags (-march=i386 -mcpu=i686; rpmbuild --target=i386), because this avoids a lot of the complexity and potential breakdowns of shipping such "optimized packages" and to ship non-i386 packages only where absolutely inevitable.
I.e. I'd like to Fedora Extras packaging policy to be extended to make RH/FC's compilation flags mandatory to packages and to allow building packages for other rpm-targets only in very rare exceptional cases.
For details of the discussion cf. https://bugzilla.fedora.us/show_bug.cgi?id=2033
Ralf
On Tue, 2004-08-31 at 10:21, Ralf Corsepius wrote:
Hi,
Question: Shall Fedora Extras support ix86 optimized packages but i386-compiled packages rsp. under shall circumstances shall Fedora Extras support such packages?
Background: Some folks have started to add i686-built application packages in addition to i386-built packages to Fedora Extras, claiming these i686-built, "optimized packages" would result into much better performance of these packages ("up to factor 2").
those optimized packages aren't faster; at least I find it hard to believe.... esp on p4 and athlon cpus where cmov is no gain again ;)
On Tue, Aug 31, 2004 at 10:30:33AM +0200, Arjan van de Ven wrote:
On Tue, 2004-08-31 at 10:21, Ralf Corsepius wrote:
Hi,
Question: Shall Fedora Extras support ix86 optimized packages but i386-compiled packages rsp. under shall circumstances shall Fedora Extras support such packages?
Background: Some folks have started to add i686-built application packages in addition to i386-built packages to Fedora Extras, claiming these i686-built, "optimized packages" would result into much better performance of these packages ("up to factor 2").
those optimized packages aren't faster; at least I find it hard to believe.... esp on p4 and athlon cpus where cmov is no gain again ;)
Well, SSE/SSE2 can help for graphic/video/audio applications. But there .i686.rpm doesn't help you, either the application selects whether to use SSE/SSE2 or not at runtime, or the packages can have separate sse2 and normal libs in one package: /usr/lib/libfoo.so.1 /usr/lib/sse2/libfoo.so.1
Jakub
Jakub Jelinek wrote:
Well, SSE/SSE2 can help for graphic/video/audio applications. But there .i686.rpm doesn't help you, either the application selects whether to use SSE/SSE2 or not at runtime, or the packages can have separate sse2 and normal libs in one package: /usr/lib/libfoo.so.1 /usr/lib/sse2/libfoo.so.1
Jakub
IMHO, Ralf's concerns are valid, but it really doesn't matter and this is not a big deal at all. In the majority of cases it does NOT break anything. The software itself is not broken, and our package management tools handle it fine.
Jakub's suggestion here is very good though, and inkscape should look into this for future updates if possible.
Just don't worry about it. We do things on a case-by-case basis. Sometimes it is worth trying even if just to experiment.
Warren Togami wtogami@redhat.com
Jakub Jelinek wrote :
Background: Some folks have started to add i686-built application packages in addition to i386-built packages to Fedora Extras, claiming these i686-built, "optimized packages" would result into much better performance of these packages ("up to factor 2").
those optimized packages aren't faster; at least I find it hard to believe.... esp on p4 and athlon cpus where cmov is no gain again ;)
Well, SSE/SSE2 can help for graphic/video/audio applications. But there .i686.rpm doesn't help you, either the application selects whether to use SSE/SSE2 or not at runtime, or the packages can have separate sse2 and normal libs in one package: /usr/lib/libfoo.so.1 /usr/lib/sse2/libfoo.so.1
This is "the proper way" for sure, but there are quite a few of (mostly multimedia) projects out there that hardcode MMX/SSE support at compile time, rather than enabling it at runtime when built for the x86 architecture :-(
Matthias
On Tue, 2004-08-31 at 16:25, Matthias Saou wrote:
Jakub Jelinek wrote :
Background: Some folks have started to add i686-built application packages in addition to i386-built packages to Fedora Extras, claiming these i686-built, "optimized packages" would result into much better performance of these packages ("up to factor 2").
those optimized packages aren't faster; at least I find it hard to believe.... esp on p4 and athlon cpus where cmov is no gain again ;)
Well, SSE/SSE2 can help for graphic/video/audio applications. But there .i686.rpm doesn't help you, either the application selects whether to use SSE/SSE2 or not at runtime, or the packages can have separate sse2 and normal libs in one package: /usr/lib/libfoo.so.1 /usr/lib/sse2/libfoo.so.1
This is "the proper way" for sure, but there are quite a few of (mostly multimedia) projects out there that hardcode MMX/SSE support at compile time, rather than enabling it at runtime when built for the x86 architecture :-(
Can't you build the same tarball twice? Once with sse2 enabled, installing with LIBDIR=/usr/lib/sse2, and one in the normal way with sse2 disabled.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Alexander Larsson Red Hat, Inc alexl@redhat.com alla@lysator.liu.se He's a gun-slinging umbrella-wielding hairdresser from a doomed world. She's a cynical paranoid advertising executive on her way to prison for a murder she didn't commit. They fight crime!
Alexander Larsson wrote :
Well, SSE/SSE2 can help for graphic/video/audio applications. But there .i686.rpm doesn't help you, either the application selects whether to use SSE/SSE2 or not at runtime, or the packages can have separate sse2 and normal libs in one package: /usr/lib/libfoo.so.1 /usr/lib/sse2/libfoo.so.1
This is "the proper way" for sure, but there are quite a few of (mostly multimedia) projects out there that hardcode MMX/SSE support at compile time, rather than enabling it at runtime when built for the x86 architecture :-(
Can't you build the same tarball twice? Once with sse2 enabled, installing with LIBDIR=/usr/lib/sse2, and one in the normal way with sse2 disabled.
Is then having the same library twice, the regular one in /usr/lib and the SSE2 optimized one in /usr/lib/sse2, expected to "just work" at runtime? If so, I didn't know the existence of this, and will definitely look into it. What about MMX? Should one just simplify with SSE vs. non-SSE instead and put (non runtime) MMX optimized libs there too?
Matthias
On Tue, 2004-08-31 at 17:38, Matthias Saou wrote:
Alexander Larsson wrote :
Well, SSE/SSE2 can help for graphic/video/audio applications. But there .i686.rpm doesn't help you, either the application selects whether to use SSE/SSE2 or not at runtime, or the packages can have separate sse2 and normal libs in one package: /usr/lib/libfoo.so.1 /usr/lib/sse2/libfoo.so.1
This is "the proper way" for sure, but there are quite a few of (mostly multimedia) projects out there that hardcode MMX/SSE support at compile time, rather than enabling it at runtime when built for the x86 architecture :-(
Can't you build the same tarball twice? Once with sse2 enabled, installing with LIBDIR=/usr/lib/sse2, and one in the normal way with sse2 disabled.
Is then having the same library twice, the regular one in /usr/lib and the SSE2 optimized one in /usr/lib/sse2, expected to "just work" at runtime? If so, I didn't know the existence of this, and will definitely look into it. What about MMX? Should one just simplify with SSE vs. non-SSE instead and put (non runtime) MMX optimized libs there too?
Thats what jakub said in his email. Its similar to /usr/lib/tls I guess. Dunno if there is a special MMX dir, but yeah, otherwise you could put those in sse2 i guess. (Only if the mmx performance difference actually matters of course.)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Alexander Larsson Red Hat, Inc alexl@redhat.com alla@lysator.liu.se He's a superhumanly strong ninja jungle king She's a cold-hearted extravagent advertising executive with only herself to blame. They fight crime!
On Tue, Aug 31, 2004 at 05:38:31PM +0200, Matthias Saou wrote:
Is then having the same library twice, the regular one in /usr/lib and the SSE2 optimized one in /usr/lib/sse2, expected to "just work" at runtime? If
Yes, it will just work. Both the dynamic linker and ldconfig know how to handle it.
so, I didn't know the existence of this, and will definitely look into it. What about MMX? Should one just simplify with SSE vs. non-SSE instead and put (non runtime) MMX optimized libs there too?
ATM sse2 is the only "important" feature ld.so on IA-32 handles. Previously it used to be mmx, but as every added feature slows down library loading when not using ld.so.cache (e.g. when LD_LIBRARY_PATH is used or DT_RPATH; every feature doubles the number of stat'ed directories before the non-existing directory cache is filled), it was just changed to sse2 instead of adding sse2 to mmx. SSE2 was chosen because you can get quite a big speedup already by recompiling with -msse2 -mfpmath=sse.
Jakub
Jakub Jelinek wrote :
On Tue, Aug 31, 2004 at 05:38:31PM +0200, Matthias Saou wrote:
Is then having the same library twice, the regular one in /usr/lib and the SSE2 optimized one in /usr/lib/sse2, expected to "just work" at runtime? If
Yes, it will just work. Both the dynamic linker and ldconfig know how to handle it.
so, I didn't know the existence of this, and will definitely look into it. What about MMX? Should one just simplify with SSE vs. non-SSE instead and put (non runtime) MMX optimized libs there too?
ATM sse2 is the only "important" feature ld.so on IA-32 handles. Previously it used to be mmx, but as every added feature slows down library loading when not using ld.so.cache (e.g. when LD_LIBRARY_PATH is used or DT_RPATH; every feature doubles the number of stat'ed directories before the non-existing directory cache is filled), it was just changed to sse2 instead of adding sse2 to mmx. SSE2 was chosen because you can get quite a big speedup already by recompiling with -msse2 -mfpmath=sse.
Thanks for this valuable insight. I'll dig into a few relevant multimedia packages and make a few "plain vs. optimized" tests of my own to see what gives.
Matthias
Jakub Jelinek schrieb:
On Tue, Aug 31, 2004 at 05:38:31PM +0200, Matthias Saou wrote:
Is then having the same library twice, the regular one in /usr/lib and the SSE2 optimized one in /usr/lib/sse2, expected to "just work" at runtime? If
Yes, it will just work. Both the dynamic linker and ldconfig know how to handle it.
so, I didn't know the existence of this, and will definitely look into it. What about MMX? Should one just simplify with SSE vs. non-SSE instead and put (non runtime) MMX optimized libs there too?
ATM sse2 is the only "important" feature ld.so on IA-32 handles. Previously it used to be mmx, but as every added feature slows down library loading when not using ld.so.cache (e.g. when LD_LIBRARY_PATH is used or DT_RPATH; every feature doubles the number of stat'ed directories before the non-existing directory cache is filled), it was just changed to sse2 instead of adding sse2 to mmx. SSE2 was chosen because you can get quite a big speedup already by recompiling with -msse2 -mfpmath=sse.
Jakub
and what about -m3dnow on amd systems ?
On Tue, 2004-08-31 at 10:36, Jakub Jelinek wrote:
On Tue, Aug 31, 2004 at 10:30:33AM +0200, Arjan van de Ven wrote:
On Tue, 2004-08-31 at 10:21, Ralf Corsepius wrote:
Hi,
Question: Shall Fedora Extras support ix86 optimized packages but i386-compiled packages rsp. under shall circumstances shall Fedora Extras support such packages?
Background: Some folks have started to add i686-built application packages in addition to i386-built packages to Fedora Extras, claiming these i686-built, "optimized packages" would result into much better performance of these packages ("up to factor 2").
those optimized packages aren't faster; at least I find it hard to believe.... esp on p4 and athlon cpus where cmov is no gain again ;)
Well, SSE/SSE2 can help for graphic/video/audio applications.
If you say so, I don't have any reason for not believing you :-)
But there .i686.rpm doesn't help you,
Can you explain?
It's the same approach RH applies to glibc and many 3rd party packagers apply to their package.
My actual concern is less the technical side, but the "policy side" of shipping "optimized packages" and its impact on "packaging"/"upgrading".
either the application selects whether to use SSE/SSE2 or not at runtime, or the packages can have separate sse2 and normal libs in one package: /usr/lib/libfoo.so.1 /usr/lib/sse2/libfoo.so.1
i.e. a partial multilib implementation.
Packing-wise, this has several disadvantages. 1. You'd have to compile library packages twice. 2. Many packages contain both libraries and applications, so you'd have to apply special measures to assure that applications still get -march=i386 compiled. 3. It would almost double the size of i386.rpms (These sse2 libs would have to be part of i386.rpms) - Is it worth it?
However, I agree, it's a nice work-around suitable for libraries where special optimizations can be proven to have a "significant/noticeable" impact.
Finally, this doesn't cover applications - The initial sparc to ignite this discussion was folks having entered "optimized applications" in FE (cf. https://bugzilla.fedora.us/show_bug.cgi?id=2033)
Ralf
On Wed, Sep 01, 2004 at 11:23:01AM +0200, Ralf Corsepius wrote:
But there .i686.rpm doesn't help you,
Can you explain?
Because you can install .i686.rpm on CPUs that don't have SSE or SSE2.
Packing-wise, this has several disadvantages.
- You'd have to compile library packages twice.
- Many packages contain both libraries and applications, so you'd have
to apply special measures to assure that applications still get -march=i386 compiled. 3. It would almost double the size of i386.rpms (These sse2 libs would have to be part of i386.rpms) - Is it worth it?
If it is worth to have the SSE/SSE2 versions at all (i.e. the gains are big enough), then yes, it is worth it. You certainly can't put binaries/libraries requiring SSE or even SSE2 without any fallback for earlier CPUs into neither i386.rpm nor i686.rpm.
Jakub
Ralf Corsepius wrote:
But there .i686.rpm doesn't help you,
Can you explain?
It's the same approach RH applies to glibc and many 3rd party packagers apply to their package.
It is proved that a .i686 package for glibc has benefits. If you find some other package where this makes a real measurable difference, I doubt that you'll find resistance. But you have to prove it first.
My actual concern is less the technical side, but the "policy side" of shipping "optimized packages" and its impact on "packaging"/"upgrading".
.i686 are not "optimized packages" if you cannot prove they are real improvements. Until then they are just doubling the QA efforts without any benefits.
Packing-wise, this has several disadvantages.
- You'd have to compile library packages twice.
And for a separate .i686 this isn't the case?
- Many packages contain both libraries and applications, so you'd have
to apply special measures to assure that applications still get -march=i386 compiled.
Most of the time this is no problem. The application side is small, the majority of the code is in the DSOs.
- It would almost double the size of i386.rpms (These sse2 libs would
have to be part of i386.rpms) - Is it worth it?
The size of the actual DSOs is not the only factor in the RPM size. This means that two RPMs are bigger then one RPM with two DSO versions.
However, I agree, it's a nice work-around suitable for libraries where special optimizations can be proven to have a "significant/noticeable" impact.
This is no work-around, this is the preferred solution.
And once again: provide us some data about packages where special DSOs or even i686 versions are of benefit. Make this analysis based on modern hardware. For instance, the Northwood cores and earlier benefit more from special i686 rules than prescott and nocona. And the latter are the main targets very soon so adding something just for the benefit of "legacy hardware" is not very attractive.
You can believe me, we are looking for possible ways to improve the quality of the shipped code. But the processor makers really do a good job in having the processors execute plain i386 code as good as possible on the processors. Code generation changes have little effects. Except when it comes to using SSE2 etc, and there we already ship code using it. Look at the gmp RPM.
On Wed, 01 Sep 2004 09:17:41 -0700, Ulrich Drepper drepper@redhat.com wrote:
- It would almost double the size of i386.rpms (These sse2 libs would
have to be part of i386.rpms) - Is it worth it?
The size of the actual DSOs is not the only factor in the RPM size. This means that two RPMs are bigger then one RPM with two DSO versions.
Just playing Devil's Advocate here, but if the extra optimised libraries are in a separate directory, wouldn't it be trivial to define a subpackage for them?
Say we have libinfinite, which is a special library for executing infinite loops. There's an option to have an SSE2 optimised version of the library, which executes them even faster.
libinfinite-0.1-1.i386.rpm contains /usr/lib/libinfinite.so.0 (and other common docs, utils, etc)
A subpackage, libinfinite-sse2-0.1-1.i386.rpm, contains /usr/lib/sse2/libinfinite.so.0 (just the optimised version, depends on libinfinite)
No doubling up, installable easily at any time, and removable by users who need the disk space (without breaking anything).
On Thu, 2004-09-02 at 05:05, Mike Barnes wrote:
On Wed, 01 Sep 2004 09:17:41 -0700, Ulrich Drepper drepper@redhat.com wrote:
- It would almost double the size of i386.rpms (These sse2 libs would
have to be part of i386.rpms) - Is it worth it?
The size of the actual DSOs is not the only factor in the RPM size. This means that two RPMs are bigger then one RPM with two DSO versions.
Just playing Devil's Advocate here, but if the extra optimised libraries are in a separate directory, wouldn't it be trivial to define a subpackage for them?
Say we have libinfinite, which is a special library for executing infinite loops. There's an option to have an SSE2 optimised version of the library, which executes them even faster.
libinfinite-0.1-1.i386.rpm contains /usr/lib/libinfinite.so.0 (and other common docs, utils, etc)
A subpackage, libinfinite-sse2-0.1-1.i386.rpm, contains /usr/lib/sse2/libinfinite.so.0 (just the optimised version, depends on libinfinite)
There is one major drawback of this approach: The libinfinite-sse2*.rpm would not get automatically installed by apt/yum etc.
If /usr/lib/sse2/*.so.* were part of i386-rpms, they would get automatically installed on all ix86 systems and the dynamical linker would have to decide on which library to use at run-time.
Ralf
Ralf Corsepius wrote:
If /usr/lib/sse2/*.so.* were part of i386-rpms, they would get automatically installed on all ix86 systems and the dynamical linker would have to decide on which library to use at run-time.
ldconfig normally makes the decisions, but yes. Using sub-packages would mean the installer has to know which of the subpackages to use and install them without the user requesting it since otherwise it would never happen, which user has enough knowledge to explicitly request this. And then there are the people who change motherboards or simply move the disks from one machine to another. For them the selection might be wrong.
It's best to use multiple DSOs in one RPM. Exceptions are huge packages like glibc. If there ever should be i686/P4 variants for the gnome/kde libs the'd probably also fall into the "huge package" category. But we are here talking about packages like gmp or video/autdio encoders etc.
On Wed, 2004-09-01 at 18:17, Ulrich Drepper wrote:
Ralf Corsepius wrote:
But there .i686.rpm doesn't help you,
Can you explain?
My actual concern is less the technical side, but the "policy side" of shipping "optimized packages" and its impact on "packaging"/"upgrading".
.i686 are not "optimized packages" if you cannot prove they are real improvements. Until then they are just doubling the QA efforts without any benefits.
Exactly, that's my point and intention (cf. below).
Packing-wise, this has several disadvantages.
- You'd have to compile library packages twice.
And for a separate .i686 this isn't the case?
There is a subtile difference: * twice within the same rpm.spec vs. * building a package twice.
The former is more complicated and error-prone than the latter, e.g. in the latter case you can pickup RPM_OPT_FLAGS and apply rpm's standard %_*dir macros, in the former case you'd have to setup most *FLAGS and dirs manually.
- Many packages contain both libraries and applications, so you'd have
to apply special measures to assure that applications still get -march=i386 compiled.
Most of the time this is no problem. The application side is small, the majority of the code is in the DSOs.
This isn't necessarily true for normal packages.
For normal packages, you'd have to configure/make/install the package twice, using different --libdirs and CFLAGS, etc. etc. and to sort out those files which have been built and installed twice.
- It would almost double the size of i386.rpms (These sse2 libs would
have to be part of i386.rpms) - Is it worth it?
The size of the actual DSOs is not the only factor in the RPM size. This means that two RPMs are bigger then one RPM with two DSO versions.
Right, nevertheless it doubles the size of the rpms and doubles the size of required disk-space. Instead of having to install one DSO, you'd have to install two.
It's the classical multilib dilemma known from embedded toolchains: Packaging simplicity and flexibility vs. space and bandwidth.
However, I agree, it's a nice work-around suitable for libraries where special optimizations can be proven to have a "significant/noticeable" impact.
This is no work-around, this is the preferred solution.
Uhh? I considered Jacub's remark to be a proposal and recommendation, and not to be a dictate.
And once again: provide us some data about packages where special DSOs or even i686 versions are of benefit.
Sorry, you are barking up the wrong tree - I fully agree with you that shipping and building "optimized packages" in most cases probably does not result into performance gains.
The cause for me starting this thread is people already doing so in FE and me questioning their procedures.
I did ask and received this in return:
https://bugzilla.fedora.us/show_bug.cgi?id=2033#c8 The developers for scribus and inkscape believe that 686 optimization is a plus for their apps (requested 686 packages for their sites), due largely to mmx. The inkscape package disables mmx if built for 386, enables for 686. ...
https://bugzilla.fedora.us/show_bug.cgi?id=2033#c9 ... One of the Scribus devels has did quite a bit of optimization during the 1.1.x series, specifically some of these are helped by i686. This resulted in 50% speed ups in user visible functions like screen redraws. e.g 5 seconds, instead of 10 seconds to view a complex page. ...
Does this qualify as "proof of gain in performance"? I really don't know.
I.e. I am asking for FE's packaging policy to be changed to not shipping "optimized packages" unless compelling facts about the influence of "optimization" on a particular package can be provided and reproduced.
Ralf
devel@lists.stg.fedoraproject.org