Attached is a revised version of my turboLiveInst patch to livecd-tools and
anaconda.
This version is more polished. I.e. bugs have been fixed, complexity removed,
and therefore should be easier to review.
I performed some anecdotal performance tests, on a sony vaio vgn-n250e. I used
a 30G destination volume for all tests, and when using usbstick media, it was
media that reported 8.5MB/s from hdparm -t. I did have selinux disabled, and
did not use the prelink option. I'd love to hear performance numbers from
differing test rigs.
The performance results I got were-
install from cdrom without turboLiveInst:
copy: 250s
postinst: 86s
install from cdrom with turboLiveInst:
copy: 299s
postinst: 84s
install from usb without turboLiveInst:
copy: 226s
postinst: 72s
install from usb with turboLiveInst:
copy: 175s
postinst: 72s
Conclusions:
On this testrig, installing from cdrom, turboLiveInst yielded a 20% speedup in
copy, which resulted in an end to end install speedup of 15%. Installing from
usb, turboLiveInst yielded a 29% speedup in copy, which resulted in an end to
end install speedup of 20%.
I did test copy-to-ram mode, and the resulting benefits were laughably huge.
But this is only because this laptop has 1G of ram, and very strange behaviour
occurs this near the threshold of having too little ram to use this feature.
Though this is still an argument in favor of turboLiveInst, in that somehow it
found itself on the better side of the threshold. I would expect that with 2G
of ram, the benefit would be on the order of 35-50% speedup, as the main thing
masking the benefit in the cdrom case, is the slow access to install media,
which hides the benefit of cutting the needed disk writes nearly in half.
The secondary benefit of turboLiveInst is that it removes the artificial
limitation that the target rootfs must be greater than 4.0G, instead of the 2.1G
actual uncompressed size of the contents of the LiveCD.
Jeremy has pushed back against this patch because of complexity. Hopefully this
round of polishing will make the patch much easier to read and understand. In
addition, since the cleanupDeleted patch which this one depends on has already
been merged, that should also make this a bit more palatable.
Jeremy also brought up the idea of doing a file level copy installation, rather
than the current block level mechanism. This _is_ a good idea, in my opinion,
in that it will more intelligently support situations such as a separate /usr
filesystem. In addition there is no way that turboLiveInst or the existing
block level mechanism can be made to support xfs or other non-ext3 destination
filesystems chosen by the user (should those options return).
But- I would argue that file level installations may suffer badly due to cdrom
seeking. I would also argue that there is no reason why turboLiveInst, could
not be the first choice for installation technique, with fallbacks to file level
copies for the seperate /usr and xfs type scenarios.
Ultimately I just hope that turboLiveInst gets serious consideration for F8, via
performance comparison with whatever other options may exist.
Finally, here are some notes on the architecture, which may help you to
understand the code-
As I mentioned in the original patch, the basic idea is to simply, at
livecd-creator build time, use a device mapper snapshot to generate a delta file
between the 4.0G filesystem housing 2.1G of data, and a 2.1G filesystem holding
the same data. Then at anaconda/liveinst time, that delta file is used to
recreate a virtual image of the 2.1G filesystem, so that it can be copied to the
installation target, rather than the 4.0G filesystem which includes 1.9G of
zeros that needn't be written to disk.
I ended up using /dev/loop118 in the initramfs init (mayflower generated) to
expose the delta file. /dev/loop118 was mknod'd already by mayflower, but not
actually used for anything. I used it to expose the delta file (osmin.tgz),
because /dev/loop121 was being used to expose the 4.0G os.img, and it seemed
simplest to use an identical mechanism to expose that data. Because by the time
anaconda runs, the original cdrom and squashfs filesystems have been lazy
unmounted, a simple cp at that time was not an option(?).
I chose to extract the delta file (osmin, a 16MB sparse file containing 1.2M of
data, compressed to 25kb on cdrom) into /dev/shm (i.e. ram) so that reads from
it would not try to go to the cdrom. This may not have been necessary.
I chose to calculate the size of the filesystem at livecd-creator time, and
include it with osmin as osmin.size(together forming osmin.tgz). This is less
complex than what I did in the first pass, which was to use dumpe2fs to
calculate it at ananconda time.
As always, questions, comments, criticisms, and especially testers are more than
welcome.
peace...
-dmc