Hi folks,
In the course of messing with various installer tools, I've ended up writing some python code that reads and writes an XML metadata file that is (basically) an easily-parsable superset of .discinfo.
A quick overview of the concepts involved here, from the top down:
A "compose" is, basically, an entire distribution - all the installable trees for all the various arches, plus iso sets, plus maybe some SRPMS and debuginfo packages that go along with them. I suppose I could rename this to "distro" but we've been using this terminology for so long that it's just stuck.
A "tree" is a directory layout with all the packages and images you need to install from. An "isoset" is (surprise!) a full set of isos that make up the distribution.
Okay, here's some examples. The tool that does the composing (pungi, distill, etc.) should create .compose.xml in the top-level of the compose dir. That file looks approximately like this:
<compose id="rawhide-20070122" time="1169485348"> <debug arch="i386">development/i386/debug</debug> <debug arch="x86_64">development/x86_64/debug</debug> ... <source arch="i386">development/source/SRPMS</source> <source arch="x86_64">development/source/SRPMS</source> ... <tree arch="i386">development/i386/os</tree> <tree arch="x86_64">development/x86_64/os</tree> </compose>
This defines the id of the compose (which should be unique for each compose, and would be nice if it was human-understandable like this one) and a timestamp that lets you know when it was created.
Really this file just exists to tell point you to the actual contents of the compose, and where it all lives - there's debuginfo packages here, sources here, trees here, and so on. Later we should also have: <isoset arch="i386">development/i386/iso</isoset>
Each of these items points to a directory where another xml file will have further information - a "tree" directory will contain a file named ".tree.xml", ".isoset.xml" for isosets, etc.
So here's an example of tree.xml:
<tree id="1169482851.57"> <compose>rawhide-20070122</compose> <family>Fedora Core 6.89</family> <version>6.89</version> <time>1169482851.57</time> <arch>i386</arch> <file type="kernel">images/pxeboot/vmlinuz</file> <file type="initrd">images/pxeboot/initrd.img</file> <file type="boot.iso">images/boot.iso</file> </tree>
Each tree has a unique ID. Like the composes, it can be any freeform string, but it must be unique among trees. (A better choice might be something like "rawhide-20070122.i386" - this is still open to change.)
In the xml structure we've got the name of the parent compose, the 'family' string (the second line of .discinfo), the version (as a floating point number), the timestamp of the tree (which I am using as the tree id, due to the fact that it's unique), and the tree's arch.
Finally there's a list of important files that other applications might like to know the location of. For my purposes, those three files are the ones I care about - other applications might want other file items to be included here.
Okay, so here's the questions:
1) Is this enough info to model trees and composes? What about iso sets? 2) Does anaconda have all the metadata that I'm writing out here? 3) Does this stuff look sane enough for inclusion in anaconda?
Let me know what you think. I'm still not completely sure how to deal with iso sets and such. I'm sure I'm missing some vital piece of information from .discinfo that wasn't needed for my purposes, so please tell me what this lacks.
Thanks in advance!
-w
On Fri, 2007-01-26 at 22:12 +0000, Will Woods wrote:
In the course of messing with various installer tools, I've ended up writing some python code that reads and writes an XML metadata file that is (basically) an easily-parsable superset of .discinfo.
It's been pointed out that this might be overkill for .discinfo, and on reflection that's probably true.
I still think this extra metadata would be helpful for a *lot* of things, and it doesn't *require* killing off .discinfo. So think of this as an add-on rather than a replacement.
-w
Will Woods wrote:
In the course of messing with various installer tools, I've ended up writing some python code that reads and writes an XML metadata file that is (basically) an easily-parsable superset of .discinfo.
There's quite a bit more here than .discinfo and not most of the things which are currently in .discinfo. .discinfo definitely needs to be reworked (for one thing, half of the lines aren't relevant with the yum backend much less when backends can be generic), but I'm not entirely sure that this is the direction to go. For one thing, we're going to have to parse this in the loader; I'm not sure I want an XML parser there. The other thing is that it feels like it's modeling a bit closely exactly how we currently do things instead of some of the directions that things are moving in with the land of many Fedora spins.
Jeremy
On Mon, 2007-01-29 at 11:08 -0500, Jeremy Katz wrote:
Will Woods wrote:
In the course of messing with various installer tools, I've ended up writing some python code that reads and writes an XML metadata file that is (basically) an easily-parsable superset of .discinfo.
There's quite a bit more here than .discinfo and not most of the things which are currently in .discinfo. .discinfo definitely needs to be reworked (for one thing, half of the lines aren't relevant with the yum backend much less when backends can be generic), but I'm not entirely sure that this is the direction to go.
Hm. Can you elaborate on the backend stuff a bit? It seems obvious that (for instance) the 'pixmaps' entry is useless, but are there things that would be helpful for yum? What considerations are needed for other, future backends?
For one thing, we're going to have to parse this in the loader; I'm not sure I want an XML parser there.
Hm, yeah. I certainly don't want to make the loader have to parse XML.
The other thing is that it feels like it's modeling a bit closely exactly how we currently do things instead of some of the directions that things are moving in with the land of many Fedora spins.
Right, this file format was designed just to handle junk we currently have. I'm hoping we can figure out a way to redesign it so it makes sense for the future as well.
So, upon reflection.. forget about the tree.xml altogether. Here's a different idea - a slightly improved version of discinfo that uses key-value pairs, and *also* contains all the info I wanted to put in tree.xml:
[wwoods@metroid os]$ cat .distinfo family: Fedora variant: Desktop version: 6.90 arch: i386 timestamp: 1170104239.562016 composeid: 20070129.1 disc: 2/3 packagedir: Fedora
This is really just stuff that's already in .discinfo, just a bit easier to read/parse. A couple notes:
'composeid' is, again, just something that each of the trees in a compose will share. Mostly this is useful to correlate common information that the trees will share; for example, a typo in anaconda on i386 will probably show up in ppc and x86_64, if they're all from the same compose. Without a composeid, there seems to be no easy way (short of comparing package sets) to determine if tree A and tree B were built from a common set of packages.
The 'disc' item might need to be reconsiders by someone who understands that stuff better. Heh.
Anyway, in addition to this basic info we can also add entries for things that applications will want to find:
boot.iso: images/boot.iso stage2.img: images/stage2.img minstg2.img: images/minstg2.img isoboot: isolinux/vmlinuz isolinux/initrd.img netboot: images/pxeboot/vmlinuz images/pxeboot/initrd.img xen: images/xen/vmlinuz images/xen/initrd
for ppc trees we might have:
isoboot-ppc32: ppc/ppc32/vmlinuz ppc/ppc32/ramdisk.image.gz isoboot-ppc64: ppc/ppc64/vmlinuz ppc/ppc64/ramdisk.image.gz netboot-ppc32: images/netboot/ppc32.img netboot-ppc64: images/netboot/ppc64.img
Does this make sense? It's got all the information *I* care about, but is this a useful improvement by other people's measures?
If not - what's missing?
-w
So, upon reflection.. forget about the tree.xml altogether. Here's a different idea - a slightly improved version of discinfo that uses key-value pairs, and *also* contains all the info I wanted to put in tree.xml:
[wwoods@metroid os]$ cat .distinfo family: Fedora variant: Desktop version: 6.90 arch: i386 timestamp: 1170104239.562016 composeid: 20070129.1 disc: 2/3 packagedir: Fedora
Don't know if it would apply in this case, but a very easy format that that is a bit more flexible is the old "win.ini" format, where you have section headers, followed by key value pairs (samba uses this config format), such that if you wanted to namespace things a bit you could. That said, name spacing can be achieved in the variable name itself.
Cheers...james
[wwoods@metroid os]$ cat .distinfo family: Fedora variant: Desktop version: 6.90 arch: i386 timestamp: 1170104239.562016 composeid: 20070129.1 disc: 2/3 packagedir: Fedora
Don't know if it would apply in this case, but a very easy format that that is a bit more flexible is the old "win.ini" format, where you have section headers, followed by key value pairs (samba uses this config format), such that if you wanted to namespace things a bit you could. That said, name spacing can be achieved in the variable name itself.
We even already have code in rhpl to manipulate files of this format, though that doesn't help much in the loader.
- Chris
On Fri, 2007-01-26 at 22:12 +0000, Will Woods wrote:
Hi folks,
In the course of messing with various installer tools, I've ended up writing some python code that reads and writes an XML metadata file that is (basically) an easily-parsable superset of .discinfo.
A quick overview of the concepts involved here, from the top down:
A "compose" is, basically, an entire distribution - all the installable trees for all the various arches, plus iso sets, plus maybe some SRPMS and debuginfo packages that go along with them. I suppose I could rename this to "distro" but we've been using this terminology for so long that it's just stuck.
A "tree" is a directory layout with all the packages and images you need to install from. An "isoset" is (surprise!) a full set of isos that make up the distribution.
Okay, here's some examples. The tool that does the composing (pungi, distill, etc.) should create .compose.xml in the top-level of the compose dir. That file looks approximately like this:
<compose id="rawhide-20070122" time="1169485348"> <debug arch="i386">development/i386/debug</debug> <debug arch="x86_64">development/x86_64/debug</debug> ... <source arch="i386">development/source/SRPMS</source> <source arch="x86_64">development/source/SRPMS</source> ... <tree arch="i386">development/i386/os</tree> <tree arch="x86_64">development/x86_64/os</tree> </compose>
This defines the id of the compose (which should be unique for each compose, and would be nice if it was human-understandable like this one) and a timestamp that lets you know when it was created.
Really this file just exists to tell point you to the actual contents of the compose, and where it all lives - there's debuginfo packages here, sources here, trees here, and so on. Later we should also have: <isoset arch="i386">development/i386/iso</isoset>
Each of these items points to a directory where another xml file will have further information - a "tree" directory will contain a file named ".tree.xml", ".isoset.xml" for isosets, etc.
So here's an example of tree.xml:
<tree id="1169482851.57"> <compose>rawhide-20070122</compose> <family>Fedora Core 6.89</family> <version>6.89</version> <time>1169482851.57</time> <arch>i386</arch> <file type="kernel">images/pxeboot/vmlinuz</file> <file type="initrd">images/pxeboot/initrd.img</file> <file type="boot.iso">images/boot.iso</file> </tree>
Finally there's a list of important files that other applications might like to know the location of. For my purposes, those three files are the ones I care about - other applications might want other file items to be included here.
This doesn't work for architectures where we have multiple kernels and compose a hybrid boot images (ppc, sparc possibly) to work with all of archs.
You might want to have some sort of flavour in the file section to allow multiple entries. Assuming this is meant to be used by the iso generating code - see the mk-images.* scripts in anaconda/scripts
The other really useful thing for the isoset would be to contain the total number of images in a set, at the moment we hard code this and if we intend for lots of spins of Fedora with different package set that's a bad plan.
Paul
Paul
anaconda-devel@lists.stg.fedoraproject.org