With F8T1 fast approaching (it's currently scheduled to be released on 1 August 2007) we need to get cracking on a new VCS system. I've been working on converting the CVS repositories to GIT on some spare hardware that I've had laying around. I think that it's at a stage where input & testing from the community is needed to move the project forward. Therefore I'd like to request a test Xen host to move the repository over:
http://fedoraproject.org/wiki/Infrastructure/RFR/GitPackageVCS
More information about the repository I've been setting up can be found here:
http://fedoraproject.org/wiki/JeffOllie/VCS/Git
Jeff
Jeffrey C. Ollie wrote:
With F8T1 fast approaching (it's currently scheduled to be released on 1 August 2007) we need to get cracking on a new VCS system. I've been working on converting the CVS repositories to GIT on some spare hardware that I've had laying around. I think that it's at a stage where input & testing from the community is needed to move the project forward. Therefore I'd like to request a test Xen host to move the repository over:
http://fedoraproject.org/wiki/Infrastructure/RFR/GitPackageVCS
More information about the repository I've been setting up can be found here:
I'm glad this is started back up. One thing that amuses me is back before the F7 launch it almost seemed assured that we would all go with mercurial. This line isn't so clear now, a lot of people have been using git. It seems our future is either going to be A) do nothing and continue with CVS or B) move to HG or Git.
One thing that will be tricky about this is that we'll be completely re-designing how we use our source management. Its not just removing cvs and plugging in some other technology in its place. I think Jesse and Jeremy may have had something particular in mind for this but I'm not sure. The hurdles as I see them are:
ACL's Making sure checkouts and commits don't take forever Good tagging abilities
What else is there?
-Mike
On Wed, 2007-06-06 at 09:17 -0500, Mike McGrath wrote:
I'm glad this is started back up. One thing that amuses me is back before the F7 launch it almost seemed assured that we would all go with mercurial. This line isn't so clear now, a lot of people have been using git. It seems our future is either going to be A) do nothing and continue with CVS or B) move to HG or Git.
Yeah, definitely time to start this back up.
One thing that will be tricky about this is that we'll be completely re-designing how we use our source management. Its not just removing cvs and plugging in some other technology in its place. I think Jesse and Jeremy may have had something particular in mind for this but I'm not sure.
Right. I really don't think we want to just take our current system, switch out CVS, and end up with all of the same workflows. The change should be more about how do we improve workflows. That means thinking about things like: * How do we make it easier for a maintainer to rebase their package to a newer upstream? * How do we make it easier for a maintainer to develop, test, and create a patch to fix a problem that's being experienced in Fedora? * How do we make it easy to send these patches to the upstream of the project being worked on? * How do we enable downstreams to take our bits, track them and make changes as they need/want? * How do we better enable a user who has a problem with something we ship to be able to fix it themselves and get the fix back to us?
That's the off the top of my head list to give you sort of the idea of things that really want to be thought about.
Because if we're just switching out CVS for {git,hg,bzr,svn,foobarbazl} and don't think about these things then we're putting all of our developers onto a learning curve to switch for what is likely to be little gain.
Jeremy
Jeremy Katz wrote:
Right. I really don't think we want to just take our current system, switch out CVS, and end up with all of the same workflows. The change should be more about how do we improve workflows. That means thinking about things like:
- How do we make it easier for a maintainer to rebase their package to a
newer upstream?
- How do we make it easier for a maintainer to develop, test, and create
a patch to fix a problem that's being experienced in Fedora?
- How do we make it easy to send these patches to the upstream of the
project being worked on?
- How do we enable downstreams to take our bits, track them and make
changes as they need/want?
- How do we better enable a user who has a problem with something we
ship to be able to fix it themselves and get the fix back to us?
* How do we do all that while keeping it simple :)
Well, I guess now we can all start thinking about it. Source Control is supposed to be a tool to empower developers, lets make sure that what we end up with is just that.
-Mike
On Wed, 2007-06-06 at 10:31 -0400, Jeremy Katz wrote:
On Wed, 2007-06-06 at 09:17 -0500, Mike McGrath wrote:
I'm glad this is started back up. One thing that amuses me is back before the F7 launch it almost seemed assured that we would all go with mercurial. This line isn't so clear now, a lot of people have been using git. It seems our future is either going to be A) do nothing and continue with CVS or B) move to HG or Git.
Yeah, definitely time to start this back up.
And just to make things clear, it's time to start up talking about it, investigating our options and getting some things rolling. But that _doesn't_ mean we should rush things to just get them done based on an arbitrary deadline. This is the sort of thing we're going to have to live with for a long while, so it's better to have it take an extra release cycle before rolling out and get it right. Otherwise, we'll have a revolt on our hands :-)
Jeremy
On Wed, 2007-06-06 at 10:44 -0400, Jeremy Katz wrote:
On Wed, 2007-06-06 at 10:31 -0400, Jeremy Katz wrote:
On Wed, 2007-06-06 at 09:17 -0500, Mike McGrath wrote:
I'm glad this is started back up. One thing that amuses me is back before the F7 launch it almost seemed assured that we would all go with mercurial. This line isn't so clear now, a lot of people have been using git. It seems our future is either going to be A) do nothing and continue with CVS or B) move to HG or Git.
Yeah, definitely time to start this back up.
And just to make things clear, it's time to start up talking about it, investigating our options and getting some things rolling. But that _doesn't_ mean we should rush things to just get them done based on an arbitrary deadline. This is the sort of thing we're going to have to live with for a long while, so it's better to have it take an extra release cycle before rolling out and get it right. Otherwise, we'll have a revolt on our hands :-)
I agree and I disagree. Yes, we need to carefully consider our next step. On the other hand I think that we need to get off of CVS as soon as possible. From what I've seen while testing the conversion to GIT there seems to be corruption in some of the CVS repositories. It's most noticeable in large/active packages (the kernel is a notable example) but sometime small packages are affected. I don't think that it's had a major effect so far because I think that it's relatively rare that people go back and look at old revisions of the packages (probably because that's so difficult in CVS).
You can see the effect of the corruption in my git repos thusly:
git clone git://161.210.6.204/kernel cd kernel gitk devel origin/FC-6
If you scroll down (way way down) you'll see that the history of the FC-6 branch diverges much much sooner than it should at:
66b7d75c65cfc08a513b7e3a51b9e9661c79f793 2005-08-18 2.6.13-rc6-git10
rather than at:
a81d311b742ee08174abb017b6b0caabaa369867 2006-10-12 Initialize branch FC-6 for kernel
Compare that with:
gitk devel origin/F-7
where you can see the histories diverge at:
80cedc3f9b0705ef0e38565d69869d7871c5c8b8 2007-05-18 Initialize branch F-7 for kernel
Jeff
On Wed, 2007-06-06 at 11:09 -0500, Jeffrey C. Ollie wrote:
On Wed, 2007-06-06 at 10:44 -0400, Jeremy Katz wrote:
On Wed, 2007-06-06 at 10:31 -0400, Jeremy Katz wrote:
On Wed, 2007-06-06 at 09:17 -0500, Mike McGrath wrote:
I'm glad this is started back up. One thing that amuses me is back before the F7 launch it almost seemed assured that we would all go with mercurial. This line isn't so clear now, a lot of people have been using git. It seems our future is either going to be A) do nothing and continue with CVS or B) move to HG or Git.
Yeah, definitely time to start this back up.
And just to make things clear, it's time to start up talking about it, investigating our options and getting some things rolling. But that _doesn't_ mean we should rush things to just get them done based on an arbitrary deadline. This is the sort of thing we're going to have to live with for a long while, so it's better to have it take an extra release cycle before rolling out and get it right. Otherwise, we'll have a revolt on our hands :-)
I agree and I disagree. Yes, we need to carefully consider our next step. On the other hand I think that we need to get off of CVS as soon as possible. From what I've seen while testing the conversion to GIT there seems to be corruption in some of the CVS repositories. It's most noticeable in large/active packages (the kernel is a notable example) but sometime small packages are affected. I don't think that it's had a major effect so far because I think that it's relatively rare that people go back and look at old revisions of the packages (probably because that's so difficult in CVS).
I wouldn't be entirely certain there -- for one thing, don't discount bugs in the conversion process. Also, there have been rare cases where things have been munged a bit directly which leads to things being ... not exactly as perhaps expected.
Jeremy
On Wed, 2007-06-06 at 10:31 -0400, Jeremy Katz wrote:
Right. I really don't think we want to just take our current system, switch out CVS, and end up with all of the same workflows. The change should be more about how do we improve workflows. That means thinking about things like:
- How do we make it easier for a maintainer to rebase their package to
a newer upstream?
- How do we make it easier for a maintainer to develop, test, and
create a patch to fix a problem that's being experienced in Fedora?
- How do we make it easy to send these patches to the upstream of the
project being worked on?
- How do we enable downstreams to take our bits, track them and make
changes as they need/want?
- How do we better enable a user who has a problem with something we
ship to be able to fix it themselves and get the fix back to us?
Awesome stuff. This is the right way to go about the conversation, for sure. I would love to add some stuff to this list:
o How do we bring developers and consumers of technology closer together? In a less market-ey speak way to put it: how do we let software developers (not just maintainers!) get directly involved and let them deal directly with the people who want to use it without the maintainer as a mediator?
o How do we deal with _more than just RPMs_ as a build and delivery mechanism. (Trust me, this is coming.)
o Do we want to move to a process where code is just in a repo and it's built automatically instead of source + patches + spec file?
Also, we need to think in use cases instead of abstract questions or about what technology we can use. For example:
"A developer has made a patch that he thinks fixes a bug for one of his users. He mails the user and says "here's a pointer to the patch - just click on this button to try a build on your system."
One of my goals that I've had for Fedora, one that's easy to understand is, "one click to try any patch."
What's required to make that happen? Realistically we probably need to move to a source control system so that when the developer commits (or is pushed in the git sense) the tag with that commit or change is available to apply. Then the build system can just pull that tag, build it and make it available to the user automatically.
I would like to compare this to the current work flow we have. Right now you report a bug, the developer looks at the bug, generates a patch. That patch is usually uploaded to bugzilla. Then a very advanced user might be able to take that patch, figure out how to apply it to a spec file, rebuild the rpm and then install and test it. Then that test information might be returned, it might not, but it's hard for you to share that build with other people that might want to test. Our current system creates artificial barriers between developers and users and requires a huge amount of cognitive load to get involved and to get things done. It's slowing things down and creating a bad experience. And worse, everyone doing Linux is doing the same thing.
So I think we need to take the next step. Think about how to build simple, easy to understand systems and connect people closer together.
So does using git for the package VCS actually help solve this? Not really. But does it mean that it's a step down the right path? Sure. But we need to make sure that we're thinking about the problems correctly instead of just trying to apply a technology to a problem without understanding which problem we're trying to solve. And I don't think our biggest problem is CVS. Our biggest problems are scale and how hard it is for people to be able to share information and be more effective.
--Chris
Christopher Blizzard wrote:
Awesome stuff. This is the right way to go about the conversation, for sure. I would love to add some stuff to this list:
Anyone know how our other friends are doing their package management? I know OpenSuSE doesn't really have a cvs at all, its completely lookaside cache (package/release/md5sum-filename) or some such thing. Anyone interested in doing some research on debian and Ubuntu?
-Mike
On Wed, 2007-06-06 at 13:50 -0400, Christopher Blizzard wrote:
On Wed, 2007-06-06 at 10:31 -0400, Jeremy Katz wrote:
Right. I really don't think we want to just take our current system, switch out CVS, and end up with all of the same workflows. The change should be more about how do we improve workflows. That means thinking about things like:
- How do we make it easier for a maintainer to rebase their package to
a newer upstream?
- How do we make it easier for a maintainer to develop, test, and
create a patch to fix a problem that's being experienced in Fedora?
- How do we make it easy to send these patches to the upstream of the
project being worked on?
- How do we enable downstreams to take our bits, track them and make
changes as they need/want?
- How do we better enable a user who has a problem with something we
ship to be able to fix it themselves and get the fix back to us?
Awesome stuff. This is the right way to go about the conversation, for sure. I would love to add some stuff to this list:
o How do we bring developers and consumers of technology closer together? In a less market-ey speak way to put it: how do we let software developers (not just maintainers!) get directly involved and let them deal directly with the people who want to use it without the maintainer as a mediator?
o How do we deal with _more than just RPMs_ as a build and delivery mechanism. (Trust me, this is coming.)
o Do we want to move to a process where code is just in a repo and it's built automatically instead of source + patches + spec file?
When I first read these two posts, I thought "you guys are crazy", but then I thought about it a bit more and started thinking "Whoah! This could be really cool!" I think what is described here could certainly be done with Git (it'd probably work in another distributed SCM but I'm less famililar with those so I can't say for sure).
It also occurred to me that this would be a very big change in how we manage packages so maybe some kind of hybrid approach would make the transition easier.
So what about something like this:
1. We convert the package repository to a new SCM so that we can get off of CVS, but the process/workflow remains relatively unchanged. This I think that we could definitely have in place by F8.
2. We set up a parallel package repository that enables our new workflow. When a package maintainer is ready to move a package to the new workflow, building from the old-style repository is disabled and the package is built from the new-style one.
Also, we need to think in use cases instead of abstract questions or about what technology we can use. For example:
"A developer has made a patch that he thinks fixes a bug for one of his users. He mails the user and says "here's a pointer to the patch - just click on this button to try a build on your system."
One of my goals that I've had for Fedora, one that's easy to understand is, "one click to try any patch."
What's required to make that happen? Realistically we probably need to move to a source control system so that when the developer commits (or is pushed in the git sense) the tag with that commit or change is available to apply. Then the build system can just pull that tag, build it and make it available to the user automatically.
I think that this is more of a Web UI issue rather than a SCM issue. Koji will already build a package based upon a tag in CVS, it wouldn't take a lot of work to extend that to GIT or some other SCM (example patches for SVN are available in a ticket on Koji's bug tracker).
What you would need is a Web UI that would:
1. Let users browse the packages and tags that are available in the SCM. 2. Or users could be directed to a particular package/tag by posting a link in bugzilla/mailing list post. 3. Be able to click on a button to request a build from Koji for a particular package/tag/release/arch. Since the build is going to take some time, users would need to supply an email address where a notification would be sent with a link to download the resulting packages. These packages would be cached for a while in case other people wanted to try out those packages as well. Packages built in this fashion wouldn't become part of rawhide or an update to a released package - that would still require action by the maintainer.
Jeff
On Wed, 2007-06-06 at 14:32 -0500, Jeffrey C. Ollie wrote:
When I first read these two posts, I thought "you guys are crazy",
No argument here ;-)
but then I thought about it a bit more and started thinking "Whoah! This could be really cool!" I think what is described here could certainly be done with Git (it'd probably work in another distributed SCM but I'm less famililar with those so I can't say for sure).
Yeah, once you start thinking about things like this it becomes pretty clear that a DVCS is almost a requirement rather than a "it'd be nice".
It also occurred to me that this would be a very big change in how we manage packages so maybe some kind of hybrid approach would make the transition easier.
Yes, it is a big change. A huge change. A groundbreaking change even. Something which shows that Fedora isn't just sitting off quietly, but innovating and doing so with toolsets that are entirely open and available for anyone to use.
So what about something like this:
- We convert the package repository to a new SCM so that we can get
off of CVS, but the process/workflow remains relatively unchanged. This I think that we could definitely have in place by F8.
- We set up a parallel package repository that enables our new
workflow. When a package maintainer is ready to move a package to the new workflow, building from the old-style repository is disabled and the package is built from the new-style one.
The problem with a staged approach like this two-fold 1) Moving off of CVS is going to end up requiring a fair bit of relearning/retraining for people. Even if we keep the workflow the same. So by having it as a two-step thing, people have to retrain themselves _twice_ rather than just once. 2) If you let some people move and not others, then it becomes very difficult to know what you have to do to make changes to a specific package. If you're the only person that works on something, that's not such a big deal... but we want to be encouraging collaboration and working together. Having two different ways of doing that at the same time is going to mean that everyone has to get over the hump _anyway_. So why not just take our lumps in get there in a go.
Jeremy
On Wed, 2007-06-06 at 16:16 -0400, Jeremy Katz wrote:
The problem with a staged approach like this two-fold
- Moving off of CVS is going to end up requiring a fair bit of
relearning/retraining for people. Even if we keep the workflow the same. So by having it as a two-step thing, people have to retrain themselves _twice_ rather than just once. 2) If you let some people move and not others, then it becomes very difficult to know what you have to do to make changes to a specific package. If you're the only person that works on something, that's not such a big deal... but we want to be encouraging collaboration and working together. Having two different ways of doing that at the same time is going to mean that everyone has to get over the hump _anyway_. So why not just take our lumps in get there in a go.
So regarding 1. I would suggest that we leave "classic" packages in CVS. Learning another system is a big deal and we get almost no bang for that buck so I don't see us moving off of CVS for our current repo setup any time soon.
I think that moving selectively is the option of the developer and/or maintainer and should reflect how the upstream project works. And it's only really required for stuff that's moving quickly or has a large community. Remember one of our primary goals: get as close to upstream as possible. If we're supporting them by using the same DVCS then they are more likely to assist us, not to mention how easy it gets to figure out what's different between repo a and repo b.
For example for the kernel, we might want to pull from a git repo. For people who use hg, we just use that. For projects that just release tarballs, we stick with what we have.
This might sound crazy (SUPPORT > 1 SYSTEM, ARE YOU CRAZY?) Well, yes, until you realize what you need to do here. To start with you only have to teach the rpm build side how pull a specific tag from a specific repo. On the query side we need a browser for each kind, which is a bit of work, but something I think we need to do anyway. (i.e. "What would git do?")
Plus, to be honest, it completely avoids the whole "which damn system do we use." And I like focusing on the end user features instead of getting stuck in VCS dicussion hell. We're not going to get everyone else to agree or even use the same system. So let's build something that supports both.
I understand the training costs here, but to be honest I think that local experts will continue to be local experts. And we want more of that not less.
--Chris
On Wed, 2007-06-06 at 16:53 -0400, Christopher Blizzard wrote:
This might sound crazy (SUPPORT > 1 SYSTEM, ARE YOU CRAZY?) Well, yes, until you realize what you need to do here. To start with you only have to teach the rpm build side how pull a specific tag from a specific repo. On the query side we need a browser for each kind, which is a bit of work, but something I think we need to do anyway. (i.e. "What would git do?")
One more thought for people to chew on. I honestly believe that one of our roles needs to be to service developers. Right now we're set up to service packagers + maintainers. I want to bring developers into the fold and give them tools to be more productive. So what ends up being a little bit more work for us ends up making developers lives much much easier. And that's a total win because it builds our most important base - those developers.
--Chris
On Wed, 2007-06-06 at 16:53 -0400, Christopher Blizzard wrote:
On Wed, 2007-06-06 at 16:16 -0400, Jeremy Katz wrote:
The problem with a staged approach like this two-fold
- Moving off of CVS is going to end up requiring a fair bit of
relearning/retraining for people. Even if we keep the workflow the same. So by having it as a two-step thing, people have to retrain themselves _twice_ rather than just once. 2) If you let some people move and not others, then it becomes very difficult to know what you have to do to make changes to a specific package. If you're the only person that works on something, that's not such a big deal... but we want to be encouraging collaboration and working together. Having two different ways of doing that at the same time is going to mean that everyone has to get over the hump _anyway_. So why not just take our lumps in get there in a go.
So regarding 1. I would suggest that we leave "classic" packages in CVS. Learning another system is a big deal and we get almost no bang for that buck so I don't see us moving off of CVS for our current repo setup any time soon.
I think that moving selectively is the option of the developer and/or maintainer and should reflect how the upstream project works. And it's only really required for stuff that's moving quickly or has a large community. Remember one of our primary goals: get as close to upstream as possible. If we're supporting them by using the same DVCS then they are more likely to assist us, not to mention how easy it gets to figure out what's different between repo a and repo b.
For example for the kernel, we might want to pull from a git repo. For people who use hg, we just use that. For projects that just release tarballs, we stick with what we have.
At the same time, I think we still need to be able to very clearly separate out our changes from what upstream has. Just a git repo of the kernel very quickly gets out of control and you end up with bazillions of things that you never push back upstream because it's easier to just keep sitting on them. So I don't think that just a VCS repo of the source is what we want... we're going to end up wanting some integration with something quilt-like to get patches out; so like stgit or mq or ...
This might sound crazy (SUPPORT > 1 SYSTEM, ARE YOU CRAZY?) Well, yes, until you realize what you need to do here. To start with you only have to teach the rpm build side how pull a specific tag from a specific repo. On the query side we need a browser for each kind, which is a bit of work, but something I think we need to do anyway. (i.e. "What would git do?")
So if I am the owner of the rpm package which has an upstream of hg and want to fix, test and commit a change to say (for the sake of argument) neon which is in git, I now have to know two different systems? You're crazy ;-)
To add to the craziness of this path, think about actually maintaining these packages across different distro releases... every VCS has its own unique and specially crack-ridden way of handling branching.
Or when you star to think about the "I'm a downstream of Fedora and need to change X, Y, and Z" and you are then having them set up potentially 3 different VCS systems to do so.
Or the "it's time for a mass-rebuild; let's go and commit a version bump to all the packages so we can rebuild. Uhhm. Uh-oh.
Plus, to be honest, it completely avoids the whole "which damn system do we use." And I like focusing on the end user features instead of getting stuck in VCS dicussion hell. We're not going to get everyone else to agree or even use the same system. So let's build something that supports both.
So instead of picking _one_ answer, we now have to make sure that we implement all of the end user features for N systems? Seriously, this is losing.
Jeremy
On Wed, 2007-06-06 at 17:31 -0400, Jeremy Katz wrote:
At the same time, I think we still need to be able to very clearly separate out our changes from what upstream has. Just a git repo of the kernel very quickly gets out of control and you end up with bazillions of things that you never push back upstream because it's easier to just keep sitting on them. So I don't think that just a VCS repo of the source is what we want... we're going to end up wanting some integration with something quilt-like to get patches out; so like stgit or mq or ...
Like I said, I think it depends on what the package is and most importantly what the maintainer and developers want to do. I know that Dave Jones didn't want to use git to build the kernel, and that's fine He doesn't have to use it.
One thing I'm trying to do here is break the maintainer model that we have today. I want developers to come and work directly in Fedora. And that means taking out that extra step if we can - going directly from a VCS to a package. Using HAL as an example - want to pull in the latest one? Just point your repo at the upstream one and catch up to that release. Then click to build. Or, even better, the developer can do it himself and push it into our release. Remember, in a lot of causes for us the maintainer _is_ the main developer! So why not draw that line as close as possible?
This might sound crazy (SUPPORT > 1 SYSTEM, ARE YOU CRAZY?) Well, yes, until you realize what you need to do here. To start with you only have to teach the rpm build side how pull a specific tag from a specific repo. On the query side we need a browser for each kind, which is a bit of work, but something I think we need to do anyway. (i.e. "What would git do?")
So if I am the owner of the rpm package which has an upstream of hg and want to fix, test and commit a change to say (for the sake of argument) neon which is in git, I now have to know two different systems? You're crazy ;-)
No. If you happen to be a maintainer _and_ a developer and you have _chosen_ to use hg or git or whatever, then we make it easy for you. This is about adding options to bring us closer to the upstream developers and make their lives easier.
To add to the craziness of this path, think about actually maintaining these packages across different distro releases... every VCS has its own unique and specially crack-ridden way of handling branching.
Yes, but you only need to teach our systems about that once.
Or when you star to think about the "I'm a downstream of Fedora and need to change X, Y, and Z" and you are then having them set up potentially 3 different VCS systems to do so.
Depends on what they want to do.
Or the "it's time for a mass-rebuild; let's go and commit a version bump to all the packages so we can rebuild. Uhhm. Uh-oh.
This is actually easier for those VCS packages than it is today. Right now we have to go in and edit every single spec file, edit, and commit. If we back away from having spec files like that and instead generate that info before compile time (so a "pristine source" is just a set of metadata, rules and a source tag) doing mass rebuilds is _easy_.
Plus, to be honest, it completely avoids the whole "which damn system do we use." And I like focusing on the end user features instead of getting stuck in VCS dicussion hell. We're not going to get everyone else to agree or even use the same system. So let's build something that supports both.
So instead of picking _one_ answer, we now have to make sure that we implement all of the end user features for N systems? Seriously, this is losing.
This comment inspired me to make a very special graphic:
http://www.ideasuite.com/~blizzard/images/captain_no.png
So let me ask you this question: where do you want Fedora to go? Just keep adding packages? Situation as usual? We got things outside of the firewall. That's nice. We will never rest on our laurels. What's next?
I have a pretty specific vision for where we should be two to three years down the road, and it involves innovating in these spaces including looking at having source repos to fix real problems with developer productivity. How can we make developers (not just maintainers) lives easier? How can we shorten the distance between them? I don't see you offering ideas, only saying others are bad.
--Chris
On Wed, 2007-06-06 at 22:46 -0400, Christopher Blizzard wrote:
On Wed, 2007-06-06 at 17:31 -0400, Jeremy Katz wrote:
At the same time, I think we still need to be able to very clearly separate out our changes from what upstream has. Just a git repo of the kernel very quickly gets out of control and you end up with bazillions of things that you never push back upstream because it's easier to just keep sitting on them. So I don't think that just a VCS repo of the source is what we want... we're going to end up wanting some integration with something quilt-like to get patches out; so like stgit or mq or ...
Like I said, I think it depends on what the package is and most importantly what the maintainer and developers want to do. I know that Dave Jones didn't want to use git to build the kernel, and that's fine He doesn't have to use it.
One thing I'm trying to do here is break the maintainer model that we have today. I want developers to come and work directly in Fedora. And that means taking out that extra step if we can - going directly from a VCS to a package. Using HAL as an example - want to pull in the latest one? Just point your repo at the upstream one and catch up to that release. Then click to build. Or, even better, the developer can do it himself and push it into our release. Remember, in a lot of causes for us the maintainer _is_ the main developer! So why not draw that line as close as possible?
Because in many more cases, the maintainer _isn't_ the main upstream developer. And in a lot of cases, the main upstream developer (or developers) are very happy to have someone else take care of things that aren't directly "write new feature, fix bugs in my code, make my program kick ass". eg, they're really happy when someone volunteers to take over maintaining their project's website for them. Similarly with package maintenance. Especially because we're not going to get to a world where there's only one Linux distribution[1] and they're certainly not going to want to maintain packages for many distros.
So I guess that's the point where our basic disagreement lies... I have no problems if upstream developers _want_ to maintain things. But I don't know that it's the case to optimize for.
This might sound crazy (SUPPORT > 1 SYSTEM, ARE YOU CRAZY?) Well, yes, until you realize what you need to do here. To start with you only have to teach the rpm build side how pull a specific tag from a specific repo. On the query side we need a browser for each kind, which is a bit of work, but something I think we need to do anyway. (i.e. "What would git do?")
So if I am the owner of the rpm package which has an upstream of hg and want to fix, test and commit a change to say (for the sake of argument) neon which is in git, I now have to know two different systems? You're crazy ;-)
No. If you happen to be a maintainer _and_ a developer and you have _chosen_ to use hg or git or whatever, then we make it easy for you. This is about adding options to bring us closer to the upstream developers and make their lives easier.
But we're making the lives of the upstream developers (who, more often than not, are more experienced and going to be better able to adapt to a different VCS and some workflow on top of it) easier at the cost of our contributors who maintain a lot of packages and _aren't_ the upstream developer.
To add to the craziness of this path, think about actually maintaining these packages across different distro releases... every VCS has its own unique and specially crack-ridden way of handling branching.
Yes, but you only need to teach our systems about that once.
And add every time a new upstream cares about the new VCS of the week. Or when $VCS changes it's branching. And maintain and test across all of the permutations.
Or when you star to think about the "I'm a downstream of Fedora and need to change X, Y, and Z" and you are then having them set up potentially 3 different VCS systems to do so.
Depends on what they want to do.
I want to enable them to make any changes that they want. That would be one of the big points of a DVCS and also is one of the big points in making it easy for anyone to set up a build farm with koji. I want them to be able to track what goes on, adjust their changes accordingly and go on their way. I don't want the setup to involve doing the server set up of each of git, hg, bzr, svk, arch and monotone.
Or the "it's time for a mass-rebuild; let's go and commit a version bump to all the packages so we can rebuild. Uhhm. Uh-oh.
This is actually easier for those VCS packages than it is today. Right now we have to go in and edit every single spec file, edit, and commit. If we back away from having spec files like that and instead generate that info before compile time (so a "pristine source" is just a set of metadata, rules and a source tag) doing mass rebuilds is _easy_.
So where's the metadata/rules kept? Do I have to use two different systems -- one for the source and one for the metadata/rules? Because I'm always going to want to be able to deal with both. If they're not in different systems, then the rebuild case still has to deal with many systems
And fundamentally, I think that the "pristine source + patches" bit is just as important today as it was 10 years ago. Because otherwise, you get into a situation where you basically encourage forking and not contributing things back upstream. Your response is going to be "but it's upstream doing it" -- at the same time, there are _always_ going to be distro specific changes that won't necessarily make sense for upstream. Being able to contain and track those and making it easier for changes to be pushed upstream as opposed to carried along in a distro-specific world forever is a good thing.
Plus, to be honest, it completely avoids the whole "which damn system do we use." And I like focusing on the end user features instead of getting stuck in VCS dicussion hell. We're not going to get everyone else to agree or even use the same system. So let's build something that supports both.
So instead of picking _one_ answer, we now have to make sure that we implement all of the end user features for N systems? Seriously, this is losing.
So let me ask you this question: where do you want Fedora to go? Just keep adding packages? Situation as usual? We got things outside of the firewall. That's nice. We will never rest on our laurels. What's next?
Continuing to add packages is going to be important forever. Because the great thing is there's always more software :)
I have a pretty specific vision for where we should be two to three years down the road, and it involves innovating in these spaces including looking at having source repos to fix real problems with developer productivity. How can we make developers (not just maintainers) lives easier? How can we shorten the distance between them?
I want to make it easier for users who are interested in something to become involved in it within the context of Fedora. I see a neat application on gnomefiles and it's not in Fedora? How do we make it easier for that user to go from being just a user to being a contributor? And then, once that user is a maintainer it becomes a much easier path for them to get involved as a contributor in upstream projects. Because they've had a chance to sort of hone their skills and start off with something that's a little easier.
For all of the maintainers we have today, how do we make their lives easier? And one part of that is how do we make it easier for them to interact with the upstream of the software they maintain for Fedora. But another part is how do we make it easier for them to interact with other parts of Fedora that their packages depend on. Sure, some bugs are easily traced and just in the piece of software that you're the maintainer of. But plenty of bugs are caused by a problem in another library or something else -- how do we make it easier for them to be able to track that down and fix it themselves and then quite possibly get involved with another project.
But I think things are devolving a bit from the actual infrastructure discussion at this point... reply-to set appropriately
Jeremy
[1] The point at which there's one Linux distro is the point at which I start Jinux ;) The variety and competition within the Linux distro space is one of the things that continues to drive innovation and progress and I think that it's an incredibly important force
On Thu, Jun 07, 2007 at 10:25:11AM -0400, Jeremy Katz wrote:
On Wed, 2007-06-06 at 22:46 -0400, Christopher Blizzard wrote:
On Wed, 2007-06-06 at 17:31 -0400, Jeremy Katz wrote:
At the same time, I think we still need to be able to very clearly separate out our changes from what upstream has.
And fundamentally, I think that the "pristine source + patches" bit is just as important today as it was 10 years ago. Because otherwise, you get into a situation where you basically encourage forking and not contributing things back upstream. Your response is going to be "but
I think that maintain "tarballs + patches" is no good idea. This thing is *very good* for distribution in src.rpms, but I hate it in VCS. I'd like to see all code (include upstream code) in VCS.
You can still export all your changes from GIT as small patches. The solution is branches. You can use one branch for pristine upstream source code and other branch for upstream code + fedora patches.
Finally, you can export pristine source and patches, compress the upstream code back to tarball and distribute the tarball + patches by src.rpm.
My wish is "git rebase" always after upgrade to new upstream code. The current "make prep" is nightmare...
See Linux kernel. That's normal that people maintain their patches outside official tree(s) for pretty long time. The modern VCS is the right tool for this job.
"separate out our changes from what upstream has" .. is definitely no problem.
Karel
Karel Zak (kzak@redhat.com) said:
My wish is "git rebase" always after upgrade to new upstream code. The current "make prep" is nightmare...
See Linux kernel. That's normal that people maintain their patches outside official tree(s) for pretty long time. The modern VCS is the right tool for this job.
"separate out our changes from what upstream has" .. is definitely no problem.
Well, it depends on what you're doing. DaveJ tried to move to git pulling the various things we need for our kernel - it ended up being a pretty big problem getting something sane out of it, and he ended up going back to patches.
Bill
On Thu, Jun 07, 2007 at 07:39:33PM -0400, Bill Nottingham wrote:
Karel Zak (kzak@redhat.com) said:
My wish is "git rebase" always after upgrade to new upstream code. The current "make prep" is nightmare...
See Linux kernel. That's normal that people maintain their patches outside official tree(s) for pretty long time. The modern VCS is the right tool for this job.
"separate out our changes from what upstream has" .. is definitely no problem.
Well, it depends on what you're doing. DaveJ tried to move to git pulling the various things we need for our kernel - it ended up being a pretty big problem getting something sane out of it, and he ended up going back to patches.
Yes, I remember a discussion about it. I'm not sure if this one DaveJ's attempt is enough to completely reject this idea ;-)
Karel
On 6/6/07, Christopher Blizzard blizzard@redhat.com wrote:
This might sound crazy (SUPPORT > 1 SYSTEM, ARE YOU CRAZY?) Well, yes, until you realize what you need to do here. To start with you only have to teach the rpm build side how pull a specific tag from a specific repo. On the query side we need a browser for each kind, which is a bit of work, but something I think we need to do anyway. (i.e. "What would git do?")
So if I am the owner of the rpm package which has an upstream of hg and want to fix, test and commit a change to say (for the sake of argument) neon which is in git, I now have to know two different systems? You're crazy ;-)
No. If you happen to be a maintainer _and_ a developer and you have _chosen_ to use hg or git or whatever, then we make it easy for you. This is about adding options to bring us closer to the upstream developers and make their lives easier.
Its been a very very long day, and I am running on fumes of fumes... but the only way I could see the workflow you are wanting to happen is that if there are appropriate front-ends for the 'maintainer/developer/qa/nth-party' to work with. Basically the front-end does not give a rats pee about what the backend is this week or next week. All it does is give say the worker a set of common commands and then interprets those for the backend systems. The tools would be meta-meta-tools and would probably slower than an snail in a Canadian December on some systems.. but might be possible if you outline a simple choice selection and pluginability (you get 2 from column A, 1 from B.. anything else you write yourself).
$ smoogeit create project kernel uses git with head git.kernel.org # this creates both locally and say a Fedora repo and uses whatever common lower end system. maybe some extra files to edit files.. it then talks to the git.kernel.org in git
$ smoogeit add patch <insert file name here> # adds the git stuff silently behind the scenes
$ smoogeit create project smooge-firewall uses sccm with head local://blah
the final workflow might be something like:
$ smoogeit publish kernel branch F7
which pushes it in a way that the Fedora build systems might be able to handle stuff like
In this never to be written system.. the build/errata/vcs/support/bugzilla system is semi-integrated together. The developer can use git for his own project and then just tell the smoogeit to publish it for SmoogeOS. The maintainer can talk to whatever the developer has and push back or keep their own local tree without having to know 8 systems.
Now like I said.. its shooting the moon to see this ever work.. and I would believe that it would require you to keep the smoogeit system to a limited number of commands and let any extras that a particular system has extra in it be done as plugins like yum. My take on it would be that there might be a 'hidden/common' system underneath all this to make the buildsystem sane to debug.. so while you check out to your laptop using git/cvs/mercurial/arch/svn/darcs/rcs/sccm/monotone/MicrosoftStuff and your upstream commits are done that way.. the build system stores it in a known format. Other people could chose to pull from either the build system or the upstream...
wow so many fricking choices these days and I used to have to deal with SCCM/RCS arguments.
Anyway going to sleep.
On Wednesday 06 June 2007 15:32:49 Jeffrey C. Ollie wrote:
- We convert the package repository to a new SCM so that we can get
off of CVS, but the process/workflow remains relatively unchanged. This I think that we could definitely have in place by F8.
Please, for the love of god, let us get a bit settled in F8 and finish off the other tools that have really rough edges before we drop something like dist-git on people's laps? We have such a short runway for F8 as it is...
Jesse Keating wrote:
On Wednesday 06 June 2007 15:32:49 Jeffrey C. Ollie wrote:
- We convert the package repository to a new SCM so that we can get
off of CVS, but the process/workflow remains relatively unchanged. This I think that we could definitely have in place by F8.
Please, for the love of god, let us get a bit settled in F8 and finish off the other tools that have really rough edges before we drop something like dist-git on people's laps? We have such a short runway for F8 as it is...
Thats not to say that some good work couldn't get done now though, jcollie isn't that involved in polishing our current tools but he's got a git background. I say let him test and let us know what he comes up with.
-Mike
On Wednesday 06 June 2007 16:25:18 Mike McGrath wrote:
Thats not to say that some good work couldn't get done now though, jcollie isn't that involved in polishing our current tools but he's got a git background. I say let him test and let us know what he comes up with.
Absolutely. In order to be successful in deploying during F9 time frame, we HAVE to have things up and testable during F8 timeframe. No question.
On 6/6/07, Christopher Blizzard blizzard@redhat.com wrote:
On Wed, 2007-06-06 at 10:31 -0400, Jeremy Katz wrote:
Right. I really don't think we want to just take our current system, switch out CVS, and end up with all of the same workflows. The change should be more about how do we improve workflows. That means thinking about things like:
- How do we make it easier for a maintainer to rebase their package to
a newer upstream?
- How do we make it easier for a maintainer to develop, test, and
create a patch to fix a problem that's being experienced in Fedora?
- How do we make it easy to send these patches to the upstream of the
project being worked on?
- How do we enable downstreams to take our bits, track them and make
changes as they need/want?
- How do we better enable a user who has a problem with something we
ship to be able to fix it themselves and get the fix back to us?
stuff snipped.
o Do we want to move to a process where code is just in a repo and it's built automatically instead of source + patches + spec file?
I am on fumes as I said.. but I do not see how the last 2 points above from Jeremy can be done with this one. Do you have an idea or is this something that is blindingly obvious?
Thanks.
On Thursday 07 June 2007 21:02:49 Stephen John Smoogen wrote:
- How do we enable downstreams to take our bits, track them and make
changes as they need/want?
- How do we better enable a user who has a problem with something we
ship to be able to fix it themselves and get the fix back to us?
stuff snipped.
o Do we want to move to a process where code is just in a repo and it's built automatically instead of source + patches + spec file?
I am on fumes as I said.. but I do not see how the last 2 points above from Jeremy can be done with this one. Do you have an idea or is this something that is blindingly obvious?
<strawman> We have two things for the upstream in our package SCM. We have the prestine tarball stashed away in a lookaside, and we have an exploaded tree of the source. We use the exploaded tree of the source to manage our patches to that source and to help with rebases. However the patches we manage always apply to the prestine point. At package build time, the patches we manage + the spec file + the prestine tarball stashed away are combined to make an srpm, and that is shoved through the build system. </strawman>
In this case, the exploaded source services us as a better way to manage our patches and to help with rebasing. It also provides a service to upstreams so that they can easily cherry pick our patches out of the exploaded source, same with downstreams, and same with somebody playing at home.
On Thu, 2007-06-07 at 21:18 -0400, Jesse Keating wrote:
<strawman> We have two things for the upstream in our package SCM. We have the prestine tarball stashed away in a lookaside, and we have an exploaded tree of the source. We use the exploaded tree of the source to manage our patches to that source and to help with rebases. However the patches we manage always apply to the prestine point. At package build time, the patches we manage + the spec file + the prestine tarball stashed away are combined to make an srpm, and that is shoved through the build system. </strawman>
So I see two ways to store patches: vendor-branch | |-- Foo.patch branch | |-- Bar.patch branch
Foo.patch and Bar.patch both directly apply to the upstream vendor branch.
vendor-branch | |-- Foo.patch branch | |-- Bar.patch branch
Foo.patch is the first patch against vendor-branch. Bar.patch is committed to the combination of vendor-branch and Foo.patch.
At first I was hoping to do the first way as it makes it easier to cherrypick changes for upstream. However, it quickly became complex as we had to manage a separate merge branch that was equivalent to the second storage graph. Whenever we rebased we would potentially have to remove conflicts from the second graph as well as the first.
So I decided that going directly to the second style was preferable. That does not have the enhanced cherrypicking benefits to upstream but it still allows us to work with individual patches within the VCS more easily than when they were simply patches stored in CVS.
Do you see a way around this limitation?
-Toshio
Jeremy Katz (katzj@redhat.com) said:
Right. I really don't think we want to just take our current system, switch out CVS, and end up with all of the same workflows. The change should be more about how do we improve workflows. That means thinking about things like:
- How do we make it easier for a maintainer to rebase their package to a
newer upstream?
- How do we make it easier for a maintainer to develop, test, and create
a patch to fix a problem that's being experienced in Fedora?
- How do we make it easy to send these patches to the upstream of the
project being worked on?
- How do we enable downstreams to take our bits, track them and make
changes as they need/want?
- How do we better enable a user who has a problem with something we
ship to be able to fix it themselves and get the fix back to us?
That's the off the top of my head list to give you sort of the idea of things that really want to be thought about.
Because if we're just switching out CVS for {git,hg,bzr,svn,foobarbazl} and don't think about these things then we're putting all of our developers onto a learning curve to switch for what is likely to be little gain.
Moreover,there have been requests from developers to explicitly *NOT* significantly change the development methodology for F8 after the changes of F7.
Bill
On Wednesday 06 June 2007 14:29:41 Bill Nottingham wrote:
Moreover,there have been requests from developers to explicitly *NOT* significantly change the development methodology for F8 after the changes of F7.
I firmly believe that this is not something we can do by F8 release. This is something we need to discuss and strawman and put up proof of concepts and get more people thinking on it during the F8 cycle and try to implement during the F9 cycle if possible.
From an inwardly looking perspective, this still fits well as RHEL5 just shipped and it will be a bit before RHEL6 really starts hitting heavy, so making a change to internal infrastructure to mirror whats going on in Fedora during the F9 cycle is still good timing.
infrastructure@lists.fedoraproject.org