Hello all, I have been working on a proposal on how to handle Fedora's Docker Registry requirements to deliver our future "docker-ized" content to users in a way that won't leave us with a bottleneck to the user community hitting a single download location. Below you will find that proposal and I would greatly appreciate feedback. The good news is I did a small PoC of the plan I would like to propose along with the workflow (simulating some steps in the process for brevity), and it all worked! The potentially bad news is that this introduces a reasonable amount of net-new services inside the Fedora Infrastructure.
I originally sent this to the Fedora rel-eng list[0] for general high level solution sanity checking before attempting to burden the Infrastructure team with any of this so most of this email is verbatim of the original one. However, towards the end I am going to go more in-depth as to the net-new requirements they will impose on the Fedora Infrastructure Team and I want to make sure those things are taken into consideration because I absolutely do not want to ask for the Infra Team to just take on a bunch of new stuff and then ride off into the sunset while y'all deal with it all but instead make sure that the things I am requesting are as reasonable as possible and are realistic expectations for everyone involved.
Without further adieu, the pitch is below.
Quick vocab background for Docker stuff:
registry: a collection of docker image repositories
repository: named after an image and is a collection of multiple tags of an that image
tag: an arbitrary string assigned to a specific docker image (identified by the image's sha256 checksum)
NOTE: The "latest" tag is special and is assumed if no tag is provided. This is true also for a 'docker pull' operation and an image tagged "latest" will be the default image pulled by users.
Proposal:
Pulp[1] + Crane[2] + MirrorManager[3] + Docker Distribution[4]
Pulp is a platform for managing repositories of content, such as software packages, and making it available to a large numbers of consumers. It is also capable of managing docker content.
Crane is a stand-alone python flask wsgi application written by the Pulp team to serve as a API entry point for the docker client and will answer to an user's 'docker pull'. It does not however create content manifests or provide hosting for docker image content, instead it depends on someone creating the manifest metadata themselves or having pulp publish it and serves 302 redirects to the docker client so they can find where the docker images actually live.
I'll assume everyone here knows their way around MirrorManager.
Docker Distribution is the defacto standard open source implementation of the Docker Registry V2 API spec[5]. It provides many features but the ability to have it's back-end storage be provided by a "mirror network" much like the one Fedora has at it's disposal is not one of them. The reason we need this in place is because the mechanism by which you could push a docker image directly to Pulp in Docker Registry v1 no longer exists in v2 so we must instead perform a "sync" operation between the two. (This is a common problem for all known "third party" v2 registry implementations).
Workflow: OSBS will perform Builds, as these builds complete they will be pushed to the docker-distribution (v2) registry, these will be considered "candidate images". Pulp will sync and publish the candidate repository.
Testing will occur using the "candidate images" (details of how we want to handle that are outside the scope of this proposal).
A "candidate image" will be marked stable once it's criteria have been satisfied to do so. (This is vague because this is a topic of ongoing discussion and work to decide what criteria an image will need to abide by before being considered "stable" and promoted as such)
Once stable, pulp will publish that repository's content to a directory, we will split that content and sync the image layers along with their metadata to Mirror Manager master mirror. We will also sync the repo metadata published by Pulp to somewhere Crane can pick it up. (This could and will likely be something that Bodhi triggers via the Pulp REST API)
Mirror Manager will Pulp distribute to the mirrors the image layers and their metadata.
Crane will get the new repository metadata and will serve redirects to the new content relative to download.fedoraproject.org which will perform another redirect (via MirrorManager) where the docker client upon a "docker pull" will find it's content.
I have put together an ascii diagram in hopes that it will assist in seeing the full picture.
https://maxamillion.fedorapeople.org/FedoraPulpDocker.txt
(I couldn't get it to paste cleanly into my email client)
Some more in depth technical items around this solution that I think the Fedora Infrastructure Team are likely interested in:
Pulp Requirements: - An AMPQ message queue, currently qpid and rabbitmq are supported upstream. However, the requirement appears to stem from the use of Celery[6] and Celery upstream supports redis[7] as a broker backend so I have requested that it be made available as supported option Pulp[8]. This will obviously take some amount of dev time, but we can plan for that if adding a message queue to Fedora Infra is a show stopper.
- MongoDB, this is currently a hard requirement but postgresql is planned replace MongoDB in the future[9] (probably a year-ish timeline on that). The question is, can we wait that long from a Fedora Project standpoint for the new feature before having a solution in place? I imagine some of this will need to be planned/scoped as time goes on and we learn more but it's worth keeping in mind
- Storage. I've been told Pulp likes a lot of storage, I don't know hard numbers for what we'd need since we're getting into uncharted territory but I've heard that a few hundred GB is not uncommon in pulp deployments when combining the MongoDB storage needs with all the artifacts in the repos.
Crane Requirements: - Crane is just a small python wsgi app written in flask
A couple of things to note about maintenance and uptime considerations:
The Intermediate docker-distribution registry is needed for builds in koji+OSBS
Pulp will be required for "promotion" of builds from candidate to testing or stable
Crane will be required for end users out in the world to access in order to actually pull down Docker images from us.
The only service here that needs to be public end-user facing (i.e. wide open to the internet and not have access locked to a FAS group) is Crane. All other components should be able to be locked down similar to the "Fedora internal" components koji (builders, etc), bodhi (signing, etc) and similar.
If there are any questions, comments, or feedback please let me know.
Thank you, -AdamM
[0] - https://lists.fedoraproject.org/archives/list/rel-eng@lists.fedoraproject.or... [1] - http://www.pulpproject.org/ [2] - https://github.com/pulp/crane [3] - https://github.com/fedora-infra/mirrormanager2/ [4] - https://github.com/docker/distribution/ [5] - https://docs.docker.com/registry/spec/api/ [6] - http://www.celeryproject.org/ [7] - http://redis.io/ [8] - https://pulp.plan.io/issues/1900 [9] - https://pulp.plan.io/issues/1803
On Fri, 6 May 2016 17:30:18 -0500 Adam Miller maxamillion@fedoraproject.org wrote:
...snip...
Proposal:
Pulp[1] + Crane[2] + MirrorManager[3] + Docker Distribution[4]
Are all of these packaged up? For EPEL? (aside mirrormanager).
...snip...
Workflow: OSBS will perform Builds, as these builds complete they will be pushed to the docker-distribution (v2) registry, these will be considered "candidate images". Pulp will sync and publish the candidate repository.
Testing will occur using the "candidate images" (details of how
we want to handle that are outside the scope of this proposal).
So, at this point the 'candidate image' is just in pulp? Or it's been published to a directory and mirrored out? I'm guessing it would be published and mirrored so people could test it?
...snip...
Mirror Manager will Pulp distribute to the mirrors the image
layers and their metadata.
This should work for mirrors to just rsync the directories right?
...snip...
Some more in depth technical items around this solution that I think the Fedora Infrastructure Team are likely interested in:
Pulp Requirements: - An AMPQ message queue, currently qpid and rabbitmq are
supported upstream. However, the requirement appears to stem from the use of Celery[6] and Celery upstream supports redis[7] as a broker backend so I have requested that it be made available as supported option Pulp[8]. This will obviously take some amount of dev time, but we can plan for that if adding a message queue to Fedora Infra is a show stopper.
Well, what needs to listen on/publish to this queue?
We have tried to avoid celery serveral times in the past and always managed to, but perhaps we can't this time. Is there any alternative to the celery use?
- MongoDB, this is currently a hard requirement but
postgresql is planned replace MongoDB in the future[9] (probably a year-ish timeline on that). The question is, can we wait that long from a Fedora Project standpoint for the new feature before having a solution in place? I imagine some of this will need to be planned/scoped as time goes on and we learn more but it's worth keeping in mind
well, OSBS already uses mongo (as does openstack), so I don't think this is a blocker, it would be nice to reuse the roles/mongodb for it tho
- Storage. I've been told Pulp likes a lot of storage, I
don't know hard numbers for what we'd need since we're getting into uncharted territory but I've heard that a few hundred GB is not uncommon in pulp deployments when combining the MongoDB storage needs with all the artifacts in the repos.
ok. Can this storage be NFS? Or is there some fs requirement?
Crane Requirements: - Crane is just a small python wsgi app written in flask
Hurray!
A couple of things to note about maintenance and uptime considerations:
The Intermediate docker-distribution registry is needed for
builds in koji+OSBS
Pulp will be required for "promotion" of builds from candidate to
testing or stable
Crane will be required for end users out in the world to access
in order to actually pull down Docker images from us.
The only service here that needs to be public end-user facing
(i.e. wide open to the internet and not have access locked to a FAS group) is Crane. All other components should be able to be locked down similar to the "Fedora internal" components koji (builders, etc), bodhi (signing, etc) and similar.
What port(s) does crane need open? Is this something we could proxy and cache via varnish?
Can we/should we look at any HA with any of these parts? For example, if we wanted to apply a kernel update and reboot everything, how could we avoid any downtime that users would see? Would it be as easy as having 2 crane frontends or would downtime on the other internal components affect crane?
As far as backups of this, we would only need the pulp storage and the mongodb? Or are there other parts that need backups to restore the entire stack in case of doom?
I'm sure I will think of more, but thats all at the moment...
kevin
On Mon, May 09, 2016 at 01:49:26PM -0600, Kevin Fenzi wrote:
On Fri, 6 May 2016 17:30:18 -0500 Adam Miller maxamillion@fedoraproject.org wrote:
...snip...
Proposal:
Pulp[1] + Crane[2] + MirrorManager[3] + Docker Distribution[4]
Are all of these packaged up? For EPEL? (aside mirrormanager).
...snip...
Workflow: OSBS will perform Builds, as these builds complete they will be pushed to the docker-distribution (v2) registry, these will be considered "candidate images". Pulp will sync and publish the candidate repository.
Testing will occur using the "candidate images" (details of how
we want to handle that are outside the scope of this proposal).
So, at this point the 'candidate image' is just in pulp? Or it's been published to a directory and mirrored out? I'm guessing it would be published and mirrored so people could test it?
...snip...
Mirror Manager will Pulp distribute to the mirrors the image
layers and their metadata.
This should work for mirrors to just rsync the directories right?
From a MirrorManager point of view it would be important, that this does not result in thousands of new files. The atomic directory contains over 700000 files which massively increases sync time for the mirrors and crawl times from our side as well as an increase in database size. Depending on the number of files this might be a candidate for a new rsync module.
Adrian
On 10 May 2016 at 03:02, Adrian Reber adrian@lisas.de wrote:
On Mon, May 09, 2016 at 01:49:26PM -0600, Kevin Fenzi wrote:
On Fri, 6 May 2016 17:30:18 -0500 Adam Miller maxamillion@fedoraproject.org wrote:
...snip...
Proposal:
Pulp[1] + Crane[2] + MirrorManager[3] + Docker Distribution[4]
Are all of these packaged up? For EPEL? (aside mirrormanager).
...snip...
Workflow: OSBS will perform Builds, as these builds complete they will be pushed to the docker-distribution (v2) registry, these will be considered "candidate images". Pulp will sync and publish the candidate repository.
Testing will occur using the "candidate images" (details of how
we want to handle that are outside the scope of this proposal).
So, at this point the 'candidate image' is just in pulp? Or it's been published to a directory and mirrored out? I'm guessing it would be published and mirrored so people could test it?
...snip...
Mirror Manager will Pulp distribute to the mirrors the image
layers and their metadata.
This should work for mirrors to just rsync the directories right?
From a MirrorManager point of view it would be important, that this does not result in thousands of new files. The atomic directory contains over 700000 files which massively increases sync time for the mirrors and crawl times from our side as well as an increase in database size. Depending on the number of files this might be a candidate for a new rsync module.
I was going to bring up that I think that atomic is a good candidate for its own sync module. The amount of lookups having to be done on the atomic directory has really increased the amount of IOPS on the data where only certain groups are interested in it.
Adrian
infrastructure mailing list infrastructure@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraprojec...
On Tue, May 10, 2016 at 08:52:00AM -0400, Stephen John Smoogen wrote:
On 10 May 2016 at 03:02, Adrian Reber adrian@lisas.de wrote:
On Mon, May 09, 2016 at 01:49:26PM -0600, Kevin Fenzi wrote:
On Fri, 6 May 2016 17:30:18 -0500 Adam Miller maxamillion@fedoraproject.org wrote:
...snip...
Proposal:
Pulp[1] + Crane[2] + MirrorManager[3] + Docker Distribution[4]
Are all of these packaged up? For EPEL? (aside mirrormanager).
...snip...
Workflow: OSBS will perform Builds, as these builds complete they will be pushed to the docker-distribution (v2) registry, these will be considered "candidate images". Pulp will sync and publish the candidate repository.
Testing will occur using the "candidate images" (details of how
we want to handle that are outside the scope of this proposal).
So, at this point the 'candidate image' is just in pulp? Or it's been published to a directory and mirrored out? I'm guessing it would be published and mirrored so people could test it?
...snip...
Mirror Manager will Pulp distribute to the mirrors the image
layers and their metadata.
This should work for mirrors to just rsync the directories right?
From a MirrorManager point of view it would be important, that this does not result in thousands of new files. The atomic directory contains over 700000 files which massively increases sync time for the mirrors and crawl times from our side as well as an increase in database size. Depending on the number of files this might be a candidate for a new rsync module.
I was going to bring up that I think that atomic is a good candidate for its own sync module. The amount of lookups having to be done on the atomic directory has really increased the amount of IOPS on the data where only certain groups are interested in it.
Agreed. I just did not want to go into more details of the atomic tree in the this thread. Important for me was to mention to not further increase the number of files in fedora/linux/.
Adrian
On Mon, May 9, 2016 at 2:49 PM, Kevin Fenzi kevin@scrye.com wrote:
On Fri, 6 May 2016 17:30:18 -0500 Adam Miller maxamillion@fedoraproject.org wrote:
...snip...
Proposal:
Pulp[1] + Crane[2] + MirrorManager[3] + Docker Distribution[4]
Are all of these packaged up? For EPEL? (aside mirrormanager).
Yes packaged in Fedora, I would have to double check on EPEL.
...snip...
Workflow: OSBS will perform Builds, as these builds complete they will be pushed to the docker-distribution (v2) registry, these will be considered "candidate images". Pulp will sync and publish the candidate repository.
Testing will occur using the "candidate images" (details of how
we want to handle that are outside the scope of this proposal).
So, at this point the 'candidate image' is just in pulp? Or it's been published to a directory and mirrored out? I'm guessing it would be published and mirrored so people could test it?
I was planning to have them in pulp and accessible to testers but wasn't sure if we publish testing stuff to the mirrors for rpms. If we do, then we could certainly follow suit here.
...snip...
Mirror Manager will Pulp distribute to the mirrors the image
layers and their metadata.
This should work for mirrors to just rsync the directories right?
Correct.
...snip...
Some more in depth technical items around this solution that I think the Fedora Infrastructure Team are likely interested in:
Pulp Requirements: - An AMPQ message queue, currently qpid and rabbitmq are
supported upstream. However, the requirement appears to stem from the use of Celery[6] and Celery upstream supports redis[7] as a broker backend so I have requested that it be made available as supported option Pulp[8]. This will obviously take some amount of dev time, but we can plan for that if adding a message queue to Fedora Infra is a show stopper.
Well, what needs to listen on/publish to this queue?
We have tried to avoid celery serveral times in the past and always managed to, but perhaps we can't this time. Is there any alternative to the celery use?
This is all isolated inside of Pulp, nothing outside of pulp would need to interact with the message bus and from what I understand, Pulp is heavily tied to celery so getting rid of it is not really an option.
- MongoDB, this is currently a hard requirement but
postgresql is planned replace MongoDB in the future[9] (probably a year-ish timeline on that). The question is, can we wait that long from a Fedora Project standpoint for the new feature before having a solution in place? I imagine some of this will need to be planned/scoped as time goes on and we learn more but it's worth keeping in mind
well, OSBS already uses mongo (as does openstack), so I don't think this is a blocker, it would be nice to reuse the roles/mongodb for it tho
OSBS does not use mongo, why do you think that it does?
- Storage. I've been told Pulp likes a lot of storage, I
don't know hard numbers for what we'd need since we're getting into uncharted territory but I've heard that a few hundred GB is not uncommon in pulp deployments when combining the MongoDB storage needs with all the artifacts in the repos.
ok. Can this storage be NFS? Or is there some fs requirement?
To the best of my knowledge, NFS will be fine here.
Crane Requirements: - Crane is just a small python wsgi app written in flask
Hurray!
A couple of things to note about maintenance and uptime considerations:
The Intermediate docker-distribution registry is needed for
builds in koji+OSBS
Pulp will be required for "promotion" of builds from candidate to
testing or stable
Crane will be required for end users out in the world to access
in order to actually pull down Docker images from us.
The only service here that needs to be public end-user facing
(i.e. wide open to the internet and not have access locked to a FAS group) is Crane. All other components should be able to be locked down similar to the "Fedora internal" components koji (builders, etc), bodhi (signing, etc) and similar.
What port(s) does crane need open? Is this something we could proxy and cache via varnish?
The end users will hit https/443 and that should be the only open port we need, we can redirect port 80 to 443 if we like.
Can we/should we look at any HA with any of these parts? For example, if we wanted to apply a kernel update and reboot everything, how could we avoid any downtime that users would see? Would it be as easy as having 2 crane frontends or would downtime on the other internal components affect crane?
Having 2 crane frontends would be great and I was planning on having two of them hiding behind haproxy. The other internal components are only needed to publish content for crane to serve but once published then they are no longer needed and can go up/down as we like. Crane just serves 302 redirects to where the content actually lives, which will be somewhere out in mirrormanager land.
As far as backups of this, we would only need the pulp storage and the mongodb? Or are there other parts that need backups to restore the entire stack in case of doom?
I haven't actually looked into that just yet. I'm not sure about disaster recovery for pulp or docker-distribution. Crane itself just needs the files backed up.
-AdamM
I'm sure I will think of more, but thats all at the moment...
kevin
infrastructure mailing list infrastructure@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraprojec...
On Tue, 10 May 2016 14:12:28 -0500 Adam Miller maxamillion@fedoraproject.org wrote:
On Mon, May 9, 2016 at 2:49 PM, Kevin Fenzi kevin@scrye.com wrote:
On Fri, 6 May 2016 17:30:18 -0500 Adam Miller maxamillion@fedoraproject.org wrote:
...snip...
Proposal:
Pulp[1] + Crane[2] + MirrorManager[3] + Docker Distribution[4]
Are all of these packaged up? For EPEL? (aside mirrormanager).
Yes packaged in Fedora, I would have to double check on EPEL.
ok. Either will work, but epel is nicer from a sysadmin perspective...
...snip...
I was planning to have them in pulp and accessible to testers but wasn't sure if we publish testing stuff to the mirrors for rpms. If we do, then we could certainly follow suit here.
We do. We push updates-testing rpms out about as often (if not more so) than updates rpms.
This is all isolated inside of Pulp, nothing outside of pulp would need to interact with the message bus and from what I understand, Pulp is heavily tied to celery so getting rid of it is not really an option.
ok. Fair enough.
...snip...
OSBS does not use mongo, why do you think that it does?
https://infrastructure.fedoraproject.org/infra/ansible/files/openshift/mongo...
This is a leftover from some older version?
The end users will hit https/443 and that should be the only open port we need, we can redirect port 80 to 443 if we like.
Great.
Having 2 crane frontends would be great and I was planning on having two of them hiding behind haproxy. The other internal components are only needed to publish content for crane to serve but once published then they are no longer needed and can go up/down as we like. Crane just serves 302 redirects to where the content actually lives, which will be somewhere out in mirrormanager land.
Excellent. Yes, we should make two frontends then.
As far as backups of this, we would only need the pulp storage and the mongodb? Or are there other parts that need backups to restore the entire stack in case of doom?
I haven't actually looked into that just yet. I'm not sure about disaster recovery for pulp or docker-distribution. Crane itself just needs the files backed up.
ok. We should figure that out at some point, but it sounds like if we have files and crane we can at least keep serving the data we already are in the case of a disaster.
kevin
On Fri, May 13, 2016 at 4:19 PM, Kevin Fenzi kevin@scrye.com wrote:
On Tue, 10 May 2016 14:12:28 -0500 Adam Miller maxamillion@fedoraproject.org wrote:
On Mon, May 9, 2016 at 2:49 PM, Kevin Fenzi kevin@scrye.com wrote:
On Fri, 6 May 2016 17:30:18 -0500 Adam Miller maxamillion@fedoraproject.org wrote:
...snip...
Proposal:
Pulp[1] + Crane[2] + MirrorManager[3] + Docker Distribution[4]
Are all of these packaged up? For EPEL? (aside mirrormanager).
Yes packaged in Fedora, I would have to double check on EPEL.
ok. Either will work, but epel is nicer from a sysadmin perspective...
...snip...
I was planning to have them in pulp and accessible to testers but wasn't sure if we publish testing stuff to the mirrors for rpms. If we do, then we could certainly follow suit here.
We do. We push updates-testing rpms out about as often (if not more so) than updates rpms.
This is all isolated inside of Pulp, nothing outside of pulp would need to interact with the message bus and from what I understand, Pulp is heavily tied to celery so getting rid of it is not really an option.
ok. Fair enough.
...snip...
OSBS does not use mongo, why do you think that it does?
https://infrastructure.fedoraproject.org/infra/ansible/files/openshift/mongo...
This is a leftover from some older version?
Definitely yes, looking at the commit logs that file is from 2012. OpenShift Architecture v2 required MongoDB but wasn't docker or kubernetes based. OpenShift Architecture V3 which has been "current" upstream for almost 2 years now (iirc) was a complete rewrite and is built on top of kubernetes and docker and uses etcd instead of mongodb, but that's isolated to the OpenShift environment and is managed embedded as part of OpenShift.
The end users will hit https/443 and that should be the only open port we need, we can redirect port 80 to 443 if we like.
Great.
Having 2 crane frontends would be great and I was planning on having two of them hiding behind haproxy. The other internal components are only needed to publish content for crane to serve but once published then they are no longer needed and can go up/down as we like. Crane just serves 302 redirects to where the content actually lives, which will be somewhere out in mirrormanager land.
Excellent. Yes, we should make two frontends then.
As far as backups of this, we would only need the pulp storage and the mongodb? Or are there other parts that need backups to restore the entire stack in case of doom?
I haven't actually looked into that just yet. I'm not sure about disaster recovery for pulp or docker-distribution. Crane itself just needs the files backed up.
ok. We should figure that out at some point, but it sounds like if we have files and crane we can at least keep serving the data we already are in the case of a disaster.
The backup docs from upstream seems pretty straight forward, we'll likely need to evaluate some stuff once we have a PoC in place and get all the systems tied together.
http://pulp.readthedocs.io/en/latest/user-guide/server.html
-AdamM
kevin
infrastructure mailing list infrastructure@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraprojec...
infrastructure@lists.fedoraproject.org