Greetings.
As previously announced, fedoraproject is moving many of it's servers from one datacenter (phx2 near phoenix, arizona, usa) to another (iad2: near arlington, virginia, usa).
As we move from the old datacenter to the new, we will have a temporary reduction in capacity. The new datacenter has a smaller, less-redundant, lower-capacity version of our infrastructure. Over the next two weeks, we will migrate services to it so that we can finish moving out of the old datacenter.
After everything is moved from the old datacenter, many of the servers there will be shipped to the new datacenter and then re-added to bring us back to full redundency and capacity.
Out detailed checklist for these migrations is available at https://hackmd.io/@fedorainfra2020/rJpsA4FLL
To summarize what we are moving when:
2020-06-03 wed: The fedoraproject master mirrors will move to IAD2. A very small outage may be noticed as dns changes. There may be some mirroring slowdowns as we work out bugs.
2020-06-04 thu: Our internal ansible control host and the fedoraproject wiki will move. The wiki will be down for a few hours.
2020-06-05 fri: Our meeting minutes archive (https://meetbot.fedoraproject.org) and our freenode irc bot (zodbot). These two services will see a hour outage or less.
2020-06-07 sun: We will pause for the next week adding new packages and unretiring packages to avoid problems.
2020-06-08 mon: Our fedora-messaging bus and gateways to it (github2fedmsg, bugzilla2fedmsg), mirrormanager, product definition center (pdc), and our identity and authentication systems. Messages over our message bus may be slow or missing and users may be unable to login at various times as we migrate services over.
Additionally, we will be stopping services that will not be back until later in the month. These include: * Fedocal * Badges * Nuancier * koschei * simple-koji-ci * All staging services (*.stg.fedoraproject.org)
2020-06-09 tue: The build and packaging ecosystem. This includes koji, src.fedoraproject.org, osbs, odcs, container registries, bodhi (updates system). During this day maintainers should avoid builds/updates if at all possible as they may or may not work at various times.
2020-06-10 wed: Various small apps (mdapi, anitya, waiverdb, greenwave, etc), mailman/lists.fedoraproject.org, and our datagrepper/datanommer services. Mailing lists will be down for several hours as data is migrated. Datagrepper will be down for most of the day as it's database is moved. Other services will be down for short amounts of time while they are moved.
2020-06-11 thu: Various small site building apps (docs building, fedora websites building, reviewstats, blockerbugs) and elections will be moved. elections will be up until the currently running elections complete. (GO VOTE! https://elections.fedoraproject.org)
2020-06-12 fri: Catch up and fix issues day, along with re-enabling package unretirements/new packages, and other 'paused' items.
The week after this servers will be shipped and the week after that we expect to start setting them up and getting them re-added. During this time, we may have to make further changes to what services are available in order to deal with load changes.
If you have any questions or concerns, please file an infrastructure ticket ( https://pagure.io/fedora-infrastructure) or come talk to us in #fedora-admin on irc.freenode.net.
Finally, I'd like to ask everyone to be patient as we do this move. I know that it's painful when you are unable to contibute something when you have time to do so, but rest assured that we are trying to migrate things as quickly and smoothly as we can.
Thanks.
kevin
Il 02/06/20 18:40, Kevin Fenzi ha scritto:
2020-06-08 mon: Our fedora-messaging bus and gateways to it (github2fedmsg, bugzilla2fedmsg), mirrormanager, product definition center (pdc), and our identity and authentication systems. Messages over our message bus may be slow or missing and users may be unable to login at various times as we migrate services over.
Additionally, we will be stopping services that will not be back until later in the month. These include:
- Fedocal
- Badges
- Nuancier
- koschei
- simple-koji-ci
- All staging services (*.stg.fedoraproject.org)
2020-06-09 tue: The build and packaging ecosystem. This includes koji, src.fedoraproject.org, osbs, odcs, container registries, bodhi (updates system). During this day maintainers should avoid builds/updates if at all possible as they may or may not work at various times.
Since Bodhi heavy relies on messages on fedora-messaging bus to do nearly all its things, I think it would be better to bring it down the day before. Missing messages would mean updates not moved on, builds not correctly tagged, etc, so it may worth stopping users from creating new updates while fedora-messaging is down.
Mattia
On Wed, Jun 03, 2020 at 05:31:17AM +0000, Mattia Verga wrote:
Il 02/06/20 18:40, Kevin Fenzi ha scritto:
2020-06-08 mon: Our fedora-messaging bus and gateways to it (github2fedmsg, bugzilla2fedmsg), mirrormanager, product definition center (pdc), and our identity and authentication systems. Messages over our message bus may be slow or missing and users may be unable to login at various times as we migrate services over.
Additionally, we will be stopping services that will not be back until later in the month. These include:
- Fedocal
- Badges
- Nuancier
- koschei
- simple-koji-ci
- All staging services (*.stg.fedoraproject.org)
2020-06-09 tue: The build and packaging ecosystem. This includes koji, src.fedoraproject.org, osbs, odcs, container registries, bodhi (updates system). During this day maintainers should avoid builds/updates if at all possible as they may or may not work at various times.
Since Bodhi heavy relies on messages on fedora-messaging bus to do nearly all its things, I think it would be better to bring it down the day before. Missing messages would mean updates not moved on, builds not correctly tagged, etc, so it may worth stopping users from creating new updates while fedora-messaging is down.
It won't actually be down, it will just be moved to the new datacenter.
So, applications that connect to 'rabbitmq.fedoraproject.org' will get the iad2 cluster instead of the phx2 one. Since we will change that when we move it everything should be using the same cluster, so it should still all work.
Does that make sense?
ie, a build happens in phx2, the message goes to the _iad2_ rabbitmq cluster and bodhi (listening there) acts on it.
The one wrinkle is fedmsg, which we can't easily switch over like that. So, we may well lot have all our fedmsgs. ;(
kevin
Il 04/06/20 03:26, Kevin Fenzi ha scritto:
It won't actually be down, it will just be moved to the new datacenter.
So, applications that connect to 'rabbitmq.fedoraproject.org' will get the iad2 cluster instead of the phx2 one. Since we will change that when we move it everything should be using the same cluster, so it should still all work.
Does that make sense?
ie, a build happens in phx2, the message goes to the _iad2_ rabbitmq cluster and bodhi (listening there) acts on it.
The one wrinkle is fedmsg, which we can't easily switch over like that. So, we may well lot have all our fedmsgs. ;(
Yeah, my concerns were about losing messages, since Bodhi relies on that for moving things, especially when creating new updates. When a new update is created, Bodhi, Koji and robosignatory work together listening to messages to push the update to pending testing. If we lose a lot of messages, we may end with a lot of builds not correctly signed/tagged and a lot of updates stuck into pending.
I just don't know how many messages you estimate to get lost while migrating.
Mattia
On Thu, Jun 04, 2020 at 05:55:59AM +0000, Mattia Verga wrote:
Il 04/06/20 03:26, Kevin Fenzi ha scritto:
It won't actually be down, it will just be moved to the new datacenter.
So, applications that connect to 'rabbitmq.fedoraproject.org' will get the iad2 cluster instead of the phx2 one. Since we will change that when we move it everything should be using the same cluster, so it should still all work.
Does that make sense?
ie, a build happens in phx2, the message goes to the _iad2_ rabbitmq cluster and bodhi (listening there) acts on it.
The one wrinkle is fedmsg, which we can't easily switch over like that. So, we may well lot have all our fedmsgs. ;(
Yeah, my concerns were about losing messages, since Bodhi relies on that for moving things, especially when creating new updates. When a new update is created, Bodhi, Koji and robosignatory work together listening to messages to push the update to pending testing. If we lose a lot of messages, we may end with a lot of builds not correctly signed/tagged and a lot of updates stuck into pending.
I just don't know how many messages you estimate to get lost while migrating.
Yeah, something to consider. However, all bodhi/koji/robosignatory are on fedora-messaging. I don't think we are going to loose many (if any) of those. fedmsg's are in more flux because each service connects to all the other ones, so if service a talks to service b and we migrate a, it likely won't be able to talk to b until b also is migrated.
That said, we have few things left on fedmsg, so I don't think the impact will be that big.
kevin
infrastructure@lists.fedoraproject.org