On Fri, Jun 26, 2020 at 10:32:14AM +0100, David Kirwan wrote:
Hi all,
If we are moving towards openshift/kubernetes backed services, we should probably be sticking with containers rather than Vagrant. We can use CRC [1] (Code Ready Containers) or minikube [2] for most local dev work.
I'd be very much in favour of having an Infra managed Prometheus instance (+ grafana and alertmanager on Openshift), its something I hoped to work on within CPE sustaining infact.
You know, I'm not in love with that stack. It could well be that I just haven't used it enough or know enough about it, but it seems just needlessly complex. ;(
I'd prefer we start out at a lower level... what are our requirements? Then, see how we can setup something to meet those.
Off the top of my head (I'm sure I can think of more):
* Ability to collect/gather rsyslog output from all our machines. * Ability to generate reports of 'variances' from all that (ie, what odd messages should a human look at?) * Handle all the logs from openshift, possibly multiple clusters? * Ability to easily drill down and look at some specifc historical logs (ie, show me the logs for the bodhi-web pods from last week when there was a issue).
Perhaps prometheus/graphana/alertmanager is the solution, but there's also tons of other open source projects out there too that we might look into.
kevin --
On Fri, 26 Jun 2020 at 10:23, Luca BRUNO lucab@redhat.com wrote:
On Thu, 25 Jun 2020 15:59:44 -0700 Kevin Fenzi kevin@scrye.com wrote:
What else would we want in there?
Monitoring - we will likely get our nagios setup again soon just because it's mostly easy, but it's also not ideal.
On this one (or more broadly "observability") I'd still like to see an infra-managed Prometheus to internally cover and sanity-check the "openshift-apps" services. I remember this was on the "backlog" dashboard at Flock'19 but I don't know if it got translated to an actual action item/ticket in the end.
Ciao, Luca _______________________________________________ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedorapro...
-- David Kirwan Software Engineer
Community Platform Engineering @ Red Hat
T: +(353) 86-8624108 IM: @dkirwan
infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedorapro...