Re: What is our technical debt?

26 Jun 2020

      On Fri, Jun 26, 2020 at 10:32:14AM +0100, David Kirwan wrote:
...
Hi all,
If we are moving towards openshift/kubernetes backed services, we should
probably be sticking with containers rather than Vagrant. We can use CRC
[1] (Code Ready Containers) or minikube [2] for most local dev work.
I'd be very much in favour of having an Infra managed Prometheus instance
(+ grafana and alertmanager on Openshift), its something I hoped to work on
within CPE sustaining infact.
You know, I'm not in love with that stack. It could well be that I just
haven't used it enough or know enough about it, but it seems just
needlessly complex. ;(
I'd prefer we start out at a lower level... what are our requirements?
Then, see how we can setup something to meet those.
Off the top of my head (I'm sure I can think of more):
* Ability to collect/gather rsyslog output from all our machines. 
* Ability to generate reports of 'variances' from all that (ie, what odd
messages should a human look at?)
* Handle all the logs from openshift, possibly multiple clusters?
* Ability to easily drill down and look at some specifc historical logs
(ie, show me the logs for the bodhi-web pods from last week when there
was a issue).
Perhaps prometheus/graphana/alertmanager is the solution, but there's
also tons of other open source projects out there too that we might look
into.
kevin
--
...

[1] https://github.com/code-ready/crc
[2] https://minikube.sigs.k8s.io/docs/

On Fri, 26 Jun 2020 at 10:23, Luca BRUNO lucab@redhat.com wrote:
...
On Thu, 25 Jun 2020 15:59:44 -0700
Kevin Fenzi kevin@scrye.com wrote:
...
...
What else would we want in there?
Monitoring - we will likely get our nagios setup again soon just
because it's mostly easy, but it's also not ideal.
On this one (or more broadly "observability") I'd still like to see an
infra-managed Prometheus to internally cover and sanity-check the
"openshift-apps" services.
I remember this was on the "backlog" dashboard at Flock'19 but I don't
know if it got translated to an actual action item/ticket in the end.
Ciao, Luca
_______________________________________________
infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to
infrastructure-leave@lists.fedoraproject.org
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedorapro...
-- 
David Kirwan
Software Engineer
Community Platform Engineering @ Red Hat
T: +(353) 86-8624108     IM: @dkirwan
...

infrastructure mailing list -- infrastructure@lists.fedoraproject.org
To unsubscribe send an email to infrastructure-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedorapro...

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Re: What is our technical debt?