Yeah makes sense Kevin,
Hmm just threw a little POC together to show some of the basics of the Openshift monitoring stack.
- Sample configuration for the User Workload monitoring stack which is in tech preview, eg data retention, and persistent storage claim size etc. - small ruby app that has a /metrics endpoint, and 2 gauge metrics being exported - Prometheus ServiceMonitor to monitor the service - Prometheus PrometheusRule to fire based on those alerts - WIP, but I'll add example Grafana GrafanaDashboards which graph the metrics at some future point
https://github.com/davidkirwan/crypto_monitoring
On Wed, 1 Jul 2020 at 17:13, Kevin Fenzi kevin@scrye.com wrote:
On Sun, Jun 28, 2020 at 01:01:31AM +0100, David Kirwan wrote:
Hmm the (prometheus, grafana, alertmanager) stack itself is pretty
simple I
would have said, but I agree it is certainly complex when installed/integrated on Openshift.. (most things are needlessly complex
on
Openshift tbh, and its an order of magnitude worse on Openshift 4 with these operators added to the mix).
Well, they may not be that complex... like I said, I haven't used them much, so I might be missing how they work.
It would be the obvious choice for me anyway considering this stack is available by default on a fresh Openshift install. We could make use of this cluster monitoring stack, especially if we're also deploying our services on Openshift. I might throw a POC/demo together to show how
"easy"
it is to get your app hooked into the Openshift cluster monitoring stack, or the UserWorkload tech preview monitoring stack[1].
I agree it makes sense to use this for openshift apps. I am not sure at all we should use it for non openshift apps.
If we did use this stack it would add a little extra pain with regards to monitoring storage maintenance/pruning. But maybe far less than running/maintaining a whole separate monitoring stack outside the
Openshift
cluster. There are also efficiencies to be made when developers are
already
in the Openshift/Kubernetes mindset, creating an extra Service and ServiceMonitor is a minor thing etc.
Sure, but we have a lot of legacy stuff we want to monitor/review logs for too.
The right answer might be to just seperate those two use cases with different solutions, but then we have 2 things to maintain. It's probibly going to take some investigation and some proof of concept working.
kevin _______________________________________________ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedorapro...