On Sat, Jun 27, 2020 at 09:29:46AM +0000, Luca BRUNO wrote:
I appreciate this reply because it contains very relevant technical points, but it highlights that we are looking at different problems at different levels.
Yeah, I mentioned that there would be more requirements... ;)
Your list of requirements basically describes an "ingestion + storage + anomaly inference for logs" solution, where the infrastructure and the applications running on top of it are blended together. This would be handy to have and it's indeed a good problem to work on (I don't have specific answers/suggestions for this).
But that's not the space I'm looking at, nor the one that Prometheus applies to. Instead, the gap/usecase is narrower and more akin to "SNMP counters for containerized web services". Logs are surely useful to drill-down into problems and investigate root-causes, but that comes after being able to answer "is any of those web services experiencing non-transient issues".
...snip...
Oh, I completely agree we need something for this too. Perhaps they can be the same stack/solution, or perhaps they can't.
Thanks much for the really cool real life example. :)
kevin