On Wed, 27 Feb 2019 at 16:05, Jim Perrin jperrin@redhat.com wrote:
How much heresy is involved in us using Amazon's elasticsearch service for this, so that we don't have yet-another-thing to maintain?
I was wondering how much data are we looking to shove there, does that data need to be 'protected', and how fast do we need it to be for us to talk back and forth to the cloud. The heresy side I don't have any say in..
On 2/27/19 4:19 AM, Stephen John Smoogen wrote:
On Tue, 26 Feb 2019 at 14:39, Clement Verna cverna@fedoraproject.org wrote:
Hi all,
fedora-packages [0] code base is showing its age. The code base and the technology stack (Turbogears2 [1] web framework and the Moksha [2] middleware) is currently not ready for Python3 and I am not planning to do the work required to make it Python3 compatible, so the application will stop working when Fedora 29 is EOL.
In order to keep the service running, I have started a Proof Of Concept (fedora-search [3]) to replace the backend of the application. Fedora-search would be a REST API service offering full test search API. Such a service would then be available for other application to use, fedora-packages would then become a frontend only application using the service provided by fedora-search.
While the POC shows that this is a viable solution, I don't think that we should be proceeding that way, for the simple reason that this add yet another code base to maintain, I think we should use this opportunity to consider using Elasticsearch instead of maintaining our own "search engine".
The main issues to getting elasticsearch working in the past was the following:
1 The number of systems needed to make it work. There is a large difference from their 'proof-of-concept see how great this is' to 'ok you want to do anything with load' setups in everything from storage to number of search nodes to network speeds. [The number of hardware for the data we have was to start with 5-8 dedicated Dell systems, some amount of shared fast storage, and N virtual machines with a 10-40GB backbone.. or throwing all of Fedora Infrastructure at once into the cloud.. because the feed it from PHX2 to the cloud is expensive.]
- Packaging of elasticsearch was a mess. At the time we had rules
that all packages needed to be packaged in Fedora and follow Fedora packaging rules. [This one has been relaxed.]
- Running of elasticsearch was a large service in itself. It doesn't
take care of itself and we would need one or more people who know it well to keep it running. [This goes down the ladder.. the logstash backends are also full services.. ] Most of that was written in Java which no one on the team at the time had good experiences with.
- A kibana/elasticsearch query expert. Just like any database, most
of the queries you can make are the worse kind. They will take a lot more CPU/memory/time than they should making just grepping for data faster.
However that is 3-5 years ago.. so a lot has changed since then.
I think that Elasticsearch offers quite a few advantages :
- Powerful Query language
- Python bindings
- Javascript bindings
- Can be deployed in our infrastructure or used as a service
- Can be useful for other applications ( docs.fp.o, pagure, ??)
So what is the general feeling about using Elasticsearch in our infrastructure ? Should we look at deploying a cluster in our infra / Should we approach the Council to see if we can get founding to have this service hosted by Elastic ?
Thanks Clément
[0] - https://apps.fedoraproject.org/packages/ [1] - http://www.turbogears.org/ [2] - https://mokshaproject.github.io/mokshaproject.net/ [3] - https://github.com/fedora-infra/fedora-search _______________________________________________ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-leave@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedorapro...
infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-leave@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedorapro...