On Tue, 26 Feb 2019 at 14:39, Clement Verna cverna@fedoraproject.org wrote:
Hi all,
fedora-packages [0] code base is showing its age. The code base and the technology stack (Turbogears2 [1] web framework and the Moksha [2] middleware) is currently not ready for Python3 and I am not planning to do the work required to make it Python3 compatible, so the application will stop working when Fedora 29 is EOL.
In order to keep the service running, I have started a Proof Of Concept (fedora-search [3]) to replace the backend of the application. Fedora-search would be a REST API service offering full test search API. Such a service would then be available for other application to use, fedora-packages would then become a frontend only application using the service provided by fedora-search.
While the POC shows that this is a viable solution, I don't think that we should be proceeding that way, for the simple reason that this add yet another code base to maintain, I think we should use this opportunity to consider using Elasticsearch instead of maintaining our own "search engine".
The main issues to getting elasticsearch working in the past was the following:
1 The number of systems needed to make it work. There is a large difference from their 'proof-of-concept see how great this is' to 'ok you want to do anything with load' setups in everything from storage to number of search nodes to network speeds. [The number of hardware for the data we have was to start with 5-8 dedicated Dell systems, some amount of shared fast storage, and N virtual machines with a 10-40GB backbone.. or throwing all of Fedora Infrastructure at once into the cloud.. because the feed it from PHX2 to the cloud is expensive.]
2. Packaging of elasticsearch was a mess. At the time we had rules that all packages needed to be packaged in Fedora and follow Fedora packaging rules. [This one has been relaxed.]
3. Running of elasticsearch was a large service in itself. It doesn't take care of itself and we would need one or more people who know it well to keep it running. [This goes down the ladder.. the logstash backends are also full services.. ] Most of that was written in Java which no one on the team at the time had good experiences with.
4. A kibana/elasticsearch query expert. Just like any database, most of the queries you can make are the worse kind. They will take a lot more CPU/memory/time than they should making just grepping for data faster.
However that is 3-5 years ago.. so a lot has changed since then.
I think that Elasticsearch offers quite a few advantages :
- Powerful Query language
- Python bindings
- Javascript bindings
- Can be deployed in our infrastructure or used as a service
- Can be useful for other applications ( docs.fp.o, pagure, ??)
So what is the general feeling about using Elasticsearch in our infrastructure ? Should we look at deploying a cluster in our infra / Should we approach the Council to see if we can get founding to have this service hosted by Elastic ?
Thanks Clément
[0] - https://apps.fedoraproject.org/packages/ [1] - http://www.turbogears.org/ [2] - https://mokshaproject.github.io/mokshaproject.net/ [3] - https://github.com/fedora-infra/fedora-search _______________________________________________ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-leave@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedorapro...