Hi all,
I've given a lot of thought to our publishing situation. The current solution, using an unmaintained version of Publican on an unsupported Fedora release, is untenable. The site we have been maintaining in web.git is incompatible with current versions of publican, besides being massive and prone to breakage. We have been working towards a replacement, slowly.
The proposed successor to web.git is an RPM based system where Publican creates SRPMs of the guides, Fedora's koji buildsystem builds packages from the RPMs, and a VM instance installs the package to build the site. Creating the SRPM would be analagous to the `publican build --embedtoc ...` from the old process, and when the VM installs the package, it does the `publican install-book` step of the process. We currently have the koji infrastructure in place, a mostly viable way for newly built packages to automatically be installed on the VM, and a functional but mostly vanilla site frontend. Along the way, I've identified a few caveats to this method, to which I'll add some general observations:
- Each language of each release of each guide is a unique package. To publish anything new, we must coordinate with releng to have the package defined in koji. Releng is already overburdened. - Each language must have a separately maintained revision history. - The buildroot shared by koji and the VM required updated versions of publican and publican's dependencies, and now must be maintained there (which is currently not happening afaik). - None of the breakage problems we've dealt with in web.git will go away. Unusual language codes, bad draft procedures, wonky *_Info.xml issues, and whatever black magic that ends with my manually running sqlite invocations against the site db - that's all still there. - The site frontend needs to be completely redesigned from the ground up, and we have not demonstrated the motivation to see that through. There are intricacies here that we didn't discover from a straightforward following of the publican user's guide, and many subjectively frustrating surprises. - A publican website can only publish publican-friendly content. To the best of my knowledge, that means docbook only. Our contributor base is declining, and prospective contributors have consistently demonstrated a lack of interest in writing docbook. - A publican website does not provide an effective presentation of many smaller articles. I've been disagreed with on this point, and will concede that it's definitely possible to have 200 things under that F21 category, but I maintain that presentation in that way would be inimical to reader browsing.
With all this in mind, I've decided to be honest with myself, and you, in saying that I'm not motivated to make the publican website frontend work. Something must be done to deal with our site's incompatibility with current Publican, but from a usage perspective, I feel like there's a very high upfront investment - and a moderate continuing investment in maintaining the el6-docs koji repo - with no substantial gain in the product delivered. For normal usage, we've simply replaced `publican build;publican install-book` with `publican package;koji build`. Rudi had initially volunteered to do *all* of the setup to make this work, but he's a busy guy and it really isn't fair to expect him to deal with that level of implementation for us.
We've tossed around ideas in #fedora-docs, and in discussing it, I've developed a wishlist for an improved publishing platform:
- We use release-based branches to maintain content for a given Fedora release. Anything committed to these branches is intended for publication, so that should happen automatically. - It should be simpler to make drafts available. If we can automate publishing on release branches, we can do the same with master. - We don't do as good of a job as we should with pushing and pulling strings from translation. Pushes and pulls could be automated. - There should be an effective way to publish and organize sundry articles. - The publishing platform should be able to process arbitrary markup formats. I think docbook is great, and publican does a good job with it, but there will be room for more contributions if that isn't mandatory. - The frontend of the site should be adaptable. docs.fp.o would ideally use the same design elements as other official Fedora sites for a more unified appearance. - Some validation on commits would be nice.
The working theory at this point is to use a continuous integration system like Jenkins to automate builds. I've had a Jenkins instance running locally this week, with new builds triggered via fedmsg signals generated by our commits. This part works smoothly, with the exception of a fedwatch bug that I've crudely worked around, but triggers using python-fedmsg really wouldn't be too difficult. The infra folks are also working on a Jenkins setup, so there's potential to share experience and buildslaves. Turning that into a browseable frontend is more immediately viable than you might think; http://sourceforge.net/projects/jenkinscat/ is designed just for that and can be customized for our needs. Jenkins plugins to cycle strings from Zanata, or some other method, could come down the road. There's room for other enhancements, too.
The CI idea is something I'm excited about, and motivated to work on. I'd love to spend that time working with others, if you're interested.
--Pete
On 02/11/2015 03:53 PM, Pete Travis wrote:
Hi all,
I've given a lot of thought to our publishing situation. The current solution, using an unmaintained version of Publican on an unsupported Fedora release, is untenable. The site we have been maintaining in web.git is incompatible with current versions of publican, besides being massive and prone to breakage. We have been working towards a replacement, slowly.
The proposed successor to web.git is an RPM based system where Publican creates SRPMs of the guides, Fedora's koji buildsystem builds packages from the RPMs, and a VM instance installs the package to build the site. Creating the SRPM would be analagous to the `publican build --embedtoc ...` from the old process, and when the VM installs the package, it does the `publican install-book` step of the process. We currently have the koji infrastructure in place, a mostly viable way for newly built packages to automatically be installed on the VM, and a functional but mostly vanilla site frontend. Along the way, I've identified a few caveats to this method, to which I'll add some general observations:
- Each language of each release of each guide is a unique package. To
publish anything new, we must coordinate with releng to have the package defined in koji. Releng is already overburdened.
- Each language must have a separately maintained revision history.
- The buildroot shared by koji and the VM required updated versions of
publican and publican's dependencies, and now must be maintained there (which is currently not happening afaik).
- None of the breakage problems we've dealt with in web.git will go
away. Unusual language codes, bad draft procedures, wonky *_Info.xml issues, and whatever black magic that ends with my manually running sqlite invocations against the site db - that's all still there.
- The site frontend needs to be completely redesigned from the ground
up, and we have not demonstrated the motivation to see that through. There are intricacies here that we didn't discover from a straightforward following of the publican user's guide, and many subjectively frustrating surprises.
- A publican website can only publish publican-friendly content. To
the best of my knowledge, that means docbook only. Our contributor base is declining, and prospective contributors have consistently demonstrated a lack of interest in writing docbook.
- A publican website does not provide an effective presentation of
many smaller articles. I've been disagreed with on this point, and will concede that it's definitely possible to have 200 things under that F21 category, but I maintain that presentation in that way would be inimical to reader browsing.
With all this in mind, I've decided to be honest with myself, and you, in saying that I'm not motivated to make the publican website frontend work. Something must be done to deal with our site's incompatibility with current Publican, but from a usage perspective, I feel like there's a very high upfront investment - and a moderate continuing investment in maintaining the el6-docs koji repo - with no substantial gain in the product delivered. For normal usage, we've simply replaced `publican build;publican install-book` with `publican package;koji build`. Rudi had initially volunteered to do *all* of the setup to make this work, but he's a busy guy and it really isn't fair to expect him to deal with that level of implementation for us.
We've tossed around ideas in #fedora-docs, and in discussing it, I've developed a wishlist for an improved publishing platform:
- We use release-based branches to maintain content for a given Fedora
release. Anything committed to these branches is intended for publication, so that should happen automatically.
- It should be simpler to make drafts available. If we can automate
publishing on release branches, we can do the same with master.
- We don't do as good of a job as we should with pushing and pulling
strings from translation. Pushes and pulls could be automated.
- There should be an effective way to publish and organize sundry
articles.
- The publishing platform should be able to process arbitrary markup
formats. I think docbook is great, and publican does a good job with it, but there will be room for more contributions if that isn't mandatory.
- The frontend of the site should be adaptable. docs.fp.o would
ideally use the same design elements as other official Fedora sites for a more unified appearance.
- Some validation on commits would be nice.
The working theory at this point is to use a continuous integration system like Jenkins to automate builds. I've had a Jenkins instance running locally this week, with new builds triggered via fedmsg signals generated by our commits. This part works smoothly, with the exception of a fedwatch bug that I've crudely worked around, but triggers using python-fedmsg really wouldn't be too difficult. The infra folks are also working on a Jenkins setup, so there's potential to share experience and buildslaves. Turning that into a browseable frontend is more immediately viable than you might think; http://sourceforge.net/projects/jenkinscat/ is designed just for that and can be customized for our needs. Jenkins plugins to cycle strings from Zanata, or some other method, could come down the road. There's room for other enhancements, too.
The CI idea is something I'm excited about, and motivated to work on. I'd love to spend that time working with others, if you're interested.
--Pete
Hello Pete
I had an idea to test CI in one VM sending docs to another VM running a webserver, all in a local data centre. Not had the time yet. I would be interested to work with you if no one else picks up the gauntlet.
Regards
On 02/11/2015 12:49 PM, Stephen Wadeley wrote:
On 02/11/2015 03:53 PM, Pete Travis wrote:
Hi all, stuff.
Hello Pete
I had an idea to test CI in one VM sending docs to another VM running a webserver, all in a local data centre. Not had the time yet. I would be interested to work with you if no one else picks up the gauntlet.
Regards
The SOP for large, public-facing public sites is to use a proxy. For docs.fp.o, that means that nine globally distributed servers are grabbing the built content in web.git via rsync once an hour, and the public is served the content from whichever proxy is appropriate and available.
It does mean we're limited to static content, no database backed CMS stuff - at least without a significantly larger engineering investment that I'd rather not get into. But, if we can produce a consumable site on a VM, the proxies can handle the web serving part of the job; the transition can be done by simply changing that one rsync invocation in ansible.
On 02/11/2015 12:49 PM, Stephen Wadeley wrote:
On 02/11/2015 03:53 PM, Pete Travis wrote:
Hi all, stuff.
Hello Pete
I had an idea to test CI in one VM sending docs to another VM running a webserver, all in a local data centre. Not had the time yet. I would be interested to work with you if no one else picks up the gauntlet.
Regards
The SOP for large, public-facing public sites is to use a proxy. For docs.fp.o, that means that nine globally distributed servers are grabbing the built content in web.git via rsync once an hour, and the public is served the content from whichever proxy is appropriate and available.
It does mean we're limited to static content, no database backed CMS stuff - at least without a significantly larger engineering investment that I'd rather not get into. But, if we can produce a consumable site on a VM, the proxies can handle the web serving part of the job; the transition can be done by simply changing that one rsync invocation in ansible.
Hi,
On Feb 11, 2015, at 3:53 PM, Pete Travis me@petetravis.com wrote:
We've tossed around ideas in #fedora-docs, and in discussing it, I've developed a wishlist for an improved publishing platform:
- We use release-based branches to maintain content for a given Fedora
release. Anything committed to these branches is intended for publication, so that should happen automatically.
Yes.
- It should be simpler to make drafts available. If we can automate
publishing on release branches, we can do the same with master.
Addressed below with Jenkinscat :) I’d also like to see an easy way for individuals to see their non-master committed work in a stage environment. I don’t think the answer lies in local builds though as they aren’t shareable. Pulling this off will almost certainly require us to be willing to consider a modification to our branching strategy. I have ideas here, if there is interest.
- We don't do as good of a job as we should with pushing and pulling
strings from translation. Pushes and pulls could be automated.
Can this be tied to the publishable branching above?
- There should be an effective way to publish and organize sundry
articles.
I would encourage us to think about a single publication system here. Therefore the challenge is how to present the articles, not how to publish them. The articles can all live in a single repo and all be republished when one changes. The right publication means this isn’t a problem. I am doing this with some non-Fedora docs now and the runtime is trivial (markdown -> html).
- The publishing platform should be able to process arbitrary markup
formats. I think docbook is great, and publican does a good job with it, but there will be room for more contributions if that isn't mandatory.
This is the biggest challenge, after design, IMHO.
I have not yet found a great multi-input publishing mechanism. Pandoc comes close, but has some serious shortcomings in some input formats. Additionally, being written in Haskell, it is harder to find people who can extend it (in my experience). Additionally, even if that works you still have to deal with branding. Asciidoctor is supposed to be better, and rst via sphinx and others is also great. Lastly, there are engines like Jekyll which might be a good fit too.
Here I think we need to either make some arbitrary decisions (i.e. we support only x and y, or we only support this subset of markup z, etc.) or risk having to support a lot of conversion engines. My suggestion for today is to try to define the minimal markup needs for our publication chain and i18n and then choose two markups (Docbook and ???) and move forward.
As a start on minimal markup needs, it sounds like we need entity support, possibly xinclude-style insertions, and a way to flag material that shouldn’t be translated. We also need to decide whether we will allow non-semantic markup. If we can ensure strong reviews and/or gating on publication, non-semantic markup can be made to work.
I’d also like to see us consider a way to easily enable drive-by contribution and editing. I’ve been working on an architecture for a different project that has similar requirements. It currently exists mostly in my mind, but I hope to get it into writing and demo mode soon.
- The frontend of the site should be adaptable. docs.fp.o would
ideally use the same design elements as other official Fedora sites for a more unified appearance.
Yes. Anything we can do to offload branding and design is a huge massive win!
- Some validation on commits would be nice.
Take a look at https://github.com/emender it is a small but growing validator for integration into CI. It isn’t commit level directly, but with the right kind of architecture could get you close.
The working theory at this point is to use a continuous integration system like Jenkins to automate builds. I've had a Jenkins instance running locally this week, with new builds triggered via fedmsg signals generated by our commits. This part works smoothly, with the exception of a fedwatch bug that I've crudely worked around, but triggers using python-fedmsg really wouldn't be too difficult.
This is fantastic. I read this as you suggesting we can do both stage and publish on this platform.
The infra folks are also working on a Jenkins setup, so there's potential to share experience and buildslaves. Turning that into a browseable frontend is more immediately viable than you might think; http://sourceforge.net/projects/jenkinscat/ is designed just for that and can be customized for our needs.
Leverage CI: +1
Jenkins plugins to cycle strings from Zanata, or some other method, could come down the road. There's room for other enhancements, too.
I’d love to hear from the i18n folks how they’d like to see this. Do they want continuous updates or to work on a cadence?
The CI idea is something I'm excited about, and motivated to work on. I'd love to spend that time working with others, if you're interested.
I’d like to work on this, after March 15. I am booked until then. I’d love to push some container tech in here because it is cool and probably a good fit, but that isn’t a requirement.
regards,
bex
On 02/11/2015 03:27 PM, Brian (bex) Exelbierd wrote:
Hi,
On Feb 11, 2015, at 3:53 PM, Pete Travis me@petetravis.com wrote:
We've tossed around ideas in #fedora-docs, and in discussing it, I've developed a wishlist for an improved publishing platform:
- We use release-based branches to maintain content for a given Fedora
release. Anything committed to these branches is intended for publication, so that should happen automatically.
Yes.
- It should be simpler to make drafts available. If we can automate
publishing on release branches, we can do the same with master.
Addressed below with Jenkinscat :) I’d also like to see an easy way for individuals to see their non-master committed work in a stage environment. I don’t think the answer lies in local builds though as they aren’t shareable. Pulling this off will almost certainly require us to be willing to consider a modification to our branching strategy. I have ideas here, if there is interest.
Interest, yes! I don't see a way around "release" branches, but I do like the idea of shared work branches, if that's what you're getting at.
- We don't do as good of a job as we should with pushing and pulling
strings from translation. Pushes and pulls could be automated.
Can this be tied to the publishable branching above?
Yes, it seems that we can simply add the appropriate commands/actions for the zanata client to the Jenkins build job. If the build passes, push the strings. Pull the strings, and if the build passes, merge the new strings into the release branch.
- There should be an effective way to publish and organize sundry
articles.
I would encourage us to think about a single publication system here. Therefore the challenge is how to present the articles, not how to publish them. The articles can all live in a single repo and all be republished when one changes. The right publication means this isn’t a problem. I am doing this with some non-Fedora docs now and the runtime is trivial (markdown -> html).
Right, the presentation is what needs the most work, the front end that users will browse through to get to the individual articles and guides. There are both tooling and design challenges there. but, erm... single repo? like web.git? Maybe I'm misunderstanding you, but nobody wants to keep that around :P
- The publishing platform should be able to process arbitrary markup
formats. I think docbook is great, and publican does a good job with it, but there will be room for more contributions if that isn't mandatory.
This is the biggest challenge, after design, IMHO.
I have not yet found a great multi-input publishing mechanism. Pandoc comes close, but has some serious shortcomings in some input formats. Additionally, being written in Haskell, it is harder to find people who can extend it (in my experience). Additionally, even if that works you still have to deal with branding. Asciidoctor is supposed to be better, and rst via sphinx and others is also great. Lastly, there are engines like Jekyll which might be a good fit too.
Here I think we need to either make some arbitrary decisions (i.e. we support only x and y, or we only support this subset of markup z, etc.) or risk having to support a lot of conversion engines. My suggestion for today is to try to define the minimal markup needs for our publication chain and i18n and then choose two markups (Docbook and ???) and move forward.
As a start on minimal markup needs, it sounds like we need entity support, possibly xinclude-style insertions, and a way to flag material that shouldn’t be translated. We also need to decide whether we will allow non-semantic markup. If we can ensure strong reviews and/or gating on publication, non-semantic markup can be made to work.
I’d also like to see us consider a way to easily enable drive-by contribution and editing. I’ve been working on an architecture for a different project that has similar requirements. It currently exists mostly in my mind, but I hope to get it into writing and demo mode soon.
I really don't want to get hung up on support for additional formats at this stage. Given tooling to dynamically build a front end, it shouldn't be a problem to add support for whatever format. Jenkins can probably handle whatever we throw at it, we can work out the tooling to build other formats after the core solution in place.
- The frontend of the site should be adaptable. docs.fp.o would
ideally use the same design elements as other official Fedora sites for a more unified appearance.
Yes. Anything we can do to offload branding and design is a huge massive win!
- Some validation on commits would be nice.
Take a look at https://github.com/emender it is a small but growing validator for integration into CI. It isn’t commit level directly, but with the right kind of architecture could get you close.
The working theory at this point is to use a continuous integration system like Jenkins to automate builds. I've had a Jenkins instance running locally this week, with new builds triggered via fedmsg signals generated by our commits. This part works smoothly, with the exception of a fedwatch bug that I've crudely worked around, but triggers using python-fedmsg really wouldn't be too difficult.
This is fantastic. I read this as you suggesting we can do both stage and publish on this platform.
Sure, we could definitely have ie drafts.docs.fedoraproject.org that's built from master, or with navigable content from feature work branches, etc. For now, it is probably best limit the scope to a public-facing solution, then iterate.
The infra folks are also working on a Jenkins setup, so there's potential to share experience and buildslaves. Turning that into a browseable frontend is more immediately viable than you might think; http://sourceforge.net/projects/jenkinscat/ is designed just for that and can be customized for our needs.
Leverage CI: +1
Jenkins plugins to cycle strings from Zanata, or some other method, could come down the road. There's room for other enhancements, too.
I’d love to hear from the i18n folks how they’d like to see this. Do they want continuous updates or to work on a cadence?
I wrote plugins, then actually started looking into it and talked to #zanata, and all we need to do is use the normal zanata cli tools as part of the Jenkins build job. The Jenkins git plugin has some features to merge-on-success we could leverage.
The loudest voice here has been Jerome Fenal from the very active French team, and he advocates a continuous flow. In a practical sense, we can set the master branch to push to master on Zanata, the F21 branch for F21 on zanata, etc; translators that want the continuous flow can work on master, those that want something more curated can work on the release branches. If Zanata's translation memory works like it did for Transifex, the translations for identical strings will automatically be available for all branches, no redundant translation required.
The CI idea is something I'm excited about, and motivated to work on. I'd love to spend that time working with others, if you're interested.
I’d like to work on this, after March 15. I am booked until then. I’d love to push some container tech in here because it is cool and probably a good fit, but that isn’t a requirement.
regards,
bex
Sounds great! Keep the theory coming, until you have time for implementation.
On Feb 12, 2015, at 1:38 AM, Pete Travis me@petetravis.com wrote:
On 02/11/2015 03:27 PM, Brian (bex) Exelbierd wrote:
On Feb 11, 2015, at 3:53 PM, Pete Travis me@petetravis.com wrote:
- It should be simpler to make drafts available. If we can automate
publishing on release branches, we can do the same with master.
Addressed below with Jenkinscat :) I’d also like to see an easy way for individuals to see their non-master committed work in a stage environment. I don’t think the answer lies in local builds though as they aren’t shareable. Pulling this off will almost certainly require us to be willing to consider a modification to our branching strategy. I have ideas here, if there is interest.
Interest, yes! I don't see a way around "release" branches, but I do like the idea of shared work branches, if that's what you're getting at.
It is :) Jenkinscat should be able to be easily extended in this direction …
I will try to draw up some of my ideas during a long train trip this weekend .. but no promises.
- There should be an effective way to publish and organize sundry
articles.
I would encourage us to think about a single publication system here. Therefore the challenge is how to present the articles, not how to publish them. The articles can all live in a single repo and all be republished when one changes. The right publication means this isn’t a problem. I am doing this with some non-Fedora docs now and the runtime is trivial (markdown -> html).
Right, the presentation is what needs the most work, the front end that users will browse through to get to the individual articles and guides.
I think the answer to this lies in having someone think about presentation without thinking at all about where the content comes from. Perhaps the folks in the Marketing Project can help us here?
There are both tooling and design challenges there. but, erm... single repo? like web.git? Maybe I'm misunderstanding you, but nobody wants to keep that around :P
On this other project, we have a single repo that holds a small collection of markdown documents. These documents are all processed and published to a separate documentation area from our main documentation which is in DocBook (and typically in a 1:1 repo:book format).
This has the advantage of just feeling cleaner. I believe, but must confess not having paid enough attention back when I had free time, that web.git is the “master” repo for the website. That scares me too :)
- The publishing platform should be able to process arbitrary markup
formats. I think docbook is great, and publican does a good job with it, but there will be room for more contributions if that isn't mandatory.
This is the biggest challenge, after design, IMHO.
I have not yet found a great multi-input publishing mechanism. Pandoc comes close, but has some serious shortcomings in some input formats. Additionally, being written in Haskell, it is harder to find people who can extend it (in my experience). Additionally, even if that works you still have to deal with branding. Asciidoctor is supposed to be better, and rst via sphinx and others is also great. Lastly, there are engines like Jekyll which might be a good fit too.
Here I think we need to either make some arbitrary decisions (i.e. we support only x and y, or we only support this subset of markup z, etc.) or risk having to support a lot of conversion engines. My suggestion for today is to try to define the minimal markup needs for our publication chain and i18n and then choose two markups (Docbook and ???) and move forward.
As a start on minimal markup needs, it sounds like we need entity support, possibly xinclude-style insertions, and a way to flag material that shouldn’t be translated. We also need to decide whether we will allow non-semantic markup. If we can ensure strong reviews and/or gating on publication, non-semantic markup can be made to work.
I’d also like to see us consider a way to easily enable drive-by contribution and editing. I’ve been working on an architecture for a different project that has similar requirements. It currently exists mostly in my mind, but I hope to get it into writing and demo mode soon.
I really don't want to get hung up on support for additional formats at this stage. Given tooling to dynamically build a front end, it shouldn't be a problem to add support for whatever format. Jenkins can probably handle whatever we throw at it, we can work out the tooling to build other formats after the core solution in place.
I agree that we start with DocBook. But I think we architect for 2 markups. That will force us to think through the ramifications of our choices.
The working theory at this point is to use a continuous integration system like Jenkins to automate builds. I've had a Jenkins instance running locally this week, with new builds triggered via fedmsg signals generated by our commits. This part works smoothly, with the exception of a fedwatch bug that I've crudely worked around, but triggers using python-fedmsg really wouldn't be too difficult.
This is fantastic. I read this as you suggesting we can do both stage and publish on this platform.
Sure, we could definitely have ie drafts.docs.fedoraproject.org that's built from master, or with navigable content from feature work branches, etc. For now, it is probably best limit the scope to a public-facing solution, then iterate.
Alternately, we can build the drafts site and use that to iterate a design that works for the public site. However, I can’t help but wonder if we shouldn’t split the task. drafts.docs.fp.o is a stage for docs.fp.o. Nothing more. Jenkinscat provides all other staging. This way stage and prod never behave differently.
The infra folks are also working on a Jenkins setup, so there's potential to share experience and buildslaves. Turning that into a browseable frontend is more immediately viable than you might think; http://sourceforge.net/projects/jenkinscat/ is designed just for that and can be customized for our needs.
Leverage CI: +1
Jenkins plugins to cycle strings from Zanata, or some other method, could come down the road. There's room for other enhancements, too.
I’d love to hear from the i18n folks how they’d like to see this. Do they want continuous updates or to work on a cadence?
I wrote plugins, then actually started looking into it and talked to #zanata, and all we need to do is use the normal zanata cli tools as part of the Jenkins build job. The Jenkins git plugin has some features to merge-on-success we could leverage.
The loudest voice here has been Jerome Fenal from the very active French team, and he advocates a continuous flow. In a practical sense, we can set the master branch to push to master on Zanata, the F21 branch for F21 on zanata, etc; translators that want the continuous flow can work on master, those that want something more curated can work on the release branches. If Zanata's translation memory works like it did for Transifex, the translations for identical strings will automatically be available for all branches, no redundant translation required.
That makes sense. I’d like to see some idea of workflow from the translators about when to decide something is publishable. Regardless, it sounds like the workflow from our perspective is simple. We land final content in a branch in a repo. That repo is set to branch and copy as required by the translator workflow. We can help them implement but we don’t have to worry about the workflow as part of our solution.
Frankly, I know it is simple to say/think, but it really helps me to think of the workflow as having hard edges that are settled by negotiations at each hand off.
Writing (including reviews) -> candidacy for publication (potential material) -> submitted to translation | staged -> published
regards,
bex
On 02/11/2015 03:27 PM, Brian (bex) Exelbierd wrote:
Hi,
On Feb 11, 2015, at 3:53 PM, Pete Travis me@petetravis.com wrote:
We've tossed around ideas in #fedora-docs, and in discussing it, I've developed a wishlist for an improved publishing platform:
- We use release-based branches to maintain content for a given Fedora
release. Anything committed to these branches is intended for publication, so that should happen automatically.
Yes.
- It should be simpler to make drafts available. If we can automate
publishing on release branches, we can do the same with master.
Addressed below with Jenkinscat :) I’d also like to see an easy way for individuals to see their non-master committed work in a stage environment. I don’t think the answer lies in local builds though as they aren’t shareable. Pulling this off will almost certainly require us to be willing to consider a modification to our branching strategy. I have ideas here, if there is interest.
Interest, yes! I don't see a way around "release" branches, but I do like the idea of shared work branches, if that's what you're getting at.
- We don't do as good of a job as we should with pushing and pulling
strings from translation. Pushes and pulls could be automated.
Can this be tied to the publishable branching above?
Yes, it seems that we can simply add the appropriate commands/actions for the zanata client to the Jenkins build job. If the build passes, push the strings. Pull the strings, and if the build passes, merge the new strings into the release branch.
- There should be an effective way to publish and organize sundry
articles.
I would encourage us to think about a single publication system here. Therefore the challenge is how to present the articles, not how to publish them. The articles can all live in a single repo and all be republished when one changes. The right publication means this isn’t a problem. I am doing this with some non-Fedora docs now and the runtime is trivial (markdown -> html).
Right, the presentation is what needs the most work, the front end that users will browse through to get to the individual articles and guides. There are both tooling and design challenges there. but, erm... single repo? like web.git? Maybe I'm misunderstanding you, but nobody wants to keep that around :P
- The publishing platform should be able to process arbitrary markup
formats. I think docbook is great, and publican does a good job with it, but there will be room for more contributions if that isn't mandatory.
This is the biggest challenge, after design, IMHO.
I have not yet found a great multi-input publishing mechanism. Pandoc comes close, but has some serious shortcomings in some input formats. Additionally, being written in Haskell, it is harder to find people who can extend it (in my experience). Additionally, even if that works you still have to deal with branding. Asciidoctor is supposed to be better, and rst via sphinx and others is also great. Lastly, there are engines like Jekyll which might be a good fit too.
Here I think we need to either make some arbitrary decisions (i.e. we support only x and y, or we only support this subset of markup z, etc.) or risk having to support a lot of conversion engines. My suggestion for today is to try to define the minimal markup needs for our publication chain and i18n and then choose two markups (Docbook and ???) and move forward.
As a start on minimal markup needs, it sounds like we need entity support, possibly xinclude-style insertions, and a way to flag material that shouldn’t be translated. We also need to decide whether we will allow non-semantic markup. If we can ensure strong reviews and/or gating on publication, non-semantic markup can be made to work.
I’d also like to see us consider a way to easily enable drive-by contribution and editing. I’ve been working on an architecture for a different project that has similar requirements. It currently exists mostly in my mind, but I hope to get it into writing and demo mode soon.
I really don't want to get hung up on support for additional formats at this stage. Given tooling to dynamically build a front end, it shouldn't be a problem to add support for whatever format. Jenkins can probably handle whatever we throw at it, we can work out the tooling to build other formats after the core solution in place.
- The frontend of the site should be adaptable. docs.fp.o would
ideally use the same design elements as other official Fedora sites for a more unified appearance.
Yes. Anything we can do to offload branding and design is a huge massive win!
- Some validation on commits would be nice.
Take a look at https://github.com/emender it is a small but growing validator for integration into CI. It isn’t commit level directly, but with the right kind of architecture could get you close.
The working theory at this point is to use a continuous integration system like Jenkins to automate builds. I've had a Jenkins instance running locally this week, with new builds triggered via fedmsg signals generated by our commits. This part works smoothly, with the exception of a fedwatch bug that I've crudely worked around, but triggers using python-fedmsg really wouldn't be too difficult.
This is fantastic. I read this as you suggesting we can do both stage and publish on this platform.
Sure, we could definitely have ie drafts.docs.fedoraproject.org that's built from master, or with navigable content from feature work branches, etc. For now, it is probably best limit the scope to a public-facing solution, then iterate.
The infra folks are also working on a Jenkins setup, so there's potential to share experience and buildslaves. Turning that into a browseable frontend is more immediately viable than you might think; http://sourceforge.net/projects/jenkinscat/ is designed just for that and can be customized for our needs.
Leverage CI: +1
Jenkins plugins to cycle strings from Zanata, or some other method, could come down the road. There's room for other enhancements, too.
I’d love to hear from the i18n folks how they’d like to see this. Do they want continuous updates or to work on a cadence?
I wrote plugins, then actually started looking into it and talked to #zanata, and all we need to do is use the normal zanata cli tools as part of the Jenkins build job. The Jenkins git plugin has some features to merge-on-success we could leverage.
The loudest voice here has been Jerome Fenal from the very active French team, and he advocates a continuous flow. In a practical sense, we can set the master branch to push to master on Zanata, the F21 branch for F21 on zanata, etc; translators that want the continuous flow can work on master, those that want something more curated can work on the release branches. If Zanata's translation memory works like it did for Transifex, the translations for identical strings will automatically be available for all branches, no redundant translation required.
The CI idea is something I'm excited about, and motivated to work on. I'd love to spend that time working with others, if you're interested.
I’d like to work on this, after March 15. I am booked until then. I’d love to push some container tech in here because it is cool and probably a good fit, but that isn’t a requirement.
regards,
bex
Sounds great! Keep the theory coming, until you have time for implementation.
On Wed, Feb 11, 2015 at 11:27:59PM +0100, Brian (bex) Exelbierd wrote:
I’d also like to see us consider a way to easily enable drive-by contribution and editing. I’ve been working on an architecture for a different project that has similar requirements. It currently exists mostly in my mind, but I hope to get it into writing and demo mode soon.
I'd love to see this. Combined with active curation, this is a very powerful model.
(PS: thanks Pete for starting this conversation!)
A Documentation Workflow
There is a lot of discussion going on around changing or replacing parts or all of the toolchain used for Fedora documentation. Conversations like this are useful, however they seem to quickly become tool-only conversations. I believe that in order to build an effective toolchain we have to all have a common belief in the goals that the toolchain will enable. To that end, I'd like us to consider spending some time ensuring that we all think the work needs to be done the same way and to the same end.
To that end, I suggest we put this on the agenda for next Monday's docs meeting.
In the spirit of enabling people to edit, which is easier, than to create, I propose the following workflow idea. As I mentioned above, this is deliberately not a tools based document. Once we have agreement on what we want, we can then fit the tools into place that will create it.
Our workflow needs to meet some goals:
- Infrequent contributors and drive-by contributors should have the easiest possible entry to the documentation process. - Users at every level should have the most flexibility possible in how they do their individual work. This means that the minimum number of requirements are set. - Content that doesn't change from release to release should easily roll forward. Content that does vary from release to release should be able to be easily segregated and maintained. Content that has only minor variations from release to release should be able to easily be created across multiple releases. - When necessary content should be able to move quickly from creation to publication (i.e. CVEs). However, the process should also easily support allowing content to be held for future releases or held indefinitely pending review/conversation/revision. - Documentation needs to be able to move cleanly from step to step in a process. It should not be ambigious what content is in which step. This also means that unfinished work should be segragated both from finished work and other unfinished work. - Each step should be optimized to have the least amount of friction for it's highest consumption users or to create patterns of desired behavior through friction reduction. - When a trade-off has to occur, complexity should be absorbed by the toolchain first and users in later steps second. This is based on the idea that there are fewer users impacted in later steps. - Internationalization should not be a blocker for the English language release. Internationalization in one language shouldn't block another language.
These goals can be accomplished via the following steps. To make things clean, I have grouped the steps into units that are able to be designed independently. We will just need to define a firm input/output handoff.
- Creation: The creation of new content or editing of existing content. Included in this step is any SME or language review. - Steps 1. Creation - self explanatory 2. Review - an optional review by peers or SMEs for technical and language attributes. We need to decide if we require this. I believe that we should request it of new writers but not of experienced writers. - Output An easily manageable set of content changes or additions that can be readily identified for processing in the next stage.
- Consensus: The decision about whether completed content is to be published and if so, for which version. This is optional, however, I believe that we should have a small group of people who are empowered to move content to publication. I do not believe every writer should have this ability. I also don't think that this is the same as reviewing. We can combine them, but that creates a lot of work for these people. - Steps 1. Approval - Output A clearly identifiable version of a document that consists of only publishable, complete content.
- Publication: Previewing content to verify it renders well and delivery to final location for usage by consumers. This can theoretically be almost 100% automated. - Steps 1. Staging - placement of completed and approved documentation for visual review 2. Publication - making completed and approved documentation available to consumers - Output Content delivered to consumers.
- Internationalization: Translation and transformation for non-English speaking audiences. - Steps 1. Internationalization - self explanatory 2. Staging - Internationalized versions need to be able to be verified by a qualified person 3. Publication - this can use the same publishing mechanism as English - Output Content delivered to consumers.
I appreciate your feedback and look forward to a conversation.
regards,
bex
On Thu, 19 Feb 2015 14:20:56 +0100 "Brian (bex) Exelbierd" bex@pobox.com wrote:
A Documentation Workflow
There is a lot of discussion going on around changing or replacing parts or all of the toolchain used for Fedora documentation. Conversations like this are useful, however they seem to quickly become tool-only conversations. I believe that in order to build an effective toolchain we have to all have a common belief in the goals that the toolchain will enable. To that end, I'd like us to consider spending some time ensuring that we all think the work needs to be done the same way and to the same end.
To that end, I suggest we put this on the agenda for next Monday's docs meeting.
In the spirit of enabling people to edit, which is easier, than to create, I propose the following workflow idea. As I mentioned above, this is deliberately not a tools based document. Once we have agreement on what we want, we can then fit the tools into place that will create it.
Yes, I also agree that starting with workflow makes a lot of sense.
Our workflow needs to meet some goals:
- Infrequent contributors and drive-by contributors should have the easiest possible entry to the documentation process.
- Users at every level should have the most flexibility possible in how they do their individual work. This means that the minimum number of requirements are set.
- Content that doesn't change from release to release should easily roll forward. Content that does vary from release to release should be able to be easily segregated and maintained. Content that has only minor variations from release to release should be able to easily be created across multiple releases.
- When necessary content should be able to move quickly from creation to publication (i.e. CVEs). However, the process should also easily support allowing content to be held for future releases or held indefinitely pending review/conversation/revision.
- Documentation needs to be able to move cleanly from step to step in a process. It should not be ambigious what content is in which step. This also means that unfinished work should be segragated both from finished work and other unfinished work.
- Each step should be optimized to have the least amount of friction for it's highest consumption users or to create patterns of desired behavior through friction reduction.
- When a trade-off has to occur, complexity should be absorbed by the toolchain first and users in later steps second. This is based on the idea that there are fewer users impacted in later steps.
- Internationalization should not be a blocker for the English language release. Internationalization in one language shouldn't block another language.
These goals can be accomplished via the following steps. To make things clean, I have grouped the steps into units that are able to be designed independently. We will just need to define a firm input/output handoff.
- Creation: The creation of new content or editing of existing content. Included in this step is any SME or language review.
- Steps
- Creation - self explanatory
- Review - an optional review by peers or SMEs for technical and language attributes. We need to decide if we require this. I believe that we should request it of new writers but not of experienced writers.
Agreed. I'm in favor of not making the whole review process too formal as this often gets in the way of easy entry for new or drive-by contributors.
- Output An easily manageable set of content changes or additions that can be readily identified for processing in the next stage.
- Consensus: The decision about whether completed content is to be published and if so, for which version. This is optional, however, I believe that we should have a small group of people who are empowered to move content to publication. I do not believe every writer should have this ability. I also don't think that this is the same as reviewing. We can combine them, but that creates a lot of work for these people.
Right. We've had a FAS group for people with permissions to publish content. I think we should keep that group around.
- Steps 1. Approval - Output A clearly identifiable version of a document that consists of only publishable, complete content.
- Publication: Previewing content to verify it renders well and delivery to final location for usage by consumers. This can theoretically be almost 100% automated.
- Steps
- Staging - placement of completed and approved documentation for visual review
- Publication - making completed and approved documentation available to consumers
- Output Content delivered to consumers.
It would be sweet to have a working documentation stage in Fedora docs. I'm unsure as to whether this stage should be completely segregated from the published docs site, though. In GNOME docs, we use the same doc site to deliver both the preview/devel and final versions of a document. By default, the user is redirected to a stable version when navigating through the site, e.g.:
https://help.gnome.org/admin/gdm/stable/
Only when they explicitly request a different version, they get a list of previous/unsupported or devel versions of the doc:
https://help.gnome.org/admin/gdm/
- Internationalization: Translation and transformation for non-English speaking audiences.
- Steps
- Internationalization - self explanatory
- Staging - Internationalized versions need to be able to be verified by a qualified person
I think this should too follow the same workflow as the English content. This means we should make sure that the whole process is translator-friendly too.
It would be good to reach out to translators and then decide whether we want doc owners/maintainers to publish translated content or whether translators should be part of that process.
3. Publication - this can use the same publishing mechanism as English - Output Content delivered to consumers.
I appreciate your feedback and look forward to a conversation.
These are some great ideas, thanks for sharing them, Brian. I think we should put these on the wiki. I can create a page for that. Or maybe reuse the one Pete already created?
Cheers, pk
On 02/26/2015 11:11 AM, Petr Kovar wrote:
On Thu, 19 Feb 2015 14:20:56 +0100 "Brian (bex) Exelbierd" bex@pobox.com wrote:
A Documentation Workflow
There is a lot of discussion going on around changing or replacing parts or all of the toolchain used for Fedora documentation. Conversations like this are useful, however they seem to quickly become tool-only conversations. I believe that in order to build an effective toolchain we have to all have a common belief in the goals that the toolchain will enable. To that end, I'd like us to consider spending some time ensuring that we all think the work needs to be done the same way and to the same end.
To that end, I suggest we put this on the agenda for next Monday's docs meeting.
In the spirit of enabling people to edit, which is easier, than to create, I propose the following workflow idea. As I mentioned above, this is deliberately not a tools based document. Once we have agreement on what we want, we can then fit the tools into place that will create it.
Yes, I also agree that starting with workflow makes a lot of sense.
+1000, thanks for starting the conversation. Although, I don't know that I'll be able to *completely* abstract tooling from my thinking on the subject... I've been focusing the discussion on tooling lately because conversations about process have often been to abstract to result in something actionable, but in this case, I like where you're steering it :)
Our workflow needs to meet some goals:
- Infrequent contributors and drive-by contributors should have the easiest possible entry to the documentation process.
I agree with this in spirit, but there's a tradeoff here, both in content quality and the contributor's sense of ownership and participation. We can enable truly drive-by submissions with mechanisms like concise, dedicated workflow documentation, bz or email templates, review queues, etc. Making fire-and-forget behavior more easy is a bonus of having an active community with a clearly established workflow, not something we should explicitly model the tooling and workflow to accommodate.
- Users at every level should have the most flexibility possible in how they do their individual work. This means that the minimum number of requirements are set.
No argument here. That said, some coverage of recommended methods would help out new contributors a lot. (Here's how you create a publican book, here's how you create a ReStructuredText article, this is how I set up my editor.) Writing style might also fall into this category; I'd rather spend more time and coach people away from passive voice writing, excessive transitional prose, etc, and readers would benefit from our following some basic conventions [to be [re]defined?]
- Content that doesn't change from release to release should easily roll forward. Content that does vary from release to release should be able to be easily segregated and maintained. Content that has only minor variations from release to release should be able to easily be created across multiple releases.
This has come up a few times. I don't see any problem with having content dissociated from the release cycle, as long as it's suitable for that, there's a clear commitment for maintenance, and the tooling supports it. This should not enable outdated documentation to have the same presence as current documentation, even if there's nothing current on the topic.
- When necessary content should be able to move quickly from creation to publication (i.e. CVEs). However, the process should also easily support allowing content to be held for future releases or held indefinitely pending review/conversation/revision.
- Documentation needs to be able to move cleanly from step to step in a process. It should not be ambigious what content is in which step. This also means that unfinished work should be segragated both from finished work and other unfinished work.
- Each step should be optimized to have the least amount of friction for it's highest consumption users or to create patterns of desired behavior through friction reduction.
- When a trade-off has to occur, complexity should be absorbed by the toolchain first and users in later steps second. This is based on the idea that there are fewer users impacted in later steps.
- Internationalization should not be a blocker for the English language release. Internationalization in one language shouldn't block another language.
These would go in the tooling requirements column, mostly.
These goals can be accomplished via the following steps. To make things clean, I have grouped the steps into units that are able to be designed independently. We will just need to define a firm input/output handoff.
- Creation: The creation of new content or editing of existing content. Included in this step is any SME or language review.
- Steps
- Creation - self explanatory
- Review - an optional review by peers or SMEs for technical and language attributes. We need to decide if we require this. I believe that we should request it of new writers but not of experienced writers.
Agreed. I'm in favor of not making the whole review process too formal as this often gets in the way of easy entry for new or drive-by contributors.
- Output An easily manageable set of content changes or additions that can be readily identified for processing in the next stage.
- Consensus: The decision about whether completed content is to be published and if so, for which version. This is optional, however, I believe that we should have a small group of people who are empowered to move content to publication. I do not believe every writer should have this ability. I also don't think that this is the same as reviewing. We can combine them, but that creates a lot of work for these people.
Right. We've had a FAS group for people with permissions to publish content. I think we should keep that group around.
- Steps 1. Approval - Output A clearly identifiable version of a document that consists of only publishable, complete content.
- Publication: Previewing content to verify it renders well and delivery to final location for usage by consumers. This can theoretically be almost 100% automated.
- Steps
- Staging - placement of completed and approved documentation for visual review
- Publication - making completed and approved documentation available to consumers
- Output Content delivered to consumers.
It would be sweet to have a working documentation stage in Fedora docs. I'm unsure as to whether this stage should be completely segregated from the published docs site, though. In GNOME docs, we use the same doc site to deliver both the preview/devel and final versions of a document. By default, the user is redirected to a stable version when navigating through the site, e.g.:
https://help.gnome.org/admin/gdm/stable/
Only when they explicitly request a different version, they get a list of previous/unsupported or devel versions of the doc:
https://help.gnome.org/admin/gdm/
- Internationalization: Translation and transformation for non-English speaking audiences.
- Steps
- Internationalization - self explanatory
- Staging - Internationalized versions need to be able to be verified by a qualified person
I think this should too follow the same workflow as the English content. This means we should make sure that the whole process is translator-friendly too.
It would be good to reach out to translators and then decide whether we want doc owners/maintainers to publish translated content or whether translators should be part of that process.
3. Publication - this can use the same publishing mechanism as English - Output Content delivered to consumers.
I appreciate your feedback and look forward to a conversation.
These are some great ideas, thanks for sharing them, Brian. I think we should put these on the wiki. I can create a page for that. Or maybe reuse the one Pete already created?
Cheers, pk
State tracking would need both process and tooling changes to support a defined incubation workflow. It sounds nice, especially for communicating to the group where we need work, but for many cases I worry this might be introducing unwarranted process and additional complexity for contribution. I think we should go forward with creating a documentation life cycle plan as part of the workflow effort, but keep it *very simple*. Progressing from one stage to the next, presuming viable content, shouldn't require more than a brief irc or list conversation.
The tooling I have in mind - and I *am* working on it - should be capable of creating a draft site as well as the 'production' site. I have ideas to share here, but don't want to distract the thread. Mostly, they depend on using git...
Our side of the localization process ( pushing POs to Zanata, pulling POs from Zanata ) can be entirely automated. If we want only reviewed strings, we can probably configure the zanata client to only pull reviewed strings; it's up to each language's team to review the translations. FYI, it's optional on the platform, and many or most language teams have opted not to use the string approval features. If you're talking about automated commits of strings only if they build, that can be automated too.
Let's do use https://fedoraproject.org/wiki/Docs_Project_Focus for expanding on these ideas as well. It can be cleaned up later if need be, but for now we'll have one place to look.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 02/12/2015 12:53 AM, Pete Travis wrote:
Hi all,
I've given a lot of thought to our publishing situation. The current solution, using an unmaintained version of Publican on an unsupported Fedora release, is untenable. The site we have been maintaining in web.git is incompatible with current versions of publican, besides being massive and prone to breakage. We have been working towards a replacement, slowly.
The proposed successor to web.git is an RPM based system where Publican creates SRPMs of the guides, Fedora's koji buildsystem builds packages from the RPMs, and a VM instance installs the package to build the site. Creating the SRPM would be analagous to the `publican build --embedtoc ...` from the old process, and when the VM installs the package, it does the `publican install-book` step of the process. We currently have the koji infrastructure in place, a mostly viable way for newly built packages to automatically be installed on the VM, and a functional but mostly vanilla site frontend. Along the way, I've identified a few caveats to this method, to which I'll add some general observations:
- Each language of each release of each guide is a unique package. To publish anything new, we must coordinate with releng to have the package defined in koji. Releng is already overburdened. - Each language must
have a separately maintained revision history. - The buildroot shared by koji and the VM required updated versions of publican and publican's dependencies, and now must be maintained there (which is currently not happening afaik). - None of the breakage problems we've dealt with in web.git will go away. Unusual language codes, bad draft procedures, wonky *_Info.xml issues, and whatever black magic that ends with my manually running sqlite invocations against the site db - that's all still there. - The site frontend needs to be completely redesigned from the ground up, and we have not demonstrated the motivation to see that through. There are intricacies here that we didn't discover from a straightforward following of the publican user's guide, and many subjectively frustrating surprises. - A publican website can only publish publican-friendly content. To the best of my knowledge, that means docbook only. Our contributor base is declining, and prospective contributors have consistently demonstrated a lack of interest in writing docbook.
Not quite, you can ship pre-built content easily enough [1], but it doesn't integrate the web UI in to it for obvious reasons.
1: https://bugzilla.redhat.com/show_bug.cgi?id=1081303
- A publican website does not provide an effective presentation of many smaller articles. I've been disagreed with on this point, and will concede that it's definitely possible to have 200 things under that F21
category, but I maintain that presentation in that way would be inimical to reader browsing.
With all this in mind, I've decided to be honest with myself, and you, in saying that I'm not motivated to make the publican website frontend work. Something must be done to deal with our site's incompatibility with current Publican, but from a usage perspective, I feel like there's a very high upfront investment - and a moderate continuing investment in maintaining the el6-docs koji repo - with no substantial gain in the product delivered. For normal usage, we've simply replaced `publican build;publican install-book` with `publican package;koji build`. Rudi had initially volunteered to do *all* of the setup to make this work, but he's a busy guy and it really isn't fair to expect him to deal with that level of implementation for us.
We've tossed around ideas in #fedora-docs, and in discussing it, I've developed a wishlist for an improved publishing platform:
- We use release-based branches to maintain content for a given Fedora release. Anything committed to these branches is intended for publication, so that should happen automatically. - It should be simpler to make
drafts available. If we can automate publishing on release branches, we can do the same with master. - We don't do as good of a job as we should with pushing and pulling strings from translation. Pushes and pulls could be automated. - There should be an effective way to publish and organize sundry articles. - The publishing platform should be able to process arbitrary markup formats. I think docbook is great, and publican does a good job with it, but there will be room for more contributions if that isn't mandatory. - The frontend of the site should be adaptable. docs.fp.o would ideally use the same design elements as other official Fedora sites for a more unified appearance. - Some validation on commits would be nice.
The working theory at this point is to use a continuous integration system like Jenkins to automate builds. I've had a Jenkins instance running locally this week, with new builds triggered via fedmsg signals generated by our commits. This part works smoothly, with the exception of a fedwatch bug that I've crudely worked around, but triggers using python-fedmsg really wouldn't be too difficult. The infra folks are also working on a Jenkins setup, so there's potential to share experience and buildslaves. Turning that into a browseable frontend is more immediately viable than you might think; http://sourceforge.net/projects/jenkinscat/ is designed just for that and can be customized for our needs. Jenkins plugins to cycle strings from Zanata, or some other method, could come down the road. There's room for other enhancements, too.
The CI idea is something I'm excited about, and motivated to work on. I'd love to spend that time working with others, if you're interested.
Don't forget that Publican offers you an exit strategy from XML!
It can can convert your DocBook in to Markdown** and then you *never* have to use XML *ever* again! \o/
** or in fact any of the formats at [2] if you install them!
2: http://search.cpan.org/search?m=dist&q=wikiconverter&s=1&n=50
Which means you can also consider switching to a real CMS like medai wiki, moin moin, etc.
You could use publican (or pandoc???) to convert XML to medai wiki format (by installing HTML::WikiConverter::MediaWiki [via cpanspec]) then import that in to https://fedoraproject.org/wiki/
Unified web space FTW.
Cheers, Jeff.
- -- Jeff Fearn Senior Software Engineer Hosted & Shared Services Red Hat Pty Ltd
On Wed, Feb 11, 2015 at 4:06 PM, Jeff Fearn jfearn@redhat.com wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 02/12/2015 12:53 AM, Pete Travis wrote:
Hi all,
I've given a lot of thought to our publishing situation. The current
solution, using an unmaintained version of Publican on an unsupported Fedora release, is untenable. The site we have been maintaining in web.git is
incompatible with current versions of publican, besides being massive
and prone to breakage. We have been working towards a replacement, slowly.
The proposed successor to web.git is an RPM based system where Publican
creates SRPMs of the guides, Fedora's koji buildsystem builds packages from the RPMs, and a VM instance installs the package to build the site.
Creating the SRPM would be analagous to the `publican build --embedtoc
...` from the old process, and when the VM installs the package, it does the `publican install-book` step of the process. We currently have the
koji infrastructure in place, a mostly viable way for newly built
packages to automatically be installed on the VM, and a functional but mostly vanilla site frontend. Along the way, I've identified a few caveats to
this method, to which I'll add some general observations:
- Each language of each release of each guide is a unique package. To
publish anything new, we must coordinate with releng to have the package defined in koji. Releng is already overburdened. - Each language must
have a separately maintained revision history. - The buildroot shared by
koji and the VM required updated versions of publican and publican's dependencies, and now must be maintained there (which is currently not
happening afaik). - None of the breakage problems we've dealt with in
web.git will go away. Unusual language codes, bad draft procedures, wonky *_Info.xml issues, and whatever black magic that ends with my manually
running sqlite invocations against the site db - that's all still there.
- The site frontend needs to be completely redesigned from the ground up,
and we have not demonstrated the motivation to see that through. There
are intricacies here that we didn't discover from a straightforward
following of the publican user's guide, and many subjectively frustrating surprises. - A publican website can only publish publican-friendly content.
To the best of my knowledge, that means docbook only. Our contributor
base is declining, and prospective contributors have consistently demonstrated a lack of interest in writing docbook.
Not quite, you can ship pre-built content easily enough [1], but it doesn't integrate the web UI in to it for obvious reasons.
1: https://bugzilla.redhat.com/show_bug.cgi?id=1081303
- A publican website does not provide an effective presentation of many
smaller articles. I've been disagreed with on this point, and will concede that it's definitely possible to have 200 things under that F21
category, but I maintain that presentation in that way would be inimical
to reader browsing.
With all this in mind, I've decided to be honest with myself, and you,
in saying that I'm not motivated to make the publican website frontend work. Something must be done to deal with our site's incompatibility with
current Publican, but from a usage perspective, I feel like there's a
very high upfront investment - and a moderate continuing investment in maintaining the el6-docs koji repo - with no substantial gain in the product
delivered. For normal usage, we've simply replaced `publican
build;publican install-book` with `publican package;koji build`. Rudi had initially volunteered to do *all* of the setup to make this work, but he's a
busy guy and it really isn't fair to expect him to deal with that level
of implementation for us.
We've tossed around ideas in #fedora-docs, and in discussing it, I've
developed a wishlist for an improved publishing platform:
- We use release-based branches to maintain content for a given Fedora
release. Anything committed to these branches is intended for publication, so that should happen automatically. - It should be simpler to make
drafts available. If we can automate publishing on release branches, we
can do the same with master. - We don't do as good of a job as we should with pushing and pulling strings from translation. Pushes and pulls
could be automated. - There should be an effective way to publish and
organize sundry articles. - The publishing platform should be able to process arbitrary markup formats. I think docbook is great, and publican
does a good job with it, but there will be room for more contributions
if that isn't mandatory. - The frontend of the site should be adaptable. docs.fp.o would ideally use the same design elements as other official
Fedora sites for a more unified appearance. - Some validation on commits
would be nice.
The working theory at this point is to use a continuous integration
system like Jenkins to automate builds. I've had a Jenkins instance running locally this week, with new builds triggered via fedmsg signals
generated by our commits. This part works smoothly, with the exception
of a fedwatch bug that I've crudely worked around, but triggers using python-fedmsg really wouldn't be too difficult. The infra folks are also
working on a Jenkins setup, so there's potential to share experience and
buildslaves. Turning that into a browseable frontend is more immediately viable than you might think;
http://sourceforge.net/projects/jenkinscat/ is designed just for that
and can be customized for our needs. Jenkins plugins to cycle strings from Zanata, or some other method, could come down the road. There's room
for other enhancements, too.
The CI idea is something I'm excited about, and motivated to work on.
I'd love to spend that time working with others, if you're interested.
Don't forget that Publican offers you an exit strategy from XML!
It can can convert your DocBook in to Markdown** and then you *never* have to use XML *ever* again! \o/
** or in fact any of the formats at [2] if you install them!
2: http://search.cpan.org/search?m=dist&q=wikiconverter&s=1&n=50
Which means you can also consider switching to a real CMS like medai wiki, moin moin, etc.
You could use publican (or pandoc???) to convert XML to medai wiki format (by installing HTML::WikiConverter::MediaWiki [via cpanspec]) then import that in to https://fedoraproject.org/wiki/
Unified web space FTW.
Cheers, Jeff.
Jeff Fearn Senior Software Engineer Hosted & Shared Services Red Hat Pty Ltd -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQEcBAEBAgAGBQJU2+BoAAoJELs3R4zxGZvK9EMIAJniylZ0V+KV88wUNaSCsGfE D6ZKCNvkMilcWa+VevGwhfMLW6a6hNj6t3wktzw0JXncmpflZMS1BOL+tQasSf3R mlEPLhYCI/HpPWa3CAknmw7vaI/hUzDziPVrubovBFzzLQtXc7F4uRbgmAmjXPig l9nbYt4PIfG4m1/a/tK3B14JXgyBcustTslKUgxdVZ0JSumkQm619CO9QABuVXwB wP2MzBM5JxA4FLHTJa+m112frw43XYPIsyJZlA/UwqZ0QLv/GP/JMF4RDRcWFIpH FrTw48Zyn3je/eO3ZWaiYuqdqBsdW/pXFMw4o3jfPJ0sKvYD7XR2OjwUZh993LA= =WNBJ -----END PGP SIGNATURE-----
It's not about getting rid of publican, or docbook. I'm very satisfied with the way publican renders markup, and would not like to see that go away. The books we have would still be written in docbook and built by publican, but jenkins would do the building instead of koji, because it can happen automatically that way, and the users would use something else to browse to the books. The publican website front end, however, doesn't do that most of extra stuff on my wishlist.
--Pete, using a different client because clearly F22/rawhide thunderbird is broken here
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 02/12/2015 10:55 AM, Pete Travis wrote:
It's not about getting rid of publican, or docbook.
Sure, but I hate XML so I mention exit strategies whenever I can :D
Cheers, Jeff.
- -- Jeff Fearn Senior Software Engineer Hosted & Shared Services Red Hat Pty Ltd
I've played around with quite a few "text-based markup" documentation toolchains in the past year or so. For Pythonistas, I think Sphinx / reStructuredText is one of the more popular ones. There's also GitBook, which is built on Node.js. For R, there's RMarkdown, which is what I use most of the time as a front end to GitBook or Nikola.
For "vanilla" Markdown, AsciiDoc, reStructured Text, etc., Pandoc does a pretty good job of rendering to HTML, PDF and EPUB. Calibre can make MOBI ebooks if you're targeting Kindles.
Personally I'd recommend Sphinx and Pandoc because the whole toolchain is in the Fedora repositories. GitBook is a bit rough around the edges, but I'm using it installed via 'npm' and have it mostly under control.
On Wed, Feb 11, 2015 at 5:13 PM, Jeff Fearn jfearn@redhat.com wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 02/12/2015 10:55 AM, Pete Travis wrote:
It's not about getting rid of publican, or docbook.
Sure, but I hate XML so I mention exit strategies whenever I can :D
Cheers, Jeff.
Jeff Fearn Senior Software Engineer Hosted & Shared Services Red Hat Pty Ltd -----BEGIN PGP SIGNATURE----- Version: GnuPG v1
iQEcBAEBAgAGBQJU2/5XAAoJELs3R4zxGZvKUkcH/274/fc5aDCGQ53MI7ICeqq0 8/Khq99hKfyW3TrR/vUIkw/H4pfMefcjWY2rXjKYBwAj3KFzHZMGY/cNGOwj5fi5 wwflOE4IEK55pqNraNyZJntghhGOmSY7+9L0N+XsI3uUOt07u0A+eG7fvtuSdU0b irptLqFfXBlh7KM8Eg5vYo/qlOawKPQ7FbCTiirZKKsWpWTofteRC1xmKU7M0HVt 6zqwTACqbYH3hrZIChcx8gwT8CoGunAydVqjeTq7rLhuS0+htw3eB+SaEYbSPdpq mQosUrVgH4AbC9PTNZIhUc6Dmps0SoRCyBXWKDzUf2UitkUgjtiD05LS2mJAyto= =mO5B
-----END PGP SIGNATURE-----
docs mailing list docs@lists.fedoraproject.org To unsubscribe: https://admin.fedoraproject.org/mailman/listinfo/docs
On Feb 12, 2015, at 12:06 AM, Jeff Fearn jfearn@redhat.com wrote:
Which means you can also consider switching to a real CMS like medai wiki, moin moin, etc.
A wiki has a lot of appeal when it comes to reduced friction for infrequent or drive-by contributors. However, it is my experience and belief that if you don’t have a strong community wiki-garderners it falls apart quickly. If we want to go in this direction, I would encourage us to adopt the approach of the php manual (as I understand it). Non-core maintainers add notes to pages. Those notes are periodically reviewed and incorporated into the main manual. At that point the notes disappear. I don’t work with that project, but I’d love to know more about how their process works if someone knows.
regards,
bex
On Feb 12, 2015, at 8:38 PM, Brian (bex) Exelbierd bex@pobox.com wrote:
On Feb 12, 2015, at 12:06 AM, Jeff Fearn jfearn@redhat.com wrote:
Which means you can also consider switching to a real CMS like medai wiki, moin moin, etc.
A wiki has a lot of appeal when it comes to reduced friction for infrequent or drive-by contributors. However, it is my experience and belief that if you don’t have a strong community wiki-garderners it falls apart quickly. If we want to go in this direction, I would encourage us to adopt the approach of the php manual (as I understand it). Non-core maintainers add notes to pages. Those notes are periodically reviewed and incorporated into the main manual. At that point the notes disappear. I don’t work with that project, but I’d love to know more about how their process works if someone knows.
I’ve done a bit more reading and found that a lot of information is at http://doc.php.net/tutorial/
They work in docbook and have their own docbook processing engine. I don’t know that it is perfect, but is may be worth reading up on their processes.
regards,
bex
On Wed, 11 Feb 2015 07:53:12 -0700 Pete Travis me@petetravis.com wrote:
...snip a bunch of stuff I agree with...
The working theory at this point is to use a continuous integration system like Jenkins to automate builds. I've had a Jenkins instance running locally this week, with new builds triggered via fedmsg signals generated by our commits. This part works smoothly, with the exception of a fedwatch bug that I've crudely worked around, but triggers using python-fedmsg really wouldn't be too difficult. The infra folks are also working on a Jenkins setup, so there's potential to share experience and buildslaves. Turning that into a browseable frontend is more immediately viable than you might think; http://sourceforge.net/projects/jenkinscat/ is designed just for that and can be customized for our needs. Jenkins plugins to cycle strings from Zanata, or some other method, could come down the road. There's room for other enhancements, too.
So, I have pretty big concerns with using jenkins for anything release critical (like docs).
Infrastructure has been running one in our cloud for a while now:
https://fedoraproject.org/wiki/Jenkins@infra (Note the disclaimer there).
It works ok most of the time, but it's not packaged up nicely, and it pretty much always breaks on any upgrade of plugins or the core setup. Then you have to dig into java and try and figure out whats going on or where they moved the functionality you were depending on. ;(
jenkins is now packaged in Fedora, which is great... but likely will never been in epel/rhel (things are just too old for it).
So, I guess we could look at setting up a fedora using the packaged jenkins and see how much pain it is to upkeep...
I don't suppose buildbot or taskotron would be options?
kevin
docs@lists.stg.fedoraproject.org