Monday, March 1, 2021

Exporting LibreOffice Guides to HTML (Part I)

LibreOffice is an open source office suite full of tricky secrets. One of my favorites is the possibility to export a text document to XHTML or HTML5, both are W3C standards supported by most modern web browsers.

But you, the reader, will certainly ask: If I have the Guides in ODT and PDF file format why do I need another format? Why spend energy adding another medium for the LibreOffice Guides? 


There are advantages and drawbacks for the endeavor. On the thumbs up side, the community get a way to read the guides without actually downloading the PDF or ODT file and contents can be accessed with the browser's navigation tools (including bookmarking and more). One example is the current ODF Standard files exported to XHTML, available at the OASIS website.

A second advantage is that (X)HTML pages can be crawled and indexed by search engines robots and the LibreOffice Guides can be found on the search results pages of Bing, Google, DuckDuckGo and others.

Another exciting possibility for distributing the guides in (X)HTML format is that they could be installed on the intranets of schools, colleges and universities, public libraries, also community, public administration and private company websites. The files are static and don't need a server side scripting languages such as php or asp. Distributing the rich contents of the LibreOffice Guides in a browser readable format will add value to every LibreOffice migration project.

One critical factor in the success of a LibreOffice migration project is how quickly users can transition to the new software and having readily available, easily accessible documentation in different forms should not be underestimated.

How difficult is to convert the Guides to an (X)HTML format?

My experience is that there are some work to do in the ODT side, and some work on the exported (X)HTML. The nice part is that these changes are small and can be partially automated.

LibreOffice has an interesting XHTML export filter. The developers did their best to preserve formatting and document fidelity between different rich text output formats. A second tool I tried is the nice extension writer2xhtml, which also have interesting features.

However reading contents in a browser (or even a tablet and a mobile phone) requires scrolling instead of the usual page turning, as in a printed book.

The layout of the document's content must be adapted to the browser's navigation actions. This requires the layout to be adjusted for on-screen viewing. Besides, it is interesting to also adapt the contents to tablets and perhaps mobile phones.

Luckily, all elements for navigation exist in the ODT file, they are just in the wrong position when exported to XHTML. The approach is to wrap these elements in sections with specific names. After being exported to XHTML these sections are mapped in <div id="name">...</div> and can be accessed by both a CSS and Javascript for pagination and layout.

Here is one layout among many alternatives, for a simple export of our Guides to a browser page layout.

Besides the existing sections in the chapter, we can add other blocks with content of interest, for example a donation section a search form for either an external or internal search, such as Xapian and Omindex.

In the next post, I'll describe the changes needed in the Guide templates and discuss some of the alternate approaches for the task.

Stay tuned!


1 comment: