Olivier Hallot

Monday, March 22, 2021

Exporting LibreOffice Guides to XHTML (Part II)

In the previous blog post I explained the reasons and some issues on exporting LibreOffice Guides to the xHTML format. Now it is time to give more technical details.

I choose to use the extension writer2xhtml available in the Extension website, because the produced HTML5 look less cluttered than the native XHTML export. Nevertheless there it will be necessary to add some extra HTML5 lines, to load the CSS and a Javascript file.

Invisible changes in chapter files that can go upstream

There are some changes that should go upstream, because it does not change the resulting PDF or ODT book layout.

Each image must be anchored “as character” in the document. The image becomes a character and must be single in the paragraph. The paragraph must be centered in the page, using a style that aligns in the center, for example, the “Figure” paragraph style. The image caption paragraph must have style “Caption”. The wrapping frame that holds the image and the caption must also be anchored “as character” in a paragraph with style “Figure” as well. This arrangement is transparent when producing ODT, PDF and HTML5 documents.

Tips, Notes and Cautions headings use graphics as bullet. Many of these paragraphs have the bullet enabled by direct formatting and this is invisible to the user in LibreOffice, but will show when exporting to HTML5 with an ugly black circle.

Recommended changes that should go upstream

Create a table style, or copy it from the table in the copyright section. Name the style as you want, custom table styles are stored in your user profile. Apply this style to all tables in the chapter. Open the table properties of each table and set alignment to “Centered”, and table width in 90%.

Remove cross references to pages. For example, “See figure 12 on page 30”. At best, use the “above” or “below” in the reference.

Add the sections described in 2,3 and 4 below.

Changes in the working copy of the chapter file.

The original chapter file is optimized for a book format, we need to prepare it for exporting to HTML5 where pagination is different, mostly by smooth scrolling. This involves steps but will not change the chapter contents, only the layout and some formatting

With the images anchored as explained above – a step that may require manual work – move the caption above the image. This can be quickly done by placing the cursor in the caption text and pressing Ctrl+Alt+Up Arrow, to swap the caption paragraph with the image paragraph. Some chapter have dozens of images so it will be nice to have a script to bulk execute this swap.
Wrap the top LibreOffice logo in a Section named “SEC_LOGO”.
Wrap the Guide name in a section named “SEC_GUIDE”.
Wrap the chapter title in a section named “SEC_TITLE”.
Delete the existing table of Contents.
Select the text from the copyright heading to the end of the chapter and wrap in a section named “SEC_DISPLAYAREA”. Ensure you leave some empty paragraphs after the end of the section.
On the bottom of the chapter, after the SEC_DISPLAYAREA section, add 6 new empty sections: SEC_TOC, SEC_BOOK_TOC, SEC_SEARCH, SEC_IMPRINT, SEC_DONATION, SEC_NAV. You can create these 6 sections in an empty document and load it as Autotext, so all section can be inserted in a single command using Autotext (Ctrl+F3).
Insert the chapter table of contents in the SEC_TOC section.
Change the template of the chapter to the provided template odf2htmlv2.ott.
Review the ordered and unordered lists in the chapter. The new template may highlight the spurious bullet and numbering inserted by direct format, and as explained above, is very hard to detect in the original ODT file. Some of these list direct formatting may also be detected in the HTML5 output.

Save your working copy.

Exporting to HTML5

The extension writer2xhtml adds a toolbar for exporting on a click. The extension allows some customization, not used here. The export used “original format” style and 115% font size.

The export is very fast and gives no choice to change the export name, so the exported file has same file name and html as file extension, overwriting existing files with same name and extension.

By default the exported file name opens in the system browser for inspection. The result is not yet what we want, we must apply specific CSS and Javascript for rearranging the layout of the sections. The se files are added in the HTML5 output juste before the </head> closing tag.

<link href="guideposition.css" rel="Stylesheet" type="text/css">
<link href="guideformats.css" rel="Stylesheet" type="text/css">
<script type="text/javascript" src="GS70.js" defer></script>

The CSS files

Two extra CSS files were created, one is guideposition.css and manages the sections position in the page, and has provision for handling other screen sizes such as in tablets. The second CSS file is guideformats.css that contains rules to override some attributes such as lists, fonts, font-size, colors, margins, padding and more of the sections.

The Javascript file

This file fills the empty sections we added at the end of the chapter. Contents for donation, guide table of contents (jump between chapters), a legal imprint, a search form (to be implemented) and more. The javascript file is common to all chapter and is custom to the guide.

Conclusion

Exporting the LibreOffice Guides to HTML is another way to offer a rich contents to the public. Guides in HTML format can be installed in servers of schools, libraries, colleges and corporations alongside with a PDF copy, to support a migration project.

The rich set of features of Writer, while allowing the creation of wonderful documents, is also source of concern not only when exporting to formats that are less flexible that ODF, but also to manage the excess of freedom. The changes recommended and the detection of hidden direct formatting in lists are examples. It becomes clear that a set of sanitizing scripts can help to remove spurious formatting, unused legacy styles, detect unwated extra styles and adjust the objects in the documents.

When handling the full set of guides it is easy to dream of an office suite that can execute some "wishes" like "anchor all images as characters and center in line", "change position of caption in all frames to top", "format all tables with style 'guides' and align to center"... but that is for an office suite of the next generation!!!.

Partial results of all this work can be visualized below.

The getting started guide in HTML format.

The javascript file

The css files: css1 and css2

The odf2xhtml.ott template

The writer2xhtml extension

Happy documenting!!!!

Monday, March 1, 2021

Exporting LibreOffice Guides to HTML (Part I)

LibreOffice is an open source office suite full of tricky secrets. One of my favorites is the possibility to export a text document to XHTML or HTML5, both are W3C standards supported by most modern web browsers.

But you, the reader, will certainly ask: If I have the Guides in ODT and PDF file format why do I need another format? Why spend energy adding another medium for the LibreOffice Guides?

There are advantages and drawbacks for the endeavor. On the thumbs up side, the community get a way to read the guides without actually downloading the PDF or ODT file and contents can be accessed with the browser's navigation tools (including bookmarking and more). One example is the current ODF Standard files exported to XHTML, available at the OASIS website.

A second advantage is that (X)HTML pages can be crawled and indexed by search engines robots and the LibreOffice Guides can be found on the search results pages of Bing, Google, DuckDuckGo and others.

Another exciting possibility for distributing the guides in (X)HTML format is that they could be installed on the intranets of schools, colleges and universities, public libraries, also community, public administration and private company websites. The files are static and don't need a server side scripting languages such as php or asp. Distributing the rich contents of the LibreOffice Guides in a browser readable format will add value to every LibreOffice migration project.

One critical factor in the success of a LibreOffice migration project is how quickly users can transition to the new software and having readily available, easily accessible documentation in different forms should not be underestimated.

How difficult is to convert the Guides to an (X)HTML format?

My experience is that there are some work to do in the ODT side, and some work on the exported (X)HTML. The nice part is that these changes are small and can be partially automated.

LibreOffice has an interesting XHTML export filter. The developers did their best to preserve formatting and document fidelity between different rich text output formats. A second tool I tried is the nice extension writer2xhtml, which also have interesting features.

However reading contents in a browser (or even a tablet and a mobile phone) requires scrolling instead of the usual page turning, as in a printed book.

The layout of the document's content must be adapted to the browser's navigation actions. This requires the layout to be adjusted for on-screen viewing. Besides, it is interesting to also adapt the contents to tablets and perhaps mobile phones.

Luckily, all elements for navigation exist in the ODT file, they are just in the wrong position when exported to XHTML. The approach is to wrap these elements in sections with specific names. After being exported to XHTML these sections are mapped in <div id="name">...</div> and can be accessed by both a CSS and Javascript for pagination and layout.

Here is one layout among many alternatives, for a simple export of our Guides to a browser page layout.

Besides the existing sections in the chapter, we can add other blocks with content of interest, for example a donation section a search form for either an external or internal search, such as Xapian and Omindex.

In the next post, I'll describe the changes needed in the Guide templates and discuss some of the alternate approaches for the task.

Stay tuned!

Monday, July 29, 2019

A better LibreOffice than LibreOffice

Turning LibreOffice users into happy LibreOffice users.

After readings and advises from other IT directors, one of my customers and strong supporter of LibreOffice, noticed that just switching programs and teaching user to avoid pitfalls in interoperability is not enough for a smooth migration, and something more than following the migration best practices has to be done for a successful switch. He then asked me to deliver a better LibreOffice than LibreOffice.

Challenge accepted. Together we started to investigate the needs of his organization, a civil company with strong military ties and with significant part of the workforce serving the military. We discovered a set of employees with repetitive tasks, usually performed by reusing old documents and updating them. The straightforward solution was to define a set of document templates and deploy it in the user computers.

But that was not enough. LibreOffice templates are accessed by a bunch of clicks with dialogs navigation, a sequence that needs to be memorized. Besides, templates dialog covers all kind of documents types and more clicks to narrow the selection. There had to be some easier way to get a brand new document from a corporate controlled template. Also, my customer also wanted to let a fingerprint in the solution and he wanted the solution to bear the company logo when user access it, including the high ranked military.

The solution was to create an extension that added a new menu to LibreOffice, specially crafted to address the needs of the workforce and help them do their job as quickly as possible.

So we packed the templates, a Basic macro needed for their document handling in an extension. The Basic macro was used to create a simplified template dialog, allowing a 2-click selection and other internal document handling macros. All new features are accessed from the top menu and a specific toolbar, with icons representing the company brand.

Thanks to the extension mechanism, the “better LibreOffice than LibreOffice” became a reality. The extension used all nice features of versioning, updating, as well as easy of deployment and maintenance. A few days after the news were spread, other departments such as engineering, legal, contracts, human resources and others asked to include their templates into the solution, turning more and more users happy to use LibreOffice.

Happy extensioning!

Monday, April 29, 2019

New Help: Copy BASIC and PYTHON code to Clipboard on a Click

The next release for LibreOffice will have a small but handy improvement for every macro developer, either experienced or beginner.

Hover the mouse on BASIC and Python code in the new Help pages and a tip shows that when you click your mouse, the code exerpt is copied in the system clipboard. You can paste in the BASIC IDE (Integrated Development environment) or any other text application in your system.

With this little feature, you save time of typing the exerpt to test in your IDE or document. Another alternative was to use a collateral file, however, collateral files with embedded macros is likely to trigger security warnings in most LibreOffice installations. Just copying the fragment is easier.

Happy Basic macro programming!
Happy Python macro programming!

Monday, January 14, 2019

Report on the New LIbreOffice Help Pages Online Editor

The Online Help Editor is getting a shape

I have improved and fixed a bit the XHP editor, and changed the page address:

https://newdesign.libreoffice.org/help_editor/index.html

The editor is still work in progress, but starts to become interesting for creating and editing Help pages.

What's new

Mike Saunders implementation of the autocompletion of XHP tags for Codemirror editor.
The left and right panes are now fixed in browser screen and scrollable,
The right pane uses 99% of the current Help transformation rendering, plus
some visual debug information left intentionally to help Author in adjusting <embed>s, <image>s and <link>s .
You can now open a Help page directly from the interface.

The help page is normally source/text/AAA/BBB/myHelpPage.xhp
Type /AAA/BBB/myHelpPage.xhp in the text box and click Open File to load in the editor.
Press Render page to see it on the right.

A set of buttons with XHP snippets to shorten editing workload:

For <paragraph>s, <note>s, <heading>s, <emph>s, <menuitem>s, etc... select the raw text or contents and click the corresponding button. The raw text will be wrapped with the opening and closing tag. For paragraph-like contents, an unique id will be created automatically, a feture required for translations.
Other snippets builds fragments of XHP tags, such as <table>s, <tablerow>, <list>s, <section>s, and more.
Just play with and do not forget to render the page on the right.

Restrictions

The editor works with Firefox only. Issues with Chrome and Edge. Other browsers not yet tested.
Saving files not implemented. However you can copy the editor contents and finish the patch in you preferred editor
More XHP checking are under development, specially id's unicity and DTD checking
If you get a blank page on the right, this is because you hit a bug in the browser transformation. Unfortunately debugging the browser transformation is very hard, support is almost none.

Invitation for developers and testers

You are invited to test the editor, report bugs and suggest improvements.
The user interface is simple HTML and Javascript. If you have skills in these technologies you are a potential developer for the editor, but we know that PHP will be the right tech choice in near future.
The source code is in the dev-tool repository.

To clone the dev-tool repository :
git clone https://gerrit.libreoffice.org/dev-tools dev-tools
The editor is in dev-tools/help3/html/

If you have a web server working in your computer (Apache, Nginx, etc...) you can run the editor locally: create a link between the web server root and the editor. For example, under Debian-like Linux:

sudo cd /var/www/html
sudo ln -s help-editor /dev-tools/help3/html
and point your browser to http://localhost/help-editor

Seeking Help and discussion on the editor

Please use the documentation list, the developer list and our IRC channels to get in touch with the development of the editor.

Ackowledgements

The Javascript editor used is CodeMirror and was carefully selected by Mike Saunders who also set the initial confguration for working with XML and our XML dialect XHP, as well as configured the autocompletion features.

The XHP snippets were originally designed for the KDE Kate editor and ported to the online editor.

Thursday, October 25, 2018

Proposed XHP Extensions

After significant amount of time spent writing and fixing LibreOffice Help pages (XHP), I came to conclusion that the LibreOffice Help XML (XHP) is a powerful markup but a bit too hard to master for newcomers and easy to have errors and mistakes slipped in files. Some of its complexity are not absolutely required so I wrote a wiki page suggesting the implementation of XHP extensions, aiming to make life simpler for adding and reading XHP contents textually (markup).

Please note that in any case the current markup is affected, so it will preserve the legacy contents as well as the current translations. For example, the new markup for 'tip' paragraph should be

<tip id="123456" localize="true" xml-lang="en-US"></tip>

So it can replace

<paragraph id="123456" localize="true" role="tip" xml-lang="en-US"></paragraph>

Yes, it is a trivial change but I hope it will make reading easier for all.

Handling of the XHP extensions will be in the XSLT transformation and it will be patched accordingly, as well as the DTD, wiki documentation on XHP, and string extractors for Pootle.

The wiki page is

https://wiki.documentfoundation.org/Documentation/Proposed_Extensions_for_XHP

And comments are welcome, do's and dont's, thumbs up or thumbs down.

Happy help writing.

Monday, September 10, 2018

Hovering icons in Help pages

Last week I implemented a feature in our help pages using a modern CSS (Cascade Style Sheets) technique when hovering the mouse pointer on icons displayed in the page: the hovered icon is enlarged twice its size.

Almost every icon in the help pages are sized 0.22 in x 0.22in (~0,5cm x 0,5cm) which is sometimes a bit too small, especially with minimalistic designed icons such as the Colibre icon family. Enlarging the icons helps user to visualize and since it was implemented in CSS, there was no need to change the icons dimensions in the source help pages.

To see it working, please check this page and hover the mouse on icons.

Normal size

Enlarged on hover

Comments and suggestions are welcome.

Happy icon hovering!