The markup used to store the text is a logical markup. This mean that every convention used is aimed to represent the logical structure of the document, not the way it is rendered on a PDF or on an HTML page. If the output looks broken, it’s a problem which must be fixed on the software side, not on the text side.

As a matter of fact, in the default configuration, the PDF and the HTML render very differently. PDF uses serif fonts, while HTML sans. PDF has the paragraph indented, normally (unless some stretching is needed) they are not separated by white space, while HTML is without indentation and are separated by white space.

For the PDF the default layout of the KOMA-script class is used, with some minor adjustments (like always using serif fonts, instead of the default sans for the chapter and section titles).

It’s a quite common mistake to try to use the markup the way one would use a word-processor like Word, OpenOffice and similar programs, or trying to mimic the layout of the original document even when there is no need for that. Example: inserting everywhere a br tag to force a space between paragraphs. Or mark section titles using a strong tag. These are just errors.

For example, a section title is something more that a bold string. It shouldn’t happen at the bottom of the page. It should write an entry in the table of contents. It should have some consistent space around. A bold string is just an ordinary line in a bold character, an important line for sure, but nothing more.

So this kind of thing is an error:

<br>
**Title**
<br>

This should be the body

While instead it should be this way:

*** Title

This should be the body

In some contexts, the two chunk of texts will render more or less the same way, but in the first chunk we marked up the text visually, in the second, correct way, logically, as we explicitly told that it is section title. In this way we know that the logical structure of the text is preserved across possible (future) changes of style, and we are not bound to the way the text is rendered with the current fonts at the present moment in a specific format.

The br tag, to make another example, should be the exception, not the rule, and used only where needed by the logic of the document to force a break line, not abused to do something like “let’s make this line a bit shorter because the PDF is not rendering perfectly”. Because in this case, when the layout change (it happens from time to time), you will just have an inexplicably broken line which makes no sense.