5. Advanced Language Features

What you have learned up until now, won't get you far if you intend to write anything but plain prose. This chapter introduces some of the more advanced language constructs, that let you spice up your documents with illustrations, tables, structured lists, program listings and mathematical formulas.

5.1. Using Lists

ecromedos knows three types of lists:

  • ordered lists,
  • unordered lists (a.k.a bullet lists),
  • and definition lists.

5.1.1. Bullet Lists and Ordered Lists

Bullet lists are set with the ul element and ordered lists with the ol element. List items are enclosed inside li tags. Both list types may be nested within each other up to four levels deep. Starting with ecromedos version 2, list items may contain arbitrary block elements. But please note that a block element inside a list must not bear a caption. The following listing shows an example of a nested list structure:

<ol>
    <li>
    <p>First paragraph in first list item</p>
    <p>Second paragraph in first list item</p>
    </li>
    <li>
        <p>Second list item</p>
        <ul>
            <li>
            <p>First subitem of second list item</p>
            </li>
            <li>
            <p>Second subitem of second list item</p>
            </li>
        </ul>
    </li>
    <li>
        <p>Third item in outer list</p>
        <ol type="i">
            <li>
            <p>Item one in ordered sublist</p>
            </li>
            <li>
            <p>Item two in ordered sublist</p>
            </li>
            <li>
            <p>Item three in ordered sublist</p>
            </li>
        </ol>
    </li>
</ol>

Ordered lists at different nesting levels will receive different enumeration marks, such as arabic numbers, latin letters, or roman numerals, to reflect their position in the list hierarchy. The type of enumeration mark at a given level is selected automatically, but may be overridden by setting the list's type attribute as follows:

<ol type="1"> for arabic numbers (1, 2, 3, ...)
<ol type="i"> for roman numerals in lowercase (i, ii, iii, iv, ...)
<ol type="I"> for roman numerals in uppercase (I, II, III, IV, ...)
<ol type="a"> for latin letters in lowercase (a, b, c, ...)
<ol type="A"> for latin letters in uppercase (A, B, C, ...)

5.1.2. Definition Lists

Definition lists are set with the dl element. An item in a definition list has two components: a term or expression to be defined and its respective definition. Take a look at the following example:

<dl>
    <dt>ecromedos</dt>
        <dd>
            A document publication system that allows generating
            different target formats from one document source.
        </dd>
    <dt>ECML</dt>
        <dd>
            The ecromedos Markup Language is an XML based markup
            language for describing the logical structure of
            standard text documents.
        </dd>
</dl>

While the dt element may contain only simple text and text-formatting elements, the dd element may contain arbitrary sequences of block elements.

5.2. Defining Tables

Starting with version 2, ecromedos features a complete table model with table captions, cells that can span multiple columns, nested subtables and minute control over the visibility of the table grid. The language elements for setting tables were largely borrowed from HTML. However, there are some subtle differences between the HTML and the ECML table model, which will become apparent in the course of this section.

5.2.1. Basic Tables

Tables are likely the most complicated part of ECML. But once you know the ins and outs of the ECML table model, you will appreciate the ease with which you can create good-looking tables in your documents. To get started, take a look at the following example of a basic formula table:

<table print-width="100%" screen-width="600px"
    align="left" id="tbl:example_4x4">
    <caption>
        Example of a simple 4x4 table without frame borders
    </caption>
    <shortcaption>
        Example of a 4x4 table (continued)
    </shortcaption>
    <colgroup>
        <col width="45%"/>
        <col width="55%"/>
    </colgroup>
    <tr>
        <td>First column, first row  </td>
        <td>Second column, first row </td>
    </tr>
    <tr>
        <td>First column, second row </td>
        <td>Second column, second row</td>
    </tr>
</table>

The attributes print-width and screen-width determine the horizontal expansion of the table. For printable output, you can specify the table width in centimeters (cm), points (pt) or as a percentage (%) of the overall width of the page's text area. For HTML output, you can use all units commonly used in HTML Cascading Stylesheets or the statement auto, to leave the calculation of the table's dimensions completely to the browser. The function of the align attribute should be self-explanatory, it can take the values left, center and right. The id attribute gives the table a unique id, which can be referenced with the ref and pageref elements (see section 4.3).

The optional caption element can be used to give the table a descriptive annotation. If you supply a shortcaption it will be printed on continuing pages when a table extends over more than one page. The colgroup element describes the column layout. For each column in the table, there must be a col element specifying the relative width of the column. Make sure that these total up to 100%, or you may experience strange effects!

The table may start with one or more header rows distinguished by the th element and end with one or more footer rows distinguished by the tf element. Regular rows are set with the tr element, individual table cells with td. Table head and foot will be repeated on each page, if the table extends accross multiple pages. Apart from that, no special formatting will be applied to text in header or footer cells.

5.2.2. Activating the Grid Rules

The table above does not have a visible grid. To draw a frame around your table, use the frame attribute on the table element and add an arbitary combination of the keywords left, right, top and bottom to it, in a comma separated list. Each of the keywords turns on drawing of the respective line on the table's outer frame border.

Using the keywords rowsep and colsep, you can activate the dividing lines in between table cells. You can do this globally, by adding them to the table's frame attribute, or for individual rows and cells. Copy the following listing into an emtpy document and try adding and removing lines from the table grid, to get a feel for it.

<table print-width="100%" screen-width="600px"
    align="left" id="tbl:example_grid" frame="top,bottom"
    print-rulewidth="1pt" screen-rulewidth="1px" rulecolor="#000000">
    <colgroup>
        <col width="25%"/>
        <col width="25%"/>
        <col width="25%"/>
        <col width="25%"/>
    </colgroup>
    <th frame="rowsep">
        <td colspan="4"><b>Header</b></td>
    </th>
    <tr frame="colsep">
        <td frame="rowsep">1</td><td>2</td><td>3</td><td>4</td>
    </tr>
    <tr frame="colsep">
        <td>5</td><td frame="rowsep">6</td><td>7</td><td>8</td>
    </tr>
    <tr frame="rowsep,colsep">
        <td>9</td><td>10</td><td>11</td><td frame="rowsep">12</td>
    </tr>
    <tr>
        <td>13</td><td>14</td><td>15</td><td>16</td>
    </tr>
</table>

The thickness of the grid rules may be specified with the attributes print-rulewidth and screen-rulewidth. The color of the lines can be controlled via the rulecolor attribute. Color values must be given as CSS-style RGB triplets in hexadecimal notation. So in this example, the table rules would be black, which is also the default.

5.2.3. Coloring Table Cells

You may color individual rows or cells by setting the color attribute on the corresponding tag. For example, to give the second cell in the first row from the previous example a gray background, you could write:

<tr frame="colsep">
    <td frame="rowsep">1</td><td color="#dddddd">2</td><td>3</td><td>4</td>
</tr>

Please note that colored cells may appear to overlap with dark grid rules when viewing PostScript or PDF documents on screen. Therefore, you should avoid using colored cells and grid rules together or instead use white rules when working with colored tables.

5.2.4. Text-Alignment in Table Cells

The vertical alignment of text in tables can be controlled only for entire rows, but not for individual cells. This is due to LaTeX's limited capabilities in this respect. To determine the vertical alignment of text in a table row, set the valign attribute on the corresponding row element to one of the specifiers top, middle or bottom.

Horizontal text alignment can be controlled per row or for each cell individually, by setting the align attribute to left, center or right. Starting with ecromedos 2.0, text in tables can also be set justified. Per default, text is set left-aligned.

5.2.5. Rows Spanning Multiple Columns

Sometimes it may be necessary to make a table cell stretch across multiple colums. You can achieve this by setting the colspan attribute on a cell to the number of columns that it should cover. Unfortunately, there is no corresponding rowspan attribute, as it exists in HTML. However, in most cases it should be possible to work around this limitiation using subtables.

5.2.6. Subtables

With ECML it is not possible to create tables with cells that span multiple rows, i.e. there is no rowspan attribute. Starting with version 2.0, you can use subtables to partially get around this limitation. A subtable is created by simply putting a subtable element in place of a td element. Here is an example:

<table print-width="100%" screen-width="600px"
    align="left" id="tbl:example_subtable" frame="top,left,right,bottom"
    print-rulewidth="1pt" screen-rulewidth="1px" rulecolor="#000000">
    <colgroup>
        <col width="25%"/>
        <col width="75%"/>
    </colgroup>
    <tr valign="middle">
        <td align="center" frame="colsep">January 2009</td>
        <subtable frame="colsep,rowsep">
            <colgroup>
                <col width="14%"/><col width="14%"/><col width="14%"/>
                <col width="14%"/><col width="14%"/><col width="15%"/>
                <col width="15%"/>
            </colgroup>
            <tr align="right">
                <td><b>Mon</b></td><td><b>Tue</b></td><td><b>Wed</b></td>
                <td><b>Thu</b></td><td><b>Fri</b></td><td><b>Sat</b></td>
                <td><b>Sun</b></td>
            </tr>
            <tr align="right">
                <td>    </td><td>    </td><td>    </td><td> 1  </td>
                <td> 2  </td><td> 3  </td><td> 4  </td>
            </tr>
            <tr align="right">
                <td> 5  </td><td> 6  </td><td> 7  </td><td> 8  </td>
                <td> 9  </td><td> 10 </td><td> 11 </td>
            </tr>
            <tr align="right">
                <td> 12 </td><td> 13 </td><td> 14 </td><td> 15 </td>
                <td> 16 </td><td> 17 </td><td> 18 </td>
            </tr>
            <tr align="right">
                <td> 19 </td><td> 20 </td><td> 21 </td><td> 22 </td>
                <td> 23 </td><td> 24 </td><td> 25 </td>
            </tr>
            <tr align="right">
                <td> 26 </td><td> 27 </td><td> 28 </td><td> 29 </td>
                <td> 30 </td><td> 32 </td><td>    </td>
            </tr>
        </subtable>
    </tr>
</table>

As you can see, the syntax for subtables is exactly the same as for regular tables, except that a subtable does not have an id or a caption and you cannot specify the table width, as it is fixed at 100%, stretching over the entire cell.

5.3. Embedding Graphics

Graphical figures are incorporated into a document via the figure element. You can give figures a caption and an id. A figure that carries an id attribute can be referenced via the ref and pageref elements (see Section 4.3). Take a look at the following example:

<figure align="center" id="fig:thebeach">
    <caption>The Beach</caption>
    <img src="thebeach.jpg" print-width="100%" screen-width="400px"/>
</figure>
<p>
    Figure <ref idref="fig:thebeach"/> shows a beautiful sunset at
    the Galveston Beach.
</p>

With the src attribute, you specify the location of the image on your harddisk. If the image's file format is not suitable for use with a particular output format, the document pre-processor will automatically convert it. For instance, when generating LaTeX output, raster images are automatically converted to encapsulated postscript. Make sure you Supply images in a high-enough resolution for proper representation in all target formats.

The attributes print-width and screen-width determine the width of the image in printable output and in XHTML output, respectively. For printable output this can be a value in points (pt) or centimeters (cm) or a percentage (%) of the overall width of the page's text area. The width for HTML output is specified in pixels (px).

The figure's horizontal alignment is controlled by setting the align attribute to left, center or right. If you would like a thin black border around your figure, set the border attribute to yes.

There is experimental support for letting the text flow around figures. Simply place the figure element inside a paragraph like an inline element and make sure that you explicitely set the figure's alignment to left or right.

You may also load small images or icons into the running text using the img element as an inline element.

5.4. About Floating Objects

Per default, figures and tables are placed exactly where specified in the source document. Imagine though, that you are generating printable output and so far the page has been filled by two thirds with text. Technically, the next thing to be inserted would be a picture, but it occupies more space than remains and thus has to be moved to the next page, leaving the page before empty by one third.

This is not only visually unpleasant, but also bloats your document unnecessarily. As a solution, you can turn figures or tables into floating objects by setting the float attribute on the main element to yes. Making an object float means that you give the formatting engine (i.e. LaTeX) permission to move it to a different location in the text in order to warrant optimal text flow across pages.

5.5. Verbatim Text and Code Listings

You can use the verbatim element when you need to print scripts and want whitespace to be preseved. Text inside a verbatim tag will be printed in typewriter letters and whitespace will be displayed just as it appeared in your editor.

For program code you should use the listing element, which has as a single child the code element. You can have your code syntax highlighted by specifying the name of the programming language or script in the syntax attribute. Here is an example for the classic “Hello World” in C:

<listing>
    <code syntax="c" colorscheme="borland" strip="yes"
        startline="1" linestep="100" tabspaces="2"><![CDATA[
#include <stdio.h>
#include <stdlib.h>

int main(void) {
    printf("Hello World!\n");
    return 0;
}
    ]]></code>
</listing>

ecromedos internally uses the powerful Pygments syntax highlighter, which can lex a vast amount of programming languages and scripts. Pygments also comes with a number of predefined coloring schemes (styles) that you may select with the colorscheme attribute. For a complete list of supported languages and available styles, run the command

pygmentize -L

If you specify a startline, the syntax highlighter will number each line in your code. The linestep attribute specifies the increment from one line to the next.

Setting the strip attribute on the verbatim or code elements to yes will result in whitespace being stripped from the beginning and end of your listing. You can override the background color of the selected coloring scheme with the bgcolor attribute.

Per default, ecromedos converts all tabulators inside a verbatim or code element to 4 spaces. You can override the number of spaces using the tabspaces attribute.

5.6. Mathematical Formulas

Mathematics are entered in TeX notation. Explaining TeX is beyond the scope of this document. For more information, please refer to appropriate literature, such as [LSHORT].

5.6.1. Inline Math

In order to set mathematical expressions inline, i.e. in the running paragraph, use the m element, as shown in this example:

<p>
    Einstein's law of equivalence of mass and energy is expressed
    as <m>e = mc^2</m>.
</p>

5.6.2. Formulas as Block Elements

Formulas can also be set as block elements. Simply enclose the m element in an equation element. To have your equation numbered, set the number attribute to yes. The following listing shows how to set the equation from the previous example as a block element:

<equation number="yes">
    <m>e = mc^2</m>
</equation>

Support for setting math is not extremely sophisticated. Future versions of ecromedos may provide better control over the alignment and grouping of formulas.