Appendices

Format of .leo files

This technical information may be of use to those wanting to process Leo files with special-purpose filters. Leo’s uses XML for its file format. The following sections describe this format in detail. Important: The actual read/write code in leoFileCommands.py is the authoritative guide. When in doubt about what Leo actually writes, look at an actual .leo file in another editor.

Here are the elements that may appear in Leo files. These elements must appear in this order.

<?xml>

Leo files start with the following line:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet>

An xml-stylesheet line is option. For example:

<?xml-stylesheet ekr_stylesheet?>
<leo_file>
The <leo_file> element opens an element that contains the entire file. </leo_file> ends the file.
<leo_header>

The <leo_header> element specifies version information and other information that affects how Leo parses the file. For example:

<leo_header file_format="2" tnodes="0" max_tnode_index="5725" clone_windows="0"/>

The file_format attribute gives the ‘major’ format number. It is ‘2’ for all 4.x versions of Leo. The tnodes and clone_windows attributes are no longer used. The max_tnode_index attribute is the largest tnode index.

<globals>

The globals element specifies information relating to the entire file. For example:

<globals body_outline_ratio="0.50">
    <global_window_position top="27" left="27" height="472" width="571"/>
    <global_log_window_position top="183" left="446" height="397" width="534"/>
</globals>
  • The body_outline_ratio attribute specifies the ratio of the height of the body pane to the total height of the Leo window. It initializes the position of the splitter separating the outline pane from the body pane.
  • The global_window_position and global_log_window_position elements specify the position of the Leo window and Log window in global coordinates:
<preferences>
This element is vestigial. Leo ignores the <preferences> element when reading. Leo writes an empty <preferences> element.
<find_panel_settings>
This element is vestigial. Leo ignores the <find_panel_settings> element when reading. Leo writes an empty <find_panel_settings> element.
<clone_windows>
This element is vestigial. Leo ignores the <clone_windows> element when reading. Leo no longer writes <clone_windows> elements.
<vnodes>
A single <vnodes> element contains nested <v> elements. <v> elements correspond to vnodes. The nesting of <v> elements indicates outline structure in the obvious way.
<v>

The <v> element represents a single vnode and has the following form:

<v...><vh>sss</vh> (zero or more nested v elements) </v>

The <vh> element specifies the headline text. sss is the headline text encoded with the usual XML escapes. As shown above, a <v> element may contain nested <v> elements. This nesting indicates outline structure in the obvious way. Zero or more of the following attributes may appear in <v> elements:

t=name.timestamp.n
a="xxx"

The t=”Tnnn” attribute specifies the <t> element associated with a <v> element. The a=”xxx” attribute specifies vnode attributes. The xxx denotes one or more upper-case letters whose meanings are as follows:

C       The vnode is a clone. (Not used in 4.x)
E       The vnode is expanded so its children are visible.
M       The vnode is marked.
T       The vnode is the top visible node.
V       The vnode is the current vnode.

For example, a=”EM” specifies that the vnode is expanded and is marked.

New in 4.0:

  • <v> elements corresponding to @file nodes now contain tnodeList attributes. The tnodeList attribute allows Leo to recreate the order in which nodes should appear in the outline. The tnodeList attribute is a list of gnx’s: global node indices. See Format of external files (4.x) for the format of gnx’s.
  • Plugins and scripts may add attributes to <v> and <t> elements. See Writing plugins for details.
<tnodes>
A single <tnodes> element contains a non-nested list of <t> elements.
<t>

The <t> element represents the body text of the corresponding <v> element. It has this form:

<t tx="<gnx>">sss</t>

The tx attribute is required. The t attribute of <v> elements refer to this tx attribute. sss is the body text encoded with the usual XML escapes.

New in 4.0: Plugins and scripts may add attributes to <v> and <t> elements. See Writing plugins for details.

Format of external files

This section describe the format of external files. Leo’s sentinel lines are comments, and this section describes those comments.

Files derived from @file use gnx’s in @+node sentinels. Such gnx’s permanently and uniquely identify nodes. Gnx’s have the form:

id.yyyymmddhhmmss
id.yyyymmddhhmmss.n

The second form is used if two gnx’s would otherwise be identical.

  • id is a string unique to a developer, e.g., a cvs id.
  • yyyymmddhhmmss is the node’s creation date.
  • n is an integer.

Here are the sentinels used by Leo, in alphabetical order. Unless otherwise noted, the documentation applies to all versions of Leo. In the following discussion, gnx denotes a gnx as described above.

@<<

A sentinel of the form @<<section_name>> represents a section reference.

If the reference does not end the line, the sentinel line ending the expansion is followed by the remainder of the reference line. This allows the Read code to recreate the reference line exactly.

@@

The @@ sentinel represents any line starting with @ in body text except @*whitespace*, @doc and @others. Examples:

@@nocolor
@@pagewidth 80
@@tabwidth 4
@@code
@afterref
Marks non-whitespace text appearing after a section references.
@+all
Marks the start of text generated by the @all directive.
@-all
Marks the end of text generated by the @all directive.

@at and @doc

The @+doc @+at sentinels indicate the start of a doc parts.

We use the following trailing whitespace convention to determine where putDocPart has inserted line breaks:

A line in a doc part is followed by an inserted newline
if and only if the newline if preceded by whitespace.

To make this convention work, Leo’s write code deletes the trailing whitespace of all lines that are followed by a “real” newline.

@+body (Leo 3.x only)
Marks the start of body text.
@-body (Leo 3.x only)
Marks the end of body text.
@delims
The @delims directive inserts @@delims sentinels into the external file. The new delimiter strings continue in effect until the next @@delims sentinel in the external file or until the end of the external file. Adding, deleting or changing @@delim sentinels will destroy Leo’s ability to read the external file. Mistakes in using the @delims directives have no effect on Leo, though such mistakes will thoroughly mess up a external file as far as compilers, HTML renderers, etc. are concerned.
@+leo

Marks the start of any external file. This sentinel has the form:

<opening_delim>@leo<closing_delim>

The read code uses single-line comments if <closing_delim> is empty. The write code generates single-line comments if possible.

The @+leo sentinel contains other information. For example:

<opening_delim>@leo-ver=4-thin<closing_delim>
@-leo
Marks the end of the Leo file. Nothing but whitespace should follow this directive.
@+middle (Leo 4.0 and later)
Marks the start of intermediate nodes between the node that references a section and the node that defines the section. Typically no such sentinels are needed: most sections are defined in a direct child of the referencing node.
@-middle (Leo 4.0 and later)
Marks the end of intermediate nodes between the node that references a section and the node that defines the section.
@+node

Mark the start and end of a node.

@+node:gnx:<headline>
@others
The @+others sentinel indicates the start of the expansion of an @+others directive, which continues until the matching @-others sentinel.
@verbatim
@verbatim indicates that the next line of the external file is not a sentinel. This escape convention allows body text to contain lines that would otherwise be considered sentinel lines.
@@verbatimAfterRef

@verbatimAfterRef is generated when a comment following a section reference would otherwise be treated as a sentinel. In Python code, an example would be:

<< ref >> #+others

Unicode reference

Leo uses unicode internally for all strings.

  1. Leo converts headline and body text to unicode when reading .leo files and external files. Both .leo files and external files may specify their encoding. The default is utf-8. If the encoding used in a external file is not “utf-8” it is represented in the @+leo sentinel line. For example:

    #@+leo-encoding=iso-8859-1.

    The utf-8 encoding is a “lossless” encoding (it can represent all unicode code points), so converting to and from utf-8 plain strings will never cause a problem. When reading or writing a character not in a “lossy” encoding, Leo converts such characters to ‘?’ and issues a warning.

  2. When writing .leo files and external files Leo uses the same encoding used to read the file, again with utf-8 used as a default.

  3. leoSettings.leo contains the following Unicode settings, with the defaults as shown:

    default_derived_file_encoding = UTF-8
    new_leo_file_encoding = UTF-8
    

    These control the default encodings used when writing external files and .leo files. Changing the new_leo_file_encoding setting is not recommended. See the comments in leoSettings.leo. You may set default_derived_file_encoding to anything that makes sense for you.

  4. The @encoding directive specifies the encoding used in a external file. You can’t mix encodings in a single external file.

Previous topic

White Papers

Next topic

Glossary