This technical information may be of use to those wanting to process Leo files with special-purpose filters. Leo’s uses XML for its file format. The following sections describe this format in detail. Important: The actual read/write code in leoFileCommands.py is the authoritative guide. When in doubt about what Leo actually writes, look at an actual .leo file in another editor.
Here are the elements that may appear in Leo files. These elements must appear in this order.
Leo files start with the following line:
<?xml version="1.0" encoding="UTF-8"?>
An xml-stylesheet line is option. For example:
<?xml-stylesheet ekr_stylesheet?>
The <leo_header> element specifies version information and other information that affects how Leo parses the file. For example:
<leo_header file_format="2" tnodes="0" max_tnode_index="5725" clone_windows="0"/>
The file_format attribute gives the ‘major’ format number. It is ‘2’ for all 4.x versions of Leo. The tnodes and clone_windows attributes are no longer used. The max_tnode_index attribute is the largest tnode index.
The globals element specifies information relating to the entire file. For example:
<globals body_outline_ratio="0.50">
<global_window_position top="27" left="27" height="472" width="571"/>
<global_log_window_position top="183" left="446" height="397" width="534"/>
</globals>
The <v> element represents a single vnode and has the following form:
<v...><vh>sss</vh> (zero or more nested v elements) </v>
The <vh> element specifies the headline text. sss is the headline text encoded with the usual XML escapes. As shown above, a <v> element may contain nested <v> elements. This nesting indicates outline structure in the obvious way. Zero or more of the following attributes may appear in <v> elements:
t=name.timestamp.n
a="xxx"
The t=”Tnnn” attribute specifies the <t> element associated with a <v> element. The a=”xxx” attribute specifies vnode attributes. The xxx denotes one or more upper-case letters whose meanings are as follows:
C The vnode is a clone. (Not used in 4.x)
E The vnode is expanded so its children are visible.
M The vnode is marked.
T The vnode is the top visible node.
V The vnode is the current vnode.
For example, a=”EM” specifies that the vnode is expanded and is marked.
New in 4.0:
The <t> element represents the body text of the corresponding <v> element. It has this form:
<t tx="<gnx>">sss</t>
The tx attribute is required. The t attribute of <v> elements refer to this tx attribute. sss is the body text encoded with the usual XML escapes.
New in 4.0: Plugins and scripts may add attributes to <v> and <t> elements. See Writing plugins for details.
This section describe the format of external files. Leo’s sentinel lines are comments, and this section describes those comments.
Files derived from @file use gnx’s in @+node sentinels. Such gnx’s permanently and uniquely identify nodes. Gnx’s have the form:
id.yyyymmddhhmmss
id.yyyymmddhhmmss.n
The second form is used if two gnx’s would otherwise be identical.
Here are the sentinels used by Leo, in alphabetical order. Unless otherwise noted, the documentation applies to all versions of Leo. In the following discussion, gnx denotes a gnx as described above.
A sentinel of the form @<<section_name>> represents a section reference.
If the reference does not end the line, the sentinel line ending the expansion is followed by the remainder of the reference line. This allows the Read code to recreate the reference line exactly.
The @@ sentinel represents any line starting with @ in body text except @*whitespace*, @doc and @others. Examples:
@@nocolor
@@pagewidth 80
@@tabwidth 4
@@code
@at and @doc
The @+doc @+at sentinels indicate the start of a doc parts.
We use the following trailing whitespace convention to determine where putDocPart has inserted line breaks:
A line in a doc part is followed by an inserted newline if and only if the newline if preceded by whitespace.To make this convention work, Leo’s write code deletes the trailing whitespace of all lines that are followed by a “real” newline.
Marks the start of any external file. This sentinel has the form:
<opening_delim>@leo<closing_delim>
The read code uses single-line comments if <closing_delim> is empty. The write code generates single-line comments if possible.
The @+leo sentinel contains other information. For example:
<opening_delim>@leo-ver=4-thin<closing_delim>
Mark the start and end of a node.
@+node:gnx:<headline>
@verbatimAfterRef is generated when a comment following a section reference would otherwise be treated as a sentinel. In Python code, an example would be:
<< ref >> #+others
Leo uses unicode internally for all strings.
Leo converts headline and body text to unicode when reading .leo files and external files. Both .leo files and external files may specify their encoding. The default is utf-8. If the encoding used in a external file is not “utf-8” it is represented in the @+leo sentinel line. For example:
#@+leo-encoding=iso-8859-1.
The utf-8 encoding is a “lossless” encoding (it can represent all unicode code points), so converting to and from utf-8 plain strings will never cause a problem. When reading or writing a character not in a “lossy” encoding, Leo converts such characters to ‘?’ and issues a warning.
When writing .leo files and external files Leo uses the same encoding used to read the file, again with utf-8 used as a default.
leoSettings.leo contains the following Unicode settings, with the defaults as shown:
default_derived_file_encoding = UTF-8
new_leo_file_encoding = UTF-8
These control the default encodings used when writing external files and .leo files. Changing the new_leo_file_encoding setting is not recommended. See the comments in leoSettings.leo. You may set default_derived_file_encoding to anything that makes sense for you.
The @encoding directive specifies the encoding used in a external file. You can’t mix encodings in a single external file.