tablo is a plain text interchange format for tabular data. It is more expressive than CSV while remaining easy for people to read and write.

tablo is designed to solve a number of ambiguities and shortcomings that frustrate users of delimited formats like CSV or TSV. In tablo, column headers are explicit and types are unambiguous. Additionally, cells may be styled with text and number formatting hints.

tablo is open source. We welcome contributions to tablo on Github.

This specification is provided under a Creative Commons Attribution–Share Alike License.

Implementations.

There are officially supported tablo implementations in Python and TypeScript/JavaScript. We welcome community-driven implementations in other languages as well.

In the interest of clarity, the tablo maintainers ask that the tablo name be used only with implementations that fully adhere to the specification described here.

Official Implementations

Examples.

The following example shows a tablo document with headers, data of string, date, and numeric types, and a format section that declares the first column should be rendered in bold, and the 3rd row should be rendered in red italic text.

Whitespace is optional between items in a row, and may be adjusted to improve legibility.

"Title"               , "Medium"                               , "Year" , "Width" , "Height"
=
"Gold Marilyn Monroe" , "Silkscreen ink and acrylic on canvas" , #1962  , 211.4   , 144.7
"Double Elvis"        , "Silkscreen ink on acrylic on canvas"  , #1963  , 210.8   , 134.6
"Flowers"             , "Offset lithograph"                    , #1964  ,  55.8   ,  55.7
"Cow"                 , "Screenprint"                          , #1966  , 116.7   ,  74.5
"Self-Portrait"       , "Screenprint"                          , #1966  ,  56     ,  52.8
"Mao"                 , "Silkscreen ink and acrylic on linen"  , #1973  ,  66.5   ,  55.9
*
A:A {bold}
3:3 {italic, red}
            

Additional examples can be found in the tablo repository on Github.

Specification.

A tablo file is consists of three sections: an optional header, the table data, and an optional format section. The header is a single line that specifies column labels. Table data consists of individual lines of comma-separated values, with each line representing a single table row. The format section is a collection of declarations to apply text formatting rules to selected cells or groups of cells.

Document

Header

The header is a comma-separated sequence of strings to be used as column labels. Header labels are optional descriptions of the table data to follow. If a column header should be empty, a hyphen may be used to indicate a null value.

Commas may be followed by optional whitespace, which consists of zero or more spaces or horizontal or vertical tabs. The final element of the header row must be followed immediately by a newline character.

Data

The data section consists of one or more data rows. A data row is a comma-separated sequence of values.

Rows may be separated into groups by inserting a line with a single tilde (~) followed by a newline. Consecutive lines may not contain group separators.

Commas may be followed by optional whitespace, which consists of zero or more spaces or horizontal or vertical tabs. The final element of a data row must be followed immediately by a newline character.

Value

Values belong to one of five data types: string, number, datetime, boolean, or null.

String

Strings are always enclosed in quotation marks ("). The value between the quotation marks is a sequence of zero or more UTF-8 code points with the exception of a backslash, quotation mark, or a control character in the ranges U+00 through U+1F and U+80 through U+9F.

Otherwise prohibited code points may be written using the unicode escape sequence \u{hex}, where hex represents between one and eight hexadecimal digits.

Additional two-character escape sequences are provided for newline, carriage return, form feed, horizontal tab, backspace, quotation mark, and backslash. A backslash followed by any character other than the defined escape characters is an error.

Hex Sequence

A hex sequence consists of between one and eight hexadecimal digits.

Number

A number is a sequence of decimal digits that can be followed by an optional fraction part, an optional exponent, or both. Numbers may be prefixed with a minus to indicate values less than zero, or a plus, which has no effect on the value. Numbers may use underscores as digit grouping separators.

Additionally, a number may be written in hexadecimal by prefixing the digits with the characters 0x. Hexadecimal numbers may also be preceded by a plus or minus and may contain underscores.

Datetime

Instants in time are written as ISO-8601 date, time, or combined datetime strings preceded by a hash sign (#).

Boolean

The two boolean values are true and false.

Null

A hyphen represents a null value, or empty cell.

Format

Column

Column specifiers consist of one or more capital letters A through Z.

Row

Row specifiers consist of one or more decimal digits.

Cell

Cell specifiers are column-row pairs, where the column specifier is followed immediately by a row specifier.

Tag

Tags are user-assignable labels that can be associated with a format selector. A tag consists of a key followed by an optional whitespace-separated value. Tags have no bearing on the content of a file's data, but provide style and formatting hints to application renderers.

Key

Keys are unique identifiers within a format specifier. Valid characters in keys are upper and lower case latin letters A–Z, the digits 0–9, underscore (_) and hyphen (-). Keys must start with a letter or underscore, but the initial character may be followed by any number of valid characters.