Fairspec Table Schema
| Authors | Evgeny Karev |
|---|---|
| Profile | https://fairspec.org/profiles/latest/table-schema.json |
Fairspec Table Schema is a simple JSON based format that defines Table Schema to describe a class of tabular data resources. Table Schema structuraly compatible with JSON Schema but it doesn’t support all the JSON Schema features. It adapts some features for tabular context, and extend JSON Schema with additional tabular features.
Language
Section titled “Language”The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119.
Descriptor
Section titled “Descriptor”A Fairspec Table Schema is a JSON resource that MUST be an object compatible with the Table Schema structure outlined below.
Table Schema
Section titled “Table Schema”A top-level descriptor object defining a schema of tabular data resources. It MIGHT have the following properties (all optional unless otherwise stated):
$schema
Section titled “$schema”External Path to one of the officially published Fairspec Table Schema profiles with default value https://fairspec.org/profiles/latest/table-schema.json.
For example for version X.Y.Z of the profile:
{ "$schema": "https://fairspec.org/profiles/X.Y.Z/table-schema.json"}An optional human-readable title for the table schema. It MUST be a string.
For example:
{ "title": "Experimental Measurements Dataset"}description
Section titled “description”An optional detailed description explaining the purpose and contents of the table schema. It MUST be a string.
For example:
{ "title": "Experimental Measurements Dataset", "description": "Temperature and pressure measurements collected during chemical reaction experiments performed in laboratory conditions."}required
Section titled “required”An optional list of column names that are required to present. Each item MUST be a string matching a key in the properties object.
For example, to require specific columns:
{ "required": ["experiment_id", "temperature", "pressure"], "properties": { "experiment_id": { "type": "integer" }, "temperature": { "type": "number" }, "pressure": { "type": "number" }, "notes": { "type": "string" } }}properties
Section titled “properties”An object defining the schema for table columns. Each key represents a column name, and its value MUST be a valid Column definition.
For example, for a simple table with different column types:
{ "properties": { "id": { "type": "integer", "minimum": 1 }, "name": { "type": "string", "maxLength": 100 }, "email": { "type": "string", "format": "email" }, "active": { "type": "boolean" } }}primaryKey
Section titled “primaryKey”An optional array of column names that form the table’s primary key. The combination of values in these columns MUST uniquely identify each row. At least one column name MUST be specified if this property is present.
For example, with a single-column primary key:
{ "primaryKey": ["experiment_id"], "properties": { "experiment_id": { "type": "integer" }, "measurement": { "type": "number" } }}For example, with a composite primary key:
{ "primaryKey": ["sample_id", "timestamp"], "properties": { "sample_id": { "type": "string" }, "timestamp": { "type": "string", "format": "date-time" }, "value": { "type": "number" } }}uniqueKeys
Section titled “uniqueKeys”An optional array of unique key constraints. Each unique key is an array of column names whose combined values MUST be unique across all rows in the table. Each unique key array MUST contain at least one column name.
For example, with multiple unique constraints:
{ "uniqueKeys": [ ["email"], ["username"], ["department", "employee_number"] ], "properties": { "email": { "type": "string", "format": "email" }, "username": { "type": "string" }, "department": { "type": "string" }, "employee_number": { "type": "integer" } }}foreignKeys
Section titled “foreignKeys”An optional array of foreign key constraints that define relationships between this table and other tables. Each foreign key specifies local columns and their reference to columns in another resource (identified by resource name in the dataset context).
For example, referencing another resource:
{ "foreignKeys": [ { "columns": ["customer_id"], "reference": { "resource": "customers", "columns": ["id"] } } ], "properties": { "order_id": { "type": "integer" }, "customer_id": { "type": "integer" }, "amount": { "type": "number" } }}For example, with a self-reference (omitting resource):
{ "foreignKeys": [ { "columns": ["parent_id"], "reference": { "columns": ["id"] } } ], "properties": { "id": { "type": "integer" }, "parent_id": { "type": "integer" }, "name": { "type": "string" } }}For example, with a composite foreign key:
{ "foreignKeys": [ { "columns": ["supplier_id", "product_code"], "reference": { "resource": "catalog", "columns": ["supplier_id", "code"] } } ]}missingValues
Section titled “missingValues”An optional list of values that represent missing or null data across all columns in the table. Each item can be either a simple value or an object with value and label properties for documentation purposes. The values type MUST be "string" or "integer".
For example, with simple values:
{ "missingValues": ["NA", "N/A", "", -999]}For example, with labeled values:
{ "missingValues": [ { "value": "NA", "label": "Not Available" }, { "value": "NR", "label": "Not Recorded" }, { "value": -999, "label": "Sensor Error" } ]}For example, with mixed values:
{ "missingValues": [ 'NA', { "value": -999, "label": "Sensor Error" } ]}Column
Section titled “Column”A column definition that specifies the data type, constraints, and metadata for a table column. The schema is routed based on the type property and optionally the format property to determine which specific column type applies.
The data type of the column. It MUST be one of the following values:
boolean- True/false valuesinteger- Whole numbersnumber- Numeric valuesstring- Text valuesarray- Array/list valuesobject- Object/dictionary values
If a column allows missing values the type can include null (order insensitive):
["boolean", "null"]- True/false values or missing values["integer", "null"]- Whole numbers or missing values["number", "null"]- Numeric values or missing values["string", "null"]- Text values or missing values["array", "null"]- Array/list values or missing values["object", "null"]- Object/dictionary values or missing values
Any other value of the type property indicates that the column type is Unknown.
Metadata example:
{ "properties": { "age": { "type": "integer" }, "title": { "type": ["string", "null"] }, }}Data example:
age253218An optional human-readable title for the column. It MUST be a string.
Metadata example:
{ "properties": { "temp_c": { "type": "number", "title": "Temperature (Celsius)" } }}Data example:
temp_c23.5-10.298.6description
Section titled “description”An optional detailed description of the column. It MUST be a string.
Metadata example:
{ "properties": { "pressure": { "type": "number", "description": "Atmospheric pressure measured in hectopascals (hPa) at the time of observation." } }}Data example:
pressure1013.251020.50995.30rdfType
Section titled “rdfType”An optional property that provides a richer, “semantic” description of the type of data in a column. The value MUST be the URI of a RDF Class, that is an instance or subclass of RDF Schema Class object.
Metadata example:
{ "properties": { "country": { "type": "string", "rdfType": "http://schema.org/Country" } }}Data example:
countryUSUKDEFRAn optional array of allowed values for the column. The values MUST match the column’s type.
For example, with string values:
{ "properties": { "status": { "type": "string", "enum": ["pending", "active", "completed", "cancelled"] } }}For example, with integer values:
{ "properties": { "priority": { "type": "integer", "enum": [1, 2, 3, 4, 5] } }}Data example:
statuspendingactivecompletedcancelledAn optional constant value for the column. The value MUST match the column’s type.
For example, with string values:
{ "properties": { "status": { "type": "string", "const": "pending" } }}For example, with integer values:
{ "properties": { "priority": { "type": "integer", "const": 1 } }}Data example:
statuspendingpendingdefault
Section titled “default”An optional default value for the column. The value MUST match the column’s type. This property is for documentation purpose and it is not used to fill missing values.
Metadata example:
{ "properties": { "status": { "type": "string", "default": "pending" "missingValues": ["N/A"] } }}Data example:
statusdoneN/Aexamples
Section titled “examples”An optional array of example values for the column. The values MUST match the column’s type and can be used for documentation, testing, or generating sample data.
Metadata example:
{ "properties": { "temperature": { "type": "number", "examples": [20.5, 25.3, 18.7] } }}Data example:
temperature20.525.318.7missingValues
Section titled “missingValues”An optional column-level list of values that represent missing or null data for this column. Each item can be either a simple value or an object with value and label properties for documentation purposes. The missing values type MUST be:
"string"or"integer"for boolean, integer, and number columns"string"for all other columns
If table-level missing values are provided, the effective missing values MUST include all the column-level values and all the compatible table-level values.
Metadata example:
{ "properties": { "measurement": { "type": "number", "missingValues": [ { "value": -999, "label": "Sensor malfunction" }, { "value": "NA", "label": "Not measured" } ] } }}Data example:
measurement25.3-999NA42.1Column Types
Section titled “Column Types”Boolean
Section titled “Boolean”A column for true/false values. It MUST have type set to "boolean" and MUST NOT have a format property.
Metadata example:
{ "properties": { "is_active": { "type": "boolean" } }}Data example:
is_activetruefalsetrueType properties:
Categorical
Section titled “Categorical”A column for categorical values. It MUST have type set to "integer" or "string" and format set to "categorical".
Metadata example:
{ "properties": { "severity": { "type": "integer", "categories": [ { "value": 1, "label": "Low" }, { "value": 2, "label": "Medium" }, { "value": 3, "label": "High" } ] } }}Data example:
severity1231Type properties:
In addition, type properties if type is "integer":
In addition, type properties if type is "string":
Integer
Section titled “Integer”A column for whole number values. It MUST have type set to "integer" and MUST NOT have a format property.
Metadata example:
{ "properties": { "age": { "type": "integer" } }}Data example:
age253218Type properties:
Number
Section titled “Number”A column for numeric values including decimals. It MUST have type set to "number" and MUST NOT have a format property.
Metadata example:
{ "properties": { "temperature": { "type": "number" } }}Data example:
temperature23.5-10.298.6Type properties:
Decimal
Section titled “Decimal”A column for decimal values. It MUST have type set to "string" and format set to "decimal".
Metadata example:
{ "properties": { "price": { "type": "string", "format": "decimal" } }}Data example:
price19.995.50123.45Type properties:
minLengthmaxLengthpatternminimummaximumexclusiveMinimumexclusiveMaximummultipleOfdecimalChargroupCharwithText
String
Section titled “String”A column for text values. It MUST have type set to "string" and MUST NOT have a format property.
Metadata example:
{ "properties": { "name": { "type": "string" } }}Data example:
nameAliceBobCharlieType properties:
A column for delimited list values stored as strings. It MUST have type set to "string" and format set to "list".
Metadata example:
{ "properties": { "tags": { "type": "string", "format": "list" } }}Data example:
tags"red,blue,green""small,compact""new,sale,featured"Type properties:
A column for URLs with HTTP/HTTPS protocol. It MUST have type set to "string" and format set to "url".
Metadata example:
{ "properties": { "homepage": { "type": "string", "format": "url" } }}Data example:
homepagehttps://example.comhttps://example.org/pagehttps://domain.net/path/to/resourceType properties:
A column for email addresses. It MUST have type set to "string" and format set to "email".
Metadata example:
{ "properties": { "contact_email": { "type": "string", "format": "email" } }}Data example:
contact_emailalice@example.combob@company.orgcharlie@domain.netType properties:
A column for ISO 8601 date values. It MUST have type set to "string" and format set to "date".
Metadata example:
{ "properties": { "birth_date": { "type": "string", "format": "date" } }}Data example:
birth_date2023-12-011990-06-152005-03-20Type properties:
A column for ISO 8601 time values. It MUST have type set to "string" and format set to "time".
Metadata example:
{ "properties": { "start_time": { "type": "string", "format": "time" } }}Data example:
start_time14:30:0009:45:3018:00:00Type properties:
DateTime
Section titled “DateTime”A column for ISO 8601 date with time values. It MUST have type set to "string" and format set to "date-time".
Metadata example:
{ "properties": { "created_at": { "type": "string", "format": "date-time" } }}Data example:
created_at2023-12-01T14:30:00Z2024-01-15T09:45:30+00:002024-03-20T18:00:00-05:00Type properties:
Duration
Section titled “Duration”A column for ISO 8601 duration values. It MUST have type set to "string" and format set to "duration".
Metadata example:
{ "properties": { "elapsed_time": { "type": "string", "format": "duration" } }}Data example:
elapsed_timePT1H30MP1DT12HPT45M30SType properties:
A column for Well-Known Text (WKT) geometry data. It MUST have type set to "string" and format set to "wkt".
Metadata example:
{ "properties": { "geometry": { "type": "string", "format": "wkt" } }}Data example:
geometry"POINT (30 10)""LINESTRING (30 10, 10 30, 40 40)""POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))"Type properties:
A column for Well-Known Binary (WKB) geometry data. It MUST have type set to "string" and format set to "wkb".
Metadata example:
{ "properties": { "geometry": { "type": "string", "format": "wkb" } }}Data example:
geometry0101000000000000000000000000000000000024400000000000003E400102000000030000000000000000003E4000000000000024400103000000010000000500000000000000000024400000000000003E40Type properties:
A column for hexadecimal encoded data. It MUST have type set to "string" and format set to "hex".
Metadata example:
{ "properties": { "color": { "type": "string", "format": "hex" } }}Data example:
colorFF573300BFFF32CD32Type properties:
Base64
Section titled “Base64”A column for Base64 encoded binary data. It MUST have type set to "string" and format set to "base64".
Metadata example:
{ "properties": { "thumbnail": { "type": "string", "format": "base64" } }}Data example:
thumbnailiVBORw0KGgoAAAANSUhEUgAAAAUAR0lGODlhAQABAIAAAAAAAP///yH5aGVsbG8gd29ybGQ=Type properties:
A column for array/list values. It MUST have type set to "array" and MUST NOT have a format property.
Metadata example:
{ "properties": { "coordinates": { "type": "array" } }}Data example:
coordinates"[1.5, 2.3]""[10, 20, 30]""[-5.2, 8.9, 12.1]"Type properties:
Object
Section titled “Object”A column for object/dictionary values. It MUST have type set to "object" and MUST NOT have a format property.
Metadata example:
{ "properties": { "metadata": { "type": "object" } }}Data example:
metadata"{""author"": ""John"", ""version"": 1}""{""author"": ""Jane"", ""version"": 2}""{""author"": ""Bob"", ""version"": 1}"Type properties:
GeoJSON
Section titled “GeoJSON”A column for GeoJSON geometry objects. It MUST have type set to "object" and format set to "geojson".
Metadata example:
{ "properties": { "location": { "type": "object", "format": "geojson" } }}Data example:
location"{""type"": ""Point"", ""coordinates"": [30, 10]}""{""type"": ""LineString"", ""coordinates"": [[30, 10], [10, 30], [40, 40]]}""{""type"": ""Polygon"", ""coordinates"": [[[30, 10], [40, 40], [20, 40], [10, 20], [30, 10]]]}"Type properties:
TopoJSON
Section titled “TopoJSON”A column for TopoJSON geometry objects. It MUST have type set to "object" and format set to "topojson".
Metadata example:
{ "properties": { "topology": { "type": "object", "format": "topojson" } }}Data example:
topology"{""type"": ""Topology"", ""objects"": {""example"": {""type"": ""Point"", ""coordinates"": [0, 0]}}}""{""type"": ""Topology"", ""arcs"": [[[0, 0], [1, 1]]], ""objects"": {""line"": {""type"": ""LineString"", ""arcs"": [0]}}}""{""type"": ""Topology"", ""objects"": {""polygon"": {""type"": ""Polygon"", ""arcs"": [[0]]}}}"Type properties:
Unknown
Section titled “Unknown”A column for values of unknown type. It MUST have type not supported by the types above.
Metadata example:
{ "properties": { "column": { "title": "Column", "description": "Column description", } }}Data example:
columna1falseType Properties
Section titled “Type Properties”format
Section titled “format”An optional format qualifier that specifies a more specific subtype of the base type.
Metadata example:
{ "properties": { "email": { "type": "string", "format": "email" } }}Data example:
emailalice@example.combob@company.orgcharlie@domain.nettrueValues
Section titled “trueValues”An optional array of string values that SHOULD be interpreted as true when parsing data. It MUST be an array of strings.
Metadata example:
{ "properties": { "is_active": { "type": "boolean", "trueValues": ["yes", "true", "1", "Y"] } }}Data example:
is_activeyestrue1YfalseValues
Section titled “falseValues”An optional array of string values that SHOULD be interpreted as false when parsing data. It MUST be an array of strings.
Metadata example:
{ "properties": { "is_active": { "type": "boolean", "falseValues": ["no", "false", "0", "N"] } }}Data example:
is_activenofalse0Nminimum
Section titled “minimum”An optional minimum value constraint (inclusive). The type MUST match the column type.
Metadata example:
{ "properties": { "temperature": { "type": "number", "minimum": -273.15 } }}Data example:
temperature-200.525.3100.0maximum
Section titled “maximum”An optional maximum value constraint (inclusive). The type MUST match the column type.
Metadata example:
{ "properties": { "temperature": { "type": "number", "maximum": 1000 } }}Data example:
temperature25.5100.0999.9exclusiveMinimum
Section titled “exclusiveMinimum”An optional minimum value constraint (exclusive). The type MUST match the column type.
Metadata example:
{ "properties": { "probability": { "type": "number", "exclusiveMinimum": 0 } }}Data example:
probability0.10.50.999exclusiveMaximum
Section titled “exclusiveMaximum”An optional maximum value constraint (exclusive). The type MUST match the column type.
Metadata example:
{ "properties": { "probability": { "type": "number", "exclusiveMaximum": 1 } }}Data example:
probability0.0010.50.999multipleOf
Section titled “multipleOf”An optional constraint that values MUST be a multiple of this number. For integers, it MUST be a positive integer. For numbers, it MUST be a positive number.
Metadata example:
{ "properties": { "price": { "type": "number", "multipleOf": 0.01 } }}Data example:
price10.0025.5099.99decimalChar
Section titled “decimalChar”An optional single character used as the decimal separator in the data.
Metadata example:
{ "properties": { "price": { "type": "number", "decimalChar": "," } }}Data example:
price19,995,50123,45groupChar
Section titled “groupChar”An optional single character used as the thousands separator in the data. It MUST be a string of length 1.
Metadata example:
{ "properties": { "population": { "type": "integer", "groupChar": "," } }}Data example:
population1,234,567890,12345,678withText
Section titled “withText”An optional boolean indicating whether numeric values may include non-numeric text that should be stripped during parsing.
Metadata example:
{ "properties": { "price": { "type": "number", "withText": true } }}Data example:
price$19.99€25.50£12.34minLength
Section titled “minLength”An optional minimum length constraint for string values. It MUST be a non-negative integer.
Metadata example:
{ "properties": { "username": { "type": "string", "minLength": 3 } }}Data example:
usernamealicebob123charliemaxLength
Section titled “maxLength”An optional maximum length constraint for string values. It MUST be a non-negative integer.
Metadata example:
{ "properties": { "username": { "type": "string", "maxLength": 20 } }}Data example:
usernamealicebobcharliepattern
Section titled “pattern”An optional regular expression pattern that values MUST match. It MUST be a valid regex string.
Metadata example:
{ "properties": { "product_code": { "type": "string", "pattern": "^[A-Z]{3}-[0-9]{4}$" } }}Data example:
product_codeABC-1234XYZ-5678DEF-9012categories
Section titled “categories”An optional array of categorical values with optional labels. Each item can be either a simple value or an object with value and label properties. The values MUST have the same type as the containing property i.e. "string" or "integer".
Metadata example:
{ "properties": { "severity": { "type": "integer", "categories": [ { "value": 1, "label": "Low" }, { "value": 2, "label": "Medium" }, { "value": 3, "label": "High" } ] } }}Data example:
severity1231withOrder
Section titled “withOrder”An optional boolean indicating that the categorical values in the column have natural order.
Metadata example:
{ "properties": { "severity": { "type": "integer", "categoriesOrdered": true, "categories": [ { "value": 1, "label": "Low" }, { "value": 2, "label": "Medium" }, { "value": 3, "label": "High" } ] } }}Data example:
severity1231delimiter
Section titled “delimiter”An optional single character used to delimit items in a list column.
Metadata example:
{ "properties": { "tags": { "type": "string", "format": "list", "delimiter": ";" } }}Data example:
tagsred;green;bluealpha;betagamma;delta;epsilonitemType
Section titled “itemType”An optional type for items in a list column. It MUST be one of: string, integer, number, boolean, date-time, date, time.
Metadata example:
{ "properties": { "measurements": { "type": "string", "format": "list", "itemType": "number" } }}Data example:
measurements"1.5,2.3,4.7""10.2,15.8""3.14,2.71,1.41"minItems
Section titled “minItems”An optional minimum number of items for the column. It MUST be a non-negative integer.
Metadata example:
{ "properties": { "tags": { "type": "string", "format": "list", "minItems": 1 } }}Data example:
tags"red,blue,green""small,compact""new,sale,featured"maxItems
Section titled “maxItems”An optional maximum number of items for the column. It MUST be a non-negative integer.
Metadata example:
{ "properties": { "tags": { "type": "string", "format": "list", "maxItems": 3 } }}Data example:
tags"red,blue,green""small,compact""new,sale,featured"temporalFormat
Section titled “temporalFormat”An optional string specifying the temporal format pattern as per the Strftime specification.
Metadata example:
{ "properties": { "collection_date": { "type": "string", "format": "date", "temporalFormat": "%m/%d/%Y" } }}Data example:
collection_date01/15/202403/22/202412/31/2023<jsonSchema>
Section titled “<jsonSchema>”For array and object column types, all properties from JSON Schema Draft 2020-12 are supported to define the structure and validation rules.
For example, with an array column:
{ "properties": { "coordinates": { "type": "array", "items": { "type": "number" }, "minItems": 2, "maxItems": 3 } }}Data example:
coordinates"[1.5, 2.3]""[10.2, 15.8, 20.5]""[3.14, 2.71]"For example, with an object column:
{ "properties": { "metadata": { "type": "object", "properties": { "author": { "type": "string" }, "version": { "type": "integer" } }, "required": ["author"] } }}Data example:
metadata"{""author"": ""Alice"", ""version"": 1}""{""author"": ""Bob"", ""version"": 2}""{""author"": ""Charlie""}"Common
Section titled “Common”Common properties shared by multiple entities in the descriptor.
External Path
Section titled “External Path”It MUST be a string representing an HTTP or HTTPS URL to a remote file.
For example:
{ "data": "https://example.com/datasets/measurements.csv"}Extension
Section titled “Extension”Fairspec Table Schema does not support extension.