API Documentation
Schema definition
Defining your schema
Overview
Our API allows you to extract data in the format that you need by defining a custom schema for each request. This guide explains how to structure your schema for optimal extraction results.
Schema Structure
A document schema is defined using a structured format that specifies the fields and their properties. Here’s a basic example:
Primitive fields
string
: Generic text datanumber
: Numeric valuesemail
: Email addressesphone
: Phone numbersdate
: Date values
Objects and arrays
object
: Nested object containing additional fields where each field is a primitive field.array
: List of items where each element is an object. As above, fields within each object can be any one of the primitive fields.
Working with Objects
Use objects when you need to group related fields together. Here’s how to structure an object type:
Working with Arrays
Use arrays when you need to extract repeating elements, such as line items in an invoice:
Response Format
The extraction system will return results in the following format:
Best Practices
Field Names
- Use clear, descriptive names - Use snake_case for consistency - Avoid special characters
Descriptions
- Provide detailed descriptions - Include format examples - Specify any expected patterns
Nested Structures
- Keep nesting depth reasonable (max 3-4 levels) - Use objects for logical grouping - Use arrays for repeated structures