GET
/
extract
/
{extraction_id}

Response structure

The response contains extracted data organized hierarchically. Each field in extracted_data can be either:

  • A primitive field (like invoice_number)
  • An object containing nested fields (like customer)
  • An array of objects (like line_items)

Each primitive field contains:

  • name: Field identifier
  • value: Extracted content
  • confidence: Accuracy score (0-1)
  • field_type: Data type (string, number, date, phone, email)
  • page_index: Page from which the value was extracted
  • source_context: Extraction context details

See below for an example with primitives, objects and an array:

{
  "schema_name": "string",
  "schema_description": "string",
  "extracted_data": {
    "name": "Invoice",
    "fields": {
      "invoice_number": {
        "name": "invoice_number",
        "value": "INV-12322",
        "field_type": "string",
        "confidence": 0.95,
        "source_context": "string",
        "page_index": 1
      },
      "customer": {
        "name": {
          "name": "name",
          "value": "Acme",
          "field_type": "string",
          "confidence": 0.95,
          "source_context": "string",
          "page_index": 1
        },
        "email": {
          "name": "email",
          "value": "acme@email.com",
          "field_type": "email",
          "confidence": 0.95,
          "source_context": "string",
          "page_index": 1
        }
      },
      "line_items": [
        {
          "description": {
            "name": "description",
            "value": "Services for SEO",
            "field_type": "string",
            "confidence": 0.95,
            "source_context": "string",
            "page_index": 1
          },
          "amount": {
            "name": "amount",
            "value": 200,
            "field_type": "number",
            "confidence": 0.95,
            "source_context": "string",
            "page_index": 1
          }
        }
      ]
    },
    "field_type": "object"
  }
}

Headers

Authorization
string
required

Bearer token for authentication

Path Parameters

extraction_id
string
required

ID of the extraction job

Response

200 - application/json
extraction_id
string
required

ID of the extraction job

file_id
string
required

ID of the file being processed

status
enum<string>
required

Status of the extraction job

Available options:
PENDING,
PROCESSING,
COMPLETED,
FAILED
created_at
string
required

When the job was created

schema_name
string
required

Name of the schema used for extraction

schema_description
string
required

Description of the schema used for extraction

extracted_data
object
required

Object containing extracted data. See above for more information about the structure of this object.

processing_metadata
object
required