API Reference

Standardize Documents V2

Standardize a batch documents, either by passing a list of Document IDs or by passing a dataset name. Pass a schemaId to standardize the documents using a specific structure, or leave it empty to create an ad-hoc structure as the AI sees fit. Standardization handles lists (arrays) by splitting documents into smaller sub-documents behind the scenes - the AI will do its best to decide how and when it is appropriate to split.

Advanced: You can specify certain parameters, by default they are left to auto which lets the AI decide.

  1. displayMode - Controls how the AI sees the document. The options are:
    • auto - Automatically determine the best mode based on the document content.
    • spatial - Represent text in the document according to its spatial layout.
    • sections - Represent the document as a list of sections (paragraphs, tables, images, etc.) as seen in the web UX.
  2. splitMode - Controls how the AI splits the document into sub-documents. The options are:
    • auto - Automatically determine the best mode based on the document content.
    • all - Split the document into single-page sub-documents, so each page is handled separately.
    • never - Do not split the document at all, so the entire document is handled as a single unit. This can lead to poor performance for long documents, or documents with lots of dense data that needs to be extracted.
  3. effortLevel - Controls how much effort the AI puts into the standardization. The options are:
    • standard - Use the standard effort level.
    • high - Use the high effort level, which takes longer but can produce better results. Currently at no extra cost.
Language
Credentials
Header
Click Try It! to start a request and see the response here!