Take your first steps by uploading a document and getting its parsed text and tables.
DocuPanda is here to do one thing: convert complex documents into a consistent, structured payload that has the same fields and data types every time.
The flow comprises of three parts:
- POST a document, which may include complex entities like tables, key-value pairs, checkboxes, signatures, etc. This triggers our AI processing, and returns a
documentId
. - GET that document by ID, and receive a plain text representation of that document that is easily readable for both humans and AI readers.
- POST a standardize request, that gets a document ID, as well as a schema, and generates a structured JSON
This getting started guide will walk through one example. Let's say we have this PDF file on our local machine : example_document.pdf
Authentication
Every request to DocuPanda needs to include an API key. You may obtain your API key by signing up and going to this link, where your API key is visible at the bottom of the page.
Posting a Document
The first thing you want to do, is take that file and post it to DocuPanda. Replace YOUR_API_KEY
with your actual API key obtained in the previous step. Supported file formats are: PDF, images (JPG, PNG, WEBP), text files, and JSONs. Regardless of the format, use base64 encoding as shown below.
import base64
import requests
url = "https://app.docupanda.io/document"
api_key = "YOUR_API_KEY"
payload = {"document": {"file": {
"contents": base64.b64encode(open("example_document.pdf", 'rb').read()).decode(),
"filename": "example_document.pdf"
}}}
headers = {
"accept": "application/json",
"content-type": "application/json",
"X-API-Key": api_key
}
response = requests.post(url, json=payload, headers=headers)
document_id = response.json()['documentId']
const fetch = require('node-fetch');
const fs = require('fs');
// Replace with your actual DocuPanda API key
const api_key = "YOUR_API_KEY";
const url = "https://app.docupanda.io/document";
// Read and encode the file in base64
const filePath = "example_document.pdf";
const fileContents = fs.readFileSync(filePath);
const base64Content = Buffer.from(fileContents).toString('base64');
// Construct the JSON payload
const payload = {
document: {
file: {
contents: base64Content,
filename: filePath
}
}
};
// Make the POST request with JSON payload
fetch(url, {
method: 'POST',
headers: {
"Accept": "application/json",
"Content-Type": "application/json",
"X-API-Key": api_key
},
body: JSON.stringify(payload)
})
.then(response => response.json())
.then(data => {
const document_id = data.documentId;
console.log(document_id); // Output the document ID
})
.catch(error => console.error('Error:', error));
If we print our the response, we'll see it gives us an ID for the document, which we may later use to query for its AI and human readable result:
print(response)
=> {'documentId': '96dde1aa'}
What you get is essentially a pointer that you can then use to query for the document's results, which you do with a GET request. Typical processing time is about 5-15 seconds for a entire document, depending on its size.
Retrieving Document Results
DocuPanda is a RESTful API, which means you append document_id
to the URL path when you want to get the results.
url = f"https://app.docupanda.io/document/{document_id}"
headers = {
"accept": "application/json",
"X-API-Key": "YOUR_API_KEY"
}
response = requests.get(url, headers=headers)
print(response.json())
const getDocumentResults = (documentId) => {
const url = `https://app.docupanda.io/document/${documentId}`;
fetch(url, {
method: 'GET',
headers: {
"accept": "application/json",
"X-API-Key": "demo-api-key"
}
})
.then(response => response.json())
.then(data => console.log(data))
.catch(error =>
console.error(error));
};
getDocumentResults(document_id); // Replace document_id with the actual ID
If you call this immediately after the POST request, the document will be in a processing state
{
"documentId": "b214r0297-demo-id",
"status": "processing",
"result": null
}
If you wait a few seconds and query again, you will get the result below. Our parsed text is human-readable, but more importantly it is AI readable, which lets you reason about it and ask your favorite Large Language Model questions about its content.
{
"documentId": "e95af17c",
"status": "completed",
"result": {
"pagesText": ["plain text representation of page 1", "plain text representation of page 2"]
}
}
Note you could avoid polling for the document to become prepared, by using webhooks, which you can read about in this guide.
Standardizing a Document - convert it to JSON
import requests
url = "https://app.docupanda.io/standardize/batch"
payload = {
"documentIds": ["exampleDocumentId"],
"schemaId": "exampleSchemaId"
}
headers = {
"accept": "application/json",
"content-type": "application/json",
"X-API-Key": "YOUR_API_KEY_HERE"
}
response = requests.post(url, json=payload, headers=headers)
print(response.json())
const url = 'https://app.docupanda.io/standardize/batch';
const options = {
method: 'POST',
headers: {
accept: 'application/json',
'content-type': 'application/json',
'X-API-Key': 'YOUR_API_KEY_HERE'
},
body: JSON.stringify({documentIds: ['exampleDocumentId'], schemaId: 'exampleSchemaId'})
};
fetch(url, options)
.then(res => res.json())
.then(json => console.log(json))
.catch(err => console.error('error:' + err));
This will return a standardizationId
, which again you can poll for using a GET request, which will finally give you our JSON view of the document, which we call a standardization.
What that standardization will include depends on the exact schema used. A schema can let you specify exactly what you want to surface from each document.
As an example. maybe the input is a rental contract, and you set up a schema to extract some basic fields. The output may look like this:
{"monthlyAmount": 2000,
"currency":"USD",
"moveInDate":"2020-31-01",
"depositAmount": 3000,
"depositCurrency":"USD"}
Using our schema creation dashboard, you can create very complex schemas that are specific to your use case. You can add an exact field for an annual payment for a rental contract, or have a field to describe whether tenants are likely allowed to keep a pet crocodile in the house. Schemas let you understand documents in a way that can be entirely unique to your use case.
There's plenty more to DocuPanda API. You can:
- Classify documents by type
- Generate a visual review of a standardization to see exactly what pixels justify every decision made by our AI
- Use our prebuilt Workflow objects to automate a sequence of events (upload -> standardize -> classify -> apply schema)
- Build scalable, complex workflows, using Webhooks