Analysing documents for emissions data
Introduction
In carbon accounting scenarios, it is often important to extract emissions data from invoices, bills, and receipts. CarbonAPI supports the downloading and extraction of an organisation's documents, scanning them for perinent emissions data, and returning the results to developers.
Supported document categories
- Air Tickets
- Accomodation (Hotels)
- Fuel Invoices/Receipts
- Waste Pickups
- Mileage Claims
- Electricity Invoices.
See platform coverage to understand if your country/region is supported. Please contact us if your region is not yet supported.
Accepted document formats
- image/jpeg
- application/pdf
Other document types will be rejected. You will not be billed for rejected document types.
Accepted document types
- Invoices + Bills
- Pictures and scans of receipts
Pages in a document
You may submit a document with at most 50 pages. Note, for billing purposes, a document is considered to have 3 pages. Subsequent pages after the first 3 will be billed as seperate documents. For example, a document with 9 pages, will billed 3x.
Documents with over 50 pages will be rejected.
File URLs/Uploads
At the time of writing CarbonAPI supports file URLs, which CarbonAPI will download and analyse. In the future direct uploads may be supported.
Don't leave personal documents exposed!
You should strongly consider sharing pre-signed URLs or SAS tokens with CarbonAPI rather than leaving private documents pubically accessible.
Step-by-step guide
First, we need to sign up for CarbonAPI. You can do this and get free evaluation credits by heading to the Account Portal and signing up.
For this guide, we'll use polling to check for the results of our call, but to reduce overhead, we strongly reccomend using our Webhooks approach, which allows you to be notified once processing has completed.
Now, you can either review our API Reference, or drop in our TypeScript SDK, which will allow you to get started quickly.
Okay, so we're ready to start making calls. Let's make some some expenses to get started with:
import { CarbonAPIClient } from "@carbonapi/typescript-sdk";
// Initialize the client with your API key
const client = new CarbonAPIClient({
apiKey: "your-api-key-here", // Get this from the Account Portal
});
// Submit our documents for processing
const batchId = await client.createDocumentBatch({
type: "url",
documents: [
{
fileUrl:
"https://s3.eu-west-2.amazonaws.com/my-bucket/my-object?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIOSFODNN7EXAMPLE%2F20130524%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20130524T000000Z&X-Amz-Expires=86400&X-Amz-SignedHeaders=host&X-Amz-Signature=b59cfa898d8b4c4e4355b2f9a98fb8c145dda827c92d9ac34897f0554d1e5bf4",
},
],
countryCode: "NZ",
});Good to know
CarbonAPI is powered by AI, which can sometimes make mistakes. It is important that you review the outputs of this service.
Okay, so we have now created a document analysis job. At the time of writing, only an asynchronous approach is supported, but we plan to add sync calls in the future.
Now, let's poll for changes in the batch. As mentioned, this is not the preferred approach, we strongly suggest using Webhook calls to reduce the complexity of your application, this is just for demo purposes.
Note, this is queue-based service. Your request will be processed as soon as possible, but you may need to wait for a few minutes when we're busy. Again, a good reason to use Webhooks!
const poll = setInterval(() => {
const documentBatch = await client.getDocumentBatch(batchId);
if (documentBatch.status === "Completed") {
console.log(documentBatch.documents); // This will contain your categories and emissions information.
clearInterval(poll); // Don't forget to do this.
}
}, 10_000); // Call every 10 seconds.