# Readable Parser

> v1.0.0

AI is the big thing right now, developers everywhere are tapping into its power to build all sorts of products—think **Customer Service Chatbot**, **Document Helper**, **FAQ Assistant**, and **Data Analyst**. But there’s a hitch: why aren’t these AI responses as accurate as we’d like?

Let's dig into this issue. We give the AI a question and some documents it found through a search. The AI then tries to answer based on those snippets. Sure, having a well-crafted question is important, but the real key is how relevant those documents are. Matching accuracy and logical precision in the text of these similar documents are key.

So, we need **a document parser that can help us convert a variety of documents (PDFs, Docs, HTML, Excel, CSV, etc.) into more readable text (like Markdown)**. This transformation enables AI to produce better answers.

## 1. Introduction

<figure><img src="https://543652368-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZGXxzmub76iILU3XH4DB%2Fuploads%2FZwwe1hjQVY5Sss1bhIgz%2F1.png?alt=media&#x26;token=5fb5f3cd-74ec-41ed-9557-c537c077d7d9" alt=""><figcaption><p>extract tables from document</p></figcaption></figure>

We provide the following **Free APIs** to help you convert various documents into markdown format text.

* [**Request Parsing**](#request-parsing): You can use this API to parse your file or URL. The parsing process is asynchronous. After successful submission, you will receive a unique `task_id` that can be used to query the parsing status and fetch the parsed result.
* [**Check Parsing Status**](#check-parsing-status): With this API, you can check the parsing status of the document with the `task_id`.
* [**Fetch Parsed Result**](#fetch-parsed-result): After the parsing task is completed, you can fetch the markdown result through this API with the `task_id`.

| Document | Supported |
| :------: | :-------: |
|    PDF   |     ✅     |
|   Docx   |     ✅     |
|   HTML   |     ✅     |
|   Excel  |     ✅     |
|    CSV   |     ✅     |

For detailed usage of the API, please refer to the [API](#api) section below.

## 2. Auth Token

You can obtain an API Key on the [Account](https://chatof.ai/account) page.

<figure><img src="https://543652368-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FZGXxzmub76iILU3XH4DB%2Fuploads%2FaWkSXgyctSci4c3AEBLq%2Fimage.png?alt=media&#x26;token=8f46aac6-821d-43ff-bbfb-b89fe331a56e" alt=""><figcaption><p>Create Your ChatofAI API Key</p></figcaption></figure>

**Remember that your API key is a secret!** Do not share it with others or expose it in any client-side code (browsers, apps). Production requests must be routed through your backend server where your API key can be securely loaded from an environment variable or key management service.

All API requests should include your API key in an Authorization HTTP header as follows:

```
Authorization: Bearer CHATOFAI_API_KEY
```

Example curl command:

```
curl https://chatof.ai/path\
  -H "Authorization: Bearer $CHATOFAI_API_KEY"
```

Example with python

```
import requests
url = "https://chatof.ai/path"
headers = {
    "Authorization": f"Bearer {CHATOFAI_API_KEY}"
}
resp = requests.get(url, headers=headers)
```

## 3. API

**Base URL**

* Production Environment: <https://chatof.ai>

### 3.1 Request Parsing

```
POST /api-parser/file/async/parser
```

#### Request Parameters

<table><thead><tr><th width="159">Name</th><th width="116">Position</th><th width="139">Type</th><th width="131">Required</th><th>Description</th></tr></thead><tbody><tr><td>Authorization</td><td>header</td><td>string</td><td>yes</td><td>Token, check <a href="#auth-token">Auth Token </a>for details</td></tr><tr><td>body</td><td>body</td><td>object</td><td>no</td><td></td></tr><tr><td>» file</td><td>body</td><td>string(binary)</td><td>no</td><td>File, required when type is pdf, docx, csv, excel</td></tr><tr><td>» url</td><td>body</td><td>string</td><td>no</td><td>URL, required when type is <code>html</code></td></tr><tr><td>» type</td><td>body</td><td>string</td><td>yes</td><td>parser type, options: <code>pdf</code>, <code>docx</code>, <code>html</code>, <code>excel</code>, <code>csv</code></td></tr><tr><td>» options</td><td>body</td><td>string</td><td>no</td><td>JSON string, see the documentation for supported options</td></tr></tbody></table>

> Response

```json
{
  "code": 0,
  "msg": "success",
  "data": {
    "task_id": "avhwk001"
  }
}
```

#### Response Data Structure

Status Code **200**

| Name        | Type    | Description                                                                                       |
| ----------- | ------- | ------------------------------------------------------------------------------------------------- |
| » code      | integer |                                                                                                   |
| » msg       | string  |                                                                                                   |
| » data      | object  |                                                                                                   |
| »» task\_id | string  | For [Check Parsing Status](#check-parsing-status) and [Fetch Parsed Result](#fetch-parsed-result) |

### 3.2 Check Parsing Status

```
GET /api-parser/file/async/parser/status
```

#### Request Parameters

<table><thead><tr><th width="159">Name</th><th width="128">Position</th><th width="112">Type</th><th>Required</th><th>Description</th></tr></thead><tbody><tr><td>task_id</td><td>query</td><td>string</td><td>yes</td><td>Task ID</td></tr><tr><td>Authorization</td><td>header</td><td>string</td><td>yes</td><td></td></tr></tbody></table>

> Response

```json
{
  "code": 0,
  "msg": "success",
  "data": {
    "status": "success",
    "err_msg": ""
  }
}
```

#### Response Data Structure

<table><thead><tr><th width="254">Name</th><th width="278">Type</th><th width="204">Description</th></tr></thead><tbody><tr><td>» code</td><td>integer</td><td></td></tr><tr><td>» msg</td><td>string</td><td></td></tr><tr><td>» data</td><td>object</td><td></td></tr><tr><td>»» status</td><td>string</td><td><p>pending</p><p>processing</p><p>success</p><p>failed</p></td></tr><tr><td>»» error_msg</td><td>string</td><td></td></tr></tbody></table>

### 3.3 Fetch Parsed Result

```
GET /api-parser/file/async/parser/download
```

#### Request Parameters

<table><thead><tr><th width="152">Name</th><th width="157">Position</th><th>Type</th><th width="110">Required</th><th>Description</th></tr></thead><tbody><tr><td>task_id</td><td>query</td><td>string</td><td>yes</td><td>From <a href="#request-parsing">Request Parsing</a></td></tr><tr><td>Authorization</td><td>header</td><td>string</td><td>yes</td><td></td></tr></tbody></table>

> Response

```json
{
  "code": 0,
  "msg": "success",
  "data": {
    "markdown": "### Title\n\nContent"
  }
}
```

#### Response Data Structure

| Name        | Type    | Description          |
| ----------- | ------- | -------------------- |
| » code      | integer |                      |
| » msg       | string  |                      |
| » data      | object  |                      |
| »» markdown | string  | Parsed Markdown Text |

## 4. Limits

There are restrictions on the request frequency to prevent abuse of the interface. You are limited to making **1 request per second** and **100 requests per day**. If you exceed this limit, your API calls will be restricted. Please note that the rate limit is applied at the user level, rather than the interface level. This means you are not allowed to make more than two simultaneous requests to the interface. If you have higher frequency requirements for your requests, please get in touch with us `support@chatof.ai`
