Printed character recognition API documentation
Interface description
The end-to-end text recognition system based on the deep neural network model and iFlytek's self-developed industry-leading optical character recognition technology can convert printed fonts in pictures (from sources such as scanners or digital cameras) into scanned bodies and complex natural scenes. Text recognition is directly converted into editable text. Supports 32 languages including Chinese、English、Hungarian、French、German、Spanish etc.
When you use printed character recognition, please follow these requirements:
| Content | Description |
|---|---|
| Transfer method | http[s] |
| Request Address | https://me-east-1.aicloudapi.com/v1/ocr |
| Request Line | POST /v1/ocr HTTP/1.1 |
| Request Line | Signature mechanism, For details, please refer to Authentication Description |
| Character encoding | UTF-8 |
| Response format | JSON |
| Image format | jpg, jpeg, png, bmp, webp, tiff |
| Image size | Minimum size: 1B; Maximum size: 10485760 B |
Authentication Description
When using the business interface, the requester needs to sign the request, and the server verifies the validity of the request through the signature.
Authentication Method
Add authentication-related parameters after the request address.Please note that the values that affect the authentication result are URL, apiSecret, apiKey, and date. If you want to debug the authentication, you must debug according to the values given in the example.,The specific parameters are as follows::
Authentication parameters::
| Parameter | Type | Required | Description | Example |
|---|---|---|---|---|
| host | string | yes | requesting host | me-east-1.aicloudapi.com |
| date | string | yes | current timestamp, RFC1123 format | Wed, 7th Dec 2022 08:18:43 GMT |
| authorization | string | yes | Information related to the signature encoded by base64 (the signature is calculated based on hamc-sha256) | Refer to the detailed generation rules below |
• Format of authorization parameter generation:
1)Get the interface keys APIKey and APISecret.
After creating an account in iFLYTEK open platform,please visit the console page to obtain 32-bit strings.
2)The format of the parameter authorization base64 before encoding (authorization_origin) is as follows:
api_key="$api_key",algorithm="hmac-sha256",headers="host date request-line",signature="$signature"
Where API _ key is the APIKey obtained on the console, algorithm is the encryption algorithm (only hmac-sha256 is supported), and headers is the parameter involved in the signature(see the note below)。
A signature is a string that uses an encryption algorithm to sign the parameters that participate in the signature and uses base64 encoding. See below for details.
3)The signature origin field (signature_origin) rule is as follows:
The original signature field is formed by splicing three parameters of host, date, and request-line according to the format.
The format of the concatenation is (\ n is a newline character with a space after ’:’):
host: $host\ndate: $date\n$request-line
If
Requested url = "https://me-east-1.aicloudapi.com/v1/ocr"
date = "Wed, 07 Dec 2022 08:18:43 GMT"
Then the signature origin field (signature_origin) is:
host: me-east-1.aicloudapi.com
date: Wed, 07 Dec 2022 08:18:43 GMT
POST /v1/ocr HTTP/1.1
4)use that hmac-sha256 algorithm and combining the signature of the signature_origin by the apiSecret to obtain a signature dig signature_sha.
signature_sha=hmac-sha256(signature_origin,$apiSecret)
Where apiSecret is the APISecret obtained at the console.
5) Encode signature_sha with base64 encoding to get the final signature.
signature=base64(signature_sha)
If
APISecret = "apisecretXXXXXXXXXXXXXXXXXXXXXXX"
date = "Wed, 07 Dec 2022 08:18:43 GMT"
Then the signature is
signature="J0D7cz4s+6lQpzNtT03BiZN1QEIhqZrSBKvk6W6nK5s="
6) Encode signature_sha with base64 encoding to get the final signature:
api_key="apikeyXXXXXXXXXXXXXXXXXXXXXXXXXX", algorithm="hmac-sha256", headers="host date request-line", signature="J0D7cz4s+6lQpzNtT03BiZN1QEIhqZrSBKvk6W6nK5s="
7)Finally, the authorization_origin is base64 encoded to obtain the final authorization parameter.
authorization = base64(authorization_origin)
Example result is:
authorization=YXBpX2tleT0iYXBpa2V5WFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFgiLCBhbGdvcml0aG09ImhtYWMtc2hhMjU2IiwgaGVhZGVycz0iaG9zdCBkYXRlIHJlcXVlc3QtbGluZSIsIHNpZ25hdHVyZT0iSjBEN2N6NHMrNmxRcHpOdFQwM0JpWk4xUUVJaHFaclNCS3ZrNlc2bks1cz0i
Authentication Result
If the authentication fails, different HTTP Code status codes will be returned according to different error types, and the error description information will be carried. The detailed error description is as follows::
| HTTP Code | Description | Error description | Solution |
|---|---|---|---|
| 401 | Missing authorization parameter | {"message":"Unauthorized"} | Check whether there is an authorization parameter. |
| 401 | Failed to resolve the signature parameters | {“message”:”HMAC signature cannot be verified”} | Check whether the parameters of the signature are correct, especially whether the API _ key copied below are correct |
| 401 | Failed to verify the signature | { "message": "HMAC signature does not match" } | Failed to verify the signature. There may be many reasons. 1. to check whether the API _ key and API _ secret are correct. 2.Check whether the parameters host, date, and request-line of the calculation signature are spliced according to the protocol requirements. 3. checks whether the base64 length of the signature is normal (normally 44 bytes). |
| 403 | Clock offset check failed | {“message”:”HMAC signature cannot be verified, a valid date or x-date header is required for HMAC Authentication”} | Check whether the server time is standard. If the difference is more than 5 minutes, this error will be reported. |
Request Parameters
When calling a business interface, the following parameters need to be configured in the Http Request Body. The request data is a JSON string.
example:
{
"header": {
"app_id": "your appid",
"status": 3
},
"parameter": {
"ocr": {
"language": "language=de",
"ocr_output_text": {
"encoding": "utf8",
"compress": "raw",
"format": "json"
}
}
},
"payload": {
"image": {
"encoding": "jpg",
"image": "iVBORw0KGg······",
"status": 3
}
}
}
Request parameter description:
| Parameter | Type | Required | Description |
|---|---|---|---|
| header | object | yes | Used to upload platform parameters |
| header.app_id | string | yes | appid information applied in iFLYTEK open platform |
| header.status | int | yes | Request status, value: 3 (one-time transmission) |
| parameter | object | yes | Used to upload service feature parameters |
| parameter.ocr | object | yes | service alias |
| parameter.ocr.language | string | yes | Language |
| parameter.ocr.ocr_output_text | object | yes | Data format expectation, used to describe the related constraints such as the code of the returned result. Different data types have different constraint dimensions. There is a corresponding relationship between this object and the response result. |
| parameter.ocr.ocr_output_text.encoding | string | no | pText encoding, optional values: UTF8 (default), gb2312 |
| parameter.ocr.ocr_output_text.compress | string | no | ext compression format, optional values: raw (default), gzip |
| parameter.ocr.ocr_output_text.format | string | no | Text format, optional values: plain, JSON (default), XML |
| payload | object | yes | Used to upload service feature parameters |
| payload.image | object | yes | Input data |
| payload.image.encoding | stringt | no | Image encoding, optional values: JPG: JPG format (default), JPEG: JPEG format, PNG: PNG format, BMP: BMP format, webp: webp format, Tiff: tiff format |
| payload.image.image | string | yes | Image data, base64 encoding required, minimum size: 1B, maximum size: 10485760 B |
| payload.image.status | int | no | Data status, optional value: 3 (one-time transfer) |
Return Result
Return parameter example:
{
"header": {
"code": 0,
"message": "success",
"sid": "ocr000e583f@hu1847a3af5cd05c2882"
},
"payload": {
"ocr_output_text": {
"compress": "raw",
"encoding": "utf8",
"format": "json",
"seq": "0",
"status": "3",
"text": "ewogICAiY2F......"
}
}
}
Returned parameter description:
| Parameter | Type | Description |
|---|---|---|
| header | object | Parameters used to describe platform characteristics |
| header.code | int | 0 indicates that the session is successfully called (does not necessarily mean that the service is successfully called, and whether the service is successfully called is subject to the text field) |
| header.message | string | Description |
| header.sid | string | Unique ID of this sessionid |
| payload | object | Data segment, used to carry the data of the response |
| payload.ocr_output_text | object | Response data block |
| payload.ocr_output_text.compress | string | Text compression format |
| payload.ocr_output_text.encoding | string | Text compression format |
| payload.ocr_output_text.format | string | Text Format |
| payload.ocr_output_text.text | string | Text data returned, which needs to be base64 decoded |
| payload.ocr_output_text.status | string | Status Code |
The decoded information of the payload. OCR _ output _ text. Text field base64 is as follows, please pay special attention to:
| Parameter | Type | Description |
|---|---|---|
| version | string | Engine version number |
| category | string | Engine version number |
| pages | array | Page Collection |
| pages.height | int | The height of the page, in pixels |
| pages.width | int | The height of the page, in pixels |
| pages.exception | int | Exception information, 0 (normal), -1 (exception) |
| pages.angle | float | Rotation Angle, Range [0,360], Clockwise |
| pages.lines | array | Text line, if not detected, the field does not exist |
| pages.tables | array | Form, if not detected, the field does not exist |
| pages.checkboxes | array | check box, if not detected, the field does not exist |
| pages.seals | array | check box, if not detected, the field does not exist |
| pages.fingerprints | array | Fingerprint area, if not detected, this field does not exist |
| pages.graphs | array | Illustration, if not detected, the field does not exist |
| pages.headers | array | Header, if not detected, the field does not exist |
| pages.footers | array | Footer, if not detected, this field does not exist |
| pages.blocks | array | Paragraph. This field does not exist if no line of text is detected or if chunking is not enabled. Output according to the structure of blocks by default in the structured resume and contract document |
| pages.page_numbers | array | Page number. If not detected, the field does not exist |
| pages.expressions | array | Formula, if not detected, the field does not exist |
| pages.barcodes | array | Formula, if not detected, the field does not exist |
pages.lines field**
| Parameter | Type | Description |
|---|---|---|
| id | int | Text line number, an integer whose value range is greater than or equal to 0 |
| coord | array | Location coordinates, at least 4 points |
| coord.x | int | Axis X |
| coord.y | int | Axis y |
| angle | float | Text line angle, value range [0-360] degrees |
| type | string | Text line data type (handwriting, print) |
| exception | int | Exception information (0: normal, -1: exception return) |
| content | string | Recognition Result |
| words | array | Recognition Result |
| words.content | string | Recognition Result |
| words.coord | array | Location coordinates, at least 4 points |
| words.coord.x | int | Axis X |
| words.coord.y | int | Axis y |
| word_units.content | string | Recognition Result |
| word_units.coord | array | Location coordinates, at least 4 points |
| word_units.coord.x | int | Axis X |
| word_units.coord.y | int | Axis Y |
pages.tables field
| Parameter | Type | Description |
|---|---|---|
| id | int | Table number, if the ID is the same, it means that they belong to the same table |
| coord | array | Location coordinates, at least 4 points |
| coord.x | int | Axis X |
| coord.y | int | Axis y |
| cols | int | Number of columns divided by the table |
| rows | int | Number of rows divided by the table |
| height_set | array | The set of table cell heights, in pixels. |
| width_set | array | The set of table cell widths, in pixels. |
| cells | array | The set of table cell widths, in pixels. |
| cells.coord | array | Position coordinates, at least four points |
| cells.coord.x | int | Axis X |
| cells.coord.y | int | Axis Y |
| cells.col | int | Column number of the cell |
| cells.row | int | Row number of the cell |
| cells.colspan | int | Number of columns spanned by the cell |
| cells.rowspan | int | Number of rows spanned by the cell |
| cells.elements | array | The collection of features inserted into the cell |
| cells.elements.id | int | The number of the inserted element in the cell |
| cells.elements.type | string | Type of other element inserted in the cell (table, graph, checkbox, seal, finger print, block paragraph) |
pages.checkboxes field
| Parameter | Type | Description |
|---|---|---|
| id | int | Check box number |
| coord | object | Location coordinates, at least 4 points |
| coord.x | int | Axis X |
| coord.y | int | Axis X |
| type | string | tick cross blank |
pages.seals field
| Parameter | Type | Description |
|---|---|---|
| id | int | Seal No |
| coord | array | Target area location information, at least 4 points |
| coord.x | int | Axis X |
| coord.y | int | Axis X |
| elements | array | Collection of inserted features in seal |
| elements.id | int | Number of the inserted element in the seal |
| elements.type | string | The type of element inserted in the seal, table, graph, checkbox, seal, fingerprint, block |
pages.fingerprints field
| Parameter | Type | Description |
|---|---|---|
| id | int | Fingerprint No |
| coord | object | Location coordinates, at least 4 points |
| coord.x | int | Axis X |
| coord.y | int | Axis Y |
pages.graphs field
| Parameter | Type | Description |
|---|---|---|
| id | int | Insert the number of the feature in the illustration |
| coord | array | Location information, at least 4 points |
| coord.x | int | Axis X |
| coord.y | int | Axis Y |
| elements | array | A collection of features inserted into an illustration |
| elements.id | float | Default: 1 |
| elements.type | string | The type of feature to insert into the illustration. Optional value: block. |
pages.headers field
| Parameter | Type | Description |
|---|---|---|
| id | int | Header Number |
| coord | array | Target area location information, at least 4 points |
| coord.x | int | Axis X |
| coord.y | int | Axis Y |
| elements | array | Set of features inserted in the header |
| elements.id | int | Inserts the number of the feature in the header |
| elements.type | string | The type of element inserted in the header, table, graph, checkbox, seal, fingerprint, block. |
pages.footers field
| Parameter | Type | Description |
|---|---|---|
| id | int | Footer Number |
| coord | array | Location coordinates, at least 4 points |
| coord.x | int | Axis X |
| coord.y | int | Axis Y |
| elements | array | Set of elements to be inserted into the footer. Value range: min: 10 ~ Max: 100 |
| elements.id | int | Insert Element Number |
| elements.type | string | The type of element inserted in the footer, table, graph, checkbox, seal, fingerprint, block |
pages.blocks field
| Parameter | Type | Description |
|---|---|---|
| id | int | Paragraph number. For a column, the text block area of the spread is the same number. |
| coord | array | Location coordinates, at least 4 points |
| coord.x | int | Axis X |
| coord.y | int | Axis Y |
| line_ids | array | Lines of text in a paragraph, indexed by ID in lines |
| line_ids.level | int | Level: Currently, it only appears in resumes and document structures. Indicates the number of nesting levels to which the current block belongs in the resume. An integer whose value range is greater than or equal to 1. |
| line_ids.parent_id | int | Parent Node: Currently, it only appears in resumes and structured documents. The parent node for the current block. An integer whose value range is greater than or equal to -1 |
| line_ids.type | string | Category of paragraph block (currently only appears in resume and document structuring) head; line |
pages.page_numbers field
| Parameter | Type | Description |
|---|---|---|
| id | int | Page Number |
| coord | array | Target area location information, at least 4 points |
| coord.x | int | Axis X |
| coord.y | int | Axis Y |
| elements | array | Set of features inserted in the page number |
| elements.id | int | Inserts the number of the feature in the page number |
| elements.type | string | The type of feature inserted in the page number |
pages.expressions field
| Parameter | Type | Description |
|---|---|---|
| id | int | Formula Number |
| coord | array | Target area location information, at least 4 points |
| coord.x | int | Axis X |
| coord.y | int | Axis Y |
pages.barcodes field
| Parameter | Type | Description |
|---|---|---|
| id | int | Bar code number |
| coord | array | Location coordinates, at least 4 points |
| coord.x | int | Axis X |
| coord.y | int | Axis Y |
| type | string | type:barcode、qrcode |
| content | string | Default: 1 |
language feature parameter list:
| Language | Parameter | Language | Parameter | Language | Parameter |
|---|---|---|---|---|---|
| Chinese | language=ch_en | English | language=ch_en | hungarian | language=hu |
| German | language=de | French | language=fr | Japanese | language=ja |
| Korean | language=ko | Spanish | language=es | Arabic | language=ar |
| Portuguese | language=pt | hindi | language=hi | Indonesian | language=id |
| Italian | language=it | Malaysian | language=ms | Russian | language=ru |
| Thai | language=th | Turkish | language=tr | Vietnamese | language=vi |
| Bulgarian | language=bg | Czech | language=cs | Dutch | language=af |
| Greek | language=el | Polish | language=pl | Romanian | language=ro |
| Swedish | language=sv | Tamil | language=ta | Bengali | language=bn |
| Persian | language=fa | Urdu | language=ur | Danish | language=da |
| Finnish | language=fi | Norwegian | language=nb |
Frequently Asked Questions
What is the main function of printed character recognition?
Answer: Convert the printed text in the picture into text that can be encoded by the computer.
What application platforms are supported for printed character recognition?
Answer: Web API application platform is currently supported.
Are there any requirements for pictures in printed character recognition ?
Answer: The image format supports JPG, JPEG, PNG, BMP, webp and tiff formats, and the image file size shall not exceed 4MB after base64 encoding.