Interpreting Your JSON Results

Read this guide to understand the JSON output files generated by the Indico Agents and Workflows platform.

Introduction

The contents of the result file depend on several factors:

  • The type of agent
  • The default output version for your cluster
  • Whether optional functionalities are enabled, such as:
    • Review
    • Typed Answer Keys (TAK)
    • Linked Labels
    • Summarization

Output Overview

  • Classification agents, including Classify and Unbundle agents, output predicted classes with confidence scores.
  • Extraction agents output extracted text with confidence levels for each prediction.
  • Linked-Label transformers group linked labels using an index in the output dictionary.
  • Form agents output includes keys like recognized_forms and form_version to identify processed forms.
  • Review or Autoreview indicate the review status of agent outputs.
  • Summarization output varies between result-file versions:
    • For version 1, results mirror single classification output results and contain summary text with citations.
    • For version 3, the output contains an array for each unbundled file in the submission.

Each file also features links to ETL (Extract, Transform, Load) and OCR (Optical Character Recognition) outputs, which offer detailed information about each page within a submission. To understanding ETL and OCR output, read our Interpreting your OCR and ETL Output document.

In addition to the agent type, Typed Answer Keys/Output Normalization also impacts your output. It adds a formatted or normalized key for field normalization.

📘

The spans Key

Agents like Extraction and Classify and Unbundle return labels containing a spans key, which represents semantically related text segments from the source document. The ctx_id field, if provided, tracks the parent span to organize related data through a workflow. If no ctx_id is provided, the agent ran on the entire document.

Understanding Your Result Files

This guide will walk you through the structure of the results files, explain the significance of each element, and provide practical examples to help you effectively utilize this data in your applications.

File Hierarchy

There are three levels of result files that you get from a workflow:

  • Top-level result file that typically has the following name structure submission_<submission id>_result.json.
  • Mid-level result file named etl_output.json provides links to the full text of the submission and the low-level result files that contain the detailed results of the workflow.
  • Low-level detailed OCR result files:
    • page_info_<index>.json provides detailed OCR data for the page identified by the index in the file name.
    • JSON files that provide granular OCR data on characters, tokens, and blocks.
📘

To learn about the mid- and low-level output file contents, read Interpreting Your OCR and ETL Output.

File Versions

Indico currently offers two versions of results files, version 1 and version 3. The formats have slight variations.

The file version you are using is accessible in the first line of the JSON output file.

{
	"file_version": 3,
	...
}
🚧

It is a best practice to use version 3 whenever possible. Indico is sunsetting version 1. It is currently being supported for backward compatibility only.

Classification

For classification results of all varieties tested under the name of the agent group, there is a dictionary of the document class name and confidence level of the prediction. Confidence levels are between 0 and 1.

Single Classification

For single classification, results are present for every agent group as keys in the results dictionary.

{
    "file_version": 1,
    "submission_id": "12345",
    "etl_output": "indico-file:///storage/submission/ocr_output.json",
    "results": {
        "document": {
            "results": {
                "Test Model 1": {
                    "Class 1": 0.688137223450789,
                    "Class 2": 0.08451932419717022,
                    "Class 3": 0.06304424768016251,
                    "Class 4": 0.021362314937542252
                }
            }
        }
    }
}
{
    "file_version": 3,
    "submission_id": 77924,
    "modelgroup_metadata": {
        "24616": {
            "id": 24616,
            "task_type": "classification",
            "name": "class with two labels",
            "selected_model": {
                "id": 35369,
                "model_type": "tfidf_gbt"
            }
        },
        ...
    },
    "component_metadata": {
        ...
        "103423": {
            "id": 103423,
            "name": "Document Classification",
            "component_type": "model_group",
            "task_type": "classification"
        },
        ...
    },
    "submission_results": [
        {
            "submissionfile_id": 182135,
            "etl_output": "indico-file:///storage/submission/25803/77924/182135/etl_output.json",
            "input_filename": "97783.txt",
            "input_filepath": "indico-file:///storage/submission/25803/77924/182135.txt",
            "input_filesize": 8171,
            "model_results": {
                "ORIGINAL": {
                    "24616": [
                        {
                            "field_id": 10559356,
                            "confidence": {
                                "email": 0.8453857085865302,
                                "other insurance files": 0.15461429141346983
                            },
                            "label": "email"
                        }
                    ],
                    ...
                }
            },
            "component_results": {
                "ORIGINAL": {}
            },
            "rejected": {
                "models": {
                    "24616": [],
                    "24669": []
                },
                "components": {}
            }
        },
        ...
    ],
    "reviews": {},
    "errored_files": {}
}

Multi Classification

For regular classification agents, class name and confidence are present for every document class in the agent.

For GenAI classification agents, only the highest probability document class has an associated confidence score.

{
    "file_version": 3,
    "submission_id": 97350,
    "modelgroup_metadata": {
        "4221": {
            "id": 4221,
            "task_type": "classification_multiple",
            "name": "multi_classify",
            "selected_model": {
                "id": 6766,
                "model_type": "tfidf_gbt"
            }
        },
        ...
    },
    "component_metadata": {
        ...
            "id": 14735,
            "name": null,
            "component_type": "output_json_formatter",
            "task_type": null
        },
        "14737": {
            "id": 14737,
            "name": "Document Multi-Classification",
            "component_type": "model_group",
            "task_type": "classification_multiple"
        },
        ...
    },
    "submission_results": [
        {
            "submissionfile_id": 93642,
            "etl_output": "indico-file:///storage/submission/4008/97350/93642/etl_output.json",
            "input_filename": "leadership.pdf",
            "input_filepath": "indico-file:///storage/submission/4008/97350/93642.pdf",
            "input_filesize": 15248,
            "model_results": {
                "ORIGINAL": {
                    "4221": [
                        {
                            "field_ids": [
                                267290,
                                267290,
                                267290
                            ],
                            "confidence": {
                                "problem": 0.9999999961727768,
                                "leadership": 0.9999999922852744,
                                "tech": 0.6666662614633455,
                                "earnings": 0.3333331376906894,
                                "other": 2.872262090213859e-8,
                                "acquisition": 2.3355426017253765e-9,
                                "energy": 9.715453333083489e-10,
                                "financial": 4.161086406129549e-10,
                                "foodretail": 4.353135264147288e-11,
                                "product": 2.2489297185618299e-11,
                                "auto": 2.2232509679333238e-11,
                                "pharmaceutical": 1.824398818834385e-14
                            },
                            "label": [
                                "problem",
                                "leadership",
                                "tech"
                            ]
                        }
                    ],
                    ...
                }
            },
            "component_results": {
                "ORIGINAL": {}
            },
            "rejected": {
                "models": {
                    "4221": [],
                    "4227": []
                },
                "components": {}
            }
        }
    ],
    "reviews": {},
    "errored_files": {}
}

Classify and Unbundle

{
    "file_version": 1,
    "submission_id": 191111,
    "etl_output": "indico-file:///storage/submission/13733/19302/11111/etl_output.json",
    "results": {
        "document": {
            "results": {
                "classify and unbundle invoices": [],
                "class filter for invoices": {
                    "field_id": 6866379,
                    "confidence": {
                        "Invoices": 0.999618701653154,
                        "Receipts": 0.0003806857502734332,
                        "Other": 6.125965725590266e-7
                    },
                    "label": "Other"
                }
            },
            "rejected": {
                "classify and unbundle invoices": [],
                "class filter for invoices": []
            }
        }
    }
}
{
    "file_version": 3,
    "submission_id": 97349,
    "modelgroup_metadata": {
        "5207": {
            "id": 5207,
            "task_type": "classification_unbundling",
            "name": "classify_unbundle_readapiv2_no_unpack",
            "selected_model": {
                "id": 8302,
                "model_type": "unbundle"
            }
        },
        ...
    },
    "component_metadata": {
        ...
        "17002": {
            "id": 17002,
            "name": "Classify & Unbundle",
            "component_type": "model_group",
            "task_type": "classification_unbundling"
        },
        ...
    },
    "submission_results": [
        {
            "submissionfile_id": 93641,
            "etl_output": "indico-file:///storage/submission/4903/97349/93641/etl_output.json",
            "input_filename": "bundled_doc-1.pdf",
            "input_filepath": "indico-file:///storage/submission/4903/97349/93641.pdf",
            "input_filesize": 361241,
            "model_results": {
                "ORIGINAL": {
                    ...
                    "5207": [
                        {
                            "label": "annual report",
                            "spans": [
                                {
                                    "start": 0,
                                    "end": 2426,
                                    "page_num": 0
                                },
                                {
                                    "start": 2427,
                                    "end": 2463,
                                    "page_num": 1
                                }
                            ],
                            "span_id": "93641:c:17002:idx:0",
                            "confidence": {
                                "annual report": 0.9949126243591309,
                                "avg annual report": 0.002969800028949976,
                                "financial disclosures": 0.002117517637088895
                            },
                            "field_id": 429947,
                            "location_type": "exact"
                        },
                        {
                            "label": "financial disclosures",
                            "spans": [
                                {
                                    "start": 2464,
                                    "end": 3659,
                                    "page_num": 2
                                },
                                {
                                    "start": 3660,
                                    "end": 5017,
                                    "page_num": 3
                                },
                                {
                                    "start": 5018,
                                    "end": 6477,
                                    "page_num": 4
                                },
                                {
                                    "start": 6478,
                                    "end": 8038,
                                    "page_num": 5
                                },
                                {
                                    "start": 8039,
                                    "end": 9392,
                                    "page_num": 6
                                },
                                {
                                    "start": 9393,
                                    "end": 10802,
                                    "page_num": 7
                                },
                                {
                                    "start": 10803,
                                    "end": 11939,
                                    "page_num": 8
                                },
                                {
                                    "start": 11940,
                                    "end": 13167,
                                    "page_num": 9
                                },
                                {
                                    "start": 13168,
                                    "end": 14544,
                                    "page_num": 10
                                },
                                {
                                    "start": 14545,
                                    "end": 15919,
                                    "page_num": 11
                                },
                                {
                                    "start": 15920,
                                    "end": 17441,
                                    "page_num": 12
                                },
                                {
                                    "start": 17442,
                                    "end": 18723,
                                    "page_num": 13
                                },
                                {
                                    "start": 18724,
                                    "end": 19341,
                                    "page_num": 14
                                },
                                {
                                    "start": 19342,
                                    "end": 22187,
                                    "page_num": 15
                                },
                                {
                                    "start": 22188,
                                    "end": 25429,
                                    "page_num": 16
                                },
                                {
                                    "start": 25430,
                                    "end": 27204,
                                    "page_num": 17
                                },
                                {
                                    "start": 27205,
                                    "end": 30796,
                                    "page_num": 18
                                }
                            ],
                            "span_id": "93641:c:17002:idx:1",
                            "confidence": {
                                "annual report": 0.0030741794034838678,
                                "avg annual report": 0.0038796598091721536,
                                "financial disclosures": 0.9930461049079895
                            },
                            "field_id": 429947,
                            "location_type": "exact"
                        },
                        {
                            "label": "avg annual report",
                            "spans": [
                                {
                                    "start": 30797,
                                    "end": 32809,
                                    "page_num": 19
                                },
                                {
                                    "start": 32810,
                                    "end": 36737,
                                    "page_num": 20
                                }
                            ],
                            "span_id": "93641:c:17002:idx:2",
                            "confidence": {
                                "annual report": 0.003948610741645098,
                                "avg annual report": 0.9928027987480164,
                                "financial disclosures": 0.003248531138524413
                            },
                            "field_id": 429947,
                            "location_type": "exact"
                        }
                    ]
                }
            },
            "component_results": {
                "ORIGINAL": {}
            },
            "rejected": {
                "models": {
                    "5208": [],
                    "5210": [],
                    "5209": [],
                    "5207": []
                },
                "components": {}
            }
        }
    ],
    "reviews": {},
    "errored_files": {}
}

Extraction

For extraction results, each extracted text (identified by character start and end indexes and text) displays the confidence score (under the confidence nesting) for each of the classes in the agent. The class with the highest confidence is identified as the label (under the label key). Confidence levels are between 0 and 1. You may notice <PAD> in list of labels. <PAD> indicates the absence of a label.

{
    "submission_id": 91,
    "etl_output": "indico-file:///storage/submission/ocr_output.json",
    "errors": [],
    "results": {
        "document": {
            "results": {
                "Invoices Extraction": [
                    {
                        "start": 115,
                        "end": 126,
                        "label": "Invoice Number",
                        "text": "redacted",
                        "confidence": {
                            "Line Item": 4.491627958458366e-9,
                            "Total": 1.0310143494507429e-7,
                            "Vendor": 3.994096786641421e-8,
                            "<PAD>": 3.4748592270261724e-7,
                            "Invoice Number": 0.9999995231628418,
                            "Line Item Value": 1.2659886472476956e-8
                        }
                    },
                    {
                        "start": 978,
                        "end": 1004,
                        "label": "Vendor",
                        "text": "redacted",
                        "confidence": {
                            "Line Item": 2.7375634203963273e-8,
                            "Total": 1.2861304909961291e-8,
                            "Vendor": 1,
                            "<PAD>": 1.0245797099628362e-8,
                            "Invoice Number": 3.3023983547764146e-8,
                            "Line Item Value": 4.132349928909207e-8
                        }
                    },
                    {
                        "start": 1960,
                        "end": 2069,
                        "label": "Line Item",
                        "text": "redacted",
                        "confidence": {
                            "Line Item": 0.9999918937683105,
                            "Total": 7.112359412531077e-8,
                            "Vendor": 0.0000023915347355796257,
                            "<PAD>": 0.000003880942585965386,
                            "Invoice Number": 1.288364899210137e-7,
                            "Line Item Value": 0.0000016712019714759663
                        }
                    },
                    ...
                ]
            }
        }
    }
}
{
    "file_version": 3,
    "submission_id": 39855,
    "modelgroup_metadata": {
        "18584": {
            "id": 18584,
            "task_type": "annotation",
            "name": "model1",
            "selected_model": {
                "id": 23032,
                "model_type": "finetune"
            }
        },
        ...
    },
    "component_metadata": {
        "82564": {
            "id": 82564,
            "name": null,
            "component_type": "input_ocr_extraction"
        },
        "82565": {
            "id": 82565,
            "name": null,
            "component_type": "output_json_formatter"
        },
        "82567": {
            "id": 82567,
            "name": "Document Extraction",
            "component_type": "model_group"
        },
        "82569": {
            "id": 82569,
            "name": "Document Extraction",
            "component_type": "model_group"
        },
        "84000": {
            "id": 84000,
            "name": "Date and Price Group",
            "component_type": "link_label"
        },
        "91655": {
            "id": 91655,
            "name": "Standard Output",
            "component_type": "default_output"
        }
    },
    "submission_results": [
        {
            "submissionfile_id": 58851,
            "etl_output": "indico-file:///storage/submission/20869/39855/58851/etl_output.json",
            "input_filename": "redacted",
            "input_filepath": "indico-file:///storage/submission/20869/39855/58851.txt",
            "input_filesize": 2919,
            "model_results": {
                "ORIGINAL": {
                    "18584": [
                        {
                            "label": "person_name",
                            "spans": [
                                {
                                    "start": 80,
                                    "end": 98,
                                    "page_num": 0
                                }
                            ],
                            "span_id": "58851:c:82567:idx:1",
                            "confidence": {
                                "address": 0.0017796654719859362,
                                "category": 0.00037611479638144374,
                                "date": 0.0002551926299929619,
                                "email": 0.1306380033493042,
                                "person_name": 0.8642205595970154,
                                "phone": 0.00008364625682588667,
                                "price": 0.00075566116720438,
                                "unformatted_summary": 0.0008147378102876246,
                                "unformatted_text": 0.0003376395034138113
                            },
                            "field_id": 7995219,
                            "location_type": "exact",
                            "text": "redacted redacted",
                            "groupings": [],
                            "normalized": {
                                "text": "redacted redacted",
                                "start": 80,
                                "end": 98,
                                "structured": null,
                                "formatted": "redacted redacted",
                                "status": "SUCCESS",
                                "comparison_type": "string",
                                "comparison_value": "redacted redacted",
                                "validation": [
                                    {
                                        "validation_type": "TYPE_CONVERSION",
                                        "error_message": null,
                                        "validation_status": "SUCCESS"
                                    }
                                ]
                            }
                        },
                        ...
                    ],
                    ...
                }
            },
            "component_results": {
                "ORIGINAL": {}
            },
            "rejected": {
                "models": {
                    "18584": [
                        {
                            "label": "email",
                            "spans": [
                                {
                                    "start": 4,
                                    "end": 73,
                                    "page_num": 0
                                }
                            ],
                            "span_id": "58851:c:82567:idx:0",
                            "confidence": {
                                "address": 0.0001422955101588741,
                                "category": 0.00004625660221790895,
                                "date": 0.00002712739478738513,
                                "email": 0.995801568031311,
                                "person_name": 0.0005087707540951669,
                                "phone": 0.000011167575394210871,
                                "price": 0.00023257092107087374,
                                "unformatted_summary": 0.0017664311453700066,
                                "unformatted_text": 0.00026696716668084264
                            },
                            "field_id": 7995218,
                            "location_type": "exact",
                            "text": "\"redacted\" <redacted>",
                            "groupings": []
                        },
                        ...
                    ],
                    ...
                },
                "components": {}
            }
        },
        ...
    ],
    "reviews": {},
    "errored_files": {}
}

Other JSON Output Styles

Linked Labels

Your JSON output document may look slightly different if your agent is downstream from a linked labels transformer in a workflow. The results document will continue to follow the standard format with the addition of a groupings key. In the groupings dictionary, your linked labels groups will be identified by their index. Each instance of a group has a unique index, so labels that share an index are in the same label group.

If a prediction went through the transformer but was determined not to be part of any groups, it will contain an empty groupings dictionary (i.e., "groupings": []).

{
    "submission_id": 9,
    "etl_output": "indico-file:///storage/submission/ocr_output.json",
    "errors": [],
    "results": {
        "document": {
            "results": {
                "Invoices Extraction": [
                    {
                        "start": 115,
                        "end": 126,
                        "label": "Invoice Number",
                        "text": "10000023222",
                        "confidence": {
                            "Line Item Description": 4.491627958458366e-9,
                            "Total": 1.0310143494507429e-7,
                            "Vendor": 3.994096786641421e-8,
                            "<PAD>": 3.4748592270261724e-7,
                            "Invoice Number": 0.9999995231628418,
                            "Line Item Value": 1.2659886472476956e-8
                        },
                        "groupings": []
                    },
                    {
                        "start": 321,
                        "end": 340,
                        "label": "Line Item Description",
                        "text": "Hospitalization Level 2",
                        "confidence": {
                            "Invoice Number": 4.491627958458366e-9,
                            "Total": 1.0310143494507429e-7,
                            "Vendor": 3.994096786641421e-8,
                            "<PAD>": 3.4748592270261724e-7,
                            "Line Item Description": 0.9999995231628418,
                            "Line Item Value": 1.2659886472476956e-8
                        },
                        "groupings": [
                            {
                                "group_name": "Line Item",
                                "group_index": 1
                            }
                        ]
                    },
                    {
                        "start": 351,
                        "end": 359,
                        "label": "Line Item Value",
                        "text": "350.00",
                        "confidence": {
                            "Line Item Value": 0.9999995231628418,
                            "Total": 1.0310143494507429e-7,
                            "Vendor": 3.994096786641421e-8,
                            "<PAD>": 3.4748592270261724e-7,
                            "Invoice Number": 1.0310143494507429e-7,
                            "Line Item Description": 1.2659886472476956e-8
                        },
                        "groupings": [
                            {
                                "group_name": "Line Item",
                                "group_index": 1
                            }
                        ]
                    },
...
                ]
            }
        }
    }
}
{
    "file_version": 3,
    "submission_id": 23111,
    "modelgroup_metadata": {
        "9111": {
            "id": 9111,
            "task_type": "classification_unbundling",
            "name": "multi_models_classify_unbundle",
            "selected_model": {
                "id": 11111,
                "model_type": "unbundle"
            }
        },
        "9112": {
            "id": 9112,
            "task_type": "classification",
            "name": "multi_models_classification_model",
            "selected_model": {
                "id": 11112,
                "model_type": "tfidf_gbt"
            }
        },
        "9113": {
            "id": 9113,
            "task_type": "annotation",
            "name": "multi_models_extraction_model",
            "selected_model": {
                "id": 11114,
                "model_type": "finetune"
            }
        }
    },
    "submission_results": [
        {
            "submissionfile_id": 21111,
            "etl_output": "indico-file:///storage/submission/11599/23532/2111/etl_output.json",
            "input_filename": "MultiClass.pdf",
            "input_filepath": "indico-file:///storage/submission/11599/23111/24607.pdf",
            "input_filesize": 15103,
            "model_results": {
                "ORIGINAL": {
                    "9111": [
                        {
                            "label": "financial disclosures",
                            "spans": [
                                {
                                    "start": 0,
                                    "end": 31,
                                    "page_num": 0
                                }
                            ],
                            "span_id": "24607:c:47034:idx:0",
                            "confidence": {
                                "annual report": 0.01469539012759924,
                                "avg annual report": 0.005166168324649334,
                                "financial disclosures": 0.9801384210586548
                            },
                            "field_id": 6403711
                        },
                        {
                            "label": "financial disclosures",
                            "spans": [
                                {
                                    "start": 32,
                                    "end": 89,
                                    "page_num": 1
                                }
                            ],
                            "span_id": "24607:c:47034:idx:1",
                            "confidence": {
                                "annual report": 0.011918464675545692,
                                "avg annual report": 0.004315620753914118,
                                "financial disclosures": 0.9837659001350403
                            },
                            "9113": []
                        }
                    },
                    "component_results": {
                        "ORIGINAL": {}
                    },
                    "rejected": {
                        "models": {
                            "9111": [],
                            "9112": [],
                            "9113": []
                        },
                        "components": {}
                    }
                }
            ],
            "reviews": {}
        }

Forms Output

Output from a Forms agent contains a recognized_forms key for the agent, which details all the forms recognized in the output file, and a orm_version key for each prediction.

{
    "file_version": 1,
    "submission_id": 11111,
    "etl_output": "indico-file:///storage/submission/13708/11111/1111/etl_output.json",
    "results": {
        "document": {
            "results": {
                "ACORD Model": [
                    {
                        "start": null,
                        "end": null,
                        "label": "Agency",
                        "confidence": {
                            "Agency": 1.0
                        },
                        "field_id": 6801111,
                        "top": 219,
                        "bottom": 530,
                        "left": 63,
                        "right": 1286,
                        "page_num": 0,
                        "type": "text",
                        "text": "My Insurance Group \n1234 Main St. \nBoston, MA 02111",
                        "normalized": {
                            "text": "Mediocre Insurance Group \n1234 Main St. \nBoston, MA 02111",
                            "start": null,
                            "end": null,
                            "structured": null,
                            "formatted": "Mediocre Insurance Group \n1234 Main St. \nBoston, MA 02111",
                            "status": "SUCCESS",
                            "validation": [
                                {
                                    "validation_type": "TYPE_CONVERSION",
                                    "error_message": null,
                                    "validation_status": "SUCCESS"
                                }
                            ]
                        }
                    },
{
    "file_version": 3,
    "submission_id": 19111,
    {
        "processed_file_name": "indico-blob:///storage/submission/0000/6803/0000.pdf",
        "recognized_forms": {
            "Acord-125-2016-03": [
                0
            ]
        },
        "etl_output_url": "indico-blob:///storage/submission/2851/6803/0000/etl_output.json"
    }
],
"pages": [
    {
        "template_name": "299485",
        "template_page_number": 0,
        "match_confidence": 0.98,
        "zones": [
            {
                "top": 119,
                "bottom": 231,
                "left": 2101,
                "right": 2485,
                "page_num": 0,
                "type": "text",
                "text": "",
                "label": "Date",
                "confidence": 100,
                "form_version": "Acord-125-2016-03",
                "value": ""
            },
            {
                "top": 219,
                "bottom": 530,
                "left": 63,
                "right": 1286,
                "page_num": 0,
                "type": "text",
                "text": "Mediocre Insurance Group \n1234 Main St. \nBoston, MA 02111",
                "label": "Agency",
                "confidence": 100,
                "form_version": "Acord-125-2016-03",
                "value": "Mediocre Insurance Group \n1234 Main St. \nBoston, MA 02111"
            },
            {
                "top": 219,
                "bottom": 331,
                "left": 1262,
                "right": 2275,
                "page_num": 0,
                "type": "text",
                "text": "Anonymous Insurance",
                "label": "Carrier",
                "confidence": 100,
                "form_version": "Acord-125-2016-03",
                "value": "Anonymous Insurance"
            },
            {
                "top": 219,
                "bottom": 331,
                "left": 2251,
                "right": 2485,
                "page_num": 0,
                "type": "text",
                "text": "",
                "label": "NAICCodePg1",
                "confidence": 100,
                "form_version": "Acord-125-2016-03",
                "value": ""
            },
            {
                "top": 325,
                "bottom": 431,
                "left": 1262,
                "right": 2200,
                "page_num": 0,
                "type": "text",
                "text": "",
                "label": "PolicyProgramName",
                "confidence": 100,
                "form_version": "Acord-125-2016-03",
                "value": ""
            },
            {
                "top": 319,
                "bottom": 431,
                "left": 2176,
                "right": 2485,
                "page_num": 0,
                "type": "text",
                "text": "",
                "label": "CompanyProductCode",
                "confidence": 100,
                "form_version": "Acord-125-2016-03",
                "value": ""
            },
            {
                "top": 419,
                "bottom": 531,
                "left": 1262,
                "right": 2485,
                "page_num": 0,
                "type": "text",
                "text": "0123456789",
                "label": "PolicyNumber",
                "confidence": 100,
                "form_version": "Acord-125-2016-03",
                "value": "0123456789"
            },

Normalization/Typed Answer Keys

If you have used Typed Answer Keys (TAK) to normalize your output, that normalization will appear in your output file. A normalized or formatted section will be included in your document, which details the normalization expectation, whether or not it was successful, and the original format of the text.

🚧

A Note on Normalization with Autoreview

Autoreview users who have normalized their results will need to modify their autoreview scripts and their integration with downstream systems. Downstream systems and autoreview scripts should be updated to use the normalized values as the final value rather than the original text.

To update autoreview scripts: Changeprediction["text"]toprediction["normalized"]["formatted"]in autoreview and post-processing code.

{
    "file_version": 1,
    "submission_id": 91111,
    "etl_output": "indico-file:///storage/submission/3106/93449/1111/etl_output.json",
    "results": {
        "document": {
            "results": {
                "Test Workflow": [
                    {
                        "start": 410,
                        "end": 432,
                        "label": "Income Amount",
                        "confidence": {
                            "Asset Value": 0.001380413887090981,
                            "Date of Appointment": 3.0055036859266693e-7,
                            "Department": 3.4668929060899245e-7,
                            "Income Amount": 0.7819252610206604,
                            "Liability Amount": 8.27995336294407e-6,
                            "Liability Type": 1.7472679019192583e-6,
                            "Name": 1.465798504796112e-6,
                            "Position": 7.679560098949878e-7,
                            "Previous Organization": 1.8361643014941365e-6,
                            "Previous Position": 1.587482643117255e-6
                        },
                        "field_id": 91111,
                        "page_num": 0,
                        "text": "00 6,575.91 6,575.91 0",
                        "normalized": {
                            "text": "00 6,575.91 6,575.91 0",
                            "start": 410,
                            "end": 412,
                            "structured": {
                                "currency": null,
                                "amount": 0.0,
                                "currency_symbol": null
                            },
                            "formatted": "$0.00",
                            "status": "SUCCESS",
                            "validation": [
                                {
                                    "validation_type": "TYPE_CONVERSION",
                                    "error_message": null,
                                    "validation_status": "SUCCESS"
                                }
                            ]
                        }
                    },
                    {
                        "start": 433,
                        "end": 436,
                        "label": "Income Amount",
                        "confidence": {
                            "Asset Value": 0.010080317035317421,
                            "Date of Appointment": 5.33476793407317e-7,
                            "Department": 5.270041469884745e-7,
                            "Income Amount": 0.6256295442581177,
                            "Liability Amount": 0.000012726128261419944,
                            "Liability Type": 5.120524292578921e-6,
                            "Name": 5.42804627912119e-7,
                            "Position": 7.942233537505672e-7,
                            "Previous Organization": 4.114456714887638e-6,
                            "Previous Position": 6.950397164473543e-6
                        },
                        "field_id": 91111,
                        "page_num": 0,
                        "text": "000",
                        "normalized": {
                            "text": "000",
                            "start": 433,
                            "end": 436,
                            "structured": {
                                "currency": null,
                                "amount": 0.0,
                                "currency_symbol": null
                            },
                            "formatted": "$0.00",
                            "status": "SUCCESS",
                            "validation": [
                                {
                                    "validation_type": "TYPE_CONVERSION",
                                    "error_message": null,
                                    "validation_status": "SUCCESS"
                                }
                            ]
                        }
                    },
{
    "file_version": 3,
    "submission_id": 25542,
    "modelgroup_metadata": {
        "12842": {
            "id": 12842,
            "task_type": "annotation",
            "name": "invoices",
            "selected_model": {
                "id": 18969,
                "model_type": "finetune"
            }
        }
    },
    "submission_results": [
        {
            "submissionfile_id": 26781,
            "etl_output": "indico-file:///storage/submission/10000/2000/20001/etl_output.json",
            "input_filename": "redacted",
            "input_filepath": "indico-file:///storage/submission/15200/25542/26781.pdf",
            "input_filesize": 105501,
            "model_results": {
                "ORIGINAL": {
                    "12842": [
                        {
                            "label": "vendor",
                            "spans": [
                                {
                                    "start": 29,
                                    "end": 54,
                                    "page_num": 0
                                }
                            ],
                            "span_id": "26781:c:62510:idx:0",
                            "confidence": {
                                "invoice": 0.00000904287207958987,
                                "vendor": 0.9999739527702332
                            },
                            "field_id": 7220809,
                            "text": "redacted",
                            "normalized": {
                                "text": "redacted",
                                "start": 29,
                                "end": 54,
                                "structured": null,
                                "formatted": "redacted",
                                "status": "SUCCESS",
                                "validation": [
                                    {
                                        "validation_type": "TYPE_CONVERSION",
                                        "error_message": null,
                                        "validation_status": "SUCCESS"
                                    }
                                ]
                            }
                        },
                        ...
                    ]
                }
            },
            "component_results": {
                "ORIGINAL": {}
            },
            "rejected": {
                "models": {
                    "12842": []
                },
                "components": {}
            }
        },
        ...
    ],
    "reviews": {}
}

Output with Review/Autoreview

Raw result files remain unaltered to guarantee that all reviewers can access and evaluate the original file consistently. Results that undergo review closely resemble standard results but include additional nested sections: pre-review, final, post-review, and reviews_meta for each agent group. Confidence levels are not provided for results that incorporate reviewer corrections.

🚧

A note on normalization with autoreview

Autoreview users who have normalized their results will need to modify their autoreview scripts and their integration with downstream systems. Downstream systems and autoreview scripts should be updated to use the normalized values as the final value rather than the original text.

To update autoreview scripts: Changeprediction["text"] to prediction["normalized"]["formatted"] in autoreview and post-processing code or set prediction["normalized"]["formatted"] in addition to prediction["text"] in autoreview alone.

{
    "submission_id": 23,
    "etl_output": "foo_etl-output.json",
    "errors": [],
    "results": {
        "document": {
            "results": {
                "bar": {
                    "pre_review": [
                        {
                            "etl": "Lorem Ipsum"
                        }
                    ],
                    "post_reviews": [
                        [
                            {
                                "etl": "dolor sit amet"
                            }
                        ],
                        [
                            {
                                "etl": "consectetur adipiscing elit"
                            }
                        ]
                    ],
                    "final": [
                        {
                            "etl": "consectetur adipiscing elit"
                        }
                    ]
                }
            }
        }
    },
    "reviews_meta": [
        {
            "review_id": 2,
            "reviewer_id": 2,
            "review_notes": "Fooey",
            "review_rejected": false,
            "review_type": "manual"
        },
        {
            "review_id": 1,
            "reviewer_id": 1,
            "review_notes": null,
            "review_rejected": false,
            "review_type": "manual"
        }
    ],
    "file_version": 1,
    "review_id": 1,
    "reviewer_id": 1,
    "review_notes": null,
    "review_rejected": false,
    "review_type": "manual"
}
{
    "file_version": 3,
    "submission_id": 21111,
    "modelgroup_metadata": {
        "12111": {
            "id": 12111,
            "task_type": "annotation",
            "name": "invoices",
            "selected_model": {
                "id": 18111,
                "model_type": "finetune"
            }
        }
    },
    "submission_results": [
        {
            "submissionfile_id": 26111,
            "etl_output": "indico-file:///storage/submission/10000/2000/20001/etl_output.json",
            "input_filename": "redacted",
            "input_filepath": "indico-file:///storage/submission/15200/25542/1111.pdf",
            "input_filesize": 105501,
            "model_results": {
                "ORIGINAL": {
                    "11111": [
                        {
                            "label": "vendor",
                            "spans": [
                                {
                                    "start": 29,
                                    "end": 54,
                                    "page_num": 0
                                }
                            ],
                            "span_id": "26781:c:62510:idx:0",
                            "confidence": {
                                "invoice": 0.00000904287207958987,
                                "vendor": 0.9999739527702332
                            },
                            "field_id": 7220809,
                            "text": "redacted",
                            "normalized": {
                                "text": "redacted",
                                "start": 29,
                                "end": 54,
                                "structured": null,
                                "formatted": "redacted",
                                "status": "SUCCESS",
                                "validation": [
                                    {
                                        "validation_type": "TYPE_CONVERSION",
                                        "error_message": null,
                                        "validation_status": "SUCCESS"
                                    }
                                ]
                            }
                        },
                        ...
                    ]
                }
            },
            "component_results": {
                "ORIGINAL": {}
            },
            "rejected": {
                "models": {
                    "12842": []
                },
                "components": {}
            }
        }
    ],
    "reviews": {}
}

Output with Summarization Enabled

  • Summarization (Version 1): In version 1, the output with summaries contains similar dictionary outputs to single classification agents with the addition of both summary text and citation data contained in the text key, where each agent’s output is structured as a dictionary rather than an array.
  • Summarization (Version 3): In version 3, the output is structured as an array with one entry per file in the submission. Each entry includes the summary text and citation data, which point back to segments in the source document.
📘

Citation formatting

The citation's included in your output have two ranges:

  • One that points to the text in the source document
  • And another that links to the text in the generated summary.

Each citation corresponds to a specific segment of the generated summary. For instance, "[1-2]" indicates that a section of the summary is based on two different parts of the source document. These citations replace full phrases or sentences in the summary text and are indicated by numbers such as "[0]" or similar.

"summary_model": {
    "field_id": 162,
    "confidence": {
        "foo": 1.0
    },
    "label": "foo",
    "text": "• This document ... the company [1-2]. \n• Confidential ... third parties [4][6]. \n• The ... [12-13]. \n• Furthermore, ... date [23].\n• In exchange for ... grant [31][36][39]. \n• The agreement ... Virginia [7]. \n• The ... employment [38][42].",
    "citations": [
        {
            "document": {
                "start": 0,
                "end": 394,
                "page_num": 0
            },
            "response": {
                "start": 188,
                "end": 191
            }
        },
			...
    	]
		}
  },
	...
"model_results": {
    "ORIGINAL": {
        "160": [
            {
                "field_id": 160,
                "confidence": {
                    "foo": 1.0
                },
                "label": "foo",
                "text": "• The document ... or boat [0][4].\n• ... wolves [1].\n• Gates ... years [6].",
                "citations": [
                    {
                        "document": {
                            "start": 16739,
                            "end": 17085,
                            "page_num": 6
                        },
                        "response": {
                            "start": 440,
                            "end": 443
                        }
                    },
				...
                ],
                "ctx_id": "75:c:489:idx:1"
            },
...
        ],
...