class indico.queries.datasets.CreateDataset(name, files, wait=True, dataset_type='TEXT', from_local_images=False, image_filename_col='filename', batch_size=20, ocr_engine=None, omnipage_ocr_options=None, read_api_ocr_options=None)

Create a dataset and upload the associated files.

  • Parameters:
    • name (str) – Name of the dataset
    • files (List[str**]) – List of pathnames to the dataset files

Options:
: dataset_type (str): Type of dataset to create [TEXT, DOCUMENT, IMAGE]
wait (bool, default=True): Wait for the dataset to upload and finish

  • Returns:
    Dataset object

class indico.queries.datasets.GetDataset(id)

Retrieve a dataset description object

  • Parameters:
    id (int) – id of the dataset to query
  • Returns:
    Dataset object

class indico.queries.datasets.GetDatasetStatus(id)

Get the status of a dataset

  • Parameters:
    id (int) – id of the dataset to query
  • Returns:
    COMPLETE or FAILED
  • Return type:
    status (str)

class indico.queries.datasets.GetDatasetFileStatus(id)

Get the status of dataset file upload

  • Parameters:
    id (int) – id of the dataset to query
  • Returns:
    DOWNLOADED or FAILED
  • Return type:
    status (str)

class indico.queries.datasets.ListDatasets(*, limit=100)

List all of your datasets

Options:
: limit (int, default=100): Max number of datasets to retrieve

  • Returns:
    List[Dataset]

class indico.queries.datasets.DeleteDataset(id)

Delete a dataset

  • Parameters:
    id (int) – ID of the dataset
  • Returns:
    The success of the operation
  • Return type:
    success (bool)

class indico.queries.datasets.AddDatasetFiles(dataset_id, files, autoprocess=False, wait=True, batch_size=20)

Add files to a dataset.

  • Parameters:
    • dataset_id (int) – ID of the dataset
    • files (List[str**]) – List of pathnames to the dataset files

Options:
: autoprocess (bool, default=False): Automatically process new dataset files
wait (bool, default=True): Block while polling for status of files
batch_size (int, default=20): Batch size for uploading files

  • Returns:
    Dataset

class indico.queries.datasets.RemoveDatasetFile(dataset_id, file_id)

Remove a file from a dataset by ID. To retrieve a list of files in a dataset,
see GetDatasetFileStatus.

  • Parameters:
    • dataset_id (int) – Dataset ID
    • file_id (int) – Datafile ID (returned by GetDatasetFileStatus)
  • Returns:
    Dataset object