Open Spreadsheet

Note: Users must procure and maintain the applicable open source tools to integrate this DE tool with the Istari Digital platform. Please contact your local IT administrator for assistance.

Supported Functions:

extract

Getting Started

The Open Spreadsheet integration allows users to extract data from .xlsx files.

Methods to Link to Istari Digital Platform

Upload: Yes

Link: No

Files Supported

The istari Digital Platform can extract from the following file types: .xlsx

Example Files

Setup for Administrators

Ensure that LibreOffice is installed on a Virtual Machine (VM) with Istari Digital Agent and appropriate Istari Digital Software. Verify that the installation is up to date with the latest updates from LibreOffice.

Version Compatibility

This software was tested with LibreOffice, and is intended to run in a Windows or Linux environment.

Function Coverage and Outputs

The Microsoft Office Excel software can produce a number of artifacts extracted from the Excel model. The table below describes each output artifact and its type.

Route	Coverage	Artifact Content Example
Extract Sheets - CSV	Yes
Named Cells - JSON	Yes
Worksheet Data - JSON	Yes
Extract workbook - PDF	Yes
Extract workbook - xlsx	Yes
Extract zipped_html_workbook - ZIP	Yes
Extract html_workbook - HTML	Yes

Detailed SDK Reference

Prerequisite: Install Istari Digital SDK and initialize Istari Digital Client per instructions here

Step 1: Upload and Extract the File(s)

Upload the file as a model

model = client.add_model(
    path="example.xlsx",
    description="Excel example Model",
    display_name="Excel Model Name",
)
print(f"Uploaded base model with ID {model.id}")

Extract once you have the model ID

extraction_job = client.add_job(
    model_id  = model.id,
    function  = "@istari:extract",
    tool_name = "open_spreadsheet",
    tool_version = "1.0.0",
    operating_system = "Ubuntu 22.04",
)
print(f"Extraction started for model ID {model.id}, job ID: {extraction_job.id}")

Please choose appropriate tool_name, tool_version, and operating_system for your installation of this software.
Above is an example of how to call the function

Step 2: Check the Job Status

extraction_job.poll_job()

Step 3: Retrieve Results

Example

for artifact in model.artifacts:
    output_file_path = f"c:\\extracts\\{artifact.name}"

    if artifact.extension in ["txt", "csv", "md", "json", "html"]:
        with open(output_file_path, "w") as f:
            f.write(artifact.read_text())
    else:
        with open(output_file_path, "wb") as f:
            f.write(artifact.read_bytes())

Troubleshooting

For general Agent and Software Troubleshooting Click Here
Missing Artifacts:
- 2.1 named_cells.json: Check source file, are there named cells in the file? If not, refer to the software's manual for defining appropriate requirements.
- 2.2 embedded images: The tool doesn't extract any embedded images.

Supported Functions:​

Getting Started​

Methods to Link to Istari Digital Platform​

Upload: Yes​

Link: No​

Files Supported​

Example Files​

Setup for Administrators​

Version Compatibility​

Function Coverage and Outputs​

Detailed SDK Reference​

Prerequisite: Install Istari Digital SDK and initialize Istari Digital Client per instructions here​

Step 1: Upload and Extract the File(s)​

Upload the file as a model​

Extract once you have the model ID​

Step 2: Check the Job Status​

Step 3: Retrieve Results​

Example​

Troubleshooting​

FAQ​