Skip to content

I/O Ports of a Service

The I/O Ports represent the input and output interfaces of a Service, enabling its interaction with other Data Analytics System Assets (Datasets, other Services, Models) within a Workflow.

Types of I/O Ports

  • 🟢 Green Circle: Data Analytics System Dataset
  • 🔵 Blue Circle: Machine Learning / Mathematical Models
  • 🟩 Green Rectangle: Generic I/O (intermediate data)
  • 🟪 Purple Square: Streaming Data Flow
  • (Invisible): REST API Endpoint

How to Define an I/O Port

As explained in the tutorial on creating a Service, to define an I/O Port for a Service it is necessary to:

  1. Create a set of command-line arguments at the core program level.
  2. Define Service Properties for each I/O Port and/or configuration parameter.

This section addresses the first point, namely which command-line arguments must be handled by the core program. For the second point, refer to Service Registration.


Input Ports

For Input Ports, the core program must handle a set of arguments depending on:

  • Port Type
  • Datasource Type

Note

During execution, the Data Analytics System will pass a series of populated command-line arguments to the Service.
Ensure that the core program ignores arguments that are not intended to be processed.

For example, in Python this can be achieved using parse_known_args() from the argparse module.

For each port type, we will now provide a definition and further details regarding the corresponding arguments for the core program, grouped into two categories:

  • Base Arguments: Determine the port type. For these, the user must define the respective Service Properties during Service registration.
  • Datasource-Specific Arguments: Automatically generated, populated, and passed by the Data Analytics System to the core program based on:
    1. Base arguments
    2. Datasource

Dataset Type

Symbol Definition Examples
service-with-single-input-dataset Accesses a location within one of the Batch Datasources defined in the Data Analytics System.

Batch Datasources include:
- Object Store
- Tabular
- Filesystem
Example 1
Example 3
Example 4

Base Arguments

--input-dataset
Populated with the path to the folder containing the Dataset (without trailing '/').
When implementing the core program, consider that this folder may contain one or more files.

--input-columns (special | optional)
This parameter is useful when handling tabular datasets. When specified, the Data Analytics System will:

  1. Display, in the UI, a checkbox for each column of the table when the port is clicked.
    tabular-dataset-columns-selection-panel

  2. Pass to the core program, as a comma-separated list, the names of the columns selected in the UI as the value of the --input-columns argument.
    Example:

    --input-columns=sepal_length,sepal_width,petal_length,petal_width,variety
    

The user can then use this list to implement column reduction for the dataset within the core program.

Datasource Specific Arguments

For an Object Storage datasource, the following command-line arguments will be provided:

Argument Value Notes
--input-dataset.storage_type minio (fixed)
--input-dataset.use_ssl False or True (based on the default value set by the user for "Secure" during datasource definition)
--input-dataset.direction input (fixed)
--input-dataset.id Dataset ID as in the Data Analytics System catalogue
--input-dataset.minio_bucket Bucket name
--input-dataset.minIO_URL Object store URL
--input-dataset.minIO_ACCESS_KEY Object store access key
--input-dataset.minIO_SECRET_KEY Object store secret key

Model Type

Symbol Definition Examples
service-with-single-input-model Accesses the location of a datasource containing a model Exemple 3
Exemple 4
Exemple 5

Base Arguments

Argument Value
--input-model Path to the folder containing the model

Datasource-Specific Arguments

Same as for the Dataset type, except that --input-dataset becomes --input-model.

Example:

Argument Value Notes
--input-model.storage_type minio (fixed)
--input-model.use_ssl False or True (based on default or user-provided value)
--input-model.direction input (fixed)
--input-model.id Model ID as in the Data Analytics System catalogue
--input-model.minio_bucket Bucket name
--input-model.minIO_URL Object store URL
--input-model.minIO_ACCESS_KEY Object store access key
--input-model.minIO_SECRET_KEY Object store secret key

Generic I/O Type (for intermediate data)

Simbolo Definizione Esempi
service-with-single-generic-input Access to a temporary volume completely managed by the Data Analytics System and decoupled from catalogue datasources Exemple 8

Base Arguments

Except for those related to Workflow Media, no additional arguments are required.

However, it is necessary to define an appropriate Service Property (see Service Registration). At that point, the core program will find, within the Docker container filesystem, a folder to read from or write data to.


Datasource-Specific Arguments

See Base Arguments above.


Streaming Type

Symbol Definition Examples
service-with-single-streaming-input Accesses a topic from a Message Broker datasource (e.g., Kafka) Esempio 2
Esempio 5

Base Arguments

Argument Value
--input-dataset Topic name

Datasource-Specific Arguments

Argument Value
--input-dataset.kafka_brokers List of broker URLs (e.g., Kafka cluster)

REST API Type

The REST API port allows exposing a REST endpoint from the Service, making it accessible via a dedicated URL in the browser.

There are no specific arguments for the core program. It is sufficient to define a Service Property as indicated in Service Registration - Porta REST API.

Once you registered the Service Property on the Workflow Designer, clicking on the Service will provide the option to access the endpoint by selecting Visit Now. The page displaying the endpoint response will open in a new tab.

visit-now-button-for-endpoint-exposing-service

The endpoint will be accessible only to authenticated users with the necessary permissions to access the Workflow containing the Service.


Output Ports

The same conventions described for input ports apply to output ports.

For dataset, model, and streaming ports, --input becomes --output. For example:

  • --output-dataset, --output-model

Multiple Ports

It is also possible to specify multiple input or output ports. For example, to define N input ports of dataset type, define arguments as:

--input-dataset-<X> where X ranges from 1 to N incrementally.

Example

To define three ports, the following arguments must be specified:

  • --input-dataset-1 --input-dataset-2 --input-dataset-3

If the dataset is tabular, also add the corresponding --input-columns:

  • --input-columns-1 --input-columns-2 --input-columns-3

Note

Each `--input-columns-<X>` corresponds to the specific dataset port with the same `X` index.

Similarly, arguments can be specified for multiple ports of other types:

Port Type Argument Names Notes Effect on Service
Dataset --input-dataset-1 --input-dataset-2 Also specify:
--input-columns-1 --input-columns-2
service-with-two-input-datasets
Modello --input-model-1 --input-model-2 N/A service-with-two-input-models
Generic I/0 INPUT GENERIC I/O N/A service-with-two-generic-inputs
Streaming N/A Having more ports of streaming input is not supported N/A
REST API Coming soon... Coming soon... Coming soon...

Mixed Ports

It is also possible to mix ports of different types with varying cardinality:

Ports Effect on Service
--input-dataset-1 --input-dataset-2 --input-columns-1 --input-columns-2 --input-model service-with-multiple-input-ports

User-Configurable Execution Parameters

To define auxiliary parameters for the Service, add the corresponding command-line argument to the core program.

Valid names are those that do not use reserved prefixes:

  • --input-dataset
  • --input-columns
  • --input-model

Example

Valid names include: --n_clusters, --threshold, --n_rounds, etc ...