Skip to content

Getting Started | tutorial-label

In this tutorial, you will create a simple Workflow to get familiar with Data Analytics System.
Specifically, you will:

  • Upload a new Dataset
  • Create a Workflow that reads the Dataset and applies the K-means algorithm
  • Execute the Workflow
  • Visualize the obtained results

1. Login

First, log in using the credentials provided by your system administrator.

2. Dataset Upload

Proceed with uploading the Dataset.
In Data Analytics System, a Dataset refers to a specific location within a Datasource (e.g., "bucket/prefix" in the case of a MinIO Object Store).

2.1 Download the Dataset

Download the Iris dataset in CSV format to your computer from the following URL:

2.2 Access the Dataset Registration Page

Follow the figures below:

dashboard with burger icon highlighted

Open the side menu via the burger icon

sidebar with datasets highlighted

Open the Datasets panel

datasets page

Open the Dataset registration form

2.3 Fill in the Registration Form

Complete the form as shown below (see also figures):

  1. Name: Quickstart Iris
  2. Select the Datasource where you want to upload the dataset: <your username>
  3. Open the Dataset upload panel by clicking the Upload Dataset button upload-dataset-button dataset registration form with first steps highlighted
    Dataset registration form
  4. In the opened panel, enter the name of the new folder to be created: Quickstart Iris
    (the uploaded file will be saved inside this folder).
  5. Upload the Iris dataset downloaded in step 2.1 by clicking the “Drop files or folders here” area.
    upload dataset dialog
    Dataset upload panel
    file-successfully-uploaded-popup
  6. Click on the newly created directory. select-just-uploaded-directory
  7. Select the dataset path by clicking the Select Dataset Path icon register-dataset-select-dataset-path-icon-button as shown below: load-just-uploaded-file-path Once selected, the schema of the Dataset will be displayed at the bottom of the same page. register-dataset-schema-details-after-dataset-path-selection
  8. Click Save.
    The detail page for the newly created Dataset will appear:

dataset-details-page-after-upload


3. Workflow Creation

3.1 Access the Workflow Designer

Access the Designer for Workflow creation as shown below:

sidebar-menu-with-workflows-item-highlighted

Access the Workflow catalog

create-workflow-button-highlighted

Create a new Workflow

3.2 Configure the Workflow Settings

Fill out the Workflow Settings form as shown:

  • Access Level: Private
  • Default Data Source: <your.username>

Then click Confirm.

new-workflow-pre-settings.png

Workflow Settings Panel

You have now entered the Workflow Designer, which allows you to combine Datasets, Services, and Models to build your Workflow:

workflow-designer-first-view


3.4 Assemble the Workflow

Drag and drop the following onto the Designer canvas:

  1. The KMeans Service
  2. The Iris Dataset you previously uploaded (under the Datasets tab)

service-drag-n-drop

3.5 Connect Components

Connect the “Quickstart Iris” Dataset to the “KMeans” Service by dragging the dataset’s output port (🟢) and dropping it onto the service’s input port (🟢):

connecting-service-ports

3.6 Assign a Workflow Name

From the right-hand menu, assign a name to your Workflow: Quickstart Iris

workflow-designer-assign-name-to-workflow

3.7 Select the Execution Target

Click on the KMeans Service on the canvas and select the default Target Worker Nodes.

select-default-target-in-workflow

The Target determines the set of cluster nodes eligible to host the execution of the Service.

3.8 Save the Workflow

Click on an empty spot on the canvas and then select Create Workflow.

save-workflow.png

Note

In this short tutorial, we do not cover the many configuration features available for Workflows and individual Services in Data Analytics System.
For these, please refer to the corresponding sections of this manual.


4. Workflow Execution

4.1 Run the Workflow

After creating the Workflow, on its detail page, click Run to execute it.

click-on-run-workflow

4.2 Execution Status

When execution is complete, the status bar will automatically change to Completed.

completed-workflow-status-bar


5. Viewing the Results

The KMeans Service generates two types of outputs:

  1. A set of graphical plots
  2. An output dataset that differs from the input dataset by the addition of a new column containing the classification results

5.1 View the Graphical Results

Remain on the Workflow detail page and click Workflow Media:

application-media-button-highlighted

A panel will appear on the right side, showing the plots generated by the Workflow:

kmeans-sample-plots

5.2 View the Output Dataset

  1. Scroll down the current Workflow detail page.
  2. Open the Datasets section.
  3. Click the Preview button (plain-preview-button) next to the output dataset (output-dataset-tag-icon).

output-dataset-view-area.png

A dialog will appear showing a preview of the resulting dataset.
Note the new cluster column, containing the processing results.

output-dataset-data-preview-popup.png


6. Next Steps

Congratulations! You have just created and executed your first Workflow.

Continue exploring Data Analytics System’s concepts and features:

  • Visit the The Platform section to explore Data Analytics System graphical interface
  • Visit the Assets section to learn more about Data Analytics System’s core components
  • Visit the Service section and the tutorial on creating custom Services