Dataset Query Service
The Dataset Query Service is a software component designed to enable access to and querying of files stored in remote storage. The service is primarily intended to be used by other application systems through REST APIs, enabling automated integration scenarios between software services.
The main objective of the service is to allow systems to:
- explore the files available within a remote storage space
- locate specific files through name-based searches
- retrieve information contained within the files
In this way, data stored in the storage system can be dynamically used by applications and digital workflows, avoiding manual file consultation.
Data Management in Storage
The datasets that can be queried by the service are stored in S3-compatible storage, implemented using MinIO.
Files stored in the system may include:
- Excel spreadsheets
- CSV files
- other structured formats used for data management
The service connects to the remote storage and allows application systems to query the contents through a standardized API-based interface.
In this way, the storage becomes a queryable data source that can be integrated into application workflows.
REST API Features
The service exposes a set of APIs that allow access to data stored in the storage system.
The main available functionalities include:
- retrieving the list of available files in the storage
- searching for files by name or partial name
- searching for specific values within files
The APIs therefore allow datasets to be used as queryable resources by other services or software components.
Service Architecture
The service is designed using a layered application structure that separates responsibilities between API request handling, application logic, and data access.
The logical architecture can be represented as follows:

In this model:
- the application client sends HTTP requests to the service
- the API service receives and manages the requests
- the service layer performs operations on the data
- the MinIO storage contains the queried files
This separation allows the system to remain modular and easily extensible.
File Querying
The service allows information to be retrieved from files through several querying methods.
Listing Available Files
One feature allows retrieving the list of files available in the storage. This enables application systems to know which datasets are available for further processing or querying.
Searching Files by Name
It is possible to search for files using:
- the full file name
- a partial name
The service returns all files that match the search criteria, enabling quick identification of the desired dataset.
Searching for Values Inside Files
A more advanced feature allows querying the actual content of files.
Operational flow:
- the requesting system specifies the file to analyze
- the service retrieves the file from the remote storage
- the file content is loaded into a tabular data structure
- a search for the requested value is performed within the data
- the results are returned to the requesting system
The search can be performed:
- across the entire dataset
- on a specific column, if specified in the request
This allows stored datasets to be used as a queryable source for analysis, control, or monitoring applications.
Dataset Query Operational Flow
The process of searching for information within datasets can be summarized as follows:
- An application system sends a request to the service via REST API
- The service validates the request parameters
- The requested file is retrieved from MinIO storage
- The file content is processed and converted into a tabular structure
- The service performs the requested value search
- The results are returned to the requesting system
This mechanism allows datasets to be queried without manually downloading or locally analyzing the files.
Integration with Application Systems
The service is primarily designed for machine-to-machine scenarios, where software applications interact through APIs.
Thanks to this approach:
- datasets can be automatically queried by other systems
- stored data becomes part of digital information flows
- the service can be easily integrated into microservices-based architectures
The Dataset Query Service therefore transforms a file storage system into a queryable data source, accessible in a structured way and integrable into application processes.