Personal tools

Almeida-2021m

From IEETA

Jump to: navigation, search

Article

Title BIcenter: A collaborative Web ETL solution based on a reflective software approach
Author João Almeida, Leonardo Coelho, José Luís Oliveira
Journal SoftwareX
Volume 16
Number
Pages 100892
Month December
Year 2021
DOI 10.1016/j.softx.2021.100892
Group Biomedical Informatics and Technologies
Group (before 2015)
Indexed by ISI Yes

Abstract: The continuous growth of new sources of information has led to an unprecedented increase in the data collected. The dimensionality and heterogeneity of these data requires efficient strategies for searching, accessing and integrating from multiple repositories. The techniques underlying this goal are usually known as Extraction, Transformation and Loading (ETL) pipelines, which aim to organise dispersed data into a common structure. However, despite their popularity and widespread use, these pipelines present a few drawbacks in specific scenarios. In clinical research, for instance, it is quite common to engage multiple researchers, institutions and datasets so that the study findings can have higher impact. This implies cooperation between several entities to design the workflow, even when these entities do not have permission to work directly with the source data, due to privacy and regulatory issues. Furthermore, extending the pipeline to other data sources requires adding new concepts and rules over time, which implies continuous updating of the ETL scripts. This paper presents a collaborative web-based ETL application that allows users to design, share and execute ETL pipelines, across multiple centres. The system is supported by a user-friendly interface in which non-technical users can build the ETL pipelines without the need to grasp the ETL details, and most importantly, without having direct access to the data.