A Proposal of Methodology for Designing Big Data Warehouses

Francesco Di Tria; Ezio Lefons; Filippo Tangorra

doi:10.20944/preprints201806.0219.v1

Submitted:

13 June 2018

Posted:

13 June 2018

You are already at the latest version

Abstract

Big Data warehouses are a new class of databases that largely use unstructured and volatile data for analytical purpose. Examples of this kind of data sources are those coming from the Web, such as social networks and blogs, or from sensor networks, where huge amounts of data may be available only for short intervals of time. In order to manage massive data sources, a strategy must be adopted to define multidimensional schemas in presence of fast-changing situations or even undefined business requirements. In the paper, we propose a design methodology that adopts agile and automatic approaches, in order to reduce the time necessary to integrate new data sources and to include new business requirements on the fly. The data are immediately available for analyses, since the underlying architecture is based on a virtual data warehouse that does not require the importing phase. Examples of application of the methodology are presented along the paper in order to show the validity of this approach compared to a traditional one.

Keywords:

Big data technology

;

Business intelligence

;

Data integration

;

System virtualization.

Subject:

Computer Science and Mathematics - Information Systems

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

A Proposal of Methodology for Designing Big Data Warehouses

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe