In era of ever-expanding data and knowledge, we lack a centralized system that maps all the faculties to their research works. This problem has not been addressed in the past and it becomes challenging for students to connect with the right faculty of their domain. Since we have so many colleges and faculties this lies in the category of big data problem. In this paper, we present a model which works on the distributed computing environment to tackle big data. The proposed model uses apache spark as an execution engine and hive as database. The results are visualized with the help of Tableau that is connected to Apache Hive to achieve distributed computing.
Keywords:
Subject: Computer Science and Mathematics - Information Systems
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.