Choice of Cluster Computing System Hadoop and Apache Spark for Network Systems

Vasiliy Elagin; Vladislav Karpov; Aleksandr Kravchenko; Aleksandr Goldstein; Andrei Vladyko

doi:10.20944/preprints201904.0281.v1

Submitted:

24 April 2019

Posted:

25 April 2019

You are already at the latest version

Subscription

Notify me about updates to this article or when a peer-reviewed version is published.

Preprint

Article

This version is not peer-reviewed.

Choice of Cluster Computing System Hadoop and Apache Spark for Network Systems

Vasiliy Elagin,Vladislav Karpov,Aleksandr Kravchenko^*,Aleksandr Goldstein,Andrei Vladyko

Submitted:

24 April 2019

Posted:

25 April 2019

You are already at the latest version

Abstract

The article provides detailed information about the new technologies based on cluster computing Hadoop and Apache Spark. The experimental task of processing logistic regression with the help of these technologies is considered. The findings on the comparison of the performance of cluster computing of Hadoop and Apache Spark are revealed and substantiated.

Keywords:

Cluster computing

;

Big Data

;

Spark

;

Hadoop.

Subject:

Computer Science and Mathematics - Information Systems

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.