Discovering the Arrow of Time in Machine Learning

J. Kasmire; Anran Zhao

doi:10.20944/preprints202108.0516.v1

Submitted:

26 August 2021

Posted:

27 August 2021

You are already at the latest version

Abstract

Machine learning (ML) is increasingly useful as data grows in volume and accessibility as it can perform tasks (e.g. categorisation, decision making, anomaly detection, etc.) through experience and without explicit instruction, even when the data are too vast, complex, highly variable, full of errors to be analysed in other ways , . Thus, ML is great for natural language, images, or other complex and messy data available in large and growing volumes. Selecting a ML algorithm depends on many factors as algorithms vary in supervision needed, tolerable error levels, and ability to account for order or temporal context, among many other things. Importantly, ML methods for explicitly ordered or time-dependent data struggle with errors or data asymmetry. Most data are at least implicitly ordered, potentially allowing a hidden `arrow of time’ to affect non-temporal ML performance. This research explores the interaction of ML and implicit order by training two ML algorithms on Twitter data before performing automatic classification tasks under conditions that balance volume and complexity of data. Results show that performance was affected, suggesting that researchers should carefully consider time when selecting appropriate ML algorithms, even when time is only implicitly included.

Keywords:

machine learning

;

time

;

naive bayes classification

;

recurrent neural networks

;

Twitter

;

social media data

;

automatic classification

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Discovering the Arrow of Time in Machine Learning

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe