Preprint
Article

Automatic electronic invoice classification using machine learning models

Altmetrics

Downloads

472

Views

546

Comments

0

A peer-reviewed article of this preprint also exists.

Submitted:

01 October 2020

Posted:

05 October 2020

You are already at the latest version

Alerts
Abstract
Electronic invoicing has become mandatory for Italian companies since January 2019. Invoices are structured in a predefined xml template where the information reported can be easily extracted and analyzed. The main aim of this paper is to exploit the information structured in electronic invoices to build an intelligent system which can facilitate accountants work. More precisely, this contribution shows how it is possible to automate part of the accounting process: all sent or received invoices of a company are classified into specific codes which represent the economic nature of the the financial transactions. In order to classify data contained in the invoices a machine learning multiclass classification problem is proposed using as input variables the information of the invoices to predict two different target variables, account codes and the VAT codes, which composes a general ledger entry. Different approaches are compared in terms of prediction accuracy. The best performance is achieved considering the hierarchical structure of the account codes.
Keywords: 
Subject: Business, Economics and Management  -   Accounting and Taxation
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated