Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

BBPE-AE: A Byte Pair Encoding-Based Auto Encoder for Password Guessing

Version 1 : Received: 9 September 2024 / Approved: 10 September 2024 / Online: 11 September 2024 (10:49:15 CEST)

How to cite: Ghafari, S.; Safari, L.; Afsharchi, M. BBPE-AE: A Byte Pair Encoding-Based Auto Encoder for Password Guessing. Preprints 2024, 2024090834. https://doi.org/10.20944/preprints202409.0834.v1 Ghafari, S.; Safari, L.; Afsharchi, M. BBPE-AE: A Byte Pair Encoding-Based Auto Encoder for Password Guessing. Preprints 2024, 2024090834. https://doi.org/10.20944/preprints202409.0834.v1

Abstract

In today’s rapidly evolving digital landscape, the significance of password guessing techniques in both offensive and defensive strategies is paramount. Passwords serve as a crucial line of defense for both individuals and corporations, safeguarding their sensitive systems and data. Therefore, assessing the effectiveness of these access credentials is a critical task. However, existing research in this field often encounters limitations, such as a lack of sufficient training data and extended model training times. Current methods often struggle with limited training data and lengthy training times. This paper introduces BBPE-AE, a novel Auto Encoder Network (AE) designed for password guessing. BBPE-AE utilizes Byte-level Byte Pair Encoding (BBPE) to extract frequent tokens from password datasets without length restrictions, employing a dynamic window technique to capture complex patterns. Experimental results on Hotmail and Myspace datasets demonstrate exceptional performance, achieving high similarity rates (BLEU-Unigram: 0.90, BLEU-Bigram: 0.82 for Hotmail; BLEU-Unigram: 0.90, BLEU-Bigram: 0.81 for Myspace). BBPE-AE generates realistic passwords that meet HIBP (Have I Been Pwned) standards, minimizing duplication. These findings highlight the effectiveness of BBPE-AE in enhancing security by generating realistic passwords, ultimately safeguarding sensitive systems and data.

Keywords

Password Guessing; Cybersecurity; Password Strength Assessment; Autoencoder Network; Long Short-Term Memory (LSTM) Networks; Byte-level Byte Pair Encoding

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.