Article
Version 1
Preserved in Portico This version is not peer-reviewed
Applying Swin Architecture to diverse Sign Language Datasets
Version 1
: Received: 25 February 2024 / Approved: 26 February 2024 / Online: 27 February 2024 (13:27:05 CET)
A peer-reviewed article of this Preprint also exists.
Kumar, Y.; Huang, K.; Lin, C.-C.; Watson, A.; Li, J.J.; Morreale, P.; Delgado, J. Applying Swin Architecture to Diverse Sign Language Datasets. Electronics 2024, 13, 1509. Kumar, Y.; Huang, K.; Lin, C.-C.; Watson, A.; Li, J.J.; Morreale, P.; Delgado, J. Applying Swin Architecture to Diverse Sign Language Datasets. Electronics 2024, 13, 1509.
Abstract
In the era of Artificial Intelligence (AI), comprehending and responding to non-verbal communication is increasingly vital. This research extends AI's reach in bridging communication gaps, notably benefiting American Sign Language (ASL) and Taiwan Sign Language (TSL) communities. It focuses on employing various AI models, especially the Hierarchical Vision Transformer with Shifted Windows (Swin), for recognizing diverse sign language datasets. The study assesses Swin architecture's adaptability to different sign languages, aiming to create a universal platform for Unvoiced communities. Utilizing deep learning and transformer technologies, hybrid application prototypes have been developed for ASL-to-English translations and vice versa, with plans to expand this to multiple sign languages. The Swin models, trained on varied dataset sizes, show considerable accuracy, indicating their flexibility and effectiveness. This research underscores major advancements in sign language recognition and underlines a commitment to inclusive communication in the digital era. Future work will focus on enhancing these models and broadening their scope to include more sign languages, integrating multimodality and Large Language Models (LLMs), thereby fostering global inclusivity.
Keywords
Swin Transformer; ASL detection; Deep Learning; The Unvoiced
Subject
Computer Science and Mathematics, Computer Vision and Graphics
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (0)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment