Version 1
: Received: 8 October 2023 / Approved: 9 October 2023 / Online: 9 October 2023 (11:04:10 CEST)
Version 2
: Received: 1 November 2023 / Approved: 2 November 2023 / Online: 2 November 2023 (10:59:40 CET)
How to cite:
Hyun, S.; Son, Y.; Park, J. W. Korean Audio-Visual Dataset of Characters in 3D Animation: Construction and Validation. Preprints2023, 2023100514. https://doi.org/10.20944/preprints202310.0514.v1
Hyun, S.; Son, Y.; Park, J. W. Korean Audio-Visual Dataset of Characters in 3D Animation: Construction and Validation. Preprints 2023, 2023100514. https://doi.org/10.20944/preprints202310.0514.v1
Hyun, S.; Son, Y.; Park, J. W. Korean Audio-Visual Dataset of Characters in 3D Animation: Construction and Validation. Preprints2023, 2023100514. https://doi.org/10.20944/preprints202310.0514.v1
APA Style
Hyun, S., Son, Y., & Park, J. W. (2023). Korean Audio-Visual Dataset of Characters in 3D Animation: Construction and Validation. Preprints. https://doi.org/10.20944/preprints202310.0514.v1
Chicago/Turabian Style
Hyun, S., Yeongmin Son and Jae Wan Park. 2023 "Korean Audio-Visual Dataset of Characters in 3D Animation: Construction and Validation" Preprints. https://doi.org/10.20944/preprints202310.0514.v1
Abstract
Characters are one of the most important elements in composing digital animation. The appear-ance and voice of a character should be designed to express the personality and values of the character. However, it is not easy for animation producers to harmoniously match the appear-ance and voice of a character. Advances in deep learning technology have made it possible to overcome this limitation. To achieve this, firstly, an audio-visual dataset of characters is required. In this study, we construct and verify a Korean audio-visual dataset consisting of frontal face im-ages of various characters and short voice clips. We developed an application that can automati-cally extract the frontal face image and a short voice clip of a character by collecting videos up-loaded to YouTube. Through this, a dataset consisting of a total of 1,522 face images and a total of 7,999 seconds of voice clips was built based on 490 characters. Furthermore, we automatically la-bel characters by gender and age to validate the dataset. The dataset built in this study is expected to be used in various deep learning fields, such as classification, generative adversarial networks, and speech synthesis.
Keywords
anime character; 3D animation; audio-visual dataset
Subject
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.