Version 1
: Received: 19 October 2024 / Approved: 20 October 2024 / Online: 21 October 2024 (11:54:40 CEST)
How to cite:
Wan, Y.; Xiao, L.; Wu, X.; Yang, J.; He, L. Imaginique Expressions: Tailoring Personalized Short-Text to Image Generation Through Aesthetic Assessment and Human Insights. Preprints2024, 2024101550. https://doi.org/10.20944/preprints202410.1550.v1
Wan, Y.; Xiao, L.; Wu, X.; Yang, J.; He, L. Imaginique Expressions: Tailoring Personalized Short-Text to Image Generation Through Aesthetic Assessment and Human Insights. Preprints 2024, 2024101550. https://doi.org/10.20944/preprints202410.1550.v1
Wan, Y.; Xiao, L.; Wu, X.; Yang, J.; He, L. Imaginique Expressions: Tailoring Personalized Short-Text to Image Generation Through Aesthetic Assessment and Human Insights. Preprints2024, 2024101550. https://doi.org/10.20944/preprints202410.1550.v1
APA Style
Wan, Y., Xiao, L., Wu, X., Yang, J., & He, L. (2024). Imaginique Expressions: Tailoring Personalized Short-Text to Image Generation Through Aesthetic Assessment and Human Insights. Preprints. https://doi.org/10.20944/preprints202410.1550.v1
Chicago/Turabian Style
Wan, Y., Jing Yang and Liang He. 2024 "Imaginique Expressions: Tailoring Personalized Short-Text to Image Generation Through Aesthetic Assessment and Human Insights" Preprints. https://doi.org/10.20944/preprints202410.1550.v1
Abstract
Text-to-image tasks have gained significant progress, where vivid images are generated from detailed text descriptions. However, existing studies overlook scenarios where the provided text is sparse, neglect the investigation of human preferences, and fail to acknowledge the diversity of aesthetic opinions. To address those issues, we develop a methodology called personalized short-text-to-image generation through aesthetic assessment and human insights. We develop a Personality Encoder (PE) to extract personal information and establish the Big-Five personality traits-based Image Aesthetic Assessment model (BFIAA) for predicting specific human aesthetic preferences. Utilizing the BFIAA model, we fine-tune the Stable Diffusion model to align it more closely with human preferences. Our experiments demonstrate that our BFIAA model can truly reflect human aesthetic preference and the adapted generation model can generate personalized images more preferred by humans.
Keywords
text-to-image; personalized image aesthetic assessment; human feedback
Subject
Computer Science and Mathematics, Computer Vision and Graphics
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.