Preprint
Article

Integrating Multimodal Generative AI and Blockchain for Enhancing Generative Design in the Early Phase of Architectural Design Process

Altmetrics

Downloads

119

Views

73

Comments

0

A peer-reviewed article of this preprint also exists.

This version is not peer-reviewed

Submitted:

17 July 2024

Posted:

17 July 2024

You are already at the latest version

Alerts
Abstract
AI advances integrate generative design tools in architecture, providing architects with sophisticated design options. It enables the creation of intricate, high-performing projects by exploring diverse design possibilities with AI and algorithms. Generative AI and generative design empower architects to create better-performing, sustainable, and efficient design solutions and explore diverse design possibilities. This paper leverages multimodal generative AI to enhance design creativity by combining textual and visual inputs. Blockchain technology converts design metadata into NFTs, ensuring secure, authentic, and traceable data storage. The framework addresses data ownership, legal adherence, and client-architect collaboration and is entirely scalable for digital design authentication. This research exemplifies the pragmatic fusion of Generative AI and blockchain technology applied in architectural design for more transparent, secure, and effective results. This study provides a strategy that uses generative AI technologies to achieve an efficient and creative workflow in the early stages of architectural design.
Keywords: 
Subject: Arts and Humanities  -   Architecture

1. Introduction

In recent years, Artificial intelligence (AI) and machine learning (ML) have played a pivotal role in catalyzing creativity by leveraging data analysis, design exploration, augmented creativity, and performance analysis [1,2,3]. Through processing extensive data, AI algorithms can unveil valuable insights, establish correlations, and generate design suggestions. Additionally, Ben Dreith [4] specifically highlighted the potential of AI to transform the creation and conceptual stages of architectural and product design.
Generative AI applications like Midjourney, DALL-E, and Stable Diffusion, created by diverse technology firms, utilize text-to-image and image-to-image inputs to produce AI-generated images, prompting discussions about their forthcoming impact on design and architecture. Architects can use these AI applications to explore various design possibilities, enhance their creative abilities, and receive immediate feedback for iterative improvements. However, some argue that AI should complement and strengthen architects' skills and intuition, not replace them. Human interpretation and critical thinking remain essential in the creative process.
Traditional architectural design processes often involve iterative conceptualization, refinement, and implementation cycles, requiring significant time and resources. In comparison, generative AI has shown promise in automating aspects of design generation. The image generation process starts with collecting a varied dataset from online repositories. Consequently, challenges persist in ensuring the security, transparency, and traceability of design data and transactions throughout the architectural lifecycle. Moreover, the process of human-AI generative design introduces legal risks, particularly concerning intellectual property infringement and various security concerns, including issues related to data privacy and copyright [5]. Despite these hurdles, leveraging generative AI for architectural design presents innovative possibilities. It focuses on job augmentation and collaborative synergy between human designers and AI systems. However, achieving this synergy demands careful consideration of copyright interests and ethical implications, requiring ongoing research and dialogue [6].
Blockchain technology offers an alternative solution for dealing with challenges in creating images through generative AI. Blockchain provides a decentralized and immutable ledger system that securely stores metadata related to image datasets, training parameters, and model outputs. By immutably recording transactional data, blockchain ensures that the integrity and origin of training data are protected, preventing unauthorized modifications or data tampering. This characteristic enhances the credibility of the dataset, which is crucial for ensuring the reliability and reproducibility of the generative AI model's training process.
In this research paper, we investigate and propose a blockchain-integrated framework for enhancing the authenticity and traceability of generating images with generative AI to ensure trust in the resulting image generation process. It could lead AI-powered creativity to innovation. The research mainly contributes to this framework by using multimodal generative AI, which combines the texts (prompt) and designs(images), improving design creativity and refinement. Blockchain technology makes data safer with metadata converted into Non-Fungible Tokens (NFTs), ensuring authenticity and traceability against unauthorized usage. This framework addresses data ownership and legal compliance challenges, improves client-architect collaboration, and is scalable for various projects, offering a comprehensive solution for creating authentic, traceable, and legally compliant digital designs.

2. Materials and Methods

This chapter delves into the fusion of Generative AI and blockchain technology within the architectural design realm. We adopt a hypothesis scenario approach presented as a framework outlining the generative design process flow. Our methodology encompasses two key techniques: multimodal Generative AI and data storage on the blockchain system, illustrated comprehensively in Figure 1. This scenario-based approach reflects the rationale behind this integration's workflow with real-life applications. We can investigate this integration's possibilities, prerequisites, and constraints using this approach.
As part of the multimodal shown in Table 1, we initiate the design process by outlining the building's intention through an initial sketch input into a generative AI application, like Midjourney. Use the application features and add/remove building elements within generated images iteratively. In the architectural design context, several terminologies are essential for stressing and clarifying how a beautiful building looks up to what finally ends after the practical stages provide shape. It may be through factors such as building typology, which describes the kind of project and its design, such as sustainability or cultural interpretation. For the next stage, we will design to provide context for buildings with a specific style in mind (modern architecture/post-modern/renaissance, or other desired style.) and contextual designs like site or location data and details of the surrounding area. Then, the rendering style is also defined, and focus points, such as interior and lighting features, are given more depth. The generative AI system is provided with design parameters such as aspect ratio, negative prompts and level of detail in the case of 2D images to guide it towards the desired output. We put together a prompt accordingly to send over the AI model and receive an image representing our architectural concept. The following design iteration analyzes the generated image to be traced back to what is intended for the final design concept.
The elements integrated in Generative AI applications, such as prompts, blending, upscales, variants, and remixes, are integral to the design process. These features enable a wide range of design options, offering designers a wide range of choices. Each resulting design drawing produces four variants, allowing further exploration and refinement. Alternative design options can be created by remixing the image and including additional parameters. The resulting images play a role in the architectural design process, whether they become the final result or not.
These models gain the ability to create art by discerning statistical patterns within pre-existing artistic media. The generative AI process involves training algorithms on extensive datasets comprising various art forms like paintings and photographs. These datasets are a foundation for algorithms to learn underlying patterns and stylistic elements within the artistic media. The training process includes the algorithm iteratively processing the data, refining its understanding of patterns, and gradually enhancing its ability to generate new art.
When generative AI models generate outputs based on training data, the ownership of that data can impact the generated content's legal, ethical, and regulatory implications. The training data may contain copyrighted material, and data ownership determines liability, attribution, and potential legal consequences. Moreover, data ownership plays a significant role in fostering innovation and promoting fair competition. For instance, when individuals or organizations invest time, effort, and resources into curating and creating high-quality training datasets, they should be able to derive value from their investments. It allows individuals and organizations to protect their data, influence usage, and assert their rights over the generated outputs. It also includes control and rights over the design images produced through generative AI applications, encompassing usage, modification, distribution, and monetization. It is essential to clarify ownership boundaries and evaluate the implications of utilizing specific design features to safeguard ownership claims in architectural design. By examining the role of these features in digital ownership, designers can navigate the complex landscape of control, rights, and responsibilities. This exploration can lead to frameworks and guidelines addressing the legal and ethical aspects of digital ownership and protection in architectural design.
This section will demonstrate the prompt data that could be stored within the blockchain system. Here, we utilize NFTs to store metadata, establishing data ownership for AI-generated images in the architecture design process. The application processing is available in our dataset [7].
The process begins with generating architectural images using generative AI and curating selected images. Subsequently, metadata for each image is created and stored as a .json file. The linkage between metadata and images is established, and the data is stored in cloud storage. To facilitate this, we develop a Java uploader application to store the images and their associated metadata in Firebase Storage by Google. Finally, the metadata is transformed into NFT metadata, and the entire process is deployed and executed in Remix using the ERC721 standard presented in Figure 2 and Figure 3.

3. Results

This section presents the outcomes following the generative design procedure, stored as NFTs. These results encapsulate the prompt details as metadata linked to the resulting images. Table 2 displays pertinent information, including User ID, Job ID, Seed, and Timestamp. All data processing and implementation conducted as part of the pilot project are included in our research dataset [7].
The data in Table 3 indicates a successful transaction deployed smart contract by representing unique identifiers of transactions or interactions within the Ethereum blockchain. Each Ethereum address represents an account or a contract on the Ethereum network. In this case, the contract address for this project is
0xb27A31f1b0AF2946B7F582768f03239b1eC07c2c. The transaction was successfully mined and executed (status 0x1) without generating additional output or logs. The contract deployment process is completed without errors, and the newly deployed contract is now available at the specified address on the Ethereum blockchain.
Following contract deployment, a transaction executed the creation of a Token within the deployed smart contract. This action resulted in minting an NFT token, incorporating metadata (data from the generated image), with ID 1 as the identifier for this metadata event. The transaction incurred a gas cost, representing the computational expense of executing the transaction on the Ethereum blockchain, with an execution cost of 183830 gas. Two logs were generated as a result of this transaction: the first indicating a transfer event, transferring ownership of the newly minted token to address 0x5B38Da6a701c568545dCfcB03FcB875f56beddC4, and the second representing an update the metadata event, signifying the update of metadata associated with the token.
Subsequently, a call was made to the token URI function within the smart contract to retrieve the URI associated with token ID 1. The decoded output of this call provided the token URI, a URL pointing to the metadata stored on a Firebase storage bucket. In this instance, the token URI is "https://firebasestorage.googleapis.com/v0/b/genainft-ac24b.appspot.com/o/metadata%2F1.json?alt=media".
This metadata contains relevant information about the NFT, including its attributes, properties, and provenance. The token URI is a standardized method for accessing NFT metadata, enabling owners and users to retrieve detailed information about the digital asset. Functionally, token URIs enhance interoperability by providing consistent access to metadata across different NFT platforms and applications.
The results of this study demonstrate the practical integration of Generative AI and blockchain technology in the architectural design process. The methodology and implementation details have been systematically explored, highlighting the benefits and challenges of this innovative approach. The following table summarises the key points considered in this integration:
This approach ensures transparency, security, and efficiency in the architectural design process, paving the way for future advancements in the field.

4. Discussion

4.1. AI Serves as a Creative Catalyst for Multimodal Design Generation

Multimodal generative AI has emerged as a groundbreaking approach in architectural design, offering the potential to revolutionize the design process through the integration of diverse modalities such as text, images, and videos [8], delve into the system implications of multimodal generation, highlighting challenges and opportunities for text-to-image (TTI) and text-to-video (TTV) models. There are two main categories of GAI models: unimodal and multimodal [9]. Unimodal models take prompts from the same modality as the content they generate. In contrast, multimodal models can accept prompts from different modalities and produce results in multiple modalities, as shown in Figure 3.
Figure 4. The difference between unimodal and multimodal generative AI, adapted from [9].
Figure 4. The difference between unimodal and multimodal generative AI, adapted from [9].
Preprints 112437 g004
Reflecting on the case study presented in this paper, particularly as shown in Table 4, consider the difference between utilizing an unimodal generative AI model solely from text input and integrating a multimodal approach using text and image prompts. When using an unimodal model, the architect's ability to convey nuanced design concepts may be limited by the constraints of text-only input.
By incorporating image prompts alongside textual descriptions, architects, as users, can communicate their design intent more effectively and explore a broader spectrum of creative possibilities. This seamless transition between image and written prompts enables flexible and dynamic design exploration, ultimately generating novel and innovative design solutions. Integrating multimodal generative AI empowers architects to harness visual and textual inputs, enriching the design process and producing more robust and sophisticated design outcomes.
The multimodal AI design process involves multiple iterations and adjustments based on initial sketches and continuous modifications. This iterative nature requires significant time and effort, mainly when refining designs to meet specific architectural requirements. Consequently, project timelines may experience delays, and architects and designers may face an increased workload.
The effectiveness of multimodal generative AI highly depends on the data it is trained on. If the training data lacks diversity or quality, the AI may produce repetitive or uninspired designs that do not meet the project's unique needs. Hence, there is a limitation in the AI's ability to generate creative and high-quality design options, potentially resulting in subpar architectural outcomes.
Based on the results and evaluation, particularly in the design process utilizing multimodal generative AI, we conclude that three aspects significantly impact the architectural design process: efficiency, accuracy, and user interface.
Table 5. Overview of the use of generative AI.
Table 5. Overview of the use of generative AI.
Aspect Resume Specific generative AI applications or
technology - Source
Computational efficiency AI enables rapid design generation, exploration, and iteration. 1MidJourney - [10,11,12];
NS2-[13,14]; Dall-E3 - [15].
Designers can quickly produce, evaluate, and refine multiple options, leading to more innovative and optimized solutions. NLP4 and MMAIR5 - [16]
NS-[17,18,19,20];
Dall-E - [21,22];
Dde6-GAN7 - [23];
CLIP8-[24].
Generative AI tools for architecture need high computational power and complex algorithms. NS-[18,25]; ChatGPT9-[26]; Bard AI10-[26];
Neural Canvas11 [27].
Ensuring efficiency and accessibility for all firms is challenging due to large datasets, diverse inputs, and multiple design constraints. NS – [18,25]; ChatGPT-[26]; Bard AI - [26];
NS-[16,17,18,19,20];
LLMs12-[28]
These demands can slow down processing and increase resource consumption. NS-[18,25]; Neural Canvas [27].
Accuracy Significant improvement in imaging accuracy ensures high-fidelity imaging for precise applications. CGANs13-[29]; U-Net Arch14 - [29].
Enhances reliability of multimodal communication and AI diagnostic processes. GenAIVA15 and FER16 -[30];
ChatGPT-[31];
Improves accuracy and transparency with visual explanations and textual analysis. ChatGPT-[31];
Maintaining high accuracy while optimizing resource usage and ensuring adaptability across diverse contexts is challenging. LangChain LLM-[32];
Ensuring consistent and reliable accuracy, generalizability, and efficient knowledge transfer in resource-limited environments is crucial. 3DI17-[33]; MML18-[33];
GenAINet10-[34].
Making visual explanations and textual analyses both accurate and comprehensible is challenging. ChatGPT-[31].
User
Experience (UX)
AI tools, like chatbots, improve adaptability, responsiveness, and user interaction by managing tasks and information efficiently. ChatGPT-[35];
Enhanced visualization and engagement build trust in AI systems MidJourney-[36,37];
Integrating text, image, and voice modalities into one tool is technically complex. NS - [38].
Generative AI tools require new skills and workflows, causing potential frustration and reduced productivity. Dall-E-[36]; OpenAI-[39]
Interoperability issues and variable AI output quality may need refinement. MidJourney - [36];
Dall-E - [36].
Limited customization can constrain designers' creativity. NS - [38].
Building user trust is challenging due to past unreliable performance and data privacy and security concerns. 20GAIS (IBM Watson) - [40]
1Generative AI program to generate images using natural language descriptions; 2Non Specific; 3Generative AI model developed by OpenAI; 4Natural Language Processing; 5Multimodal AI recognition; 6Dde: Data-driven evaluator; 7Generative Adversarial Network; 8Contrastive Language-Image Pre-Training; 9Generative Pre-trained Transformer; 10Google’ AI Chatbot; 11AI Comic Generator; 12Large Language Models; 13Conditional Generative Adversarial Network; 14Convolutional neural network; 15Generative AI for Virtual Avatar; 16FER: Facial Expression Recognition; 173D Invariant; 18Multimodal Machine Learning; 19Generative AI Networks; 20Generative AI System.
Table 4 summarises the advantages and challenges of applying multimodal generative AI in design, as highlighted by various research studies. As the case study reflects, multimodal generative AI has significantly enhanced computational efficiency, accuracy, and user experience across multiple applications. On the other hand, all aspects also conclude some challenges in maintaining high accuracy, consistency, and reliability while optimizing resource usage and ensuring adaptability across diverse contexts and environments, particularly in critical applications. Overall, Generative AI revolutionizes the design process, making it an indispensable tool in architectural design and other fields.

4.2. Blockchain Provides Methods of Verifying and Tracing the Authenticity of AI-Human Generative Design

Blockchain technology ensures ownership by offering a transparent ledger that records and validates ownership transactions. This technology enables the creation and management of assets through fungible tokens (NFTs), unique tokens representing ownership of a specific item or content piece.
Based on the case study in this research, illustrated in Figure 9, the "Transaction to execute createToken() Function within NFT Smart Contract" and the subsequent processes related to minting the NFT token and updating metadata can be considered as part of the authentication process. This authentication involves verifying the transaction's validity and ensuring that the correct function is executed within the smart contract, ultimately leading to the creation and authentication of the NFT token. It will empower users to securely generate and possess digital assets on the blockchain. Furthermore, the "Token URI: Standardised Method for Retrieving Metadata Associated with NFT" process can be considered as part of the traceability process. This step involves accessing the metadata associated with the NFT through a standardized token URI. By retrieving this metadata, users can trace and verify information about the digital asset, including its attributes, properties, and provenance.
Figure 5. The process of NFT smart contract: Verifying, validity and traceability.
Figure 5. The process of NFT smart contract: Verifying, validity and traceability.
Preprints 112437 g005
NFTs play a pivotal role in establishing ownership rights in architectural AI images by providing a unique digital representation of the asset on the blockchain. These tokens serve as a digital certificate of authenticity, verifying the ownership and provenance of the associated digital asset. In architectural AI images, NFTs can be used to tokenize specific designs, renderings, or other creative outputs generated through generative AI algorithms. By minting these assets as NFTs, architects and creators can assert ownership over their digital creations and establish a transparent chain of ownership on the blockchain. It ensures that the creator's rights are recognized and protected in the digital realm, enabling them to monetize their work, license it to others, or transfer ownership as desired. Additionally, NFTs can embed metadata that provides detailed information about the architectural AI image, further enhancing its value and utility for potential buyers or users. NFTs offer a secure and transparent mechanism for asserting ownership and managing intellectual property rights in architectural AI images, fostering trust and accountability in the digital ecosystem.
The case study approach provides detailed insights into real-world implementations, capturing multiple perspectives by examining these technologies and enhancing the reliability of the findings. After demonstrating the integration of generative AI and the implementation of the NFT blockchain, we concluded that three main aspects must be evaluated. In Table 6, we highlight the usage of NFTs from several papers, focusing on several key aspects: authenticity and ownership, integration in the creative process, and application in digital environments.
Integrating blockchain technology and NFTs forms a robust framework for ensuring the authenticity and ownership of digital assets within the generative AI design process. Features of generative AI, such as creating unique digital content, are enhanced by blockchain's ability to track and verify ownership securely. Legal aspects of data ownership are addressed through the immutable records provided by blockchain, ensuring clear attribution and reducing disputes. Blockchain also serves as a reliable digital asset storage solution, maintaining authenticity and ownership. Results from various studies demonstrate the successful implementation of these technologies, highlighting their potential to revolutionize digital asset management by providing secure, transparent, and verifiable ownership of AI-generated content.

4.3. Research Challenges, Limitations and Future Directions

This study addresses challenges in integrating generative AI and blockchain technology in architectural design. The table below summarises these challenges, the approaches to address them, the proposed solutions, and the existing research gaps that need further exploration
Table 7. Overview of the research challenges.
Table 7. Overview of the research challenges.
Aspect Challenges Addressing
Challenges
Proposed Solution Limitations
Authenticity and traceability Ensuring the authenticity and traceability of AI-generated images Developing a blockchain-integrated framework that ensures the authenticity and traceability of the generative AI process. A blockchain system can be used to store AI-generated images and their metadata as NFTs, ensuring secure and traceable data. Scalability and performance of integrated systems in large-scale applications
Integration of technologies Integrating multimodal generative AI and blockchain technologies seamlessly Develop a structured framework for integration. Combine multimodal generative AI and blockchain technology in a streamlined workflow for architectural design. Interoperability between different generative AI tools and blockchain platforms
Data ownership and legal issues Managing data ownership for AI-generated content Addressing data ownership and regulatory issues by ensuring proper attribution and legal compliance through blockchain records. Store AI-generated images and metadata in a blockchain system, ensuring data ownership and legal compliance through NFTs. Comprehensive studies on legal and regulatory frameworks required to govern the use of AI and blockchain in architectural design
User experience and
interaction
Improving design efficiency, accuracy, and user interaction Utilizing detailed prompt engineering to ensure accurate and relevant AI-generated images that align with the intended architectural designs. Use generative AI applications to create and refine architectural designs, ensuring user-friendly interaction and high-quality outputs. User acceptance and trust in AI-generated designs and blockchain-based data management
Future work should focus on expanding training datasets to include diverse architectural styles and exploring methods to disentangle style and organization within generative design algorithms. Additionally, we must develop user-friendly tools and interfaces that enable architects and designers to integrate Generative AI and blockchain into their design workflows easily. It could involve developing plug-ins for popular design software that streamline the process of generating AI-driven designs and recording them on the blockchain.
Collaboration between architects, AI researchers, and blockchain developers will be crucial for successfully implementing and adopting this integrated framework. Interdisciplinary efforts can drive innovation and ensure that the developed solutions meet the practical needs of the architectural community. Furthermore, establishing comprehensive regulatory frameworks that address the legal and ethical aspects of using generative AI and blockchain in design will be essential. These frameworks should promote innovation while protecting the rights and interests of all stakeholders involved, providing a secure environment for developing and applying these advanced technologies.

5. Conclusion

This study provides a strategy that uses generative AI technologies to achieve an efficient and creative workflow in the early stages of architectural design. Integrating generative AI and blockchain in architecture offers significant benefits, including streamlining the design process, protecting data ownership, and promoting authenticity and traceability. Incorporating generative AI platforms holds immense potential for transforming the architecture field, enabling architects to streamline design exploration processes effectively. This transformative capability highlights the role of AI as a creative catalyst for multimodal design generation, facilitating a deeper understanding of client intentions. By integrating blockchain technology and implementing NFT smart contracts, this technology successfully demonstrates how to enhance authenticity and traceability for generative AI designs as digital assets. NFT smart contracts function as digital certificates of authenticity, verifying ownership and provenance of associated digital assets.

Supplementary Materials

The following supporting information can be downloaded at: https://data.mendeley.com/datasets/d9zh352rf2/1

Author Contributions

JT made corrections to improve the paper; AF wrote the article; JT reviewed the whole text and made comments and suggestions to improve it. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded The Ministry of Science and Technology in Taiwan, grant No. MOST 111-2221-E-006 -049 -MY2.

Data Availability Statement

The datasets used and analyzed during the current study are available from the corresponding author upon reasonable request.

Acknowledgements

The authors thank the reviewers for their valuable comments and the editors for improving the manuscript.

Conflicts of Interest

The authors declare no competing interests.

References

  1. Davenport, T. H.; Mittal, N. How Generative AI Is Changing Creative Work, Harvard Business School Publishing, 2022. https://hbr.org/2022/11/how-generative-ai-is-changing-creative-work.
  2. Liu, Y.-C.; Liang, C. Design exploration predicts designer creativity: a deep learning approach, Cognitive Neurodynamics, 14(3), 2020, 291-300. 10.1007/s11571-020-09569-7. [CrossRef]
  3. Fu, K.; Fuge, M.; Brown, D. C. Design creativity, Artificial Intelligence for Engineering Design, Analysis and Manufacturing, 32(4), 2018, 363-364. 10.1017/S089006041800015X https://www.cambridge.org/core/article/design-creativity/F169FA985C16C17B2DEBBADDCB25C1A5.
  4. Dreith, B. How AI software will change architecture and design, Dezeen Limited, 2022. https://www.dezeen.com/2022/11/16/ai-design-architecture-product/.
  5. Litan, A. Why trust and security are essential for the future of generative AI, Gartner Inc., 2023. https://www.gartner.com/en/newsroom/press-releases/2023-04-20-why-trust-and-security-are-essential-for-the-future-of-generative-ai.
  6. Samuelson, P. Generative AI meets copyright, 381(6654), 2023, 158-161. [CrossRef]
  7. Fitriawijaya, A.; Taysheng, J. Multimodal Generative AI and NFT Metadata, Taipei, Taiwan, Release Date. 10.17632/d9zh352rf2.1.
  8. Golden, A.; Hsia, S.; Sun, F.; Acun, B.; Hosmer, B.; Lee, Y.; Devito, Z.; Johnson, J.; Wei, G.-Y.; Brooks, D. Generative AI Beyond LLMs: System Implications of Multimodal Generation, arXiv preprint arXiv:.14385, 2023. 10.48550/arXiv.2312.14385.
  9. Hariri, W. Unlocking the potential of ChatGPT: A comprehensive exploration of its applications, advantages, limitations, and future directions in natural language processing, arXiv preprint arXiv:.02017, 2023. 10.48550/arXiv.2304.02017.
  10. Del Castillo, A. P., AI: discovering the many faces of a faceless technology, ETUI aisbl, Brussels, Belgium, 2023 https://www.etui.org/sites/default/files/2023-05/AI-Guide-discovering%20the%20many%20faces%20of%20a%20faceless%20technology-2023.pdf.
  11. Ma, S. Y. Exploring ambiguity in generative AI images and its impact on collaborative design ideation, Master Student Thesis: Master, Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, 2024 https://pure.tue.nl/ws/portalfiles/portal/320754968/MTP_thesis_report_Sherry_Ma.pdf.
  12. Jaruga-Rozdolska, A. Artificial intelligence as part of future practices in the architect's work: MidJourney generative tool as part of a process of creating an architectural form, Architectus, 3( 71), 2022, 95-104. [CrossRef]
  13. Meeran, A. AI and Architecture: Image-based Machine Learning for early-stage design conceptualization, 2021.
  14. Basole, R. C.; Major, T. Generative AI for Visualization: Opportunities and Challenges, IEEE Computer Graphics and Applications, 44(2), 2024, 55-64. [CrossRef]
  15. Harreis, H.; Koullias, T.; Roberts, R.; Te, K. Generative AI: Unlocking the future of fashion, McKinsey Company, 2023. https://digital-humanai.io/wp-content/uploads/2023/03/Generative-AI-Unlocking-the-future-of-fashion.pdf.
  16. Zhong, C.; Yi'an Shi, L. H. C.; Wang, L. AI-enhanced performative building design optimization and exploration, presented at the 29th International Conference on Computer-Aided Architectural Design Research in Asia, CAADRIA 2024, The Association for Computer-Aided Architectural Design Research in Asia (CAADRIA), 1(2024 Published. https://papers.cumincad.org/data/works/att/caadria2024_15.pdf.
  17. Bstieler, L.; Noble, C. H., The PDMA Handbook of Innovation and New Product Development, John Wiley & Sons, 2023.
  18. Li, C.; Zhang, T.; Du, X.; Zhang, Y.; Xie, H. Generative AI for Architectural Design: A Literature Review, arXiv preprint arXiv:.01335, 2024. [CrossRef]
  19. Zhang, K.; Cai, S.; Yang, W.; Wu, W.; Shen, H. Exploring Optimal Combinations: The Impact of Sequential Multimodal Inspirational Stimuli in Design Concepts on Creativity, presented at the Proceedings of the 2024 ACM Designing Interactive Systems Conference, IT University of Copenhagen, Denmark Association for Computing Machinery, 2024 Published, 2788–2801. [CrossRef]
  20. Ochoa, K. S., Can Artificial Intelligence Mark the Next Architectural Revolution? Design Exploration in the Realm of Generative Algorithms and Search Engines, Springer, 2024. [CrossRef]
  21. Paananen, V.; Oppenlaender, J.; Visuri, A. J. I. J. O. a. C. Using text-to-image generation for architectural design ideation, 2023, 14780771231222783. [CrossRef]
  22. Albaghajati, Z. M.; Bettaieb, D. M.; Malek, R. B. Exploring text-to-image application in architectural design: insights and implications, Architecture, Structures and Construction, 3(4), 2023, 475-497. [CrossRef]
  23. Yuan, C.; Marion, T.; Moghaddam, M. Dde-gan: Integrating a data-driven design evaluator into generative adversarial networks for desirable and diverse concept generation, Journal of Mechanical Design, 145(4), 2023, 041407. [CrossRef]
  24. Salem, A. A.; Mansour, Y.; Eldaly, H. Generative vs. Non-Generative AI: Analyzing the Effects of AI on the Architectural Design Process, Engineering Research Journal, 53(2), 2024, 119-128. [CrossRef]
  25. Liao, W.; Lu, X.; Fei, Y.; Gu, Y.; Huang, Y. Generative AI design for building structures, Automation in Construction, 157, 2024, 105187. [CrossRef]
  26. Rane, N.; Choudhary, S.; Rane, J. Integrating ChatGPT, Bard, and leading-edge generative artificial intelligence in architectural design and engineering: applications, framework, and challenges, SSRN Electronic Journal, 2023. [CrossRef]
  27. Shen, Y.; Shen, Y.; Cheng, J.; Jiang, C.; Fan, M.; Wang, Z. Neural Canvas: Supporting Scenic Design Prototyping by Integrating 3D Sketching and Generative AI, presented at the Proceedings of the CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA Association for Computing Machinery, 2024 Published, Article 1056. [CrossRef]
  28. Makatura, L.; Michael Foshey; Bohan Wang; Felix Hähnlein; Pingchuan Ma; Bolei Deng; Megan Tjandrasuwita; Andrew Spielberg; Crystal Elaine Owens; Peter Yichen Chen; Allan Zhao; Amy Zhu; Wil J. Norton; Edward Gu; Joshua Jacob; Yifei Li; Adriana Schulz; Matusik., W. Large Language Models for Design and Manufacturing, An MIT Exploration of Generative AI, 2024. [CrossRef]
  29. Maqbool, J.; Hassan, S. T.; Cheema, M. I. Application of conditional generative adversarial networks toward time-efficient and high-fidelity imaging via multimode fibers, in AI and Optical Data Sciences V, SPIE, 12903, 2024, 69-73. [CrossRef]
  30. F, G.; N, M.; Khan, J.; S. H, K. M. Envisioning the interactive convergence of Generative AI and Facial Expression Recognition, in 2024 IEEE 9th International Conference for Convergence in Technology (I2CT)), 2024, 1-5. 10.1109/I2CT61223.2024.10543745.
  31. Koga, S. Evaluating ChatGPT in pathology: towards multimodal AI in medical imaging, Journal of Clinical Pathology, 2024, jcp-2024-209483. [CrossRef]
  32. Micheal, A. A.; Prasanth, A.; Aswin, T.; Krisha, B. Advancing Educational Accessibility: The LangChain LLM Chatbot's Impact on Multimedia Syllabus-Based Learning, 2024. [CrossRef]
  33. Dollar, O. Deep Inverse Design, Discovery, and Optimization of Molecular Structure through 3D Invariant and Multimodal Machine Learning, 2023 http://hdl.handle.net/1773/50264.
  34. Zou, H.; Zhao, Q.; Bariah, L.; Tian, Y.; Bennis, M.; Lasaulce, S.; Debbah, M.; Bader, F. GenAINet: Enabling Wireless Collective Intelligence via Knowledge Transfer and Reasoning, arXiv preprint:2402.16631, 2024. [CrossRef]
  35. Zürcher, A. Developing a Chatbot for Internal Documents, Master, Business Information Technology, Haaga-Helia University of Applied Sciences, Finland, 2024 https://www.theseus.fi/bitstream/handle/10024/861594/Zurcher_Alexandre.pdf?sequence=2.
  36. Parati, I.; Zolotova, M. Using Future Thinking as a steering tool for Generative AI creative output: a case study aiming at rethink lighting in the next future, in Human Interaction & Emerging Technologies: Artificial Intelligence & Future Applications Proceedings of the 11th International Conference on Human Interaction and Emerging Technologies, IHIET-AI 2024, April 25–27, 2024, Lausanne, Switzerland, Technology & Engineering, AHFE International Open Access), 2024. [CrossRef]
  37. Nistler, J.; Pojeta, T. J. V. T.-E. I. Graphical use of AI, 65(4), 2023, 54-56. https://www.vtei.cz/wp-content/uploads/2023/08/6575-casopis-VTEI-4-23-EN-AI.pdf.
  38. Bagnato, V. P. Artificial Intelligence for Design: The Artificial Intelligence of Objects, Interdisciplinary Journal of Architecture and Built Environment, 2023, 30-35. https://www.researchgate.net/profile/Valerio-Perna/publication/379573623_FORUM_AP_27_Venturing_into_the_Age_of_AI_Insights_and_Perspectives/links/660fb14db839e05a20bd9cfb/FORUM-A-P-27-Venturing-into-the-Age-of-AI-Insights-and-Perspectives.pdf#page=31.
  39. Schraml N, T. (2023, 2023 August-September) 'You've Got All the Weapons You Need. Now Fight!'. Database Trends & Applications [Article]. 32. Available: https://link.gale.com/apps/doc/A766676216/AONE?u=anon~172811b0&sid=googleScholar&xid=b8f48060.
  40. Sharma, S. K.; Dwivedi, Y. K.; Metri, B.; Lal, B.; Elbanna, A., Transfer, Diffusion and Adoption of Next-Generation Digital Technologies: IFIP WG 8.6 International Working Conference on Transfer and Diffusion of IT, TDIT 2023, Nagpur, India, December 15–16, 2023, Proceedings, Part I, Springer Nature, 2023 https://link.springer.com/book/10.1007/978-3-031-50192-0.
  41. Tomaževič, N.; Ravšelj, D.; Aristovnik, A. Artificial Intelligence for human-centric society: The future is here, in, European Liberal Forum), 2023. https://liberalforum.eu/wp-content/uploads/2023/12/Artificial-Intelligence-for-human-centric-society.pdf.
  42. Mcnamara, T. Artificial intelligence and the emergence of co-creativism in contemporary art, INSAM Journal of Contemporary Music, Art.
  43. Technology, (11), 2023, 12-38. https://www.ceeol.com/search/article-detail?id=1210421.
  44. Zhang, B.; Chen, G.; Ooi, B. C.; Shou, M. Z.; Tan, K. L.; Tung, A. K. H.; Xiao, X.; Yip, J. W. L.; Zhang, M. Managing Metaverse Data Tsunami: Actionable Insights, IEEE Transactions on Knowledge and Data Engineering, 2024, 1-20. [CrossRef]
  45. Guo, Y.; Liu, Q.; Chen, J.; Xue, W.; Jensen, H.; Rosas, F.; Shaw, J.; Wu, X.; Zhang, J.; Xu, J. J. a. P. A. Pathway to Future Symbiotic Creativity, 2022. [CrossRef]
  46. Parra Pennefather, P. Prototyping with Generative AI, in Creative Prototyping with Generative AI: Augmenting Creative Workflows with Generative AI, Springer, 2023, 109-143. [CrossRef]
  47. Ioannıdıs, S.; Kontıs, A. P. The 4 Epochs of the Metaverse, Journal of Metaverse, 3(2), 2023, 152-165. [CrossRef]
  48. Kalpokas, I. J. P.; Criticism, S. Work of art in the Age of Its AI Reproduction, 2023, 01914537231184490. [CrossRef]
  49. Rudolf, I. Understanding the Influence of Artificial Intelligence Art on Transaction in the Art World, Master, School of Humanities, Social, Science, and Economics, International Hellenic University, Thessaloniki, Greece, 2024 https://repository.ihu.edu.gr/xmlui/bitstream/handle/11544/30356/Ion%20Rudolf.pdf?sequence=1.
  50. Gupta, R.; Pal, S. K. Introduction to Metaverse, Springer Books, 2023. [CrossRef]
  51. Popescu, A.-D. Non-fungible tokens (nft)–innovation beyond the craze, in 5th International Conference on Innovation in Business, Economics and Marketing Research, 32, 2021, 26-30. https://www.researchgate.net/publication/353973149_Non-Fungible_Tokens_NFT_-_Innovation_beyond_the_craze#fullTextFileContent.
  52. Sahu, B.; Chandramohan Jha, A. M. NFT Marketplaces: The Future of Digital Asset Trading, International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), 9(3), 2023, 513-519. [CrossRef]
  53. Han, Y.; Wang, C.; Wang, H. Research on Blockchain Cross-Chain Model Based on "NFT + Cross-Chain Bridge", IEEE Access, 12, 2024, 77065-77078. [CrossRef]
  54. Wang, Q.; Li, R.; Wang, Q.; Chen, S. Non-fungible token (NFT): Overview, evaluation, opportunities and challenges, arXiv preprint arXiv:.07447, 2021. [CrossRef]
  55. Moreaux, A. Visual content tracking, IPR management, & blockchain: from process abstraction to functional interoperability, Institut Polytechnique de Paris, 2023 https://theses.hal.science/tel-04418984/.
  56. Lu, W.; Wu, L. A blockchain-based deployment framework for protecting building design intellectual property rights in collaborative digital environments, Computers in Industry, 159, 2024, 104098. [CrossRef]
  57. Truong, V. T.; Le, L.; Niyato, D. Blockchain meets metaverse and digital asset management: A comprehensive survey, Ieee Access, 11, 2023, 26258-26288. [CrossRef]
  58. Park, A.; Kietzmann, J.; Pitt, L.; Dabirian, A. The evolution of non-fungible tokens: Complexity and novelty of NFT use-cases, IT Professional, 24(1), 2022, 9-14. [CrossRef]
  59. Huang, M.-H.; Rust, R. T. J. J. O. S. R. Artificial intelligence in service, 21(2), 2018, 155-172. [CrossRef]
  60. Morháč, D.; Valaštín, V.; Košťál, K.; Kotuliak, I. Cross-Chain Payments on Blockchain Networks: An Apartment Booking Use-Case, presented at the Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing, Avila, Spain Association for Computing Machinery, 2024 Published, 608–611. [CrossRef]
  61. Battah, A.; Madine, M.; Yaqoob, I.; Salah, K.; Hasan, H. R.; Jayaraman, R. Blockchain and NFTs for trusted ownership, trading, and access of AI models, IEEE Access, 10, 2022, 112230-112249. [CrossRef]
  62. Bhujel, S.; Rahulamathavan, Y. A survey: Security, transparency, and scalability issues of nft's and its marketplaces, Sensors, 22(22), 2022, 8833. [CrossRef]
  63. Khalil, U.; Uddin, M.; Malik, O. A.; Hong, O. W. A Novel NFT Solution for Assets Digitization and Authentication in Cyber-Physical Systems: Blueprint and Evaluation, IEEE Open Journal of the Computer Society, 5, 2024, 131-143. [CrossRef]
Figure 1. A framework for integrating Multimodal Generative AI and blockchain systems.
Figure 1. A framework for integrating Multimodal Generative AI and blockchain systems.
Preprints 112437 g001
Figure 2. AI-generated images to NFT metadata: a streamlined process.
Figure 2. AI-generated images to NFT metadata: a streamlined process.
Preprints 112437 g002
Figure 3. The screenshot depicts the deployment of NFT metadata using the ERC721 standard sourced from the dataset [7].
Figure 3. The screenshot depicts the deployment of NFT metadata using the ERC721 standard sourced from the dataset [7].
Preprints 112437 g003
Table 1. An example of the architectural design process using multimodal generative AI (images generated by Midjhourney).
Table 1. An example of the architectural design process using multimodal generative AI (images generated by Midjhourney).
Input Output
Image options Selected images
Preprints 112437 i001
Design Intention:
Micro Library is located in the rice field, with a building concept modern house with a pyramid roof

Produce a hand sketch depicting the intended shape of the building to ensure alignment with the desired design.
Preprints 112437 i002
Prompt:https://s.mj.run/mzXZOLMb-Xw people walking, sunny day, architectural rendering --s 750
Preprints 112437 i003
09effa33-8139-42cc-8704-feeddebf8186
Preprints 112437 i004

Modify the building and add the environment to adjust with design intention.
Preprints 112437 i005
Prompt: modern stilt house building with long cube shape, with wooden material and perforated building facade
Preprints 112437 i006
c4e63e2f-b027-4234-986b-ef2294080713
Preprints 112437 i007
Modify the building element
Preprints 112437 i008
Prompt: trees and sky, --no building - Variations (Region)
Preprints 112437 i009
8084c539-22e0-4cec-8163-84eba9190947
Table 2. NFT Metadata, source [7].
Table 2. NFT Metadata, source [7].
Generative AI image linked to metadata in Firebase storage:
Preprints 112437 i010
Image Linked in Firebase storage:
https://firebasestorage.googleapis.com/v0/b/genainft-ac24b.appspot.com/o/image%2F1.png?alt=media&token=0aab1e85-3c98-4e04-ac79-db175ab7c82e
Metadata (.json file):
{
"attributes": [
{
"username": "digicliffnotes",
"user_id": "1084483399319822387",
"job_id": "c4e63e2f-b027-4234-986b-ef2294080713",
"creator_name": "Adam",
"creation_date": "April 13th, 2024 11:26 pm",
"creation_tool": "Midjourney",
"prompt": "modern_stilt_house_building_with_long_cube_shape_with_wooden_material_and_perforated_building_facade",
"image_link": "https://cdn.discordapp.com/attachments/1087237286707605535/1228727456182177862/digicliffnotes_modern_stilt_house_building_with_long_cube_shape_c4e63e2f-b027-4234-986b-ef2294080713.png?
ex=662d189e&is=661aa39e&hm=e6931926dc71ae46f1f1966444fa2b5f344916e0a168a2b74e71688c1f69a478&"
}
],
"image":"https://firebasestorage.googleapis.com/v0/b/genainft-ac24b.appspot.com/o/image%2F1.png?alt=media&token=0aab1e85-3c98-4e04-ac79-db175ab7c82e",
"name": "Design_option #1"
}
Link of metadata in Firebase storage :
https://firebasestorage.googleapis.com/v0/b/genainft-ac24b.appspot.com/o/metadata%2F1.json?alt=media&token=c08b907d-b6f4-4653-9ceb-c8966e48e659
Deploy and transaction the metadata using Smart Contract:
The contract address: 0xb27A31f1b0AF2946B7F582768f03239b1eC07c2c
Token address: 0x5B38Da6a701c568545dCfcB03FcB875f56beddC4
Token URI: "https://firebasestorage.googleapis.com/v0/b/genainft-ac24b.appspot.com/o/metadata%2F1.json?alt=media"
Table 3. Key points of scenario.
Table 3. Key points of scenario.
Key points Description
Framework for integration Combines Generative AI and blockchain for architectural design using a hypothesis scenario framework to outline the design process flow and real-life applications.
Generative AI design process Involves initial sketches, prompt engineering, and iterative refinement to generate accurate AI outputs. Key elements include building typology, site details, materials, spatial layout, and rendering style.
Features of Generative AI Utilises variants, upscale, blends, remixes, and prompts to provide multiple design possibilities. AI models are trained on large datasets to identify artistic trends and stylistic components.
Data ownership and legal aspects Emphasizes the importance of data ownership in AI training datasets, impacting legal, moral, and regulatory implications. It includes rights to usage, modification, distribution, and monetization of AI-generated outputs.
Blockchain for data storage Uses blockchain to store prompt data and AI-generated images as NFTs, ensuring secure data ownership. The process includes generating images, producing metadata, storing data in Google Firebase, and converting metadata into NFT format.
Results and implementation Showcases the outcomes of the generative design process, with metadata linked to final images. Provides examples of metadata, storage links, and smart contract details for NFT deployment, demonstrating the practical application of the method.
Table 4. Comparing Uni-modal and Multimodal Generative AI in the architectural design phase.
Table 4. Comparing Uni-modal and Multimodal Generative AI in the architectural design phase.
Uni-modal
Input Output (Generated image by Generative AI)
Image Options Selected Image
Design
objective
Design iteration: Find the reference building with writing the prompt to develop the building shape suitable with the design intention Preprints 112437 i011
Job ID: 7c006bd2-07ad-4e8d-b9ef-47f00abc4732
Preprints 112437 i012
Job ID: f918bc31-bcff-4a1c-a49a-d98959652c05
Image -
Prompts Create a prompt as a trigger to draw the environment:
["micro library, incorporating vernacular and contemporary architecture, combination of perforated metal panel and transparent muted glass wall as facade, mir rendering, perpspective view, located in the rice field near the village in taiwan, natural light"]
Multimodal
Design objective Design iteration: Combine the building to get wider range design options using blend Preprints 112437 i013
Job ID: f4e81d0e-b2e2-4057-8ee3-42a59f97551c
seed 2798360210
Preprints 112437 i014
Job ID: 8ede5825-a45c-452b-ba8e-bb22e6c2d14a
seed 2798360210
Image Preprints 112437 i015Preprints 112437 i016
Prompts No prompt
Table 6. Overview of the aspect of using NFT and its integration.
Table 6. Overview of the aspect of using NFT and its integration.
Aspect Resume Source
Authenticity, certification and ownership NFTs provide a robust method for certifying the authenticity of digital assets through blockchain technology. [10,41,42,43,44,45,46,47]
Blockchain's immutable nature ensures that the ownership records of NFTs remain tamper-proof and verifiable, thus guaranteeing the authenticity of AI-generated content. [47,48,49]
Proving ownership and authenticity in a decentralized NFT market can be complex. [50,51]
Ensuring the security of blockchain and NFTs against hacking, fraud, and other malicious activities is a significant concern that can impact the reliability of authenticity and ownership records. [42,46,47]
Integration in the creative process NFTs facilitate creating, owning, and distributing collaborative AI-human creations. This integration supports a new dimension of creativity where digital assets are co-created by humans and AI. [42,44,48]
Interoperability issues between different blockchain platforms can hinder seamless integration and data exchange. [52,53,54]
Complexity integrating NFT-based. [55,56,57]
Scalability and traceability process Blockchain-based NFTs simplify the registration, verification, and tracing of financial transactions related to digital assets, thus enhancing these transactions' overall security and transparency. [43,46,49,58,59]
NFTs offer a transparent and secure means to track and verify ownership of digital assets. [60,61]
Implementing blockchain solutions on a large scale can be complex and costly, limiting their practicality for verifying and tracking digital assets. [55,56,62]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated