Preprint
Review

This version is not peer-reviewed.

Applications of Deep Reinforcement Learning for Home Energy Management Systems: A Review

A peer-reviewed article of this preprint also exists.

Submitted:

11 November 2024

Posted:

12 November 2024

You are already at the latest version

Abstract
In the context of the increasing integration of renewable energy sources (RES) and smart devices in domestic applications, the implementation of Home Energy Management Systems (HEMS) is becoming a pivotal factor in optimizing energy usage and reducing costs. This review examines the role of reinforcement learning (RL) in the advancement of HEMS, presenting it as a powerful tool for the adaptive management of complex, real-time energy demands. This review is notable for its comprehensive examination of the applications of RL-based methods and tools in HEMS, which encompasses demand response, load scheduling, and renewable energy integration. Furthermore, the integration of RL within distributed automation and Internet of Things (IoT) frameworks is emphasized in the review as a means of facilitating autonomous, data-driven control. Despite the considerable potential of this approach, the authors identify a number of challenges that require further investigation, including the need for robust data security and scalable solutions. It is recommended that future research place greater emphasis on real applications and case studies, with the objective of bridging the gap between theoretical models and practical implementations. The objective is to achieve resilient and secure energy management in residential and prosumers buildings, in particular within local microgrids.
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  

1. Introduction

The transition towards sustainable energy practices has led to an increased emphasis on Home Energy Management Systems (HEMS) as integral components of energy-efficient residential buildings. These systems are designed to monitor, control and optimize energy usage within residences, thereby aligning with broader goals of reduced environmental impact and operational cost savings [1,2]. As residential energy demands increase, driven by the growing utilization of smart devices, electric vehicles and distributed energy resources (DERs) such as solar panels and battery storage, HEMS are becoming indispensable components of modern homes [3,4,5,6]. The importance of HEMS is further emphasized by the advent of smart grids, as they not only facilitate the management of energy within the context of individual households but also provide support for grid stability and efficiency through the implementation of demand side management (DSM) and demand side response (DSR) mechanisms [6,7,8].
In conjunction with these technological developments, new regulatory frameworks, policies and standards are fostering the evolution of HEMS, thereby rendering energy management functions obligatory for residential and prosumer premises. The revised Energy Performance of Buildings Directive (EPBD 2024) [9], places an emphasis on energy efficiency within the context of the building sector and mandates improvements in energy performance across EU member states. The directive’s objective is to achieve a highly energy-efficient and decarbonized building stock by 2050, which will necessitate updates to existing HEMS to meet the requisite energy-saving and emission-reducing targets [10]. Furthermore, the recently introduced in EPBD 2024 the Smart Readiness Indicator (SRI) assesses a building’s capacity to support a range of smart services, including the integration of renewable energy sources (RES) and storage solutions, the enabling of vehicle-to-grid (V2G) connections, and effective interfacing with the smart grid. The SRI is therefore of great importance in the promotion of buildings that are not only energy-efficient but also adaptable to future energy requirements [11,12,13]. New technical standards, such as ISO 52120 [14] for Building Automation and Control Systems (BACS), provide further specification of the functional and performance requirements for energy management. This standard provides comprehensive guidance for optimizing energy usage within residential and commercial buildings, requiring the integration of BACS with HEMS to achieve enhanced energy efficiency and responsiveness to grid signals. In combination, these regulatory changes are propelling HEMS towards more sophisticated functionalities that not only facilitate energy management at the household level but also actively contribute to grid stability [11,15,16].

1.1. Modern and Future Home Energy Management Systems – Complexity and Advancements

It is expected that modern and future HEMS will manage a wide array of interconnected devices, including HVAC systems, lighting, appliances, as well as RESs and local prosumers microgrids [8,17]. These devices are often integrated with distributed control networks and Internet of Things (IoT) technologies that enable real-time monitoring and automated control [18,19,20]. The utilization of the IoT within smart homes provides a plethora of data, yet simultaneously introduces a multitude of complexities. The system must adeptly handle diverse data streams from a multitude of sources, process this data in real-time, and make optimal control decisions under varying conditions [21,22,23,24]. The heterogeneity of data, which encompasses weather forecasts, energy prices, user habits, and the operational status of devices, gives rise to a highly complex problem space. The complexity is further amplified in interconnected environments, where homes operate as nodes within a smart grid. Adaptive and coordinated responses are therefore essential to ensure grid stability and efficiency [25,26,27,28].
Bearing in mind all the discussed aspects, conventional optimization techniques for HEMS, such as rule-based algorithms, static scheduling and linear programming, frequently demonstrate shortcomings in addressing the dynamic characteristics of smart homes comprising interconnected devices and fluctuating demand profiles. A Reinforcement Learning (RL) offers a promising approach for dynamically managing energy resources within complex environments. Its ability to learn and improve policies through interaction and reward makes it a compelling solution for these scenarios. In particular, RL enables HEMS to adapt continuously to changing conditions by adjusting control policies in response to real-time feedback [29,30,31,32]. As HEMS applications become increasingly complex and diverse, Deep Reinforcement Learning (DRL), a subfield of RL that employs deep neural networks to process high-dimensional data, has emerged as a powerful tool for managing these systems. The abilities of DRL algorithms to manage extensive data inputs and learn intricate relationships between variables make them particularly well-suited to smart home environments with extensive IoT devices and diverse energy sources [32,33,34,35].
The implementation of RL and DRL in HEMS presents a number of benefits with regard to DSM and load management. To illustrate, DRL algorithms are capable of anticipating peak load periods based on historical data and current usage patterns, thereby enabling systems to preemptively shift loads, control appliance usage, or draw on stored energy to reduce strain on the grid [6,15,36,37]. Furthermore, the adaptability of RL-based methods enables HEMS to autonomously respond to DSM signals from utility providers, thereby adjusting loads in a manner that optimizes both homeowner cost savings and grid efficiency. This adaptability is crucial, as HEMS are increasingly expected to balance energy consumption and storage within the household while synchronizing with smart grid demands. This will contribute to broader goals of energy resilience and sustainability as well as support transactive energy model [38,39,40,41,42].
The BACS and smart home systems plays a pivotal role within HEMS by orchestrating the control of an array of home subsystems, including lighting, climate, and security, through centralized or distributed control platforms [43,44,45]. The integration of advanced BACS with RL and DRL models facilitates a more unified and effective operation of HEMS. This is achieved by providing a comprehensive view of the user’s activity, home’s energy demands and environmental conditions, which enables the management of the building’s diverse systems in a coordinated manner. This integration supports an optimized approach to energy management, where, for instance, Heating, Ventilation and Air Condition (HVAC) systems can be programmed to adjust based on occupancy patterns detected by IoT sensors or energy pricing data, thereby reducing operational costs [5,22,46,47].
Despite the considerable potential of RL and DRL in the field of HEMS, a number of challenges remain. One significant constraint is the data-intensive nature of DRL models, which necessitate extensive training data to learn optimal policies, often resulting in high computational overheads. Furthermore, DRL models necessitate a high degree of safety and reliability, as suboptimal decisions could result in increased energy costs or compromise user comfort. Furthermore, the potential for integrating DRL with IoT-based HEMS gives rise to concerns regarding data security and user privacy, given that these systems process sensitive personal data from smart home devices [26,48,49,50,51].

1.2. An Original Contribution and the Paper Structure

This review provides a comprehensive examination of the current applications of RL and DRL in HEMS, with a particular focus on the analysis of their benefits, limitations, and the state-of-the-art methodologies. By outlining the contributions of RL and DRL to energy demand forecasting, load scheduling, and peak load management, as well as their role in smart grid integration and DSM, this review aims to demonstrate the value of these technologies in advancing smart home capabilities. Furthermore, it discusses ongoing research and future directions, with particular emphasis on areas where RL and DRL have the potential to facilitate the development of autonomous, efficient, and sustainable home energy systems that will benefit both individual households and the broader energy grid.
Moreover, the review makes a novel contribution by exploring the intersection of RL with IoT technologies, DSR and DSM strategies, scheduling optimization, and the integration of RESs with energy storage systems. The focus on these interrelated domains represents a cutting-edge direction in smart building technologies development, especially in smart home automation and energy management, which remains underdeveloped in existing literature. The novelty of this review lies in the following key aspects:
  • Synergy between RL and IoT for Real-Time Smart Home systems. While RL and IoT have individually shown promise in home and building automation [43,52,53,54], this review is among the first to extensively analyze how RL can be leveraged with IoT networks to achieve real-time monitoring and adaptive control in energy management. Moreover, it demonstrates the potential for more efficient and autonomous building operations through the utilization of IoT sensors to feed RL systems with real-time data on energy usage, occupancy, and environmental conditions;
  • Innovative approaches to DSR optimization. This review identifies a novel application of RL in enhancing DSR programs, enabling homes, in particular prosumers, to dynamically respond to fluctuations in energy prices and grid conditions. By utilizing RL, homes buildings can autonomously learn optimal strategies for shifting or reducing energy loads, contributing to grid stability and energy cost savings, particularly in the context of peak demand periods. The ability of RL to adapt to varying DR signals and building-specific constraints presents a significant advancement over traditional rule-based approaches;
  • Advanced scheduling for energy and resource optimization. A unique focus of this review is the application of RL in scheduling algorithms for home automation systems, particularly in relation to energy consumption, occupancy prediction, and appliance usage. This review explores how RL and DRL can optimize multiobjective scheduling problems, balancing comfort, energy efficiency, and operational costs. Such applications are critical for ensuring flexible home and prosumers systems, capable of responding to dynamic energy demands and varying occupant needs;
  • Integration of RL and DRL with RES and energy storage systems. One of the most novel aspects of this review is the examination of how RL and DRL techniques can be used to manage RESs, such as solar and wind, in conjunction with energy storage systems, especially important for modern and future prosumer applications. By enabling intelligent decision-making about when to store, use, or sell generated energy, RL and DRL algorithms can help maximize the self-consumption of renewables and ensure grid or microgrid independence. This is particularly important in homes and buildings aiming for net-zero energy performance, as RL-driven strategies can optimize the use of intermittent RES in real time;
  • Bridging the gap between theory and practice. While much of the existing research on RL in building automation remains theoretical or simulation-based, this review uniquely emphasizes need for practical case studies and real-world implementations. It identifies key challenges such as scalability, data availability, and heterogeneous system integration, offering insights into how these challenges can be overcome when deploying RL-based systems in operational environments.
Section 2 sets forth the principal assumptions and methodology of the review, whereas Section 3 contains a synthetic analysis of selected publications. The following Section 4 provides detailed information and characterization of the application areas of RL and DRL techniques and algorithms in homes, along with the identification of gaps and challenges. On this basis, in Section 5, the authors identify potential opportunities and development trends, with particular attention to the role of RL and DRL in the organization of advanced energy management systems for homes and prosumers in local microgrids. In the final Section 6, the conclusions are presented, along with an outline of future work.

2. Methodology of the Review

The methodology of organizing this review is informed by the latest developments in home and building automation systems, including an analysis of the potential for incorporating RL and DRL algorithms into the various functional areas of energy management, considering the guidelines for the SRI [55] introduced firstly by the EPBD 2018 [56]. Consequently, the initial stage of the selection process entailed verifying the number of publications addressing the subjects of building automation, home automation and reinforcement learning in the primary bibliographic databases of scientific and technical literature. The outcomes of this preliminary selection process are presented in Table 1.
It is notable that there is a paucity of scientific publications from the area that combine two threads: home and building automation and RL. In particular, the number of articles and individual review studies on the use of RL in home automation and smart homes is limited, indicating a relatively low level of interest among researchers and scientists in this application area for RL. Conversely, a substantial number of publications in these thematic areas were identified in the Google Scholar database, which encompasses a comprehensive range of academic materials, including those that may not meet traditional bibliographic and scientific standards. This suggests that ongoing processes of technological and technical development are occurring regarding solutions based on RL that are dedicated to applications in building and home automation.
However, it is important to note that the factors driving this development are relatively recent. They have emerged in recent years, largely driven by the necessity to align building infrastructure with new standards governing the smart grid, integration of RES, energy storage units and charging stations for electric vehicles. Furthermore, in Europe, the necessity to develop and implement novel, sophisticated energy and demand management mechanisms in residential and commercial buildings is driven by the requirements set forth in the aforementioned EPBD directive and the SRI indicator [11,13,57,58].
To gain insight into the current state of knowledge and latest developments in RL solutions and algorithms in home and building automation, a literature review was conducted. This involved selecting relevant publications from technical science publishers. These included Springer, ScienceDirect, MDPI, IEEE Xplore (journals and conferences), and additionally Taylor and Francis, the ACM Digital Library, and the Wiley Online Library. The findings are presented in Table 2.
A detailed examination of the data presented in Table 2 clearly demonstrates that the subject of RL implementation in the functional structures of home and building automation systems is a developing area of research and is being addressed in scientific publications by leading publishers. However, it should be noted, however, that most publications are not indexed in the Web of Science and Scopus databases. Furthermore, the authors of this review, who have been engaged with the subject of supporting BACS building automation systems and the impact of automation functions on the energy efficiency of buildings for several years [41,59,60], highlight the necessity for the development and implementation of new, efficient mechanisms for the processing of large amounts of data in such applications. This is a direct consequence of the continual expansion of the infrastructure of both residential and non-residential buildings, particularly in response to the evolving requirements and needs of their users and managers. Thus, there is an increasing number of diverse types of sensors, monitoring modules, energy and media meters that generate an ever-increasing volume of data, as well as fieldbus-level controllers, which in turn process this data and based on it implement usage scenarios for rooms, buildings or entire building campuses, along with public spaces in their surroundings.
This is an area of application in which BACS and Building Management Systems (BMS) have been functioning effectively for a considerable period. They have been installed in a range of larger public, commercial and office buildings, among other locations. However, the growing trend of developing local energy microgrids and smart grid solutions has led to a requirement for the active inclusion of the infrastructure of these facilities, both commercial and residential, in the energy management network. This, in turn, necessitates the implementation of a novel and ingenious methodology for the organization of monitoring and automation functions with data processing. One of the solutions in this domain is the utilization of external data servers and cloud computing resources [44,46,61]. Moreover, an essential aspect is also the provision of data services and effective processing at the local automation network level (fieldbus) [62,63], with the potential for implementing procedures directly in network nodes (IoT) [52,64] or local automation servers (edge computing) [44,50,65]. It is in this application area that the potential of machine learning mechanisms and algorithms, as well as RL and DRL, with various learning models, data analysis are seen to be particularly promising. Accordingly, the authors selected several dozen scientific papers directly concerning the simulation and application studies of such algorithms in the field of building automation, with a particular focus on home automation. The decision to focus on home automation was motivated by the limited number of publications, particularly reviews that simultaneously addressed RL and DRL as well as home automation. Additionally, the growing trend of this application field for RL and DRL further reinforced the choice of home automation as a key area of interest.

3. State of the Art and Practice

In the assumptions for the analysis of the existing state of knowledge in the area of the use of RL and DRL techniques and algorithms in home automation systems, the focus for this review was on threads supporting the implementation of energy management functions. This is a specific area of application in the context of home automation, as so far, the focus has been mainly on ensuring comfort, convenience and security in the use of the home infrastructure [18,66,67,68,69]. On the other hand, the trends and changes identified in sections 1 and 2 indicate the need to carry out research and development work on the integration of mechanisms in home automation systems to support effective energy management, the organization of the functioning of these facilities as prosumers, and so on [8,26,70,71,72,73,74,75,76]. The authors selected four key application areas that support integration, energy and device management, and the operation of prosumer installations in the home. These are the Internet of Things, Demand Response, Scheduling and RES + Storage. As a result, the Table 3 presents a unique collection of several dozen publications in the form of articles in journals and conference materials (without reviews) from the last five years, which constitute the main core of this review analysis, considering the issues indicated previously. In addition, the graph in Figure 1 shows the link between the research and development work on the application of RL and DRL methods discussed in these publications and the main publishers. The developmental nature of this work is evidenced by the large number of publications by the IEEE Xplore publisher, particularly in the form of conference proceedings and individual scientific articles.
The publications collected in the Table 3 have been divided into four groups according to the areas of application of RL and DRL algorithms identified in them. It should be emphasized that for each of the application groups, the authors of these publications analyzed the possibilities of using different RL and DRL algorithms and techniques. They also assumed slightly different objectives and methods for verifying the effectiveness of the algorithm implementation, which can be summarized as follows:
  • IoT applications
    • Algorithms used: DRL, Deep Q-learning, Q-learning, DDPG
    • Objectives: Focuses on optimizing cost and comfort, with additional considerations for autonomy, personalization, and privacy
    • Verification: All experiments and models are verified through simulations
  • DSR applications
    • Algorithms used: A variety including MORL, Q-learning (and its variations with Fuzzy Reasoning), DQN, MARL, PPO, Actor-Critic methods, among others
    • Objectives: Primarily target cost and comfort optimization
    • Both simulations and some evaluations using real-world data or physical testing setups (e.g., MATLAB and Arduino Uno)
  • Scheduling applications
    • Algorithms used: Q-learning, DQN, PPO, MADDPG, among others
    • Objectives: Focus on cost and comfort optimization, with several entries solely targeting cost
    • Verification: Predominantly simulations, with some studies using practical data from real-world networks
  • Data security and privacy
    • Algorithms used: TRPO, SAC, Q-learning, PPO, DDPG, and others
    • Objectives: Aimed at optimizing cost and comfort, with a specific focus on energy systems integrating renewable sources and storage
    • Verification: All studies verify their findings through simulations, with some using real-world data from energy markets and PV profiles.
It is noteworthy that the efficacy of RL and DRL algorithms has been evaluated through simulations conducted within the development environment for each analyzed case. It is notable that only a limited number of the simulations included data sets derived from real-world objects. Furthermore, there is a notable absence of detailed descriptions of studies conducted on real-world objects in the subject literature. Additionally, there is a dearth of case studies that would provide insight into the tangible impact of RL and DRL algorithm techniques on the efficacy of automation systems, particularly in the context of energy management in residential and commercial buildings. This is one of the issues identified by the authors of this review as a gap and challenge for further research and development work. Furthermore, they conducted a comprehensive analysis of the characteristics of the solutions and application areas for RL and DRL proposed in the literature, with a particular focus on smaller home automation applications.

4. Applications of Reinforcement Learning for Home Automation

As previously stated, the architectural framework of HEMS can be based on IoT modules, as they facilitate the collection of data on energy consumption, weather conditions, or user presence in real time and remotely, enabling the automatic control of receivers and energy flows [79]. The straightforward integration of IoT devices and the relatively simple control algorithms implemented in them prompt the consideration of more effective control methods, such as DRL [34,112]. In the context of HEMS applications, DRL algorithms are employed with the objective of rationalizing energy consumption in households. In particular, the application of these techniques is beneficial in the context of new home infrastructure related to RES, electric car charging systems and energy storage installations. In such circumstances, the incorporation of DRL mechanisms facilitates the formulation of intelligent decisions pertaining to energy management, with considerations of variables such as energy costs, environmental conditions and user preferences [77,78]. DRL agents learn through interactions with their environment, receiving rewards or penalties for their actions, thereby enabling them to gradually enhance their real-time decision-making capabilities [77]. Moreover, DRL models are employed to forecast future energy availability, consumption patterns, and storage levels by analyzing historical data, weather forecasts, and real-time sensor data. This enables the automation system and users to make decisions that are based on informed reasoning, thereby maximizing the use of renewable energy sources and minimizing dependence on external energy sources. RL-based HEMS systems are capable of adapting energy management strategies in accordance with individual user preferences and changing system conditions, including tariffs and DSM/DSR signals. This enables the achievement of greater energy savings and user comfort. Furthermore, these systems can provide users with recommendations regarding the optimal management of energy consumption [78].
From the technical and functional organization point of view, energy consumption management through optimal work planning (i.e., switching on and off or changing the mode) of devices, particularly those included in the technical infrastructure of the building (e.g., heating, ventilation, air conditioning, etc.), represents a primary task of modern HEMS. The utilization of control algorithms employing DRL within HEMS facilitates the incorporation of dynamic planning (scheduling) functions. The use of DRL is important for the optimization of scheduling strategies in HEMS systems. This is since DRL algorithms learn optimal energy management strategies through interactions with the environment, receiving rewards for actions that result in energy and cost savings. This is evidenced by the literature, which cites numerous studies on the subject [3,6,94,95,96,97,98,99,100,101,102,103].
The adaptability and flexibility of DRL algorithms enable management systems based on monitoring and control functions to adjust their operations in response to changing market and environmental conditions, including fluctuations in energy prices and changes in demand. Concurrently, they facilitate the optimal utilization of renewable energy and energy stored in energy storage tanks, thereby rationalizing energy storage management processes. As previously stated, a key consideration for the implementation of scheduling in HEMS is the incorporation of user preferences, with the objective of aligning energy management strategies with individual requirements. This approach is essential for the acceptance of such a mode of operation by users. Such preferences can be collated through the utilization of user interfaces, which permit the configuration of the system in accordance with user specifications [6,94,95,98,99,101,103]. Alternatively, they can be employed to facilitate the adaptive realignment of the energy management strategy, thereby reducing instances of user dissatisfaction and enhancing overall system efficiency [94,96,98,99,102].
It is becoming increasingly common to integrate RES with HEMS. These sources of energy include photovoltaic (PV) panels, small wind turbines and associated energy storage devices. The primary challenge facing HEMS is the rationalization of energy supplied to buildings by disparate sources, in addition to the effective integration of PV systems and energy storage devices. Subsequently, energy management strategies must be adjusted in a dynamic manner, with a view to ensuring the safety and efficiency of the systems. A significant number of research and technical teams have already indicated that the use of DRL may offer a potential solution to these issues. In the field of energy management optimization, DRL-based algorithms are being developed and implemented with the objective of managing the charging and discharging of energy storage devices in response to changing market conditions and energy demand. A variety of DRL algorithms are employed, including PPO, LSTM, and DDPG. These facilitate the forecasting of prospective energy conditions and the real-time adaptation of energy management strategies. The primary objective of optimization in such procedures is to minimize operating costs and maximize savings and energy efficiency [104,108,109,110,111].
In the context of the evolving energy storage market and its integration into residential settings, the dynamic adjustment of energy management strategies stored in these systems represents a pivotal challenge for HEMS. Systems based on DRL can anticipate future price conditions and making optimal decisions regarding the management of energy storages and RES in accordance with these predictions. These approaches facilitate flexible and efficient energy management, which is of particular importance in the context of evolving market conditions and energy demand [105,106,107,110].
The implementation of advanced DRL algorithms is contingent upon the assurance of user data protection and the safeguarding of systems against potential cyber threats. Furthermore, challenges related to the scalability and interoperability of these systems necessitate the development of standards and communication protocols that will facilitate the integration of diverse devices and platforms. The issues of security and scalability were addressed in the article [111]. The utilization of DRL in the management of energy storage and RES in HEMS systems offers substantial benefits in terms of energy consumption optimization, dynamic adjustment of energy management strategies and integration of renewable energy sources.
The utilization of DRL in HEMS also facilitates the implementation of the DSR concept, which represents a promising solution for dynamic energy management. The implementation of DR in HEMS is achieved through a variety of techniques, including load shifting, peak load reduction, and energy storage management [76,80,81,85,87,91]. Furthermore, such systems gather data on energy consumption, energy prices, and environmental conditions, thereby facilitating the optimal management of energy consumption and the maximization of user comfort [82,91,92].
It is of paramount importance to consider user preferences in DR systems integrated with HEMS to enhance user acceptance and satisfaction. Such preferences can be gathered via user interfaces, which permit the configuration of the system in accordance with user requirements [74,80,82,85,88,92]. It is of paramount importance to ascertain user feedback regarding their satisfaction with the energy management strategy, as this enables the adaptive adjustment of the strategy and the minimization of user dissatisfaction, thus optimizing system efficiency. The deployment of RL algorithms, including Q-learning, DRL, PPO, and Primal-Dual DDPG is of paramount importance for the optimization of DR strategies in HEMS systems. These algorithms learn optimal energy management strategies through interactions with the environment, receiving rewards for actions that result in energy and cost savings [76,81,83,85,87,89,92]. This enables the systems to modify their operational procedures in response to alterations in conditions, such as fluctuations in energy prices and demand [35,74,81,82,88,90]. Mobile applications and other user interfaces afford users the ability to interact with the system, configure preferences, and receive recommendations for optimal energy management. The adaptability of DRL algorithms enables their effective utilization in response to changing market and environmental conditions, such as fluctuations in energy prices and changes in demand. These algorithms facilitate the optimization of renewable energy utilization and energy storage management, thereby minimizing costs and maximizing savings [31,83,84,87,90,93]. Furthermore, DRL models can be employed to forecast future energy requirements and to plan energy consumption in an optimal manner, thus facilitating more effective resource management and cost reduction [35,74,82,88,89].

4.1. Problems, Gaps and Challenges

As mentioned previously, the conjunction of IoT and DRL technologies in HEMS exhibits considerable promise. Nevertheless, numerous challenges and deficiencies necessitate attention. Further research is required to ascertain the feasibility of implementing these technologies in real-world settings, with particular attention paid to data security, scalability, and integration with existing energy infrastructure. The overcoming of these challenges will facilitate the development of more efficient, safe, and widely available energy management systems in households.
The results of the application analyses indicate that HEMS systems implementing DRL-based scheduling show good results, particularly in simulations. However, there are several significant gaps and challenges that need to be addressed. Firstly, there is a paucity of empirical research on the implementation of DLR applications in HEMS. To assess how these systems cope with changing conditions and user preferences, it is necessary to conduct tests in real conditions. Furthermore, additional research is required on data security, scalability and integration with existing infrastructure, as well as real economic analyses. The authors of numerous publications related to the research on the application possibilities of DR technology and advanced DRL algorithms in HEMS systems have highlighted these issues. In particular, the papers [31,80,81,82,85,88,90,92,93] indicate a requirement for economic analyses and the development of more cost-effective solutions to facilitate the scaling of the technology to a larger number of households. In this context, the research and technical teams also identify challenges from a technological and system security perspective. The collection and processing of substantial data sets pertaining to energy consumption and user preferences carries an inherent risk of privacy violations and vulnerability to cyber-attacks. Accordingly, as the authors of the papers [31,35,81,83,87,89,91,93] indicate, advanced security methods should be developed to ensure the protection of user data and the security of systems against potential threats. This remains a current and significant challenge, particularly considering the monotony of available communication technologies and the trend of data processing in cloud applications. A further significant limitation is the scalability of DR systems and their interoperability with a range of devices and platforms. It is imperative to develop standards and communication protocols that will facilitate the seamless integration of diverse devices and systems [31,80,82,83,88,90,113].
A further two problematic issues emerge from the analysis of the literature. The initial challenge pertains to the integration of sophisticated HEMS systems that incorporate DR mechanisms based on DRL with the extant energy infrastructure and smart grid monitoring and control systems. The authors of the papers [31,81,84,86,88,90,92,93] indicate that, given the current technical and standardization conditions, the process may prove to be both complicated and expensive. This necessitates further investigation into methodologies that will facilitate seamless integration of novel technologies with existing energy management systems, thereby enabling the establishment of consistent standards within this domain. A further challenge is that the efficacy of DRL algorithms is dependent on the quality and quantity of data collected by DR systems. In the absence of sufficiently accurate data, suboptimal decisions and system actions may result. Consequently, research is required to develop methods to enhance data quality and algorithms to address the issue of missing or incomplete data, as it is suggested in papers [35,74,82,83,86,89,90,93].

5. Opportunities in Application of Reinforcement Learning in Home Automation and Home and Building Energy Management

Considering the identified gaps and challenges pertaining to the implementation of RL algorithmic methodologies in buildings, it is imperative to consider the diverse nature of such facilities when contemplating the prospective avenues for their advancement. While the principal objective is energy efficiency, there are several fundamental differences between smart home automation systems and larger building automation systems, which have an impact on the application and development of RL algorithms in each case. The home automation systems are often focused on the management of energy-consuming appliances, lighting, HVAC systems, as well as RES, such as rooftop solar PV panels. In some cases, they may also include home energy storage systems, which are batteries that are used to store energy generated from renewable sources. In comparison to large commercial or industrial buildings, these systems are generally smaller in scale, involve fewer interconnected devices, and have more predictable occupant behavior [94,99]. In contrast, building automation systems, particularly in commercial or multi-dwelling environments, manage a more complex array of equipment, including centralized HVAC systems, elevators, large-scale lighting networks, security systems, and RES integrated into microgrids [32,114,115].
Furthermore, recent scientific and technical publications have identified several key application areas, creating new opportunities for the development of the aforementioned RL and DRL mechanisms in energy management and infrastructure of homes and buildings. Table 4 provides a summary of the most important opportunities, assigned to these areas, divided into applications in homes and buildings.
DRL approaches demonstrate considerable potential for enhancing the efficiency of HEMS, thereby improving the energy efficiency of residential buildings under their monitoring and control. Another crucial domain is facilitating the integration of such buildings with smart grid platform functions and DSM/ DSR mechanisms operated by distribution system operators (DSO).
Considering the aforementioned opportunities, the authors have identified key applications of RL and DRL in HEMS that facilitate effective energy management while maintaining the comfort and safety of residential buildings:
  • Adaptability and optimization The utilization of DRL models in HEMS facilitates the dynamic realignment of energy management strategies (storage and consumption) in accordance with fluctuating market and weather conditions, as well as evolving user preferences. This approach has the potential to significantly enhance energy and cost savings [138,139];
  • Integration with renewable energy sourcesRL-based HEMS systems facilitate the integration of renewable energy sources, such as photovoltaic panels and wind turbines, thereby enhancing energy independence and reducing reliance on the power grid [138];
  • Demand management and flexibilityThe implementation of DRL models in HEMS can facilitate demand management by enabling a more flexible and responsive management of energy consumption, which is of paramount importance in the context of dynamic tariffs and demand response programs [75,139];
  • Data security and privacyThe implementation of DRL in HEMS systems requires the utilization of sophisticated data protection and security methodologies to guarantee the confidentiality of user data and the integrity of the system against the threat of cyberattacks [29].

6. Conclusions

This review has detailed the significant potential of RL and DRL-based algorithms and methods for improving the efficiency of HEMS, particularly in light of the growing complexities of home infrastructure and energy systems as well as their integration with smart grids. The capacity of RL and DRL for real-time, adaptive decision-making introduces novel approaches to energy management in smart homes, offering benefits over conventional, static control methods. The authors have made a distinctive contribution by examining the applications of RL and DRL methods across several key areas, including load scheduling, DSM/DSR, integration with IoT networks, and energy storage management. Furthermore, the authors emphasize that RL and DRL, through their capacity for continuous learning and data-driven control, can enhance flexibility and responsiveness of HEMS to factors such as fluctuating energy tariffs, renewable energy availability, and user preferences [140,141,142].
One of the key contributions of this review is the synthesis of RL and DRL techniques in the context of BACS and IoT frameworks, which demonstrates how interconnected systems can utilize real-time data to autonomously balance energy loads, optimize renewable energy use and adjust consumption patterns for cost savings [31,98]. This integration represents a future-oriented direction for HEMS development, supporting not only household energy optimization but also contributing to grid stability through DSR mechanisms.
Notwithstanding the aforementioned opportunities, the review also identifies challenges that require future research work. Most notably is the scalability of DRL solutions and their integration within diverse residential infrastructures. A crucial next stage in the development of DRL-based HEMS is the validation of the technology through case studies conducted in real-world applications and scenarios. This approach will facilitate a practical evaluation of the algorithms’ performance, economic impact, and user satisfaction. Furthermore, future research should address data analysis, processing and security concerns. It is needed to investigate methods to enhance algorithmic efficiency in processing high volumes of real-time data and its integration, availability. Such studies will facilitate the transition from theoretical models to practical implementations, thereby advancing RL and DRL-based HEMS towards scalable, resilient, and secure applications in residential energy management.

Author Contributions

Conceptualization, J.G. and A.O.; methodology, D.L.; validation, D.L., and J.G.; formal analysis, D.L. and J.G.; investigation, D.L. and J.G.; resources, D.L.; data curation, J.G.; writing—original draft preparation, J.G.; writing—review and editing, A.O.; supervision, J.G. and A.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Faculty of Engineering, Automatics, Computer Science and Biomedical Engineering of the AGH University of Krakow as part of a research subsidy for young scientists (Dean’s grants) for 2024. Application number 10.16.120.79990.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

Nomenclature

A2C Advantage Actor-Critic
ANN Artificial Neural Networks
BACS Building Automation and Control Systems
BMS Building Management Systems
DDPG Deep Deterministic Policy Gradients
DDQN Double Deep Q-learning
DERs Distributed Energy Resources
DQN Deep Q-network
DRL Deep Reinforcement Learning
DSM Demand Side Management
DSO Distribution System Operator
DSR Demand Side Response
DTA Dual Targeting Algorithm
EPBD Energy Performance of Buildings Directive
HEMS Home Energy Management Systems
HVAC Heating, Ventilation and Air Condition
IoT Internet of Things
LSTM Long Short-Term Memory
MADDPG Multi-agent Deep Deterministic Policy Gradient
MARL Multi-Agent Reinforcement Learning
MORL Multi-Objective Reinforcement Learning
PPO Proximal Policy Optimization
PV Photovoltaic
RES Renewable Energy Sources
RL Reinforcement Learning
SAC Soft Actor-Critic
SRI Smart Readiness Indicator
TD3 Twin Delayed Deep Deterministic Policy Gradient
TRPO Trust Region Policy Optimization
V2G Vehicle-to-Grid

References

  1. Filho, G.P.R.; Villas, L.A.; Gonçalves, V.P.; Pessin, G.; Loureiro, A.A.F.; Ueyama, J. Energy-Efficient Smart Home Systems: Infrastructure and Decision-Making Process. Internet of Things 2019, 5, 153–167. [Google Scholar] [CrossRef]
  2. Pratt, A.; Krishnamurthy, D.; Ruth, M.; Wu, H.; Lunacek, M.; Vaynshenk, P. Transactive Home Energy Management Systems: The Impact of Their Proliferation on the Electric Grid. IEEE Electrification Magazine 2016, 4, 8–14. [Google Scholar] [CrossRef]
  3. Diyan, M.; Silva, B.N.; Han, K. A Multi-Objective Approach for Optimal Energy Management in Smart Home Using the Reinforcement Learning. Sensors 2020, 20, 3450. [Google Scholar] [CrossRef] [PubMed]
  4. Pau, G.; Collotta, M.; Ruano, A.; Qin, J. Smart Home Energy Management. Energies (Basel) 2017, 10, 382. [Google Scholar] [CrossRef]
  5. Umair, M.; Cheema, M.A.; Afzal, B.; Shah, G. Energy Management of Smart Homes over Fog-Based IoT Architecture. Sustainable Computing: Informatics and Systems 2023, 39. [Google Scholar] [CrossRef]
  6. Deanseekeaw, A.; Khortsriwong, N.; Boonraksa, P.; Boonraksa, T.; Marungsri, B. Optimal Load Scheduling for Smart Home Energy Management Using Deep Reinforcement Learning. In Proceedings of the 2024 12th International Electrical Engineering Congress (iEECON); IEEE, March 6 2024; pp. 1–4. [Google Scholar]
  7. Ożadowicz, A.; Grela, J. An Event-Driven Building Energy Management System Enabling Active Demand Side Management. In Proceedings of the 2016 Second International Conference on Event-based Control, Communication, and Signal Processing (EBCCSP); IEEE, June 2016; pp. 1–8.
  8. Verschae, R.; Kato, T.; Matsuyama, T. Energy Management in Prosumer Communities: A Coordinated Approach. Energies (Basel) 2016, 9, 562. [Google Scholar] [CrossRef]
  9. European Parliament Directive (EU) 2024/1275 of the European Parliament and the Council on the Energy Performance of Buildings; EU: Strasbourg, France, 2024.
  10. European Commission Energy Roadmap 2050; 2012.
  11. Fokaides, P.A.; Panteli, C.; Panayidou, A. How Are the Smart Readiness Indicators Expected to Affect the Energy Performance of Buildings: First Evidence and Perspectives. Sustainability 2020, 12, 9496. [Google Scholar] [CrossRef]
  12. Märzinger, T.; Österreicher, D. Extending the Application of the Smart Readiness Indicator—A Methodology for the Quantitative Assessment of the Load Shifting Potential of Smart Districts. Energies (Basel) 2020, 13, 3507. [Google Scholar] [CrossRef]
  13. Ożadowicz, A. A Hybrid Approach in Design of Building Energy Management System with Smart Readiness Indicator and Building as a Service Concept. Energies (Basel) 2022, 15, 1432. [Google Scholar] [CrossRef]
  14. ISO 52120-1:2021, I. 205 T.C. Energy Performance of Buildings Contribution of Building Automation, Controls and Building Management; Geneva, Switzerland, 2021.
  15. Favuzza, S.; Ippolito, M.; Massaro, F.; Musca, R.; Riva Sanseverino, E.; Schillaci, G.; Zizzo, G. Building Automation and Control Systems and Electrical Distribution Grids: A Study on the Effects of Loads Control Logics on Power Losses and Peaks. Energies (Basel) 2018, 11, 667. [Google Scholar] [CrossRef]
  16. Mahmood, A.; Baig, F.; Alrajeh, N.; Qasim, U.; Khan, Z.; Javaid, N. An Enhanced System Architecture for Optimized Demand Side Management in Smart Grid. Applied Sciences 2016, 6, 122. [Google Scholar] [CrossRef]
  17. Hou, P.; Yang, G.; Hu, J.; Douglass, P.J.; Xue, Y. A Distributed Transactive Energy Mechanism for Integrating PV and Storage Prosumers in Market Operation. Engineering 2022, 12, 171–182. [Google Scholar] [CrossRef]
  18. Kato, T.; Ishikawa, N.; Yoshida, N. Distributed Autonomous Control of Home Appliances Based on Event Driven Architecture. In Proceedings of the 2017 IEEE 6th Global Conference on Consumer Electronics (GCCE); IEEE, October 2017; pp. 1–2. [Google Scholar]
  19. Charbonnier, F.; Morstyn, T.; McCulloch, M.D. Scalable Multi-Agent Reinforcement Learning for Distributed Control of Residential Energy Flexibility. Appl Energy 2022, 314, 118825. [Google Scholar] [CrossRef]
  20. Delsing, J. Local Cloud Internet of Things Automation: Technology and Business Model Features of Distributed Internet of Things Automation Solutions. IEEE Industrial Electronics Magazine 2017, 11, 8–21. [Google Scholar] [CrossRef]
  21. Yassine, A.; Singh, S.; Hossain, M.S.; Muhammad, G. IoT Big Data Analytics for Smart Homes with Fog and Cloud Computing. Future Generation Computer Systems 2019, 91, 563–573. [Google Scholar] [CrossRef]
  22. Machorro-Cano, I.; Alor-Hernández, G.; Paredes-Valverde, M.A.; Rodríguez-Mazahua, L.; Sánchez-Cervantes, J.L.; Olmedo-Aguirre, J.O. HEMS-IoT: A Big Data and Machine Learning-Based Smart Home System for Energy Saving. Energies (Basel) 2020, 13, 1097. [Google Scholar] [CrossRef]
  23. Bawa, M.; Caganova, D.; Szilva, I.; Spirkova, D. Importance of Internet of Things and Big Data in Building Smart City and What Would Be Its Challenges. In Smart City 360°; Leon-Garcia, A., Lenort, R., Holman, D., Staš, D., Krutilova, V., Wicher, P., Cagáňová, D., Špirková, D., Golej, J., Nguyen, K., Eds.; Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering; Springer International Publishing: Cham, 2016; Vol. 166, pp. 605–616 ISBN 978-3-319-33680-0.
  24. Lawal, K.N.; Olaniyi, T.K.; Gibson, R.M. Leveraging Real-World Data from IoT Devices in a Fog–Cloud Architecture for Resource Optimisation within a Smart Building. Applied Sciences 2023, 14, 316. [Google Scholar] [CrossRef]
  25. Akter, M.N.; Mahmud, M.A.; Oo, A.M.T. A Hierarchical Transactive Energy Management System for Microgrids. In Proceedings of the 2016 IEEE Power and Energy Society General Meeting (PESGM); IEEE, July 2016; Vol. 2016-Novem; pp. 1–5. [Google Scholar]
  26. Taghizad-Tavana, K.; Ghanbari-Ghalehjoughi, M.; Razzaghi-Asl, N.; Nojavan, S.; Alizadeh, A. An Overview of the Architecture of Home Energy Management System as Microgrids, Automation Systems, Communication Protocols, Security, and Cyber Challenges. Sustainability 2022, 14, 15938. [Google Scholar] [CrossRef]
  27. Kiehbadroudinezhad, M.; Merabet, A.; Abo-Khalil, A.G.; Salameh, T.; Ghenai, C. Intelligent and Optimized Microgrids for Future Supply Power from Renewable Energy Resources: A Review. Energies (Basel) 2022, 15, 3359. [Google Scholar] [CrossRef]
  28. Chamana, M.; Schmitt, K.E.K.; Bhatta, R.; Liyanage, S.; Osman, I.; Murshed, M.; Bayne, S.; MacFie, J. Buildings Participation in Resilience Enhancement of Community Microgrids: Synergy Between Microgrid and Building Management Systems. IEEE Access 2022, 10, 100922–100938. [Google Scholar] [CrossRef]
  29. Al-Ani, O.; Das, S. Reinforcement Learning: Theory and Applications in HEMS. Energies (Basel) 2022, 15, 6392. [Google Scholar] [CrossRef]
  30. Wang, Z.; Hong, T. Reinforcement Learning for Building Controls: The Opportunities and Challenges. Appl Energy 2020, 269, 115036. [Google Scholar] [CrossRef]
  31. Benjamin, A.; Badar, A.Q.H. Reinforcement Learning Based Cost-Effective Smart Home Energy Management. In Proceedings of the 2023 IEEE 3rd International Conference on Sustainable Energy and Future Electric Transportation (SEFET); IEEE, August 9 2023; pp. 1–5. [Google Scholar]
  32. Yu, L.; Qin, S.; Zhang, M.; Shen, C.; Jiang, T.; Guan, X. A Review of Deep Reinforcement Learning for Smart Building Energy Management. IEEE Internet Things J 2021, 8, 12046–12063. [Google Scholar] [CrossRef]
  33. Wei, T.; Wang, Y.; Zhu, Q. Deep Reinforcement Learning for Building HVAC Control. In Proceedings of the Proceedings of the 54th Annual Design Automation Conference 2017. ACM: New York, NY, USA, June 18 2017; pp. 1–6.
  34. Yu, L.; Xie, W.; Xie, D.; Zou, Y.; Zhang, D.; Sun, Z.; Zhang, L.; Zhang, Y.; Jiang, T. Deep Reinforcement Learning for Smart Home Energy Management. IEEE Internet Things J 2020, 7, 2751–2762. [Google Scholar] [CrossRef]
  35. Kodama, N.; Harada, T.; Miyazaki, K. Home Energy Management Algorithm Based on Deep Reinforcement Learning Using Multistep Prediction. IEEE Access 2021, 9, 153108–153115. [Google Scholar] [CrossRef]
  36. Perez, K.X.; Baldea, M.; Edgar, T.F. Integrated Smart Appliance Scheduling and HVAC Control for Peak Residential Load Management. In Proceedings of the 2016 American Control Conference (ACC); IEEE, July 2016; Vol. 2016-July; pp. 1458–1463. [Google Scholar]
  37. Tekler, Z.D.; Low, R.; Yuen, C.; Blessing, L. Plug-Mate: An IoT-Based Occupancy-Driven Plug Load Management System in Smart Buildings. Build Environ 2022, 223, 109472. [Google Scholar] [CrossRef]
  38. Fambri, G.; Badami, M.; Tsagkrasoulis, D.; Katsiki, V.; Giannakis, G.; Papanikolaou, A. Demand Flexibility Enabled by Virtual Energy Storage to Improve Renewable Energy Penetration. Energies (Basel) 2020, 13, 5128. [Google Scholar] [CrossRef]
  39. Mancini, F.; Lo Basso, G.; de Santoli, L. Energy Use in Residential Buildings: Impact of Building Automation Control Systems on Energy Performance and Flexibility. Energies (Basel) 2019, 12, 2896. [Google Scholar] [CrossRef]
  40. Liu, Z.; Zhang, X.; Sun, Y.; Zhou, Y. Advanced Controls on Energy Reliability, Flexibility and Occupant-Centric Control for Smart and Energy-Efficient Buildings. Energy Build 2023, 297, 113436. [Google Scholar] [CrossRef]
  41. Babar, M.; Grela, J.; Ożadowicz, A.; Nguyen, P.; Hanzelka, Z.; Kamphuis, I. Energy Flexometer: Transactive Energy-Based Internet of Things Technology. Energies (Basel) 2018, 11, 568. [Google Scholar] [CrossRef]
  42. Chen, Y.; Yang, Y.; Xu, X. Towards Transactive Energy: An Analysis of Information-related Practical Issues. Energy Conversion and Economics 2022, 3, 112–121. [Google Scholar] [CrossRef]
  43. Sheshalani Balasingam; Zapiee, M. K.; Mohana, D. Smart Home Automation System Using IOT. International Journal of Recent Technology and Applied Science 2022, 4, 44–53. [Google Scholar] [CrossRef]
  44. Yar, H.; Imran, A.S.; Khan, Z.A.; Sajjad, M.; Kastrati, Z. Towards Smart Home Automation Using IoT-Enabled Edge-Computing Paradigm. Sensors 2021, 21, 4932. [Google Scholar] [CrossRef] [PubMed]
  45. Almusaylim, Z.A.; Zaman, N. A Review on Smart Home Present State and Challenges: Linked to Context-Awareness Internet of Things (IoT). Wireless Networks 2019, 25, 3193–3204. [Google Scholar] [CrossRef]
  46. Sun, H.; Yu, H.; Fan, G.; Chen, L. Energy and Time Efficient Task Offloading and Resource Allocation on the Generic IoT-Fog-Cloud Architecture. Peer Peer Netw Appl 2020, 13, 548–563. [Google Scholar] [CrossRef]
  47. García-Monge, M.; Zalba, B.; Casas, R.; Cano, E.; Guillén-Lambea, S.; López-Mesa, B.; Martínez, I. Is IoT Monitoring Key to Improve Building Energy Efficiency? Case Study of a Smart Campus in Spain. Energy Build 2023, 285, 112882. [Google Scholar] [CrossRef]
  48. Arif, S.; Khan, M.A.; Rehman, S.U.; Kabir, M.A.; Imran, M. Investigating Smart Home Security: Is Blockchain the Answer? IEEE Access 2020, 8, 117802–117816. [Google Scholar] [CrossRef]
  49. Graveto, V.; Cruz, T.; Simöes, P. Security of Building Automation and Control Systems: Survey and Future Research Directions. Comput Secur 2022, 112, 102527. [Google Scholar] [CrossRef]
  50. Parikh, S.; Dave, D.; Patel, R.; Doshi, N. Security and Privacy Issues in Cloud, Fog and Edge Computing. Procedia Comput Sci 2019, 160, 734–739. [Google Scholar] [CrossRef]
  51. Abed, S.; Jaffal, R.; Mohd, B.J. A Review on Blockchain and IoT Integration from Energy, Security and Hardware Perspectives. Wirel Pers Commun 2023, 129, 2079–2122. [Google Scholar] [CrossRef]
  52. Ożadowicz, A. Generic IoT for Smart Buildings and Field-Level Automation—Challenges, Threats, Approaches, and Solutions. Computers 2024, 13, 45. [Google Scholar] [CrossRef]
  53. Yu, J.; Kim, M.; Bang, H.C.; Bae, S.H.; Kim, S.J. IoT as a Applications: Cloud-Based Building Management Systems for the Internet of Things. Multimed Tools Appl 2016, 75, 14583–14596. [Google Scholar] [CrossRef]
  54. Kastner, W.; Kofler, M.; Jung, M.; Gridling, G.; Weidinger, J. Building Automation Systems Integration into the Internet of Things. The IoT6 Approach, Its Realization and Validation. In Proceedings of the Emerging Technology and Factory Automation (ETFA), 2014 IEEE; 2014; pp. 1–9.
  55. Verbeke Stijin; Aerts Dorien; Reynders Glenn; Ma Yixiao; Waide Paul FINAL REPORT ON THE TECHNICAL SUPPORT TO THE DEVELOPMENT OF A SMART READINESS INDICATOR FOR BUILDINGS; Brussels, 2020.
  56. European Parliament Directive (EU) 2018/844 of the European Parliament and the Council on the Energy Performance of Buildings; EU, 2018.
  57. Ramezani, B.; Silva, Manuel. G. da; Simões, N. Application of Smart Readiness Indicator for Mediterranean Buildings in Retrofitting Actions. Energy Build 2021, 249, 111173. [Google Scholar] [CrossRef]
  58. Janhunen, E.; Pulkka, L.; Säynäjoki, A.; Junnila, S. Applicability of the Smart Readiness Indicator for Cold Climate Countries. Buildings 2019, 9. [Google Scholar] [CrossRef]
  59. Ożadowicz, A.; Grela, J. Impact of Building Automation Control Systems on Energy Efficiency — University Building Case Study. In Proceedings of the 2017 22nd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA); IEEE, September 2017; pp. 1–8. [Google Scholar]
  60. Ożadowicz, A.; Grela, J. Energy Saving in the Street Lighting Control System—a New Approach Based on the EN-15232 Standard. Energy Effic 2017, 10, 563–576. [Google Scholar] [CrossRef]
  61. Laroui, M.; Nour, B.; Moungla, H.; Cherif, M.A.; Afifi, H.; Guizani, M. Edge and Fog Computing for IoT: A Survey on Current Research Activities & Future Directions. Comput Commun 2021, 180, 210–231. [Google Scholar] [CrossRef]
  62. Genkin, M.; McArthur, J.J. B-SMART: A Reference Architecture for Artificially Intelligent Autonomic Smart Buildings. Eng Appl Artif Intell 2023, 121, 106063. [Google Scholar] [CrossRef]
  63. Seitz, A.; Johanssen, J.O.; Bruegge, B.; Loftness, V.; Hartkopf, V.; Sturm, M. A Fog Architecture for Decentralized Decision Making in Smart Buildings. In Proceedings of the Proceedings - 2017 2nd International Workshop on Science of Smart City Operations and Platforms Engineering, in partnership with Global City Teams Challenge, SCOPE 2017; Association for Computing Machinery, Inc; 2017; pp. 34–39. [Google Scholar]
  64. Mansour, M.; Gamal, A.; Ahmed, A.I.; Said, L.A.; Elbaz, A.; Herencsar, N.; Soltan, A. Internet of Things: A Comprehensive Overview on Protocols, Architectures, Technologies, Simulation Tools, and Future Directions. Energies (Basel) 2023, 16, 3465. [Google Scholar] [CrossRef]
  65. Yousefpour, A.; Fung, C.; Nguyen, T.; Kadiyala, K.; Jalali, F.; Niakanlahiji, A.; Kong, J.; Jue, J.P. All One Needs to Know about Fog Computing and Related Edge Computing Paradigms: A Complete Survey. Journal of Systems Architecture 2019, 98, 289–330. [Google Scholar] [CrossRef]
  66. Kastner, W.; Jung, M.; Krammer, L. Future Trends in Smart Homes and Buildings. In Industrial Communication Technology Handbook, Second Edition; Zurawski, R., Ed.; CRC Press Taylor & Francis Group, 2015; pp. 59-1-59–20 ISBN 978-1-4822-0732-3.
  67. Lobaccaro, G.; Carlucci, S.; Löfström, E. A Review of Systems and Technologies for Smart Homes and Smart Grids. Energies (Basel) 2016, 9, 1–33. [Google Scholar] [CrossRef]
  68. Bouchabou, D.; Nguyen, S.M.; Lohr, C.; LeDuc, B.; Kanellos, I. A Survey of Human Activity Recognition in Smart Homes Based on IoT Sensors Algorithms: Taxonomies, Challenges, and Opportunities with Deep Learning. Sensors 2021, 21, 6037. [Google Scholar] [CrossRef] [PubMed]
  69. Grela, J.; Ożadowicz, A. Building Automation Planning and Design Tool Implementing EN 15 232 BACS Efficiency Classes. In Proceedings of the 2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA); IEEE, September 2016; pp. 1–4. [Google Scholar]
  70. Sharda, S.; Sharma, K.; Singh, M. A Real-Time Automated Scheduling Algorithm with PV Integration for Smart Home Prosumers. Journal of Building Engineering 2021, 44, 102828. [Google Scholar] [CrossRef]
  71. Sangoleye, F.; Jao, J.; Faris, K.; Tsiropoulou, E.E.; Papavassiliou, S. Reinforcement Learning-Based Demand Response Management in Smart Grid Systems With Prosumers. IEEE Syst J 2023, 17, 1797–1807. [Google Scholar] [CrossRef]
  72. Ożadowicz, A. A New Concept of Active Demand Side Management for Energy Efficient Prosumer Microgrids with Smart Building Technologies. Energies (Basel) 2017, 10, 1771. [Google Scholar] [CrossRef]
  73. Sierla, S.; Pourakbari-Kasmaei, M.; Vyatkin, V. A Taxonomy of Machine Learning Applications for Virtual Power Plants and Home/Building Energy Management Systems. Autom Constr 2022, 136, 104174. [Google Scholar] [CrossRef]
  74. Razghandi, M.; Zhou, H.; Erol-Kantarci, M.; Turgut, D. Smart Home Energy Management: Sequence-to-Sequence Load Forecasting and Q-Learning. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM); IEEE, December 2021; pp. 01–06. [Google Scholar]
  75. Zhang, H.; Wu, D.; Boulet, B. A Review of Recent Advances on Reinforcement Learning for Smart Home Energy Management. In Proceedings of the 2020 IEEE Electric Power and Energy Conference (EPEC); IEEE, November 9 2020; pp. 1–6. [Google Scholar]
  76. Lu, R.; Hong, S.H.; Yu, M. Demand Response for Home Energy Management Using Reinforcement Learning and Artificial Neural Network. IEEE Trans Smart Grid 2019, 10, 6629–6639. [Google Scholar] [CrossRef]
  77. Radhamani, R.; Karthick, S.; Kishore Kumar, S.; Gokulraj, M. Deployment of an IoT-Integrated Home Energy Management System Employing Deep Reinforcement Learning. In Proceedings of the 2024 2nd International Conference on Artificial Intelligence and Machine Learning Applications Theme: Healthcare and Internet of Things (AIMLA); IEEE, March 15 2024; pp. 1–4. [Google Scholar]
  78. Dhayalan, V.; Raman, R.; Kalaivani, N.; Shrirvastava, A.; Reddy, R.S.; Meenakshi, B. Smart Renewable Energy Management Using Internet of Things and Reinforcement Learning. In Proceedings of the 2024 2nd International Conference on Computer, Communication and Control (IC4); IEEE, February 8 2024; pp. 1–5.
  79. Wang, Y.; Xiao, R.; Wang, X.; Liu, A. Constructing Autonomous, Personalized, and Private Working Management of Smart Home Products Based on Deep Reinforcement Learning. Procedia CIRP 2023, 119, 72–77. [Google Scholar] [CrossRef]
  80. Chen, S.-J.; Chiu, W.-Y.; Liu, W.-J. User Preference-Based Demand Response for Smart Home Energy Management Using Multiobjective Reinforcement Learning. IEEE Access 2021, 9, 161627–161637. [Google Scholar] [CrossRef]
  81. Angano, W.; Musau, P.; Wekesa, C.W. Design and Testing of a Demand Response Q-Learning Algorithm for a Smart Home Energy Management System. In Proceedings of the 2021 IEEE PES/IAS PowerAfrica; IEEE, August 23 2021; pp. 1–5. [Google Scholar]
  82. Amer, A.A.; Shaban, K.; Massoud, A.M. DRL-HEMS: Deep Reinforcement Learning Agent for Demand Response in Home Energy Management Systems Considering Customers and Operators Perspectives. IEEE Trans Smart Grid 2023, 14, 239–250. [Google Scholar] [CrossRef]
  83. Liu, W.; Wang, Y.; Jiang, F.; Cheng, Y.; Rong, J.; Wang, C.; Peng, J. A Real-Time Demand Response Strategy of Home Energy Management by Using Distributed Deep Reinforcement Learning. In Proceedings of the 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys); IEEE, December 2021; pp. 988–995.
  84. Alfaverh, F.; Denai, M.; Sun, Y. Demand Response Strategy Based on Reinforcement Learning and Fuzzy Reasoning for Home Energy Management. IEEE Access 2020, 8, 39310–39321. [Google Scholar] [CrossRef]
  85. Li, H.; Wan, Z.; He, H. A Deep Reinforcement Learning Based Approach for Home Energy Management System. In Proceedings of the 2020 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT); IEEE, February 2020; pp. 1–5. [Google Scholar]
  86. Mathew, A.; Roy, A.; Mathew, J. Intelligent Residential Energy Management System Using Deep Reinforcement Learning. IEEE Syst J 2020, 14, 5362–5372. [Google Scholar] [CrossRef]
  87. Ding, H.; Xu, Y.; Chew Si Hao, B.; Li, Q.; Lentzakis, A. A Safe Reinforcement Learning Approach for Multi-Energy Management of Smart Home. Electric Power Systems Research 2022, 210, 108120. [Google Scholar] [CrossRef]
  88. Chu, Y.; Wei, Z.; Sun, G.; Zang, H.; Chen, S.; Zhou, Y. Optimal Home Energy Management Strategy: A Reinforcement Learning Method with Actor-Critic Using Kronecker-Factored Trust Region. Electric Power Systems Research 2022, 212, 108617. [Google Scholar] [CrossRef]
  89. Lissa, P.; Deane, C.; Schukat, M.; Seri, F.; Keane, M.; Barrett, E. Deep Reinforcement Learning for Home Energy Management System Control. Energy and AI 2021, 3, 100043. [Google Scholar] [CrossRef]
  90. Liu, Y.; Zhang, D.; Gooi, H.B. Optimization Strategy Based on Deep Reinforcement Learning for Home Energy Management. CSEE Journal of Power and Energy Systems 2020, 6, 572–582. [Google Scholar] [CrossRef]
  91. Kumari, A.; Tanwar, S. Reinforcement Learning for Multiagent-Based Residential Energy Management System. In Proceedings of the 2021 IEEE Globecom Workshops (GC Wkshps); IEEE, December 2021; pp. 1–6. [Google Scholar]
  92. Kumari, A.; Kakkar, R.; Tanwar, S.; Garg, D.; Polkowski, Z.; Alqahtani, F.; Tolba, A. Multi-Agent-Based Decentralized Residential Energy Management Using Deep Reinforcement Learning. Journal of Building Engineering 2024, 87, 109031. [Google Scholar] [CrossRef]
  93. Amer, A.; Shaban, K.; Massoud, A. Demand Response in HEMSs Using DRL and the Impact of Its Various Configurations and Environmental Changes. Energies (Basel) 2022, 15, 8235. [Google Scholar] [CrossRef]
  94. Roslann, A.; Asuhaimi, F.A.; Ariffin, K.N.Z. Energy Efficient Scheduling in Smart Home Using Deep Reinforcement Learning. In Proceedings of the 2022 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET); IEEE, September 13 2022; pp. 1–6. [Google Scholar]
  95. Xiong, L.; Tang, Y.; Liu, C.; Mao, S.; Meng, K.; Dong, Z.; Qian, F. Meta-Reinforcement Learning-Based Transferable Scheduling Strategy for Energy Management. IEEE Transactions on Circuits and Systems I: Regular Papers 2023, 70, 1685–1695. [Google Scholar] [CrossRef]
  96. Kahraman, A.; Yang, G. Home Energy Management System Based on Deep Reinforcement Learning Algorithms. In Proceedings of the 2022 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT-Europe); IEEE, October 10 2022; Vol. 2022-October; pp. 1–5. [Google Scholar]
  97. Aldahmashi, J.; Ma, X. Real-Time Energy Management in Smart Homes Through Deep Reinforcement Learning. IEEE Access 2024, 12, 43155–43172. [Google Scholar] [CrossRef]
  98. Seveiche-Maury, Z.; Arrubla-Hoyos, W. Proposal of a Decision-Making Model for Home Energy Saving through Artificial Intelligence Applied to a HEMS. In Proceedings of the 2023 IEEE Colombian Caribbean Conference (C3); IEEE, November 22 2023; pp. 1–6. [Google Scholar]
  99. Wei, G.; Chi, M.; Liu, Z.-W.; Ge, M.; Li, C.; Liu, X. Deep Reinforcement Learning for Real-Time Energy Management in Smart Home. IEEE Syst J 2023, 17, 2489–2499. [Google Scholar] [CrossRef]
  100. Jiang, F.; Zheng, C.; Gao, D.; Zhang, X.; Liu, W.; Cheng, Y.; Hu, C.; Peng, J. A Novel Multi-Agent Cooperative Reinforcement Learning Method for Home Energy Management under a Peak Power-Limiting. In Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC); IEEE, October 11 2020; pp. 350–355.
  101. Diyan, M.; Khan, M.; Zhenbo, C.; Silva, B.N.; Han, J.; Han, K.J. Intelligent Home Energy Management System Based on Bi-Directional Long-Short Term Memory and Reinforcement Learning. In Proceedings of the 2021 International Conference on Information Networking (ICOIN); IEEE, January 13 2021; Vol. 2021-January; pp. 782–787. [Google Scholar]
  102. Zenginis, I.; Vardakas, J.; Koltsaklis, N.E.; Verikoukis, C. Smart Home’s Energy Management Through a Clustering-Based Reinforcement Learning Approach. IEEE Internet Things J 2022, 9, 16363–16371. [Google Scholar] [CrossRef]
  103. Haq, E.U.; Lyu, C.; Xie, P.; Yan, S.; Ahmad, F.; Jia, Y. Implementation of Home Energy Management System Based on Reinforcement Learning. Energy Reports 2022, 8, 560–566. [Google Scholar] [CrossRef]
  104. Thattai, K.; Ravishankar, J.; Li, C. Consumer-Centric Home Energy Management System Using Trust Region Policy Optimization- Based Multi-Agent Deep Reinforcement Learning. In Proceedings of the 2023 IEEE Belgrade PowerTech; IEEE, June 25 2023; pp. 1–6. [Google Scholar]
  105. Langer, L.; Volling, T. A Reinforcement Learning Approach to Home Energy Management for Modulating Heat Pumps and Photovoltaic Systems. Appl Energy 2022, 327, 120020. [Google Scholar] [CrossRef]
  106. Xiong, S.; Liu, D.; Chen, Y.; Zhang, Y.; Cai, X. A Deep Reinforcement Learning Approach Based Energy Management Strategy for Home Energy System Considering the Time-of-Use Price and Real-Time Control of Energy Storage System. Energy Reports 2024, 11, 3501–3508. [Google Scholar] [CrossRef]
  107. Lee, S.; Choi, D.-H. Reinforcement Learning-Based Energy Management of Smart Home with Rooftop Solar Photovoltaic System, Energy Storage System, and Home Appliances. Sensors 2019, 19, 3937. [Google Scholar] [CrossRef] [PubMed]
  108. Abedi, S.; Yoon, S.W.; Kwon, S. Battery Energy Storage Control Using a Reinforcement Learning Approach with Cyclic Time-Dependent Markov Process. International Journal of Electrical Power & Energy Systems 2022, 134, 107368. [Google Scholar] [CrossRef]
  109. Härtel, F.; Bocklisch, T. Minimizing Energy Cost in PV Battery Storage Systems Using Reinforcement Learning. IEEE Access 2023, 11, 39855–39865. [Google Scholar] [CrossRef]
  110. Xu, G.; Shi, J.; Wu, J.; Lu, C.; Wu, C.; Wang, D.; Han, Z. An Optimal Solutions-Guided Deep Reinforcement Learning Approach for Online Energy Storage Control. Appl Energy 2024, 361, 122915. [Google Scholar] [CrossRef]
  111. Wang, B.; Zha, Z.; Zhang, L.; Liu, L.; Fan, H. Deep Reinforcement Learning-Based Security-Constrained Battery Scheduling in Home Energy System. IEEE Transactions on Consumer Electronics 2024, 70, 3548–3561. [Google Scholar] [CrossRef]
  112. Markiewicz, M.; Skała, A.; Grela, J.; Janusz, S.; Stasiak, T.; Latoń, D.; Bielecki, A.; Bańczyk, K. The Architecture for Testing Central Heating Control Algorithms with Feedback from Wireless Temperature Sensors. Energies (Basel) 2023, 16, 5584. [Google Scholar] [CrossRef]
  113. Arun, S.L.; Selvan, M.P. Intelligent Residential Energy Management System for Dynamic Demand Response in Smart Buildings. IEEE Syst J 2017, 1–12. [Google Scholar] [CrossRef]
  114. Pan, Y.; Shen, Y.; Qin, J.; Zhang, L. Deep Reinforcement Learning for Multi-Objective Optimization in BIM-Based Green Building Design. Autom Constr 2024, 166, 105598. [Google Scholar] [CrossRef]
  115. Shaqour, A.; Hagishima, A. Systematic Review on Deep Reinforcement Learning-Based Energy Management for Different Building Types. Energies (Basel) 2022, 15, 8663. [Google Scholar] [CrossRef]
  116. Qi, T.; Ye, C.; Zhao, Y.; Li, L.; Ding, Y. Deep Reinforcement Learning Based Charging Scheduling for Household Electric Vehicles in Active Distribution Network. Journal of Modern Power Systems and Clean Energy 2023, 11, 1890–1901. [Google Scholar] [CrossRef]
  117. Deanseekeaw, A.; Khortsriwong, N.; Boonraksa, P.; Boonraksa, T.; Marungsri, B. Optimal Load Scheduling for Smart Home Energy Management Using Deep Reinforcement Learning. In Proceedings of the 2024 12th International Electrical Engineering Congress (iEECON); IEEE, March 6 2024; pp. 1–4. [Google Scholar]
  118. Jendoubi, I.; Bouffard, F. Multi-Agent Hierarchical Reinforcement Learning for Energy Management. Appl Energy 2023, 332, 120500. [Google Scholar] [CrossRef]
  119. Qin, Y.; Ke, J.; Wang, B.; Filaretov, G.F. Energy Optimization for Regional Buildings Based on Distributed Reinforcement Learning. Sustain Cities Soc 2022, 78, 103625. [Google Scholar] [CrossRef]
  120. Anvari-Moghaddam, A.; Rahimi-Kian, A.; Mirian, M.S.; Guerrero, J.M. A Multi-Agent Based Energy Management Solution for Integrated Buildings and Microgrid System. Appl Energy 2017, 203, 41–56. [Google Scholar] [CrossRef]
  121. Kumar Nunna, H.S.V.S.; Srinivasan, D. Multi-Agent Based Transactive Energy Framework for Distribution Systems with Smart Microgrids. IEEE Trans Industr Inform 2017, 3203, 1–1. [Google Scholar] [CrossRef]
  122. Vamvakas, D.; Michailidis, P.; Korkas, C.; Kosmatopoulos, E. Review and Evaluation of Reinforcement Learning Frameworks on Smart Grid Applications. Energies (Basel) 2023, 16, 5326. [Google Scholar] [CrossRef]
  123. Yu, L.; Xie, W.; Xie, D.; Zou, Y.; Zhang, D.; Sun, Z.; Zhang, L.; Zhang, Y.; Jiang, T. Deep Reinforcement Learning for Smart Home Energy Management. IEEE Internet Things J 2020, 7, 2751–2762. [Google Scholar] [CrossRef]
  124. Zhang, L.; Gao, Y.; Zhu, H.; Tao, L. A Distributed Real-Time Pricing Strategy Based on Reinforcement Learning Approach for Smart Grid. Expert Syst Appl 2022, 191, 116285. [Google Scholar] [CrossRef]
  125. Huang, X.; Zhang, D.; Zhang, X. Energy Management of Intelligent Building Based on Deep Reinforced Learning. Alexandria Engineering Journal 2021, 60, 1509–1517. [Google Scholar] [CrossRef]
  126. Wang, Z.; Xiao, F.; Ran, Y.; Li, Y.; Xu, Y. Scalable Energy Management Approach of Residential Hybrid Energy System Using Multi-Agent Deep Reinforcement Learning. Appl Energy 2024, 367, 123414. [Google Scholar] [CrossRef]
  127. Knap, P.; Gerding, E. Energy Storage in the Smart Grid: A Multi-Agent Deep Reinforcement Learning Approach. In Trends in Clean Energy Research: Selected Papers from the 9th International Conference on Advances on Clean Energy Research (ICACER 2024); Chen, L., Ed.; Springer Nature Switzerland: Cham, 2024; pp. 221–235. [Google Scholar]
  128. Sobhani, A.; Khorshidi, F.; Fakhredanesh, M. DeePLS: Personalize Lighting in Smart Home by Human Detection, Recognition, and Tracking. SN Comput Sci 2023, 4, 773. [Google Scholar] [CrossRef]
  129. Safaei, D.; Sobhani, A.; Kiaei, A.A. DeePLT: Personalized Lighting Facilitates by Trajectory Prediction of Recognized Residents in the Smart Home. International Journal of Information Technology 2024, 16, 2987–2999. [Google Scholar] [CrossRef]
  130. Manganelli, M.; Consalvi, R. Design and Energy Performance Assessment of High-Efficiency Lighting Systems. In Proceedings of the 2015 IEEE 15th International Conference on Environment and Electrical Engineering (EEEIC); IEEE, June 2015; pp. 1035–1040. [Google Scholar]
  131. Liu, J.; Chen, H.-M.; Li, S.; Lin, S. Adaptive and Energy-Saving Smart Lighting Control Based on Deep Q-Network Algorithm. In Proceedings of the 2021 6th International Conference on Control, Robotics and Cybernetics (CRC); IEEE, October 9 2021. pp. 207–211.
  132. Suman, S.; Rivest, F.; Etemad, A. Toward Personalization of User Preferences in Partially Observable Smart Home Environments. IEEE Transactions on Artificial Intelligence 2023, 4, 549–561. [Google Scholar] [CrossRef]
  133. Almilaify, Y.; Nweye, K.; Nagy, Z. SCALEX: SCALability EXploration of Multi-Agent Reinforcement Learning Agents in Grid-Interactive Efficient Buildings. In Proceedings of the Proceedings of the 10th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation; ACM: New York, NY, USA, November 15 2023; pp. 261–264.
  134. Khan, M.A.; Saleh, A.M.; Waseem, M.; Sajjad, I.A. Artificial Intelligence Enabled Demand Response: Prospects and Challenges in Smart Grid Environment. IEEE Access 2023, 11, 1477–1505. [Google Scholar] [CrossRef]
  135. Gao, Y.; Li, S.; Xiao, Y.; Dong, W.; Fairbank, M.; Lu, B. An Iterative Optimization and Learning-Based IoT System for Energy Management of Connected Buildings. IEEE Internet Things J 2022, 9, 21246–21259. [Google Scholar] [CrossRef]
  136. Malagnino, A.; Montanaro, T.; Lazoi, M.; Sergi, I.; Corallo, A.; Patrono, L. Building Information Modeling and Internet of Things Integration for Smart and Sustainable Environments: A Review. J Clean Prod 2021, 312, 127716. [Google Scholar] [CrossRef]
  137. Anvari-Moghaddam, A.; Rahimi-Kian, A.; Mirian, M.S.; Guerrero, J.M. A Multi-Agent Based Energy Management Solution for Integrated Buildings and Microgrid System. Appl Energy 2017, 203, 41–56. [Google Scholar] [CrossRef]
  138. Pinthurat, W.; Surinkaew, T.; Hredzak, B. An Overview of Reinforcement Learning-Based Approaches for Smart Home Energy Management Systems with Energy Storages. Renewable and Sustainable Energy Reviews 2024, 202, 114648. [Google Scholar] [CrossRef]
  139. Sheng, R.; Mu, C.; Zhang, X.; Ding, Z.; Sun, C. Review of Home Energy Management Systems Based on Deep Reinforcement Learning. In Proceedings of the Proceedings - 2023 38th Youth Academic Annual Conference of Chinese Association of Automation, YAC 2023; Institute of Electrical and Electronics Engineers Inc., 2023; pp. 1239–1244.
  140. Daneshvar, M.; Pesaran, M.; Mohammadi-ivatloo, B. 7 - Transactive Energy in Future Smart Homes. In The Energy Internet; Su, W., Huang, A.Q., Eds.; Woodhead Publishing, 2019; pp. 153–179 ISBN 978-0-08-102207-8.
  141. Rodrigues, S.D.; Garcia, V.J. Transactive Energy in Microgrid Communities: A Systematic Review. Renewable and Sustainable Energy Reviews 2023, 171, 112999. [Google Scholar] [CrossRef]
  142. Nizami, S.; Tushar, W.; Hossain, M.J.; Yuen, C.; Saha, T.; Poor, H.V. Transactive Energy for Low Voltage Residential Networks: A Review. Appl Energy 2022, 323, 119556. [Google Scholar] [CrossRef]
Figure 1. The number of selected publications in four key areas of RL and DRL applications for four major publishers (last 5 years).
Figure 1. The number of selected publications in four key areas of RL and DRL applications for four major publishers (last 5 years).
Preprints 139158 g001
Table 1. The literature review results from bibliometric databases.
Table 1. The literature review results from bibliometric databases.
Database Publication Type Building
Automation
Home
Automation
Reinforcement Learning Building
Automation + Reinforcement Learning
Home
Automation + Reinforcement Learning
Web of Science Articles 13,770 2,368 46,764 164 20
Reviews 888 179 2,007 13 3
Scopus Articles 11,628 8,481 51,883 103 101
Reviews 967 622 3,206 9 3
Google Scholar Any type 3,150,000 3,170,000 4,680,000 250,000 204,000
Reviews 172,000 191,000 63,200 24,600 21,500
Table 2. The literature review results from publisher databases.
Table 2. The literature review results from publisher databases.
Database Publication Type Building
Automation
Home
Automation
Reinforcement Learning Building
Automation + Reinforcement Learning
Home
Automation + Reinforcement Learning
Springer Articles 36,848 15,339 64,648 2,434 1,073
Reviews 2,898 1,297 5,232 462 173
Science Direct Articles 70,247 25,541 83,149 3,760 1,312
Reviews 8,619 4,046 12,9646 1,323 579
MDPI Articles 1,346 379 3,797 17 2
Reviews 133 47 261 7 1
IEEE Xplore Conferences 28,815 8,194 31,831 430 65
Journals 5,861 1,032 876 202 24
Taylor
and Francis
Articles 154,033 54,445 310,108 16,794 9,081
Reviews 4,498 1,737 5,397 512 228
ACM DigitalLibrary All type 149,403 28,663 47,165 13,237 4,325
Reviews 201 43 62 18 4
Wiley Online
Library
Journal 236,565 71,939 223,645 18,307 10,196
Books 45,956 18,855 36,651 5,294 3,001
Table 3. Recent application of RL and DRL algorithms in HEMS.
Table 3. Recent application of RL and DRL algorithms in HEMS.
Reference / Year Application Algorithm
Method
Objectives Verification
[77] 2024 IoT Deep Reinforcement
Learning (DRL)
Cost and Comfort Simulation
[78] 2024 IoT Deep Q-learning Cost and Comfort Simulation
[79] 2023 IoT Q-learning Other
(Autonomy, Personalization,
and Privacy)
Simulation
[34] 2020 IoT Deep Deterministic
Policy Gradients (DDPG)
Cost and Comfort Simulation
[80] 2021 Demand Response Multi-Objective
Reinforcement Learning (MORL)
Cost and Comfort Simulation
[81] 2021 Demand Response Q-learning Cost and Comfort Real (Physical
system testing
using MATLAB
and Arduino Uno)
[82] 2023 Demand Response Deep Q-network (DQN) Cost and Comfort Simulation (evaluated
using real-world data)
[83] 2021 Demand Response MATD3 - Multi-Agent Twin
Delayed Deep Deterministic
Policy Gradient
Cost and Comfort Simulation (evaluated
using real-world data)
[84] 2020 Demand Response Q-learning combined
with Fuzzy Reasoning
Cost Simulation
[76] 2019 Demand Response Multi-Agent Reinforcement Learning (MARL) combined with Artificial Neural Networks (ANN) Cost and Comfort Simulation
[85] 2020 Demand Response Proximal Policy Optimization (PPO) Cost Simulation
[31] 2023 Demand Response Q-learning combined
with Fuzzy Reasoning
Cost and Comfort Simulation
[35] 2021 Demand Response DDPG with
Dual Targeting Algorithm (DTA)
Cost and Comfort Simulation
[86] 2020 Demand Response DQN Cost Simulation
[87] 2022 Demand Response Primal-Dual Deep Deterministic Policy Gradient (PD-DDPG) Cost Simulation
[88] 2022 Demand Response Actor-Critic using Kronecker-Factored Trust Region (ACKTR) Cost and Comfort Simulation (evaluated
using real-world data)
[89] 2021 Demand Response DRL Cost and Comfort Simulation
[90] 2020 Demand Response DQN and Double
Deep Q-learning (DDQN)
Cost and Comfort Simulation (validated using a real-world database combined with the
household energy
storage model)
[91] 2021 Demand Response Q-learning Cost Simulation
[92] 2024 Demand Response DQN Cost and Comfort Simulation
[74] 2021 Demand Response Q-learning Cost Simulation
[93] 2022 Demand Response DQN Cost and Comfort Simulation
[6] 2024 Scheduling DQN,
Advantage Actor-Critic (A2C),
and Proximal Policy Optimization (PPO)
Cost Simulation
[94] 2022 Scheduling Q-learning Cost and Comfort Simulation
[95] 2023 Scheduling Meta-Reinforcement Learning (Meta-RL) with Long Short-Term Memory (LSTM) Cost Simulation (using practical data from Australia’s
electricity network)
[96] 2022 Scheduling DQN, DDPG, and Twin Delayed Deep Deterministic Policy Gradient (TD3) Cost Simulation
[97] 2024 Scheduling PPO Cost Simulation (using real-world datasets)
[98] 2023 Scheduling DQN Cost and Comfort Real (using real-time data from a test bench with household devices)
[99] 2023 Scheduling PPO Cost and Comfort Simulation (based on real-world data)
[100] 2020 Scheduling Multi-agent Deep Deterministic Policy Gradient (MADDPG) Cost Simulation
[101] 2021 Scheduling Q-learning Cost and Comfort Simulation
[102] 2022 Scheduling DDPG Cost and Comfort Simulation
[103] 2022 Scheduling Q-learning Cost and Comfort Simulation
[3] 2020 Scheduling Q-learning Cost and Comfort Simulation
[104] 2023 RES + Storage Trust Region Policy Optimization (TRPO) based Multi-Agent Deep Reinforcement Learning (DRL) Cost and Comfort Simulation (using real-world data from the Australian National Electricity Market and PV profiles)
[105] 2022 RES + Storage DDPG Cost and Comfort Simulation
[106] 2024 RES + Storage SAC Cost Simulation
[107] 2019 RES + Storage Q-learning Cost and Comfort Simulation
[108] 2022 RES + Storage Q-learning Cost Simulation
[109] 2023 RES + Storage PPO with LSTM networks Cost Simulation
[110] 2024 RES + Storage DRL, specifically DDPG
and PPO
Cost Simulation
[111] 2024 RES + Storage Actor-Critic-based RL
with Distributional Critic Net
Cost Simulation
Table 4. Comparison of opportunities: home vs. building applications of RL and DRL.
Table 4. Comparison of opportunities: home vs. building applications of RL and DRL.
Opportunity Home Automation Building Automation
Demand Response
and Load Shifting
- RL is used to shift energy-intensive activities to off-peak hours based on dynamic pricing or renewable energy availability [32]
- Methods like PPO and A2C are used for optimizing the timing of energy use in home devices [116,117]
- RL enables buildings to participate in demand response programs by shifting large loads (e.g., elevators, HVAC) to off-peak periods or times of high renewable generation [118]
- More complex energy balancing strategies are needed due to scale [30,119]
Integration
with Renewable Energy
- RL can optimize the use of rooftop solar panels and home batteries by learning when to store energy or sell it back to the grid
- Key opportunity lies in coordinating solar generation with storage for maximum efficiency [107]
- RL manages larger-scale renewable energy systems (e.g., building-integrated PV, wind turbines), optimizing when to use, store, or sell energy to the grid [120,121]
- RL models handle interactions with smart grids and microgrids [122]
Energy Storage
Management
- RL optimizes home battery usage by learning when to store solar energy or discharge it during peak demand periods [89,106,123]
- Future opportunities include real-time adaptation to energy pricing and household consumption patterns [124,125]
- Large buildings with energy storage systems require RL to balance stored energy with grid demand, renewable generation, and internal consumption [115,126]
- RL agents coordinate across multiple storage units and energy systems [118,121,127]
Smart Lighting
and Occupancy-based Control
- RL-based lighting systems learn from occupancy sensors and adjust lighting schedules to save energy while maintaining comfort
- Personalized lighting control based on user habits is a key development area [128,129]
- RL for adaptive lighting in large buildings helps reduce energy waste by adjusting lighting across zones based on occupancy [130]
- Deep Q-learning has been applied for energy-efficient lighting control in commercial spaces [131]
Scalability
and Complexity
- Home automation systems involve fewer devices and simpler control systems, making it easier to deploy RL models and achieve fast optimization results
- Future work will focus on personalization and adapting RL to individual preferences [3,132]
- Building automation systems are more complex, requiring multi-agent RL systems to handle diverse, multi-zone environments [133]
- Scalability of RL models to manage multi-objective optimization across large buildings is an ongoing research challenge [40]
Integration
with Smart Grids and IoT
- IoT devices in smart homes provide real-time data to RL systems for better energy optimization and appliance control [134]
- RL agents can integrate with home microgrids, managing energy flows between renewable sources, storage, and consumption [19,107]
- In large buildings, RL facilitates participation in smart grids by managing energy exchange, load balancing, and interactions with external energy markets [122,124]
- Enhanced IoT connectivity improves RL performance in coordinating various building subsystems [135,136]
Renewable Energy
Prosumers
- Homes with solar panels and energy storage can act as “prosumers,” where RL optimizes energy generation, consumption, and selling excess energy back to the grid [70,71,72] - Buildings with integrated
renewable systems participate as prosumers in energy markets, and RL manages the building’s contribution to local energy grids and microgrids [28,137]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated