Edge computing has emerged as a significant computing trend alongside the rapid expansion of the Internet of Things (IoT). Computing operations at the network’s edge offer a solution to the high latency and service overload challenges often associated with cloud computing. Additionally, the Secure Hash Algorithm-3 (SHA-3) plays a crucial role in ensuring data integrity and is implemented for numerous applications in the IoT field. Therefore, integrating the SHA-3 accelerator into edge computing is essential. Moreover, recent studies about SHA-3 have primarily focused on achieving high performance and optimizing resource utilization for SHA-3. However, these studies have overlooked the crucial aspect of data transfer between the external memory and the hash function block, as transfer time plays a significant role. This paper proposes an efficient SHA-3 architecture designed for System-on-Chip (SoC) Field Programmable Gate Array (FPGA) to address the aforementioned challenges and be suitable for real edge computing. The architecture contains three key techniques. First, the harmony of padding and Direct Memory Access (DMA) for managing the data transfer process and enhancing performance efficiency based on Serial Input to Parallel Output (SIPO) and a barrel shifter. Secondly, implementing internal pipelining within the Round Function (RF) to reduce critical path delays and optimize resource utilization. Finally, designing four modes (SHA3-224, SHA3-256, SHA3-384, and SHA3-512) to cater to various applications. Our architecture is implemented and tested on the DE10-Standard Development Kit (Cyclone V SX SoC-5CSXFC6D6F31C6N), which is integrated into the FPGA and is controlled by a Dual-Core ARM Cortex-A9 processor. The result is up to 38.34 Gbps in throughput and 5.61 Mbps/ALM in efficiency for the RF computation, 28.02 Gbps in throughput, and 3.63 Mbps/ALM for the full proposed architecture.