3.1. Model design in Simulink and Vivado
In this study, we present four different approaches to implement the weighted sum of two neurons on an FPGA using the Xilinx System Generator (SysGen) of Matlab Simulink and VHDL coding in Vivado Design Suite. The first approach involves utilizing Block-Random-Access-Memory(BRAM) HDL blocks to implement the complex exponential function of the weighted sum of two neurons in Matlab Simulink using SysGen. The second approach employs BRAM IP cores and VHDL coding to develop the model in the Vivado design suite. In the third approach, CORDIC HDL blocks from the Xilinx toolbox are used in Matlab Simulink to implement the exponential function. Finally, the fourth approach utilizes VHDL coding and CORDIC IP core to implement the weighted sum of two neurons in the Vivado design suite.
For each approach, three separate designs were implemented using 8, 16, and 32-bit fixed-point data format configurations.
The weighted sum of two neurons requires several input parameters, including , , , , , , and t. However, for the sake of simplicity, we kept all input parameters fixed except for t across all four approaches. Specifically, we used the following fixed values: , , , , , and . We varied the value of the input parameter t within the range of -4 to 4 to cover a complete oscillation cycle.
The outputs of all the approaches were compared with the MATLAB simulation output of equation (5). The accuracy, latency, and resource utilization of each approach were evaluated and compared to determine the most efficient approach for implementing the weighted sum of two neurons on an FPGA.
Vitis Core Development kit version 2021.1, which includes MATLAB R2021a, Xilinx System Generator, and Vivado 2021.1 (64-bit), is used to design, code, and simulate the projects.
3.1.1. BRAM-based design (using SysGen in MATLAB Simulink)
In the FPGA implementation of complex exponential neurons, a crucial task is to implement the function . This is achieved by separately implementing the real and imaginary components of the function, which are and , respectively. Therefore, the real part of the complex exponential output is given by , while the imaginary part is given by . Here, the is considered as input of the exponential function. In this project, we utilize the periodicity of the cosine and sine functions, which have a period of . Therefore, for any given angle , the values of and are the same. The output of the cosine function is a value between -1 and 1, inclusive. To implement the exponential function in BRAM, we consider the input range to be between -3.14 and 3.14, and the output range to be between -1 and 1, since the maximum and minimum value of real () and imaginary () output is in between 1 to -1. We set the input resolution to a step size of 0.01, resulting in 629 equally spaced values in this range. This approach simplifies the design and reduces the number of memory elements required to implement the exponential function.
In BRAM-based design using the SysGen approach, to implement the complex exponential function, a lookup table with input versus output data of
is mapped to the block-random-access-memory (BRAM) HDL block in MATLAB Simulink. The block is configured as a fixed point data format representation. Other graphical HDL blocks such as adder, multiplexer, buffer, etc., are used to implement the weighted sum of two complex exponential neurons shown in
Figure 3.
Once the design is complete, an IP core is generated using the Xilinx System Generator tool. This IP core is then instantiated in Vivado IDE to perform behavioral simulation and generate a hardware implementation report.
3.1.2. BRAM-based design in Vivado
In this approach, we utilized the Block Memory Generator tool (version 8.4) of Xilinx LogiCORE to map the Block-Random-Access-Memory (BRAM) to implement the complex exponential function in Vivado IDE. To ensure the same level of precision and accuracy, we used a similar lookup table configuration of the BRAM-based design in MATLAB Simulink for the BRAM mapping.
For implementing the weighted sum of two neurons model, we opted for a VHDL coding approach and fixed point data format to develop the necessary components, including the adder and multiplexer. These components were then mapped to the BRAM to finally complete the implementation.
3.1.3. CORDIC based design (using SysGen in MATLAB Simulink)
In the CORDIC-based design approach, we take advantage of the built-in Sin_and_Cos function provided by the Xilinx CORDIC 6.0 HDL block in MATLAB Simulink to efficiently implement the complex exponential function. Since the input of the CORDIC block is limited to a range of to , a wrapping subsystem is implemented before the CORDIC IP Core to ensure that the input signal falls within this range. This helps to guarantee the accuracy and precision of the final output.
Once the complex exponential function is obtained, it is combined with other components, such as an adder, multiplexer, and buffer, to implement the weighted sum of two neurons. The fixed-point representation is used throughout the design to maintain precision and reduce the computational overhead.
After finishing the design, an IP core is created using the Xilinx System Generator tool. This IP core is then embedded in Vivado IDE, where behavioral simulation is performed and a hardware implementation report is generated.
3.1.4. CORDIC based design in Vivado
CORDIC-based design in Vivado: In the CORDIC-based design approach in Vivado, we utilize the CORDIC (6.0) IP core of Xilinx to implement the complex exponential function using its built-in Sin and Cos functions and parallel architecture.
Similar to the approach in Simulink, we implement a wrapping function, adder, multiplexer, and buffer in VHDL coding to implement the weighted sum of two neurons.
Figure 4 shows the schemetic diagram of CORDIC-based design in Vivado IDE as an example.
3.1.5. The fixed-point implementation
In the given project, the input range lies between -4 and 4, while the output range is between -1 and 1. To represent the integer part of the input value in binary, only three bits are required as 4 in binary can be represented as 100. One bit is used to represent the sign of the input value, and the remaining bits are reserved for the fractional part.
However, for internal operations such as addition and multiplication, the input values go beyond decimal 16. Therefore, five bits are reserved for the integer part to ensure accurate calculations.
Similarly, for the output values, one bit is used to represent the sign, and the remaining bits are used for the fractional part. Since the maximum and minimum output values are 1 and -1, one bit is reserved for decimal 1, and the remaining bits are used for the fractional part.
By utilizing this fixed-point representation technique, we can allocate more bits to the fractional part, thereby enhancing the precision and accuracy of the final output.
Based on the above conditions, for the Simulink and Vivado-based design, in an 8-bit fixed-point implementation, the input is configured as Fix8_5 data format, with one sign bit, two integer bits and five fractional bits. The output comes with Fix8_6 data format, with one sign bit, one integer bit and six fractional bits.
In the 16-bit fixed-point implementation, the input is configured as Fix16_13 data format, with one sign bit, two integer bits and 13 fractional bits. The output comes with Fix16_14 data format, with one sign bit, one integer bit and 14 fractional bits.
In the 32-bit fixed-point implementation, the input is configured as Fix32_29 data format, with one sign bit, two integer bits and 29 fractional bits. The output comes with Fix32_30 data format, with one sign bit, one integer bit and 30 fractional bits.