3.1. Pulse Response of IDM MOS Capacitors
The pulse response characteristic of the IDMMOS capacitor, illustrated in
Figure 3a, indicates that the flat-band voltage (V
fb) undergoes stable positive and negative shifts in response to the voltage polarity switch every 50 pulses. Here, the V
fb shift from the initial V
fb (ΔV
fb) were plotted. Both positive and negative V
fb shifts display obvious nonlinear responses, exhibiting substantial changes immediately after the polarity switch and gradual suppression in the amount of change as the pulse count increases. Utilizing the approximation formula for nonlinear characteristics [
47,
48], the estimated nonlinear parameter γ for both positive and negative V
fb shifts is approximately 6, indicating a nearly symmetrical response. The pulse voltage (V
p) dependence illustrated in
Figure 3b indicates an increase in modulation amplitude and γ with the rise in V
p, as depicted in
Figure 3c. As a result, IDMs exhibit inherently nonlinear and near-symmetric responses, and the degree of nonlinearity varies depending on the operating conditions, making it necessary to consider these specific characteristics in synaptic applications.
Next, we delve into the reasons behind the nonlinear response. The V
p-dependent V
fb shift in
Figure 3b incorporates information about both the IDM interface state and response characteristics, which is useful for analyzing their relationship. ΔV
fb on the y-axis corresponds to the strength of the interface dipoles, as shown in
Figure 1a. Here, we assumed that the unit dipole switches between two states: large and small. In this scenario, the maximum V
fb shift occurs when all the unit dipoles at the HfO
2/SiO
2 interface switch due to the electric field, and under the opposite electric field, the opposite maximum V
fb shift occurs when all the unit dipoles switch to the opposite state. Additionally, the maximum modulation width of the 2-IDM structure is 0.66 V, as previously reported. In this context, the ratio of un-switched unit dipoles, that is, switchable unit dipoles, is defined as θ
D. In the following discussion, θ
D was estimated from the experimentally obtained ΔV
fb based on the above assumptions. On the other hand, the modulation rate, dV
fb/dt (V/sec), can be estimated from the ΔV
fb shift per pulse, and the oxide electric field E
ox (V/cm) can be estimated from the relationship between the ideal C-V curve of the MOS structure and V
p. Consequently, we can establish the relationship between dV
fb/dt and E
ox as shown in
Figure 4a. It is essential to note that even if the switching rate of the unit dipole is constant, the modulation rate varies depending on θ
D. For instance, a change in θ
D from θ
D=0.5 to θ
D=0.65 or θ
D=0.35 predicts the characteristics (I) and (II) in
Figure 4a. However, experimental results indicate more significant changes that cannot be explained by a simple θ
D difference.
The experimentally obtained dV
fb/dt is considered to be proportional to the number of switchable unit dipoles. Therefore, the following relationship can be predicted: dV/dt=ΔV
max·
k·θ
D, where ΔV
max is the maximum V
fb modulation of 0.66 V, and
k is the reaction rate of dipole modulation (s
-1), expressed by the following equation [
21]:
where,
is the molecular vibrational frequency, typically on the order of ~10
13 (s
-1).
T and
kB are the temperature (K) and Boltzmann’s constant, respectively. From these relationships and the experimentally obtained E
ox-dependence, we can estimate the zero-field activation energy
(eV) and the effective dipole moment
peff (eÅ) for each θ
D. The θ
D dependence of
and
peff for positive ΔV
fb shifts is summarized in
Figure 4b, and the results estimated by the same analysis for negative ΔV
fb shifts were shown in
Figure 4c. We can find that, for both cases, both
and
peff increase when θ
D falls below 0.5. Here,
peff reflects structural features such as chemical bonding configuration, and
is the energy barrier for structural changes [
50,
51]. In other words, the bonding configuration contributing to IDM varies depending on θ
D. It has been proposed that IDM is caused by changes in the chemical bonding around the interface Ti atom, and a similar primitive pulse measurement to this study suggested that
is close to that corresponding to the breakage of the Ti-O bond [
21]. On the other hand, studies on the dielectric breakdown of gate dielectrics have reported that electric field-induced chemical bond breakage is sensitive to local bonding configuration [
50,
51]. Since IDM occurs at an amorphous oxide interface, it is natural that there are variations in bond length and bond angle in the chemical bonds of interfacial Ti atoms. Therefore, it is reasonable to assume that the initial structural change starts from the bonding with low
. In addition, there is a possibility that the structural change itself affects approximate bonding; that is, IDM itself leads to structural variations with higher
. From the above experimental results and considerations, we conclude that the nonlinearity in IDM response is an unavoidable feature caused by the amorphous oxide interface.
3.2. Pulse Response of IDMFETs
We can easily predict that converting the threshold voltage (V
th) shift induced by the IDM into a change in the channel current of the FET will result in a response characteristic that is different from the IDM response, since the channel current-gate voltage relationship of the FET is not ideally linear. That is, general I
d-V
g characteristics include at least a linear region and a sub-threshold region [
52], representing the coexistence of linear and exponential responses. Before describing the synaptic characteristics of the IDMFET, we will briefly discuss the fundamental DC I
d-V
g curve and pulse-induced I
d change. The DC I
d-V
g curves shown in
Figure 5a indicate that approximately 1 V hysteresis takes place with a sweeping voltage range of ±4.5 V. To convert the IDM-induced V
th shift into I
d change, it is suitable to use the read V
g within this hysteresis range. Here, the sub-threshold swing was estimated to be approximately 100 mV/decade, suggesting that the I
d change caused by a 0.1-V V
th shift is expected to be an order of magnitude current change.
The amplitudes of the I
d modulations marked as (I), (II), and (III) in
Figure 5a represent the pulse-induced I
d changes observed under different readout V
g voltages and the same pulse conditions. Here, the pulse voltage (V
p) and pulse width (t
p) were set to ±5.4 V and 800 μs, respectively, and the V
p polarity was switched every 300 pulses. The changes in the pulse response characteristics (I), (II), and (III) shown in
Figure 5b exhibit that the I
d increase and decrease exhibit opposite behavior regarding nonlinearity. As for the I
d increase, (I) exhibits a nonlinear response, (II) approaches linear response, and (III) shows an inverted nonlinear response, exhibiting that the nonlinear coefficient (ν
+) changes from positive to negative. Regarding the I
d decrease, the nonlinear coefficient (ν
−) is always positive, and the nonlinearity becomes stronger in the order of (I), (II), and (III). On the other hand, even with the same read V
g, the nonlinearity changes significantly depending on the pulse voltage V
p [
Figure 5c,d]. In the lower graph of
Figure 5d, we present the ratio of the nonlinear parameters for I
d increase and decrease (ν
+/ν
−) as an indicator of asymmetry. Here, approaching ν
+/ν
− to 1 indicates proximity to symmetric response, and smaller V
p values have better symmetry. In summary, the nonlinearity and asymmetry of IDMFET exhibit complex behavior dependent on read and pulse conditions. A summary of the ν
+/ν
− ratios measured under various conditions [
Figure 5e] shows that the general tendency is that asymmetry becomes stronger when aiming for a large current ratio (I
max/I
min). This implies that simultaneously, the nonlinearity of the I
d decrease becomes stronger.
The above behavior regarding the nonlinearity and asymmetric response can be roughly understood in terms of basic FET operation as follows. We can easily understand that when the Id modulation is in the linear region or sub-threshold region with a sufficiently small Imax/Imin ratio, the nonlinear and near-symmetric IDM characteristics are directly reflected in the Id response. On the other hand, when Imax/Imin is large and the device is operating in the sub-threshold region, even if the Vth shift is constant, the smaller the current, the smaller the absolute Id change will be exponentially. That is, in the characteristic of the Id increase, IDMFET is insensitive to Vth shift in the initial stage and gradually becomes sensitive, so the nonlinear characteristics are weakened. Conversely, in the characteristic of the Id decrease, IDMFET is sensitive to Vth shift in the initial stage and gradually becomes insensitive, so the nonlinearity of FET operation is further superimposed on the nonlinear IDM response. It is easy to predict that a similar effect will occur even when Imax/Imin is large and the Id modulation straddles the linear and sub-threshold regions. The ultimate goal of this study is to verify whether such nonlinear and asymmetric IDMFET response can be applied to STDP learning.
3.3. Double-Pulse-Controlled Synaptic Operation of IDMFETs
To update I
d of IDMFETs based on the time difference between pre- and post-synaptic spikes, akin to synaptic weight (
w) updates in biological STDPs, it is crucial to carefully choose the pre-spike and post-spike waveforms. However, for compatibility with the digital circuits responsible for neuron information processing, it is preferable to avoid complex waveforms as much as possible. We adopted a simple bipolar rectangular waveform, as shown in
Figure 6a. Pre- and post-synaptic spikes have waveforms of the same voltage (V
STDP) and pulse width (t
STDP) with a time difference Δt. Assuming that a superimposed waveform of pre-synaptic and post-synaptic spikes is applied to the gate stack structure, I
d modulation is expected to depend on Δt, because the period during which a voltage twice V
STDP is applied coincides with Δt. Here, the application period of V
STDP also changes, but since IDM has an exponential response to E
ox, it is expected that it can be ignored by setting an appropriate V
STDP.
Figure 6b shows the measurement results in which the sign of Δt alternates every 500 spikes. An increase in I
d is observed at +Δt, and a decrease in I
d at −Δt, indicating the expected STDP-like response. This means that synaptic potentiation occurs when a post spike is input after a pre spike is input, and synaptic depression occurs at the opposite timing. Furthermore, as Δt approaches 200 μsec of t
STDP, the amplitude of the I
d modulation increases, which is a characteristic predicted from the above waveform superposition. On the other hand, we also find that STDP operation exhibits obvious nonlinear and asymmetric potentiation/depression properties. For example, at Δt=±200 μsec, the ν
+/ν
− ratio was estimated to be 0.2, showing similar asymmetry to the previously discussed single-pulse IDMFET response.
In order to determine whether the pulse-timing-dependent I
d modulation obtained from the IDMFET can be applied to STDP learning, we need to discuss based on the different Δt responses acquired within the same I
d range. Therefore, we performed a similar double-pulse measurement that restricted the I
d range, where the sign of Δt is reversed, when I
d exceeds the range of 0.8 to 3.0 μA.
Figure 7a presents the comparison of response characteristics for Δt=±200 μs and ±100 μs. We can see that for the latter, more pulses are required for Δt sign reversal compared to the former. Both results exhibit asymmetric response characteristics, and Δt does not approximately affect the ν
+/ν
− ratio. The ΔI
d-I
d characteristics in
Figure 7b can be obtained by converting the measured pulse-induced I
d change into an I
d change for each pulse (ΔI
d). Here, we can find the impact of the asymmetry response. Regarding I
d increase, a slight ΔI
d value persists even as I
d approaches 3 μA. However, in the case of I
d decrease, ΔI
d approaches zero more closely as I
d approaches 0.8 μA. The experimentally obtained ΔI
d-I
d data were fitted with an approximate equation: ΔI
d= α(I
d−I
0)+β(I
d−I
0)γ, where α, β, γ, and I
0 are constants. In the simulation study described later, the approximate equation of ΔI
d-I
d data was converted to the synaptic weight,
w, in the
w range of 0-0.8. The Δ
w-Δt characteristics of STDP shown in
Figure 7c are the result converted from the experimentally obtained ΔI
d-I
d data, revealing a significant impact of the nonlinear and asymmetric IDMFET response. Under conditions where
w is close to zero, potentiation is larger than depression, reaching an equilibrium of potentiation/depression around
w=0.4. As
w increases further, depression becomes more prominent. In the following simulations, these nonlinear and asymmetric STDP characteristics are applied to unsupervised pattern learning.
On the other hand, an obvious variation is observed in the experimental ΔI
d-I
d data in
Figure 7b.
Figure 7d illustrates the difference between the approximation curve and measured data across the entire I
d range for the Δt=±200 μsec measurement. The origin of this variation contains fluctuations of the IDM device itself and measurement system noise. Regarding the former, the fluctuation of the IDM response itself and other V
th fluctuations such as the oxide carrier trap may contribute. In the subsequent simulations, STDP incorporating the distribution of observed variations is applied.
In general, SNN learning requires an additional
w update function that differs from STDP, for example, to set initial
w values and to optimize and adjust the synaptic learning conditions. In this study, an additional
w update of FID is applied to adjust the STDP-based unsupervised learning, as described above. We propose a two-pulse controlled modulation, as shown in the inset of
Figure 7e, which is highly compatible with our STDP operation. Positive and negative voltage pulses, serving as pre- and post-synaptic spikes, are inputted to the IDMFET, inducing the
w depression as shown in
Figure 7e. The depression effect becomes stronger with the increase in pulse voltage (V
FID) across all
w ranges. This depression characteristic is incorporated into SNN simulations using the same approximate equation as the STDP characteristics.
3.4. Unsupervised Synaptic Learning Based on IDMFET Characteristics
First, let's examine how unsupervised learning, combining STDP and FID, operates using a network with N=100 as an example. In this simulation, when a hidden layer neuron fires, the synapses connected to it are updated by STDP, and subsequently, FID is applied to all synapses that underwent STDP (100% FID). For STDP, we utilized the approximate curve obtained from measurements at V
STDP=3.5 V, and for FID, the approximate curve with V
FID varied in the range of 3.15 to 3.5 V was employed. Random variations from the distributions estimated by the measurements were incorporated into both STDP and FID. The training dynamics in
Figure 8a show the average classification accuracy over 10 training/classification cycles, with the shaded area indicating the spread between the maximum and minimum values. In comparison to the result at V
FID=3.2 V, a higher V
FID of 3.5 V reaches maximum accuracy faster, but subsequently experiences more significant accuracy degradation and fluctuates. Here, the number of training images required to reach 90% of the maximum average accuracy is defined as learning efficiency (η). While V
FID does not significantly affect the maximum accuracy (
Figure 8b), a noticeably larger V
FID is advantageous for learning efficiency (
Figure 8c). This is presumed to be due to a larger V
FID enhancing the WTA effect, suppressing the probability of overlapping different digit patterns. However, as V
FID increases, the robustness deteriorates after reaching the maximum accuracy, as shown in
Figure 8a, suggesting that a large V
FID degrades the information of the pattern once learned. Based on the characteristics of the IDMFET obtained in this experiment, V
FID around 3.2 V is considered a balanced and favorable condition.
From the perspective of reducing calculation costs, it is advantageous to minimize the number of FIDs.
Figure 8a illustrates the training dynamics of randomly inducing FID pulses with a 50% probability, demonstrating that both the maximum accuracy and learning efficiency are degraded compared to those of the 100% FID. As depicted in
Figure 8b,c, no clear benefit was found from the results of V
FID dependence either. We also investigated various FID probabilities and concluded that FID is always required after STDP. This result suggests that FID is effective for properly operating WTA and accumulating training patterns' information in appropriate synapses. It is worth mentioning that previously reported studies on STDP-based unsupervised learning did not incorporate additional pulses such as FID [
32,
33,
34,
37,
38,
39,
40,
41]. This difference is presumed to be due to the difference in spike waveforms. Generally, more complex spike waveforms are employed to balance potentiation and depression during STDP, for example, spike waveforms include triangle waves and different positive/negative shapes, voltages, and widths. In this study, emphasis was placed on the simplicity of spike waveforms and concurrent STDP learning. An important result of this study is that we were able to achieve efficient unsupervised learning by combining additional FIDs within these constraints.
Next, we briefly mention the impact of the variation of IDMFETs. The training and classification calculation without the variation was also performed, but there are no significant differences in classification accuracy and learning efficiency (
Figure 8b,c). We performed similar calculations with a wider distribution than the experimentally observed variation of IDMFETs and found a decline in learning performance. For example, if the variation is 10 times wider than those of IDMFETs, the maximum accuracy drops to 70 %. This means that while the current level of variation is acceptable, devices with excessive variation should be treated with caution.
Finally, let's discuss the impact of the feature neuron size.
Figure 8e illustrates training dynamics for different values of N, calculated at V
FID=3.25 V. Increasing N results in a decrease in learning efficiency due to the increased number of synapses to be learned, where we found a proportional relationship of η=68×N. On the other hand, increasing N can improve classification accuracy, as shown in
Figure 8e, in which the previously reported accuracy data deduced by similar networks with STDP-based unsupervised learning were compared [
32,
33,
34]. It is important to note that previous studies were not related to the device characteristics or are not based on the actual device dynamics. It is evident that even with the STDP characteristics of IDMFET, introducing suitable FID operations can achieve accuracy equivalent to conventional SNN. Based on these results, IDMFET is considered a promising candidate as a synaptic device for unsupervised SNN learning. Particularly noteworthy is the fact that, in typical SNN systems, the number of synapses is orders of magnitude larger than that of neurons; therefore, the implementation of high-density synaptic devices using IDMFETs is expected to be highly effective.