Submitted:
08 July 2024
Posted:
10 July 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
- It provides a comprehensive and easy to follow description of the state-of-the-art edge devices and their underlying architecture.
- It reviews the supported programming frameworks of the processors and general model compression techniques to enable edge computing.
- The study has analyzed the technical details of the processors for edge computing and provides charts on hardware parameters.
2. Deep Learning Algorithms in Edge Application
i. Classification
ii. Ditection
iii. Speech Recognition and Natural Language Processing
3. Model Compression
i. Quantization
ii. Pruning
iii. Knowledge Distillation
4. Framework for Deep Learning Networks
5. Framework for Spiking Neural Networks
6. Edge Processors
i. Dataflow Edge Processor
| Company | Latest Chip | Power (W) | Process (nm) | Area (mm2) | Precision INT/FP | Performance (TOPS) | E. Eff. (TOPS/W) |
Architecture | Reference | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Apple | M1 | 10 | 5 | 119 | 64 | 11 | 1.1 | Dataflow | [159] | ||
| Apple | A14 | 6 | 5 | 88 | 64 | 11 | 1.83 | Dataflow | [242] | ||
| Apple | A15 | 7 | 5 | 64 | 15.8 | 2.26 | Dataflow | [242] | |||
| Apple | A16 | 5.5 | 4 | 64 | 17 | 3 | Dataflow | [157] | |||
| *AIStorm | AIStorm | 0.225 | 8 | 2.5 | 11 | Dataflow | [243] | ||||
| *AlphaIC | RAP-E | 3 | 8 | 30 | 10 | Dataflow | [244] | ||||
| aiCTX | Dynap-CNN | 0.001 | 22 | 12 | 1 | 0.0002 | 0.2 | Neuromorphic | [15,213] | ||
| *ARM | Ethos78 | 1 | 5 | 16 | 10 | 10 | Dataflow | [160,161] | |||
| *AIMotive | Apache5 IEP | 0.8 | 16 | 121 | 8 | 1.6-32 | 2 | Dataflow | [164,165] | ||
| *Blaize | Pathfinder, EI Cano | 6 | 14 | 64, FP-8, BF16 | 16 | 2.7 | Dataflow | [162] | |||
| *Bitman | BM1880 | 2.5 | 28 | 93.52 | 8 | 2 | 0.8 | Dataflow | [245,246] | ||
| *BrainChip | Akida1000 | 2 | 28 | 225 | 1,2,4 | 1.5 | 0.75 | Neuromorphic | [147,148] | ||
| *Cannan | Kendrite K210 | 2 | 28 | 8 | 1.5 | 1.25 | Dataflow | [247,248] | |||
| *CEVA | CEVA-Neuro-S | 16 | 2, 5, 8, 12, 16 | 12.7 | Dataflow | [134] | |||||
| *CEVA | CEVA-Neuro-M | 0.83 | 16 | 2, 5, 8, 12, 16 | 20 | 24 | Dataflow | [135] | |||
| *Cadence | DNA100 | 0.85 | 16 | 16 | 4.6 | 3 | Dataflow | [167,168] | |||
| *Deepvision | ARA-1 | 1.7 | 28 | 8,16 | 4 | 2.35 | Dataflow | [169] | |||
| *Deepvision | ARA-2 | 16 | Dataflow | [170] | |||||||
| *Eta | ECM3532 | 0.01 | 55 | 25 | 8 | 0.001 | 0.1 | Dataflow | [249] | ||
| *FlexLogic | InferX X1 | 13.5 | 7 | 54 | 8 | 7.65 | 0.57 | Dataflow | [250] | ||
| Edge TPU | 2 | 28 | 96 | 8, BF16 | 4 | 2 | Dataflow | [176,177] | |||
| *Gyrfalcon | LightSpeer 2803S | 0.7 | 28 | 81 | 8 | 16.8 | 24 | PIM | [224] | ||
| *Gyrfalcon | LightSpeer 5801 | 0.224 | 28 | 36 | 8 | 2.8 | 12.6 | PIM | [224] | ||
| *Gyrfalcon | Janux GS31 | 650/900 | 28 | 10457.5 | 8 | 2150 | 3.30 | PIM | [225] | ||
| *GreenWaves | GAP9 | 0.05 | 22 | 12.25 | FP-(8,16,32) | 0.05 | 1 | Dataflow | [180,181] | ||
| *Horizon | Journey 3 | 2.5 | 16 | 8 | 5 | 2 | Dataflow | [171] | |||
| *Horizon | Journey5/5P | 30 | 16 | 8 | 128 | 4.8 | Dataflow | [172,173] | |||
| *Hailo | Hailo 8 M2 | 2.5 | 28 | 225 | 4,8,16 | 26 | 2.8 | Dataflow | [174,175] | ||
| Intel | Loihi 2 | 0.1 | 7 | 31 | 8 | 0.3 | 3 | Neuromorphic | [9] | ||
| Intel | Loihi | 0.11 | 14 | 60 | 1-9 | 0.03 | 0.3 | Neuromorphic | [9,218] | ||
| *Intel | Intel® Movidius | 2 | 16 | 71.928 | 16 | 4 | 2 | Dataflow | [186] | ||
| IBM | TrueNorth | 0.065 | 28 | 430 | 8 | 0.0581 | 0.4 | Neuroorphic | [10,218] | ||
| IBM | NorthPole | 74 | 12 | 800 | 2,4,8 | 200 (INT8) | 2.7 | Dataflow | [299,304] | ||
| *Imagination | PowerVR Series3NX | FP-(8,16) | 0.60 | Dataflow | [182,183] | ||||||
| *Imagination | IMG 4NX MC1 | 0.417 | 4,16 | 12.5 | 30 | Dataflow | [184] | ||||
| *Imec | DIANA | 22 | 10.244 | 2 | 29.5 (A), 0.14 (D) | 14.4 | PIM+Digital | [222,223] | |||
| *Kalray | MPPA3 | 15 | 16 | 8,16 | 255 | 1.67 | Dataflow | [13] | |||
| *Kneron | KL720 AI | 1.56 | 28 | 81 | 8,16 | 1.4 | 0.9 | Dataflow | [191] | ||
| *Kneron | KL530 | 0.5 | 8 | 1 | 2 | Dataflow | [192] | ||||
| *Koniku | Konicore | Neuromorphic | [12] | ||||||||
| *LeapMind | Efficiera | 0.237 | 12 | 0.422 | 1,2,4,8,16,32 | 6.55 | 27.7 | Datalow | [21] | ||
| Memryx | MX3 | 1 | -- | -- | 4,8,16 (W) BF16 | 5 | 5 | Dataflow | [297] | ||
| *Mythic | M1108 | 4 | 361 | 8 | 35 | 8.75 | PIM | [89] | |||
| *Mythic | M1076 | 3 | 40 | 294.5 | 8 | 25 | 8.34 | PIM | [18,88,90] | ||
| *mobileEye | EyeQ5 | 10 | 7 | 45 | 4,8 | 24 | 2.4 | Dataflow | [193,194,195] | ||
| *mobileEye | EyeQ6 | 40 | 7 | 4,8 | 128 | 3.2 | Dataflow | [196] | |||
| *Mediatek | i350 | 14 | 0.45 | Dataflow | [251] | ||||||
| *NVIDIA | Jetson Nano B01 | 10 | 20 | 118 | FP16 | 1.88 | 0.188 | Dataflow | [197] | ||
| NVIDIA | AGX Orin | 60 | 7 | -- | 8 | 275 | 3.33 | Dataflow | [199] | ||
| *NXP | i.MX 8M+ | 14 | 196 | FP16 | 2.3 | Dataflow | [86,87] | ||||
| *NXP | i.MX9 | 4x10-6 | 12 | Dataflow | [85] | ||||||
| *Perceive | Ergo | 0.073 | 5 | 49 | 8 | 4 | 55 | Dataflow | [252] | ||
| TSU & Polar Bear Tech | QM930 | 12 | 12 | 1089 | 4,8,16 | 20 (INT8) | 1.67 | Dataflow | [302] | ||
| Qualcomm | QCS8250 | 7 | 157.48 | 8 | 15 | Dataflow | [200,201] | ||||
| Qualcomm | Snapdragon 888+ | 5 | 5 | FP32 | 32 | 6.4 | Dataflow | [202,203,204] | |||
| Qualcomm | Snapdragon 8 Gen2 | 4 | 4,8,16, FP16 | 51 | Dataflow | [303] | |||||
| *RockChip | rk3399Pro | 3 | 28 | 729 | 8, 16 | 3 | 1 | Dataflow | [253] | ||
| Rokid | Amlogic A311D | 12 | 5 | Dataflow | [254] | ||||||
| Samsung | Exynos 2100 | 5 | 26 | Dataflow | [205,206] | ||||||
| Samsung | Exynos 2200 | 4 | 8,16, FP16 | Dataflow | [255] | ||||||
| Samsung | HBM-PIM | 0.9 | 20 | 46.88 | 1.2 | 1.34 | PIM | [226,227] | |||
| Sima..ai | MLSoC | 10 | 16 | 175.55 | 8 | 50 | 5 | Dataflow | [300,301] | ||
| Synopsis | EV7x | 16 | 8, 12, 16, | 2.7 | Dataflow | [209,210] | |||||
| *Syntiant | NDP100 | 0.00014 | 40 | 2.52 | 0.000256 | 20 | PIM | [228,229] | |||
| *Syntiant | NDP101 | 0.0002 | 40 | 25 | 1,2, 4,8 | 0.004 | 20 | PIM | [228,231] | ||
| *Syntiant | NDP102 | 0.0001 | 40 | 4.2921 | 1, 2, 4, 8 | 0.003 | 20 | PIM | [228,235] | ||
| *Syntiant | NDP120 | 0.0005 | 40 | 7.75 | 1,2,4,8 | 0.0019 | 3.8 | PIM | [228,234] | ||
| *Syntiant | NDP200 | 0.001 | 40 | 1,2,4,8 | 0.0064 | 6.4 | PIM | [228,232] | |||
| Think Silicon | NEMA®|pico XS | 0.0003 | 28 | 0.11 | FP16,32 | 0.0018 | 6 | Dataflow | [256] | ||
| Tesla/Samsung | FSD Chip | 36 | 14 | 260 | 8, FP-8 | 73.72 | 2.04 | Dataflow | [211] | ||
| Videntis | TEMPO | Neuromorphic | [11] | ||||||||
| Verisilicon | VIP9000 | 16 | 16, FP16 | 0.5-100 | Dataflow | [207,208] | |||||
| Untether | TsunAImi | 400 | 16 | 8 | 2008 | 8 | PIM | [236,237] | |||
| UPMEM | UPMEM-PIM | 700 | 20 | 32, 64 | 0.149 | PIM | [238,239,240,241] | ||||
| Company | Product | Supported Neural Networks | Supported Frameworks | Application/benefits |
|---|---|---|---|---|
| Apple | Apple A14 | DNN | TFL | iPhone12 series |
| Apple | Apple A15 | DNN | TFL | iPhone13 series |
| aiCTX-Synsense | Dynap-CNN | CNN, RNN, Reservoir Computing | SNN | High-speed aircraft, IoT, security, healthcare, mobile |
| ARM | Ethos78 | CNN and RNN | TF, TFL,Caffe2,PyTorch, MXNet, ONNX | Automotive |
| AIMotive | Apache5 IEP | GoogleNet, VGG16, 19, Inception-v4, v2, MobileNet v1, ResNet50, Yolo v2 | Caffe2 | Automotives, pedestrian detection, vehicle detection, lane detection, driver status monitoring |
| Blaize | EI Cano | CNN, YOLO v3 | TFL | Fit for industrial, retail, smart-city, and computer-vision systems |
| BrainChip | Akida1000 | CNN in SNN, Mobilenet | MetaTF | Online learning, data analytics, security |
| BrainChip | AKD500, 1500, 2000 | DNN | MetaTF | Smart home, Smart health, Smart City and smart transportation |
| CEVA | Neuro-s | CNN, RNN | TFL | IoTs, smartphones, surveillance, automotive, robotics, medical |
| Cadence | Tensilica DNA100 | FCC, CNN, LSTM | ONNX, Caffe2, TensorFlow | IoT, smartphones, AR/VR, smart surveillance, Autonomous vehicle |
| Deepvision | ARA-1 | Deep Lab V3, Resnet-50, Resnet-152, MobileNet-SSD, YOLO V3, UNET | Caffe2, TFL, MXNET, PyTorch | Smart retail, robotics, industrial automation, smart cities, autonomous vehicles, and more |
| Deepvision | ARA-2 | Model in ARA-1 and LSTM, RNN, | TFL, Pytorch | Smart retail, robotics, industrial automation, smart cities, |
| Eta | ECM3532 | CNN, GRU, LSTM | --- | Smart home, consumer products, medical, logistics, smart industry |
| Gyrfalcon | LightSpeer 2803S | CNN based, VGG, ResNet, MobileNet; | TFL, Caffe2 | High performance audio and video processing |
| Gyrfalcon | LightSpeer 5801 | CNN based, ResNet, MobileNet and VGG16, | TFL, PyTorch & Caffe2 | Object Detection and Tracking,NLP, Visual Analysis |
| Gyrfalcon Edge Server | Janux GS31 | VGG, REsNet, MobileNet | TFL, Caffe2, PyTorch | Smart cities, surveillance, object detection, recognition |
| GreenWaves | GAP9 | CNN, mobileNet v1 | DSP application | |
| Horizon | Journey 3 | CNN, mobilenet v2, efficient net | TFL, Pytorch, ONNX, mxnet, Caffe2 | Automotive |
| Horizon | Journey5/5P | Resnet18, 50, MobileNet v1-v2, ShuffleNetv2, EfficientNet FasterRCNN, Yolov3 | TFL, Pytorch, ONNX, mxnet, Caffe2 | Automotive |
| Hailo | Hailo 8 M2 | YOLO 3, YOLOv4, CenterPose, CenterNet, ResNet-50 | ONNX, TFL | Edge vision applications |
| Intel | Loihi 2 | SNN based NN | Lava, TFL, Pytorch | Online learning, sensing, robotics, healthcare |
| Intel | Loihi | SNN based NN | Nengo | Online learning, robotics, healthcare and many more |
| Imagination | PowerVR Series3NX | Mobilenet v3, CNN | Caffe, TFL | smartphones, smart cameras, drones, automotives, wearables, |
| Imec & GF | DIANA | DNN | TFL, Pytorch | Analog computing in Edge inference |
| KoniKu | Konicore | synthetic Biology+Silicon | -- | Chemical detection, aviation, security |
| Kalray | MPPA3 | Deep network converted to KaNN | Kalray's KANN | Autonomous vehicles, surveillance, robotics, industry, 5G |
| Kneron | KL720 AI | CNN, RNN, LSTM | ONNX, TFL, Keras, Caffe2 | wide applications from automotive to home appliances |
| Kneron | KL520 | Vgg16, Resnet, GoogleNet, YOLO, Lenet, MobileNet, FCC | ONNX, TFL, Keras, Caffe2 | Automotive, home, industry and so on. |
| LeapMind | Efficiera | CNN, YOLO v3, Mobilenet-v2, Lmnet | Blueoil, Python & C++ API | Home, Industrial machinery, surveillance camera, robots |
| Memryx | MX3 | CNN | Pytorh, ONNX, TF, Keras | Automation, surveillance, agriculture, financial |
| Mythic | M1108 | CNN, large complex DNN, Resnet50, YOLO v3, Body25 | Pytorch, TFL, and ONNX | Machine Vision, Electronics, Smart Home, UAV/Drone, Edge Server |
| Mythic | M1076 | CNN, Complex DNN, Resnet50, YOLO v3 | Pytorch, TFL, and ONNX | Surveillance, Vision, voice, Smart Home, UAV, Edge Server |
| MobileEye | EyeQ5 | DNN | Autonomous driving | |
| MobileEye | EyeQ6 | DNN | Autonomous driving | |
| Mediatek | i350 | DNN | TFL | Vision and voice, Biotech and Bio-metric measurements |
| NXP | i.MX 8M+ | DNN | TFL, Arm NN, ONNX | Edge Vision |
| NXP | i.MX9 | CNN, Mobilenet v1 | TFL, Arm NN, ONNX | Graphics, image, display, audio |
| NVIDIA | AGX Orin | DNN | TF, TFL, Caffe, Pytorch | Robotics, Retail, Traffic, Manufacturing |
| Qualcomm | QCS8250 | CNN, GAN, RNN | TFL | smartphone, tablet, support 5G, video and image processing |
| Qualcomm | Snapdragon 888+ | DNN | TFL | Smartphone, tablet, 5G, gaming, video upscaling, image & video processing, |
| RockChip | rk3399Pro | VGG16, ResNEt50, Inception4 | TFL, Caffe, mxnet, ONNX, darknet | Smart Home, City, and Industry; face recognition, driving monitoring, |
| Rokid | Amlogic A311D | Inception V3, YoloV2, YOLOV3 | TFL, Caffe2 Darknet | High-performance multimedia |
| Samsung | Exynos 2100 | CNN | TFL | Smartphone, tablet, advanced image signal processing (ISP), 5G |
| Samsung | HBM-PIM | DNN see youtube to write on int | Pytorch, TFL | Supercomputer and AI application |
| Synopsis | EV7x | CNN, RNN, LSTM | OpenCV, OpenVX and OpenCL C, TFL, Caffe2 | Robotics, autopilot car, vision, SLAM, and DSP algorithms |
| Syntiant | NDP100 | DNN | TFL | Mobile phones, hearing equipment, smartwatches, IoT, remote controls |
| Syntiant | NDP101 | CNN, RNN, GRU, LSTM | TFL | Mobile phones, smart homes, remote controls, smartwatches, IoT |
| Syntiant | NDP102 | CNN, RNN, GRU, LSTM | TFL | Mobile phones, smart homes, remote controls, smartwatches, IoT |
| Syntiant | NDP120 | CNN, RNN, GRU, LSTM | TFL | Mobile phones, smart home, wearables, PC, IoT endpoints, media streamers, AR/VR |
| Syntiant | NDP200 | FC, Conv, DSConv, RNN-GRU, LSTM | TFL | Mobile phones, smart homes, security cameras, video doorbells |
| Think Silicon | Nema PicoXS | DNN | ---- | Wearable and embedded devices |
| Tesla | FSD | CNN | Pytorch | Automotive |
| Verisilicon | VIP9000 | All modern DNN | TF, Pytorch, TFL, DarkNet, ONNX | Can perform as intelligent eye and intelligent ear at the edge |
| Untether | TsunAImi | DNN, ResNet-50, Yolo, Unet, RNN, BERT, TCNs, LSTMs | TFL, Pytorch | NLP, Inference at the edge server or data center |
| UPMEM | UPMEM-PIM | DNN | ----- | Sequence alignment: DNA or protein; Genome assembly: Metagenomic analysis |
ii. Neuromorphic Edge AI Processor
iii. PIM Processor
iv. Processor in the Industrial Research
| Research Group | Name | Power (W) | Process (nm) | Area (mm2) | Precision INT/FP* |
Performance (TOPS) | E. Eff. (TOPS/W) |
Architecture | Reference |
|---|---|---|---|---|---|---|---|---|---|
| TSMC+ NTHU | 2.13E-03 | 22 | 6 | 2,4,8 | 4.18E-01 | 195.7 | PIM | [259] | |
| TSMC | 0.037 | 22 | 0.202 | 4,8,12,16 | 3.3 | 89 | PIM | [257] | |
| TSMC | 0.00142 | 7 | 0.0032 | 4 | 0.372 | 351 | PIM | [258] | |
| Samsung+GIT | FORMS | 66.36 | 32 | 89.15 | 8 | 0.0277 | PIM | [262] | |
| IBM + U Patra | HERMES | 9.61E-02 | 14 | 0.6351 | 8 | 2.1 | 21.9 | PIM | [260] |
| Samsung+ASU | PIMCA | 0.124 | 20.9 | 1,2 | 4.9 | 588 | PIM | [261] | |
| Intel+Cornell U | CAPE | 7 | 9 | 4 | PIM | [263] | |||
| SK Hynix | AiM | 6.08 | 1 | PIM | [264] | ||||
| TSMC | DCIM | 0.0116 | 5 | 0.0133 | 4,8 | 2.95 | 254 | PIM | [265] |
| Samsung | 0.3181 | 4 | 4.74 | 4,8,16, FP16 | 39.3 | 11.59 | Dataflow | [266] | |
| Alibaba + FU | 0.0212 | 28 | 8.7 | 3 | 0.97 | 32.9 | Dataflow | [267] | |
| Alibaba + FU | 0.072 | 65 | 8.7 | 3 | 1 | 8.6 | Dataflow | [267] | |
| Alibaba | 0.978 | 55 | 602.22 | 8 | Dataflow | [268] | |||
| TSMC+ NTHU | 0.00227 | 22 | 18 | 2,4,8 | 0.91 | 960.2 | PIM | [269] | |
| TSMC + NTHU | 0.00543 | 40 | 18 | 2,4,8 | 3.9 | 718 | PIM | [270] | |
| TSMC+GIT | 0.000350 | 40 | 0.027 | 0.0092 | 26.56 | PIM | [271] | ||
| TSMC+GIT | 0.131 | 40 | 25 | 1-8,1-8,32 | 7.989 | 60.64 | PIM | [272] | |
| Intel+UC | 0.0090 | 28 | 0.033 | 1,1 | 20 | 2219 | PIM | [273] | |
| Intel+UC | 0.0194 | 28 | 0.049 | 1-4,1 | 4.8 | 248 | PIM | [274] | |
| TSMC+ NTHU | nvCIM | 0.00398 | 22 | 6 | 2,4 | 5.12 | 1286.4 | PIM | [275] |
| Pi2star +NTHU | 0.00841. | 65 | 12 | 1-8 | 3.16 | 75.9 | PIM | [276] | |
| Pi2star +NTHU | 0.00652 | 65 | 9 | 4,8 | 2 | 35.8 | PIM | [277] | |
| Tsing+NTHU | 0.273 | 28 | 6.82 | 12 | 4.07 | 27.5 | Dataflow | [278] | |
| Samsung | 0.381 | 4 | 4.74 | 4,8,FP16 | 19.7 | 11.59 | Dataflow | [279] | |
| Renesas Electronics | 4.4 | 12 | 60.4 | 13.8 | Dataflow | [280] | |||
| IBM | 6.20 | 7 | 19.6 | 2,4,FP(8,16,32) | 102.4 | 16.5 | Dataflow | [281] | |
| Intel + IMTU | QNAP | 0.132 | 28 | 3.24 | 8 | 2.3 | 17.5 | Dataflow | [282] |
| Samsung | 0.794 | 5 | 5.46 | 8,16 | 29.4 | 13.6 | Dataflow | [283] | |
| Sony | 0.379 | 22 | 61.91 | 8,16,32 | 1.21 | 4.97 | Dataflow | [284] | |
| Mediatek | 1.05 | 7 | 3.04 | 3.6 | 13.32 | Dataflow | [285] | ||
| Pi2star | 0.099 | 65 | 12 | 8 | 1.32 | 13.3 | Dataflow | [286] | |
| Mediatek | 0.0012 | 12 | 0.102 | 86.24 | PIM | [287] | |||
| TSMC+NTHU | 0.10 | 22 | 8.6 | 8,8,8 | 6.96 | 68.9 | PIM | [288] | |
| TSMC+NTHU | 0.099 | 22 | 9.32 | 8,8,8 | 24.8 | 251 | PIM | [289] | |
| ARM+Harvard | 0.04 | 12 | FP4 | 0.734 | 18.1 | Dataflow | [290] | ||
| ARM+Harvard | 0.045 | 12 | FP8 | 0.367 | 8.24 | Dataflow | [291] | ||
| TSMC + NTHU | 0.0037 | 22 | 18 | 8,8,22 | 0.59 | 160.1 | Dataflow | [292] | |
| STMircroelectronics | 0.738 | 18 | 4.24 | 1,1 | 229 | 310 | Dataflow | [293] | |
| STMircroelectronics | 0.740 | 18 | 4.19 | 4,4 | 57 | 77 | Dataflow | [294] | |
| MediaTek | 0.711 | 12 | 1.37 | 12 | 16.5 | 23.2 | PIM | [309] | |
| TSMC+ NTHU | 16 | 8 | 98.5 | PIM | [308] | ||||
| Renesas Electronics | 5.06 | 14 | 8 | 130.55 | 23.9 | Dataflow | [310] |
7. Performance Analysis of the Edge Processors
- Performance: tera-operations per second (TOPS).
- Energy efficiency: TOPS/W.
- Power: Watt (W).
- Area: square millimeter (mm2).
ii. AI Edge Processor with PIM Architecture
iii. Edge Processor in the Industrial Research
8. Summary
References
- M. Merenda., 2020. “Edge machine learning for ai-enabled iot devices: A review”, Sensors, 20(9), p.2533, March 2020. DOI: 10.3390/s20092533. [CrossRef]
- M. P. Vestias et al., “Moving deep learning to the edge”, Algorithms, 13(5), p.125. March 2020. 10.3390/a13050125. [CrossRef]
- IBM, “Why organizations are betting on edge computing?”, May 2020, Accessed on: June 1, 2023. Available: https://www.ibm.com/thought-leadership/institute-business-value/report/edge-computing.
- W. Shi et al., “Edge computing: Vision and challenges”, IEEE internet of things journal, 3(5), pp.637-646. October 2016. DOI: 10.1109/JIOT.2016.2579198. [CrossRef]
- Statista. “IoT: Number of Connected Devices Worldwide 2015–2025”, November 2016, Accessed on: June 5, 2023. Available: https://www.statista.com/statistics/471264/iot-number-of-connected-devices-worldwide/.
- J. M. Chabas et al., “New demand, new markets: What edge computing means for hardware companies”, McKinsey & Company, New York, NY, USA, Tech. Rep. November 2018. Accessed on: August 02, 2023.
- Google, “Cloud TPU”, Available online: https://cloud.google.com/tpu. Accessed on: May 05, 2023.
- Accenture Lab. “Driving intelligence at the edge with neuromorphic computing”, 2021. Accessed on: June 3, 2023. Available: https://www.accenture.com/_acnmedia/PDF-145/Accenture-Neuromorphic-Computing-POV.pdf.
- Intel Labs. Technology Brief, “Taking /Neuromorphic Computing to the Next Level with Loihi 2”, 2021. Accessed on: May 10, 2023. Available: https://www.intel.com/content/www/us/en/research/neuromorphic-computing-loihi-2-technology-brief.html.
- F. Akopyan, “Truenorth: Design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip”, IEEE transactions on computer-aided design of integrated circuits and systems, 34(10), pp.1537-1557. August 2015. DOI: 10.1109/TCAD.2015.2474396. [CrossRef]
- Videantis. July 2020. Accessed on: June 11, 2022. Available: https://www.videantis.com/videantis-processor-adopted-for-tempo-ai-chip.html.
- Konikore, 2021. Accessed on: July 10, 2022. “A living Breathing Machine”, Available: https://good-design.org/projects/konikore/.
- Kalray, May 2023. Accessed on: July07, 2023. Available: https://www.kalrayinc.com/press-release/projet-ip-cube/.
- Brainchip, 2023, Accessed on: June 21, 2023. Available: https://brainchipinc.com/akida-neuromorphic-system-on-chip/.
- Synsence, May 2023, Accessed on: June 01, 2023. Available: https://www.synsense-neuromorphic.com/technology.
- Samsung, HBM-PIM, March 2023, Accessed on: July 25, 2023. Available:https://www.samsung.com/semiconductor/solutions/technology/hbm-processing-in-memory/.
- Upmem, Upmem-PIM, October 2019, Accessed on May 07, 2023. Available online: https://www.upmem.com/nextplatform-com-2019-10-03-accelerating-compute-by-cramming-it-into-dram/.
- Mythic, 2021. Accessed on: Feb. 05, 2022. Available: https://www.mythic-ai.com/product/m1076-analog-matrix-processor/.
- Gyrfalcon, Accessed on: March 03, 2023, Available: https://www.gyrfalcontech.ai/solutions/2803s/.
- Syntiant, January 2021. Accessed on: Feb. 07, 2023. Available: https://www.syntiant.com/ndp101;https://www.syntiant.com/post/the-growing-syntiant-core-family.
- Leapmind, Efficiera, July 2023. Accessed on: July 06, 2023, Available: https://leapmind.io/en/news/detail/230801/.
- K. M. Tarwani et al., “Survey on recurrent neural network in natural language processing”, Int. J. Eng. Trends Technol, 48, pp.301-304. June 2017. DOI: 10.14445/22315381/IJETT-V48P253. [CrossRef]
- Y. Goldberg, “A primer on neural network models for natural language processing,” Journal of Artificial Intelligence Research, 57, pp.345-420. October 2015. arXiv:1510.00726v1 [cs.CL].
- L. Yao et al., “An improved LSTM structure for natural language processing,” In 2018 IEEE International Conference of Safety Produce Informatization (IICSPI) (pp. 565-569). December 2018. DOI: 10.1109/IICSPI.2018.8690387. [CrossRef]
- S. Wang et al., “Learning natural language inference with LSTM,” December 2015. arXiv preprint arXiv:1512.08849.
- E. Azari et al., “An Energy-Efficient Reconfigurable LSTM Accelerator for Natural Language Processing,” 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, December 2019, pp. 4450-4459, DOI: 10.1109/BigData47090.2019.9006030. [CrossRef]
- W. Li et al., "Stance Detection of Microblog Text Based on Two-Channel CNN-GRU Fusion Network," in IEEE Access, vol. 7, pp. 145944-145952, Sept. 2019, DOI: 10.1109/ACCESS.2019.2944136. [CrossRef]
- M. Zulqarnain et al., “Efficient processing of GRU based on word embedding for text classification”, JOIV: International Journal on Informatics Visualization, 3(4), pp.377-383. Nov 2019. DOI: 10.30630/joiv.3.4.289. [CrossRef]
- Q. Liu et al., "Content-Guided Convolutional Neural Network for Hyperspectral Image Classification," in IEEE Transactions on Geoscience and Remote Sensing, vol. 58, no. 9, pp. 6124-6137, Sept. 2020, DOI: 10.1109/TGRS.2020.2974134. [CrossRef]
- Kumar et al., "MobiHisNet: A Lightweight CNN in Mobile Edge Computing for Histopathological Image Classification," in IEEE Internet of Things Journal, vol. 8, no. 24, pp. 17778-17789, Dec.15, 2021, DOI: 10.1109/JIOT.2021.3119520. [CrossRef]
- M. Wang, “Multi-path convolutional neural networks for complex image classification.”, Jun. 2015. arXiv preprint arXiv:1506.04701.
- H. Charlton, MacRumors, June 2023. “Apple reportedly planning to switch technology behind A17 bionic chip to cut cost next year”, Accessed on: July 05, 2023. Available: https://www.macrumors.com/2023/06/23/apple-to-switch-tech-behind-a17-to-cut-costs/.
- L. Wang, Taipei Times. “TSMC says new chips to be world’s most advanced”, May 2023, Accessed on: June 25, 2023. Available: https://www.taipeitimes.com/News/biz/archives/2023/05/12/2003799625.
- Samsung, Exynos, April 2022, Accessed on: February 06, 2023. Available-:https://www.samsung.com/semiconductor/minisite/exynos/products/all-processors/.
- Z. Q. Lin et al., “Edgespeechnets: Highly efficient deep neural networks for speech recognition on the edge.” arXiv preprint, Nov. 2018. DOI:10.48550/arXiv.1810.08559. [CrossRef]
- T. Shen et al., “The analysis of intelligent real-time image recognition technology based on mobile edge computing and deep learning,” Journal of Real-Time Image Processing, 18(4), pp.1157-1166. Oct. 2021. DOI: 10.1007/s11554-020-01039-x. [CrossRef]
- P. Subramaniam et al., “Review of security in mobile edge computing with deep learning,” In 2019 Advances in Science and Engineering Technology International Conferences (ASET) (pp. 1-5). Mar. 2019. DOI: 10.1109/ICASET.2019.8714349. [CrossRef]
- Krizhevsky et. al., “Imagenet classification with deep convolutional neural networks,”. Advances in neural information processing systems, 25, pp.1097-1105. Jun. 2017. DOI: 10.1145/3065386. [CrossRef]
- J. Schneible et al., “Anomaly detection on the edge,”. In MILCOM 2017-2017 IEEE Military Communications Conference (MILCOM) (pp. 678-682). Oct. 2017. DOI: 10.1109/MILCOM.2017.8170817. [CrossRef]
- T. Sirojan et al., “Sustainable Deep Learning at Grid Edge for Real-Time High Impedance Fault Detection,” in IEEE Transactions on Sustainable Computing, vol. 7, no. 2, pp. 346-357, 1 April-June 2022, DOI: 10.1109/TSUSC.2018.2879960. [CrossRef]
- F. Wang et al., “Deep Learning for Edge Computing Applications: A State-of-the-Art Survey,” in IEEE Access, vol. 8, pp. 58322-58336, 2020, doi: 10.1109/ACCESS.2020.2982411. [CrossRef]
- M.Z. Alom et al., “A State-of-the-Art Survey on Deep Learning Theory and Architectures,” Electronics. 2019; 8(3):292. Jan. 2019. DOI: 10.3390/electronics8030292. [CrossRef]
- Sengupta et al., “Going deeper in spiking neural networks: VGG and residual architectures,” Frontiers in neuroscience, 13, p.95. Mar. 2019. DOI: 10.3389/fnins.2019.00095. [CrossRef]
- L Wen et al., “A transfer convolutional neural network for fault diagnosis based on ResNet-50,” Neural Comput & Applic 32, 6111–6124 (2020). DOI: 10.1007/s00521-019-04097-w. [CrossRef]
- Szegedy et al., “Going deeper with convolutions,” 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 2015, pp. 1-9, DOI: 10.1109/CVPR.2015.7298594. [CrossRef]
- DeepVision (Kinara), March 2022. Accessed on: Jan 08, 2023. Available: https://kinara.ai/about-us/.
- Kneron, Accessed on: Jan 13, 2023. Available: https://www.kneron.com/page/soc/.
- Q. Wang et al., “N3LDG: A Lightweight Neural Network Library for Natural Language Processing,” Beijing Da Xue Xue Bao 55, no. 1 (2019): 113-119. Jan. 2019. DOI: 10.13209/j.0479-8023.2018.065. [CrossRef]
- S. Desai et al., “Lightweight convolutional representations for on-device natural language processing,” arXiv preprint. Feb. 2020. DOI: 10.48550/arXiv.2002.01535. [CrossRef]
- M. Zhang et al., “Libn3l: a lightweight package for neural nlp,” In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pp. 225-229. May 2016. DOI: https://aclanthology.org/L16-1034.
- Y. Tay et al., “Lightweight and efficient neural natural language processing with quaternion networks,” arXiv preprint, June 2019. DOI: 10.48550/arXiv.1906.04393. [CrossRef]
- Gyrfalcon, “LightSpeur 5801S neural accelerator”, 2022, Accessed on: December 10, 2022. Available: https://www.gyrfalcontech.ai/solutions/lightspeeur-5801/.
- D. Liu et al., “Bringing AI to edge: From deep learning’s perspective,”. Neurocomputing. Volume 485, 7, Pages 297-320, May 2022., :DOI: 10.1016/j.neucom.2021.04.141. [CrossRef]
- H. Li, “Application of IOT deep learning in edge computing: a review,”. Academic Journal of Computing & Information Science. 31;4(5). Oct 2021. DOI: 10.25236/AJCIS.2021.040514. [CrossRef]
- S.S. Zaidi et al., “A survey of modern deep learning-based object detection models,”. Digital Signal Processing., 103514. Mar 2022. DOI: 10.1016/j.dsp.2022.103514. [CrossRef]
- J. Chen et al., "Deep learning with edge computing: A Review," in Proceedings of the IEEE, vol. 107, no. 8, pp. 1655-1674, Aug. 2019, DOI: 10.1109/JPROC.2019.2921977. [CrossRef]
- W. Rawat et al., “Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review," in Neural Computation, vol. 29, no. 9, pp. 2352-2449, Sept. 2017, DOI: 10.1162/neco_a_00990. [CrossRef]
- A. M. Al-Saffar et al., “Review of deep convolution neural network in image classification”, 2017 International Conference on Radar, Antenna, Microwave, Electronics, and Telecommunications (ICRAMET), Jakarta, Indonesia, pp. 26-31, Oct. 2017. DOI: 10.1109/ICRAMET.2017.8253139. [CrossRef]
- N. F. Iandola et al., “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size”, arXiv preprint, Nov 2016. DOI: 10.48550/arXiv.1602.07360. [CrossRef]
- Elhassouny et al., “Trends in deep convolutional neural Networks architectures: a review”, 2019 International Conference of Computer Science and Renewable Energies (ICCSRE), Agadir, Morocco, pp. 1-8, Jul. 2019. DOI: 10.1109/ICCSRE.2019.8807741. [CrossRef]
- G. Howard et al., “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint, Apr. 2017. DOI: 10.48550/arXiv.1704.04861. [CrossRef]
- M. Sandler et al., “Mobilenetv2: Inverted residuals and linear bottlenecks”, ArXiv preprint, Jan 2018. 10.48550/arXiv.1801.04381. [CrossRef]
- Howard et al., “Searching for mobilenetv3”, In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314-1324. 2019.
- X. Zhang et al., “Shufflenet: An extremely efficient convolutional neural network for mobile devices”, In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6848-6856. 2018.
- M. Ningning et al., “Shufflenet v2: Practical guidelines for efficient cnn architecture design,” In Proceedings of the European conference on computer vision (ECCV), pp. 116-131. 2018.
- T. Mingxing et al., “Efficientnet: Rethinking model scaling for convolutional neural networks,” In International Conference on Machine Learning, pp. 6105-6114. PMLR, 2019.
- V. Niv, Hailo blog, “Object detection at the Edge: Making the right choice,” AI on the Edge: the Hailo Blog, Oct 2022, Accessd on: Jan 04, 2023. Available on: https://hailo.ai/blog/object-detection-at-the-edge-making-the-right-choice/.
- Z. -Q. Zhao et al., “Object Detection with deep learning: A review”, IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 11, pp. 3212-3232, Nov. 2019, DOI: 10.1109/TNNLS.2018. 2876865.. [CrossRef]
- J. Chen and X. Ran, "Deep learning with edge computing: A review," in Proceedings of the IEEE, vol. 107, no. 8, pp. 1655-1674, Aug. 2019, DOI: 10.1109/JPROC.2019.2921977. [CrossRef]
- J. -M. Hung et al., "An 8-Mb DC-Current-Free Binary-to-8b Precision ReRAM Nonvolatile Computing-in-Memory Macro using Time-Space-Readout with 1286.4-21.6TOPS/W for Edge-AI Devices," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 1-3, DOI: 10.1109/ISSCC42614.2022.9731715. [CrossRef]
- J. Oruh et al., Adegun, "Long Short-Term Memory Recurrent Neural Network for Automatic Speech Recognition," in IEEE Access, vol. 10, pp. 30069-30079, 2022, DOI: 10.1109/ACCESS.2022.3159339. [CrossRef]
- Liu et al., “Time delay recurrent neural network for speech recognition”, Journal of Physics: Conference Series. Vol. 1229. No. 1. IOP Publishing, 2019. DOI:10.1088/1742-6596/1229/1/012078. [CrossRef]
- Y. zhao et al., “The Speechtransformer for Large-scale Mandarin Chinese Speech Recognition,” ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019, pp. 7095-7099, DOI: 10.1109/ICASSP.2019.8682586. [CrossRef]
- J. Oruh et al., “Long Short-Term Memory Recurrent Neural Network for Automatic Speech Recognition,” IEEE Access, vol. 10, pp. 30069-30079, 2022, DOI: 10.1109/ACCESS.2022.3159339. [CrossRef]
- M. Omar et al., “Natural Language Processing: Recent Advances, Challenges, and Future Directions,” arXiv preprint, 2022 Jan 3, 10.48550/arXiv.2201.00768. [CrossRef]
- Z. Yuan et al., “14.2 A 65nm 24.7µJ/Frame 12.3mW Activation-Similarity-Aware Convolutional Neural Network Video Processor Using Hybrid Precision, Inter-Frame Data Reuse and Mixed-Bit-Width Difference-Frame Data Codec,” 2020 IEEE International Solid- State Circuits Conference - (ISSCC), San Francisco, CA, USA, 2020, pp. 232-234, DOI: 10.1109/ISSCC19947.2020.9063155. [CrossRef]
- Geoff Tate, “Advantages of BFloat16 for AI inference, Oct 2019, Accessed on: Jan 07, 2023, Available: https://semiengineering.com/advantages-of-bfloat16-for-ai-inference/.
- OpenAI, GPT-4: Technical Report, 27 Mar 2023. 10.48550/arXiv.2303.08774.
- Radford et al., “Language models are unsupervised multi-task learners,” OpenAI blog. 24;1(8):9. Feb 2019.
- T. Brown et al., “Language models are few-shot learners”, Advances in neural information processing systems. 33:1877-901. 2020.
- W. Fedus, “Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity”. arXiv preprint, Jan 2021. DOI: 10.48550/arXiv.2101.03961. [CrossRef]
- Q. Cao et al., “DeFormer: Decomposing pre-trained transformers for faster question answering”, arXiv preprint, May 2020. DOI: 10.48550/arXiv.2005.00697. [CrossRef]
- Z. Sun et al., “Mobilebert: a compact task-agnostic bert for resource-limited devices,” arXiv preprint. 2020 Apr 2020. 10.48550/arXiv.2004.02984. [CrossRef]
- Garret, “The Synatiant journey and pervasive NDP” Blog Post, Processor, August 2021. Accessed on: May 5, 2022, Availabe: https://www.edge-ai-vision.com/2021/08/the-syntiant-journey-and-the-pervasive-ndp/#:~:text=In%20the%20summer%20of%202019,will%20capitalize%20on%20the%20momentum.
- NXP, “ iMX Application Processors”, Accessed on: July 10, 2023, Available:https://www.nxp.com/products/processors-and-microcontrollers/arm-processors/i-mx-applications-processors/i-mx-9-processors:IMX9-PROCESSORS.
- NXP, “i.MX 8M Plus-Arm Cortex-A53, Machine Learning Vision, Multimedia and Industrial IoT” Accessed on: June 17, 2023, Available:https://www.nxp.com/products/processors-and-microcontrollers/arm-processors/i-mx-applications-processors/i-mx-8-processors/i-mx-8m-plus-arm-cortex-a53-machine-learning-vision-multimedia-and-industrial-iot:IMX8MPLUS.
- NXP Datasheet, “i.MX 8M Plus SoM datasheet”, Accessed on: February 10, 2023. Available: https://www.solid-run.com/wp-content/uploads/2021/06/i.MX8M-Plus-Datasheet-2021-.pdf.
- Deleo, Cision, PRNewwire, “Mythic expands product lineup with new scalable, power-efficient analog matrix processor for edge AI applications”, Mythic 1076, Accessed on: May 10, 2023. Available: https://www.prnewswire.com/news-releases/mythic-expands-product-lineup-with-new-scalable-power-efficient-analog-matrix-processor-for-edge-ai-applications-301306344.html.
- S. W. Foxton, EETimes. “Mythic Launches Second AI Chip”, Accessed on: April 20, 2022. Available: https://www.eetasia.com/mythic-launches-second-ai-chip/.
- L. Fick et al., "Analog Matrix Processor for Edge AI Real-Time Video Analytics," 2022 IEEE International Solid- State Circuits Conference (ISSCC), 2022, pp. 260-262.
- Gyrfalcon, “PIM AI Accelerators”, Accessed on: August 01, 2023. Available: https://www.gyrfalcontech.ai/.
- Gyrfalcon Technology, “Lightspeeur 2803 Neural Accelerator”, Accessed on: August 02, 2023. Available: https://www.gyrfalcontech.ai/solutions/2803s/.
- Yu et al., "A survey of model compression and acceleration for deep neural networks." arXiv preprint, June 2020. DOI: 10.48550/arXiv.1710.09282. [CrossRef]
- L. Deng et al., "Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey," in Proceedings of the IEEE, vol. 108, no. 4, pp. 485-532, April 2020, DOI: 10.1109/JPROC.2020.2976475. [CrossRef]
- K. Nan et al., "Deep model compression for mobile platforms: A survey," in Tsinghua Science and Technology, vol. 24, no. 6, pp. 677-693, Dec. 2019, DOI: 10.26599/TST.2018.9010103. [CrossRef]
- Berthelier et al., “Deep Model Compression and Architecture Optimization for Embedded Systems: A Survey”. J Sign Process Syst 93, 863–878. August 2021. DOI:10.1007/s11265-020-01596-1. [CrossRef]
- J. Lei et al., “A Review of Deep Network Model Compression”, Journal of Software, 2018, 29(2): 251-266. DOI: http://www.jos.org.cn/1000- 9825/5428.htm.
- S. Han et al., "Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding." arXiv preprint, DOI: 10.48550/arXiv.1510.00149. [CrossRef]
- Q. Qin et al., "To compress, or not to compress: Characterizing deep learning model compression for embedded inference." In 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), pp. 729-736. IEEE, 2018. DOI: 10.1109/BDCloud.2018.00110. [CrossRef]
- B. Jacob et al., "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 2704-2713, DOI: 10.1109/CVPR.2018.00286. [CrossRef]
- Y. Chunyu, and S. S. Agaian, "A comprehensive review of Binary Neural Network." arXiv preprint, Mar. 2023. DOI: 10.1007/s10462-023-10464-w. [CrossRef]
- H. Mo et al., "9.2 A 28nm 12.1TOPS/W Dual-Mode CNN Processor Using Effective-Weight-Based Convolution and Error-Compensation-Based Prediction," 2021 IEEE International Solid- State Circuits Conference (ISSCC), 2021, pp. 146-148. DOI: 10.1109/ISSCC42613.2021.9365943. [CrossRef]
- S. Yin et al., "PIMCA: A 3.4-Mb Programmable In-Memory Computing Accelerator in 28nm for On-Chip DNN Inference," 2021 Symposium on VLSI Circuits, 2021, pp. 1-2. DOI: 10.23919/VLSICircuits52068.2021.9492403. [CrossRef]
- H. Fujiwara et al., "A 5-nm 254-TOPS/W 221-TOPS/mm2 Fully Digital Computing-in-Memory Macro Supporting Wide-Range Dynamic-Voltage-Frequency Scaling and Simultaneous MAC and Write Operations," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 1-3, DOI: 10.1109/ISSCC42614.2022.9731754. [CrossRef]
- S. Wang and P. Kanwar, “BFloat16: The secret to high performance on Cloud TPUs”, Aug. 2019. Accessed on: Sept. 18, 2022, Available: https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus.
- G. Tate, “Advantages of BFloat16 for AI inference, Oct 2019, Accessed on: Sept. 18, 2022. Available: https://semiengineering.com/advantages-of-bfloat16-for-ai-inference/.
- S. Lee et al., "A 1ynm 1.25V 8Gb, 16Gb/s/pin GDDR6-based Accelerator-in-Memory supporting 1TFLOPS MAC Operation and Various Activation Functions for Deep-Learning Applications," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 1-3, doi: 10.1109/ISSCC42614.2022.9731711. [CrossRef]
- Blaize, Stop Compromising Start Deploying, 2022, Accessed on: Jun 11, 2023. Available: https://www.blaize.com/products/ai-edge-computing-platforms/.
- M. Demer, Blaize Ignites Edge-AI Performance, Microprocessor Report, Sept. 2020. Accessed on: Jun. 2022. Available: https://www.blaize.com/wp-content/uploads/2020/09/Blaize-Ignites-Edge-AI-Performance.pdf.
- T. Liang et al., “Pruning and quantization for deep neural network acceleration: A survey." Neurocomputing 461 (2021): 370-403. Oct. 2021. DOI: 10.1016/j.neucom.2021.07.045. [CrossRef]
- B. M. Mahdi, and M. Ghatee. "A systematic review on overfitting control in shallow and deep neural networks." Artificial Intelligence Review, 1-48. Dec. 2021, DOI: 0.1007/s10462-021-09975-1. [CrossRef]
- H. Yang et al., "Soft filter pruning for accelerating deep convolutional neural networks." arXiv preprint. Aug. 2018. DOI: 10.48550/arXiv.1808.06866. [CrossRef]
- H. Torsten et al., "Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks." arXiv preprint, Jan.2021.DOI: 10.48550/arXiv.2102.00554. [CrossRef]
- V. Sanh eta al., "Movement pruning: Adaptive sparsity by fine-tuning." arXiv preprint, Oct. 2020. DOI: 10.48550/arXiv.2005.07683. [CrossRef]
- B. Cristian et al. "Model compression." In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 535-541. Aug. 2006. DOI: 10.1145/1150402.1150464. [CrossRef]
- G. Jianping et al., "Knowledge distillation: A survey." International Journal of Computer Vision 129, no. 6: 1789-1819. May 2021. DOI: 10.48550/arXiv.2006.05525. [CrossRef]
- Kim Yoon, and Alexander M. Rush. "Sequence-level knowledge distillation." arXiv preprint, Sep. 2016, DOI: 10.48550/arXiv.1606.07947. [CrossRef]
- -Z, Zeyuan, and Y. Li. "Towards understanding ensemble, knowledge distillation and self-distillation in deep learning." arXiv preprint, Feb 2023, 10.48550/arXiv.2012.09816. [CrossRef]
- M. Huang et al., Knowledge Distillation for Sequence Model. In Interspeech (pp. 3703-3707). Sep 2018. DOI: 10.21437/Interspeech.2018-1589. [CrossRef]
- C. J. Hyun, and B. Hariharan. "On the efficacy of knowledge distillation." In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4794-4802. Oct 2019.
- T. Tambe et al., “EdgeBERT: Optimizing On-chip inference for multi-task NLP”, arXiv preprints. Nov 2020. DOI: 10.48550/arXiv.2011.14203. [CrossRef]
- Z. Sun et al., “Mobilebert: a compact task-agnostic bert for resource-limited devices”, arXiv preprint Apr 6, 2020. DOI: 10.48550/arXiv.2004.02984. [CrossRef]
- Tensorflow, “An end-to-end open-source machine learning platform”, Accessed on: May 01, 2023. Available: https://www.tensorflow.org/.
- S. Li, “TensorFlow Lite: On-Device Machine Learning Framework”, Journal of Computer Research and Development, 2020, 57(9): 1839-1853. DOI: 10.7544/issn1000-1239.2020.20200291. [CrossRef]
- Paszke et al., “Pytorch: An imperative style, high-performance deep learning library”, Advances in neural information processing systems, 32, pp.8026-8037. Dec 2019.
- Pytorch, Pytorch Mobile, “End to end workflow from training to deployment for iOS and android mobile devices”, Accessed on: Dec 20, 2022. Available: https://pytorch.org/mobile/home/.
- Keras, “Keras API References”, Accessed on: Dec 20, 2022. Available: Online link: https://keras.io/api/.
- Caffe2, “A new lightweight, modular, and scalable deep learning framework”, Accessed on: Dec 21, 2022. Available: https://research.facebook.com/downloads/caffe2/.
- Zelinsky, "Learning OpenCV---Computer Vision with the OpenCV Library (Bradski, G.R. et al.; 2008) [On the Shelf]," in IEEE Robotics & Automation Magazine, vol. 16, no. 3, pp. 100-100, September 2009, DOI: 10.1109/MRA.2009.933612. [CrossRef]
- ONNX, “Open Neural Network Exchange-the open standard for machine learning interoperability”, Accessed on: Dec 22, 2022. Available: https://onnx.ai/.
- MXNet, “A flexible and efficient efficient library for deep learning”, Accessed on: Dec 22, 2022. Available: https://mxnet.apache.org/versions/1.9.0/.
- ONNX, “Meta AI”, Accessed on: Dec 23, 2022, Available: https://ai.facebook.com/tools/onnx/.
- P. Vajda, and Y. Jia, “Delivering real-time AI in the palm of your hand”, Accessed on: Dec 27, 2022. Available: https://engineering.fb.com/2016/11/08/android/delivering-real-time-ai-in-the-palm-of-your-hand/.
- CEVA, “Edge AI & deep learning”, Accessed on: July 10, 2023. Available: https://www.ceva-dsp.com/app/deep-learning/.
- M. Demler, “CEVA Neupro Accelerator Neural Nets”, Microprocessor Report, Jan 2018. Available:https://www.ceva-dsp.com/wp-content/uploads/2018/02/Ceva-NeuPro-Accelerates-Neural-Nets.pdf.
- CEVA, “CEVA NeuPro-S On-device Computer Vision Processor Architecture”, Sep 2020, Accessed on: Jun 17, 2022. Available: https://www.ceva-dsp.com/wp content/uploads/2020/11/09_11_20_NeuPro-S_Brochure_V2.pdf.
- P.A. Merolla et al., “A million spiking-neuron integrated circuit with a scalable communication network and interface”. Science. 8;345(6197):668-73. Aug 2014. DOI: 10.1126/science.1254642. [CrossRef]
- C. Yakopcic et al., "Solving Constraint Satisfaction Problems Using the Loihi Spiking Neuromorphic Processor," 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, France, 2020, pp. 1079-1084, doi: 10.23919/DATE48585.2020.9116227. [CrossRef]
- T. Bohnstingl, “Neuromorphic hardware learns to learn”, Frontiers in neuroscience, 21;13:483. May 2019. DOI: 10.3389/fnins.2019.00483. [CrossRef]
- S.B. Shrestha et al., “Slayer: Spike layer error reassignment in time”, Advances in neural information processing systems,31. Sep 2018. DOI: 10.48550/arXiv.1810.08646. [CrossRef]
- S. Davidson, S. B. Furber, “Comparison of artificial and spiking neural networks on digital hardware”, Frontiers in Neuroscience, 15:345. Apr 2021. DOI: 10.3389/fnins.2021.651141. [CrossRef]
- P. Blouw et al., “Benchmarking keyword spotting efficiency on neuromorphic hardware”, In Proceedings of the 7th Annual Neuro-inspired Computational Elements Workshop, pp. 1-8. Mar 2019, DOI: 10.1145/3320288.3320304. [CrossRef]
- NengoLoihi, Accessed on: Nov 20, 2022. Available: https://www.nengo.ai/nengo-loihi/.
- Nengo, “Spinnaker backend for Nengo”, Accessed on: Nov 20, 2022. Available: https://nengo-spinnaker.readthedocs.io/en/latest/.
- NengoDL, Accessed on: Nov 20, 2022, Available: https://www.nengo.ai/nengo-dl/.
- Brainchip, MetaTF, Online link: https://brainchip.com/metatf-development-environment/.
- Brainchip, “Introducing the ADK1000 IP and NSOM for Edge AI IoT”, May 2020, Accessed on: Nov 22, 2022. Available: https://www.youtube.com/watch?v=EUGx45BCKlE.
- P. Clarke, eeNews, “Akida Spiking Neural Processor Could Head to FDSOI”, Aug 2, 2021, Accessed on: Nov 25, 2022. Available: https://www.eenewsanalog.com/news/akida-spiking-neural-processor-could-head-fdsoi.
- M. Demer, “Brainchip Akida is a faster learner”, microprocessor report, Lynely group, Oct 28, 2019. Available: https://d1io3yog0oux5.cloudfront.net/brainchipinc/files/BrainChip+Akida+Is+a+Fast+Learner.pdf.
- Lava. “Lava software framework”, Accessed on: Nov 26, 2022. Available: https://lava-nc.org/.
- Reuther et al., "AI and ML Accelerator Survey and Trends," 2022 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 2022, pp. 1-10, doi: 10.1109/HPEC55821.2022.9926331. [CrossRef]
- Y. Chen et al., “A survey of accelerator architectures for deep neural networks”, Engineering, 6(3), pp.264-274. Mar 2020. DOI: 10.1016/j.eng.2020.01.007. [CrossRef]
- W. Li and M. Liewig, “A survey of AI accelerators for edge environments”, In World Conference on Information Systems and Technologies (pp. 35-44). Springer, Cham. April 2020. DOI: 10.1007/978-3-030-45691-7_4. [CrossRef]
- M. S. Murshed et al., “Machine learning at the network edge: A survey”, ACM Computing Surveys (CSUR). 54(8):1-37. Oct 2021, DOI: 10.1145/3469029. [CrossRef]
- W. Lin et al., “Low-Power Ultra-Small Edge AI Accelerators for Image Recognition with Convolution Neural Networks: Analysis and Future Directions”, Electronics, 10(17), p.2048. Aug 2021. DOI: 10.3390/electronics10172048. [CrossRef]
- Reuther et al., "Survey of Machine Learning Accelerators," 2020 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 2020, pp. 1-12, doi: 10.1109/HPEC43674.2020.9286149. [CrossRef]
- J. Cross. Macworld. “Apple’s A16 chip doesn’t live up to its ‘Pro’ price or expectations”, Accessed on: Jan 1, 2023. Available: https://www.macworld.com/article/1073243/a16-processor-cpu-gpu-lpddr5-memory-performance.html.
- Apple. Press Release, June 6, 2022. “Apple unveils M2, taking the breakthrough performance and capabilities of M1 even further”, Accessed on: July 10, 2022. Available: https://www.apple.com/newsroom/2022/06/apple-unveils-m2-with-breakthrough-performance-and-capabilities/.
- Apple. Press Release. Nov 10, 2020. “Apple Unleashes M1”, Accessed on: Dec 5, 2021. Available: https://www.apple.com/newsroom/2020/11/apple-unleashes-m1/.
- ARM, NPU, Ethos-78, “Highly scalaeable and efficient second generation ML inference processor”, Accessed on: May 15, 2022. Available: https://www.arm.com/products/silicon-ip-cpu/ethos/ethos-n78.
- Frumusanu, “Arm Announces Ethos-N78: Bigger and More Efficient”, Anandtech, May 27, 2020. Accessed on: April 25, 2022. Available: https://www.anandtech.com/show/15817/arm-announces-ethosn78-npu-bigger-and-more-efficient.
- Blaize. “2022 best edge AI processor blaize Pathfinder P1600 embedded system on module”, Accessed on: Dec 05, 2022. Available: https://www.blaize.com/products/ai-edge-computing-platforms/.
- M. Demer, “Blaize Ignites Edge-AI Performance”, Microprocessor Report, Sep 2020, Accessed on: June 20, 2022. Available: https://www.blaize.com/wp-content/uploads/2020/09/Blaize-Ignites-Edge-AI-Performance.pdf.
- AIMotive. “Industry High 98% Efficiency Demonstrated Aimotive and Nextchip”, April 15, 2021, Accessed on: Mar 25, 2022. Available: https://aimotive.com/-/industry-high-98-efficiency-demonstrated-by-aimotive-and-nextchip.
- AIMotive. “NN acceleration for automotive AI”, Accessed on: May 25, 2022. Available: https://aimotive.com/aiware-apache5.
- N. Dahad, “Hardware Inference Chip Targets Automotive Applications”, December 24, 201. Accessed on: June 25, 2022. Available:https://www.embedded.com/hardware-inference-chip-targets-automotive-applications/..
- Cadence, “Tesilica AI Platform”, Accessed on: Dec 12, 2022. Available: https://www.cadence.com/en_US/home/tools/ip/tensilica-ip/tensilica-ai-platform.html.
- Cadence Newsroom, “Cadence Accelerates Intelligent SoC Development with Comprehensive On-Device Tensilica AI Platform”, Sep 13, 2021. Accessed on: Aug 15, 2022. Available: https://www.cadence.com/en_US/home/company/newsroom/press-releases/pr/2021/cadence-accelerates-intelligent-soc-development-with-comprehensi.html.
- M. Maxfield, “Say Hello to Deep Vision’s Polymorphic Dataflow Architecture”, EE Journal, Dec 24, 2020. Accessed on: Dec05, 2022, Available: https://www.eejournal.com/article/say-hello-to-deep-visions-polymorphic-dataflow-architecture/.
- S. Ward-Foxton, “AI Startup Deepvision Raises Funds Preps Next Chip”, EETimes, Sep 15, 2021, Accessed on: Dec 05, 2022. Available: https://www.eetasia.com/ai-startup-deep-vision-raises-funds-preps-next-chip/.
- Horizon AI, “Efficient AI Computing for Automotive Intelligence”, Accessed on: Dec 06, 2022, Available: https://en.horizon.ai/.
- Horizon Robotics, “Horizon Robotics and BYD announce cooperation on BYD’s BEV perception solution powered by Journey 5 computing solution at Shanghai auton show 2023,” Cision PR Newswire, Apr 19, 2023. Accessed on: Jun 20, 2023. Available: https://www.prnewswire.com/news-releases/horizon-robotics-and-byd-announce-cooperation-on-byds-bev-perception-solution-powered-by-journey-5-computing-solution-at-shanghai-auto-show-2023-301802072.html.
- Zheng, “Horizon Robotics' AI chip with up to 128 TOPS computing power gets key certification”, Cnevpost, Jul 6, 2021. Accessed on June 16, 2022. Available: https://cnevpost.com/2021/07/06/horizon-robotics-ai-chip-with-up-to-128-tops-computing-power-gets-key-certification/.
- Hailo, “The World's Top Performance AI Processor for Edge Devices”, Accessed on: May 20, 2023. Available: https://hailo.ai/.
- Brown, “Hailo-8 NPU Ships on Linux-Powered Lanner Edge System”, Jun 1, 2021. Accessed on: Jul 10, 2022. Available: https://linuxgizmos.com/hailo-8-npu-ships-on-linux-powered-lanner-edge-systems/.
- Edge TPU. “Coral Technology”, Accessed on: May 20, 2022. Available: https://coral.ai/technology/.
- Coral, “USB Accelerator”, Accessed on: Jun 13, 2022. Available: https://coral.ai/products/accelerator/.
- N. P. Jouppi et al., “A domain-specific supercomputer for training deep neural networks”, Communications of the ACM.;63(7):67-78. Jun 2020. DOI: 10.1145/3360307.
- Google, “How Google Tensor powers up Pixel phones”, Accessed on: Jul 16, 2022. Available: https://store.google.com/intl/en/ideas/articles/google-tensor-pixel-smartphone/.
- GreenWaves, “GAP9 Processor for Hearables and Sensors”, Accessed on: Jun 18, 2023. Available: https://greenwaves-technologies.com/gap9_processor/.
- Deleo, GreenWaves. GAP9, “GreenWaves Unveils Groundbreaking Ultra-Low Power GAP9 IoT Application Processor for the Next Wave of Intelligence at the Very Edge”, Accessed on: Aug 08, 2023. Avilable: https://greenwaves-technologies.com/gap9_iot_application_processor/.
- Imagination, “Power Series3NX, Advanced Compute and Neural Network Processors Enabling the Smart Edge”, Accesed on: Jun 10, 2022. Available: https://www.imaginationtech.com/vision-ai/powervr-series3nx/.
- B. Har-Evan, “Seperating the wheat from the chaff in embedded AI with PowerVR Series3NX”, Jan 24, 2019. Accessed on: Jul 25, 2022. Available: https://www.imaginationtech.com/blog/separating-the-wheat-from-the-chaff-in-embedded-ai/.
- Imagination, “The ideal single core solution for neural network acceleration”, Accessed on: June 16, 2022. Available: https://www.imaginationtech.com/product/img-4nx-mc1/.
- Wikichip, “Intel Nirvana, Neural Network Processor (NNP)”, Accessed on: July 14, 2023, Available: https://en.wikichip.org/wiki/nervana/nnp.
- Carmelito, “Intel Neural Compute Stick 2-Review”, Element14, Mar 8, 2021. Accessed on: Mar 24, 2023. Availabel: https://community.element14.com/products/roadtest/rv/roadtest_reviews/954/intel_neural_compute_3.
- L. Smith, “4th Gen Intel Xeon Scalable Processors Launched”, StorageReview, Jan 10, 2023. Accessed on: May 12, 2023. Available: https://www.storagereview.com/news/4th-gen-intel-xeon-scalable-processors-launched.
- J. Burns, and L. Chang, “Meet the IBM Artificial Intelligence Unit”, Oct 18, 2022. Accessed on: Dec 16, 2022. Available: https://research.ibm.com/blog/ibm-artificial-intelligence-unit-aiu.
- K. Gupta. “IBM Research Introduces Artificial Intelligence Unit (AIU): It’s First Complete System-on-Chip Designed to Run and Train Deep Learning Models Faster and More Efficiently than a General-Purpose CPU”, MarkTecPost, Oct 27, 2022. Accessed on: Dec 20, 2022. Available: https://www.marktechpost.com/2022/10/27/ibm-research-introduces-artificial-intelligence-unit-aiu-its-first-complete-system-on-chip-designed-to-run-and-train-deep-learning-models-faster-and-more-efficiently-than-a-general-purpose-cpu/.
- P. Clarke, “Startup launches near-binary neural network accelerator”, EENews, May 19, 2020. Accessed on: Dec 20, 2022. Available: https://www.eenewseurope.com/en/startup-launches-near-binary-neural-network-accelerator/.
- Kneron, “AI System on Chip (SoC)”, KL720 AI SoC, Accessed on: Jul 15, 2023. Available: https://www.kneron.com/page/soc/.
- Kneron, ““AI System on Chip (SoC)”, Accessed on: Jul 15, 2023”, KL530 AI SoC, https://www.kneron.com/page/soc/ .
- MobileEye, “One automatic grade SoC, many mobility solutions”, Accessed on Aug 4, 2023. Available: https://www.mobileye.com/our-technology/evolution-eyeq-chip/.
- EyeQ5, Wikichip, March 2021. Accessed on: Jun 22, 2023. Availabe: https://en.wikichip.org/wiki/mobileye/eyeq/eyeq5.
- D. Casil, “Mobileye presents EyeQ Ultra, the chip that promises true level 4 autonomous driving in 2025”, Jul 01, 2022. Accessed on: Jun 05, 2023. Available: https://www.gearrice.com/update/mobileye-presents-eyeq-ultra-the-chip-that-promises-true-level-4-autonomous-driving-in-2025/.
- MobileEye, “Meet EyeQ6: Our most advanced driver-assistance chip yet”, May 25, 2022. Accessed on: May 27, 2023. Available: https://www.mobileye.com/blog/eyeq6-system-on-chip/.
- Nvidia, “Jetson Nano”, Accessed on: May 26, 2023. Available: https://elinux.org/Jetson_Nano#:~:text=Useful%20for%20deploying%20computer%20vision,5%2D10W%20of%20power%20consumption.
- NIDIA Jetson Nano B01, “Deep learning with raspberry pi and alternatives” April 5, 2023. Accessed on Jul 03, 2023. Available: https://qengineering.eu/deep-learning-with-raspberry-pi-and-alternatives.html#Compare_Jetson.
- Nvidia, Jetson Orin, “The future of industrial-grade edge AI”, Accessed on: Jul 25, 2023. Available: https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/.
- L. Deleon, “Build enhanced video conference experiences”, Qualcom. Mar 7, 2023. Accessed on: May 05, 2023, Available: https://developer.qualcomm.com/blog/build-enhanced-video-conference-experiences.
- Qualcomm, QCS8250, “Premium processor designed to help you deliver maximum performance for compute intensive camera, video conferencing and Edge AI applications with support Wi-Fi 6 and 5G for the Internet of Things (IoT)” Accessed on: Jul, 15, 2023, https://www.qualcomm.com/products/qcs8250.
- Snapdragon, “888+ 5G Mobile Platform”, Accessed on: May 24, 2023. Available: https://www.qualcomm.com/products/snapdragon-888-plus-5g-mobile-platform.
- Qualcomm, “Qualcomm Snapdragon 888 Plus, Benchmark, Test and Spec”, CPU monkey, Jun 16, 2023, Accessed on: Jul 15, 2023. Available: https://www.cpu-monkey.com/en/cpu-qualcomm_snapdragon_888_plus.
- Hsu, “Training ML Models at the Edge with Federated Learning”, Qualcomm, Jun 07, 2021. Accessed on: Jul 7, 2023. Available: https://developer.qualcomm.com/blog/training-ml-models-edge-federated-learning.
- Samsung, “The core that redefines your device”, Accessed on May 25, 2023.Available:https://www.samsung.com/semiconductor/minisite/exynos/products/all-processors/.
- GSMARENA, “Exynos 2100 Vs Snapdragon 888: Benchmarking the Samsung Galaxy S21 Ultra Versions”, GSMARENA, Feb 07, 2021. Accessed on: Jun 10, 2023.Available:https://www.gsmarena.com/exynos_2100_vs_snapdragon_888_benchmarking_the_samsung_galaxy_s21_ultra_performance-news-47611.php.
- M. Kong, “VeriSilicon VIP9000 NPU AI processor and ZSPNano DSP IP bring AI-Vision and AI-Voice to low power automotive image processing SoC”, VeriSilicon Press release, May 12, 2020. Acessed on: Jul 16, 2022. Available:https://www.verisilicon.com/en/PressRelease/VIP9000andZSPAdoptedbyiCatch.
- VeriSilicon, “VeriSilicon Launches VIP9000, New Generation of Neural Processor Unit IP”, VeriSilicon Press Release, Jul 8, 2019. Accessed on: May 25, 2022. Available: https://www.verisilicon.com/en/PressRelease/VIP9000.
- Synopsys, “Designware ARC EV Processors for Embedded Vsion”, Accessed on: Jul 25, 2022. Available: https://www.synopsys.com/designware-ip/processor-solutions/ev-processors.html.
- Synopsys, “Synopsys EV7x vision processor”, Accessed on: May 25, 2023. Available:https://www.synopsys.com/dw/ipdir.php?ds=ev7x-vision-processors.
- Wikichip, “FSD Chip”, Wikichip, Accessed on: May 28, 2023. Available: https://en.wikichip.org/wiki/tesla_(car_company)/fsd_chip.
- Research and Markets, “Neuromorphic Chips: Global Strategic Business Report”, Research and Markets, . ID: 4805280, Accessed on: May 16, 2023, Available:https://www.researchandmarkets.com/reports/4805280/neuromorphic-chips-global-strategic-business.
- M. Ghilardi, “Synsense secures additional capital from strategic investors”, News Synsecse, Apr 18, 2023. Accessed on: May 5, 2023. Available: https://www.venturelab.swiss/SynSense-secures-additional-capital-from-strategic-investors.
- GrAI VIP, “Life Ready AI Processors”, Accessed on: Jul 16, 2023. Available: https://www.graimatterlabs.ai/product.
- S. Cassidy et al., "Real-Time Scalable Cortical Computing at 46 Giga-Synaptic OPS/Watt with ~100× Speedup in Time-to-Solution and ~100,000× Reduction in Energy-to-Solution," SC '14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, New Orleans, LA, USA, 2014, pp. 27-38, DOI: 10.1109/SC.2014.8. [CrossRef]
- S. Wax-forton, “Innatera unveils neuromorphic AI chip to accelerate spiking networks”, EETimes, Jul 7, 2021. Accessed on: May 25, 2023, Available:https://www.linleygroup.com/newsletters/newsletter_detail.php?num=6302&year=2021&tag=3.
- J L Aufrace, “Innatera Neuromorphic AI Accelerator for Spiking Neural Networks Enables Sub-mW AI Inference”, CNX software-embedded systems news, Jul 16, 2021. Accessed on: May 25, 2023. Available: https://www.cnx-software.com/2021/07/16/innatera-neuromorphic-ai-accelerator-for-spiking-neural-networks-snn-enables-sub-mw-ai-inference/.
- B. Rajendran et al., "Low-Power Neuromorphic Hardware for Signal Processing Applications: A Review of Architectural and System-Level Design Approaches," in IEEE Signal Processing Magazine, vol. 36, no. 6, pp. 97-110, Nov. 2019, DOI: 10.1109/MSP.2019.2933719. [CrossRef]
- Blouw P, Choo X, Hunsberger E, Eliasmith C. Benchmarking keyword spotting efficiency on neuromorphic hardware. In Proceedings of the 7th Annual Neuro-inspired Computational Elements Workshop 2019 Mar 26 (pp. 1-8).
- Yousefzadeh A et al., SENeCA: Scalable energy-efficient neuromorphic computer architecture. In2022 IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS) 2022 Jun 13 (pp. 371-374). IEEE.
- Konikore, “Technology that sniffs out danger”, Accessed on: May 26, 2023. Available: https://theindexproject.org/post/konikore.
- K. Ueyoshi et al., "DIANA: An End-to-End Energy-Efficient Digital and ANAlog Hybrid Neural Network SoC," 2022 IEEE International Solid- State Circuits Conference (ISSCC), 2022, pp. 1-3, doi: 10.1109/ISSCC42614.2022.9731716. [CrossRef]
- N. Flaherty, “Axelera shows DIANA analog in-memory computing chip”, EENews, Feb 21, 2022. Accessed on: Jul 22, 2023. Available: https://www.eenewseurope.com/en/axelera-shows-diana-analog-in-memory-computing-chip/.
- Gyrfalcon Technology, “PIM AI Accelerators”, Accessed on: Mar 25, 2022. Available: https://www.gyrfalcontech.ai/.
- SolidRun, “Janux GS31 AI Server”, Accessed on: Mar 25, 2022. Available: https://www.solid-run.com/embedded-networking/nxp-lx2160a-family/ai-inference-server/.
- Samsung, “Samsung Brings PIM Technology to Wider Applications”, Aug 24, 2021. Accessed on: May 18. 2023. Available: https://www.samsung.com/semiconductor/newsroom/news-events/samsung-brings-in-memory-processing-power-to-wider-range-of-applications/.
- Kim JH, Kang SH, Lee S, Kim H, Song W, Ro Y, Lee S, Wang D, Shin H, Phuah B, Choi J. Aquabolt-XL: Samsung HBM2-PIM with in-memory processing for ML accelerators and beyond. In2021 IEEE Hot Chips 33 Symposium (HCS) 2021 Aug 22 (pp. 1-26). IEEE.
- Syntiant, “Making Edge AI a Reality: A new processor for deep learning”, Accessed on: Jun 28, 2023. Availabel: https://www.syntiant.com/.
- Syntiant, “NDP100 Neural Decision Processor- NDP100- always-on speech recognition”, Accessed on: Jun 28, 2023. Available: https://www.syntiant.com/ndp100.
- Syntiant, “NDP200 Neural Decision Processor, NDP200 always-on vision, sensor and speech recognition”, Accessed on: Jun 28, 2023. Available: https://www.syntiant.com/ndp200.
- N. Tyler, “Syntiant Introduces NDP102 Neural Decision Processor”, newelectronics, Sep 16, 2021. Accessed on: Jun 28, 2023. Aailable: https://www.newelectronics.co.uk/content/news/syntiant-introduces-ndp102-neural-decision-processor.
- Halfacree, “Syntiant's NDP200 Promises 6.4GOP/s of Edge AI Compute in a Tiny 1mW Power Envelope”, Hackster.io, 2021, Accessed on: Jun 29, 2023. Available:https://www.hackster.io/news/syntiant-s-ndp200-promises-6-4gop-s-of-edge-ai-compute-in-a-tiny-1mw-power-envelope-96590283ffbc.
- M. Demler, “Syntiant Knows All the Best Words, NDP10x Speech-Recognition Processors Consume Just 200uW”, Microprocessors Report, 2019, Accessed on: Jun 29, 2023. Available: https://www.syntiant.com/post/syntiant-knows-all-the-best-words.
- M. Demler, “Syntiant NDP120 Sharpens Its Hearing, Wake-Word Detector COmbines Ultra-Low Power DLA with HiFi 3DSP”, 2021, Available: https://www.linleygroup.com/mpr/article.php?id=12455.
- G. Medici, “Syntiant Introduces NDP102 Neural Decision Processor”, Syntiant, Sep 15, 2021. Accessed on: Jun 30, 2023. Available: https://www.newelectronics.co.uk/content/news/syntiant-introduces-ndp102-neural-decision-processor.
- Untether, “The most efficient AI computer engine available”, Accessed on: May 18, 2023. Available: https://www.untether.ai/press-releases/untether-ai-ushers-in-the-petaops-era-with-at-memory-computation-for-ai-inference-workloads.
- Untether, “Untether AI”, Accessed on: May 18, 2023. Available: https://www.colfax-intl.com/downloads/UntetherAI-tsunAImi-Product-Brief.pdf.
- Upmem, “The PIM reference platform”, Accessed on: May 19, 2023. Available:https://www.upmem.com/technology/.
- D. Lavenier, R. Cimadomo and R. Jodin, "Variant Calling Parallelization on Processor-in-Memory Architecture," 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Seoul, Korea (South), 2020, pp. 204-207, DOI: 10.1109/BIBM49941.2020.9313351. [CrossRef]
- Gómez-Luna et al., “Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-in-Memory Hardware”, arXiv preprint, DOI: 10.48550/arXiv.2110.01709. [CrossRef]
- Ian Cutress, “Hot Chips 31 Analysis: In Memory Processing by Upmem”, Anandtech, Aug 18, 2019. Accessed on: May 20, 2023. Available: https://www.anandtech.com/show/14750/hot-chips-31-analysis-inmemory-processing-by-upmem.
- Nanoreview.net., “A14 Bionic vs. A15 Bionic”, Accessed on: Jun 16, 2023. Available:https://nanoreview.net/en/soc-compare/apple-a15-bionic-vs-apple-a14-bionic.
- R. Merrit, “Startup Accelerates AI at the Sensor”, EETimes, Feb 11, 2019. Accessed on: Jun 10, 2023. Available:https://www.eetimes.com/startup-accelerates-ai-at-the-sensor/.
- P. Clarke, “Indo-US Startup Preps Agent-based AI Processor”, EENews, Aug 26, 2018. Accessed on: Jun 20, 2023. Available: https://www.eenewsanalog.com/en/indo-us-startup-preps-agent-based-ai-processor-2/.
- B. Wheeler, “Bitmain SoC Brings AI to the Edge”, Accessed on: Jul 23, 2023. Available: https://www.linleygroup.com/newsletters/newsletter_detail.php?num=5975&year=2019&tag=3.
- W. Liang, “Get Started, Neural Network Stick”, Github, May 10, 2019. Accessed on: May 16, 2023. Available: https://github.com/BM1880-BIRD/bm1880-system-sdk/wiki/GET-STARTED.
- L. Gwennap, “Kendryte Embeds AI for Surveillance”, Accessed on: Jul 14, 2023. Available: https://www.linleygroup.com/newsletters/newsletter_detail.php?num=5992.
- Canaan, “Kendryte K210”, Accessed on: May 15, 2023. Available: https://canaan.io/product/kendryteai.
- Eta Compute, “Micropower AI vision platform”, Accessed on: May 15, 2023. Available: https://etacompute.com/tensai-flow/.
- FlexLogic, “Flexlogic announces InferX high performance IP for DSP and AI inference” Apr 24, 2023. Accessed on Jun 12, 2023, Available: https://flex-logix.com/inferx-ai/inferx-ai-hardware/. 2023.
- Mediatek, i350, “Mediatek introduces i350 edge AI platform designed for voice and vision processing applications”, Oct 14, 2020, Accessed on: May 16, 2023. Available: https://corp.mediatek.com/news-events/press-releases/mediatek-introduces-i350-edge-ai-platform-designed-for-voice-and-vision-processing-applications.
- Perceive, “Put high power intelligence in a low poer device”, Accessed on: May 16, 2023. Available: https://perceive.io/product/ergo/.
- Yida, “Introducing the Rock Pi N10 RK3399Pro SBC for AI and Deep Learning”, Accessed on: May 17, 2023. Available: https://www.seeedstudio.com/blog/2019/12/04/introducing-the-rock-pi-n10-rk3399pro-sbc-for-ai-and-deep-learning/.
- GadgetVersus, “Amalogic A311D processor benchmarks and Specs”, Accessed on: May 16, 2023. Available: https://gadgetversus.com/processor/amlogic-a311d-specs/.
- Samsung, “Exynos 2200”, Accessed on: Jun 1, 2023. Available: https://semiconductor.samsung.com/us/processor/mobile-processor/exynos-2200/.
- Think Silicon, “Nema Pico XS”, Accessed on: May 23, 2023. Available: https://www.think-silicon.com/nema-pico-xs#features.
- Y. -D. Chih et al., "16.4 An 89TOPS/W and 16.3TOPS/mm2 All-Digital SRAM-Based Full-Precision Compute-In Memory Macro in 22nm for Machine-Learning Edge Applications," 2021 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2021, pp. 252-254, DOI: 10.1109/ISSCC42613.2021.9365766. [CrossRef]
- Q. Dong et al., "15.3 A 351TOPS/W and 372.4GOPS Compute-in-Memory SRAM Macro in 7nm FinFET CMOS for Machine-Learning Applications," 2020 IEEE International Solid- State Circuits Conference - (ISSCC), San Francisco, CA, USA, 2020, pp. 242-244, DOI: 10.1109/ISSCC19947.2020.9062985. [CrossRef]
- C. -X. Xue et al., "16.1 A 22nm 4Mb 8b-Precision ReRAM Computing-in-Memory Macro with 11.91 to 195.7TOPS/W for Tiny AI Edge Devices," 2021 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2021, pp. 245-247, DOI: 10.1109/ISSCC42613.2021.9365769. [CrossRef]
- R. Khaddam-Aljameh et al., "HERMES Core – A 14nm CMOS and PCM-based In-Memory Compute Core using an array of 300ps/LSB Linearized CCO-based ADCs and local digital processing," 2021 Symposium on VLSI Technology, Kyoto, Japan, 2021, pp. 1-2.
- S. Yin et al., "PIMCA: A 3.4-Mb Programmable In-Memory Computing Accelerator in 28nm for On-Chip DNN Inference," 2021 Symposium on VLSI Circuits, 2021, pp. 1-2. DOI: 10.23919/VLSICircuits52068.2021.9492403. [CrossRef]
- G. Yuan et al., "FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator," 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), Valencia, Spain, 2021, pp. 265-278, DOI: 10.1109/ISCA52012.2021.00029. [CrossRef]
- H. Caminal et al., "CAPE: A Content-Addressable Processing Engine," 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), Seoul, Korea (South), 2021, pp. 557-569, DOI: 10.1109/HPCA51647.2021.00054. [CrossRef]
- S. Lee et al., "A 1ynm 1.25V 8Gb, 16Gb/s/pin GDDR6-based Accelerator-inMemory supporting 1 TFLOPS MAC Operation and Various Activation Functions for Deep-Learning Applications", SK hynix, ISSCC, Feb 2022. DOI: 10.1109/ISSCC42614.2022.9731711. [CrossRef]
- H. Fujiwara et al., "A 5-nm 254-TOPS/W 221-TOPS/mm2 Fully Digital Computing-in-Memory Macro Supporting Wide-Range Dynamic-Voltage-Frequency Scaling and Simultaneous MAC and Write Operations," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 1-3, DOI: 10.1109/ISSCC42614.2022.9731754. [CrossRef]
- -S. Park et al., "A Multi-Mode 8K-MAC HW-Utilization-Aware Neural Processing Unit with a Unified Multi-Precision Datapath in 4nm Flagship Mobile SoC," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 246-248, DOI: 10.1109/ISSCC42614.2022.9731639. [CrossRef]
- H. Zhu et al., "COMB-MCM: Computing-on-Memory-Boundary NN Processor with Bipolar Bitwise Sparsity Optimization for Scalable Multi-Chiplet-Module Edge Machine Learning," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 1-3, DOI: 10.1109/ISSCC42614.2022.9731657. [CrossRef]
- D. Niu et al., "184QPS/W 64Mb/mm23D Logic-to-DRAM Hybrid Bonding with Process-Near-Memory Engine for Recommendation System," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 1-3, DOI: 10.1109/ISSCC42614.2022.9731694. [CrossRef]
- Y. -C. Chiu et al., "A 22nm 4Mb STT-MRAM Data-Encrypted Near-Memory Computation Macro with a 192GB/s Read-and-Decryption Bandwidth and 25.1-55.1TOPS/W 8b MAC for AI Operations," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 178-180, DOI: 10.1109/ISSCC42614.2022.9731621. [CrossRef]
- W. -S. Khwa et al., "11.3 A 40-nm, 2M-Cell, 8b-Precision, Hybrid SLC-MLC PCM Computing-in-Memory Macro with 20.5 - 65.0TOPS/W for Tiny-Al Edge Devices," 2022 IEEE International Solid- State Circuits Conference (ISSCC), 2022, pp. 1-3. DOI: 10.1109/ISSCC42614.2022.9731670. [CrossRef]
- S. D. Spetalnick et al., "A 40nm 64kb 26.56TOPS/W 2.37Mb/mm2RRAM Binary/Compute-in-Memory Macro with 4.23x Improvement in Density and >75% Use of Sensing Dynamic Range," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 1-3, DOI: 10.1109/ISSCC42614.2022.9731725. [CrossRef]
- M. Chang et al., "A 40nm 60.64TOPS/W ECC-Capable Compute-in-Memory/Digital 2.25MB/768KB RRAM/SRAM System with Embedded Cortex M3 Microprocessor for Edge Recommendation Systems," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 1-3, DOI: 10.1109/ISSCC42614.2022.9731679. [CrossRef]
- D. Wang et al., "DIMC: 2219TOPS/W 2569F2/b Digital In-Memory Computing Macro in 28nm Based on Approximate Arithmetic Hardware," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 266-268, DOI: 10.1109/ISSCC42614.2022.9731659. [CrossRef]
- D. Wang et al., "DIMC: 2219TOPS/W 2569F2/b Digital In-Memory Computing Macro in 28nm Based on Approximate Arithmetic Hardware," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 266-268, DOI: 10.1109/ISSCC42614.2022.9731659. [CrossRef]
- -M. Hung et al., "An 8-Mb DC-Current-Free Binary-to-8b Precision ReRAM Nonvolatile Computing-in-Memory Macro using Time-Space-Readout with 1286.4-21.6TOPS/W for Edge-AI Devices," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 1-3, DOI: 10.1109/ISSCC42614.2022.9731715. [CrossRef]
- Yue et al., "15.2 A 2.75-to-75.9TOPS/W Computing-in-Memory NN Processor Supporting Set-Associate Block-Wise Zero Skipping and Ping-Pong CIM with Simultaneous Computation and Weight Updating," 2021 IEEE International Solid- State Circuits Conference (ISSCC), 2021, pp. 238-240. DOI: 10.1109/ISSCC42613.2021.9365958. [CrossRef]
- J. Yue et al., "14.3 A 65nm Computing-in-Memory-Based CNN Processor with 2.9-to-35.8TOPS/W System Energy Efficiency Using Dynamic-Sparsity Performance-Scaling Architecture and Energy-Efficient Inter/Intra-Macro Data Reuse," 2020 IEEE International Solid- State Circuits Conference - (ISSCC), San Francisco, CA, USA, 2020, pp. 234-236, DOI: 10.1109/ISSCC19947.2020.9062958. [CrossRef]
- Y. Wang et al., "A 28nm 27.5TOPS/W Approximate-Computing-Based Transformer Processor with Asymptotic Sparsity Speculating and Out-of-Order Computing," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 1-3, DOI: 10.1109/ISSCC42614.2022.9731686. [CrossRef]
- J. -S. Park et al., "A Multi-Mode 8K-MAC HW-Utilization-Aware Neural Processing Unit with a Unified Multi-Precision Datapath in 4nm Flagship Mobile SoC," 2022 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2022, pp. 246-248, DOI: 10.1109/ISSCC42614.2022.9731639. [CrossRef]
- Matsubara et al., "4.2 A 12nm Autonomous-Driving Processor with 60.4TOPS, 13.8TOPS/W CNN Executed by Task-Separated ASIL D Control," 2021 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2021, pp. 56-58, DOI: 10.1109/ISSCC42613.2021.9365745. [CrossRef]
- Agrawal et al., "9.1 A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware Throttling," 2021 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2021, pp. 144-146, DOI: 10.1109/ISSCC42613.2021.9365791. [CrossRef]
- H. Mo et al., "9.2 A 28nm 12.1TOPS/W Dual-Mode CNN Processor Using Effective-Weight-Based Convolution and Error-Compensation-Based Prediction," 2021 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2021, pp. 146-148, doi: 10.1109/ISSCC42613.2021.9365943. [CrossRef]
- J. -S. Park et al., "9.5 A 6K-MAC Feature-Map-Sparsity-Aware Neural Processing Unit in 5nm Flagship Mobile SoC," 2021 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2021, pp. 152-154, DOI: 10.1109/ISSCC42613.2021.9365928. [CrossRef]
- R. Eki et al., "9.6 A 1/2.3inch 12.3Mpixel with On-Chip 4.97TOPS/W CNN Processor Back-Illuminated Stacked CMOS Image Sensor," 2021 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2021, pp. 154-156, DOI: 10.1109/ISSCC42613.2021.9365965. [CrossRef]
- C. -H. Lin et al., "7.1 A 3.4-to-13.3TOPS/W 3.6TOPS Dual-Core Deep-Learning Accelerator for Versatile AI Applications in 7nm 5G Smartphone SoC," 2020 IEEE International Solid- State Circuits Conference - (ISSCC), San Francisco, CA, USA, 2020, pp. 134-136, DOI: 10.1109/ISSCC19947.2020.9063111. [CrossRef]
- C. -H. Lin et al., "7.1 A 3.4-to-13.3TOPS/W 3.6TOPS Dual-Core Deep-Learning Accelerator for Versatile AI Applications in 7nm 5G Smartphone SoC," 2020 IEEE International Solid- State Circuits Conference - (ISSCC), 2020, pp. 134-136, DOI: 10.1109/ISSCC19947.2020.9063111. [CrossRef]
- C. -H. Lin et al., "7.1 A 3.4-to-13.3TOPS/W 3.6TOPS Dual-Core Deep-Learning Accelerator for Versatile AI Applications in 7nm 5G Smartphone SoC," 2020 IEEE International Solid- State Circuits Conference - (ISSCC), San Francisco, CA, USA, 2020, pp. 134-136, DOI: 10.1109/ISSCC19947.2020.9063111. [CrossRef]
- W. -H. Huang et al., "A Nonvolatile Al-Edge Processor with 4MB SLC-MLC Hybrid-Mode ReRAM Compute-in-Memory Macro and 51.4-251TOPS/W," 2023 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2023, pp. 15-17, DOI: 10.1109/ISSCC42615.2023.10067610. [CrossRef]
- W. -H. Huang et al., "A Nonvolatile Al-Edge Processor with 4MB SLC-MLC Hybrid-Mode ReRAM Compute-in-Memory Macro and 51.4-251TOPS/W," 2023 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2023, pp. 15-17, doi: 10.1109/ISSCC42615.2023.10067610. [CrossRef]
- T. Tambe et al., "22.9 A 12nm 18.1TFLOPs/W Sparse Transformer Processor with Entropy-Based Early Exit, Mixed-Precision Predication and Fine-Grained Power Management," 2023 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2023, pp. 342-344, doi: 10.1109/ISSCC42615.2023.10067817. [CrossRef]
- T. Tambe et al., "22.9 A 12nm 18.1TFLOPs/W Sparse Transformer Processor with Entropy-Based Early Exit, Mixed-Precision Predication and Fine-Grained Power Management," 2023 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2023, pp. 342-344, doi: 10.1109/ISSCC42615.2023. 10067817. [CrossRef]
- Y. -C. Chiu et al., "A 22nm 8Mb STT-MRAM Near-Memory-Computing Macro with 8b-Precision and 46.4-160.1TOPS/W for Edge-AI Devices," 2023 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2023, pp. 496-498, DOI: 10.1109/ISSCC42615.2023.10067563. [CrossRef]
- G. Desoli et al., "16.7 A 40-310TOPS/W SRAM-Based All-Digital Up to 4b In-Memory Computing Multi-Tiled NN Accelerator in FD-SOI 18nm for Deep-Learning Edge Applications," 2023 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2023, pp. 260-262, DOI: 10.1109/ISSCC42615.2023.10067422. [CrossRef]
- G. Desoli et al., "16.7 A 40-310TOPS/W SRAM-Based All-Digital Up to 4b In-Memory Computing Multi-Tiled NN Accelerator in FD-SOI 18nm for Deep-Learning Edge Applications," 2023 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2023, pp. 260-262, DOI: 10.1109/ISSCC42615.2023.10067422. [CrossRef]
- MemComputing, “MEMCPU”, Accessed on: Jul 1, 2023. Available: https://www.memcpu.com/.
- IniLabs, “IniLabs”, Accessed on: Jul 1, 2023, Available: https://inilabs.com/.
- Memryx, Accessed on: Aug 1, 2023, Available: https://memryx.com/products/.
- Tavanaei et al., “Deep learning in spiking neural networks”, Neural networks. 111:47-63. Mar 2019. 10.1016/j.neunet.2018.12.002. [CrossRef]
- D. S. Modha, et al., “IBM NorthPole neural inference machine”, HotChips conference, 2023, Aug 27-29, California, USA.
- S. Dhruvanarayan, V. Bittorf, “MLSoCTM – An Overview,” HotChips Conference 2023, California, USA, August 2023.
- SiMa.ai, Accessed on: Sept. 3, 2023. Available: https://sima.ai/.
- Z. Tan, Y. Wu, Y. Zhang, H. Shi, W. Zhang, K. Ma, “A scaleable multi-chiplet deep learning accelerator with hub-side 2.5D heterogeneous integration”, HotChip Conference’ 2023. California, USA, August 2023.
- E. Mahurin, “Qualcomm Hexagon NPU”, HotChip Conference’2023, California, USA, August 2023.
- Dharmendra S. Modha et al., “Neural Inference at the Frontier of Energy, Space, and Time”, Computer Science, 382, pp. 329-335, October, 2023. DOI: 10.1126/science.adh1174.
- Bill Dally, ‘Hardware for Deep Learning’, NVIDIA Corporation, HotChip Conference, 2023, California, USA, August 2023.
- J. H. Kim, Y. Ro, J. So, S. Lee, S.-H. Kang, Y. Cho, H. Kim, B. Kim, K. Kim, S. Park, J.-S. Kim, S. Cha, W.-J. Lee, J. Jung, J.-G. Lee, J. Lee, J.H. Song, S. Lee, J. Cho, J. Yu, and K. Sohn, ‘Samsung PIM/PNM for Transformer based AI : Energy Efficiency on PIM/PNM Cluster’, HotChip Conference’2023, California, USA, August 2023.
- Ambarella, Accessed on : March 5, 2024, Available: https://www.ambarella.com/products/iot-industrial-robotics/.
- W-S Khwa, P-C Wu, J-J Wu, J-W Su, H-Y Chen, Z-E Ke, T-C Chiu, J-M Hsu, C-Y Cheng, Y-C Chen, C-C Lo, R-S Liu, C-C Hsieh,K-T Tang, M-F Chang, ‘A 16nm 96Kb Integer/Floating-Point Dual Mode-Gain-CellComputing-in-Memory Macro Achieving 73.3 163.3TOPS/W and 33.2-91.2TFLOPS/W for AI-Edge Devices’ 2024 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2024.
- M-E Shih, S-W Hsieh, P-Y Tsa, M-H Lin, P-K Tsung, E-J Chang, J Liang, S-H Chang,C-L Huang, Y-Y Nian, Z Wan, S Kumar, C-X Xue, G Jedhe, H. Fujiwara, H Mori, C-Wei Chen, P-H Huang, C-F Juan1, C-Y Chen, T-Y Lin, CH Wang, C-C Chen, K Jou, ‘NVE: A 3nm 23.2TOPS/W 12b-Digital-CIM-Based Neural Engine for High Resolution Visual-Quality Enhancement on Smart Devices’, 2024 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2024.
- K Nose, T Fujii, K Togawa, S Okumura, K Mikami, D Hayashi, T Tanaka, T Toi, ‘A 23.9TOPS/W @ 0.8V, 130TOPS AI Acceleratorwith 16× Performanc e-Accelerable Pruning in 14nm Heterogeneous Embedded MPU for Real-Time Robot Applications’, 2024 IEEE International Solid- State Circuits Conference (ISSCC), San Francisco, CA, USA, 2024.













| Architecture | Precision | Process (nm) | Metrics | Frameworks | Algorithm/Models | Applications | |
|---|---|---|---|---|---|---|---|
| GPU TPU Neuromorphic PIM SoC ASIC |
FP-8,16,32 BF-16 INT-1,2,4,8,16 |
4 5 7 10 14 16 20 22 28 40 |
Area Power Throughput Energy Efficiency |
Tensorflow (TF) TF Lite Caffe2 Pytorch MXNet ONNX MetaTF Lava Nengo OpenCV DarkNet |
SNN MLP CNN VGG ResNet YOLO Inception MobileNet RNN GRU BERT LSTM |
Defense Healthcare Cyber Security Vehicle Smartphone Transportation Robotics Education UAV Drones Communication Industry Traffic Control |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).