Version 1
: Received: 7 July 2022 / Approved: 8 July 2022 / Online: 8 July 2022 (10:40:18 CEST)
Version 2
: Received: 8 June 2023 / Approved: 9 June 2023 / Online: 9 June 2023 (08:57:55 CEST)
Version 3
: Received: 26 October 2023 / Approved: 27 October 2023 / Online: 30 October 2023 (05:54:50 CET)
Version 4
: Received: 24 May 2024 / Approved: 27 May 2024 / Online: 27 May 2024 (05:47:24 CEST)
How to cite:
Zhang, Y.; Zhao, J.; Wu, W.; Muscoloni, A.; Cannistraci, C. V. Ultra-Sparse Network Advantage in Deep Learning via Cannistraci-Hebb Brain-Inspired Training With Hyperbolic Meta-Deep Community-Layered Epitopology. Preprints2022, 2022070139. https://doi.org/10.20944/preprints202207.0139.v3
Zhang, Y.; Zhao, J.; Wu, W.; Muscoloni, A.; Cannistraci, C. V. Ultra-Sparse Network Advantage in Deep Learning via Cannistraci-Hebb Brain-Inspired Training With Hyperbolic Meta-Deep Community-Layered Epitopology. Preprints 2022, 2022070139. https://doi.org/10.20944/preprints202207.0139.v3
Zhang, Y.; Zhao, J.; Wu, W.; Muscoloni, A.; Cannistraci, C. V. Ultra-Sparse Network Advantage in Deep Learning via Cannistraci-Hebb Brain-Inspired Training With Hyperbolic Meta-Deep Community-Layered Epitopology. Preprints2022, 2022070139. https://doi.org/10.20944/preprints202207.0139.v3
APA Style
Zhang, Y., Zhao, J., Wu, W., Muscoloni, A., & Cannistraci, C. V. (2023). Ultra-Sparse Network Advantage in Deep Learning via Cannistraci-Hebb Brain-Inspired Training With Hyperbolic Meta-Deep Community-Layered Epitopology. Preprints. https://doi.org/10.20944/preprints202207.0139.v3
Chicago/Turabian Style
Zhang, Y., Alessandro Muscoloni and Carlo Vittorio Cannistraci. 2023 "Ultra-Sparse Network Advantage in Deep Learning via Cannistraci-Hebb Brain-Inspired Training With Hyperbolic Meta-Deep Community-Layered Epitopology" Preprints. https://doi.org/10.20944/preprints202207.0139.v3
Abstract
Sparse training (ST) aims to ameliorate deep learning by replacing fully connected artificial neural networks (ANNs) with sparse or ultra-sparse ones, such as brain networks are, therefore it might benefit to borrow brain-inspired learning paradigms from complex network intelligence theory. Here, we launch the ultrasparse advantage challenge, whose goal is to offer evidence on the extent to which ultra-sparse (around 1% connection retained) topologies can achieve any leaning advantage on fully connected. Epitopological learning is a field of network science and complex network intelligence that studies how to implement learning on complex networks by changing the shape of their connectivity structure (epitopological plasticity). One way to implement Epitopological (epi- means new) Learning is via link prediction: predicting the likelihood of nonobserved links to appear in the network. Cannistraci-Hebb learning theory inspired the CH3-L3 network automata rule for link prediction which is effective for general-purpose link prediction. Here, starting from CH3-L3 we propose Epitopological Sparse Meta-deep Learning (ESML) to apply Epitopological Learning to sparse training. In empirical experiments, we find that ESML learns ANNs with ultra-sparse hyperbolic (epi-)topology in which emerges a community layer organization that is meta-deep (meaning that each layer also has an internal depth due to powerlaw node hierarchy). Furthermore, we discover that ESML can in many cases automatically sparse the neurons during training (arriving even to 30% neurons left in hidden layers), this process of node dynamic removal is called percolation. Starting from this network science evidence, we design Cannistraci-Hebb training (CHT), a 4-step training methodology that put ESML at its heart. We conduct experiments on 6 datasets and 3 network structures (MLPs, VGG16, ResNet50) comparing CHT to dynamic sparse training SOTA algorithms and fully connected network. The results indicate that, with a mere 1% of links retained during training, CHT surpasses fully connected networks on VGG16 and ResNet50. This key finding is an evidence for ultra-sparse advantage and signs a milestone in deep learning. CHT acts akin to a gradient-free oracle which adopts CH3-L3 based epitopological learning to guide the placement of new links in the ultra-sparse network topology to facilitate sparse-weight gradient learning, and this in turn reduces the convergence time of ultra-sparse training. Finally, CHT offers first examples of parsimony dynamic sparse training because, in many datasets, it canretain network performance by percolating and significantly reducing the node network size.
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Received:
30 October 2023
Commenter:
Carlo Vittorio Cannistraci
Commenter's Conflict of Interests:
Author
Comment: We revised the title. We added more datasets and tests. We present the first evidence for ultra-sparse advantage in deep artificial neural networks.
Commenter: Carlo Vittorio Cannistraci
Commenter's Conflict of Interests: Author