arXiv:2411.16061
Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training
Authors: Man Yao, Xuerui Qiu, Tianxiang Hu, Jiakui Hu, Yuhong Chou, Keyu Tian, Jianxing Liao, Luziwei Leng, Bo Xu, Guoqi Li
Submitted 24 November, 2024; originally announced November 2024. href="/search/cs?searchtype=author&query=Leng%2C+L">Luziwei Leng</a>, <a href="/search/cs?searchtype=author&query=Xu%2C+B">Bo Xu</a>, <a href="/search/cs?searchtype=author&query=Li%2C+G">Guoqi Li</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2411.16061v1-abstract-short" style="display: inline;"> The ambition of brain-inspired Spiking Neural Networks (SNNs) is to become a low-power alternative to traditional Artificial Neural Networks (ANNs). This work addresses two major challenges in realizing this vision: the performance gap between SNNs and ANNs, and the high training costs of SNNs. We identify intrinsic flaws in spiking neurons caused by binary firing mechanisms and propose a Spike Fi… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2411.16061v1-abstract-full').style.display = 'inline'; document.getElementById('2411.16061v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2411.16061v1-abstract-full" style="display: none;"> The ambition of brain-inspired Spiking Neural Networks (SNNs) is to become a low-power alternative to traditional Artificial Neural Networks (ANNs). This work addresses two major challenges in realizing this vision: the performance gap between SNNs and ANNs, and the high training costs of SNNs. We identify intrinsic flaws in spiking neurons caused by binary firing mechanisms and propose a Spike Firing Approximation (SFA) method using integer training and spike-driven inference. This optimizes the spike firing pattern of spiking neurons, enhancing efficient training, reducing power consumption, improving performance, enabling easier scaling, and better utilizing neuromorphic chips. We also develop an efficient spike-driven Transformer architecture and a spike-masked autoencoder to prevent performance degradation during SNN scaling. On ImageNet-1k, we achieve state-of-the-art top-1 accuracy of 78.5\%, 79.8\%, 84.0\%, and 86.2\% with models containing 10M, 19M, 83M, and 173M parameters, respectively. For instance, the 10M model outperforms the best existing SNN by 7.2\% on ImageNet, with training time acceleration and inference energy efficiency improved by 4.5$\times$ and 3.9$\times$, respectively. We validate the effectiveness and efficiency of the proposed method across various tasks, including object detection, semantic segmentation, and neuromorphic vision tasks. This work enables SNNs to match ANN performance while maintaining the low-power advantage, marking a significant step towards SNNs as a general visual backbone. arXiv:2410.18580
Spatial-Temporal Search for Spiking Neural Networks
Authors: Kaiwei Che, Zhaokun Zhou, Li Yuan, Jianguo Zhang, Yonghong Tian, Luziwei Leng
Submitted 24 October, 2024; originally announced October 2024. By adopting architectures of Artificial Neural Networks (ANNs), SNNs achieve competitive performances on benchmark tasks like image classification. However, successful architectures of ANN… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2410.18580v1-abstract-full').style.display = 'inline'; document.getElementById('2410.18580v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2410.18580v1-abstract-full" style="display: none;"> Spiking Neural Networks (SNNs) are considered as a potential candidate for the next generation of artificial intelligence with appealing characteristics such as sparse computation and inherent temporal dynamics. By adopting architectures of Artificial Neural Networks (ANNs), SNNs achieve competitive performances on benchmark tasks like image classification. However, successful architectures of ANNs are not optimal for SNNs. In this work, we apply Neural Architecture Search (NAS) to find suitable architectures for SNNs. Previous NAS methods for SNNs focus primarily on the spatial dimension, with a notable lack of consideration for the temporal dynamics that are of critical importance for SNNs. Drawing inspiration from the heterogeneity of biological neural networks, we propose a differentiable approach to optimize SNN on both spatial and temporal dimensions. At spatial level, we have developed a spike-based differentiable hierarchical search (SpikeDHS) framework, where spike-based operation is optimized on both the cell and the layer level under computational constraints. We further propose a differentiable surrogate gradient search (DGS) method to evolve local SG functions independently during training. At temporal level, we explore an optimal configuration of diverse temporal dynamics on different types of spiking neurons by evolving their time constants, based on which we further develop hybrid networks combining SNN and ANN, balancing both accuracy and efficiency. Our methods achieve comparable classification performance of CIFAR10/100 and ImageNet with accuracies of 96.43%, 78.96%, and 70.21%, respectively. arXiv:2410.17268
SPikE-SSM: A Sparse, Precise, and Efficient Spiking State Space Model for Long Sequences Learning
Authors: Yan Zhong, Ruoyu Zhao, Chao Wang, Qinghai Guo, Jianguo Zhang, Zhichao Lu, Luziwei Leng
Submitted 7 October, 2024; originally announced October 2024.
Comments: 23 pages, 5 figures Since the advent of Transformers, SNNs have struggled to compete with artificial networks on long sequential tasks, until the recent emergence of state space models (SSMs), which offer superior computational efficiency and modeling capability. However, applying… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2410.17268v1-abstract-full').style.display = 'inline'; document.getElementById('2410.17268v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2410.17268v1-abstract-full" style="display: none;"> Spiking neural networks (SNNs) provide an energy-efficient solution by utilizing the spike-based and sparse nature of biological systems. Since the advent of Transformers, SNNs have struggled to compete with artificial networks on long sequential tasks, until the recent emergence of state space models (SSMs), which offer superior computational efficiency and modeling capability. However, applying the highly capable SSMs to SNNs for long sequences learning poses three major challenges: (1) The membrane potential is determined by the past spiking history of the neuron, leading to reduced efficiency for sequence modeling in parallel computing scenarios. (2) Complex dynamics of biological spiking neurons are crucial for functionality but challenging to simulate and exploit effectively in large networks. (3) It is arduous to maintain high sparsity while achieving high accuracy for spiking neurons without resorting to dense computing, as utilized in artificial neuron-based SSMs. To address them, we propose a sparse, precise and efficient spiking SSM framework, termed SPikE-SSM. For (1), we propose a boundary compression strategy (PMBC) to accelerate the inference of the spiking neuron model, enabling parallel processing for long sequence learning. For (2), we propose a novel and concise neuron model incorporating reset-refractory mechanism to leverage the inherent temporal dimension for dynamic computing with biological interpretability. For (3), we hierarchically integrate the proposed neuron model to the original SSM block, and enhance the dynamics of SPikE-SSM by incorporating trainable thresholds and refractory magnitudes to balance accuracy and sparsity. arXiv:2408.14909
SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models
Authors: Shuaijie Shen, Chao Wang, Renzhuo Huang, Yan Zhong, Qinghai Guo, Zhichao Lu, Jianguo Zhang, Luziwei Leng
Submitted 27 August, 2024; originally announced August 2024. While SNNs are increasing competitive with artificial neural networks (ANNs) for vision tasks, they are rarely used for long sequence tasks, despite their intrinsic temporal dynamics. In this work, we develop spiking state space models (SpikingSSMs) for long sequence lea… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2408.14909v1-abstract-full').style.display = 'inline'; document.getElementById('2408.14909v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2408.14909v1-abstract-full" style="display: none;"> Known as low energy consumption networks, spiking neural networks (SNNs) have gained a lot of attention within the past decades. While SNNs are increasing competitive with artificial neural networks (ANNs) for vision tasks, they are rarely used for long sequence tasks, despite their intrinsic temporal dynamics. In this work, we develop spiking state space models (SpikingSSMs) for long sequence learning by leveraging on the sequence learning abilities of state space models (SSMs). Inspired by dendritic neuron structure, we hierarchically integrate neuronal dynamics with the original SSM block, meanwhile realizing sparse synaptic computation. Furthermore, to solve the conflict of event-driven neuronal dynamics with parallel computing, we propose a light-weight surrogate dynamic network which accurately predicts the after-reset membrane potential and compatible to learnable thresholds, enabling orders of acceleration in training speed compared with conventional iterative methods. On the long range arena benchmark task, SpikingSSM achieves competitive performance to state-of-the-art SSMs meanwhile realizing on average 90\% of network sparsity. arXiv:2408.08188
Scaling Up Natural Language Understanding for Multi-Robots Through the Lens of Hierarchy
Authors: Shaojun Xu, Xusheng Luo, Yutong Huang, Letian Leng, Ruixuan Liu, Changliu Liu
Submitted 15 August, 2024; originally announced August 2024. This work proposes an approach to exploit the task hierarchy from human instructions to facilitate multi-robot planning. Using Large Language Models (LLMs), we propose a two-step approach to translate multi-sentence instructions into a structured l… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2408.08188v1-abstract-full').style.display = 'inline'; document.getElementById('2408.08188v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2408.08188v1-abstract-full" style="display: none;"> Long-horizon planning is hindered by challenges such as uncertainty accumulation, computational complexity, delayed rewards and incomplete information. This work proposes an approach to exploit the task hierarchy from human instructions to facilitate multi-robot planning. Using Large Language Models (LLMs), we propose a two-step approach to translate multi-sentence instructions into a structured language, Hierarchical Linear Temporal Logic (LTL), which serves as a formal representation for planning. Initially, LLMs transform the instructions into a hierarchical representation defined as Hierarchical Task Tree, capturing the logical and temporal relations among tasks. Following this, a domain-specific fine-tuning of LLM translates sub-tasks of each task into flat LTL formulas, aggregating them to form hierarchical LTL specifications. These specifications are then leveraged for planning using off-the-shelf planners. Our framework not only bridges the gap between instructions and algorithmic planning but also showcases the potential of LLMs in harnessing hierarchical reasoning to automate multi-robot task planning. Through evaluations in both simulation and real-world experiments involving human participants, we demonstrate that our method can handle more complex instructions compared to existing methods. The results indicate that our approach achieves higher success rates and lower costs in multi-robot task allocation and plan generation. arXiv:2408.00280
Towards Scalable GPU-Accelerated SNN Training via Temporal Fusion
Authors: Yanchen Li, Jiachun Li, Kebin Sun, Luziwei Leng, Ran Cheng
Submitted 1 August, 2024; originally announced August 2024.
Comments: International Conference on Artificial Neural Networks (ICANN) 2024 While SNNs show promising efficiency on specialized sparse-computational hardware, their practical training often relies on conventional GPUs. This reliance frequently leads to exten… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2408.00280v1-abstract-full').style.display = 'inline'; document.getElementById('2408.00280v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2408.00280v1-abstract-full" style="display: none;"> Drawing on the intricate structures of the brain, Spiking Neural Networks (SNNs) emerge as a transformative development in artificial intelligence, closely emulating the complex dynamics of biological neural networks. While SNNs show promising efficiency on specialized sparse-computational hardware, their practical training often relies on conventional GPUs. This reliance frequently leads to extended computation times when contrasted with traditional Artificial Neural Networks (ANNs), presenting significant hurdles for advancing SNN research. To navigate this challenge, we present a novel temporal fusion method, specifically designed to expedite the propagation dynamics of SNNs on GPU platforms, which serves as an enhancement to the current significant approaches for handling deep learning tasks with SNNs. This method underwent thorough validation through extensive experiments in both authentic training scenarios and idealized conditions, confirming its efficacy and adaptability for single and multi-GPU systems. Benchmarked against various existing SNN libraries/implementations, our method achieved accelerations ranging from $5\times$ to $40\times$ on NVIDIA A100 GPUs. arXiv:2406.12552
Evolutionary Spiking Neural Networks: A Survey
Authors: Shuaijie Shen, Rui Zhang, Chao Wang, Renzhuo Huang, Aiersi Tuerhong, Qinghai Guo, Zhichao Lu, Jianguo Zhang, Luziwei Leng
Submitted 18 June, 2024; originally announced June 2024.
Journal ref: J Membr Comput (2024) However, the unique information propagation mechanisms and the complexity of SNN neuron models pose challenges for adopting traditional methods developed for ANNs to SNNs. These challenges include both weight learning and architecture… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2406.12552v1-abstract-full').style.display = 'inline'; document.getElementById('2406.12552v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2406.12552v1-abstract-full" style="display: none;"> Spiking neural networks (SNNs) are gaining increasing attention as potential computationally efficient alternatives to traditional artificial neural networks(ANNs). However, the unique information propagation mechanisms and the complexity of SNN neuron models pose challenges for adopting traditional methods developed for ANNs to SNNs. These challenges include both weight learning and architecture design. While surrogate gradient learning has shown some success in addressing the former challenge, the latter remains relatively unexplored. Recently, a novel paradigm utilizing evolutionary computation methods has emerged to tackle these challenges. This approach has resulted in the development of a variety of energy-efficient and high-performance SNNs across a wide range of machine learning benchmarks. arXiv:2406.06626
Benchmarking Neural Decoding Backbones towards Enhanced On-edge iBCI Applications
Authors: Zhou Zhou, Guohang He, Zheng Zhang, Luziwei Leng, Qinghai Guo, Jianxing Liao, Xuan Song, Ran Cheng
Submitted 7 June, 2024; originally announced June 2024. Implementing these decoding processes on edge devices, such as the wearables, introduces considerable challenges related to computational demands, processing speed, and maintaining accuracy. This study seeks… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2406.06626v1-abstract-full').style.display = 'inline'; document.getElementById('2406.06626v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2406.06626v1-abstract-full" style="display: none;"> Traditional invasive Brain-Computer Interfaces (iBCIs) typically depend on neural decoding processes conducted on workstations within laboratory settings, which prevents their everyday usage. Implementing these decoding processes on edge devices, such as the wearables, introduces considerable challenges related to computational demands, processing speed, and maintaining accuracy. This study seeks to identify an optimal neural decoding backbone that boasts robust performance and swift inference capabilities suitable for edge deployment. We executed a series of neural decoding experiments involving nonhuman primates engaged in random reaching tasks, evaluating four prospective models, Gated Recurrent Unit (GRU), Transformer, Receptance Weighted Key Value (RWKV), and Selective State Space model (Mamba), across several metrics: single-session decoding, multi-session decoding, new session fine-tuning, inference speed, calibration speed, and scalability. The findings indicate that although the GRU model delivers sufficient accuracy, the RWKV and Mamba models are preferable due to their superior inference and calibration speeds. Additionally, RWKV and Mamba comply with the scaling law, demonstrating improved performance with larger data sets and increased model sizes, whereas GRU shows less pronounced scalability, and the Transformer model requires computational resources that scale prohibitively. This paper presents a thorough comparative analysis of the four models in various scenarios. The results are pivotal in pinpointing an optimal backbone that can handle increasing data volumes and is viable for edge implementation. arXiv:2309.08892
Pour me a drink: Robotic Precision Pouring Carbonated Beverages into Transparent Containers
Authors: Feiya Zhu, Shuo Hu, Letian Leng, Alison Bartsch, Abraham George, Amir Barati Farimani
Submitted 19 September, 2023; v1 submitted 16 September, 2023; originally announced September 2023.
Comments: Supplementary materials will be available soon However, liquid handling and pouring is a challenging task due to the complex dynamics and varying properties of different liquids, the exacting precision required to prevent spills and ensure accurate pou… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2309.08892v2-abstract-full').style.display = 'inline'; document.getElementById('2309.08892v2-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2309.08892v2-abstract-full" style="display: none;"> With the growing emphasis on the development and integration of service robots within household environments, we will need to endow robots with the ability to reliably pour a variety of liquids. However, liquid handling and pouring is a challenging task due to the complex dynamics and varying properties of different liquids, the exacting precision required to prevent spills and ensure accurate pouring, and the necessity for robots to adapt seamlessly to a multitude of containers in real-world scenarios. In response to these challenges, we propose a novel autonomous robotics pipeline that empowers robots to execute precision pouring tasks, encompassing both carbonated and non-carbonated liquids, as well as opaque and transparent liquids, into a variety of transparent containers. Our proposed approach maximizes the potential of RGB input alone, achieving zero-shot capability by harnessing existing pre-trained vision segmentation models. This eliminates the need for additional data collection, manual image annotations, or extensive training. Furthermore, our work integrates ChatGPT, facilitating seamless interaction between individuals without prior expertise in robotics and our pouring pipeline, this integration enables users to effortlessly request and execute pouring actions. arXiv:2308.09946
Weakly-Supervised Action Localization by Hierarchically-structured Latent Attention Modeling
Authors: Guiqin Wang, Peng Zhao, Cong Zhao, Shusen Yang, Jie Cheng, Luziwei Leng, Jianxing Liao, Qinghai Guo
Submitted 25 September, 2023; v1 submitted 19 August, 2023; originally announced August 2023.
Comments: Accepted to ICCV 2023. arXiv admin note: text overlap with arXiv:2203.15187, arXiv:2003.12424, arXiv:2104.02967 by other authors Most existing models rely on multiple instance learning(MIL), where the predictions of unlabeled instances are supervised by classifying labeled bags. The MIL-based methods are relatively well studied with cogent performance achieved on classification but not on… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2308.09946v2-abstract-full').style.display = 'inline'; document.getElementById('2308.09946v2-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2308.09946v2-abstract-full" style="display: none;"> Weakly-supervised action localization aims to recognize and localize action instancese in untrimmed videos with only video-level labels. Most existing models rely on multiple instance learning(MIL), where the predictions of unlabeled instances are supervised by classifying labeled bags. The MIL-based methods are relatively well studied with cogent performance achieved on classification but not on localization. Generally, they locate temporal regions by the video-level classification but overlook the temporal variations of feature semantics. To address this problem, we propose a novel attention-based hierarchically-structured latent model to learn the temporal variations of feature semantics. Specifically, our model entails two components, the first is an unsupervised change-points detection module that detects change-points by learning the latent representations of video features in a temporal hierarchy based on their rates of change, and the second is an attention-based classification model that selects the change-points of the foreground as the boundaries. To evaluate the effectiveness of our model, we conduct extensive experiments on two benchmark datasets, THUMOS-14 and ActivityNet-v1.3. arXiv:2308.00451
Physics-Driven Spectrum-Consistent Federated Learning for Palmprint Verification
Authors: Ziyuan Yang, Andrew Beng Jin Teoh, Bob Zhang, Lu Leng, Yi Zhang
Submitted 1 August, 2023; originally announced August 2023. However, existing methods mainly improve palmprint verification within one spectrum, which is challenging to verify across different spectrums. Additionally, in distributed server-client-based deployment, palmprint verification systems predominantly necessitate clients to transmit pri… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2308.00451v1-abstract-full').style.display = 'inline'; document.getElementById('2308.00451v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2308.00451v1-abstract-full" style="display: none;"> Palmprint as biometrics has gained increasing attention recently due to its discriminative ability and robustness. However, existing methods mainly improve palmprint verification within one spectrum, which is challenging to verify across different spectrums. Additionally, in distributed server-client-based deployment, palmprint verification systems predominantly necessitate clients to transmit private data for model training on the centralized server, thereby engendering privacy apprehensions. To alleviate the above issues, in this paper, we propose a physics-driven spectrum-consistent federated learning method for palmprint verification, dubbed as PSFed-Palm. PSFed-Palm draws upon the inherent physical properties of distinct wavelength spectrums, wherein images acquired under similar wavelengths display heightened resemblances. Our approach first partitions clients into short- and long-spectrum groups according to the wavelength range of their local spectrum images. Subsequently, we introduce anchor models for short- and long-spectrum, which constrain the optimization directions of local models associated with long- and short-spectrum images. Specifically, a spectrum-consistent loss that enforces the model parameters and feature representation to align with their corresponding anchor models is designed. Finally, we impose constraints on the local models to ensure their consistency with the global model, effectively preventing model drift. This measure guarantees spectrum consistency while protecting data privacy, as there is no need to share local data. Extensive experiments are conducted to validate the efficacy of our proposed PSFed-Palm approach. The proposed PSFed-Palm demonstrates compelling performance despite only a limited number of training data. arXiv:2307.12900
Automotive Object Detection via Learning Sparse Events by Spiking Neurons
Authors: Hu Zhang, Yanchen Li, Luziwei Leng, Kaiwei Che, Qian Liu, Qinghai Guo, Jianxing Liao, Ran Cheng
Submitted 10 June, 2024; v1 submitted 24 July, 2023; originally announced July 2023.
Comments: IEEE Transactions on Cognitive and Developmental Systems Traditional object detection techniques that utilize Artificial Neural Networks (ANNs) face challenges due to the sparse and asynchronous nature of the events these sensors captu… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2307.12900v5-abstract-full').style.display = 'inline'; document.getElementById('2307.12900v5-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2307.12900v5-abstract-full" style="display: none;"> Event-based sensors, distinguished by their high temporal resolution of 1 $\mathrm渭\text{s}$ and a dynamic range of 120 $\text{dB}$, stand out as ideal tools for deployment in fast-paced settings like vehicles and drones. Traditional object detection techniques that utilize Artificial Neural Networks (ANNs) face challenges due to the sparse and asynchronous nature of the events these sensors capture. In contrast, Spiking Neural Networks (SNNs) offer a promising alternative, providing a temporal representation that is inherently aligned with event-based data. This paper explores the unique membrane potential dynamics of SNNs and their ability to modulate sparse events. We introduce an innovative spike-triggered adaptive threshold mechanism designed for stable training. Building on these insights, we present a specialized spiking feature pyramid network (SpikeFPN) optimized for automotive event-based object detection. Comprehensive evaluations demonstrate that SpikeFPN surpasses both traditional SNNs and advanced ANNs enhanced with attention mechanisms. Evidently, SpikeFPN achieves a mean Average Precision (mAP) of 0.477 on the GEN1 Automotive Detection (GAD) benchmark dataset, marking significant increases over the selected SNN baselines. Moreover, the efficient design of SpikeFPN ensures robust performance while optimizing computational resources, attributed to its innate sparse computation capabilities. arXiv:2306.12465
Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference
Authors: Boyan Li, Luziwei Leng, Shuaijie Shen, Kaixuan Zhang, Jianguo Zhang, Jianxing Liao, Ran Cheng
Submitted 26 April, 2024; v1 submitted 21 June, 2023; originally announced June 2023.
Comments: IEEE TNNLS However, the inability of Multiplication-Free Inference (MFI) to align with attention and transformer mechanisms, which are critical to superior performance on high-resolution vision tasks, imposing limitations on… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2306.12465v3-abstract-full').style.display = 'inline'; document.getElementById('2306.12465v3-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2306.12465v3-abstract-full" style="display: none;"> Advancements in adapting deep convolution architectures for Spiking Neural Networks (SNNs) have significantly enhanced image classification performance and reduced computational burdens. However, the inability of Multiplication-Free Inference (MFI) to align with attention and transformer mechanisms, which are critical to superior performance on high-resolution vision tasks, imposing limitations on these gains. To address this, our research explores a new pathway, drawing inspiration from the progress made in Multi-Layer Perceptrons (MLPs). We propose an innovative spiking MLP architecture that uses batch normalization to retain MFI compatibility and introducing a spiking patch encoding layer to enhance local feature extraction capabilities. As a result, we establish an efficient multi-stage spiking MLP network that blends effectively global receptive fields with local feature extraction for comprehensive spike-based computation. Without relying on pre-training or sophisticated SNN training techniques, our network secures a top-1 accuracy of 66.39% on the ImageNet-1K dataset, surpassing the directly trained spiking ResNet-34 by 2.67%. Furthermore, we curtail computational costs, model parameters, and simulation steps. An expanded version of our network compares with the performance of the spiking VGG-16 network with a 71.64% top-1 accuracy, all while operating with a model capacity 2.1 times smaller. Our findings highlight the potential of our deep SNN architecture in effectively integrating global and local learning abilities. Interestingly, the trained receptive field in our network mirrors the activity patterns of cortical cells. arXiv:2305.00044
Hedonic Prices and Quality Adjusted Price Indices Powered by AI
Authors: Patrick Bajari, Zhihao Cen, Victor Chernozhukov, Manoj Manukonda, Suhas Vijaykumar, Jin Wang, Ramon Huerta, Junbo Li, Ling Leng, George Monokroussos, Shan Wan
Submitted 28 April, 2023; originally announced May 2023.
Comments: Revised CEMMAP Working Paper (CWP08/23) We develop empirical hedonic models that can process large amounts of unstructured product data (text, images, prices, quantities) and output accurate hedonic price estimates and derived indices. To accomplish this, we generate abst… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2305.00044v1-abstract-full').style.display = 'inline'; document.getElementById('2305.00044v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2305.00044v1-abstract-full" style="display: none;"> Accurate, real-time measurements of price index changes using electronic records are essential for tracking inflation and productivity in today's economic environment. We develop empirical hedonic models that can process large amounts of unstructured product data (text, images, prices, quantities) and output accurate hedonic price estimates and derived indices. To accomplish this, we generate abstract product attributes, or ``features,'' from text descriptions and images using deep neural networks, and then use these attributes to estimate the hedonic price function. Specifically, we convert textual information about the product to numeric features using large language models based on transformers, trained or fine-tuned using product descriptions, and convert the product image to numeric features using a residual network model. To produce the estimated hedonic price function, we again use a multi-task neural network trained to predict a product's price in all time periods simultaneously. To demonstrate the performance of this approach, we apply the models to Amazon's data for first-party apparel sales and estimate hedonic prices. The resulting models have high predictive accuracy, with $R^2$ ranging from $80\%$ to $90\%$. Finally, we construct the AI-based hedonic Fisher price index, chained at the year-over-year frequency. arXiv:2304.11857
Accurate and Efficient Event-based Semantic Segmentation Using Adaptive Spiking Encoder-Decoder Network
Authors: Rui Zhang, Luziwei Leng, Kaiwei Che, Hu Zhang, Jie Cheng, Qinghai Guo, Jiangxing Liao, Ran Cheng
Submitted 2 August, 2024; v1 submitted 24 April, 2023; originally announced April 2023.
Comments: Accepted for publication in IEEE Transactions on Neural Networks and Learning Systems Despite their potential, SNNs face challenges in training and architectural design, resulting in limited performance in challenging event-based dense prediction tasks compared… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2304.11857v3-abstract-full').style.display = 'inline'; document.getElementById('2304.11857v3-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2304.11857v3-abstract-full" style="display: none;"> Spiking neural networks (SNNs), known for their low-power, event-driven computation and intrinsic temporal dynamics, are emerging as promising solutions for processing dynamic, asynchronous signals from event-based sensors. Despite their potential, SNNs face challenges in training and architectural design, resulting in limited performance in challenging event-based dense prediction tasks compared to artificial neural networks (ANNs). In this work, we develop an efficient spiking encoder-decoder network (SpikingEDN) for large-scale event-based semantic segmentation tasks. To enhance the learning efficiency from dynamic event streams, we harness the adaptive threshold which improves network accuracy, sparsity and robustness in streaming inference. Moreover, we develop a dual-path Spiking Spatially-Adaptive Modulation module, which is specifically tailored to enhance the representation of sparse events and multi-modal inputs, thereby considerably improving network performance. Our SpikingEDN attains a mean intersection over union (MIoU) of 72.57\% on the DDD17 dataset and 58.32\% on the larger DSEC-Semantic dataset, showing competitive results to the state-of-the-art ANNs while requiring substantially fewer computational resources. Our results shed light on the untapped potential of SNNs in event-based vision applications. arXiv:2303.00914
Neuro-Modulated Hebbian Learning for Fully Test-Time Adaptation
Authors: Yushun Tang, Ce Zhang, Heng Xu, Shuoshuo Chen, Jie Cheng, Luziwei Leng, Qinghai Guo, Zhihai He
Submitted 10 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.
Comments: CVPR2023 accepted We take inspiration from the biological plausibility learning where the neuron responses are tuned based on a local synapse-change procedure and activated by competitive lateral inhib… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2303.00914v2-abstract-full').style.display = 'inline'; document.getElementById('2303.00914v2-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2303.00914v2-abstract-full" style="display: none;"> Fully test-time adaptation aims to adapt the network model based on sequential analysis of input samples during the inference stage to address the cross-domain performance degradation problem of deep neural networks. We take inspiration from the biological plausibility learning where the neuron responses are tuned based on a local synapse-change procedure and activated by competitive lateral inhibition rules. Based on these feed-forward learning rules, we design a soft Hebbian learning process which provides an unsupervised and effective mechanism for online adaptation. We observe that the performance of this feed-forward Hebbian learning for fully test-time adaptation can be significantly improved by incorporating a feedback neuro-modulation layer. It is able to fine-tune the neuron responses based on the external feedback generated by the error back-propagation from the top inference layers. This leads to our proposed neuro-modulated Hebbian learning (NHL) method for fully test-time adaptation. With the unsupervised feed-forward soft Hebbian learning being combined with a learned neuro-modulator to capture feedback from external responses, the source model can be effectively adapted during the testing process. arXiv:2212.13466
General GAN-generated image detection by data augmentation in fingerprint domain
Authors: Huaming Wang, Jianwei Fei, Yunshu Dai, Lingyun Leng, Zhihua Xia
Submitted 9 April, 2023; v1 submitted 27 December, 2022; originally announced December 2022. Specifically, we first separate the fingerprints and contents of the GAN-generated images using an autoencoder based GAN fingerprint extractor, followed by random perturbations of the fingerprints. Then the original fingerprints are substituted wit… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2212.13466v2-abstract-full').style.display = 'inline'; document.getElementById('2212.13466v2-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2212.13466v2-abstract-full" style="display: none;"> In this work, we investigate improving the generalizability of GAN-generated image detectors by performing data augmentation in the fingerprint domain. Specifically, we first separate the fingerprints and contents of the GAN-generated images using an autoencoder based GAN fingerprint extractor, followed by random perturbations of the fingerprints. Then the original fingerprints are substituted with the perturbed fingerprints and added to the original contents, to produce images that are visually invariant but with distinct fingerprints. The perturbed images can successfully imitate images generated by different GANs to improve the generalization of the detectors, which is demonstrated by the spectra visualization. To our knowledge, we are the first to conduct data augmentation in the fingerprint domain. Our work explores a novel prospect that is distinct from previous works on spatial and frequency domain augmentation. arXiv:2105.14422
Periodic-GP: Learning Periodic World with Gaussian Process Bandits
Authors: Hengrui Cai, Zhihao Cen, Ling Leng, Rui Song
Submitted 8 June, 2021; v1 submitted 29 May, 2021; originally announced May 2021. In this work, we focus on learning the stochastic periodic world by leveraging this seasonal law. To deal with the general action… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2105.14422v3-abstract-full').style.display = 'inline'; document.getElementById('2105.14422v3-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2105.14422v3-abstract-full" style="display: none;"> We consider the sequential decision optimization on the periodic environment, that occurs in a wide variety of real-world applications when the data involves seasonality, such as the daily demand of drivers in ride-sharing and dynamic traffic patterns in transportation. In this work, we focus on learning the stochastic periodic world by leveraging this seasonal law. To deal with the general action space, we use the bandit based on Gaussian process (GP) as the base model due to its flexibility and generality, and propose the Periodic-GP method with a temporal periodic kernel based on the upper confidence bound. Theoretically, we provide a new regret bound of the proposed method, by explicitly characterizing the periodic kernel in the periodic stationary model. arXiv:2006.11099
Cortical oscillations implement a backbone for sampling-based computation in spiking neural networks
Authors: Agnes Korcsak-Gorzo, Michael G. Müller, Andreas Baumbach, Luziwei Leng, Oliver Julien Breitwieser, Sacha J. van Albada, Walter Senn, Karlheinz Meier, Robert Legenstein, Mihai A. Petrovici
Submitted 4 April, 2022; v1 submitted 19 June, 2020; originally announced June 2020.
Comments: 34 pages, 9 figures
Journal ref: PLoS Comput Biol 18(3): e1009753 (2022) M眉ller</a>, <a href="/search/cs?searchtype=author&query=Baumbach%2C+A">Andreas Baumbach</a>, <a href="/search/cs?searchtype=author&query=Leng%2C+L">Luziwei Leng</a>, <a href="/search/cs?searchtype=author&query=Breitwieser%2C+O+J">Oliver Julien Breitwieser</a>, <a href="/search/cs?searchtype=author&query=van+Albada%2C+S+J">Sacha J. van Albada</a>, <a href="/search/cs?searchtype=author&query=Senn%2C+W">Walter Senn</a>, <a href="/search/cs?searchtype=author&query=Meier%2C+K">Karlheinz Meier</a>, <a href="/search/cs?searchtype=author&query=Legenstein%2C+R">Robert Legenstein</a>, <a href="/search/cs?searchtype=author&query=Petrovici%2C+M+A">Mihai A. Petrovici</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2006.11099v5-abstract-short" style="display: inline;"> Being permanently confronted with an uncertain world, brains have faced evolutionary pressure to represent this uncertainty in order to respond appropriately. Often, this requires visiting multiple interpretations of the available information or multiple solutions to an encountered problem. This gives rise to the so-called mixing problem: since all of these "valid" states represent powerful attrac… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2006.11099v5-abstract-full').style.display = 'inline'; document.getElementById('2006.11099v5-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2006.11099v5-abstract-full" style="display: none;"> Being permanently confronted with an uncertain world, brains have faced evolutionary pressure to represent this uncertainty in order to respond appropriately. Often, this requires visiting multiple interpretations of the available information or multiple solutions to an encountered problem. This gives rise to the so-called mixing problem: since all of these "valid" states represent powerful attractors, but between themselves can be very dissimilar, switching between such states can be difficult. We propose that cortical oscillations can be effectively used to overcome this challenge. By acting as an effective temperature, background spiking activity modulates exploration. Rhythmic changes induced by cortical oscillations can then be interpreted as a form of simulated tempering. We provide a rigorous mathematical discussion of this link and study some of its phenomenological implications in computer simulations. arXiv:2002.01751
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making
Authors: Chengchun Shi, Runzhe Wan, Rui Song, Wenbin Lu, Ling Leng
Submitted 5 February, 2020; originally announced February 2020. In this paper, we propose a novel Forward-Backward Learning procedure to test MA in sequential decision making. The proposed test does not assume any parametric form on the joint distribution of the observed data and plays an important role for identifying the optimal policy in high-order Markov decision… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2002.01751v1-abstract-full').style.display = 'inline'; document.getElementById('2002.01751v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2002.01751v1-abstract-full" style="display: none;"> The Markov assumption (MA) is fundamental to the empirical validity of reinforcement learning. In this paper, we propose a novel Forward-Backward Learning procedure to test MA in sequential decision making. The proposed test does not assume any parametric form on the joint distribution of the observed data and plays an important role for identifying the optimal policy in high-order Markov decision processes and partially observable MDPs. arXiv:1807.02389
Accelerated physical emulation of Bayesian inference in spiking neural networks
Authors: Akos F. Kungl, Sebastian Schmitt, Johann Klän, Paul Müller, Andreas Baumbach, Dominik Dold, Alexander Kugele, Nico Gürtler, Luziwei Leng, Eric Müller, Christoph Koke, Mitja Kleider, Christian Mauch, Oliver Breitwieser, Maurice Güttler, Dan Husmann, Kai Husmann, Joscha Ilmberger, Andreas Hartel, Vitali Karasenko, Andreas Grübl, Johannes Schemmel, Karlheinz Meier, Mihai A. Petrovici
Submitted 1 April, 2020; v1 submitted 6 July, 2018; originally announced July 2018.
Comments: This preprint has been published 2019 November 14. Please cite as: Kungl A. F. et al. (2019) Accelerated Physical Emulation of Bayesian Inference in Spiking Neural Networks. Front. Neurosci. 13:1201. doi: 10.3389/fnins.2019.01201
Journal ref: Frontiers in Neuroscience - Neuromorphic Engineering, 14 November 2019 Kungl</a>, <a href="/search/cs?searchtype=author&query=Schmitt%2C+S">Sebastian Schmitt</a>, <a href="/search/cs?searchtype=author&query=Kl%C3%A4hn%2C+J">Johann Kl盲hn</a>, <a href="/search/cs?searchtype=author&query=M%C3%BCller%2C+P">Paul M眉ller</a>, <a href="/search/cs?searchtype=author&query=Baumbach%2C+A">Andreas Baumbach</a>, <a href="/search/cs?searchtype=author&query=Dold%2C+D">Dominik Dold</a>, <a href="/search/cs?searchtype=author&query=Kugele%2C+A">Alexander Kugele</a>, <a href="/search/cs?searchtype=author&query=G%C3%BCrtler%2C+N">Nico G眉rtler</a>, <a href="/search/cs?searchtype=author&query=Leng%2C+L">Luziwei Leng</a>, <a href="/search/cs?searchtype=author&query=M%C3%BCller%2C+E">Eric M眉ller</a>, <a href="/search/cs?searchtype=author&query=Koke%2C+C">Christoph Koke</a>, <a href="/search/cs?searchtype=author&query=Kleider%2C+M">Mitja Kleider</a>, <a href="/search/cs?searchtype=author&query=Mauch%2C+C">Christian Mauch</a>, <a href="/search/cs?searchtype=author&query=Breitwieser%2C+O">Oliver Breitwieser</a>, <a href="/search/cs?searchtype=author&query=G%C3%BCttler%2C+M">Maurice G眉ttler</a>, <a href="/search/cs?searchtype=author&query=Husmann%2C+D">Dan Husmann</a>, <a href="/search/cs?searchtype=author&query=Husmann%2C+K">Kai Husmann</a>, <a href="/search/cs?searchtype=author&query=Ilmberger%2C+J">Joscha Ilmberger</a>, <a href="/search/cs?searchtype=author&query=Hartel%2C+A">Andreas Hartel</a>, <a href="/search/cs?searchtype=author&query=Karasenko%2C+V">Vitali Karasenko</a>, <a href="/search/cs?searchtype=author&query=Gr%C3%BCbl%2C+A">Andreas Gr眉bl</a>, <a href="/search/cs?searchtype=author&query=Schemmel%2C+J">Johannes Schemmel</a>, <a href="/search/cs?searchtype=author&query=Meier%2C+K">Karlheinz Meier</a>, <a href="/search/cs?searchtype=author&query=Petrovici%2C+M+A">Mihai A. Petrovici</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="1807.02389v4-abstract-short" style="display: inline;"> The massively parallel nature of biological information processing plays an important role for its superiority to human-engineered computing devices. In particular, it may hold the key to overcoming the von Neumann bottleneck that limits contemporary computer architectures. Physical-model neuromorphic devices seek to replicate not only this inherent parallelism, but also aspects of its microscopic… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('1807.02389v4-abstract-full').style.display = 'inline'; document.getElementById('1807.02389v4-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="1807.02389v4-abstract-full" style="display: none;"> The massively parallel nature of biological information processing plays an important role for its superiority to human-engineered computing devices. In particular, it may hold the key to overcoming the von Neumann bottleneck that limits contemporary computer architectures. Physical-model neuromorphic devices seek to replicate not only this inherent parallelism, but also aspects of its microscopic dynamics in analog circuits emulating neurons and synapses. However, these machines require network models that are not only adept at solving particular tasks, but that can also cope with the inherent imperfections of analog substrates. We present a spiking network model that performs Bayesian inference through sampling on the BrainScaleS neuromorphic platform, where we use it for generative and discriminative computations on visual data. By illustrating its functionality on this platform, we implicitly demonstrate its robustness to various substrate-specific distortive effects, as well as its accelerated capability for computation. arXiv:1709.08166
Spiking neurons with short-term synaptic plasticity form superior generative networks
Authors: Luziwei Leng, Roman Martel, Oliver Breitwieser, Ilja Bytschok, Walter Senn, Johannes Schemmel, Karlheinz Meier, Mihai A. Petrovici
Submitted 10 October, 2017; v1 submitted 24 September, 2017; originally announced September 2017.
Comments: corrected typo in abstract (2019) Accelerated Physical Emulation of Bayesian Inference in Spiking Neural Networks. Front. Neurosci. 13:1201. doi: 10.3389/fnins.2019.01201</span> </p> <p class="comments is-size-7"> <span class="has-text-black-bis has-text-weight-semibold">Journal ref:</span> Frontiers in Neuroscience - Neuromorphic Engineering, 14 November 2019 </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:1709.08166</a> <span> [<a href="">pdf</a>, <a href="">ps</a>, <a href="">other</a>] </span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Neural and Evolutionary Computing">cs.NE</span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Biological Physics"></span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Neurons and Cognition">q-bio.NC</span> </div> </div> <p class="title is-5 mathjax"> Spiking neurons with short-term synaptic plasticity form superior generative networks </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&query=Leng%2C+L">Luziwei Leng</a>, <a href="/search/cs?searchtype=author&query=Martel%2C+R">Roman Martel</a>, <a href="/search/cs?searchtype=author&query=Breitwieser%2C+O">Oliver Breitwieser</a>, <a href="/search/cs?searchtype=author&query=Bytschok%2C+I">Ilja Bytschok</a>, <a href="/search/cs?searchtype=author&query=Senn%2C+W">Walter Senn</a>, <a href="/search/cs?searchtype=author&query=Schemmel%2C+J">Johannes Schemmel</a>, <a href="/search/cs?searchtype=author&query=Meier%2C+K">Karlheinz Meier</a>, <a href="/search/cs?searchtype=author&query=Petrovici%2C+M+A">Mihai A. Petrovici</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="1709.08166v3-abstract-short" style="display: inline;"> Spiking networks that perform probabilistic inference have been proposed both as models of cortical computation and as candidates for solving problems in machine learning. However, the evidence for spike-based computation being in any way superior to non-spiking alternatives remains scarce. We propose that short-term plasticity can provide spiking networks with distinct computational advantages co… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('1709.08166v3-abstract-full').style.display = 'inline'; document.getElementById('1709.08166v3-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="1709.08166v3-abstract-full" style="display: none;"> Spiking networks that perform probabilistic inference have been proposed both as models of cortical computation and as candidates for solving problems in machine learning. However, the evidence for spike-based computation being in any way superior to non-spiking alternatives remains scarce. We propose that short-term plasticity can provide spiking networks with distinct computational advantages compared to their classical counterparts. In this work, we use networks of leaky integrate-and-fire neurons that are trained to perform both discriminative and generative tasks in their forward and backward information processing paths, respectively. During training, the energy landscape associated with their dynamics becomes highly diverse, with deep attractor basins separated by high barriers. Classical algorithms solve this problem by employing various tempering techniques, which are both computationally demanding and require global state updates. We demonstrate how similar results can be achieved in spiking networks endowed with local short-term synaptic plasticity. Additionally, we discuss how these networks can even outperform tempering-based approaches when the training data is imbalanced. We thereby show how biologically inspired, local, spike-triggered synaptic dynamics based simply on a limited pool of synaptic resources can allow spiking networks to outperform their non-spiking relatives. <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('1709.08166v3-abstract-full').style.display = 'none'; document.getElementById('1709.08166v3-abstract-short').style.display = 'inline';">△ Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 10 October, 2017; <span class="has-text-black-bis has-text-weight-semibold">v1</span> submitted 24 September, 2017; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> September 2017. </p> <p class="comments is-size-7"> <span class="has-text-black-bis has-text-weight-semibold">Comments:</span> <span class="has-text-grey-dark mathjax">corrected typo in abstract</span> </p> </li> </ol> <div class="is-hidden-tablet"> 