arXiv:2411.15611 [pdf, other]
cs.CV

Knowledge Transfer Across Modalities with Natural Language Supervision

Authors: Carlo Alberto Barbano, Luca Molinaro, Emanuele Aiello, Marco Grangetto style="display: inline;"> We present a way to learn novel concepts by only using their textual description. We call this method Knowledge Transfer. Similarly to human perception, we leverage cross-modal interaction to introduce new concepts. We hypothesize that in a pre-trained visual encoder there are enough low-level features already learned (e.g. shape, appearance, color) that can be used to describe previously unknown… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2411.15611v1-abstract-full').style.display = 'inline'; document.getElementById('2411.15611v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2411.15611v1-abstract-full" style="display: none;"> We present a way to learn novel concepts by only using their textual description. We call this method Knowledge Transfer. Similarly to human perception, we leverage cross-modal interaction to introduce new concepts. We hypothesize that in a pre-trained visual encoder there are enough low-level features already learned (e.g. shape, appearance, color) that can be used to describe previously unknown high-level concepts. Provided with a textual description of the novel concept, our method works by aligning the known low-level features of the visual encoder to its high-level textual description. We show that Knowledge Transfer can successfully introduce novel concepts in multimodal models, in a very efficient manner, by only requiring a single description of the target concept. Our approach is compatible with both separate textual and visual encoders (e.g. CLIP) and shared parameters across modalities. We also show that, following the same principle, Knowledge Transfer can improve concepts already known by the model. Submitted 23 November, 2024; originally announced November 2024.
Comments: 21 pages, 7 figures, 17 tables
MSC Class: 68T45 (Primary) 68T50 (Secondary)
ACM Class: I.2.6 We propose a progressive image compression method in which an image is first represented as a pair of base-quality and top-quality latent representations. Next, a residual latent representation is encoded as the element-wise difference between the top and b… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2411.10185v2-abstract-full').style.display = 'inline'; document.getElementById('2411.10185v2-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2411.10185v2-abstract-full" style="display: none;"> Learned progressive image compression is gaining momentum as it allows improved image reconstruction as more bits are decoded at the receiver. We propose a progressive image compression method in which an image is first represented as a pair of base-quality and top-quality latent representations. Next, a residual latent representation is encoded as the element-wise difference between the top and base representations. Our scheme enables progressive image compression with element-wise granularity by introducing a masking system that ranks each element of the residual latent representation from most to least important, dividing it into complementary components, which can be transmitted separately to the decoder in order to obtain different reconstruction quality. The masking system does not add further parameters nor complexity. At the receiver, any elements of the top latent representation excluded from the transmitted components can be independently replaced with the mean predicted by the hyperprior architecture, ensuring reliable reconstructions at any intermediate quality level. We also introduced Rate Enhancement Modules (REMs), which refine the estimation of entropy parameters using already decoded components. We obtain results competitive with state-of-the-art competitors, while significantly reducing computational complexity, decoding time, and number of parameters. <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2411.10185v2-abstract-full').style.display = 'none'; document.getElementById('2411.10185v2-abstract-short').style.display = 'inline';">△ Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 26 November, 2024; <span class="has-text-black-bis has-text-weight-semibold">v1</span> submitted 15 November, 2024; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> November 2024. </p> <p class="comments is-size-7"> <span class="has-text-black-bis has-text-weight-semibold">Comments:</span> <span class="has-text-grey-dark mathjax">9 pages. arXiv:2410.02981 [pdf, other]
eess.IV cs.CV cs.LG

GABIC: Graph-based Attention Block for Image Compression

Authors: Gabriele Spadaro, Alberto Presta, Enzo Tartaglione, Jhony H. Giraldo, Marco Grangetto, Attilio Fiandrotti Giraldo</a>, <a href="/search/cs?searchtype=author&query=Grangetto%2C+M">Marco Grangetto</a>, <a href="/search/cs?searchtype=author&query=Fiandrotti%2C+A">Attilio Fiandrotti</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2410.02981v1-abstract-short" style="display: inline;"> While standardized codecs like JPEG and HEVC-intra represent the industry standard in image compression, neural Learned Image Compression (LIC) codecs represent a promising alternative. In detail, integrating attention mechanisms from Vision Transformers into LIC models has shown improved compression efficiency. However, extra efficiency often comes at the cost of aggregating redundant features. T… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2410.02981v1-abstract-full').style.display = 'inline'; document.getElementById('2410.02981v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2410.02981v1-abstract-full" style="display: none;"> While standardized codecs like JPEG and HEVC-intra represent the industry standard in image compression, neural Learned Image Compression (LIC) codecs represent a promising alternative. In detail, integrating attention mechanisms from Vision Transformers into LIC models has shown improved compression efficiency. However, extra efficiency often comes at the cost of aggregating redundant features. This work proposes a Graph-based Attention Block for Image Compression (GABIC), a method to reduce feature redundancy based on a k-Nearest Neighbors enhanced attention mechanism. Submitted 3 October, 2024; originally announced October 2024.
Comments: 10 pages, 5 figures, accepted at ICIP 2024 Giraldo</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2410.00807v1-abstract-short" style="display: inline;"> In recent years, Graph Neural Networks (GNNs) have demonstrated strong adaptability to various real-world challenges, with architectures such as Vision GNN (ViG) achieving state-of-the-art performance in several computer vision tasks. However, their practical applicability is hindered by the computational complexity of constructing the graph, which scales quadratically with the image size. In this… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2410.00807v1-abstract-full').style.display = 'inline'; document.getElementById('2410.00807v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2410.00807v1-abstract-full" style="display: none;"> In recent years, Graph Neural Networks (GNNs) have demonstrated strong adaptability to various real-world challenges, with architectures such as Vision GNN (ViG) achieving state-of-the-art performance in several computer vision tasks. However, their practical applicability is hindered by the computational complexity of constructing the graph, which scales quadratically with the image size. In this paper, we introduce a novel Windowed vision Graph neural Network (WiGNet) model for efficient image processing. WiGNet explores a different strategy from previous works by partitioning the image into windows and constructing a graph within each window. Therefore, our model uses graph convolutions instead of the typical 2D convolution or self-attention mechanism. WiGNet effectively manages computational and memory complexity for large image sizes. We evaluate our method in the ImageNet-1k benchmark dataset and test the adaptability of WiGNet using the CelebA-HQ dataset as a downstream task with higher-resolution images. In both of these scenarios, our method achieves competitive results compared to previous vision GNNs while keeping memory and computational complexity at bay. WiGNet offers a promising solution toward the deployment of vision GNNs in real-world applications. arXiv:2410.00807 [pdf, other]
cs.CV cs.AI

WiGNet: Windowed Vision Graph Neural Network

Authors: Gabriele Spadaro, Marco Grangetto, Attilio Fiandrotti, Enzo Tartaglione, Jhony H. Giraldo Unfortunately, a distinct encoder-decoder pair with millions of parameters must be trained for each $位$, hence the need to switch encoders and to store multiple encoders and dec… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2410.00557v2-abstract-full').style.display = 'inline'; document.getElementById('2410.00557v2-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2410.00557v2-abstract-full" style="display: none;"> In end-to-end learned image compression, encoder and decoder are jointly trained to minimize a $R + 位D$ cost function, where $位$ controls the trade-off between rate of the quantized latent representation and image quality. Unfortunately, a distinct encoder-decoder pair with millions of parameters must be trained for each $位$, hence the need to switch encoders and to store multiple encoders and decoders on the user device for every target rate. This paper proposes to exploit a differentiable quantizer designed around a parametric sum of hyperbolic tangents, called STanH , that relaxes the step-wise quantization function. STanH is implemented as a differentiable activation layer with learnable quantization parameters that can be plugged into a pre-trained fixed rate model and refined to achieve different target bitrates. Submitted 1 October, 2024; originally announced October 2024. One of the most predominant biomarkers in neuroimaging is represented by brain age, which has been shown to be a good indicator for different conditions, such as Alzheimer's Disease. Using brain age for weakly supervised pre-training of DL models in transfer le… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2408.07079v3-abstract-full').style.display = 'inline'; document.getElementById('2408.07079v3-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2408.07079v3-abstract-full" style="display: none;"> Deep Learning (DL) in neuroimaging has become increasingly relevant for detecting neurological conditions and neurodegenerative disorders. One of the most predominant biomarkers in neuroimaging is represented by brain age, which has been shown to be a good indicator for different conditions, such as Alzheimer's Disease. Using brain age for weakly supervised pre-training of DL models in transfer learning settings has also recently shown promising results, especially when dealing with data scarcity of different conditions. On the other hand, anatomical information of brain MRIs (e.g. cortical thickness) can provide important information for learning good representations that can be transferred to many downstream tasks. In this work, we propose AnatCL, an anatomical foundation model for brain MRIs that i.) leverages anatomical information in a weakly contrastive learning approach, and ii.) achieves state-of-the-art performances across many different downstream tasks. To validate our approach we consider 12 different downstream tasks for the diagnosis of different conditions such as Alzheimer's Disease, autism spectrum disorder, and schizophrenia. Furthermore, we also target the prediction of 10 different clinical assessment scores using structural MRI data. Our findings show that incorporating anatomical information during pre-training leads to more robust and generalizable representations. arXiv:2408.07079 [pdf, other]
eess.IV cs.AI cs.CV cs.LG

Anatomical Foundation Models for Brain MRIs

Authors: Carlo Alberto Barbano, Matteo Brunello, Benoit Dufumier, Marco Grangetto Despite demonstrating impressive rendering speed and quality, the rapid convergence of such models poses challenges for further improving reconstruction quality. Common strategies to improve rendering quality involves augmenting… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2407.10389v3-abstract-full').style.display = 'inline'; document.getElementById('2407.10389v3-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2407.10389v3-abstract-full" style="display: none;"> Since the introduction of NeRFs, considerable attention has been focused on improving their training and inference times, leading to the development of Fast-NeRFs models. Despite demonstrating impressive rendering speed and quality, the rapid convergence of such models poses challenges for further improving reconstruction quality. Common strategies to improve rendering quality involves augmenting model parameters or increasing the number of sampled points. However, these computationally intensive approaches encounter limitations in achieving significant quality enhancements. This study introduces a model-agnostic framework inspired by Sparsely-Gated Mixture of Experts to enhance rendering quality without escalating computational complexity. Our approach enables specialization in rendering different scene components by employing a mixture of experts with varying resolutions. We present a novel gate formulation designed to maximize expert capabilities and propose a resolution-based routing technique to effectively induce sparsity and decompose scenes. Submitted 29 November, 2024; v1 submitted 7 August, 2024; originally announced August 2024.
Comments: Updated version; added ablation study
MSC Class: 68T07
ACM Class: I.2.6 In this study, we introduce a novel approach that leverages multiple reference images to enhance robustness against stain variation. Our method is parame… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2406.02077v3-abstract-full').style.display = 'inline'; document.getElementById('2406.02077v3-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2406.02077v3-abstract-full" style="display: none;"> Traditional staining normalization approaches, e.g. Macenko, typically rely on the choice of a single representative reference image, which may not adequately account for the diverse staining patterns of datasets collected in practical scenarios. In this study, we introduce a novel approach that leverages multiple reference images to enhance robustness against stain variation. Our method is parameter-free and can be adopted in existing computational pathology pipelines with no significant changes. We evaluate the effectiveness of our method through experiments using a deep-learning pipeline for automatic nuclei segmentation on colorectal images. arXiv:2406.02077 [pdf, other]
eess.IV cs.AI cs.CV

Multi-target stain normalization for histology slides

Authors: Desislav Ivanov, Carlo Alberto Barbano, Marco Grangetto Teixeira</a>, <a href="/search/cs?searchtype=author&query=Neves%2C+J+C">Jo茫o C. Neves</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2406.00772v2-abstract-short" style="display: inline;"> Contrastive Analysis (CA) regards the problem of identifying patterns in images that allow distinguishing between a background (BG) dataset (i.e. healthy subjects) and a target (TG) dataset (i.e. unhealthy subjects). Recent works on this topic rely on variational autoencoders (VAE) or contrastive learning strategies to learn the patterns that separate TG samples from BG samples in a supervised man… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2406.00772v2-abstract-full').style.display = 'inline'; document.getElementById('2406.00772v2-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2406.00772v2-abstract-full" style="display: none;"> Contrastive Analysis (CA) regards the problem of identifying patterns in images that allow distinguishing between a background (BG) dataset (i.e. healthy subjects) and a target (TG) dataset (i.e. unhealthy subjects). Recent works on this topic rely on variational autoencoders (VAE) or contrastive learning strategies to learn the patterns that separate TG samples from BG samples in a supervised manner. However, the dependency on target (unhealthy) samples can be challenging in medical scenarios due to their limited availability. Also, the blurred reconstructions of VAEs lack utility and interpretability. In this work, we redefine the CA task by employing a self-supervised contrastive encoder to learn a latent representation encoding only common patterns from input images, using samples exclusively from the BG dataset during training, and approximating the distribution of the target patterns by leveraging data augmentation techniques. Subsequently, we exploit state-of-the-art generative methods, i.e. diffusion models, conditioned on the learned latent representation to produce a realistic (healthy) version of the input image encoding solely the common patterns. Thorough validation on a facial image dataset and experiments across three brain MRI datasets demonstrate that conditioning the generative process of state-of-the-art generative methods with the latent representation from our self-supervised contrastive encoder yields improvements in the generated image quality and in the accuracy of image classification. arXiv:2406.00772 [pdf, other]
cs.CV

Unsupervised Contrastive Analysis for Salient Pattern Detection using Conditional Diffusion Models

Authors: Cristiano Patr铆cio, Carlo Alberto Barbano, Attilio Fiandrotti, Riccardo Renzulli, Marco Grangetto, Luis F. Teixeira, Jo茫o C. Neves This project aims to develop a state-of-the-art AI-based system for diagnosing Covid-19 pneumonia from Chest X-ray (CXR) images. The contributions of this work are manyfold: the release of the public CORDA dataset, a deep learning pipeline for Covid-19… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2405.11598v1-abstract-full').style.display = 'inline'; document.getElementById('2405.11598v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2405.11598v1-abstract-full" style="display: none;"> In this paper, we present the major results from the Covid Radiographic imaging System based on AI (Co.R.S.A.) project, which took place in Italy. This project aims to develop a state-of-the-art AI-based system for diagnosing Covid-19 pneumonia from Chest X-ray (CXR) images. The contributions of this work are manyfold: the release of the public CORDA dataset, a deep learning pipeline for Covid-19 detection, and the clinical validation of the developed solution by expert radiologists. The proposed detection model is based on a two-step approach that, paired with state-of-the-art debiasing, provides reliable results. Most importantly, our investigation includes the actual usage of the diagnosis aid tool by radiologists, allowing us to assess the real benefits in terms of accuracy and time efficiency. Submitted 4 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.
Comments: 18 pages, 11 figures In this work, we tackle the problem of adapting a pre-trained model to multiple target domains by plugging into the decoder an adapter modu… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2404.15591v1-abstract-full').style.display = 'inline'; document.getElementById('2404.15591v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2404.15591v1-abstract-full" style="display: none;"> In Learned Image Compression (LIC), a model is trained at encoding and decoding images sampled from a source domain, often outperforming traditional codecs on natural images; yet its performance may be far from optimal on images sampled from different domains. In this work, we tackle the problem of adapting a pre-trained model to multiple target domains by plugging into the decoder an adapter module for each of them, including the source one. Each adapter improves the decoder performance on a specific domain, without the model forgetting about the images seen at training time. A gate network computes the weights to optimally blend the contributions from the adapters when the bitstream is decoded. We experimentally validate our method over two state-of-the-art pre-trained models, observing improved rate-distortion efficiency on the target domains without penalties on the source domain. arXiv:2404.15591 [pdf, other]
cs.CV eess.IV

Domain Adaptation for Learned Image Compression with Supervised Adapters

Authors: Alberto Presta, Gabriele Spadaro, Enzo Tartaglione, Attilio Fiandrotti, Marco Grangetto To develop a deep-learning based system for recognition of subclinical atherosclerosis on a plain frontal chest x-ray. Methods and Results. A deep-learning algorithm to predict coronary artery calcium (CAC) score (the AI-CAC model) was developed on 460 chest x-ray (80% training cohort, 20% internal validation cohort) of primary prevention patients (58.4% male, median age 63 [51-74] years) wi… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2403.18756v1-abstract-full').style.display = 'inline'; document.getElementById('2403.18756v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2403.18756v1-abstract-full" style="display: none;"> Aims. To develop a deep-learning based system for recognition of subclinical atherosclerosis on a plain frontal chest x-ray. Methods and Results. A deep-learning algorithm to predict coronary artery calcium (CAC) score (the AI-CAC model) was developed on 460 chest x-ray (80% training cohort, 20% internal validation cohort) of primary prevention patients (58.4% male, median age 63 [51-74] years) with available paired chest x-ray and chest computed tomography (CT) indicated for any clinical reason and performed within 3 months. The CAC score calculated on chest CT was used as ground truth. The model was validated on an temporally-independent cohort of 90 patients from the same institution (external validation). The diagnostic accuracy of the AI-CAC model assessed by the area under the curve (AUC) was the primary outcome. Overall, median AI-CAC score was 35 (0-388) and 28.9% patients had no AI-CAC. AUC of the AI-CAC model to identify a CAC>0 was 0.90 in the internal validation cohort and 0.77 in the external validation cohort. Sensitivity was consistently above 92% in both cohorts. In the overall cohort (n=540), among patients with AI-CAC=0, a single ASCVD event occurred, after 4.3 years. Patients with AI-CAC>0 had significantly higher Kaplan Meier estimates for ASCVD events (13.5% vs. 3.4%, log-rank=0.013). Conclusion. The AI-CAC model seems to accurately detect subclinical atherosclerosis on chest x-ray with elevated sensitivity, and to predict ASCVD events with elevated negative predictive value. arXiv:2403.18756 [pdf]
cs.CV cs.AI cs.LG

Detection of subclinical atherosclerosis by image-based deep learning on chest x-ray

Authors: Guglielmo Gallone, Francesco Iodice, Alberto Presta, Davide Tore, Ovidio de Filippo, Michele Visciano, Carlo Alberto Barbano, Alessandro Serafini, Paola Gorrini, Alessandro Bruno, Walter Grosso Marra, James Hughes, Mario Iannaccone, Paolo Fonio, Attilio Fiandrotti, Alessandro Depaoli, Marco Grangetto, Gaetano Maria de Ferrari, Fabrizio D'Ascenzo To estimate accurate and generalizable models, large datasets have been collected, which are often multi-site and multi-scanner. This large heterogeneity negatively affects the generalization performan… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2211.08326v2-abstract-full').style.display = 'inline'; document.getElementById('2211.08326v2-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2211.08326v2-abstract-full" style="display: none;"> Building accurate Deep Learning (DL) models for brain age prediction is a very relevant topic in neuroimaging, as it could help better understand neurodegenerative disorders and find new biomarkers. To estimate accurate and generalizable models, large datasets have been collected, which are often multi-site and multi-scanner. This large heterogeneity negatively affects the generalization performance of DL models since they are prone to overfit site-related noise. Recently, contrastive learning approaches have been shown to be more robust against noise in data or labels. For this reason, we propose a novel contrastive learning regression loss for robust brain age prediction using MRI scans. arXiv:2211.08326 [pdf, other]
eess.IV cs.CV cs.LG

Contrastive learning for regression in multi-site brain age prediction

Authors: Carlo Alberto Barbano, Benoit Dufumier, Edouard Duchesnay, Marco Grangetto, Pietro Gori For this reason, learning unbiased models from biased data has become a very relevant research topic in the last years. In this work, we tackle the problem of learning representations that are robust to bi… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2211.05568v4-abstract-full').style.display = 'inline'; document.getElementById('2211.05568v4-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2211.05568v4-abstract-full" style="display: none;"> Many datasets are biased, namely they contain easy-to-learn features that are highly correlated with the target class only in the dataset but not in the true underlying distribution of the data. For this reason, learning unbiased models from biased data has become a very relevant research topic in the last years. In this work, we tackle the problem of learning representations that are robust to biases. We first present a margin-based theoretical framework that allows us to clarify why recent contrastive losses (InfoNCE, SupCon, etc.) can fail when dealing with biased data. Based on that, we derive a novel formulation of the supervised contrastive loss (epsilon-SupInfoNCE), providing more accurate control of the minimal distance between positive and negative samples. Furthermore, thanks to our theoretical framework, we also propose FairKL, a new debiasing regularization loss, that works well even with extremely biased data. arXiv:2211.05568 [pdf, other]
cs.LG cs.CV stat.ML

Unbiased Supervised Contrastive Learning

Authors: Carlo Alberto Barbano, Benoit Dufumier, Enzo Tartaglione, Marco Grangetto, Pietro Gori Promoting sparse topologies, for example, allows the deployment of deep neural networks models on embedded, resource-constrai… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2208.09203v1-abstract-full').style.display = 'inline'; document.getElementById('2208.09203v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2208.09203v1-abstract-full" style="display: none;"> From the moment Neural Networks dominated the scene for image processing, the computational complexity needed to solve the targeted tasks skyrocketed: against such an unsustainable trend, many strategies have been developed, ambitiously targeting performance's preservation. Promoting sparse topologies, for example, allows the deployment of deep neural networks models on embedded, resource-constrained devices. Recently, Capsule Networks were introduced to enhance explainability of a model, where each capsule is an explicit representation of an object or its parts. These models show promising results on toy datasets, but their low scalability prevents deployment on more complex tasks. In this work, we explore sparsity besides capsule representations to improve their computational efficiency by reducing the number of capsules. arXiv:2208.09203 [pdf, other]
cs.CV

Towards Efficient Capsule Networks

Authors: Riccardo Renzulli, Marco Grangetto Neurons at equilibrium in deep models </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&query=Bragagnolo%2C+A">Andrea Bragagnolo</a>, <a href="/search/cs?searchtype=author&query=Tartaglione%2C+E">Enzo Tartaglione</a>, <a href="/search/cs?searchtype=author&query=Grangetto%2C+M">Marco Grangetto</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2207.09455v3-abstract-short" style="display: inline;"> Recent advances in deep learning optimization showed that, with some a-posteriori information on fully-trained models, it is possible to match the same performance by simply training a subset of their parameters. Such a discovery has a broad impact from theory to applications, driving the research towards methods to identify the minimum subset of parameters to train without look-ahead information… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2207.09455v3-abstract-full').style.display = 'inline'; document.getElementById('2207.09455v3-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2207.09455v3-abstract-full" style="display: none;"> Recent advances in deep learning optimization showed that, with some a-posteriori information on fully-trained models, it is possible to match the same performance by simply training a subset of their parameters. Such a discovery has a broad impact from theory to applications, driving the research towards methods to identify the minimum subset of parameters to train without look-ahead information exploitation. However, the methods proposed do not match the state-of-the-art performance, and rely on unstructured sparsely connected models. In this work we shift our focus from the single parameters to the behavior of the whole neuron, exploiting the concept of neuronal equilibrium (NEq). When a neuron is in a configuration at equilibrium (meaning that it has learned a specific input-output relationship), we can halt its update; on the contrary, when a neuron is at non-equilibrium, we let its state evolve towards an equilibrium state, updating its parameters. arXiv:2207.09455 [pdf, other]
cs.LG cs.AI

To update or not to update? Neurons at equilibrium in deep models

Authors: Andrea Bragagnolo, Enzo Tartaglione, Marco Grangetto However, little attention has been devoted to connected legal aspects. In 2016, the European Union approved the General Data Protection Regulation which entered into force in 2018. Its main rationale was to protect the privacy and data protection of its citizens by the way of operating of the so-calle… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2207.02000v1-abstract-full').style.display = 'inline'; document.getElementById('2207.02000v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2207.02000v1-abstract-full" style="display: none;"> Deep learning models are nowadays broadly deployed to solve an incredibly large variety of tasks. However, little attention has been devoted to connected legal aspects. In 2016, the European Union approved the General Data Protection Regulation which entered into force in 2018. Its main rationale was to protect the privacy and data protection of its citizens by the way of operating of the so-called "Data Economy". As data is the fuel of modern Artificial Intelligence, it is argued that the GDPR can be partly applicable to a series of algorithmic decision making tasks before a more structured AI Regulation enters into force. In the meantime, AI should not allow undesired information leakage deviating from the purpose for which is created. In this work we propose DisP, an approach for deep learning models disentangling the information related to some classes we desire to keep private, from the data processed by AI. In particular, DisP is a regularization strategy de-correlating the features belonging to the same private class at training time, hiding the information of private classes membership. arXiv:2207.02000 [pdf, other]
cs.LG cs.AI cs.CR
doi: 10.1016/j.neucom.2023.126612

Disentangling private classes through regularization

Authors: Enzo Tartaglione, Francesca Gennari, Marco Grangetto This limitation arises because the models heavily depend on peripheral and confounding factors, inadvertently acquired during training. Existing approaches to address this problem typically involve explicit supervision of bias attributes… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2204.12941v2-abstract-full').style.display = 'inline'; document.getElementById('2204.12941v2-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2204.12941v2-abstract-full" style="display: none;"> Deep neural networks often struggle to learn robust representations in the presence of dataset biases, leading to suboptimal generalization on unbiased datasets. This limitation arises because the models heavily depend on peripheral and confounding factors, inadvertently acquired during training. Existing approaches to address this problem typically involve explicit supervision of bias attributes or reliance on prior knowledge about the biases. In this study, we address the challenging scenario where no explicit annotations of bias are available, and there's no prior knowledge about its nature. We present a fully unsupervised debiasing framework with three key steps: firstly, leveraging the inherent tendency to learn malignant biases to acquire a bias-capturing model; next, employing a pseudo-labeling process to obtain bias labels; and finally, applying cutting-edge supervised debiasing techniques to achieve an unbiased model. Additionally, we introduce a theoretical framework for evaluating model biasedness and conduct a detailed analysis of how biases impact neural network training. arXiv:2204.12941 [pdf, other]
cs.LG cs.CV
doi: 10.1109/TAI.2024.3514554

Unsupervised Learning of Unbiased Visual Representations

Authors: Carlo Alberto Barbano, Enzo Tartaglione, Marco Grangetto One of their main innovations relies on the routing mechanism which extracts a parse tree: its main purpose is to explicitly build relationships between capsules. However, their true potential in terms of explainability has not surfaced yet: these relationships are extremely heterogeneous and diffi… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2204.01298v1-abstract-full').style.display = 'inline'; document.getElementById('2204.01298v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2204.01298v1-abstract-full" style="display: none;"> Capsule Networks ambition is to build an explainable and biologically-inspired neural network model. One of their main innovations relies on the routing mechanism which extracts a parse tree: its main purpose is to explicitly build relationships between capsules. However, their true potential in terms of explainability has not surfaced yet: these relationships are extremely heterogeneous and difficult to understand. This paper proposes REM, a technique which minimizes the entropy of the parse tree-like structure, improving its explainability. We accomplish this by driving the model parameters distribution towards low entropy configurations, using a pruning mechanism as a proxy. arXiv:2204.01298 [pdf, other]
cs.CV cs.AI

REM: Routing Entropy Minimization for Capsule Networks

Authors: Riccardo Renzulli, Enzo Tartaglione, Marco Grangetto

Submitted 4 April, 2022; originally announced April 2022. Our formulation scales efficiently beyond the first order and is agnostic of the quantization scheme. The network can then be trained to minimize the entropy of the quantized parameters, so that they can be… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2107.05298v1-abstract-full').style.display = 'inline'; document.getElementById('2107.05298v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2107.05298v1-abstract-full" style="display: none;"> We formulate the entropy of a quantized artificial neural network as a differentiable function that can be plugged as a regularization term into the cost function minimized by gradient descent. Our formulation scales efficiently beyond the first order and is agnostic of the quantization scheme. The network can then be trained to minimize the entropy of the quantized parameters, so that they can be optimally compressed via entropy coding. We experiment with our entropy formulation at quantizing and compressing well-known network architectures over multiple datasets. Our approach compares favorably over similar methods, enjoying the benefits of higher order entropy estimate, showing flexibility towards non-uniform quantization (we use Lloyd-max quantization), scalability towards any entropy order to be minimized and efficiency in terms of compression. arXiv:2107.05298 [pdf, ps, other]
cs.LG cs.AI cs.IT
doi: 10.1016/j.neucom.2021.07.022

HEMP: High-order Entropy Minimization for neural network comPression

Authors: Enzo Tartaglione, St茅phane Lathuili猫re, Attilio Fiandrotti, Marco Cagnazzo, Marco Grangetto

Submitted 12 July, 2021; originally announced July 2021. There are problems, like the presence of biases in the training data, which question the generalization capability of these models. In this work we propose EnD, a regularization strategy whose aim is to prevent deep models from learning u… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2103.02023v1-abstract-full').style.display = 'inline'; document.getElementById('2103.02023v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2103.02023v1-abstract-full" style="display: none;"> Artificial neural networks perform state-of-the-art in an ever-growing number of tasks, and nowadays they are used to solve an incredibly large variety of tasks. There are problems, like the presence of biases in the training data, which question the generalization capability of these models. In this work we propose EnD, a regularization strategy whose aim is to prevent deep models from learning unwanted biases. In particular, we insert an "information bottleneck" at a certain point of the deep neural network, where we disentangle the information about the bias, still letting the useful information for the training task forward-propagating in the rest of the model. One big advantage of EnD is that we do not require additional training complexity (like decoders or extra layers in the model), since it is a regularizer directly applied on the trained model. arXiv:2103.02023 [pdf, other]
cs.CV cs.AI cs.LG
doi: 10.1109/CVPR46437.2021.01330

EnD: Entangling and Disentangling deep representations for bias correction

Authors: Enzo Tartaglione, Carlo Alberto Barbano, Marco Grangetto

Submitted 2 March, 2021; originally announced March 2021. For this reason, histopathological characterization of colorectal polyps is the major instrument for the pathologist in order to infer the actual risk for cancer and to guide further follow-up. Colorectal polyps diagnosis includes the evaluation of the polyp type, and more importantly, the grade of dysplasia. This latter… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2102.05498v1-abstract-full').style.display = 'inline'; document.getElementById('2102.05498v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2102.05498v1-abstract-full" style="display: none;"> Colorectal cancer is a leading cause of cancer death for both men and women. For this reason, histopathological characterization of colorectal polyps is the major instrument for the pathologist in order to infer the actual risk for cancer and to guide further follow-up. Colorectal polyps diagnosis includes the evaluation of the polyp type, and more importantly, the grade of dysplasia. This latter evaluation represents a critical step for the clinical follow-up. The proposed deep learning-based classification pipeline is based on state-of-the-art convolutional neural network, trained using proper countermeasures to tackle WSI high resolution and very imbalanced dataset. arXiv:2102.05498 [pdf, other]
eess.IV cs.CV cs.LG
doi: 10.1007/978-981-16-3880-0_34

Dysplasia grading of colorectal polyps through CNN analysis of WSI

Authors: Daniele Perlo, Enzo Tartaglione, Luca Bertero, Paola Cassoni, Marco Grangetto

Submitted 10 February, 2021; originally announced February 2021. SeReNe (Sensitivity-based Regularization of Neurons) is a method for learning sparse topologies with a structure, exploiting neural sensitivity as a regularizer. We define the sensitivity of a neuron as the variation of the network output with respect to the variati… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2102.03773v1-abstract-full').style.display = 'inline'; document.getElementById('2102.03773v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2102.03773v1-abstract-full" style="display: none;"> Deep neural networks include millions of learnable parameters, making their deployment over resource-constrained devices problematic. SeReNe (Sensitivity-based Regularization of Neurons) is a method for learning sparse topologies with a structure, exploiting neural sensitivity as a regularizer. We define the sensitivity of a neuron as the variation of the network output with respect to the variation of the activity of the neuron. The lower the sensitivity of a neuron, the less the network output is perturbed if the neuron output changes. By including the neuron sensitivity in the cost function as a regularization term, we areable to prune neurons with low sensitivity. As entire neurons are pruned rather then single parameters, practical network footprint reduction becomes possible. arXiv:2102.03773 [pdf, other]
cs.LG cs.AI stat.ML
doi: 10.1109/TNNLS.2021.3084527

SeReNe: Sensitivity based Regularization of Neurons for Structured Sparsity in Neural Networks

Authors: Enzo Tartaglione, Andrea Bragagnolo, Francesco Odierna, Attilio Fiandrotti, Marco Grangetto

Submitted 7 February, 2021; originally announced February 2021. The use of nasopharyngeal swabs has been considered the most viable approach; however, the result is not immediate or, in the case of fast exams, sufficiently accurate. Using Chest X-Ray (CXR) imaging for early screening potentially provides faster and more accurate respo… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2101.10223v1-abstract-full').style.display = 'inline'; document.getElementById('2101.10223v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2101.10223v1-abstract-full" style="display: none;"> Early screening of patients is a critical issue in order to assess immediate and fast responses against the spread of COVID-19. The use of nasopharyngeal swabs has been considered the most viable approach; however, the result is not immediate or, in the case of fast exams, sufficiently accurate. Using Chest X-Ray (CXR) imaging for early screening potentially provides faster and more accurate response; however, diagnosing COVID from CXRs is hard and we should rely on deep learning support, whose decision process is, on the other hand, "black-boxed" and, for such reason, untrustworthy. We propose an explainable two-step diagnostic approach, where we first detect known pathologies (anomalies) in the lungs, on top of which we diagnose the illness. Our approach achieves promising performance in COVID detection, compatible with expert human radiologists. arXiv:2101.10223 [pdf, other]
eess.IV cs.CV
doi: 10.1007/978-3-031-06427-2_15

A two-step explainable approach for COVID-19 computer-aided diagnosis from chest x-ray images

Authors: Carlo Alberto Barbano, Enzo Tartaglione, Claudio Berzovini, Marco Calandri, Marco Grangetto

Submitted 25 January, 2021; originally announced January 2021.
Comments: 5 pages, 4 figures
ACM Class: I.2.0; I.2.6 Colorectal polyps characterization relies on the histological analysis of tissue samples to determine the polyps malignancy and dysplasia grade. Deep neural networks achieve outstanding accuracy in medical pattern… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2101.09991v2-abstract-full').style.display = 'inline'; document.getElementById('2101.09991v2-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2101.09991v2-abstract-full" style="display: none;"> Histopathological characterization of colorectal polyps allows to tailor patients' management and follow up with the ultimate aim of avoiding or promptly detecting an invasive carcinoma. Colorectal polyps characterization relies on the histological analysis of tissue samples to determine the polyps malignancy and dysplasia grade. Deep neural networks achieve outstanding accuracy in medical patterns recognition, however they require large sets of annotated training images. We introduce UniToPatho, an annotated dataset of 9536 hematoxylin and eosin (H&E) stained patches extracted from 292 whole-slide images, meant for training deep neural networks for colorectal polyps classification and adenomas grading. arXiv:2101.09991 [pdf, other]
eess.IV cs.CV cs.LG
doi: 10.1109/ICIP42928.2021.9506198

UniToPatho, a labeled histopathological dataset for colorectal polyps classification and adenoma dysplasia grading

Authors: Carlo Alberto Barbano, Daniele Perlo, Enzo Tartaglione, Attilio Fiandrotti, Luca Bertero, Paola Cassoni, Marco Grangetto

Submitted 10 February, 2021; v1 submitted 25 January, 2021; originally announced January 2021.
Comments: 5 pages, 3 figures
ACM Class: I.2.0; I.2.6 Gava</a>, <a href="/search/cs?searchtype=author&query=D%27Agata%2C+F">Federico D'Agata</a>, <a href="/search/cs?searchtype=author&query=Tartaglione%2C+E">Enzo Tartaglione</a>, <a href="/search/cs?searchtype=author&query=Grangetto%2C+M">Marco Grangetto</a>, <a href="/search/cs?searchtype=author&query=Bertolino%2C+F">Francesca Bertolino</a>, <a href="/search/cs?searchtype=author&query=Santonocito%2C+A">Ambra Santonocito</a>, <a href="/search/cs?searchtype=author&query=Bennink%2C+E">Edwin Bennink</a>, <a href="/search/cs?searchtype=author&query=Bergui%2C+M">Mauro Bergui</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2101.05992v1-abstract-short" style="display: inline;"> Purpose: In this study we investigate whether a Convolutional Neural Network (CNN) can generate clinically relevant parametric maps from CT perfusion data in a clinical setting of patients with acute ischemic stroke. Methods: Training of the CNN was done on a subset of 100 perfusion data, while 15 samples were used as validation. All the data used for the training/validation of the network and to… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2101.05992v1-abstract-full').style.display = 'inline'; document.getElementById('2101.05992v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2101.05992v1-abstract-full" style="display: none;"> Purpose: In this study we investigate whether a Convolutional Neural Network (CNN) can generate clinically relevant parametric maps from CT perfusion data in a clinical setting of patients with acute ischemic stroke. Methods: Training of the CNN was done on a subset of 100 perfusion data, while 15 samples were used as validation. All the data used for the training/validation of the network and to generate ground truth (GT) maps, using a state-of-the-art deconvolution-algorithm, were previously pre-processed using a standard pipeline. Validation was carried out through manual segmentation of infarct core and penumbra on both CNN-derived maps and GT maps. Concordance among segmented lesions was assessed using the Dice and the Pearson correlation coefficients across lesion volumes. Results: Mean Dice scores from two different raters and the GT maps were > 0.70 (good-matching). Inter-rater concordance was also high and strong correlation was found between lesion volumes of CNN maps and GT maps (0.99, 0.98). Conclusion: Our CNN-based approach generated clinically relevant perfusion maps that are comparable to state-of-the-art perfusion analysis methods based on deconvolution of the data. arXiv:2101.05992 [pdf]
eess.IV cs.CV

Neural Network-derived perfusion maps: a Model-free approach to computed tomography perfusion in patients with acute ischemic stroke

Authors: Umberto A. Gava, Federico D'Agata, Enzo Tartaglione, Marco Grangetto, Francesca Bertolino, Ambra Santonocito, Edwin Bennink, Mauro Bergui

Submitted 15 January, 2021; originally announced January 2021. Let the sensitivity of a network parameter be the variation of the loss function with respect to the variation of the parameter. Parameters with low sensitivity, i.e. having little impact on the loss when perturbed, are shrunk and then pruned to sparsify the network. Our method allows… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2011.09905v1-abstract-full').style.display = 'inline'; document.getElementById('2011.09905v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2011.09905v1-abstract-full" style="display: none;"> LOBSTER (LOss-Based SensiTivity rEgulaRization) is a method for training neural networks having a sparse topology. Let the sensitivity of a network parameter be the variation of the loss function with respect to the variation of the parameter. Parameters with low sensitivity, i.e. having little impact on the loss when perturbed, are shrunk and then pruned to sparsify the network. Our method allows to train a network from scratch, i.e. without preliminary learning or rewinding. arXiv:2011.09905 [pdf, other]
cs.LG
doi: 10.1016/j.neunet.2021.11.029

LOss-Based SensiTivity rEgulaRization: towards deep sparse neural networks

Authors: Enzo Tartaglione, Andrea Bragagnolo, Attilio Fiandrotti, Marco Grangetto

Submitted 16 November, 2020; originally announced November 2020. However, typical training strategies do not take into account lawful, ethical and discriminatory potential issues the trained ANN models could incur in. In this work we propose NDR, a non-discriminatory regularization strategy to prevent the… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2008.01430v1-abstract-full').style.display = 'inline'; document.getElementById('2008.01430v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2008.01430v1-abstract-full" style="display: none;"> Artificial neural networks perform state-of-the-art in an ever-growing number of tasks, nowadays they are used to solve an incredibly large variety of tasks. However, typical training strategies do not take into account lawful, ethical and discriminatory potential issues the trained ANN models could incur in. In this work we propose NDR, a non-discriminatory regularization strategy to prevent the ANN model to solve the target task using some discriminatory features like, for example, the ethnicity in an image classification task for human faces. In particular, a part of the ANN model is trained to hide the discriminatory information such that the rest of the network focuses in learning the given learning task. arXiv:2008.01430 [pdf, other]
cs.CV cs.AI cs.LG
doi: 10.1109/TrustCom50675.2020.00126

A non-discriminatory approach to ethical deep learning

Authors: Enzo Tartaglione, Marco Grangetto

Submitted 4 August, 2020; originally announced August 2020. However, there is a general lack in understanding why these pruning strategies are effective. In this work, we are going to compare and analyze pruned solutions with two different pruning approaches, one-shot and gra… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2004.14765v1-abstract-full').style.display = 'inline'; document.getElementById('2004.14765v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2004.14765v1-abstract-full" style="display: none;"> Recently, a race towards the simplification of deep networks has begun, showing that it is effectively possible to reduce the size of these models with minimal or no performance loss. However, there is a general lack in understanding why these pruning strategies are effective. In this work, we are going to compare and analyze pruned solutions with two different pruning approaches, one-shot and gradual, showing the higher effectiveness of the latter. In particular, we find that gradual pruning allows access to narrow, well-generalizing minima, which are typically ignored when using one-shot approaches. In this work we also propose PSP-entropy, a measure to understand how a given neuron correlates to some specific learned classes. arXiv:2004.14765 [pdf, ps, other]
cs.LG cond-mat.dis-nn cs.NE
doi: 10.1007/978-3-030-61616-8_6

Pruning artificial neural networks: a way to find well-generalizing, high-entropy sharp minima

Authors: Enzo Tartaglione, Andrea Bragagnolo, Marco Grangetto

Submitted 30 April, 2020; originally announced April 2020. In this study we provide insights and also raise warnings on what is reasonable to expect by applying deep-learning to COVID classification of CXR images. We provide a methodological guide and critical reading of an… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2004.05405v1-abstract-full').style.display = 'inline'; document.getElementById('2004.05405v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2004.05405v1-abstract-full" style="display: none;"> The possibility to use widespread and simple chest X-ray (CXR) imaging for early screening of COVID-19 patients is attracting much interest from both the clinical and the AI community. In this study we provide insights and also raise warnings on what is reasonable to expect by applying deep-learning to COVID classification of CXR images. We provide a methodological guide and critical reading of an extensive set of statistical results that can be obtained using currently available datasets. In particular, we take the challenge posed by current small size COVID data and show how significant can be the bias introduced by transfer-learning using larger public non-COVID CXR datasets. We also contribute by providing results on a medium size COVID CXR dataset, just collected by one of the major emergency hospitals in Northern Italy during the peak of the COVID pandemic. These novel data allow us to contribute to validate the generalization capacity of preliminary results circulating in the scientific community. arXiv:2004.05405 [pdf, other]
eess.IV cs.CV cs.LG
doi: 10.3390/ijerph17186933

Unveiling COVID-19 from Chest X-ray with deep learning: a hurdles race with small data

Authors: Enzo Tartaglione, Carlo Alberto Barbano, Claudio Berzovini, Marco Calandri, Marco Grangetto

Submitted 11 April, 2020; originally announced April 2020.
Journal ref: Int. J. Environ. Res. Public Health 2020, 17(18), 6933 Public Health 2020, 17(18), 6933 </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:1907.08544</a> <span> [<a href="">pdf</a>, <a href="">other</a>] </span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Machine Learning">cs.LG</span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Neural and Evolutionary Computing">cs.NE</span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Machine Learning">stat.ML</span> </div> <div class="is-inline-block" style="margin-left: 0.5rem"> <div class="tags has-addons"> <span class="tag is-dark is-size-7">doi</span> <span class="tag is-light is-size-7"><a class="" href="">10.1007/978-3-030-30484-3_16 <i class="fa fa-external-link" aria-hidden="true"></i></a></span> </div> </div> </div> <p class="title is-5 mathjax"> Post-synaptic potential regularization has potential </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&query=Tartaglione%2C+E">Enzo Tartaglione</a>, <a href="/search/cs?searchtype=author&query=Perlo%2C+D">Daniele Perlo</a>, <a href="/search/cs?searchtype=author&query=Grangetto%2C+M">Marco Grangetto</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="1907.08544v1-abstract-short" style="display: inline;"> Improving generalization is one of the main challenges for training deep neural networks on classification tasks. In particular, a number of techniques have been proposed, aiming to boost the performance on unseen data: from standard data augmentation techniques to the $\ell_2$ regularization, dropout, batch normalization, entropy-driven SGD and many more.\\ In this work we propose an elegant, sim… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('1907.08544v1-abstract-full').style.display = 'inline'; document.getElementById('1907.08544v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="1907.08544v1-abstract-full" style="display: none;"> Improving generalization is one of the main challenges for training deep neural networks on classification tasks. In particular, a number of techniques have been proposed, aiming to boost the performance on unseen data: from standard data augmentation techniques to the $\ell_2$ regularization, dropout, batch normalization, entropy-driven SGD and many more.\\ In this work we propose an elegant, simple and principled approach: post-synaptic potential regularization (PSP). We tested this regularization on a number of different state-of-the-art scenarios. arXiv:1907.08544 [pdf, other]
cs.LG cs.NE stat.ML
doi: 10.1007/978-3-030-30484-3_16

Post-synaptic potential regularization has potential

Authors: Enzo Tartaglione, Daniele Perlo, Marco Grangetto

Submitted 19 July, 2019; originally announced July 2019. In this work, a novel graph-based sol… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('1802.09843v6-abstract-full').style.display = 'inline'; document.getElementById('1802.09843v6-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="1802.09843v6-abstract-full" style="display: none;"> Reed-Xiaoli detector (RXD) is recognized as the benchmark algorithm for image anomaly detection; however, it presents known limitations, namely the dependence over the image following a multivariate Gaussian model, the estimation and inversion of a high-dimensional covariance matrix, and the inability to effectively include spatial awareness in its evaluation. In this work, a novel graph-based solution to the image anomaly detection problem is proposed; leveraging the graph Fourier transform, we are able to overcome some of RXD's limitations while reducing computational cost at the same time. arXiv:1802.09843 [pdf, other]
eess.IV cs.CV eess.SP
doi: 10.1007/s00138-020-01059-4

Graph Laplacian for Image Anomaly Detection

Authors: Francesco Verdoja, Marco Grangetto

Submitted 10 February, 2020; v1 submitted 27 February, 2018; originally announced February 2018.
Comments: Published in Machine Vision and Applications (Springer)
Journal ref: Machine Vision and Applications, vol. 31, no. 1, Feb. 2020 Nonetheless, such systems are vulnerable to pollution attacks where a handful of malicious peers can disrupt the communication by transmitting just a few bogus packets which are then recombined and relayed by unaware honest nodes,… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('1707.07546v1-abstract-full').style.display = 'inline'; document.getElementById('1707.07546v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="1707.07546v1-abstract-full" style="display: none;"> Network coding based peer-to-peer streaming represents an effective solution to aggregate user capacities and to increase system throughput in live multimedia streaming. Nonetheless, such systems are vulnerable to pollution attacks where a handful of malicious peers can disrupt the communication by transmitting just a few bogus packets which are then recombined and relayed by unaware honest nodes, further spreading the pollution over the network. Whereas previous research focused on malicious nodes identification schemes and pollution-resilient coding, in this paper we show pollution countermeasures which make a standard network coding scheme resilient to pollution attacks. Thanks to a simple yet effective analytical model of a reference node collecting packets by malicious and honest neighbors, we demonstrate that i) packets received earlier are less likely to be polluted and ii) short generations increase the likelihood to recover a clean generation. Therefore, we propose a recombination scheme where nodes draw packets to be recombined according to their age in the input queue, paired with a decoding scheme able to detect the reception of polluted packets early in the decoding process and short generations. The effectiveness of our approach is experimentally evaluated in a real system we developed and deployed on hundreds to thousands peers. arXiv:1707.07546 [pdf, ps, other]
cs.NI cs.MM
doi: 10.1109/TMM.2015.2402516

Simple Countermeasures to Mitigate the Effect of Pollution Attack in Network Coding Based Peer-to-Peer Live Streaming

Authors: Attilio Fiandrotti, Rossano Gaeta, Marco Grangetto

Submitted 24 July, 2017; originally announced July 2017.
Journal ref: IEEE Transactions on Multimedia, Volume 17, Issue 4, April 2015, Pages 562 - 573 Controlling and hence limiting such factors has always been an important but elusive research goal, since the packet degree distribution, which is the main factor driving the complexity, is altered in a non-det… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('1309.0316v1-abstract-full').style.display = 'inline'; document.getElementById('1309.0316v1-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="1309.0316v1-abstract-full" style="display: none;"> A key problem in random network coding (NC) lies in the complexity and energy consumption associated with the packet decoding processes, which hinder its application in mobile environments. Controlling and hence limiting such factors has always been an important but elusive research goal, since the packet degree distribution, which is the main factor driving the complexity, is altered in a non-deterministic way by the random recombinations at the network nodes. In this paper we tackle this problem proposing Band Codes (BC), a novel class of network codes specifically designed to preserve the packet degree distribution during packet encoding, ecombination and decoding. BC are random codes over GF(2) that exhibit low decoding complexity, feature limited and controlled degree distribution by construction, and hence allow to effectively apply NC even in energy-constrained scenarios. In particular, in this paper we motivate and describe our new design and provide a thorough analysis of its performance. We provide numerical simulations of the performance of BC in order to validate the analysis and assess the overhead of BC with respect to a onventional NC scheme. arXiv:1309.0316 [pdf, ps, other]
cs.MM cs.NI
doi: 10.1109/TMM.2013.2285518

Band Codes for Energy-Efficient Network Coding with Application to P2P Mobile Streaming

Authors: Attilio Fiandrotti, Valerio Bioglio, Marco Grangetto, Rossano Gaeta, Enrico Magli

Submitted 2 September, 2013; originally announced September 2013.
Comments: To be published in IEEE Transacions on Multimedia
ACM Class: H.5.1 Grangetto</a>, <a href="/search/cs?searchtype=author&query=Magli%2C+E">E. Magli</a>, <a href="/search/cs?searchtype=author&query=Olmo%2C+G">G. Olmo</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="0712.0271v2-abstract-short" style="display: inline;"> Distributed source coding schemes are typically based on the use of channels codes as source codes. In this paper we propose a new paradigm, termed "distributed arithmetic coding", which exploits the fact that arithmetic codes are good source as well as channel codes. In particular, we propose a distributed binary arithmetic coder for Slepian-Wolf coding with decoder side information, along with… <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('0712.0271v2-abstract-full').style.display = 'inline'; document.getElementById('0712.0271v2-abstract-short').style.display = 'none';">▽ More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="0712.0271v2-abstract-full" style="display: none;"> Distributed source coding schemes are typically based on the use of channels codes as source codes. In this paper we propose a new paradigm, termed "distributed arithmetic coding", which exploits the fact that arithmetic codes are good source as well as channel codes. In particular, we propose a distributed binary arithmetic coder for Slepian-Wolf coding with decoder side information, along with a soft joint decoder. The proposed scheme provides several advantages over existing Slepian-Wolf coders, especially its good performance at small block lengths, and the ability to incorporate arbitrary source models in the encoding process, e.g. context-based statistical models. arXiv:0712.0271 [pdf, ps, other]
cs.IT

Distributed Arithmetic Coding for the Asymmetric Slepian-Wolf problem

Authors: M. Grangetto, E. Magli, G. Olmo

Submitted 11 November, 