<p class="list-title is-inline-block"><a href="">arXiv:2411.12173</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Machine Learning">cs.LG</span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Artificial Intelligence">cs.AI</span> </div> </div> <p class="title is-5 mathjax"> SkillTree: Explainable Skill-Based Deep Reinforcement Learning for Long-Horizon Control Tasks </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Wen%2C+Y">Yongyan Wen</a>, <a href="/search/cs?searchtype=author&amp;query=Li%2C+S">Siyuan Li</a>, <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Rongchang Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Yuan%2C+L">Lei Yuan</a>, <a href="/search/cs?searchtype=author&amp;query=Mao%2C+H">Hangyu Mao</a>, <a href="/search/cs?searchtype=author&amp;query=Liu%2C+P">Peng Liu</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2411.12173v1-abstract-short" style="display: inline;"> Deep reinforcement learning (DRL) has achieved remarkable success in various research domains. However, its reliance on neural networks results in a lack of transparency, which limits its practical applications. To achieve explainability, decision trees have emerged as a popular and promising alternative to neural networks. Nonetheless, due to their limited expressiveness, traditional decision tre&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2411.12173v1-abstract-full').style.display = 'inline'; document.getElementById('2411.12173v1-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2411.12173v1-abstract-full" style="display: none;"> Deep reinforcement learning (DRL) has achieved remarkable success in various research domains. However, its reliance on neural networks results in a lack of transparency, which limits its practical applications. To achieve explainability, decision trees have emerged as a popular and promising alternative to neural networks. Nonetheless, due to their limited expressiveness, traditional decision trees struggle with high-dimensional long-horizon continuous control tasks. In this paper, we proposes SkillTree, a novel framework that reduces complex continuous action spaces into discrete skill spaces. Our hierarchical approach integrates a differentiable decision tree within the high-level policy to generate skill embeddings, which subsequently guide the low-level policy in executing skills. By making skill decisions explainable, we achieve skill-level explainability, enhancing the understanding of the decision-making process in complex tasks. Experimental results demonstrate that our method achieves performance comparable to skill-based neural networks in complex robotic arm control domains. Furthermore, SkillTree offers explanations at the skill level, thereby increasing the transparency of the decision-making process. <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2411.12173v1-abstract-full').style.display = 'none'; document.getElementById('2411.12173v1-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 18 November, 2024; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> November 2024. </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:2410.14577</a> <span>&nbsp;[<a href="">pdf</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Robotics">cs.RO</span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Systems and Control">eess.SY</span> </div> </div> <p class="title is-5 mathjax"> Reimagining partial thickness keratoplasty: An eye mountable robot for autonomous big bubble needle insertion </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Wang%2C+Y">Y. Wang</a>, <a href="/search/cs?searchtype=author&amp;query=Opfermann%2C+J+D">J. D. Opfermann</a>, <a href="/search/cs?searchtype=author&amp;query=Yu%2C+J">J. Yu</a>, <a href="/search/cs?searchtype=author&amp;query=Yi%2C+H">H. Yi</a>, <a href="/search/cs?searchtype=author&amp;query=Kaluna%2C+J">J. Kaluna</a>, <a href="/search/cs?searchtype=author&amp;query=Biswas%2C+R">R. Biswas</a>, <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">R. Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Gensheimer%2C+W">W. Gensheimer</a>, <a href="/search/cs?searchtype=author&amp;query=Krieger%2C+A">A. Krieger</a>, <a href="/search/cs?searchtype=author&amp;query=Kang%2C+J+U">J. U. Kang</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2410.14577v1-abstract-short" style="display: inline;"> Autonomous surgical robots have demonstrated significant potential to standardize surgical outcomes, driving innovations that enhance safety and consistency regardless of individual surgeon experience. Deep anterior lamellar keratoplasty (DALK), a partial thickness corneal transplant surgery aimed at replacing the anterior part of cornea above Descemet membrane (DM), would greatly benefit from an&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2410.14577v1-abstract-full').style.display = 'inline'; document.getElementById('2410.14577v1-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2410.14577v1-abstract-full" style="display: none;"> Autonomous surgical robots have demonstrated significant potential to standardize surgical outcomes, driving innovations that enhance safety and consistency regardless of individual surgeon experience. Deep anterior lamellar keratoplasty (DALK), a partial thickness corneal transplant surgery aimed at replacing the anterior part of cornea above Descemet membrane (DM), would greatly benefit from an autonomous surgical approach as it highly relies on surgeon skill with high perforation rates. In this study, we proposed a novel autonomous surgical robotic system (AUTO-DALK) based on a customized neural network capable of precise needle control and consistent big bubble demarcation on cadaver and live rabbit models. We demonstrate the feasibility of an AI-based image-guided vertical drilling approach for big bubble generation, in contrast to the conventional horizontal needle approach. Our system integrates an optical coherence tomography (OCT) fiber optic distal sensor into the eye-mountable micro robotic system, which automatically segments OCT M-mode depth signals to identify corneal layers using a custom deep learning algorithm. It enables the robot to autonomously guide the needle to targeted tissue layers via a depth-controlled feedback loop. We compared autonomous needle insertion performance and resulting pneumo-dissection using AUTO-DALK against 1) freehand insertion, 2) OCT sensor guided manual insertion, and 3) teleoperated robotic insertion, reporting significant improvements in insertion depth, pneumo-dissection depth, task completion time, and big bubble formation. Ex vivo and in vivo results indicate that the AI-driven, AUTO-DALK system, is a promising solution to standardize pneumo-dissection outcomes for partial thickness keratoplasty. <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2410.14577v1-abstract-full').style.display = 'none'; document.getElementById('2410.14577v1-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 18 October, 2024; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> October 2024. </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:2406.11562</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Machine Learning">cs.LG</span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Robotics">cs.RO</span> </div> </div> <p class="title is-5 mathjax"> An Imitative Reinforcement Learning Framework for Autonomous Dogfight </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Li%2C+S">Siyuan Li</a>, <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Rongchang Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Liu%2C+P">Peng Liu</a>, <a href="/search/cs?searchtype=author&amp;query=Zhao%2C+Y">Yingnan Zhao</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2406.11562v1-abstract-short" style="display: inline;"> Unmanned Combat Aerial Vehicle (UCAV) dogfight, which refers to a fight between two or more UCAVs usually at close quarters, plays a decisive role on the aerial battlefields. With the evolution of artificial intelligence, dogfight progressively transits towards intelligent and autonomous modes. However, the development of autonomous dogfight policy learning is hindered by challenges such as weak e&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2406.11562v1-abstract-full').style.display = 'inline'; document.getElementById('2406.11562v1-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2406.11562v1-abstract-full" style="display: none;"> Unmanned Combat Aerial Vehicle (UCAV) dogfight, which refers to a fight between two or more UCAVs usually at close quarters, plays a decisive role on the aerial battlefields. With the evolution of artificial intelligence, dogfight progressively transits towards intelligent and autonomous modes. However, the development of autonomous dogfight policy learning is hindered by challenges such as weak exploration capabilities, low learning efficiency, and unrealistic simulated environments. To overcome these challenges, this paper proposes a novel imitative reinforcement learning framework, which efficiently leverages expert data while enabling autonomous exploration. The proposed framework not only enhances learning efficiency through expert imitation, but also ensures adaptability to dynamic environments via autonomous exploration with reinforcement learning. Therefore, the proposed framework can learn a successful dogfight policy of &#39;pursuit-lock-launch&#39; for UCAVs. To support data-driven learning, we establish a dogfight environment based on the Harfang3D sandbox, where we conduct extensive experiments. The results indicate that the proposed framework excels in multistage dogfight, significantly outperforms state-of-the-art reinforcement learning and imitation learning methods. Thanks to the ability of imitating experts and autonomous exploration, our framework can quickly learn the critical knowledge in complex aerial combat tasks, achieving up to a 100% success rate and demonstrating excellent robustness. <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2406.11562v1-abstract-full').style.display = 'none'; document.getElementById('2406.11562v1-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 17 June, 2024; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> June 2024. </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:2405.00980</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Computation and Language">cs.CL</span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Computer Vision and Pattern Recognition">cs.CV</span> </div> </div> <p class="title is-5 mathjax"> A Hong Kong Sign Language Corpus Collected from Sign-interpreted TV News </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Niu%2C+Z">Zhe Niu</a>, <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Ronglai Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Mak%2C+B">Brian Mak</a>, <a href="/search/cs?searchtype=author&amp;query=Wei%2C+F">Fangyun Wei</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2405.00980v1-abstract-short" style="display: inline;"> This paper introduces TVB-HKSL-News, a new Hong Kong sign language (HKSL) dataset collected from a TV news program over a period of 7 months. The dataset is collected to enrich resources for HKSL and support research in large-vocabulary continuous sign language recognition (SLR) and translation (SLT). It consists of 16.07 hours of sign videos of two signers with a vocabulary of 6,515 glosses (for&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2405.00980v1-abstract-full').style.display = 'inline'; document.getElementById('2405.00980v1-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2405.00980v1-abstract-full" style="display: none;"> This paper introduces TVB-HKSL-News, a new Hong Kong sign language (HKSL) dataset collected from a TV news program over a period of 7 months. The dataset is collected to enrich resources for HKSL and support research in large-vocabulary continuous sign language recognition (SLR) and translation (SLT). It consists of 16.07 hours of sign videos of two signers with a vocabulary of 6,515 glosses (for SLR) and 2,850 Chinese characters or 18K Chinese words (for SLT). One signer has 11.66 hours of sign videos and the other has 4.41 hours. One objective in building the dataset is to support the investigation of how well large-vocabulary continuous sign language recognition/translation can be done for a single signer given a (relatively) large amount of his/her training data, which could potentially lead to the development of new modeling methods. Besides, most parts of the data collection pipeline are automated with little human intervention; we believe that our collection method can be scaled up to collect more sign language data easily for SLT in the future for any sign languages if such sign-interpreted videos are available. We also run a SOTA SLR/SLT model on the dataset and get a baseline SLR word error rate of 34.08% and a baseline SLT BLEU-4 score of 23.58 for benchmarking future research on the dataset. <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2405.00980v1-abstract-full').style.display = 'none'; document.getElementById('2405.00980v1-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 1 May, 2024; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> May 2024. </p> <p class="comments is-size-7"> <span class="has-text-black-bis has-text-weight-semibold">Comments:</span> <span class="has-text-grey-dark mathjax">Accepted by LREC-COLING 2024</span> </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:2401.05336</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Computer Vision and Pattern Recognition">cs.CV</span> </div> </div> <p class="title is-5 mathjax"> Towards Online Continuous Sign Language Recognition and Translation </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Ronglai Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Wei%2C+F">Fangyun Wei</a>, <a href="/search/cs?searchtype=author&amp;query=Mak%2C+B">Brian Mak</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2401.05336v2-abstract-short" style="display: inline;"> Research on continuous sign language recognition (CSLR) is essential to bridge the communication gap between deaf and hearing individuals. Numerous previous studies have trained their models using the connectionist temporal classification (CTC) loss. During inference, these CTC-based models generally require the entire sign video as input to make predictions, a process known as offline recognition&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2401.05336v2-abstract-full').style.display = 'inline'; document.getElementById('2401.05336v2-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2401.05336v2-abstract-full" style="display: none;"> Research on continuous sign language recognition (CSLR) is essential to bridge the communication gap between deaf and hearing individuals. Numerous previous studies have trained their models using the connectionist temporal classification (CTC) loss. During inference, these CTC-based models generally require the entire sign video as input to make predictions, a process known as offline recognition, which suffers from high latency and substantial memory usage. In this work, we take the first step towards online CSLR. Our approach consists of three phases: 1) developing a sign dictionary; 2) training an isolated sign language recognition model on the dictionary; and 3) employing a sliding window approach on the input sign sequence, feeding each sign clip to the optimized model for online recognition. Additionally, our online recognition model can be extended to support online translation by integrating a gloss-to-text network and can enhance the performance of any offline model. With these extensions, our online approach achieves new state-of-the-art performance on three popular benchmarks across various task settings. Code and models are available at <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2401.05336v2-abstract-full').style.display = 'none'; document.getElementById('2401.05336v2-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 22 September, 2024; <span class="has-text-black-bis has-text-weight-semibold">v1</span> submitted 10 January, 2024; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> January 2024. </p> <p class="comments is-size-7"> <span class="has-text-black-bis has-text-weight-semibold">Comments:</span> <span class="has-text-grey-dark mathjax">Accepted to EMNLP 2024</span> </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:2401.04730</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Computer Vision and Pattern Recognition">cs.CV</span> </div> </div> <p class="title is-5 mathjax"> A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Ronglai Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Wei%2C+F">Fangyun Wei</a>, <a href="/search/cs?searchtype=author&amp;query=Chen%2C+Z">Zenggui Chen</a>, <a href="/search/cs?searchtype=author&amp;query=Mak%2C+B">Brian Mak</a>, <a href="/search/cs?searchtype=author&amp;query=Yang%2C+J">Jiaolong Yang</a>, <a href="/search/cs?searchtype=author&amp;query=Tong%2C+X">Xin Tong</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2401.04730v2-abstract-short" style="display: inline;"> The objective of this paper is to develop a functional system for translating spoken languages into sign languages, referred to as Spoken2Sign translation. The Spoken2Sign task is orthogonal and complementary to traditional sign language to spoken language (Sign2Spoken) translation. To enable Spoken2Sign translation, we present a simple baseline consisting of three steps: 1) creating a gloss-video&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2401.04730v2-abstract-full').style.display = 'inline'; document.getElementById('2401.04730v2-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2401.04730v2-abstract-full" style="display: none;"> The objective of this paper is to develop a functional system for translating spoken languages into sign languages, referred to as Spoken2Sign translation. The Spoken2Sign task is orthogonal and complementary to traditional sign language to spoken language (Sign2Spoken) translation. To enable Spoken2Sign translation, we present a simple baseline consisting of three steps: 1) creating a gloss-video dictionary using existing Sign2Spoken benchmarks; 2) estimating a 3D sign for each sign video in the dictionary; 3) training a Spoken2Sign model, which is composed of a Text2Gloss translator, a sign connector, and a rendering module, with the aid of the yielded gloss-3D sign dictionary. The translation results are then displayed through a sign avatar. As far as we know, we are the first to present the Spoken2Sign task in an output format of 3D signs. In addition to its capability of Spoken2Sign translation, we also demonstrate that two by-products of our approach-3D keypoint augmentation and multi-view understanding-can assist in keypoint-based sign language understanding. Code and models are available at <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2401.04730v2-abstract-full').style.display = 'none'; document.getElementById('2401.04730v2-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 3 July, 2024; <span class="has-text-black-bis has-text-weight-semibold">v1</span> submitted 9 January, 2024; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> January 2024. </p> <p class="comments is-size-7"> <span class="has-text-black-bis has-text-weight-semibold">Comments:</span> <span class="has-text-grey-dark mathjax">Accepted by ECCV 2024</span> </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:2310.14274</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Machine Learning">cs.LG</span> </div> </div> <p class="title is-5 mathjax"> Robust Visual Imitation Learning with Inverse Dynamics Representations </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Li%2C+S">Siyuan Li</a>, <a href="/search/cs?searchtype=author&amp;query=Wang%2C+X">Xun Wang</a>, <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Rongchang Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Sun%2C+K">Kewu Sun</a>, <a href="/search/cs?searchtype=author&amp;query=Cui%2C+L">Lingfei Cui</a>, <a href="/search/cs?searchtype=author&amp;query=Ding%2C+J">Jishiyu Ding</a>, <a href="/search/cs?searchtype=author&amp;query=Liu%2C+P">Peng Liu</a>, <a href="/search/cs?searchtype=author&amp;query=Ma%2C+Z">Zhe Ma</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2310.14274v1-abstract-short" style="display: inline;"> Imitation learning (IL) has achieved considerable success in solving complex sequential decision-making problems. However, current IL methods mainly assume that the environment for learning policies is the same as the environment for collecting expert datasets. Therefore, these methods may fail to work when there are slight differences between the learning and expert environments, especially for c&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2310.14274v1-abstract-full').style.display = 'inline'; document.getElementById('2310.14274v1-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2310.14274v1-abstract-full" style="display: none;"> Imitation learning (IL) has achieved considerable success in solving complex sequential decision-making problems. However, current IL methods mainly assume that the environment for learning policies is the same as the environment for collecting expert datasets. Therefore, these methods may fail to work when there are slight differences between the learning and expert environments, especially for challenging problems with high-dimensional image observations. However, in real-world scenarios, it is rare to have the chance to collect expert trajectories precisely in the target learning environment. To address this challenge, we propose a novel robust imitation learning approach, where we develop an inverse dynamics state representation learning objective to align the expert environment and the learning environment. With the abstract state representation, we design an effective reward function, which thoroughly measures the similarity between behavior data and expert data not only element-wise, but also from the trajectory level. We conduct extensive experiments to evaluate the proposed approach under various visual perturbations and in diverse visual control tasks. Our approach can achieve a near-expert performance in most environments, and significantly outperforms the state-of-the-art visual IL methods and robust IL methods. <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2310.14274v1-abstract-full').style.display = 'none'; document.getElementById('2310.14274v1-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 22 October, 2023; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> October 2023. </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:2306.14222</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Computation and Language">cs.CL</span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Artificial Intelligence">cs.AI</span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Statistical Finance">q-fin.ST</span> </div> </div> <p class="title is-5 mathjax"> Unveiling the Potential of Sentiment: Can Large Language Models Predict Chinese Stock Price Movements? </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Zhang%2C+H">Haohan Zhang</a>, <a href="/search/cs?searchtype=author&amp;query=Hua%2C+F">Fengrui Hua</a>, <a href="/search/cs?searchtype=author&amp;query=Xu%2C+C">Chengjin Xu</a>, <a href="/search/cs?searchtype=author&amp;query=Kong%2C+H">Hao Kong</a>, <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Ruiting Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Guo%2C+J">Jian Guo</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2306.14222v2-abstract-short" style="display: inline;"> The rapid advancement of Large Language Models (LLMs) has spurred discussions about their potential to enhance quantitative trading strategies. LLMs excel in analyzing sentiments about listed companies from financial news, providing critical insights for trading decisions. However, the performance of LLMs in this task varies substantially due to their inherent characteristics. This paper introduce&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2306.14222v2-abstract-full').style.display = 'inline'; document.getElementById('2306.14222v2-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2306.14222v2-abstract-full" style="display: none;"> The rapid advancement of Large Language Models (LLMs) has spurred discussions about their potential to enhance quantitative trading strategies. LLMs excel in analyzing sentiments about listed companies from financial news, providing critical insights for trading decisions. However, the performance of LLMs in this task varies substantially due to their inherent characteristics. This paper introduces a standardized experimental procedure for comprehensive evaluations. We detail the methodology using three distinct LLMs, each embodying a unique approach to performance enhancement, applied specifically to the task of sentiment factor extraction from large volumes of Chinese news summaries. Subsequently, we develop quantitative trading strategies using these sentiment factors and conduct back-tests in realistic scenarios. Our results will offer perspectives about the performances of Large Language Models applied to extracting sentiments from Chinese news texts. <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2306.14222v2-abstract-full').style.display = 'none'; document.getElementById('2306.14222v2-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 4 May, 2024; <span class="has-text-black-bis has-text-weight-semibold">v1</span> submitted 25 June, 2023; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> June 2023. </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:2303.12080</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Computer Vision and Pattern Recognition">cs.CV</span> </div> </div> <p class="title is-5 mathjax"> Natural Language-Assisted Sign Language Recognition </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Ronglai Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Wei%2C+F">Fangyun Wei</a>, <a href="/search/cs?searchtype=author&amp;query=Mak%2C+B">Brian Mak</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2303.12080v1-abstract-short" style="display: inline;"> Sign languages are visual languages which convey information by signers&#39; handshape, facial expression, body movement, and so forth. Due to the inherent restriction of combinations of these visual ingredients, there exist a significant number of visually indistinguishable signs (VISigns) in sign languages, which limits the recognition capacity of vision neural networks. To mitigate the problem, we&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2303.12080v1-abstract-full').style.display = 'inline'; document.getElementById('2303.12080v1-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2303.12080v1-abstract-full" style="display: none;"> Sign languages are visual languages which convey information by signers&#39; handshape, facial expression, body movement, and so forth. Due to the inherent restriction of combinations of these visual ingredients, there exist a significant number of visually indistinguishable signs (VISigns) in sign languages, which limits the recognition capacity of vision neural networks. To mitigate the problem, we propose the Natural Language-Assisted Sign Language Recognition (NLA-SLR) framework, which exploits semantic information contained in glosses (sign labels). First, for VISigns with similar semantic meanings, we propose language-aware label smoothing by generating soft labels for each training sign whose smoothing weights are computed from the normalized semantic similarities among the glosses to ease training. Second, for VISigns with distinct semantic meanings, we present an inter-modality mixup technique which blends vision and gloss features to further maximize the separability of different signs under the supervision of blended labels. Besides, we also introduce a novel backbone, video-keypoint network, which not only models both RGB videos and human body keypoints but also derives knowledge from sign videos of different temporal receptive fields. Empirically, our method achieves state-of-the-art performance on three widely-adopted benchmarks: MSASL, WLASL, and NMFs-CSL. Codes are available at <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2303.12080v1-abstract-full').style.display = 'none'; document.getElementById('2303.12080v1-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 21 March, 2023; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> March 2023. </p> <p class="comments is-size-7"> <span class="has-text-black-bis has-text-weight-semibold">Comments:</span> <span class="has-text-grey-dark mathjax">Accepted by CVPR 2023. Codes are available at</span> </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:2212.13023</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Computer Vision and Pattern Recognition">cs.CV</span> </div> </div> <p class="title is-5 mathjax"> Improving Continuous Sign Language Recognition with Consistency Constraints and Signer Removal </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Ronglai Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Mak%2C+B">Brian Mak</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2212.13023v2-abstract-short" style="display: inline;"> Most deep-learning-based continuous sign language recognition (CSLR) models share a similar backbone consisting of a visual module, a sequential module, and an alignment module. However, due to limited training samples, a connectionist temporal classification loss may not train such CSLR backbones sufficiently. In this work, we propose three auxiliary tasks to enhance the CSLR backbones. The first&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2212.13023v2-abstract-full').style.display = 'inline'; document.getElementById('2212.13023v2-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2212.13023v2-abstract-full" style="display: none;"> Most deep-learning-based continuous sign language recognition (CSLR) models share a similar backbone consisting of a visual module, a sequential module, and an alignment module. However, due to limited training samples, a connectionist temporal classification loss may not train such CSLR backbones sufficiently. In this work, we propose three auxiliary tasks to enhance the CSLR backbones. The first task enhances the visual module, which is sensitive to the insufficient training problem, from the perspective of consistency. Specifically, since the information of sign languages is mainly included in signers&#39; facial expressions and hand movements, a keypoint-guided spatial attention module is developed to enforce the visual module to focus on informative regions, i.e., spatial attention consistency. Second, noticing that both the output features of the visual and sequential modules represent the same sentence, to better exploit the backbone&#39;s power, a sentence embedding consistency constraint is imposed between the visual and sequential modules to enhance the representation power of both features. We name the CSLR model trained with the above auxiliary tasks as consistency-enhanced CSLR, which performs well on signer-dependent datasets in which all signers appear during both training and testing. To make it more robust for the signer-independent setting, a signer removal module based on feature disentanglement is further proposed to remove signer information from the backbone. Extensive ablation studies are conducted to validate the effectiveness of these auxiliary tasks. More remarkably, with a transformer-based backbone, our model achieves state-of-the-art or competitive performance on five benchmarks, PHOENIX-2014, PHOENIX-2014-T, PHOENIX-2014-SI, CSL, and CSL-Daily. Code and Models are available at <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2212.13023v2-abstract-full').style.display = 'none'; document.getElementById('2212.13023v2-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 11 January, 2024; <span class="has-text-black-bis has-text-weight-semibold">v1</span> submitted 26 December, 2022; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> December 2022. </p> <p class="comments is-size-7"> <span class="has-text-black-bis has-text-weight-semibold">Comments:</span> <span class="has-text-grey-dark mathjax">Accepted by ACM TOMM</span> </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:2211.07641</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Neural and Evolutionary Computing">cs.NE</span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Machine Learning">cs.LG</span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Signal Processing">eess.SP</span> </div> </div> <p class="title is-5 mathjax"> Motif-topology improved Spiking Neural Network for the Cocktail Party Effect and McGurk Effect </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Jia%2C+S">Shuncheng Jia</a>, <a href="/search/cs?searchtype=author&amp;query=Zhang%2C+T">Tielin Zhang</a>, <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Ruichen Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Xu%2C+B">Bo Xu</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2211.07641v1-abstract-short" style="display: inline;"> Network architectures and learning principles are playing key in forming complex functions in artificial neural networks (ANNs) and spiking neural networks (SNNs). SNNs are considered the new-generation artificial networks by incorporating more biological features than ANNs, including dynamic spiking neurons, functionally specified architectures, and efficient learning paradigms. Network architect&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2211.07641v1-abstract-full').style.display = 'inline'; document.getElementById('2211.07641v1-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2211.07641v1-abstract-full" style="display: none;"> Network architectures and learning principles are playing key in forming complex functions in artificial neural networks (ANNs) and spiking neural networks (SNNs). SNNs are considered the new-generation artificial networks by incorporating more biological features than ANNs, including dynamic spiking neurons, functionally specified architectures, and efficient learning paradigms. Network architectures are also considered embodying the function of the network. Here, we propose a Motif-topology improved SNN (M-SNN) for the efficient multi-sensory integration and cognitive phenomenon simulations. The cognitive phenomenon simulation we simulated includes the cocktail party effect and McGurk effect, which are discussed by many researchers. Our M-SNN constituted by the meta operator called network motifs. The source of 3-node network motifs topology from artificial one pre-learned from the spatial or temporal dataset. In the single-sensory classification task, the results showed the accuracy of M-SNN using network motif topologies was higher than the pure feedforward network topology without using them. In the multi-sensory integration task, the performance of M-SNN using artificial network motif was better than the state-of-the-art SNN using BRP (biologically-plausible reward propagation). Furthermore, the M-SNN could better simulate the cocktail party effect and McGurk effect with lower computational cost. We think the artificial network motifs could be considered as some prior knowledge that would contribute to the multi-sensory integration of SNNs and provide more benefits for simulating the cognitive phenomenon. <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2211.07641v1-abstract-full').style.display = 'none'; document.getElementById('2211.07641v1-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 12 November, 2022; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> November 2022. </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:2211.01367</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Computer Vision and Pattern Recognition">cs.CV</span> </div> </div> <p class="title is-5 mathjax"> Two-Stream Network for Sign Language Recognition and Translation </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Chen%2C+Y">Yutong Chen</a>, <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Ronglai Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Wei%2C+F">Fangyun Wei</a>, <a href="/search/cs?searchtype=author&amp;query=Wu%2C+Y">Yu Wu</a>, <a href="/search/cs?searchtype=author&amp;query=Liu%2C+S">Shujie Liu</a>, <a href="/search/cs?searchtype=author&amp;query=Mak%2C+B">Brian Mak</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2211.01367v2-abstract-short" style="display: inline;"> Sign languages are visual languages using manual articulations and non-manual elements to convey information. For sign language recognition and translation, the majority of existing approaches directly encode RGB videos into hidden representations. RGB videos, however, are raw signals with substantial visual redundancy, leading the encoder to overlook the key information for sign language understa&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2211.01367v2-abstract-full').style.display = 'inline'; document.getElementById('2211.01367v2-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2211.01367v2-abstract-full" style="display: none;"> Sign languages are visual languages using manual articulations and non-manual elements to convey information. For sign language recognition and translation, the majority of existing approaches directly encode RGB videos into hidden representations. RGB videos, however, are raw signals with substantial visual redundancy, leading the encoder to overlook the key information for sign language understanding. To mitigate this problem and better incorporate domain knowledge, such as handshape and body movement, we introduce a dual visual encoder containing two separate streams to model both the raw videos and the keypoint sequences generated by an off-the-shelf keypoint estimator. To make the two streams interact with each other, we explore a variety of techniques, including bidirectional lateral connection, sign pyramid network with auxiliary supervision, and frame-level self-distillation. The resulting model is called TwoStream-SLR, which is competent for sign language recognition (SLR). TwoStream-SLR is extended to a sign language translation (SLT) model, TwoStream-SLT, by simply attaching an extra translation network. Experimentally, our TwoStream-SLR and TwoStream-SLT achieve state-of-the-art performance on SLR and SLT tasks across a series of datasets including Phoenix-2014, Phoenix-2014T, and CSL-Daily. Code and models are available at: <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2211.01367v2-abstract-full').style.display = 'none'; document.getElementById('2211.01367v2-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 22 March, 2023; <span class="has-text-black-bis has-text-weight-semibold">v1</span> submitted 2 November, 2022; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> November 2022. </p> <p class="comments is-size-7"> <span class="has-text-black-bis has-text-weight-semibold">Comments:</span> <span class="has-text-grey-dark mathjax">Accepted by NeurIPS 2022. Code and models are available at:</span> </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:2202.06821</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Neural and Evolutionary Computing">cs.NE</span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Artificial Intelligence">cs.AI</span> </div> </div> <p class="title is-5 mathjax"> Motif-topology and Reward-learning improved Spiking Neural Network for Efficient Multi-sensory Integration </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Jia%2C+S">Shuncheng Jia</a>, <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Ruichen Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Zhang%2C+T">Tielin Zhang</a>, <a href="/search/cs?searchtype=author&amp;query=Liu%2C+H">Hongxing Liu</a>, <a href="/search/cs?searchtype=author&amp;query=Xu%2C+B">Bo Xu</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2202.06821v1-abstract-short" style="display: inline;"> Network architectures and learning principles are key in forming complex functions in artificial neural networks (ANNs) and spiking neural networks (SNNs). SNNs are considered the new-generation artificial networks by incorporating more biological features than ANNs, including dynamic spiking neurons, functionally specified architectures, and efficient learning paradigms. In this paper, we propose&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2202.06821v1-abstract-full').style.display = 'inline'; document.getElementById('2202.06821v1-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2202.06821v1-abstract-full" style="display: none;"> Network architectures and learning principles are key in forming complex functions in artificial neural networks (ANNs) and spiking neural networks (SNNs). SNNs are considered the new-generation artificial networks by incorporating more biological features than ANNs, including dynamic spiking neurons, functionally specified architectures, and efficient learning paradigms. In this paper, we propose a Motif-topology and Reward-learning improved SNN (MR-SNN) for efficient multi-sensory integration. MR-SNN contains 13 types of 3-node Motif topologies which are first extracted from independent single-sensory learning paradigms and then integrated for multi-sensory classification. The experimental results showed higher accuracy and stronger robustness of the proposed MR-SNN than other conventional SNNs without using Motifs. Furthermore, the proposed reward learning paradigm was biologically plausible and can better explain the cognitive McGurk effect caused by incongruent visual and auditory sensory signals. <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2202.06821v1-abstract-full').style.display = 'none'; document.getElementById('2202.06821v1-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 10 February, 2022; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> February 2022. </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:2007.07796</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Machine Learning">cs.LG</span> <span class="tag is-small is-grey tooltip is-tooltip-top" data-tooltip="Machine Learning">stat.ML</span> </div> </div> <p class="title is-5 mathjax"> Neural Topic Models with Survival Supervision: Jointly Predicting Time-to-Event Outcomes and Learning How Clinical Features Relate </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Chen%2C+G+H">George H. Chen</a>, <a href="/search/cs?searchtype=author&amp;query=Li%2C+L">Linhong Li</a>, <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Ren Zuo</a>, <a href="/search/cs?searchtype=author&amp;query=Coston%2C+A">Amanda Coston</a>, <a href="/search/cs?searchtype=author&amp;query=Weiss%2C+J+C">Jeremy C. Weiss</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="2007.07796v2-abstract-short" style="display: inline;"> We present a neural network framework for learning a survival model to predict a time-to-event outcome while simultaneously learning a topic model that reveals feature relationships. In particular, we model each subject as a distribution over &#34;topics&#34;, where a topic could, for instance, correspond to an age group, a disorder, or a disease. The presence of a topic in a subject means that specific c&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2007.07796v2-abstract-full').style.display = 'inline'; document.getElementById('2007.07796v2-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="2007.07796v2-abstract-full" style="display: none;"> We present a neural network framework for learning a survival model to predict a time-to-event outcome while simultaneously learning a topic model that reveals feature relationships. In particular, we model each subject as a distribution over &#34;topics&#34;, where a topic could, for instance, correspond to an age group, a disorder, or a disease. The presence of a topic in a subject means that specific clinical features are more likely to appear for the subject. Topics encode information about related features and are learned in a supervised manner to predict a time-to-event outcome. Our framework supports combining many different topic and survival models; training the resulting joint survival-topic model readily scales to large datasets using standard neural net optimizers with minibatch gradient descent. For example, a special case is to combine LDA with a Cox model, in which case a subject&#39;s distribution over topics serves as the input feature vector to the Cox model. We explain how to address practical implementation issues that arise when applying these neural survival-supervised topic models to clinical data, including how to visualize results to assist clinical interpretation. We study the effectiveness of our proposed framework on seven clinical datasets on predicting time until death as well as hospital ICU length of stay, where we find that neural survival-supervised topic models achieve competitive accuracy with existing approaches while yielding interpretable clinical topics that explain feature relationships. Our code is available at: <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('2007.07796v2-abstract-full').style.display = 'none'; document.getElementById('2007.07796v2-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 4 June, 2024; <span class="has-text-black-bis has-text-weight-semibold">v1</span> submitted 15 July, 2020; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> July 2020. </p> <p class="comments is-size-7"> <span class="has-text-black-bis has-text-weight-semibold">Comments:</span> <span class="has-text-grey-dark mathjax">Accepted at the Artificial Intelligence in Medicine journal; preliminary conference version (see earlier arXiv draft) appeared in the International Conference on Artificial Intelligence in Medicine (AIME 2020)</span> </p> </li> <li class="arxiv-result"> <div class="is-marginless"> <p class="list-title is-inline-block"><a href="">arXiv:1901.10925</a> <span>&nbsp;[<a href="">pdf</a>, <a href="">ps</a>, <a href="">other</a>]&nbsp;</span> </p> <div class="tags is-inline-block"> <span class="tag is-small is-link tooltip is-tooltip-top" data-tooltip="Logic in Computer Science">cs.LO</span> </div> </div> <p class="title is-5 mathjax"> A Constructive Equivalence between Computation Tree Logic and Failure Trace Testing </p> <p class="authors"> <span class="search-hit">Authors:</span> <a href="/search/cs?searchtype=author&amp;query=Bruda%2C+S+D">Stefan D. Bruda</a>, <a href="/search/cs?searchtype=author&amp;query=Singh%2C+S">Sunita Singh</a>, <a href="/search/cs?searchtype=author&amp;query=Uddin%2C+A+F+M+N">A. F. M. Nokib Uddin</a>, <a href="/search/cs?searchtype=author&amp;query=Zhang%2C+Z">Zhiyu Zhang</a>, <a href="/search/cs?searchtype=author&amp;query=Zuo%2C+R">Rui Zuo</a> </p> <p class="abstract mathjax"> <span class="has-text-black-bis has-text-weight-semibold">Abstract</span>: <span class="abstract-short has-text-grey-dark mathjax" id="1901.10925v1-abstract-short" style="display: inline;"> The two major systems of formal verification are model checking and algebraic model-based testing. Model checking is based on some form of temporal logic such as linear temporal logic (LTL) or computation tree logic (CTL). One powerful and realistic logic being used is CTL, which is capable of expressing most interesting properties of processes such as liveness and safety. Model-based testing is b&hellip; <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('1901.10925v1-abstract-full').style.display = 'inline'; document.getElementById('1901.10925v1-abstract-short').style.display = 'none';">&#9661; More</a> </span> <span class="abstract-full has-text-grey-dark mathjax" id="1901.10925v1-abstract-full" style="display: none;"> The two major systems of formal verification are model checking and algebraic model-based testing. Model checking is based on some form of temporal logic such as linear temporal logic (LTL) or computation tree logic (CTL). One powerful and realistic logic being used is CTL, which is capable of expressing most interesting properties of processes such as liveness and safety. Model-based testing is based on some operational semantics of processes (such as traces, failures, or both) and its associated preorders. The most fine-grained preorder beside bisimulation (mostly of theoretical importance) is based on failure traces. We show that these two most powerful variants are equivalent; that is, we show that for any failure trace test there exists a CTL formula equivalent to it, and the other way around. All our proofs are constructive and algorithmic. Our result allows for parts of a large system to be specified logically while other parts are specified algebraically, thus combining the best of the two (logic and algebraic) worlds. <a class="is-size-7" style="white-space: nowrap;" onclick="document.getElementById('1901.10925v1-abstract-full').style.display = 'none'; document.getElementById('1901.10925v1-abstract-short').style.display = 'inline';">&#9651; Less</a> </span> </p> <p class="is-size-7"><span class="has-text-black-bis has-text-weight-semibold">Submitted</span> 30 January, 2019; <span class="has-text-black-bis has-text-weight-semibold">originally announced</span> January 2019. </p> <p class="comments is-size-7"> <span class="has-text-black-bis has-text-weight-semibold">Comments:</span> <span class="has-text-grey-dark mathjax">32 pages, 6 figures (all figures typeset with gastex)</span> </p> <p class="comments is-size-7"> <span class="has-text-black-bis has-text-weight-semibold">MSC Class:</span> 68Q60 </p> </li> </ol> <div 