CINXE.COM

<!DOCTYPE html> <html lang="en" dir="ltr"> <head>  <script async src="https://www.googletagmanager.com/gtag/js?id=G-P63WKM1TM1"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-P63WKM1TM1'); </script>  <script type="text/javascript" > (function(m,e,t,r,i,k,a){m[i]=m[i]||function(){(m[i].a=m[i].a||[]).push(arguments)}; m[i].l=1*new Date(); for (var j = 0; j < document.scripts.length; j++) {if (document.scripts[j].src === r) { return; }} k=e.createElement(t),a=e.getElementsByTagName(t)[0],k.async=1,k.src=r,a.parentNode.insertBefore(k,a)}) (window, document, "script", "https://mc.yandex.ru/metrika/tag.js", "ym"); ym(55165297, "init", { clickmap:false, trackLinks:true, accurateTrackBounce:true, webvisor:false }); </script> <noscript><div><img src="https://mc.yandex.ru/watch/55165297" style="position:absolute; left:-9999px;" alt="" /></div></noscript>    <title>Search results for: multi-agent reinforcement learning</title> <meta name="description" content="Search results for: multi-agent reinforcement learning"> <meta name="keywords" content="multi-agent reinforcement learning"> <meta name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1, user-scalable=no"> <meta charset="utf-8"> <link href="https://cdn.waset.org/favicon.ico" type="image/x-icon" rel="shortcut icon"> <link href="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/css/bootstrap.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/plugins/fontawesome/css/all.min.css" rel="stylesheet"> <link href="https://cdn.waset.org/static/css/site.css?v=150220211555" rel="stylesheet"> </head> <body> <header> <div class="container"> <nav class="navbar navbar-expand-lg navbar-light"> <a class="navbar-brand" href="https://waset.org"> <img src="https://cdn.waset.org/static/images/wasetc.png" alt="Open Science Research Excellence" title="Open Science Research Excellence" /> </a> <button class="d-block d-lg-none navbar-toggler ml-auto" type="button" data-toggle="collapse" data-target="#navbarMenu" aria-controls="navbarMenu" aria-expanded="false" aria-label="Toggle navigation"> <span class="navbar-toggler-icon"></span> </button> <div class="w-100"> <div class="d-none d-lg-flex flex-row-reverse"> <form method="get" action="https://waset.org/search" class="form-inline my-2 my-lg-0"> <input class="form-control mr-sm-2" type="search" placeholder="Search Conferences" value="multi-agent reinforcement learning" name="q" aria-label="Search"> <button class="btn btn-light my-2 my-sm-0" type="submit"><i class="fas fa-search"></i></button> </form> </div> <div class="collapse navbar-collapse mt-1" id="navbarMenu"> <ul class="navbar-nav ml-auto align-items-center" id="mainNavMenu"> <li class="nav-item"> <a class="nav-link" href="https://waset.org/conferences" title="Conferences in 2024/2025/2026">Conferences</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/disciplines" title="Disciplines">Disciplines</a> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/committees" rel="nofollow">Committees</a> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownPublications" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> Publications </a> <div class="dropdown-menu" aria-labelledby="navbarDropdownPublications"> <a class="dropdown-item" href="https://publications.waset.org/abstracts">Abstracts</a> <a class="dropdown-item" href="https://publications.waset.org">Periodicals</a> <a class="dropdown-item" href="https://publications.waset.org/archive">Archive</a> </div> </li> <li class="nav-item"> <a class="nav-link" href="https://waset.org/page/support" title="Support">Support</a> </li> </ul> </div> </div> </nav> </div> </header> <main> <div class="container mt-4"> <div class="row"> <div class="col-md-9 mx-auto"> <form method="get" action="https://publications.waset.org/abstracts/search"> <div id="custom-search-input"> <div class="input-group"> <i class="fas fa-search"></i> <input type="text" class="search-query" name="q" placeholder="Author, Title, Abstract, Keywords" value="multi-agent reinforcement learning"> <input type="submit" class="btn_search" value="Search"> </div> </div> </form> </div> </div> <div class="row mt-3"> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Commenced</strong> in January 2007</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Frequency:</strong> Monthly</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Edition:</strong> International</div> </div> </div> <div class="col-sm-3"> <div class="card"> <div class="card-body"><strong>Paper Count:</strong> 7793</div> </div> </div> </div> <h1 class="mt-3 mb-3 text-center" style="font-size:1.6rem;">Search results for: multi-agent reinforcement learning</h1> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7793</span> Deep Reinforcement Learning Model for Autonomous Driving</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Boumaraf%20Malak">Boumaraf Malak</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The development of intelligent transportation systems (ITS) and artificial intelligence (AI) are spurring us to pave the way for the widespread adoption of autonomous vehicles (AVs). This is open again opportunities for smart roads, smart traffic safety, and mobility comfort. A highly intelligent decision-making system is essential for autonomous driving around dense, dynamic objects. It must be able to handle complex road geometry and topology, as well as complex multiagent interactions, and closely follow higher-level commands such as routing information. Autonomous vehicles have become a very hot research topic in recent years due to their significant ability to reduce traffic accidents and personal injuries. Using new artificial intelligence-based technologies handles important functions in scene understanding, motion planning, decision making, vehicle control, social behavior, and communication for AV. This paper focuses only on deep reinforcement learning-based methods; it does not include traditional (flat) planar techniques, which have been the subject of extensive research in the past because reinforcement learning (RL) has become a powerful learning framework now capable of learning complex policies in high dimensional environments. The DRL algorithm used so far found solutions to the four main problems of autonomous driving; in our paper, we highlight the challenges and point to possible future research directions. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=deep%20reinforcement%20learning" title="deep reinforcement learning">deep reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=autonomous%20driving" title=" autonomous driving"> autonomous driving</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20deterministic%20policy%20gradient" title=" deep deterministic policy gradient"> deep deterministic policy gradient</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20Q-learning" title=" deep Q-learning"> deep Q-learning</a> </p> <a href="https://publications.waset.org/abstracts/166548/deep-reinforcement-learning-model-for-autonomous-driving" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/166548.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">85</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7792</span> Metareasoning Image Optimization Q-Learning</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Mahasa%20Zahirnia">Mahasa Zahirnia</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The purpose of this paper is to explore new and effective ways of optimizing satellite images using artificial intelligence, and the process of implementing reinforcement learning to enhance the quality of data captured within the image. In our implementation of Bellman's Reinforcement Learning equations, associated state diagrams, and multi-stage image processing, we were able to enhance image quality, detect and define objects. Reinforcement learning is the differentiator in the area of artificial intelligence, and Q-Learning relies on trial and error to achieve its goals. The reward system that is embedded in Q-Learning allows the agent to self-evaluate its performance and decide on the best possible course of action based on the current and future environment. Results show that within a simulated environment, built on the images that are commercially available, the rate of detection was 40-90%. Reinforcement learning through Q-Learning algorithm is not just desired but required design criteria for image optimization and enhancements. The proposed methods presented are a cost effective method of resolving uncertainty of the data because reinforcement learning finds ideal policies to manage the process using a smaller sample of images. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=Q-learning" title="Q-learning">Q-learning</a>, <a href="https://publications.waset.org/abstracts/search?q=image%20optimization" title=" image optimization"> image optimization</a>, <a href="https://publications.waset.org/abstracts/search?q=reinforcement%20learning" title=" reinforcement learning"> reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=Markov%20decision%20process" title=" Markov decision process"> Markov decision process</a> </p> <a href="https://publications.waset.org/abstracts/119650/metareasoning-image-optimization-q-learning" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/119650.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">215</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7791</span> Q-Learning of Bee-Like Robots Through Obstacle Avoidance</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Jawairia%20Rasheed">Jawairia Rasheed</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Modern robots are often used for search and rescue purpose. One of the key areas of interest in such cases is learning complex environments. One of the key methodologies for robots in such cases is reinforcement learning. In reinforcement learning robots learn to move the path to reach the goal while avoiding obstacles. Q-learning, one of the most advancement of reinforcement learning is used for making the robots to learn the path. Robots learn by interacting with the environment to reach the goal. In this paper simulation model of bee-like robots is implemented in NETLOGO. In the start the learning rate was less and it increased with the passage of time. The bees successfully learned to reach the goal while avoiding obstacles through Q-learning technique. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=reinforlearning%20of%20bee%20like%20robots%20for%20reaching%20the%20goalcement%20learning%20for%20randomly%20placed%20obstacles" title="reinforlearning of bee like robots for reaching the goalcement learning for randomly placed obstacles">reinforlearning of bee like robots for reaching the goalcement learning for randomly placed obstacles</a>, <a href="https://publications.waset.org/abstracts/search?q=obstacle%20avoidance%20through%20q-learning" title=" obstacle avoidance through q-learning"> obstacle avoidance through q-learning</a>, <a href="https://publications.waset.org/abstracts/search?q=q-learning%20for%20obstacle%20avoidance" title=" q-learning for obstacle avoidance"> q-learning for obstacle avoidance</a>, <a href="https://publications.waset.org/abstracts/search?q=" title=""></a> </p> <a href="https://publications.waset.org/abstracts/155154/q-learning-of-bee-like-robots-through-obstacle-avoidance" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/155154.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">101</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7790</span> Decoding the Structure of Multi-Agent System Communication: A Comparative Analysis of Protocols and Paradigms</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Gulshad%20Azatova">Gulshad Azatova</a>, <a href="https://publications.waset.org/abstracts/search?q=Aleksandr%20Kapitonov"> Aleksandr Kapitonov</a>, <a href="https://publications.waset.org/abstracts/search?q=Natig%20Aminov"> Natig Aminov</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Multiagent systems have gained significant attention in various fields, such as robotics, autonomous vehicles, and distributed computing, where multiple agents cooperate and communicate to achieve complex tasks. Efficient communication among agents is a crucial aspect of these systems, as it directly impacts their overall performance and scalability. This scholarly work provides an exploration of essential communication elements and conducts a comparative assessment of diverse protocols utilized in multiagent systems. The emphasis lies in scrutinizing the strengths, weaknesses, and applicability of these protocols across various scenarios. The research also sheds light on emerging trends within communication protocols for multiagent systems, including the incorporation of machine learning methods and the adoption of blockchain-based solutions to ensure secure communication. These trends provide valuable insights into the evolving landscape of multiagent systems and their communication protocols. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=communication" title="communication">communication</a>, <a href="https://publications.waset.org/abstracts/search?q=multi-agent%20systems" title=" multi-agent systems"> multi-agent systems</a>, <a href="https://publications.waset.org/abstracts/search?q=protocols" title=" protocols"> protocols</a>, <a href="https://publications.waset.org/abstracts/search?q=consensus" title=" consensus"> consensus</a> </p> <a href="https://publications.waset.org/abstracts/178909/decoding-the-structure-of-multi-agent-system-communication-a-comparative-analysis-of-protocols-and-paradigms" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/178909.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">74</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7789</span> The AI Arena: A Framework for Distributed Multi-Agent Reinforcement Learning</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Edward%20W.%20Staley">Edward W. Staley</a>, <a href="https://publications.waset.org/abstracts/search?q=Corban%20G.%20Rivera"> Corban G. Rivera</a>, <a href="https://publications.waset.org/abstracts/search?q=Ashley%20J.%20Llorens"> Ashley J. Llorens</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Advances in reinforcement learning (RL) have resulted in recent breakthroughs in the application of artificial intelligence (AI) across many different domains. An emerging landscape of development environments is making powerful RL techniques more accessible for a growing community of researchers. However, most existing frameworks do not directly address the problem of learning in complex operating environments, such as dense urban settings or defense-related scenarios, that incorporate distributed, heterogeneous teams of agents. To help enable AI research for this important class of applications, we introduce the AI Arena: a scalable framework with flexible abstractions for distributed multi-agent reinforcement learning. The AI Arena extends the OpenAI Gym interface to allow greater flexibility in learning control policies across multiple agents with heterogeneous learning strategies and localized views of the environment. To illustrate the utility of our framework, we present experimental results that demonstrate performance gains due to a distributed multi-agent learning approach over commonly-used RL techniques in several different learning environments. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=reinforcement%20learning" title="reinforcement learning">reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=multi-agent" title=" multi-agent"> multi-agent</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20learning" title=" deep learning"> deep learning</a>, <a href="https://publications.waset.org/abstracts/search?q=artificial%20intelligence" title=" artificial intelligence"> artificial intelligence</a> </p> <a href="https://publications.waset.org/abstracts/135925/the-ai-arena-a-framework-for-distributed-multi-agent-reinforcement-learning" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/135925.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">157</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7788</span> Curriculum-Based Multi-Agent Reinforcement Learning for Robotic Navigation</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Hyeongbok%20Kim">Hyeongbok Kim</a>, <a href="https://publications.waset.org/abstracts/search?q=Lingling%20Zhao"> Lingling Zhao</a>, <a href="https://publications.waset.org/abstracts/search?q=Xiaohong%20Su"> Xiaohong Su</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Deep reinforcement learning has been applied to address various problems in robotics, such as autonomous driving and unmanned aerial vehicle. However, because of the sparse reward penalty for a collision with obstacles during the navigation mission, the agent fails to learn the optimal policy or requires a long time for convergence. Therefore, using obstacles and enemy agents, in this paper, we present a curriculum-based boost learning method to effectively train compound skills during multi-agent reinforcement learning. First, to enable the agents to solve challenging tasks, we gradually increased learning difficulties by adjusting reward shaping instead of constructing different learning environments. Then, in a benchmark environment with static obstacles and moving enemy agents, the experimental results showed that the proposed curriculum learning strategy enhanced cooperative navigation and compound collision avoidance skills in uncertain environments while improving learning efficiency. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=curriculum%20learning" title="curriculum learning">curriculum learning</a>, <a href="https://publications.waset.org/abstracts/search?q=hard%20exploration" title=" hard exploration"> hard exploration</a>, <a href="https://publications.waset.org/abstracts/search?q=multi-agent%20reinforcement%20learning" title=" multi-agent reinforcement learning"> multi-agent reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=robotic%20navigation" title=" robotic navigation"> robotic navigation</a>, <a href="https://publications.waset.org/abstracts/search?q=sparse%20reward" title=" sparse reward"> sparse reward</a> </p> <a href="https://publications.waset.org/abstracts/162478/curriculum-based-multi-agent-reinforcement-learning-for-robotic-navigation" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/162478.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">92</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7787</span> Predicting Shot Making in Basketball Learnt Fromadversarial Multiagent Trajectories</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Mark%20Harmon">Mark Harmon</a>, <a href="https://publications.waset.org/abstracts/search?q=Abdolghani%20Ebrahimi"> Abdolghani Ebrahimi</a>, <a href="https://publications.waset.org/abstracts/search?q=Patrick%20Lucey"> Patrick Lucey</a>, <a href="https://publications.waset.org/abstracts/search?q=Diego%20Klabjan"> Diego Klabjan</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In this paper, we predict the likelihood of a player making a shot in basketball from multiagent trajectories. Previous approaches to similar problems center on hand-crafting features to capture domain-specific knowledge. Although intuitive, recent work in deep learning has shown, this approach is prone to missing important predictive features. To circumvent this issue, we present a convolutional neural network (CNN) approach where we initially represent the multiagent behavior as an image. To encode the adversarial nature of basketball, we use a multichannel image which we then feed into a CNN. Additionally, to capture the temporal aspect of the trajectories, we use “fading.” We find that this approach is superior to a traditional FFN model. By using gradient ascent, we were able to discover what the CNN filters look for during training. Last, we find that a combined FFN+CNN is the best performing network with an error rate of 39%. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=basketball" title="basketball">basketball</a>, <a href="https://publications.waset.org/abstracts/search?q=computer%20vision" title=" computer vision"> computer vision</a>, <a href="https://publications.waset.org/abstracts/search?q=image%20processing" title=" image processing"> image processing</a>, <a href="https://publications.waset.org/abstracts/search?q=convolutional%20neural%20network" title=" convolutional neural network"> convolutional neural network</a> </p> <a href="https://publications.waset.org/abstracts/133743/predicting-shot-making-in-basketball-learnt-fromadversarial-multiagent-trajectories" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/133743.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">153</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7786</span> Deep Reinforcement Learning with Leonard-Ornstein Processes Based Recommender System</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Khalil%20Bachiri">Khalil Bachiri</a>, <a href="https://publications.waset.org/abstracts/search?q=Ali%20Yahyaouy"> Ali Yahyaouy</a>, <a href="https://publications.waset.org/abstracts/search?q=Nicoleta%20Rogovschi"> Nicoleta Rogovschi</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Improved user experience is a goal of contemporary recommender systems. Recommender systems are starting to incorporate reinforcement learning since it easily satisfies this goal of increasing a user’s reward every session. In this paper, we examine the most effective Reinforcement Learning agent tactics on the Movielens (1M) dataset, balancing precision and a variety of recommendations. The absence of variability in final predictions makes simplistic techniques, although able to optimize ranking quality criteria, worthless for consumers of the recommendation system. Utilizing the stochasticity of Leonard-Ornstein processes, our suggested strategy encourages the agent to investigate its surroundings. Research demonstrates that raising the NDCG (Discounted Cumulative Gain) and HR (HitRate) criterion without lowering the Ornstein-Uhlenbeck process drift coefficient enhances the diversity of suggestions. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=recommender%20systems" title="recommender systems">recommender systems</a>, <a href="https://publications.waset.org/abstracts/search?q=reinforcement%20learning" title=" reinforcement learning"> reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20learning" title=" deep learning"> deep learning</a>, <a href="https://publications.waset.org/abstracts/search?q=DDPG" title=" DDPG"> DDPG</a>, <a href="https://publications.waset.org/abstracts/search?q=Leonard-Ornstein%20process" title=" Leonard-Ornstein process"> Leonard-Ornstein process</a> </p> <a href="https://publications.waset.org/abstracts/157614/deep-reinforcement-learning-with-leonard-ornstein-processes-based-recommender-system" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/157614.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">142</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7785</span> Leveraging Deep Q Networks in Portfolio Optimization</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Peng%20Liu">Peng Liu</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Deep Q networks (DQNs) represent a significant advancement in reinforcement learning, utilizing neural networks to approximate the optimal Q-value for guiding sequential decision processes. This paper presents a comprehensive introduction to reinforcement learning principles, delves into the mechanics of DQNs, and explores its application in portfolio optimization. By evaluating the performance of DQNs against traditional benchmark portfolios, we demonstrate its potential to enhance investment strategies. Our results underscore the advantages of DQNs in dynamically adjusting asset allocations, offering a robust portfolio management framework. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=deep%20reinforcement%20learning" title="deep reinforcement learning">deep reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20Q%20networks" title=" deep Q networks"> deep Q networks</a>, <a href="https://publications.waset.org/abstracts/search?q=portfolio%20optimization" title=" portfolio optimization"> portfolio optimization</a>, <a href="https://publications.waset.org/abstracts/search?q=multi-period%20optimization" title=" multi-period optimization"> multi-period optimization</a> </p> <a href="https://publications.waset.org/abstracts/189031/leveraging-deep-q-networks-in-portfolio-optimization" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/189031.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">32</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7784</span> Reinforcement Learning for Self Driving Racing Car Games</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Adam%20Beaunoyer">Adam Beaunoyer</a>, <a href="https://publications.waset.org/abstracts/search?q=Cory%20Beaunoyer"> Cory Beaunoyer</a>, <a href="https://publications.waset.org/abstracts/search?q=Mohammed%20Elmorsy"> Mohammed Elmorsy</a>, <a href="https://publications.waset.org/abstracts/search?q=Hanan%20Saleh"> Hanan Saleh</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This research aims to create a reinforcement learning agent capable of racing in challenging simulated environments with a low collision count. We present a reinforcement learning agent that can navigate challenging tracks using both a Deep Q-Network (DQN) and a Soft Actor-Critic (SAC) method. A challenging track includes curves, jumps, and varying road widths throughout. Using open-source code on Github, the environment used in this research is based on the 1995 racing game WipeOut. The proposed reinforcement learning agent can navigate challenging tracks rapidly while maintaining low racing completion time and collision count. The results show that the SAC model outperforms the DQN model by a large margin. We also propose an alternative multiple-car model that can navigate the track without colliding with other vehicles on the track. The SAC model is the basis for the multiple-car model, where it can complete the laps quicker than the single-car model but has a higher collision rate with the track wall. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=reinforcement%20learning" title="reinforcement learning">reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=soft%20actor-critic" title=" soft actor-critic"> soft actor-critic</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20q-network" title=" deep q-network"> deep q-network</a>, <a href="https://publications.waset.org/abstracts/search?q=self-driving%20cars" title=" self-driving cars"> self-driving cars</a>, <a href="https://publications.waset.org/abstracts/search?q=artificial%20intelligence" title=" artificial intelligence"> artificial intelligence</a>, <a href="https://publications.waset.org/abstracts/search?q=gaming" title=" gaming"> gaming</a> </p> <a href="https://publications.waset.org/abstracts/185804/reinforcement-learning-for-self-driving-racing-car-games" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/185804.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">46</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7783</span> A Fully Interpretable Deep Reinforcement Learning-Based Motion Control for Legged Robots</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Haodong%20Huang">Haodong Huang</a>, <a href="https://publications.waset.org/abstracts/search?q=Zida%20Zhao"> Zida Zhao</a>, <a href="https://publications.waset.org/abstracts/search?q=Shilong%20Sun"> Shilong Sun</a>, <a href="https://publications.waset.org/abstracts/search?q=Chiyao%20Li"> Chiyao Li</a>, <a href="https://publications.waset.org/abstracts/search?q=Wenfu%20Xu"> Wenfu Xu</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The control methods for legged robots based on deep reinforcement learning have seen widespread application; however, the inherent black-box nature of neural networks presents challenges in understanding the decision-making motives of the robots. To address this issue, we propose a fully interpretable deep reinforcement learning training method to elucidate the underlying principles of legged robot motion. We incorporate the dynamics of legged robots into the policy, where observations serve as inputs and actions as outputs of the dynamics model. By embedding the dynamics equations within the multi-layer perceptron (MLP) computation process and making the parameters trainable, we enhance interpretability. Additionally, Bayesian optimization is introduced to train these parameters. We validate the proposed fully interpretable motion control algorithm on a legged robot, opening new research avenues for motion control and learning algorithms for legged robots within the deep learning framework. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=deep%20reinforcement%20learning" title="deep reinforcement learning">deep reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=interpretation" title=" interpretation"> interpretation</a>, <a href="https://publications.waset.org/abstracts/search?q=motion%20control" title=" motion control"> motion control</a>, <a href="https://publications.waset.org/abstracts/search?q=legged%20robots" title=" legged robots"> legged robots</a> </p> <a href="https://publications.waset.org/abstracts/189290/a-fully-interpretable-deep-reinforcement-learning-based-motion-control-for-legged-robots" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/189290.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">21</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7782</span> A Deep Reinforcement Learning-Based Secure Framework against Adversarial Attacks in Power System</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Arshia%20Aflaki">Arshia Aflaki</a>, <a href="https://publications.waset.org/abstracts/search?q=Hadis%20Karimipour"> Hadis Karimipour</a>, <a href="https://publications.waset.org/abstracts/search?q=Anik%20Islam"> Anik Islam</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Generative Adversarial Attacks (GAAs) threaten critical sectors, ranging from fingerprint recognition to industrial control systems. Existing Deep Learning (DL) algorithms are not robust enough against this kind of cyber-attack. As one of the most critical industries in the world, the power grid is not an exception. In this study, a Deep Reinforcement Learning-based (DRL) framework assisting the DL model to improve the robustness of the model against generative adversarial attacks is proposed. Real-world smart grid stability data, as an IIoT dataset, test our method and improves the classification accuracy of a deep learning model from around 57 percent to 96 percent. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=generative%20adversarial%20attack" title="generative adversarial attack">generative adversarial attack</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20reinforcement%20learning" title=" deep reinforcement learning"> deep reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20learning" title=" deep learning"> deep learning</a>, <a href="https://publications.waset.org/abstracts/search?q=IIoT" title=" IIoT"> IIoT</a>, <a href="https://publications.waset.org/abstracts/search?q=generative%20adversarial%20networks" title=" generative adversarial networks"> generative adversarial networks</a>, <a href="https://publications.waset.org/abstracts/search?q=power%20system" title=" power system"> power system</a> </p> <a href="https://publications.waset.org/abstracts/188908/a-deep-reinforcement-learning-based-secure-framework-against-adversarial-attacks-in-power-system" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/188908.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">36</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7781</span> Targeted Photoactivatable Multiagent Nanoconjugates for Imaging and Photodynamic Therapy</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Shazia%20Bano">Shazia Bano</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Nanoconjugates that integrate photo-based therapeutics and diagnostics within a single platform promise great advances in revolutionizing cancer treatments. However, to achieve high therapeutic efficacy, designing functionally efficacious nanocarriers to tightly retain the drug, promoting selective drug localization and release, and the validation of the efficacy of these nanoconjugates is a great challenge. Here we have designed smart multiagent, liposome based targeted photoactivatable multiagent nanoconjugates, doped with a photoactivatable chromophore benzoporphyrin derivative (BPD) labelled with an active targeting ligand cetuximab to target the EGFR receptor (over expressed in various cancer cells) to deliver a combination of therapeutic agents. This study establishes a tunable nanoplatform for the delivery of the photoactivatable multiagent nanoconjugates for tumor-specific accumulation and targeted destruction of cancer cells in complex cancer model to enhance the therapeutic index of the administrated drugs. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=targeting" title="targeting">targeting</a>, <a href="https://publications.waset.org/abstracts/search?q=photodynamic%20therapy" title=" photodynamic therapy"> photodynamic therapy</a>, <a href="https://publications.waset.org/abstracts/search?q=photoactivatable" title=" photoactivatable"> photoactivatable</a>, <a href="https://publications.waset.org/abstracts/search?q=nanoconjugates" title=" nanoconjugates"> nanoconjugates</a> </p> <a href="https://publications.waset.org/abstracts/111067/targeted-photoactivatable-multiagent-nanoconjugates-for-imaging-and-photodynamic-therapy" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/111067.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">142</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7780</span> Deep Reinforcement Learning Model Using Parameterised Quantum Circuits</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Lokes%20Parvatha%20Kumaran%20S.">Lokes Parvatha Kumaran S.</a>, <a href="https://publications.waset.org/abstracts/search?q=Sakthi%20Jay%20Mahenthar%20C."> Sakthi Jay Mahenthar C.</a>, <a href="https://publications.waset.org/abstracts/search?q=Sathyaprakash%20P."> Sathyaprakash P.</a>, <a href="https://publications.waset.org/abstracts/search?q=Jayakumar%20V."> Jayakumar V.</a>, <a href="https://publications.waset.org/abstracts/search?q=Shobanadevi%20A."> Shobanadevi A.</a> </p> <p class="card-text"><strong>Abstract:</strong></p> With the evolution of technology, the need to solve complex computational problems like machine learning and deep learning has shot up. But even the most powerful classical supercomputers find it difficult to execute these tasks. With the recent development of quantum computing, researchers and tech-giants strive for new quantum circuits for machine learning tasks, as present works on Quantum Machine Learning (QML) ensure less memory consumption and reduced model parameters. But it is strenuous to simulate classical deep learning models on existing quantum computing platforms due to the inflexibility of deep quantum circuits. As a consequence, it is essential to design viable quantum algorithms for QML for noisy intermediate-scale quantum (NISQ) devices. The proposed work aims to explore Variational Quantum Circuits (VQC) for Deep Reinforcement Learning by remodeling the experience replay and target network into a representation of VQC. In addition, to reduce the number of model parameters, quantum information encoding schemes are used to achieve better results than the classical neural networks. VQCs are employed to approximate the deep Q-value function for decision-making and policy-selection reinforcement learning with experience replay and the target network. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=quantum%20computing" title="quantum computing">quantum computing</a>, <a href="https://publications.waset.org/abstracts/search?q=quantum%20machine%20learning" title=" quantum machine learning"> quantum machine learning</a>, <a href="https://publications.waset.org/abstracts/search?q=variational%20quantum%20circuit" title=" variational quantum circuit"> variational quantum circuit</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20reinforcement%20learning" title=" deep reinforcement learning"> deep reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=quantum%20information%20encoding%20scheme" title=" quantum information encoding scheme"> quantum information encoding scheme</a> </p> <a href="https://publications.waset.org/abstracts/152629/deep-reinforcement-learning-model-using-parameterised-quantum-circuits" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/152629.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">133</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7779</span> Distributed System Computing Resource Scheduling Algorithm Based on Deep Reinforcement Learning</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Yitao%20Lei">Yitao Lei</a>, <a href="https://publications.waset.org/abstracts/search?q=Xingxiang%20Zhai"> Xingxiang Zhai</a>, <a href="https://publications.waset.org/abstracts/search?q=Burra%20Venkata%20Durga%20Kumar"> Burra Venkata Durga Kumar</a> </p> <p class="card-text"><strong>Abstract:</strong></p> As the quantity and complexity of computing in large-scale software systems increase, distributed system computing becomes increasingly important. The distributed system realizes high-performance computing by collaboration between different computing resources. If there are no efficient resource scheduling resources, the abuse of distributed computing may cause resource waste and high costs. However, resource scheduling is usually an NP-hard problem, so we cannot find a general solution. However, some optimization algorithms exist like genetic algorithm, ant colony optimization, etc. The large scale of distributed systems makes this traditional optimization algorithm challenging to work with. Heuristic and machine learning algorithms are usually applied in this situation to ease the computing load. As a result, we do a review of traditional resource scheduling optimization algorithms and try to introduce a deep reinforcement learning method that utilizes the perceptual ability of neural networks and the decision-making ability of reinforcement learning. Using the machine learning method, we try to find important factors that influence the performance of distributed system computing and help the distributed system do an efficient computing resource scheduling. This paper surveys the application of deep reinforcement learning on distributed system computing resource scheduling proposes a deep reinforcement learning method that uses a recurrent neural network to optimize the resource scheduling, and proposes the challenges and improvement directions for DRL-based resource scheduling algorithms. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=resource%20scheduling" title="resource scheduling">resource scheduling</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20reinforcement%20learning" title=" deep reinforcement learning"> deep reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=distributed%20system" title=" distributed system"> distributed system</a>, <a href="https://publications.waset.org/abstracts/search?q=artificial%20intelligence" title=" artificial intelligence"> artificial intelligence</a> </p> <a href="https://publications.waset.org/abstracts/152538/distributed-system-computing-resource-scheduling-algorithm-based-on-deep-reinforcement-learning" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/152538.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">111</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7778</span> Reinforcement Learning for Classification of Low-Resolution Satellite Images</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Khadija%20Bouzaachane">Khadija Bouzaachane</a>, <a href="https://publications.waset.org/abstracts/search?q=El%20Mahdi%20El%20Guarmah"> El Mahdi El Guarmah</a> </p> <p class="card-text"><strong>Abstract:</strong></p> The classification of low-resolution satellite images has been a worthwhile and fertile field that attracts plenty of researchers due to its importance in monitoring geographical areas. It could be used for several purposes such as disaster management, military surveillance, agricultural monitoring. The main objective of this work is to classify efficiently and accurately low-resolution satellite images by using novel technics of deep learning and reinforcement learning. The images include roads, residential areas, industrial areas, rivers, sea lakes, and vegetation. To achieve that goal, we carried out experiments on the sentinel-2 images considering both high accuracy and efficiency classification. Our proposed model achieved a 91% accuracy on the testing dataset besides a good classification for land cover. Focus on the parameter precision; we have obtained 93% for the river, 92% for residential, 97% for residential, 96% for the forest, 87% for annual crop, 84% for herbaceous vegetation, 85% for pasture, 78% highway and 100% for Sea Lake. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=classification" title="classification">classification</a>, <a href="https://publications.waset.org/abstracts/search?q=deep%20learning" title=" deep learning"> deep learning</a>, <a href="https://publications.waset.org/abstracts/search?q=reinforcement%20learning" title=" reinforcement learning"> reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=satellite%20imagery" title=" satellite imagery"> satellite imagery</a> </p> <a href="https://publications.waset.org/abstracts/141097/reinforcement-learning-for-classification-of-low-resolution-satellite-images" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/141097.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">213</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7777</span> A Comparative Study of Mechanisms across Different Online Social Learning Types</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Xinyu%20Wang">Xinyu Wang</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In the context of the rapid development of Internet technology and the increasing prevalence of online social media, this study investigates the impact of digital communication on social learning. Through three behavioral experiments, we explore both affective and cognitive social learning in online environments. Experiment 1 manipulates the content of experimental materials and two forms of feedback, emotional valence, sociability, and repetition, to verify whether individuals can achieve online emotional social learning through reinforcement using two social learning strategies. Results reveal that both social learning strategies can assist individuals in affective, social learning through reinforcement, with feedback-based learning strategies outperforming frequency-dependent strategies. Experiment 2 similarly manipulates the content of experimental materials and two forms of feedback to verify whether individuals can achieve online knowledge social learning through reinforcement using two social learning strategies. Results show that similar to online affective social learning, individuals adopt both social learning strategies to achieve cognitive social learning through reinforcement, with feedback-based learning strategies outperforming frequency-dependent strategies. Experiment 3 simultaneously observes online affective and cognitive social learning by manipulating the content of experimental materials and feedback at different levels of social pressure. Results indicate that online affective social learning exhibits different learning effects under different levels of social pressure, whereas online cognitive social learning remains unaffected by social pressure, demonstrating more stable learning effects. Additionally, to explore the sustained effects of online social learning and differences in duration among different types of online social learning, all three experiments incorporate two test time points. Results reveal significant differences in pre-post-test scores for online social learning in Experiments 2 and 3, whereas differences are less apparent in Experiment 1. To accurately measure the sustained effects of online social learning, the researchers conducted a mini-meta-analysis of all effect sizes of online social learning duration. Results indicate that although the overall effect size is small, the effect of online social learning weakens over time. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=online%20social%20learning" title="online social learning">online social learning</a>, <a href="https://publications.waset.org/abstracts/search?q=affective%20social%20learning" title=" affective social learning"> affective social learning</a>, <a href="https://publications.waset.org/abstracts/search?q=cognitive%20social%20learning" title=" cognitive social learning"> cognitive social learning</a>, <a href="https://publications.waset.org/abstracts/search?q=social%20learning%20strategies" title=" social learning strategies"> social learning strategies</a>, <a href="https://publications.waset.org/abstracts/search?q=social%20reinforcement" title=" social reinforcement"> social reinforcement</a>, <a href="https://publications.waset.org/abstracts/search?q=social%20pressure" title=" social pressure"> social pressure</a>, <a href="https://publications.waset.org/abstracts/search?q=duration" title=" duration"> duration</a> </p> <a href="https://publications.waset.org/abstracts/186019/a-comparative-study-of-mechanisms-across-different-online-social-learning-types" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/186019.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">46</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7776</span> Sampling Effects on Secondary Voltage Control of Microgrids Based on Network of Multiagent</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=M.%20J.%20Park">M. J. Park</a>, <a href="https://publications.waset.org/abstracts/search?q=S.%20H.%20Lee"> S. H. Lee</a>, <a href="https://publications.waset.org/abstracts/search?q=C.%20H.%20Lee"> C. H. Lee</a>, <a href="https://publications.waset.org/abstracts/search?q=O.%20M.%20Kwon"> O. M. Kwon</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This paper studies a secondary voltage control framework of the microgrids based on the consensus for a communication network of multiagent. The proposed control is designed by the communication network with one-way links. The communication network is modeled by a directed graph. At this time, the concept of sampling is considered as the communication constraint among each distributed generator in the microgrids. To analyze the sampling effects on the secondary voltage control of the microgrids, by using Lyapunov theory and some mathematical techniques, the sufficient condition for such problem will be established regarding linear matrix inequality (LMI). Finally, some simulation results are given to illustrate the necessity of the consideration of the sampling effects on the secondary voltage control of the microgrids. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=microgrids" title="microgrids">microgrids</a>, <a href="https://publications.waset.org/abstracts/search?q=secondary%20control" title=" secondary control"> secondary control</a>, <a href="https://publications.waset.org/abstracts/search?q=multiagent" title=" multiagent"> multiagent</a>, <a href="https://publications.waset.org/abstracts/search?q=sampling" title=" sampling"> sampling</a>, <a href="https://publications.waset.org/abstracts/search?q=LMI" title=" LMI"> LMI</a> </p> <a href="https://publications.waset.org/abstracts/51477/sampling-effects-on-secondary-voltage-control-of-microgrids-based-on-network-of-multiagent" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/51477.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">333</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7775</span> Machine Learning Approach for Mutation Testing</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Michael%20Stewart">Michael Stewart</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Mutation testing is a type of software testing proposed in the 1970s where program statements are deliberately changed to introduce simple errors so that test cases can be validated to determine if they can detect the errors. Test cases are executed against the mutant code to determine if one fails, detects the error and ensures the program is correct. One major issue with this type of testing was it became intensive computationally to generate and test all possible mutations for complex programs. This paper used reinforcement learning and parallel processing within the context of mutation testing for the selection of mutation operators and test cases that reduced the computational cost of testing and improved test suite effectiveness. Experiments were conducted using sample programs to determine how well the reinforcement learning-based algorithm performed with one live mutation, multiple live mutations and no live mutations. The experiments, measured by mutation score, were used to update the algorithm and improved accuracy for predictions. The performance was then evaluated on multiple processor computers. With reinforcement learning, the mutation operators utilized were reduced by 50 – 100%. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=automated-testing" title="automated-testing">automated-testing</a>, <a href="https://publications.waset.org/abstracts/search?q=machine%20learning" title=" machine learning"> machine learning</a>, <a href="https://publications.waset.org/abstracts/search?q=mutation%20testing" title=" mutation testing"> mutation testing</a>, <a href="https://publications.waset.org/abstracts/search?q=parallel%20processing" title=" parallel processing"> parallel processing</a>, <a href="https://publications.waset.org/abstracts/search?q=reinforcement%20learning" title=" reinforcement learning"> reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=software%20engineering" title=" software engineering"> software engineering</a>, <a href="https://publications.waset.org/abstracts/search?q=software%20testing" title=" software testing"> software testing</a> </p> <a href="https://publications.waset.org/abstracts/141195/machine-learning-approach-for-mutation-testing" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/141195.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">198</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7774</span> Personalized Email Marketing Strategy: A Reinforcement Learning Approach</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Lei%20Zhang">Lei Zhang</a>, <a href="https://publications.waset.org/abstracts/search?q=Tingting%20Xu"> Tingting Xu</a>, <a href="https://publications.waset.org/abstracts/search?q=Jun%20He"> Jun He</a>, <a href="https://publications.waset.org/abstracts/search?q=Zhenyu%20Yan"> Zhenyu Yan</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Email marketing is one of the most important segments of online marketing. It has been proved to be the most effective way to acquire and retain customers. The email content is vital to customers. Different customers may have different familiarity with a product, so a successful marketing strategy must personalize email content based on individual customers’ product affinity. In this study, we build our personalized email marketing strategy with three types of emails: nurture, promotion, and conversion. Each type of email has a different influence on customers. We investigate this difference by analyzing customers’ open rates, click rates and opt-out rates. Feature importance from response models is also analyzed. The goal of the marketing strategy is to improve the click rate on conversion-type emails. To build the personalized strategy, we formulate the problem as a reinforcement learning problem and adopt a Q-learning algorithm with variations. The simulation results show that our model-based strategy outperforms the current marketer’s strategy. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=email%20marketing" title="email marketing">email marketing</a>, <a href="https://publications.waset.org/abstracts/search?q=email%20content" title=" email content"> email content</a>, <a href="https://publications.waset.org/abstracts/search?q=reinforcement%20learning" title=" reinforcement learning"> reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=machine%20learning" title=" machine learning"> machine learning</a>, <a href="https://publications.waset.org/abstracts/search?q=Q-learning" title=" Q-learning"> Q-learning</a> </p> <a href="https://publications.waset.org/abstracts/152253/personalized-email-marketing-strategy-a-reinforcement-learning-approach" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/152253.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">194</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7773</span> Deep Reinforcement Learning and Generative Adversarial Networks Approach to Thwart Intrusions and Adversarial Attacks</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Fabrice%20Setephin%20Atedjio">Fabrice Setephin Atedjio</a>, <a href="https://publications.waset.org/abstracts/search?q=Jean-Pierre%20Lienou"> Jean-Pierre Lienou</a>, <a href="https://publications.waset.org/abstracts/search?q=Frederica%20F.%20Nelson"> Frederica F. Nelson</a>, <a href="https://publications.waset.org/abstracts/search?q=Sachin%20S.%20Shetty"> Sachin S. Shetty</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Malicious users exploit vulnerabilities in computer systems, significantly disrupting their performance and revealing the inadequacies of existing protective solutions. Even machine learning-based approaches, designed to ensure reliability, can be compromised by adversarial attacks that undermine their robustness. This paper addresses two critical aspects of enhancing model reliability. First, we focus on improving model performance and robustness against adversarial threats. To achieve this, we propose a strategy by harnessing deep reinforcement learning. Second, we introduce an approach leveraging generative adversarial networks to counter adversarial attacks effectively. Our results demonstrate substantial improvements over previous works in the literature, with classifiers exhibiting enhanced accuracy in classification tasks, even in the presence of adversarial perturbations. These findings underscore the efficacy of the proposed model in mitigating intrusions and adversarial attacks within the machine learning landscape. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=machine%20learning" title="machine learning">machine learning</a>, <a href="https://publications.waset.org/abstracts/search?q=reliability" title=" reliability"> reliability</a>, <a href="https://publications.waset.org/abstracts/search?q=adversarial%20attacks" title=" adversarial attacks"> adversarial attacks</a>, <a href="https://publications.waset.org/abstracts/search?q=deep-reinforcement%20learning" title=" deep-reinforcement learning"> deep-reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=robustness" title=" robustness"> robustness</a> </p> <a href="https://publications.waset.org/abstracts/194008/deep-reinforcement-learning-and-generative-adversarial-networks-approach-to-thwart-intrusions-and-adversarial-attacks" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/194008.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">9</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7772</span> Efficient Subgoal Discovery for Hierarchical Reinforcement Learning Using Local Computations</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Adrian%20Millea">Adrian Millea</a> </p> <p class="card-text"><strong>Abstract:</strong></p> In hierarchical reinforcement learning, one of the main issues encountered is the discovery of subgoal states or options (which are policies reaching subgoal states) by partitioning the environment in a meaningful way. This partitioning usually requires an expensive global clustering operation or eigendecomposition of the Laplacian of the states graph. We propose a local solution to this issue, much more efficient than algorithms using global information, which successfully discovers subgoal states by computing a simple function, which we call heterogeneity for each state as a function of its neighbors. Moreover, we construct a value function using the difference in heterogeneity from one step to the next, as reward, such that we are able to explore the state space much more efficiently than say epsilon-greedy. The same principle can then be applied to higher level of the hierarchy, where now states are subgoals discovered at the level below. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=exploration" title="exploration">exploration</a>, <a href="https://publications.waset.org/abstracts/search?q=hierarchical%20reinforcement%20learning" title=" hierarchical reinforcement learning"> hierarchical reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=locality" title=" locality"> locality</a>, <a href="https://publications.waset.org/abstracts/search?q=options" title=" options"> options</a>, <a href="https://publications.waset.org/abstracts/search?q=value%20functions" title=" value functions"> value functions</a> </p> <a href="https://publications.waset.org/abstracts/134077/efficient-subgoal-discovery-for-hierarchical-reinforcement-learning-using-local-computations" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/134077.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">171</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7771</span> Using Q-Learning to Auto-Tune PID Controller Gains for Online Quadcopter Altitude Stabilization</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Y.%20Alrubyli">Y. Alrubyli</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Unmanned Arial Vehicles (UAVs), and more specifically, quadcopters need to be stable during their flights. Altitude stability is usually achieved by using a PID controller that is built into the flight controller software. Furthermore, the PID controller has gains that need to be tuned to reach optimal altitude stabilization during the quadcopter’s flight. For that, control system engineers need to tune those gains by using extensive modeling of the environment, which might change from one environment and condition to another. As quadcopters penetrate more sectors, from the military to the consumer sectors, they have been put into complex and challenging environments more than ever before. Hence, intelligent self-stabilizing quadcopters are needed to maneuver through those complex environments and situations. Here we show that by using online reinforcement learning with minimal background knowledge, the altitude stability of the quadcopter can be achieved using a model-free approach. We found that by using background knowledge instead of letting the online reinforcement learning algorithm wander for a while to tune the PID gains, altitude stabilization can be achieved faster. In addition, using this approach will accelerate development by avoiding extensive simulations before applying the PID gains to the real-world quadcopter. Our results demonstrate the possibility of using the trial and error approach of reinforcement learning combined with background knowledge to achieve faster quadcopter altitude stabilization in different environments and conditions. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=reinforcement%20learning" title="reinforcement learning">reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=Q-leanring" title=" Q-leanring"> Q-leanring</a>, <a href="https://publications.waset.org/abstracts/search?q=online%20learning" title=" online learning"> online learning</a>, <a href="https://publications.waset.org/abstracts/search?q=PID%20tuning" title=" PID tuning"> PID tuning</a>, <a href="https://publications.waset.org/abstracts/search?q=unmanned%20aerial%20vehicle" title=" unmanned aerial vehicle"> unmanned aerial vehicle</a>, <a href="https://publications.waset.org/abstracts/search?q=quadcopter" title=" quadcopter"> quadcopter</a> </p> <a href="https://publications.waset.org/abstracts/149493/using-q-learning-to-auto-tune-pid-controller-gains-for-online-quadcopter-altitude-stabilization" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/149493.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">173</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7770</span> Umbrella Reinforcement Learning – A Tool for Hard Problems</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Egor%20E.%20Nuzhin">Egor E. Nuzhin</a>, <a href="https://publications.waset.org/abstracts/search?q=Nikolay%20V.%20Brilliantov">Nikolay V. Brilliantov</a> </p> <p class="card-text"><strong>Abstract:</strong></p> We propose an approach for addressing Reinforcement Learning (RL) problems. It combines the ideas of umbrella sampling, borrowed from Monte Carlo technique of computational physics and chemistry, with optimal control methods, and is realized on the base of neural networks. This results in a powerful algorithm, designed to solve hard RL problems – the problems, with long-time delayed reward, state-traps sticking and a lack of terminal states. It outperforms the prominent algorithms, such as PPO, RND, iLQR and VI, which are among the most efficient for the hard problems. The new algorithm deals with a continuous ensemble of agents and expected return, that includes the ensemble entropy. This results in a quick and efficient search of the optimal policy in terms of ”exploration-exploitation trade-off” in the state-action space. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=umbrella%20sampling" title="umbrella sampling">umbrella sampling</a>, <a href="https://publications.waset.org/abstracts/search?q=reinforcement%20learning" title=" reinforcement learning"> reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=policy%20gradient" title=" policy gradient"> policy gradient</a>, <a href="https://publications.waset.org/abstracts/search?q=dynamic%20programming" title=" dynamic programming"> dynamic programming</a> </p> <a href="https://publications.waset.org/abstracts/192151/umbrella-reinforcement-learning-a-tool-for-hard-problems" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/192151.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">21</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7769</span> Effectiveness of Reinforcement Learning (RL) for Autonomous Energy Management Solutions</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Tesfaye%20Mengistu">Tesfaye Mengistu</a> </p> <p class="card-text"><strong>Abstract:</strong></p> This thesis aims to investigate the effectiveness of Reinforcement Learning (RL) for Autonomous Energy Management solutions. The study explores the potential of Model Free RL approaches, such as Monte Carlo RL and Q-learning, to improve energy management by autonomously adjusting energy management strategies to maximize efficiency. The research investigates the implementation of RL algorithms for optimizing energy consumption in a single-agent environment. The focus is on developing a framework for the implementation of RL algorithms, highlighting the importance of RL for enabling autonomous systems to adapt quickly to changing conditions and make decisions based on previous experiences. Moreover, the paper proposes RL as a novel energy management solution to address nations' CO2 emission goals. Reinforcement learning algorithms are well-suited to solving problems with sequential decision-making patterns and can provide accurate and immediate outputs to ease the planning and decision-making process. This research provides insights into the challenges and opportunities of using RL for energy management solutions and recommends further studies to explore its full potential. In conclusion, this study provides valuable insights into how RL can be used to improve the efficiency of energy management systems and supports the use of RL as a promising approach for developing autonomous energy management solutions in residential buildings. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=artificial%20intelligence" title="artificial intelligence">artificial intelligence</a>, <a href="https://publications.waset.org/abstracts/search?q=reinforcement%20learning" title=" reinforcement learning"> reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=monte%20carlo" title=" monte carlo"> monte carlo</a>, <a href="https://publications.waset.org/abstracts/search?q=energy%20management" title=" energy management"> energy management</a>, <a href="https://publications.waset.org/abstracts/search?q=CO2%20emission" title=" CO2 emission"> CO2 emission</a> </p> <a href="https://publications.waset.org/abstracts/167464/effectiveness-of-reinforcement-learning-rl-for-autonomous-energy-management-solutions" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/167464.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">83</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7768</span> Research on Knowledge Graph Inference Technology Based on Proximal Policy Optimization</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Yihao%20Kuang">Yihao Kuang</a>, <a href="https://publications.waset.org/abstracts/search?q=Bowen%20Ding"> Bowen Ding</a> </p> <p class="card-text"><strong>Abstract:</strong></p> With the increasing scale and complexity of knowledge graph, modern knowledge graph contains more and more types of entity, relationship, and attribute information. Therefore, in recent years, it has been a trend for knowledge graph inference to use reinforcement learning to deal with large-scale, incomplete, and noisy knowledge graph and improve the inference effect and interpretability. The Proximal Policy Optimization (PPO) algorithm utilizes a near-end strategy optimization approach. This allows for more extensive updates of policy parameters while constraining the update extent to maintain training stability. This characteristic enables PPOs to converge to improve strategies more rapidly, often demonstrating enhanced performance early in the training process. Furthermore, PPO has the advantage of offline learning, effectively utilizing historical experience data for training and enhancing sample utilization. This means that even with limited resources, PPOs can efficiently train for reinforcement learning tasks. Based on these characteristics, this paper aims to obtain better and more efficient inference effect by introducing PPO into knowledge inference technology. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=reinforcement%20learning" title="reinforcement learning">reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=PPO" title=" PPO"> PPO</a>, <a href="https://publications.waset.org/abstracts/search?q=knowledge%20inference" title=" knowledge inference"> knowledge inference</a>, <a href="https://publications.waset.org/abstracts/search?q=supervised%20learning" title=" supervised learning"> supervised learning</a> </p> <a href="https://publications.waset.org/abstracts/173972/research-on-knowledge-graph-inference-technology-based-on-proximal-policy-optimization" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/173972.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">67</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7767</span> Analysis of Q-Learning on Artificial Neural Networks for Robot Control Using Live Video Feed</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Nihal%20Murali">Nihal Murali</a>, <a href="https://publications.waset.org/abstracts/search?q=Kunal%20Gupta"> Kunal Gupta</a>, <a href="https://publications.waset.org/abstracts/search?q=Surekha%20Bhanot"> Surekha Bhanot</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Training of artificial neural networks (ANNs) using reinforcement learning (RL) techniques is being widely discussed in the robot learning literature. The high model complexity of ANNs along with the model-free nature of RL algorithms provides a desirable combination for many robotics applications. There is a huge need for algorithms that generalize using raw sensory inputs, such as vision, without any hand-engineered features or domain heuristics. In this paper, the standard control problem of line following robot was used as a test-bed, and an ANN controller for the robot was trained on images from a live video feed using Q-learning. A virtual agent was first trained in simulation environment and then deployed onto a robot’s hardware. The robot successfully learns to traverse a wide range of curves and displays excellent generalization ability. Qualitative analysis of the evolution of policies, performance and weights of the network provide insights into the nature and convergence of the learning algorithm. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=artificial%20neural%20networks" title="artificial neural networks">artificial neural networks</a>, <a href="https://publications.waset.org/abstracts/search?q=q-learning" title=" q-learning"> q-learning</a>, <a href="https://publications.waset.org/abstracts/search?q=reinforcement%20learning" title=" reinforcement learning"> reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=robot%20learning" title=" robot learning"> robot learning</a> </p> <a href="https://publications.waset.org/abstracts/70136/analysis-of-q-learning-on-artificial-neural-networks-for-robot-control-using-live-video-feed" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/70136.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">372</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7766</span> Gaits Stability Analysis for a Pneumatic Quadruped Robot Using Reinforcement Learning</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Soofiyan%20Atar">Soofiyan Atar</a>, <a href="https://publications.waset.org/abstracts/search?q=Adil%20Shaikh"> Adil Shaikh</a>, <a href="https://publications.waset.org/abstracts/search?q=Sahil%20Rajpurkar"> Sahil Rajpurkar</a>, <a href="https://publications.waset.org/abstracts/search?q=Pragnesh%20Bhalala"> Pragnesh Bhalala</a>, <a href="https://publications.waset.org/abstracts/search?q=Aniket%20Desai"> Aniket Desai</a>, <a href="https://publications.waset.org/abstracts/search?q=Irfan%20Siddavatam"> Irfan Siddavatam</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Deep reinforcement learning (deep RL) algorithms leverage the symbolic power of complex controllers by automating it by mapping sensory inputs to low-level actions. Deep RL eliminates the complex robot dynamics with minimal engineering. Deep RL provides high-risk involvement by directly implementing it in real-world scenarios and also high sensitivity towards hyperparameters. Tuning of hyperparameters on a pneumatic quadruped robot becomes very expensive through trial-and-error learning. This paper presents an automated learning control for a pneumatic quadruped robot using sample efficient deep Q learning, enabling minimal tuning and very few trials to learn the neural network. Long training hours may degrade the pneumatic cylinder due to jerk actions originated through stochastic weights. We applied this method to the pneumatic quadruped robot, which resulted in a hopping gait. In our process, we eliminated the use of a simulator and acquired a stable gait. This approach evolves so that the resultant gait matures more sturdy towards any stochastic changes in the environment. We further show that our algorithm performed very well as compared to programmed gait using robot dynamics. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=model-based%20reinforcement%20learning" title="model-based reinforcement learning">model-based reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=gait%20stability" title=" gait stability"> gait stability</a>, <a href="https://publications.waset.org/abstracts/search?q=supervised%20learning" title=" supervised learning"> supervised learning</a>, <a href="https://publications.waset.org/abstracts/search?q=pneumatic%20quadruped" title=" pneumatic quadruped"> pneumatic quadruped</a> </p> <a href="https://publications.waset.org/abstracts/140524/gaits-stability-analysis-for-a-pneumatic-quadruped-robot-using-reinforcement-learning" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/140524.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">316</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7765</span> Cryptographic Resource Allocation Algorithm Based on Deep Reinforcement Learning</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Xu%20Jie">Xu Jie</a> </p> <p class="card-text"><strong>Abstract:</strong></p> As a key network security method, cryptographic services must fully cope with problems such as the wide variety of cryptographic algorithms, high concurrency requirements, random job crossovers, and instantaneous surges in workloads. Its complexity and dynamics also make it difficult for traditional static security policies to cope with the ever-changing situation. Cyber Threats and Environment. Traditional resource scheduling algorithms are inadequate when facing complex decision-making problems in dynamic environments. A network cryptographic resource allocation algorithm based on reinforcement learning is proposed, aiming to optimize task energy consumption, migration cost, and fitness of differentiated services (including user, data, and task security) by modeling the multi-job collaborative cryptographic service scheduling problem as a multi-objective optimized job flow scheduling problem and using a multi-agent reinforcement learning method, efficient scheduling and optimal configuration of cryptographic service resources are achieved. By introducing reinforcement learning, resource allocation strategies can be adjusted in real-time in a dynamic environment, improving resource utilization and achieving load balancing. Experimental results show that this algorithm has significant advantages in path planning length, system delay and network load balancing and effectively solves the problem of complex resource scheduling in cryptographic services. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=cloud%20computing" title="cloud computing">cloud computing</a>, <a href="https://publications.waset.org/abstracts/search?q=cryptography%20on-demand%20service" title=" cryptography on-demand service"> cryptography on-demand service</a>, <a href="https://publications.waset.org/abstracts/search?q=reinforcement%20learning" title=" reinforcement learning"> reinforcement learning</a>, <a href="https://publications.waset.org/abstracts/search?q=workflow%20scheduling" title=" workflow scheduling"> workflow scheduling</a> </p> <a href="https://publications.waset.org/abstracts/193329/cryptographic-resource-allocation-algorithm-based-on-deep-reinforcement-learning" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/193329.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">12</span> </span> </div> </div> <div class="card paper-listing mb-3 mt-3"> <h5 class="card-header" style="font-size:.9rem"><span class="badge badge-info">7764</span> Adhesion Performance According to Lateral Reinforcement Method of Textile</h5> <div class="card-body"> <p class="card-text"><strong>Authors:</strong> <a href="https://publications.waset.org/abstracts/search?q=Jungbhin%20You">Jungbhin You</a>, <a href="https://publications.waset.org/abstracts/search?q=Taekyun%20Kim"> Taekyun Kim</a>, <a href="https://publications.waset.org/abstracts/search?q=Jongho%20Park"> Jongho Park</a>, <a href="https://publications.waset.org/abstracts/search?q=Sungnam%20Hong"> Sungnam Hong</a>, <a href="https://publications.waset.org/abstracts/search?q=Sun-Kyu%20Park"> Sun-Kyu Park</a> </p> <p class="card-text"><strong>Abstract:</strong></p> Reinforced concrete has been mainly used in construction field because of excellent durability. However, it may lead to reduction of durability and safety due to corrosion of reinforcement steels according to damage of concrete surface. Recently, research of textile is ongoing to complement weakness of reinforced concrete. In previous research, only experiment of longitudinal length were performed. Therefore, in order to investigate the adhesion performance according to the lattice shape and the embedded length, the pull-out test was performed on the roving with parameter of the number of lateral reinforcement, the lateral reinforcement length and the lateral reinforcement spacing. As a result, the number of lateral reinforcement and the lateral reinforcement length did not significantly affect the load variation depending on the adhesion performance, and only the load analysis results according to the reinforcement spacing are affected. <p class="card-text"><strong>Keywords:</strong> <a href="https://publications.waset.org/abstracts/search?q=adhesion%20performance" title="adhesion performance">adhesion performance</a>, <a href="https://publications.waset.org/abstracts/search?q=lateral%20reinforcement" title=" lateral reinforcement"> lateral reinforcement</a>, <a href="https://publications.waset.org/abstracts/search?q=pull-out%20test" title=" pull-out test"> pull-out test</a>, <a href="https://publications.waset.org/abstracts/search?q=textile" title=" textile"> textile</a> </p> <a href="https://publications.waset.org/abstracts/67487/adhesion-performance-according-to-lateral-reinforcement-method-of-textile" class="btn btn-primary btn-sm">Procedia</a> <a href="https://publications.waset.org/abstracts/67487.pdf" target="_blank" class="btn btn-primary btn-sm">PDF</a> <span class="bg-info text-light px-1 py-1 float-right rounded"> Downloads <span class="badge badge-light">358</span> </span> </div> </div> <ul class="pagination"> <li class="page-item disabled"><span class="page-link">&lsaquo;</span></li> <li class="page-item active"><span class="page-link">1</span></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=multi-agent%20reinforcement%20learning&page=2">2</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=multi-agent%20reinforcement%20learning&page=3">3</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=multi-agent%20reinforcement%20learning&page=4">4</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=multi-agent%20reinforcement%20learning&page=5">5</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=multi-agent%20reinforcement%20learning&page=6">6</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=multi-agent%20reinforcement%20learning&page=7">7</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=multi-agent%20reinforcement%20learning&page=8">8</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=multi-agent%20reinforcement%20learning&page=9">9</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=multi-agent%20reinforcement%20learning&page=10">10</a></li> <li class="page-item disabled"><span class="page-link">...</span></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=multi-agent%20reinforcement%20learning&page=259">259</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=multi-agent%20reinforcement%20learning&page=260">260</a></li> <li class="page-item"><a class="page-link" href="https://publications.waset.org/abstracts/search?q=multi-agent%20reinforcement%20learning&page=2" rel="next">&rsaquo;</a></li> </ul> </div> </main> <footer> <div id="infolinks" class="pt-3 pb-2"> <div class="container"> <div style="background-color:#f5f5f5;" class="p-3"> <div class="row"> <div class="col-md-2"> <ul class="list-unstyled"> About <li><a href="https://waset.org/page/support">About Us</a></li> <li><a href="https://waset.org/page/support#legal-information">Legal</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/WASET-16th-foundational-anniversary.pdf">WASET celebrates its 16th foundational anniversary</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Account <li><a href="https://waset.org/profile">My Account</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Explore <li><a href="https://waset.org/disciplines">Disciplines</a></li> <li><a href="https://waset.org/conferences">Conferences</a></li> <li><a href="https://waset.org/conference-programs">Conference Program</a></li> <li><a href="https://waset.org/committees">Committees</a></li> <li><a href="https://publications.waset.org">Publications</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Research <li><a href="https://publications.waset.org/abstracts">Abstracts</a></li> <li><a href="https://publications.waset.org">Periodicals</a></li> <li><a href="https://publications.waset.org/archive">Archive</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Open Science <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Philosophy.pdf">Open Science Philosophy</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Science-Award.pdf">Open Science Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Open-Society-Open-Science-and-Open-Innovation.pdf">Open Innovation</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Postdoctoral-Fellowship-Award.pdf">Postdoctoral Fellowship Award</a></li> <li><a target="_blank" rel="nofollow" href="https://publications.waset.org/static/files/Scholarly-Research-Review.pdf">Scholarly Research Review</a></li> </ul> </div> <div class="col-md-2"> <ul class="list-unstyled"> Support <li><a href="https://waset.org/page/support">Support</a></li> <li><a href="https://waset.org/profile/messages/create">Contact Us</a></li> <li><a href="https://waset.org/profile/messages/create">Report Abuse</a></li> </ul> </div> </div> </div> </div> </div> <div class="container text-center"> <hr style="margin-top:0;margin-bottom:.3rem;"> <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank" class="text-muted small">Creative Commons Attribution 4.0 International License</a> <div id="copy" class="mt-2">© 2024 World Academy of Science, Engineering and Technology</div> </div> </footer> <a href="javascript:" id="return-to-top"><i class="fas fa-arrow-up"></i></a> <div class="modal" id="modal-template"> <div class="modal-dialog"> <div class="modal-content"> <div class="row m-0 mt-1"> <div class="col-md-12"> <button type="button" class="close" data-dismiss="modal" aria-label="Close"><span aria-hidden="true">×</span></button> </div> </div> <div class="modal-body"></div> </div> </div> </div> <script src="https://cdn.waset.org/static/plugins/jquery-3.3.1.min.js"></script> <script src="https://cdn.waset.org/static/plugins/bootstrap-4.2.1/js/bootstrap.bundle.min.js"></script> <script src="https://cdn.waset.org/static/js/site.js?v=150220211556"></script> <script> jQuery(document).ready(function() { /*jQuery.get("https://publications.waset.org/xhr/user-menu", function (response) { jQuery('#mainNavMenu').append(response); });*/ jQuery.get({ url: "https://publications.waset.org/xhr/user-menu", cache: false }).then(function(response){ jQuery('#mainNavMenu').append(response); }); }); </script> </body> </html>