CINXE.COM
Systems and Networking – Communications of the ACM
<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" > <channel> <title>Systems and Networking – Communications of the ACM</title> <atom:link href="https://cacm-acm-org-preprod.go-vip.net/category/systems-and-networking/feed/" rel="self" type="application/rss+xml" /> <link>https://cacm-acm-org-preprod.go-vip.net</link> <description></description> <lastBuildDate>Wed, 14 Feb 2024 20:56:33 +0000</lastBuildDate> <language>en-US</language> <sy:updatePeriod> hourly </sy:updatePeriod> <sy:updateFrequency> 1 </sy:updateFrequency> <generator>https://wordpress.org/?v=6.7.1</generator> <image> <url>https://cacm-acm-org-preprod.go-vip.net/wp-content/uploads/2023/11/cropped-cropped-cacm_favicon-1.png?w=32</url> <title>Systems and Networking – Communications of the ACM</title> <link>https://cacm-acm-org-preprod.go-vip.net</link> <width>32</width> <height>32</height> </image> <site xmlns="com-wordpress:feed-additions:1">212686555</site> <item> <title>Taming Algorithmic Priority Inversion in Mission-Critical Perception Pipelines</title> <link>https://cacm-acm-org-preprod.go-vip.net/research/taming-algorithmic-priority-inversion-in-mission-critical-perception-pipelines/</link> <comments>https://cacm-acm-org-preprod.go-vip.net/research/taming-algorithmic-priority-inversion-in-mission-critical-perception-pipelines/#respond</comments> <dc:creator><![CDATA[Shengzhong Liu]]></dc:creator> <pubDate>Wed, 14 Feb 2024 18:55:24 +0000</pubDate> <category><![CDATA[Artificial Intelligence and Machine Learning]]></category> <category><![CDATA[Systems and Networking]]></category> <guid isPermaLink="false">https://cacm-acm-org-preprod.go-vip.net/?post_type=digital-library&p=751299</guid> <description><![CDATA[This paper discusses algorithmic priority inversion in mission-critical machine inference pipelines used in modern neural-network-based perception subsystems and describes a solution to mitigate its effect.]]></description> <content:encoded><![CDATA[<article> <div class="body" lang="en"> <section class="sec"> <h2 class="heading"><span class="caption-label">1. </span>Introduction</h2> <p id="p-1"><i>Algorithmic priority inversion</i> plagues modern <i>mission-critical</i> machine inference pipelines such as those implementing perception modules in autonomous drones and self-driving cars. We describe an initial solution for removing such priority inversion from neural-network-based perception systems. This research was originally published in RTSS 2020.<a class="reference-link xref xref-bibr" href="#B17" data-jats-ref-type="bibr" data-jats-rid="B17"><sup>17</sup></a> While it is evaluated in the context of autonomous driving only, the design principles described below are expected to remain applicable in other contexts.</p> <p id="p-2">The application of artificial intelligence (AI) has revolutionized cyber-physical systems but has posed novel challenges in aligning computational resource consumption with mission-specific priority. Perception is one of the key components that enable system autonomy. It is also a major efficiency bottleneck that accounts for a considerable fraction of resource consumption.<a class="reference-link xref xref-bibr" href="#B3" data-jats-ref-type="bibr" data-jats-rid="B3"><sup>3</sup></a><sup>,</sup><a class="reference-link xref xref-bibr" href="#B12" data-jats-ref-type="bibr" data-jats-rid="B12"><sup>12</sup></a> In general, priority inversion occurs in computing systems when computations that are less critical (or that have longer deadlines) are performed together with or ahead of those that are more critical (or that have shorter deadlines). Current neural-network-based machine intelligence software suffers from a significant form of priority inversion on the path from perception to decision-making, because it processes input data sequentially in arrival order as opposed to processing important parts of a scene first. By resolving this problem, we significantly improve the system’s responsiveness to critical inputs at a lower platform cost. The work applies to intelligent systems that perceive their environment in real-time (using neural networks), such as self-driving vehicles,<a class="reference-link xref xref-bibr" href="#B1" data-jats-ref-type="bibr" data-jats-rid="B1"><sup>1</sup></a> autonomous delivery drones,<a class="reference-link xref xref-bibr" href="#B5" data-jats-ref-type="bibr" data-jats-rid="B5"><sup>5</sup></a> military defense systems,<a class="reference-link xref xref-bibr" href="#B2" data-jats-ref-type="bibr" data-jats-rid="B2"><sup>2</sup></a> and socially-assistive robotics.<a class="reference-link xref xref-bibr" href="#B8" data-jats-ref-type="bibr" data-jats-rid="B8"><sup>8</sup></a></p> <p id="p-3">To understand the present gap, observe that current deep perception networks perform many layers of manipulation of large multidimensional matrices (called <i>tensors</i>). The underlying neural network libraries (e.g., <i>TensorFlow</i>) are reminiscent of what used to be called the <i>cyclic executive</i><a class="reference-link xref xref-bibr" href="#B4" data-jats-ref-type="bibr" data-jats-rid="B4"><sup>4</sup></a> in early operating system literature. Cyclic executives, in contrast to priority-based scheduling,<a class="reference-link xref xref-bibr" href="#B11" data-jats-ref-type="bibr" data-jats-rid="B11"><sup>11</sup></a> processed all pieces of incoming data at the same <i>priority</i> and <i>fidelity</i> (e.g., as nested loops). Given incoming data frames (e.g., multicolor images or 3D LiDAR point clouds), modern neural network algorithms process all data rows and columns at the same priority and fidelity. Importance cues drive attention weights in AI computations, but not actual computational resource assignments.</p> <p id="p-4">This flat processing is in sharp contrast to the way <i>humans</i> process information. Human cognitive perception systems are good at partitioning the perceived scene into semantically meaningful partial regions in real-time, before allocating different degrees of attention (i.e., processing fidelity) and prioritizing the processing of important parts, to better utilize the limited cognitive resources. Given a complex scene, such as a freeway with multiple nearby vehicles, human drivers are good at understanding what to focus on to plan a valid path forward. In fact, human cognitive capacity is not sufficient to simultaneously absorb everything in their field of view. For example, if faced with an iMax screen partitioned into a dozen subdivisions, each playing an independent movie, humans would be fundamentally incapable of giving all such simultaneously playing movies sufficient attention. This suggests that GPUs that can, in fact, keep up with processing all pixels of the input scene are fundamentally and needlessly over-provisioned. They could be substantially smaller if endowed with a human-like capacity to focus on part of the scene only. The lack of prioritized allocation of processing resources to different parts of an input data stream (e.g., from a camera) is an instance of <i>algorithmic priority inversion</i>. As exemplified above, it results in significant resource waste, processing less important stimuli together with more important ones. To avoid wasting resources, the architecture described in this paper allows machine perception pipelines to partition the scene into regions of different criticality, prioritize the processing of important parts ahead of others, and provide higher processing fidelity on critical regions.</p> </section> <section class="sec"> <h2 class="heading"><span class="caption-label">2. </span>System Architecture</h2> <p id="p-5">Consider a simple pipeline composed of a camera that observes its physical environment, a neural network that processes the sampled frames, and a control unit that must react in real-time. <a class="xref xref-fig" href="#fig1" data-jats-ref-type="fig" data-jats-rid="fig1">Figure 1</a> contrasts the traditional design of such a machine inference pipeline to the proposed architecture. In the traditional design, the captured input data frames are processed sequentially by the neural network without preemption in execution.</p> <figure id="fig1" class="fig" data-jats-position="float"> <div class="image-container"><img decoding="async" class="graphic" title="Figure 1: " src="https://cacm-acm-org-preprod.go-vip.net/wp-content/uploads/2024/02/3610801_fig01.jpg" alt="" data-image-id="fig1" data-image-type="figure" /></div><figcaption><span class="caption-label">Figure 1: </span> <span class="p">Real-time Machine Inference Pipeline Architecture.</span></p> <div class="figcaption-footer"> </div> </figcaption></figure> <p id="p-7">Unfortunately, the multi-dimensional data frames captured by modern sensors (e.g., colored camera images and 3D LiDAR point clouds) carry information of different degrees of criticality in every frame<a class="footnote-link xref xref-fn" href="#FN1" data-jats-ref-type="fn" data-jats-rid="FN1"><sup>a</sup></a>. Data of different criticality may require a different processing latency. For example, processing parts of the image that represent faraway objects does not need to happen every frame, whereas processing nearby objects, such as a vehicle in front, needs to be done immediately because of their impact on immediate path planning. To accommodate these differences in input data criticality, our machine perception pipeline breaks the input frame processing into four steps:</p> <ul class="list" data-jats-list-type="bullet"> <li class="list-item"> <p id="p-8">Data slicing and priority allocation: This module breaks up newly arriving frames into smaller regions of different degrees of criticality based on simple heuristics (i.e., distance-based criticality).</p> </li> <li class="list-item"> <p id="p-9">Deduplication: This module drops redundant regions (i.e., ones that refer to the same physical objects) across successive arriving frames.</p> </li> <li class="list-item"> <p id="p-10">“Anytime” neural network: This neural network implements an imprecise computation model that allows execution to be preempted while yielding partial utility from the partially completed computation. The approach allows newly arriving critical data to preempt the processing of less critical data from older frames.</p> </li> <li class="list-item"> <p id="p-11">Batching and utility maximization: This module sits between the data slicing and deduplication modules on one end and the neural network on the other. With data regions broken by priority, it decides which regions to pass to the neural network for processing. Since multiple regions may be queued for processing, it also decides how best to benefit from batching (that improves processing efficiency).</p> </li> </ul> <p id="p-12">We refer to the subsystem shown in <a class="xref xref-fig" href="#fig1" data-jats-ref-type="fig" data-jats-rid="fig1">Figure 1</a> as the <i>observer</i>. The goal is to allow the observer to respond to more urgent stimuli ahead of less urgent ones. To make the observer concrete, we consider a video processing pipeline, where the input video frames get broken into regions of different criticality according to the distance information obtained from a ranging sensor (i.e., LiDAR). Different deadline-driven priorities are then assigned to the processing of these regions. We adopt an imprecise computation model for neural networks<a class="reference-link xref xref-bibr" href="#B21" data-jats-ref-type="bibr" data-jats-rid="B21"><sup>21</sup></a> to achieve a hierarchy of different processing fidelities. We further introduce a utility-optimizing scheduling algorithm for the resulting real-time workload to meet deadlines while maximizing a notion of global utility (to the mission). We implement the architecture on an NVIDIA Jetson Xavier platform and do a performance evaluation on the platform using real video traces collected from autonomous vehicles. The results show that the new algorithms significantly improve the average quality of machine inference, while nearly eliminating deadline misses, compared to a set of state-of-the-art baselines executed on the same hardware under the same frame rate.</p> <p id="p-13">For completeness, below we first describe all components of the observer, respectively. We then detail the batching and utility maximization algorithm used.</p> <section class="sec"> <h3 class="heading"><span class="caption-label">2.1. </span>Data slicing and priority allocation</h3> <p id="p-14">This module breaks up input data frames into regions that require different degrees of attention. Objects with a smaller <i>time-to-collision</i><a class="reference-link xref xref-bibr" href="#B18" data-jats-ref-type="bibr" data-jats-rid="B18"><sup>18</sup></a> should receive attention more urgently and be processed at a higher fidelity. We further assume that the observer is equipped with a <i>ranging</i> sensor. For example, in autonomous driving systems, a LiDAR sensor measures distances between the vehicle and other objects. LiDAR point cloud-based object localization techniques have been proposed<a class="reference-link xref xref-bibr" href="#B6" data-jats-ref-type="bibr" data-jats-rid="B6"><sup>6</sup></a> that provide a fast (i.e., over 200Hz) and accurate ranging and object localization capability. The computed object locations can then be projected onto the image obtained from the camera, allowing the extraction of regions (subareas of the image) that represent these localized objects, sorted by distance from the observer. For simplicity, we restrict those subareas to rectangular regions or <i>bounding boxes</i>. We define the priority (of bounding boxes) by time-to-collision, given the trajectory of the observer and the location of the object. Computing the time-to-collision is a well-studied topic and is not our contribution.<a class="reference-link xref xref-bibr" href="#B18" data-jats-ref-type="bibr" data-jats-rid="B18"><sup>18</sup></a></p> </section> <section class="sec"> <h3 class="heading"><span class="caption-label">2.2. </span>Deduplication</h3> <p id="p-15">The deduplication module eliminates redundant bounding boxes. Since the same objects generally persist across many frames, the same bounding boxes will be identified in multiple frames. The set of bounding boxes pertaining to the same object in different frames is called a <i>tubelet</i>. Since the best information is usually the most recent, only the most recent bounding box in a tubelet needs to be acted on. The deduplication module identifies boxes with large overlaps as redundant and stores the most recent box only. For efficiency reasons described later, we quantize the used bounding box sizes. The deduplication module uses the same box size for the same object throughout the entire tubelet. Note that, in a traditional neural network processing pipeline, each frame is processed in its entirety before the next one arrives. Thus, no deduplication module is used. The option to add this time-saving module to our architecture arises because our pipeline can postpone the processing of some objects until a later time. By that time, updated images of the same object may arrive. This enables savings by looking at the latest image only when the neural network eventually gets around to processing the object.</p> </section> <section class="sec"> <h3 class="heading"><span class="caption-label">2.3. </span>The anytime neural network</h3> <p id="p-16">A perfect <i>anytime</i> algorithm is one that can be terminated at any point, yielding utility that monotonically increases with the amount of processing performed. We approximate the optimal model with an imprecise computation model,<a class="reference-link xref xref-bibr" href="#B14" data-jats-ref-type="bibr" data-jats-rid="B14"><sup>14</sup></a><sup>–</sup><a class="reference-link xref xref-bibr" href="#B16" data-jats-ref-type="bibr" data-jats-rid="B16"><sup>16</sup></a> where the processing consists of two parts: a <i>mandatory part</i> and multiple <i>optional parts</i>. The optional parts, or a portion thereof, can be skipped to conserve resources. When at least one optional part is skipped, the task is said to produce an <i>imprecise</i> result. Deep neural networks (e.g., image recognition models)<a class="reference-link xref xref-bibr" href="#B10" data-jats-ref-type="bibr" data-jats-rid="B10"><sup>10</sup></a> are a concatenation of a large number of layers that can be divided into several stages, as we show in <a class="xref xref-fig" href="#fig2" data-jats-ref-type="fig" data-jats-rid="fig2">Figure 2</a>. Ordinarily, an output layer is used at the end to convert features computed by earlier layers into the output value (e.g., an object classification). Prior work has shown, however, that other output layers can be forked off of intermediate stages producing meaningful albeit imprecise outputs based on features computed up to that point.<a class="reference-link xref xref-bibr" href="#B20" data-jats-ref-type="bibr" data-jats-rid="B20"><sup>20</sup></a> <a class="xref xref-fig" href="#fig3" data-jats-ref-type="fig" data-jats-rid="fig3">Figure 3</a> shows the accuracy of ResNet-based classification applied to the ImageNet<a class="reference-link xref xref-bibr" href="#B7" data-jats-ref-type="bibr" data-jats-rid="B7"><sup>7</sup></a> dataset at the intermediate stages of neural network processing. The quality of outputs increases when the network executes more optional parts. We set the utility proportionally to <i>predictive confidence in result</i>; a low confidence output is less useful than a high confidence output. The proportionality factor itself can be set depending on task criticality, such that uncertainty in the output of more critical tasks is penalized more.</p> <figure id="fig2" class="fig" data-jats-position="float"> <div class="image-container"><img decoding="async" class="graphic" title="Figure 2: " src="https://cacm-acm-org-preprod.go-vip.net/wp-content/uploads/2024/02/3610801_fig02.jpg" alt="" data-image-id="fig2" data-image-type="figure" /></div><figcaption><span class="caption-label">Figure 2: </span> <span class="p">ResNet<a class="reference-link xref xref-bibr" href="#B10" data-jats-ref-type="bibr" data-jats-rid="B10"><sup>10</sup></a> architecture with multiple exits. On the left, we show the design of the basic bottleneck block of ResNet. <i>c</i> is the feature dimension. The classifier has a pooling layer and a fully connected layer.</span></p> <div class="figcaption-footer"> </div> </figcaption></figure> <figure id="fig3" class="fig" data-jats-position="float"> <div class="image-container"><img decoding="async" class="graphic" title="Figure 3: " src="https://cacm-acm-org-preprod.go-vip.net/wp-content/uploads/2024/02/3610801_fig03.jpg" alt="" data-image-id="fig3" data-image-type="figure" /></div><figcaption><span class="caption-label">Figure 3: </span> <span class="p">ResNet stage accuracy change on ImageNet<a class="reference-link xref xref-bibr" href="#B7" data-jats-ref-type="bibr" data-jats-rid="B7"><sup>7</sup></a> dataset.</span></p> <div class="figcaption-footer"> </div> </figcaption></figure> </section> <section class="sec"> <h3 class="heading"><span class="caption-label">2.4. </span>Batching and utility maximization</h3> <p id="p-19">This module decides the schedule of processing of all regions identified by the data slicing and prioritization module and that passes de-duplication. The data slicing module computes <i>bounding boxes</i> for objects detected, which constitute regions that require attention, each assigned a degree of criticality. The deduplication module groups boxes related to the same object into a tubelet. Only the latest box in the tubelet is kept. Each physical object gives rise to a separate neural network task to be scheduled. The input of that task is the bounding box for the corresponding object (cropped from the full scene).</p> </section> </section> <section class="sec"> <h2 class="heading"><span class="caption-label">3. </span>The Scheduling Problem</h2> <p id="p-20">In this section, we describe our task execution model, formulate the studied scheduling problem, and derive a near-optimal solution.</p> <section class="sec"> <h3 class="heading"><span class="caption-label">3.1. </span>The execution model</h3> <p id="p-21">As alluded to earlier, the scheduled tasks in our system constitute the execution of multi-layer deep neural networks (e.g., ResNet,<a class="reference-link xref xref-bibr" href="#B10" data-jats-ref-type="bibr" data-jats-rid="B10"><sup>10</sup></a> as shown in <a class="xref xref-fig" href="#fig2" data-jats-ref-type="fig" data-jats-rid="fig2">Figure 2</a>), each processing a different input data region (i.e., a bounding box). As shown in <a class="xref xref-fig" href="#fig2" data-jats-ref-type="fig" data-jats-rid="fig2">Figure 2</a>, tasks are broken into stages, where each stage includes multiple neural network layers. The unit of scheduling is a single stage, whose execution is non-preemptive, but tasks can be preempted on stage boundaries. A task arrives when a new object is detected by the ranging sensor (e.g., LiDAR) giving rise to a corresponding new bounding box in the camera scene. Let the arrival time of task <i>τ<sub>i</sub></i> be denoted by <i>a<sub>i</sub></i>. A deadline <i>d<sub>i</sub></i> > <i>a<sub>i</sub></i>, is assigned by the data slicing and priority assignment module denoting the time by which the task must be processed (e.g., the corresponding object classified). The data slicing and priority assignment module are invoked at frame arrival time. Therefore, both <i>a<sub>i</sub></i> and <i>d<sub>i</sub></i> are a multiple of frame inter-arrival time, <i>H</i>. No task can be executed after its deadline. Future object sizes, arrival times, and deadlines are unknown, which makes the scheduling problem an <i>online decision problem</i>. A combination of two aspects makes this real-time scheduling problem interesting: <i>batching</i> and <i>imprecise computations</i>. We describe these aspects below.</p> <section class="inline-headings-section"> <p data-jats-content-type="inline-heading"><strong>Batching.</strong> Stages of the neural network, in our architecture, are executed on a low-end embedded GPU. While such GPUs feature parallel execution, most require that the same kernel be executed on all GPU cores. This means that we can process different images concurrently on the GPU as long as we run the <i>same kernel</i> on all GPU cores. We call such concurrent execution, <i>batching</i>. Running the same kernel on all GPU cores means that we can only batch image processing tasks if both of the following apply: (i) they are executing <i>the same neural network stage</i>, and (ii) they <i>run on the same size inputs</i>. The latter condition is because the processing of different bounding box sizes requires instantiating different GPU kernels. Batching is advantageous because it allows us to better utilize the parallel processing capacity of GPU. To increase batching opportunities, we limit the size of possible bounding boxes to a finite set of options. For a given bounding box size <i>k</i>, at most <i>B<sup>(k)</sup></i> tasks (processing inputs) can be batched together before overloading the GPU capacity. We call it the <i>batching limit</i> for the corresponding input size.</p> </section> <section class="inline-headings-section"> <p data-jats-content-type="inline-heading"><strong>Imprecise computations.</strong> Let the number of neural network stages for task <i>τ<sub>i</sub></i> be <i>L<sub>i</sub></i> (different input sizes may have different numbers of stages). We call the first stage <i>mandatory</i> and call the remaining stages <i>optional</i>. Following a recently developed imprecise computation model for deep neural networks (DNN),<a class="reference-link xref xref-bibr" href="#B21" data-jats-ref-type="bibr" data-jats-rid="B21"><sup>21</sup></a> tasks are written such that they can return an object classification result once the mandatory stage is executed. This result then improves with the execution of each optional stage. Earlier work presented an approach to estimate the expected confidence in the correctness of the results of future stages, ahead of executing these stages.<a class="reference-link xref xref-bibr" href="#B22" data-jats-ref-type="bibr" data-jats-rid="B22"><sup>22</sup></a> This estimation offers a basis for assessing the utility of future task stage execution. We denote the utility of task <i>τ<sub>i</sub></i> after executing <i>j</i> ≤ <i>L<sub>i</sub></i> stages by <i>R<sub>i,j</sub></i>, where <i>R<sub>i,j</sub></i> is set proportionately to the predicted confidence in correctness at the conclusion of stage <i>j</i>. Note that, the expected utility can be different among tasks (depending in part on input size), but it is computable, non-decreasing, and concave with respect to the network stage.<a class="reference-link xref xref-bibr" href="#B22" data-jats-ref-type="bibr" data-jats-rid="B22"><sup>22</sup></a></p> <p id="p-24">We denote by 𝒯(<i>t</i>) the set of <i>current tasks</i> at time <i>t</i>. A task, <i>τ<sub>i</sub></i>, is called <i>current</i> at time <i>t</i>, if <i>a<sub>i</sub></i> ≤ <i>t</i> < <i>d<sub>i</sub></i>, and the task has not yet completed its last stage, <i>L<sub>i</sub></i>. For task <i>τ<sub>i</sub></i> of input size, <i>k</i>, the execution time of the <i>j</i>-th stage is denoted by <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msubsup><mrow><mi>e</mi></mrow><mrow><mi>j</mi><mo>,</mo><mi>b</mi></mrow><mrow><mrow><mo>(</mo><mi>k</mi><mo>)</mo></mrow></mrow></msubsup></mrow></math></span>, where <i>b</i> is the number of batched tasks during the stage execution.</p> </section> </section> <section class="sec"> <h3 class="heading"><span class="caption-label">3.2. </span>Problem formulation</h3> <p id="p-25">We next formulate a new scheduling problem, called <i>BAtched Scheduling with Imprecise Computations (BASIC)</i>. The problem is simply to decide on the number of stages <i>l<sub>i</sub></i> ≤ <i>L<sub>i</sub></i> to execute for each task <i>τ<sub>i</sub></i> and to schedule the batched execution of those task stages on the GPU such that the total utility, <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mrow><munder><mo>∑</mo><mrow><mi>i</mi></mrow></munder><mrow><msub><mrow><mi>R</mi></mrow><mrow><mi>i</mi><mo>,</mo><msub><mrow><mi>l</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow></msub></mrow></mrow></mrow></math></span>, of executed tasks is maximized, and batching constraints are met (i.e., all used GPU cores execute the same kernel at any given time, and that the batching limit is not exceeded). In summary:</p> <section class="inline-headings-section"> <p data-jats-content-type="inline-heading"><strong>The BASIC problem.</strong> <i>With online task arrivals, the objective of the BASIC problem is to derive a schedule x to maximize the aggregate system utility. The schedule decides three outputs: task stage execution order on the GPU, number of stages to execute for each task, and task batching decisions. For each scheduling period t, we use x<sub>t</sub>(i, j) ɛ {0, 1} to denote whether the j-th stage of task τ<sub>i</sub> is executed. Besides, we use P to denote a batch of tasks, where |P| denotes the number of tasks being batched. The mathematical formulation of the optimization problem is:</i></p> <p><span class="disp-formula"> <span class="article-label">(1)</span> <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> <mtable> <mtr> <mtd> <mi>B</mi> <mi>A</mi> <mi>S</mi> <mi>I</mi> <mi>C</mi> <mo>:</mo> </mtd> <mtd> <mi>max</mi> <munder> <mrow> <mi>Σ</mi> </mrow> <mrow> <msub> <mrow> <mi>x</mi> </mrow> <mrow> <mi>t</mi> </mrow> </msub> </mrow> </munder> <munder> <mrow> <mi>Σ</mi> </mrow> <mrow> <mi>i</mi> </mrow> </munder> <msub> <mrow> <mi>x</mi> </mrow> <mrow> <mi>t</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> <mrow> <mo>(</mo> <msub> <mrow> <mi>R</mi> </mrow> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> </mrow> </msub> <mo>−</mo> <msub> <mrow> <mi>R</mi> </mrow> <mrow> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>−</mo> <mn>1</mn> </mrow> </msub> <mo>)</mo> </mrow> </mrow> <mo></mo> </mtd> </mtr> <mtr> <mtd> <mi>s</mi> <mn>.</mn> <mi>t</mi> <mn>.</mn> <msub> <mrow> <mi>x</mi> </mrow> <mrow> <mi>t</mi> </mrow> </msub> <mrow> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>∈</mo> <mrow> <mrow> <mo>{</mo> <mn>0</mn> <mo>,</mo> <mn>1</mn> <mo>}</mo> </mrow> <munderover> <mrow> <mi>Σ</mi> </mrow> <mrow> <mi>t</mi> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>T</mi> </mrow> </munderover> <msub> <mrow> <mi>x</mi> </mrow> <mrow> <mi>t</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> <mo>≤</mo> <mn>1</mn> <mo>,</mo> <mo>∀</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> </mrow> </mrow> </mrow> </mtd> </mtr> </mtable> </mrow> </math> </span> <span class="disp-formula"> <span class="article-label">(2)</span> <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> <msub> <mrow> <mi>x</mi> </mrow> <mrow> <mi>t</mi> </mrow> </msub> <mrow> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mrow> <mo>=</mo> <mn>0</mn> <mrow> <mo>∀</mo> <mi>t</mi> <mo>∉</mo> <mo>[</mo> <msub> <mrow> <mi>a</mi> </mrow> <mrow> <mi>i</mi> </mrow> </msub> <mo>,</mo> <msub> <mrow> <mi>d</mi> </mrow> <mrow> <mi>i</mi> </mrow> </msub> <mo>)</mo> </mrow> </mrow> </mrow> <mo>,</mo> <mo>∀</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> </mrow> </math> </span> <span class="disp-formula"> <span class="article-label">(3)</span> <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> <mtable> <mtr> <mtd> <munderover> <mrow> <mi>Σ</mi> </mrow> <mrow> <mi>t</mi> <mo>′</mo> <mo>=</mo> <mn>1</mn> </mrow> <mrow> <mi>t</mi> <mo>−</mo> <mn>1</mn> </mrow> </munderover> <msub> <mrow> <mi>x</mi> </mrow> <mrow> <mi>t</mi> <mo>′</mo> </mrow> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>−</mo> <mn>1</mn> <mo>)</mo> <mo>−</mo> <msub> <mrow> <mi>x</mi> </mrow> <mrow> <mi>t</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> <mo>≥</mo> <mn>0</mn> <mo>,</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mo>∀</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>></mo> <mn>1</mn> <mo>,</mo> <mi>t</mi> <mo>></mo> <mn>1</mn> </mtd> </mtr> </mtable> </mrow> </math> </span> <span class="disp-formula"> <span class="article-label">(4)</span> <math xmlns="http://www.w3.org/1998/Math/MathML"> <mrow> <mtable> <mtr> <mtd> <msub> <mrow> <mi>s</mi> </mrow> <mrow> <mi>i</mi> </mrow> </msub> <mo>=</mo> <msub> <mrow> <mi>s</mi> </mrow> <mrow> <mi>i</mi> <mo>′</mo> </mrow> </msub> <mo>=</mo> <mi>k</mi> <mo>,</mo> <msub> <mrow> <mi>l</mi> </mrow> <mrow> <mi>i</mi> </mrow> </msub> <mo>=</mo> <msub> <mrow> <mi>l</mi> </mrow> <mrow> <mi>i</mi> <mo>′</mo> </mrow> </msub> <mo>,</mo> <mrow> <mo>|</mo> <mi>P</mi> <mo>|</mo> </mrow> <mo>≤</mo> <msub> <mrow> <mi>b</mi> </mrow> <mrow> <mi>k</mi> </mrow> </msub> <mo>,</mo> </mtd> </mtr> <mtr> <mtd> <mo>∀</mo> <mi>i</mi> <mo>∈</mo> <mi>P</mi> <mo>,</mo> <mi>i</mi> <mo>′</mo> <mo>∈</mo> <mi>P</mi> <mo>,</mo> <mo>∃</mo> <mi>k</mi> <mo>∈</mo> <mi>S</mi> </mtd> </mtr> </mtable> </mrow> </math> </span></p> <p id="p-27"><i>The following constraints should be satisfied: (1) Each neural network stage can only be executed once; (2) No task can be executed after its deadline; (3) The execution of different stages of the same task must satisfy their precedence constraints; and (4) Only tasks with the same (image size, network stage) can be batched, and the number of batched tasks can not exceed the batching constraint of their image size</i>.</p> <p id="p-28">Only one batch (kernel) can be executed on the GPU at any time. However, multiple batches can be executed sequentially in one scheduling period, as long as the sum of their execution times does not exceed the period length, <i>H</i>.</p> </section> </section> <section class="sec"> <h3 class="heading"><span class="caption-label">3.3. </span>An online scheduling framework</h3> <p id="p-29">We derive an optimal dynamic programming-based solution for the BASIC scheduling problem and express its competitive ratio relative to a clairvoyant scheduler (that has full knowledge of all future task arrivals). We then derive a more efficient greedy algorithm that approximates the dynamic programming schedule. We define the clairvoyant scheduling problem as follows:</p> <div id="statement1" class="statement"><b class="statement-label"><span class="sc">Definition</span> 1 (<span class="sc">Clairvoyant Scheduling Problem</span>).</b></p> <p id="p-30"><i>Given information about all future tasks, the clairvoyant scheduling problem seeks to maximize the aggregate utility obtained from (stages of) tasks that are completed before their deadlines. The maximum aggregate utility is OPT</i>.</p> </div> <p id="p-31">With no future information, an online scheduling algorithm that achieves a competitive ratio of <i>c</i> (i.e., a utility <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mo>≥</mo><mfrac><mrow><mn>1</mn></mrow><mrow><mi>c</mi></mrow></mfrac><mo>⋅</mo><mi>O</mi><mi>P</mi><mi>T</mi></mrow></math></span>) is called <i>c</i>-competitive. A lower bound on the competitive ratio for online scheduling algorithms was shown to be 1.618.<a class="reference-link xref xref-bibr" href="#B9" data-jats-ref-type="bibr" data-jats-rid="B9"><sup>9</sup></a></p> <p id="p-32">Our scheduler is invoked upon frame arrivals, which is once every <i>H</i> unit of time. We thus call <i>H</i> the <i>scheduling period</i>. We assume that all task stage execution times are multiples of some basic time unit <i>δ</i>, thereby allowing us to express <i>H</i> by an integer value. We further call the problem of scheduling current tasks within the period between successive frame arrivals, the <i>local scheduling problem</i>:</p> <div id="statement2" class="statement"><b class="statement-label"><span class="sc">Definition</span> 2 (<span class="sc">Local Basic Problem</span>).</b></p> <p id="p-33"><i>Given the set of current tasks, 𝒯(t), within the scheduling period, t, the local BASIC problem seeks to maximize the total utility gained within this scheduling period only</i>.</p> </div> <p id="p-34">We proceed to show that an online scheduling algorithm that optimally solves the local scheduling problem within each period will have a good competitive ratio. Let <i>L</i> be the maximum number of stages in any task, and let <i>B</i> be the maximum batching size:</p> <div id="statement3" class="statement"><b class="statement-label"><span class="sc">Theorem</span> 1.</b></p> <p id="p-35"><i>If during each scheduling period, the local BASIC problem for that period is solved optimally, then the resulting online scheduling algorithm is</i> min{2 + <i>L</i>, 2<i>B</i> + 1}<i>-competitive with respect to a clairvoyant algorithm</i>.</p> </div> <p id="p-36">When no imprecise computation is considered, the competitive ratio is further reduced to:</p> <div id="statement4" class="statement"><b class="statement-label"><span class="sc">Corollary</span> 1.</b></p> <p id="p-37"><i>If each task is only one stage long, and if the online scheduling algorithm solved the local BASIC problem in each scheduling period optimally, then the online scheduling algorithm is</i> 3<i>-competitive with respect to a clairvoyant algorithm</i>.</p> </div> </section> <section class="sec"> <h3 class="heading"><span class="caption-label">3.4. </span>Local scheduling algorithms</h3> <p id="p-38">In this section, we propose two algorithms to solve the local BASIC problem. The first is a dynamic programming-based algorithm that optimally solves it but may have a higher computational overhead. The second is a greedy algorithm that is computationally efficient but may not optimally solve the problem.</p> <section class="inline-headings-section"> <p data-jats-content-type="inline-heading"><strong>Local dynamic programming scheduling.</strong> Since we only consider batching together on the GPU tasks that execute the same kernel (i.e., same stage on the same size input), we need to partition the scheduling interval, <i>H</i>, into sub-intervals where the above constraint is met. The challenge is to find optimal partitioning. This question is broken into three steps:</p> <ul class="list" data-jats-list-type="bullet"> <li class="list-item"> <p id="p-40">Step 1: Given an amount of time, <i>T<sub>j,k</sub></i> ≤ <i>H</i>, what is the maximum utility attainable by scheduling the same stage, <i>j</i>, of tasks that process an input of size <i>k</i>? The answer here simply depends on the maximum number of tasks that we can batch during <i>T<sub>j,k</sub></i> without violating the batching limit. If the time allows for more than one batch, dynamic programming is used to optimally size the batches. Let the maximum attainable utility thus found be denoted by <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msubsup><mrow><mi>U</mi></mrow><mrow><mi>j</mi><mo>,</mo><mi>k</mi></mrow><mrow><mo>*</mo></mrow></msubsup></mrow></math></span>.</p> </li> <li class="list-item"> <p id="p-41">Step 2: Given an amount of time, <i>T<sub>k</sub></i> ≤ <i>H</i>, what is the maximum utility attainable by scheduling (any number of stages of) tasks that process an input of size <i>k</i>? Let us call this maximum utility <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msubsup><mrow><mi>U</mi></mrow><mrow><mi>k</mi></mrow><mrow><mo>*</mo></mrow></msubsup></mrow></math></span>. Dynamic programming is used to find the best way to break interval <i>T<sub>k</sub></i> into non-overlapping intervals <i>T<sub>j,k</sub></i>, for which the total sum of utilities, <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msubsup><mrow><mi>U</mi></mrow><mrow><mi>j</mi><mo>,</mo><mi>k</mi></mrow><mrow><mo>*</mo></mrow></msubsup></mrow></math></span>, is maximum.</p> </li> <li class="list-item"> <p id="p-42">Step 3: Given the scheduling interval, <i>H</i>, what is the maximum utility attainable by scheduling tasks of different input sizes? Let us call this maximum utility <i>U</i>*. Dynamic programming is used to find the best way to break interval <i>H</i> into non-overlapping intervals <i>T<sub>k</sub></i>, for which the total sum of utilities, <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msubsup><mrow><mi>U</mi></mrow><mrow><mi>k</mi></mrow><mrow><mo>*</mo></mrow></msubsup></mrow></math></span>, is maximum.</p> </li> </ul> <p id="p-43">The resulting utility, <i>U</i>*, as well as the corresponding breakdown of the scheduling interval constitute the optimal solution. In essence, the solution breaks down the overall utility maximization problem into a utility maximization problem over time sub-intervals, where tasks process only a given input size. These sub-intervals are in turn broken into sub-intervals that process the same stage (and input size). The intuition is that the subintervals in question do not overlap. We pose an <i>order preserving</i> assumption on task marginal utilities with the same image size.</p> <div id="statement5" class="statement"><b class="statement-label"><span class="sc">Assumption</span> 1 (<span class="sc">Order Preserving Assumption</span>).</b></p> <p id="p-44"><i>For two tasks</i><span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><msub><mrow><mi>τ</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow><mrow><mn>1</mn></mrow></msub></mrow></math></span><i>and</i><span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><msub><mrow><mi>τ</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow><mrow><mn>2</mn></mrow></msub></mrow></math></span><i>with the same size, if for one neural network stage j, we have</i><span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><msub><mrow><mi>R</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow><mrow><mn>1</mn><mo>,</mo><mi>j</mi></mrow></msub><mo>−</mo><msub><mrow><msub><mrow><mi>R</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow><mrow><mn>1</mn><mo>,</mo><mi>j</mi><mo>−</mo><mn>1</mn></mrow></msub><mo>≥</mo><msub><mrow><msub><mrow><mi>R</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow><mrow><mn>2</mn><mo>,</mo><mi>j</mi></mrow></msub><mo>−</mo><msub><mrow><msub><mrow><mi>R</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow><mrow><mn>2</mn><mo>,</mo><mi>j</mi><mo>−</mo><mn>1</mn></mrow></msub></mrow></math></span>, <i>then it also holds</i><span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><msub><mrow><mi>R</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow><mrow><mn>1</mn><mo>,</mo><mi>j</mi><mo>+</mo><mn>1</mn></mrow></msub><mo>−</mo><msub><mrow><msub><mrow><mi>R</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow><mrow><mn>1</mn><mo>,</mo><mi>j</mi></mrow></msub><mo>≥</mo><msub><mrow><msub><mrow><mi>R</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow><mrow><mn>2</mn><mo>,</mo><mi>j</mi><mo>+</mo><mn>1</mn></mrow></msub><mo>−</mo><msub><mrow><msub><mrow><mi>R</mi></mrow><mrow><mi>i</mi></mrow></msub></mrow><mrow><mn>2</mn><mo>,</mo><mi>j</mi></mrow></msub></mrow></math></span>.</p> </div> <p id="p-45">Thus, the choice of the best subset of tasks to execute remains the same regardless of which stage is considered. Below, we describe the algorithm in more detail.</p> <p id="p-46">Step 1: For each object size <i>k</i> and stage <i>j</i>, we can use a dynamic programming algorithm to decide the maximum number of tasks <i>M</i> that can execute stage <i>j</i> in time 0 < <i>T<sub>j,k</sub></i> ≤ <i>H</i>. Observe that this computation can be done offline. The details are shown in <a class="xref xref-fig" href="#F1" data-jats-ref-type="fig" data-jats-rid="F1">Algorithm 1</a>. With the optimal number, <i>M</i>, computed for each, <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msub><mrow><mi>T</mi></mrow><mrow><mi>j</mi><mo>,</mo><mi>k</mi></mrow></msub><mo>,</mo><msubsup><mrow><mi>U</mi></mrow><mrow><mi>j</mi><mo>,</mo><mi>k</mi></mrow><mrow><mo>*</mo></mrow></msubsup></mrow></math></span> is simply the sum of utilities of the <i>M</i> highest-utility tasks that are ready to execute stage <i>j</i> on an input of size <i>k</i>.</p> <p id="p-47">Step 2: We solve this problem by two-dimensional dynamic programming, considering the considered network stages and the time, respectively. The recursive (induction) step takes the output of Step 1 as input to calculate the optimal utility from assigning some fraction of <i>T<sub>k</sub></i> to the first <i>j</i> − 1 stage and the remainder to stage <i>j</i>, and computes the best possible sum of the two, for each <i>T<sub>k</sub></i>. Once all stages are considered, the result is the optimal utility, <span class="inline-formula"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><msubsup><mrow><mi>U</mi></mrow><mrow><mi>k</mi></mrow><mrow><mo>*</mo></mrow></msubsup></mrow></math></span>, from running tasks of input size <i>k</i> for a period <i>T<sub>k</sub></i>. The details are explained in <a class="xref xref-fig" href="#F2" data-jats-ref-type="fig" data-jats-rid="F2">Algorithm 2</a>.</p> <figure id="F1" class="fig" data-jats-position="float"> <div class="image-container"><img decoding="async" class="graphic" title="Algorithm 1: " src="https://cacm-acm-org-preprod.go-vip.net/wp-content/uploads/2024/02/3610801_fig04.jpg" alt="" data-image-id="F1" data-image-type="figure" /></div><figcaption><span class="caption-label">Algorithm 1: </span> <span class="p">Batching</span></p> <div class="figcaption-footer"> </div> </figcaption></figure> <figure id="F2" class="fig" data-jats-position="float"> <div class="image-container"><img decoding="async" class="graphic" title="Algorithm 2: " src="https://cacm-acm-org-preprod.go-vip.net/wp-content/uploads/2024/02/3610801_fig05.jpg" alt="" data-image-id="F2" data-image-type="figure" /></div><figcaption><span class="caption-label">Algorithm 2: </span> <span class="p">Stage Assignment</span></p> <div class="figcaption-footer"> </div> </figcaption></figure> <figure id="F3" class="fig" data-jats-position="float"> <div class="image-container"><img decoding="async" class="graphic" title="Algorithm 3: " src="https://cacm-acm-org-preprod.go-vip.net/wp-content/uploads/2024/02/3610801_fig06.jpg" alt="" data-image-id="F3" data-image-type="figure" /></div><figcaption><span class="caption-label">Algorithm 3: </span> <span class="p">Local DP Scheduling Algorithm</span></p> <div class="figcaption-footer"> </div> </figcaption></figure> <figure id="F4" class="fig" data-jats-position="float"> <div class="image-container"><img decoding="async" class="graphic" title="Algorithm 4: " src="https://cacm-acm-org-preprod.go-vip.net/wp-content/uploads/2024/02/3610801_fig07.jpg" alt="" data-image-id="F4" data-image-type="figure" /></div><figcaption><span class="caption-label">Algorithm 4: </span> <span class="p">Local Greedy Scheduling Algorithm</span></p> <div class="figcaption-footer"> </div> </figcaption></figure> <p id="p-52">Step 3: Similar to Step 2, we perform a standard dynamic programming procedure to decide the optimal time partitioning among tasks processing different input sizes. The details of this procedure, along with the integrated local dynamic programming scheduling algorithm are presented in <a class="xref xref-fig" href="#F3" data-jats-ref-type="fig" data-jats-rid="F3">Algorithm 3</a>.</p> <p id="p-53">The optimality of <a class="xref xref-fig" href="#F3" data-jats-ref-type="fig" data-jats-rid="F3">Algorithm 3</a> follows from the optimality of dynamic programming. Hence, the overall competitive ratio is 3 for single-stage task scheduling and min{<i>L</i>+2, 2<i>B</i>+1} for multi-stage task scheduling, according to Corollary 1 and Theorem 1, respectively. However, this algorithm may have a high computational overhead since <a class="xref xref-fig" href="#F2" data-jats-ref-type="fig" data-jats-rid="F2">Algorithms 2</a> and <a class="xref xref-fig" href="#F3" data-jats-ref-type="fig" data-jats-rid="F3">3</a> which need to be executed each scheduling period, are <i>O</i>(<i>KLH</i><sup>3</sup>). Next, we present a simpler local greedy algorithm, which has better time efficiency.</p> </section> <section class="inline-headings-section"> <p data-jats-content-type="inline-heading"><strong>Local Greedy scheduling.</strong> The greedy online scheduling algorithm solves the local BASIC scheduling problem following a simple greedy selection rule: Execute the (eligible) batch with the maximum utility next. The pseudo-code of the greedy scheduling algorithm is shown in <a class="xref xref-fig" href="#F4" data-jats-ref-type="fig" data-jats-rid="F4">Algorithm 4</a>. The greedy scheduling algorithm is simple to implement and has a very low computational overhead. We show that it achieves a comparable performance to the optimal algorithm in practice.</p> </section> </section> </section> <section class="sec"> <h2 class="heading"><span class="caption-label">4. </span>Evaluation</h2> <p id="p-55">In this section, we verify the effectiveness and efficiency of our proposed scheduling framework by comparing it with several state-of-the-art baselines on a large-scale self-driving dataset, Waymo Open Dataset.</p> <section class="sec"> <h3 class="heading"><span class="caption-label">4.1. </span>Experimental setup</h3> <section class="inline-headings-section"> <p data-jats-content-type="inline-heading"><strong>Hardware platform.</strong> All experiments are conducted on an NVIDIA Jetson AGX Xavier SoC, which is specifically designed for automotive platforms. It’s equipped with an 8-core Carmel Arm v8.2 64-bit CPU, a 512-core Volta GPU, and 32GB memory. Its mode is set as MAXN with maximum CPU/GPU/memory frequency budget, and all CPU cores are online.</p> </section> <section class="inline-headings-section"> <p data-jats-content-type="inline-heading"><strong>Dataset.</strong> Our experiment is performed on the Waymo Open Dataset,<a class="reference-link xref xref-bibr" href="#B19" data-jats-ref-type="bibr" data-jats-rid="B19"><sup>19</sup></a> which is a large-scale autonomous driving dataset collected by Waymo self-driving cars in diverse geographies and conditions. It includes driving video segments of the 20s each, collected by LiDARs and cameras at 10Hz. Only the front camera data is used in our experiment.</p> </section> <section class="inline-headings-section"> <p data-jats-content-type="inline-heading"><strong>Neural network training.</strong> We use ResNet proposed by He <i>et al</i>.<a class="reference-link xref xref-bibr" href="#B10" data-jats-ref-type="bibr" data-jats-rid="B10"><sup>10</sup></a> for object classification. The network is trained on a general-purpose object detection dataset, COCO.<a class="reference-link xref xref-bibr" href="#B13" data-jats-ref-type="bibr" data-jats-rid="B13"><sup>13</sup></a> It contains 80 object classes that cover Waymo classes.</p> </section> <section class="inline-headings-section"> <p data-jats-content-type="inline-heading"><strong>Scheduling load and evaluation metrics.</strong> We extract the distance between objects and the autonomous vehicle (AV) from the projected LiDAR point cloud. The deadlines of object classification tasks are set as the time to collision (TTC) with the AV. To simulate different loads for the scheduling algorithms, we manually change the sampling period (i.e., frame rate) from 40ms to 160ms. We consider a task to miss its deadline if the scheduler fails to run the mandatory part of the task by the deadline. In the following evaluation, we present both the <i>normalized accuracy</i> and <i>deadline miss rate</i> for different algorithms. The normalized accuracy is defined as the ratio between achieved accuracy and the maximum accuracy when all neural network stages are finished for every object.</p> </section> </section> <section class="sec"> <h3 class="heading"><span class="caption-label">4.2. </span>Compared scheduling algorithms</h3> <p id="p-60">The following scheduling algorithms are compared.</p> <ul class="list" data-jats-list-type="bullet"> <li class="list-item"> <p id="p-61">OnlineDP: the online scheduling algorithm we proposed in Section 3. The local scheduling is conducted by the hierarchical dynamic programming algorithm.</p> </li> <li class="list-item"> <p id="p-62">Greedy: the online scheduling algorithm we proposed, with the local scheduling conducted by the greedy batching algorithm.</p> </li> <li class="list-item"> <p id="p-63">Greedy-NoBatch: It always executes the object with maximal marginal utility without batching.</p> </li> <li class="list-item"> <p id="p-64">EDF: It always chooses the task stage with the earliest deadline (without considering task utility).</p> </li> <li class="list-item"> <p id="p-65">Non-Preemptive EDF (NP-EDF): This algorithm does not allow preemption. It is included to understand the impact of allowing preemption on stage boundaries compared to not allowing it.</p> </li> <li class="list-item"> <p id="p-66">FIFO: It runs the task with the earliest arrival time first. All stages are performed as long as the deadline is not violated.</p> </li> <li class="list-item"> <p id="p-67">RR: Round-robin scheduling algorithm. Runs one stage of each task in a round-robin fashion.</p> </li> </ul> </section> <section class="sec"> <h3 class="heading"><span class="caption-label">4.3. </span>Slicing and batching</h3> <p id="p-68">We compare the inference time for <i>full frames</i> and <i>batched partial frames with/out deduplication</i>. In full-frame processing, we directly run the neural network on image-captured full images, whose size is 1920 × 1280. In <i>batched partial frames</i>, we do the slicing into bounding boxes within one frame first, then perform the deduplication (if applicable), and finally, batch execution of objects with the same size. Each frame is evaluated independently. No imprecise computation is considered. Our results show that the average latency for full frames is 350ms, while the average latency for (the sum of) batched partial frames is 105ms without deduplication, and 83ms with deduplication. Besides, the cumulative distributions of frame latencies for the three methods are shown in <a class="xref xref-fig" href="#fig4" data-jats-ref-type="fig" data-jats-rid="fig4">Figure 4</a>. Data slicing, batching, and deduplication steps, although induce extra processing delays, can effectively reduce the end-to-end latency. However, neither approach is fast enough compared to 100ms sampling period, so that the imprecise computation model and prioritization are needed.</p> <figure id="fig4" class="fig" data-jats-position="float"> <div class="image-container"><img decoding="async" class="graphic" title="Figure 4: " src="https://cacm-acm-org-preprod.go-vip.net/wp-content/uploads/2024/02/3610801_fig08.jpg" alt="" data-image-id="fig4" data-image-type="figure" /></div><figcaption><span class="caption-label">Figure 4: </span> <span class="p">Cumulative distribution comparison of end-to-end latency. The execution time for frame slicing, deduplication (if applicable), batching, and neural network inference are all counted.</span></p> <div class="figcaption-footer"> </div> </figcaption></figure> </section> <section class="sec"> <h3 class="heading"><span class="caption-label">4.4. </span>Scheduling policy comparisons</h3> <p id="p-70">Next, we evaluate the scheduling algorithms in terms of achieved classification accuracy and deadline miss rate. The scheduling results are presented in <a class="xref xref-fig" href="#fig5" data-jats-ref-type="fig" data-jats-rid="fig5">Figure 5</a>. The two proposed algorithms, OnlineDP and Greedy, clearly outperform all the baselines with a large margin in all metrics. The improvement comes for two reasons: First, the integration of the imprecise computation model into neural networks makes the scheduler more flexible. It makes the neural network partially preemptive at the stage level, and gives the scheduler an extra degree of freedom (namely, deciding how much of each task to execute). Second, the involvement of batching simultaneously improves the model performance and alleviates deadline misses. The batching mechanism enables the GPU to be utilized at its highest parallel capacity. The deadline miss rates of both OnlineDP and Greedy are pretty close to 0 under any task load. We find Greedy shows similar performance as OnlineDP, though they possess different theoretical results. One practical reason is that the utility prediction function can not perfectly predict the utility for all future stages, where the OnlineDP scheduling can be negatively impacted.</p> <figure id="fig5" class="fig" data-jats-position="float"> <div class="image-container"><img decoding="async" class="graphic" title="Figure 5: " src="https://cacm-acm-org-preprod.go-vip.net/wp-content/uploads/2024/02/3610801_fig09.jpg" alt="" data-image-id="fig5" data-image-type="figure" /></div><figcaption><span class="caption-label">Figure 5: </span> <span class="p">Accuracy and deadline miss rate comparisons on all objects.</span></p> <div class="figcaption-footer"> </div> </figcaption></figure> <p id="p-72">To evaluate scheduling performance in driving scenarios involving the aforementioned important subcases, we compare the metrics of different algorithms for the subset of “critical objects.” Critical objects are defined as objects whose time-to-collision (and hence processing deadline) fall within 1s from when they first appear in the scene. Results are shown in <a class="xref xref-fig" href="#fig6" data-jats-ref-type="fig" data-jats-rid="fig6">Figure 6</a>. We notice that the accuracy and deadline miss rates of FIFO and RR are much worse in this case (because severe priority inversion occurs in these two algorithms). The deadline-driven algorithms (NP-EDF and EDF) can effectively resolve this issue because objects with earlier deadlines are always executed first. However, their general performance is limited for a lack of utility optimization. The utility-based scheduling algorithms (Greedy, Greedy-NoBatch, and OnlineDP) are also effective in removing priority inversion, while at the same time achieving better confidence in results. These algorithms multiply a weight factor <i>a</i> > 1 to increase the utility of handling critical objects so that they are preferred by the algorithm over non-critical ones.</p> <figure id="fig6" class="fig" data-jats-position="float"> <div class="image-container"><img decoding="async" class="graphic" title="Figure 6: " src="https://cacm-acm-org-preprod.go-vip.net/wp-content/uploads/2024/02/3610801_fig10.jpg" alt="" data-image-id="fig6" data-image-type="figure" /></div><figcaption><span class="caption-label">Figure 6: </span> <span class="p">Accuracy and deadline miss rate comparisons on critical objects. Critical objects are defined as objects that have a deadline less than 1s.</span></p> <div class="figcaption-footer"> </div> </figcaption></figure> </section> </section> <section class="sec"> <h2 class="heading"><span class="caption-label">5. </span>Conclusion</h2> <p id="p-74">We presented a novel perception pipeline architecture and scheduling algorithm that resolve algorithmic priority inversion in mission-critical machine inference pipelines, prevalent in conventional FIFO-based AI workflows. To mitigate the impact of priority inversion, the proposed online scheduling architecture rests on two key ideas: (1) Prioritize parts of the incoming sensor data over others to enable a more timely response to more critical stimuli, and (2) Explore the maximum parallel capacity of the GPU by a novel task batching algorithm that improves both response speed and quality. An extensive evaluation, performed on a real-world driving dataset, validates the effectiveness of our framework.</p> </section> <section class="sec"> <h2 class="heading">Acknowledgments</h2> <p id="p-75">Research reported in this paper was sponsored in part by the Army Research Laboratory under Cooperative Agreement W911NF-17-20196, NSF CNS 18-15891, NSF CNS 18-15959, NSF CNS 19-32529, NSF CNS 20-38817, Navy N00014-17-1-2783, and the Boeing Company.</p> </section> </div> <footer class="back"></footer> </article> ]]></content:encoded> <wfw:commentRss>https://cacm-acm-org-preprod.go-vip.net/research/taming-algorithmic-priority-inversion-in-mission-critical-perception-pipelines/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <dc:creator><![CDATA[Shuochao Yao]]></dc:creator> <dc:creator><![CDATA[Xinzhe Fu]]></dc:creator> <dc:creator><![CDATA[Rohan Tabish]]></dc:creator> <dc:creator><![CDATA[Simon Yu]]></dc:creator> <dc:creator><![CDATA[Ayoosh Bansal]]></dc:creator> <dc:creator><![CDATA[Heechul Yun]]></dc:creator> <dc:creator><![CDATA[Lui Sha]]></dc:creator> <dc:creator><![CDATA[Tarek Abdelzaher]]></dc:creator> <post-id xmlns="com-wordpress:feed-additions:1">751299</post-id> </item> <item> <title>Technical Perspective: Bridging AI with Real-Time Systems</title> <link>https://cacm-acm-org-preprod.go-vip.net/research/technical-perspective-bridging-ai-with-real-time-systems/</link> <comments>https://cacm-acm-org-preprod.go-vip.net/research/technical-perspective-bridging-ai-with-real-time-systems/#respond</comments> <dc:creator><![CDATA[Giorgio Buttazzo]]></dc:creator> <pubDate>Wed, 14 Feb 2024 18:53:45 +0000</pubDate> <category><![CDATA[Artificial Intelligence and Machine Learning]]></category> <category><![CDATA[Systems and Networking]]></category> <guid isPermaLink="false">http://cacm-acm-org-preprod.go-vip.net/?post_type=article&p=563954</guid> <description><![CDATA["Taming Algorithmic Priority Inversion in Mission-Critical Perception Pipelines," by Shengzhong Liu et al., proposes a new methodology for overcoming the limitations of current AI frameworks to enable the use of deep neural networks in mission-critical systems.]]></description> <content:encoded><![CDATA[<article> <div class="body" lang="en"> <section id="sec1" class="sec"> <p id="p-1">Artificial intelligence (AI) and machine learning models are making progress at an unprecedented rate and have achieved remarkable performance in several specific tasks such as image classification, object detection, automatic control, strategy games, some types of medical diagnoses, and music composition.</p> <p id="p-2">The exceptional performance of machine learning models in perception tasks makes them very attractive for being adopted in a large variety of autonomous systems, which must process sensory data to understand the environment and react in real time to accomplish a given task. Examples of such autonomous systems include self-driving cars, advanced robots operating in unknown environments, and interplanetary space probes. These systems must not only perceive the objects in the scene and their location with a high accuracy, but they also must predict their trajectories and plan proper actions within stringent timing constraints.</p> <p id="p-3">Consider, for instance, an autonomous car driving in an urban environment. Its onboard perception system is not only in charge of detecting the road, sidewalks, traffic lights, and road signs, but it is also responsible for identifying and recognizing moving objects, like pedestrians, bicycles, and other moving vehicles, while predicting their trajectories and planning proper actions to prevent possible impacts with them. In this context, a correct prediction produced too late could cause the system to fail. This example illustrates that guaranteeing a timely response in this type of system is as crucial as producing a correct prediction.</p> <p id="p-4">In a complex, highly dynamic scenario like the one considered for a self-driving car, however, not all computational tasks are equally important. For example, objects closer to the vehicle should receive a higher priority with respect to those located further away. Similarly, objects moving at higher speed should be processed at higher rates with respect to objects that are standing or moving at lower speed.</p> <p id="p-5">One problem with the current AI frameworks and hardware accelerators for deep neural networks is that they have been developed for non-critical applications where timing is not an issue. Consequently, when multiple neural models must be executed on the same platform, each model is normally executed non-preemptively (that is, without interruption) or, in the best case, using simple scheduling heuristics that do not take time requirements or task criticality into account.</p> <p id="p-6">This means if a highly critical task <i>H</i> is activated just after a low-critical task <i>L</i> has started its execution, <i>H</i> will experience a long delay, since it can only start executing after the completion of <i>L</i>. This phenomenon is referred to as a <i>priority inversion</i> and has been studied extensively in the field of real-time systems. However, it represents a serious problem in current AI algorithms, preventing their use in safety-critical real-time systems, where timing and functional requirements are equally important.</p> <p id="p-7">The following paper proposes a new methodology for overcoming the limitations of current AI frameworks to enable the use of deep neural networks in mission-critical systems. The key idea is to split the perception process into multiple tasks associated with different objects and prioritize them to enable more timely response to more critical stimuli.</p> <p id="p-8">The system combine range data acquired from a light detection and ranging sensor (LiDAR) with images obtained from a camera. In particular, the 3D objects detected by the LiDAR (based on distances) are projected on the 2D-image plane of the camera. Then, bounding boxes are assigned a priority inversely proportional to their distance from the vehicle, so that closer objects will be attended first. In this way, depending on the overall workload, low-priority objects can be processed by a lower rate.</p> <p id="p-9">To overcome the limitation of non-preemptive execution, the deep neural network in charge of processing the cropped images is broken into stages, each consisting of multiple neural layers so the model inference can be preempted between stages. To add flexibility, multiple predictions with different precision are generated at the end of multiple stages to balance accuracy vs. execution time.</p> <p id="p-10">The overall system is then able to schedule the various perceptual tasks based on the assigned priority, while avoiding priority inversion during neural inference and enabling a more predictable execution of AI algorithms in mission-critical systems.</p> </section> </div> </article> ]]></content:encoded> <wfw:commentRss>https://cacm-acm-org-preprod.go-vip.net/research/technical-perspective-bridging-ai-with-real-time-systems/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <post-id xmlns="com-wordpress:feed-additions:1">563954</post-id> </item> <item> <title>The State of the Metaverse</title> <link>https://cacm-acm-org-preprod.go-vip.net/news/the-state-of-the-metaverse/</link> <comments>https://cacm-acm-org-preprod.go-vip.net/news/the-state-of-the-metaverse/#respond</comments> <dc:creator><![CDATA[Esther Shein]]></dc:creator> <pubDate>Thu, 08 Feb 2024 15:27:33 +0000</pubDate> <category><![CDATA[Computing Applications]]></category> <category><![CDATA[Systems and Networking]]></category> <guid isPermaLink="false">https://cacm-acm-org-preprod.go-vip.net/article/the-state-of-the-metaverse/</guid> <description><![CDATA[After years and many attempts to build out a metaverse and make use of it, is there anything out there? ]]></description> <content:encoded><![CDATA[ <p>After its ascension as one of the tech darlings of 2022, the <a href="https://en.wikipedia.org/wiki/Metaverse">metaverse</a> took a major backseat to generative AI and ChatGPT last year. Yet, while still in a nascent phase, enterprise projects are occurring, and industry observers say the metaverse this year will bring more efficiencies and collaboration.</p> <p>The metaverse, a 3D virtual, immersive environment where users are represented by avatars to interact, has the potential to generate up to $5 trillion in value by 2030. This makes it too large for companies to ignore, according to <a href="https://www.mckinsey.com/capabilities/growth-marketing-and-sales/our-insights/value-creation-in-the-metaverse">McKinsey</a>, which also found that 15% of corporate revenue is expected to come from the metaverse in the next few years.</p> <p>Thanks in part to more sophisticated AI tools and digital twins, which simulate real-world objects in detail, now it is the metaverse’s time to shine.</p> <p>“Like all emerging technologies that have come before and will come after, the metaverse has exited out of the hype part of the hype cycle and now, real value creation is taking place using the metaverse to solve real business challenges,’’ says Domhnaill Hernon, global lead for Ernst & Young’s EY Metaverse Labs.</p> <p><strong>The industrial metaverse and other use cases.</strong></p> <p>The metaverse already has gained significant traction in the consumer space, specifically in online gaming, where it has “hundreds of millions of monthly active users,” Hernon notes.</p> <p>In a business context, the manufacturing industry has taken the lead in utilizing the metaverse. Ninety-two percent of companies are experimenting with or implementing at least one metaverse-related use case, and on average, they are running six or more, according to the 2023 report “Exploring the Industrial Metaverse in Manufacturing,” from Deloitte and the Manufacturing Leadership Council.</p> <p>The report also notes that amid tech layoffs, the implementation of the industrial metaverse in manufacturing is creating new opportunities for tech-based manufacturing jobs that may not have existed previously.</p> <p><br>John Coykendall<strong>, </strong>avice chair at Deloitte and leader of its U.S. industrial products and construction practice, says strategic initiatives are underway in the industrial metaverse to enhance production, customer engagement, supply chain efficiency, and talent development. Citing the report, he notes that key metaverse initiatives rooted in smart factory concepts (modern technologies used to analyze data and drive efficiencies through automated processes) are making significant strides, in areas including data analytics (91%), cloud computing (86%), Internet of Things (78%), 5G (59%), and Artificial Intelligence (62%).</p> <p>“Because of the scope of what the industrial metaverse can offer—connection to data-rich, immersive 3D environments from anywhere there is a broadband internet connection—its potential value stretches far beyond just the production ecosystem,’’ the Deloitte report observes. In some instances, manufacturers are building on their smart factory momentum and implementing industrial metaverse use cases, the report says.</p> <p>The Siemens and MIT Technology Review report “<a href="https://wp.technologyreview.com/wp-content/uploads/2023/03/MITTR_Siemens_The-Emergent-Industrial-Metaverse.pdf">The emergent industrial metaverse</a>,” goes further, saying the industrial metaverse has arguably, the greatest potential for immersive, interactive spaces.</p> <p>Like Deloitte’s study, the report says that integrating technologies such as high-fidelity simulations, extended reality, AI, machine learning, the Internet of Things, blockchain, cloud, and 5G/6G will drive the industrial metaverse to offer fully immersive real-time and synchronous representations of the real world.</p> <p>Although it is still immature, momentum is growing in the enterprise metaverse, which “represents a shift in how employees communicate, collaborate, and co-create in online 3D digital spaces,’’ according to Hernon. “The biggest tech companies in the world are investing in technology to deliver the enterprise metaverse.”</p> <p>Health sciences and financial services are using the enterprise metaverse to either attract new talent or to engage people in different ways, Hernon says. “In the industrial metaverse, automotive, pharma, and energy are leveraging VR for training and prototyping, and also leveraging digital twins for improved productivity and efficiency.”</p> <p>The metaverse is also flourishing in pharma and air transportation, notes Mat Kuruvilla, chief innovation architect at global IT services firm UST. “We always believe the power lies in the enterprise to use the metaverse to conduct better collaboration, better design, and better training, especially when many people work virtually,’’ Kuruvilla says. “It’s really a great mechanism to interact and bring real-world experiences and synthetic world experiences together.”</p> <p>UST is developing a virtual pilot project for a major airline interested in rebranding its terminals at several hubs for better flow. There are simulations for signage placement and navigation through the check-in, baggage claim, and gate areas. The idea is for users to immerse themselves in the different scenarios travelers experience when they arrive at the airport to help make better redesign decisions, Kuruvilla explains.</p> <p>The project, dubbed “airline’s immersive,” can be experienced on a PC, headset, or in a 180-degree immersive room with a PC connected to a projection system on three walls “that all have the experience going on and it’s wrapping around you’’ in an anamorphic display that works with a gaming hand controller, he says.</p> <p>UST’s development team used <a href="https://unity.com/">Unity</a> as its virtual reality platform for development and Stable diffusion, a deep learning, text-to-image model.</p> <p><strong>Key concerns and considerations</strong></p> <p>But with increased implementation comes added risks. Some 72% of manufacturers are most concerned with the cybersecurity risks associated with implementing metaverse-enabling technologies, according to the Deloitte report.</p> <p>Kuruvilla says companies need to think about how to control access to various devices, especially given that “today, many of the devices don’t have good policy management.’’ Head-mounted devices, in particular, are a challenge, given that “the security model is still not very clear for headsets,’’ he says.</p> <p>In the industrial metaverse, organizations also will have to consider connectivity, computational power, interoperability, and how to create “extremely high-fidelity” digital twin models, according to the Siemens/MIT Technology Review report.</p> <p>“Blockchain may be a key metaverse ingredient to ensure greater security and privacy,’’ the report says.</p> <p>Yet, despite these issues, the Deloitte report notes that “most companies believe the value the industrial metaverse will deliver outweighs the risk, especially with the right mitigation strategies in place.”</p> <p>Additionally, because virtual and augmented reality are new technologies, there is a learning curve for those who are not gamers, Hernon points out. “Hence, we are very careful and purposeful about how we launch our metaverse experiences and how we onboard new users, taking great care to make them as comfortable as possible.”</p> <p><strong>Expect to see more investments in the metaverse</strong></p> <p>Some of EY’s biggest clients are investing in a range of metaverse use cases including attracting talent and new employee onboarding, as well as delivering better learning and development, Hernon says.</p> <p>Kuruvilla believes a lot of simulations will be done in the metaverse and content created around training this year. Apple’s Vision Pro, launched in February [*DUE IN FEBRUARY 2024] will dramatically improve how people work with a spatial computing device, he says.</p> <p>Hernon hopes that “the negative sentiment” around the metaverse—which includes concerns over potentially addictive behaviors as users become increasingly engrossed in virtual worlds, and new forms of harassment, bullying, and hate–will dissipate and that more and more examples of real value creation use cases emerge.</p> <p>“Every day I read a new article stating the ‘metaverse is dead,’ and those articles are written by people that don’t actually build, deploy, or invest in the technology,’’ he maintains. “Transitioning out of the hype cycle is part of the maturation process of any emerging technology. The reality is that substantial investments are being made in this technology by all the largest companies in the world and we are seeing some consistent use cases emerge with strong ROI.”</p> <p><em><strong>Esther Shein </strong>is a freelance technology and business writer based in the Boston area.</em></p> ]]></content:encoded> <wfw:commentRss>https://cacm-acm-org-preprod.go-vip.net/news/the-state-of-the-metaverse/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <post-id xmlns="com-wordpress:feed-additions:1">751068</post-id> </item> <item> <title>The Role of Autonomous Machine Computing in Shaping the Autonomy Economy</title> <link>https://cacm-acm-org-preprod.go-vip.net/blogcacm/the-role-of-autonomous-machine-computing-in-shaping-the-autonomy-economy/</link> <comments>https://cacm-acm-org-preprod.go-vip.net/blogcacm/the-role-of-autonomous-machine-computing-in-shaping-the-autonomy-economy/#respond</comments> <dc:creator><![CDATA[Shaoshan Liu]]></dc:creator> <pubDate>Thu, 01 Feb 2024 14:28:38 +0000</pubDate> <category><![CDATA[Architecture and Hardware]]></category> <category><![CDATA[Computing Applications]]></category> <category><![CDATA[Society]]></category> <category><![CDATA[Systems and Networking]]></category> <guid isPermaLink="false"></guid> <description><![CDATA[The role of Autonomous Machine Computing in fostering the Autonomy Economy.]]></description> <content:encoded><![CDATA[ <p>The Autonomy Economy represents a transformative phase in our society, driven by the integration of autonomous machines such as autonomous vehicles, delivery robots, drones, and more into the provision of goods and services. Central to this revolution is Autonomous Machine Computing (AMC), the computing technological backbone enabling these diverse autonomous systems. This post delves into AMC’s critical role in fostering the Autonomy Economy.</p> <p>The ascension of autonomous machines signifies a paradigm shift from the Digital Economy. Originally confined to basic robotics and industrial applications, these autonomous machines now permeate everyday life, signaling a move towards the Autonomy Economy era. For example, in China, when you check into a hotel, it is likely that a delivery robot is going to bring what you need to your room. When you order food and grocery delivery in some cities, it’s robots that deliver the food from the restaurant or the supermarket to your doorstep. At home, robotic vacuum cleaners have become standard home appliances [1].</p> <p>In past decades, the Digital Economy has significantly propelled economic growth. For instance, the Internet economy accounts for 21% of GDP growth in mature economies from 2005 to 2010, contributing $2.1 trillion to the U.S. economy in 2019 [2]. Compared to the Digital Economy, the Autonomy Economy is poised for an even more profound impact. A key example is the potential transformation of the $1.9-trillion U.S. transportation sector through the widespread adoption of autonomous vehicles, indicative of the sweeping changes across various industries.</p> <h3 class="wp-block-heading">Historical Insights – The Virtuous Cycle</h3> <p>At the heart of this transition to the Autonomy Economy is AMC. Similar to its predecessors—personal computing and mobile computing—AMC is the core technology stack that empowers a wide range of autonomous machine form factors, including intelligent vehicles, autonomous drones, delivery robots, home service robots, agriculture robots, industry robots and many more that we have yet to imagine. AMC involves sensing technologies, computing technologies, communication technologies, autonomous machine algorithms, reliability and security, and many other technical areas. As of today, AMC is still evolving and being defined.</p> <figure class="wp-block-image"><img decoding="async" src="https://cacm-acm-org-preprod.go-vip.net/system/assets/0004/7016/013124_Liu_Figure_1_Liu.jpg" alt=""/></figure> <p><strong>Figure 1: The Virtuous Cycle of Computing Technologies and Their Ecosystems</strong><br>(Credit: Shaoshan Liu)</p> <p>To elucidate the impact of AMC on the evolution of the Autonomy Economy, let us examine historical precedents. Figure 1 illustrates the correlation between the market sizes of ecosystems and their corresponding semiconductor markets in the realms of personal and mobile computing. Currently, the mobile processor market is valued at $35.1 billion, with the mobile computing ecosystem’s market size soaring to $800 billion—approximately 23 times larger. Similarly, the personal computing processor market stands at $55 billion, with its ecosystem valued at $900 billion, making it 16 times the processor market’s size [3]. This pattern indicates that, as a computing era matures, the semiconductor industry plays a crucial role, potentially fostering an ecosystem that is 15 to 25 times its own market size.</p> <p>These historical insights reveal a fundamental truth: the semiconductor industry acts as a cornerstone for modern economies. The growth of nascent sectors like AMC prompts the semiconductor industry to innovate and develop necessary technologies. These advancements, in turn, fuel further growth in the emerging sector, creating a self-reinforcing cycle of development and expansion. This virtuous cycle underscores the semiconductor industry’s pivotal role in enabling and propelling the growth of sectors critical to the future economy, such as AMC in the burgeoning Autonomy Economy.</p> <h3 class="wp-block-heading">Computing Power Allocation Dictates the Size of Ecosystem</h3> <p>The pivotal role of computing power allocation in ecosystem growth is evident in the transformation of the mobile computing industry, as illustrated in Figure 2: In the early 2000s, mobile phones, predominantly feature phones, were widespread yet offered limited functionality, focusing 90% of their computing power on basic communication tasks like encoding and decoding. With less than 10% of computing power allocated for applications, the scope for diverse applications was minimal, constraining the mobile computing ecosystem’s market size to approximately $10 billion.</p> <figure class="wp-block-image"><img decoding="async" src="https://cacm-acm-org-preprod.go-vip.net/system/assets/0004/7017/013124_Liu_Figure2_Liu.jpg" alt=""/></figure> <p><strong>Figure 2: the growth of mobile computing ecosystem</strong><br>(Credit: Shaoshan Liu)</p> <p>The advent of smartphones marked a paradigm shift, driving an insatiable demand for increased computing power to support a burgeoning array of mobile applications. This demand catalyzed the transformation of single embedded chips into sophisticated mobile systems-on-chip, integrating multi-core CPUs, mobile GPUs, mobile DSPs, and advanced power management systems. This technological leap forward has made 90% of computing power available for applications, such as YouTube, WhatsApp, Uber etc., and has expanded the mobile computing ecosystem’s market size to $800 billion today. This trajectory underscores the transformative power of semiconductor technology advancements. By enabling a broader range of applications, these technologies not only enhance existing markets but also pave the way for new ecosystems, scaling the market size to multiples of the semiconductor sector’s value.</p> <h3 class="wp-block-heading">Autonomous Machine Computing Challenge</h3> <p>The predominant challenge facing AMC today is the creation of a robust ecosystem, hampered by the limited computing power allocated for AMC applications. As depicted in Figure 3, existing designs of AMC systems heavily prioritizes basic operational functions: 50% of computing resources are allocated to perception, 20% to localization, and 25% to planning. Consequently, this distribution leaves a minimal 5% for application development and execution, significantly restricting the capability for autonomous machines to perform complex, intelligent tasks.</p> <p>Addressing this issue necessitates a paradigm shift towards an AMC processor architecture that embeds critical computing functions directly into hardware. This approach dramatically liberates computing resources, enabling software developers to devote more attention to creating advanced AMC applications. An optimized design would significantly adjust the current allocation of computing power by integrating the essential tasks of perception, localization, and planning into the hardware itself, thus requiring only a minimal portion of the computing resources—10%, 5%, and 5%, respectively. This reallocation strategy would make an impressive 80% of computing power available for intelligent applications, thereby substantially enhancing the functional capabilities of autonomous machine systems.</p> <p>This strategy is more than just a technical enhancement; it represents a leap forward in the way we conceptualize and develop autonomous machines. By offering an efficient hardware platform that eases the development of AMC applications, we aim to unlock the creative potential of software developers. This, in turn, is expected to spur the creation of a dynamic marketplace for AMC applications. Ultimately, our vision is to usher in a new era of innovation in autonomous machines, shifting the emphasis from executing basic tasks to achieving advanced, intelligent behaviors, thereby expanding the possibilities of what autonomous machines can accomplish.</p> <figure class="wp-block-image"><img decoding="async" src="https://cacm-acm-org-preprod.go-vip.net/system/assets/0004/7018/013124_Liu_Figure3_Liu.jpg" alt=""/></figure> <p><strong>Figure 3: challenges of autonomous machine computing</strong><br>(Credit: Shaoshan Liu)</p> <h3 class="wp-block-heading">Roadmap Forward</h3> <p>As illustrated in Figure 4, unlocking the full economic potential of the AMC ecosystem hinges on the development of advanced computing systems that are easy to program, this involves the following research directions [4]: First, establishing a common computing architecture for autonomous machines, akin to the ARM architecture in mobile computing. Over the past five years, we’ve identified three promising architectures—dataflow accelerator architecture (DAA), factor graph architecture (FGA), and learning-based architecture (LBA)—as candidates. Our objective is to integrate their best features into a unified architecture.</p> <p>Second, developing a programming and runtime system to enable developers to efficiently create AMC applications, overcoming the current challenges of system abstraction, real-time constraints, and reliability guarantees.</p> <p>Third, the advancement of semiconductor technologies, such as SPAD, 3D stacking, and InP/InGaAs that will further enhance AMC’s sensing, computing, and communication capabilities, leading to more innovative applications and a seamless integration of emerging technologies with AMC system architectures.</p> <figure class="wp-block-image"><img decoding="async" src="https://cacm-acm-org-preprod.go-vip.net/system/assets/0004/7019/013124_Liu_Figure4_Liu.jpg" alt=""/></figure> <p><strong>Figure 4: roadmap forward</strong><br>(Credit: Shaoshan Liu)</p> <h3 class="wp-block-heading"><strong>References:</strong></h3> <ol class="wp-block-list"> <li>Liu, S. China’s Rise As A Robotic Nation, <a href="https://www.forbes.com/sites/forbestechcouncil/2022/04/29/chinas-rise-as-a-robotic-nation/?sh=67f4e2d641ef"><em>Forbes</em></a>, April 2022.</li> <li>Liu, S. The Transition to the Autonomy Economy and China-U.S. Tech Competition, <a href="https://thediplomat.com/2023/08/the-transition-to-the-autonomy-economy-and-china-us-tech-competition/"><em>The Diplomat</em></a>, August 2023.</li> <li>Liu, S. Growth Of The Autonomous Machine Computing Ecosystem Driven By Semiconductor Innovations, <a href="https://www.forbes.com/sites/forbestechcouncil/2022/08/25/growth-of-the-autonomous-machine-computing-ecosystem-driven-by-semiconductor-innovations/?sh=1b5643c5155b"><em>Forbes</em></a>, August 2022.</li> <li>Liu, S. and Gaudiot, J.L., 2023, International Roadmap for Devices and Systems (IRDS) 2023 Autonomous Machine Computing, <a href="https://irds.ieee.org/images/files/pdf/2023/2023IRDS_WP_AMC.pdf"><em>IEEE</em></a>.<br> </li> </ol> <p><em><strong>Shaoshan Liu’s</strong> background is a unique combination of technology, entrepreneurship, and public policy. He is currently a member of the ACM U.S. Technology Policy Committee, and a member of U.S. National Academy of Public Administration’s Technology Leadership Panel Advisory Group. His educational background includes a Ph.D. in Computer Engineering from U.C. Irvine, and a Master of Public Administration (MPA) from Harvard Kennedy School.</em></p> ]]></content:encoded> <wfw:commentRss>https://cacm-acm-org-preprod.go-vip.net/blogcacm/the-role-of-autonomous-machine-computing-in-shaping-the-autonomy-economy/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <post-id xmlns="com-wordpress:feed-additions:1">750879</post-id> </item> <item> <title>Governments Are Spying on Your Push Notifications</title> <link>https://cacm-acm-org-preprod.go-vip.net/news/governments-are-spying-on-your-push-notifications/</link> <comments>https://cacm-acm-org-preprod.go-vip.net/news/governments-are-spying-on-your-push-notifications/#respond</comments> <dc:creator><![CDATA[Logan Kugler]]></dc:creator> <pubDate>Tue, 30 Jan 2024 15:15:05 +0000</pubDate> <category><![CDATA[Computing Applications]]></category> <category><![CDATA[Systems and Networking]]></category> <guid isPermaLink="false">https://cacm-acm-org-preprod.go-vip.net/article/governments-are-spying-on-your-push-notifications/</guid> <description><![CDATA[Companies are being compelled to turn over potentially valuable push notification metadata to authorities.]]></description> <content:encoded><![CDATA[ <p>We all get push notifications on our smartphones, but we don’t all feel the same about them. Some of us find them helpful, like when we’re pinged that our Amazon package has arrived. Some of us get so irritated with inane app alerts that we disable them the moment we download new apps.</p> <p>Few of us, however, consider them dangerous, and it turns out they just might be. In December 2023, U.S. Senator Ron Wyden sent a <a href="https://www.documentcloud.org/documents/24191267-wyden_smartphone_push_notification_surveillance_letter_to_doj_-_signed">letter</a> to the U.S. Department of Justice that said certain governments were using push notifications to spy on users of Apple and Google devices.</p> <p>In the letter, Wyden said his office had received a tip in 2022 that unidentified “government agencies in foreign countries” were pressuring Apple and Google to turn over push notification records.</p> <p>The push notifications generated by nearly every app on your smartphone travel through Apple and Google servers on their way to users, and they leave behind records on those servers. According to the letter, this includes metadata “detailing which app received a notification and when, as well as the phone and associated Apple or Google account to which that notification was intended to be delivered.” In certain cases, it may also include “unencrypted content, which could range from backend directives for the app to the <em>actual text</em> displayed to a user in an app notification.”</p> <p>Wyden’s claims have been verified. <a href="https://www.reuters.com/technology/cybersecurity/governments-spying-apple-google-users-through-push-notifications-us-senator-2023-12-06/">Reuters</a> confirmed via a source that both the U.S. and foreign governments have been asking the two companies for metadata related to push notifications. In a statement, Apple revealed it has been pressured by governments to share push notification data for some time. The company also indicated it has previously been prohibited by the U.S. government from sharing this information.</p> <p>Now that the issue is public, Apple has updated its <a href="https://www.macrumors.com/2023/12/07/apple-updates-law-enforcement-guidelines/">Legal Process Guidelines</a> to disclose that it may share push notification data with government authorities when it has a legal obligation to do so via “a subpoena or greater legal process.”</p> <p>With enough push notification metadata, it is possible that government agencies could decode how someone has used a particular app. It also means governments may be able to associate which Apple and Google accounts have sent anonymous messages. In the case of the “actual text displayed to a user in an app notification” cited in Wyden’s letter, it also means governments might actually be able to see what you’ve sent in an anonymous messaging app.</p> <p>This metadata could have legitimate law enforcement applications when obtained through proper legal channels, but it also could have a devastating impact on user privacy, individual civil rights, and the overall security of anonymous messaging apps.</p> <p>That last target is particularly worrisome.</p> <p>Anonymous messaging apps often are used by citizens and dissidents under oppressive political regimes to communicate without suffering harassment, persecution, or oppression. Journalists regularly rely on anonymous messaging apps to communicate securely with sources revealing sensitive information. They also are used extensively in the financial services industry to safely raise awareness about unethical companies or practices, says <a href="https://www.cayebank.bz/luigi-wewege/">Luigi Wewege</a>, president of Belize-based Central American financial institution Caye International Bank.</p> <p>The Justice Department has not yet commented on Wyden’s letter or the issue at large. While Apple and Google have confirmed that governments are compelling them to share push notification metadata, it’s still unclear what type of data is being shared—or how often it’s being shared.</p> <p>“The hosting companies and app service providers should have full access to the push notifications, and this is how this technology is designed,” says <a href="https://sites.google.com/view/wudezhi/home">Dezhi Wu</a>, a professor at the University of South Carolina who does research on push notification design. “Depending on the agreement levels between tech companies and governments, there is the potential for governments to access the text of the notification itself.”</p> <p>Right now, no one seems to know if governments can specifically access the text of notifications themselves. If they can, every anonymous messaging app on the planet just became tremendously less anonymous.</p> <p>We also don’t know how long Apple and Google have had agreements with governments to turn over push notification metadata. However, we do know that these agreements aren’t new.</p> <p>In recent reporting, <a href="https://www.washingtonpost.com/technology/2023/12/06/push-notifications-surveillance-apple-google/">The Washington Post</a> found more than two dozen U.S. government requests for push notification data from Amazon, Apple, Google, and Microsoft, some of which were related to federal law enforcement’s investigation into the Capitol riots that took place on January 6, 2021. How this news will change the world of digital privacy remains to be seen.</p> <p></p> <p><strong><em>Logan Kugler</em></strong><em> is a freelance technology writer based in Tampa, Florida. He is a regular contributor to CACM and has written for nearly 100 major publications.</em></p> ]]></content:encoded> <wfw:commentRss>https://cacm-acm-org-preprod.go-vip.net/news/governments-are-spying-on-your-push-notifications/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <post-id xmlns="com-wordpress:feed-additions:1">750836</post-id> </item> <item> <title>Protecting Life-Saving Medical Devices From Cyberattack</title> <link>https://cacm-acm-org-preprod.go-vip.net/opinion/protecting-life-saving-medical-devices-from-cyberattack/</link> <comments>https://cacm-acm-org-preprod.go-vip.net/opinion/protecting-life-saving-medical-devices-from-cyberattack/#respond</comments> <dc:creator><![CDATA[Alex Vakulov]]></dc:creator> <pubDate>Wed, 10 Jan 2024 22:02:13 +0000</pubDate> <category><![CDATA[Security and Privacy]]></category> <category><![CDATA[Systems and Networking]]></category> <guid isPermaLink="false">https://cacm-acm-org-preprod.go-vip.net/?post_type=digital-library&p=586533</guid> <description><![CDATA[Alex Vakulov ponders how to protect the Internet of Medical Things.]]></description> <content:encoded><![CDATA[<article> <div class="body" lang="en"> <section id="sec1" class="sec"> <p id="p-4" data-jats-content-type="noindent"><a class="ext-link" href="https://bit.ly/3tP74ae" data-jats-ext-link-type="uri">https://bit.ly/3tP74ae</a> <strong>September 12, 2023</strong></p> <p id="p-5" data-jats-content-type="noindent">Smart medical gadgets are crucial for keeping people alive and healthy. From wearables that keep an eye on your heart rate all day to heart pumps and big machines like ventilators and dialysis units, these devices often work non-stop.</p> <p id="p-6">However, the sad reality is that cybersecurity is not always top of mind when these devices are being created. Many are easily connected to the Internet, often have simple passwords, or sometimes do not even require passwords. This lack of security is a huge problem because it allows hackers to not only break into the devices themselves, but also to penetrate hospital systems and wreak havoc with harmful software. According to a 2021 <a class="ext-link" href="https://www.csoonline.com/article/571989/outdated-iot-healthcare-devices-pose-major-security-threats.html" data-jats-ext-link-type="uri">report by Cynerio</a>, ransomware attacks on healthcare facilities surged by 123%, with over 500 attacks costing more than $21 billion.</p> <p id="p-7">More and more manufacturers are beefing up their cybersecurity game by using modern <a class="ext-link" href="https://www.infoworld.com/article/3271126/what-is-cicd-continuous-integration-and-continuous-delivery-explained.html" data-jats-ext-link-type="uri">CI/CD workflows</a> to protect against the wave of attacks targeting their medical devices. New software tools are making it easier for healthcare organizations’ security teams to quickly address issues, even when the devices come from different manufacturers. These tools can translate various queries, rules, and filters, making it easier to spot vulnerabilities.</p> <p id="p-8">Now, let’s explore some typical security issues in the world of connected medical devices and go over some guidelines and best practices for securing them.</p> </section> <section id="sec2" class="sec"> <h2 class="heading">Understanding Security Concerns in IoMT Devices</h2> <p id="p-9">The Internet of Medical Things (IoMT) is basically a specialized branch of the broader Internet of Things (IoT). While IoT connects all sorts of devices like smartphones, wearables, and industrial sensors, IoMT focuses specifically on medical gadgets. Both use cloud-based storage and AI-powered communication to share data, but IoMT takes it a step further by helping healthcare professionals with tasks like assessing, diagnosing, treating, and tracking patients’ conditions.</p> <p id="p-10">Hackers usually target these devices and systems to get their hands on some pretty sensitive stuff, mainly personally identifiable and protected health information. Once they snatch this valuable data, they either <a class="ext-link" href="https://www.cisecurity.org/insights/blog/ransomware-in-the-healthcare-sector" data-jats-ext-link-type="uri">hold it for ransom</a> or try to sell it on the dark web.</p> <p>Security loopholes in medical devices make things too risky. They widen the attack surface, giving hackers more ways to break in. Some of the typical issues include:</p> <ul class="list" data-jats-list-type="bullet"> <li class="list-item"> <p id="p-12">Badly managed access controls</p> </li> <li class="list-item"> <p id="p-13">Weak network segmentation</p> </li> <li class="list-item"> <p id="p-14">Outdated, vulnerable systems</p> </li> <li class="list-item"> <p id="p-15">Missing security updates</p> </li> <li class="list-item"> <p id="p-16">A glut of unencrypted, raw data</p> </li> <li class="list-item"> <p id="p-17">Risky open-source software elements</p> </li> </ul> <p id="p-18">Lately, the healthcare sector has become a hot target for attacks focused on apps and APIs.</p> <p id="p-19">When devices are networked together, there is usually a weak link in the chain—a device with simpler, less-secure software. Hackers can break into that device and then use it as a steppingstone to move laterally across the whole network, hunting for valuable data. Everything from cloud databases and network services to firmware, specific gadgets, storage systems, servers, and web apps can either bolster security or become a potential weak point in the system’s defenses.</p> <p id="p-20">Manufacturers frequently treat security as an afterthought, rather than a built-in feature of medical devices. This lack of embedded cybersecurity measures, coupled with the absence of audit logs, amplifies the risks. In addition, human factor-related issues can have life-threatening outcomes in such a setup.</p> <p id="p-21">One crucial step in dodging these threats is to use proper data encryption. Alongside this, other measures like network segmentation, well-designed authorization protocols, and next-gen traffic filtering that operate across all layers of the <a class="ext-link" href="https://www.networkworld.com/article/3239677/the-osi-model-explained-and-how-to-easily-remember-its-7-layers.html" data-jats-ext-link-type="uri">OSI model</a> should be in place to minimize the risks associated with medical devices. AI technologies can also significantly enhance security measures, detecting potential threats more swiftly than traditional methods. By automating many aspects of IT operations, <a class="ext-link" href="https://www.atera.com/blog/the-advancements-ai-is-making-in-the-it-industry/" data-jats-ext-link-type="uri">AI in ITSM</a> can save significant operational costs and time.</p> <p id="p-22">The challenge in keeping IoMT devices secure is tied to the unique conditions under which they operate. Most of these devices need to run 24/7 without any interruptions, so <a class="ext-link" href="https://cybersecurity.att.com/blogs/security-essentials/patching-frequency-best-practices" data-jats-ext-link-type="uri">regular updates or patches</a>, which would require temporarily shutting down the device, are not just inconvenient; they can have financial costs and, more importantly, could endanger lives. Adding to the complexity, devices from different manufacturers may have their own timetables for updates and maintenance. This can mess with the functionality of other devices on the network. Plus, if the software is not compatible across the board, that opens up a whole new can of worms in terms of security risks.</p> </section> <section id="sec3" class="sec"> <h2 class="heading">Navigating FDA Guidelines for IoMT Device Cybersecurity</h2> <p id="p-23">A while back, <a class="ext-link" href="https://www.federalregister.gov/documents/2017/09/06/2017-18815/design-considerations-and-premarket-submission-recommendations-for-interoperable-medical-devices" data-jats-ext-link-type="uri">the FDA put out some guidelines</a> about design considerations and recommendations for both before and after medical devices hit the market. Unfortunately, these guidelines are not always followed as closely as they should be. The FDA places cybersecurity at the top of the priority list, and everyone involved—from manufacturers to healthcare providers and even patients—must play their part in ensuring IoMT devices are as secure as possible.</p> <p id="p-24">One way to prevent security mishaps is to have a solid cybersecurity risk management plan in place. This should cover both before and after the product is released. In plain terms, security should be baked into the device right from the design stage and should be a default feature that is fully supported technically. These security measures should be part of the device throughout its entire life, all the way to when it eventually becomes obsolete.</p> <p id="p-25">Before a medical device even hits the market, there are guidelines that focus on the design and development stage. These guidelines stress that manufacturers should clearly justify why they chose specific security controls during the device’s design process.</p> <p id="p-26">After the device is out there in the real world, there is another set of guidelines for managing its cybersecurity. These guidelines urge manufacturers to think about cybersecurity throughout the product’s entire life. This means having a system in place for managing security vulnerabilities. It is also crucial to follow the cybersecurity framework set out by the National Institute of Standards and Technology (<a class="ext-link" href="https://www.nist.gov/" data-jats-ext-link-type="uri">NIST</a>).</p> </section> <section id="sec4" class="sec"> <h2 class="heading">Essential Cybersecurity Practices for the Internet of Medical Things</h2> <p id="p-27">Now, I want to share several key principles that, in my opinion, could serve as the foundation for solid cybersecurity in the world of the Internet of Medical Things (IoMT). Adhering to the following guidelines can help maintain the safety, integrity, and reliable operation of IoMT devices and networks.</p> <ul class="list" data-jats-list-type="bullet"> <li class="list-item"> <p id="p-28"><b>Risk-Based Approach</b></p> <p id="p-29">Manufacturers are highly encouraged to figure out what their assets are, as well as identifying potential threats and weak spots. They should then assess how vulnerabilities and threats could compromise the device’s operation and affect the health and safety of users or patients. It is also crucial to gauge the likelihood of these threats actually happening, and set up suitable strategies to lessen those risks.</p> </li> <li class="list-item"> <p id="p-30"><b>Thorough Security Testing</b></p> <p id="p-31">All devices and systems should be rigorously tested to find any possible weak links. It is recommended that manufacturers engage in activities like <a class="ext-link" href="https://www.synopsys.com/glossary/what-is-penetration-testing.html" data-jats-ext-link-type="uri">penetration testing</a> and vulnerability scanning to make sure their security controls are up to snuff.</p> </li> <li class="list-item"> <p id="p-32"><b>Clear Labeling</b></p> <p id="p-33">The device’s labels should be straightforward about its security features and any safety steps of which users should be aware.</p> </li> <li class="list-item"> <p id="p-34"><b>Incident Response Plan</b></p> <p id="p-35">Once the device is out in the market, manufacturers must be ready to tackle any cybersecurity issues. This should include a well-thought-out plan for disclosing vulnerabilities and dealing with them effectively.</p> </li> </ul> </section> <section id="sec5" class="sec"> <h2 class="heading">Conclusion</h2> <p id="p-36">The healthcare world is changing fast, with an increasing number of organizations leaning on smart health gadgets that are part of the Internet of Medical Things. While IoMT offers cutting-edge ways to update medical practices and improve patient care, it is not without its risks. Lacking strong security measures makes these devices sitting ducks for potential cyberattacks.</p> <p id="p-37">To make sure we are covering all our bases, it is crucial to identify any and all possible security weak spots and threats. Once we know what we are up against, we can put solid protective measures in place.</p> <p id="p-38">Managing the attack surface—essentially the sum of all potential security risks—can make the network on which these IoMT devices operate much safer. And let’s not forget keeping patient data and electronic medical records secure is absolutely essential as this technology continues to evolve.</p> </section> </div> </article> ]]></content:encoded> <wfw:commentRss>https://cacm-acm-org-preprod.go-vip.net/opinion/protecting-life-saving-medical-devices-from-cyberattack/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <post-id xmlns="com-wordpress:feed-additions:1">586533</post-id> </item> <item> <title>On Specifying for Trustworthiness</title> <link>https://cacm-acm-org-preprod.go-vip.net/research/on-specifying-for-trustworthiness/</link> <comments>https://cacm-acm-org-preprod.go-vip.net/research/on-specifying-for-trustworthiness/#respond</comments> <dc:creator><![CDATA[Dhaminda B. Abeywickrama]]></dc:creator> <pubDate>Mon, 08 Jan 2024 22:08:27 +0000</pubDate> <category><![CDATA[Architecture and Hardware]]></category> <category><![CDATA[Systems and Networking]]></category> <guid isPermaLink="false">https://cacm-acm-org-preprod.go-vip.net/?post_type=digital-library&p=586482</guid> <description><![CDATA[As autonomous systems increasingly become part of our lives, it is crucial to foster trust between humans and these systems, to ensure positive outcomes and mitigate harmful ones. ]]></description> <content:encoded><![CDATA[<article> <div class="body" lang="en"> <section id="sec1" class="sec"> <p id="p-1">Autonomous systems (AS) are systems that involve software applications, machines, and people—that is, systems that can take action with little or no human supervision.<sup><a class="reference-link xref xref-bibr" href="#bib34" data-jats-ref-type="bibr" data-jats-rid="bib34">34</a></sup> Soon, AS will no longer be confined to safety-controlled industrial settings. Instead, they will increasingly become part of our daily lives, having matured across various domains, such as driverless cars, healthcare robotics, and uncrewed aerial vehicles (UAVs). As such, it is crucial that these systems are trusted and trustworthy. Trust may vary, as it can be gained and lost over time. Different research disciplines define trust in different ways. This article focuses on the notion of trust that concerns the relationship between humans and AS. AS are considered <i>trustworthy</i> when the design, engineering, and operation of these systems generates positive outcomes and mitigates potentially harmful outcomes.<sup><a class="reference-link xref xref-bibr" href="#bib35" data-jats-ref-type="bibr" data-jats-rid="bib35">35</a></sup> The trustworthiness of AS can depend on many factors, such as explainability, accountability, and understandability to different users; robustness of AS in dynamic and uncertain environments; assurance of their design and operation through verification and validation (V&V) activities; confidence in their ability to adapt functionality as required; security against attacks on the systems, users, and deployed environment; governance and regulation of their design and operation; and consideration of ethics and human values in their deployment and use.<sup><a class="reference-link xref xref-bibr" href="#bib35" data-jats-ref-type="bibr" data-jats-rid="bib35">35</a></sup></p> <blockquote class="disp-quote" data-jats-content-type="pull-quote"> <p id="p-2">Autonomous systems are considered trustworthy when their design, engineering, and operation generates positive outcomes and mitigates potentially harmful outcomes.</p> </blockquote> <p id="p-3">There are various techniques for demonstrating the trustworthiness of systems, such as synthesis, formal verification at design time, runtime verification or monitoring, and test-based methods. However, common to all these techniques is the need to formulate <i>specifications</i>. A specification is a detailed formulation that provides “a definitive description of a system for the purpose of developing or validating the system.”<sup><a class="reference-link xref xref-bibr" href="#bib13" data-jats-ref-type="bibr" data-jats-rid="bib13">13</a></sup> According to Kress-Gazit et al.,<sup><a class="reference-link xref xref-bibr" href="#bib29" data-jats-ref-type="bibr" data-jats-rid="bib29">29</a></sup> writing specifications that capture trust is challenging. A human will only trust an AS to perform in a safe manner (that is, nothing bad happens) if it clearly and demonstrably acts in such a manner. This requires the AS to not only be safe, but also to be seen as safe by the human. In the same manner, it is equally important to ensure that the AS trusts the human.<sup><a class="reference-link xref xref-bibr" href="#bib29" data-jats-ref-type="bibr" data-jats-rid="bib29">29</a></sup> To address this, specifications must go beyond typical functionality and safety aspects.</p> <p id="p-4">Engineering trustworthy and trusted AS involves different processes, technology, and skills than those required for traditional software solutions. Many practitioners in the AS or artificial intelligence (AI) domains have learned by accumulating experiences and failures across projects.<sup><a class="reference-link xref xref-bibr" href="#bib1" data-jats-ref-type="bibr" data-jats-rid="bib1">1</a></sup> Best practices have started to emerge. There is increasing evidence of the need for rigorous specification techniques for developing and deploying AI applications.<sup><a class="reference-link xref xref-bibr" href="#bib3" data-jats-ref-type="bibr" data-jats-rid="bib3">3</a></sup> Even when not life-critical, actions and decisions made by AS may have serious consequences. If we are to use them in our businesses, at doctor’s surgeries, on our roads, or in our homes, we must build AS that precisely satisfy the requirements of their stakeholders. However, specifying requirements for AS (AI in particular) remains more a craft than a science. For example, machine learning (ML) applications are often specified based on optimization and efficiency measures rather than well-specified quality requirements that relate to stakeholders needs<sup><a class="reference-link xref xref-bibr" href="#bib23" data-jats-ref-type="bibr" data-jats-rid="bib23">23</a></sup> and further research is needed.</p> <blockquote class="disp-quote" data-jats-content-type="pull-quote"> <p id="p-5">Engineering trustworthy and trusted AS involves different processes, technology, and skills than those required for traditional software solutions.</p> </blockquote> <p id="p-6">In the U.K. Research and Innovation (UKRI) Trustworthy Autonomous Systems (TAS) program, we conduct cross-disciplinary fundamental research to ensure that AS are safe, reliable, resilient, ethical, and trusted. TAS is organized around six research projects called Nodes and a Hub; each Node focuses on the individual aspects of trust in AS, such as resilience, trust, functionality, verifiability, security, and governance and regulation.</p> <p id="p-7">Undertaking a community approach, this roadmap article is the result of the “Specifying for Trustworthiness” workshop held during the September 2021 TAS All Hands Meeting, which gathered a diverse group of researchers from all parts of the TAS program. Co-authored by a representative sample of the AS community in the U.K., this article highlights the specification challenges for AS with illustrations from a representative set of domains currently being investigated within our community. The main contribution of this article is to identify key open research problems termed ‘intellectual challenges’ involved with specifying for trustworthiness in AS that cut across domains and are exacerbated by the inherent uncertainty involved with the environments in which AS need to operate. This article takes a broad view of specification, concentrating on top-level requirements including, but not limited, to functionality, safety, security, and other non-functional properties that contribute to the trustworthiness of AS. Also, a discussion on the formalization of these specifications has intentionally been left for the future, when the understanding of what is required to specify for trustworthiness will be more mature.</p> <p id="p-8">To motivate and present the research challenges associated with specifying for trustworthiness in AS, the rest of this article is divided into three parts. The next section discusses a number of AS domains, each with its unique specification challenges. Then, the article presents key intellectual challenges currently being investigated within our community. Finally, the article summarizes our findings.</p> </section> <section id="sec2" class="sec"> <h2 class="heading">Autonomous Systems Domains and Their Specification Challenges</h2> <p id="p-9">In this, article, we classify AS domains based on two criteria: the number of autonomous agents (single or multiple) and whether humans are interacting with the AS as part of the system or the environment, following Schneiders et al.<sup><a class="reference-link xref xref-bibr" href="#bib38" data-jats-ref-type="bibr" data-jats-rid="bib38">38</a></sup> Accordingly, we distinguish AS domains into four categories:</p> <ul class="list" data-jats-list-type="bullet"> <li class="list-item"> <p id="p-10">A single autonomous agent (for example, automated driving, UAV)</p> </li> <li class="list-item"> <p id="p-11">A group of autonomous agents (for example, swarms)</p> </li> <li class="list-item"> <p id="p-12">An autonomous agent assisting a human (for example, AI in healthcare, human–robot interaction)</p> </li> <li class="list-item"> <p id="p-13">A group of autonomous agents collaborating with humans (for example, emergency situations, disaster relief).</p> </li> </ul> <p id="p-14">We discuss the specification challenges involved with AS using illustrations from a representative set of domains, as being investigated within our community in TAS (see Table <a class="xref xref-table" href="#T1" data-jats-ref-type="table" data-jats-rid="T1">1</a>), rather than attempting to cover all possible AS domains.</p> <figure id="T1" class="table-wrap" data-jats-position="float"> <div class="caption"><span class="caption-label">Table 1. </span></p> <div class="title">AS domains and their specification challenges.</div> </div> <div class="table-container"> <table class="table table-bordered table-condensed table-hover" data-jats-frame="hsides" data-jats-rules="rows"> <colgroup> <col align="left" valign="top" /> <col align="left" valign="top" /> <col align="left" valign="top" /> </colgroup> <thead> <tr> <th style="text-align: center;"><b>Category</b></th> <th style="text-align: center;"><b>Domain</b></th> <th style="text-align: center;"><b>Specification Challenge</b></th> </tr> </thead> <tbody> <tr> <td style="text-align: left;" rowspan="3">Single Autonomous Agent</td> <td style="text-align: left;" rowspan="2">Automated driving</td> <td style="text-align: left;">How to address the lack of machine-readable specifications that formally express acceptable driving behavior?</td> </tr> <tr> <td style="text-align: left;">How to specify the actions of other road users?</td> </tr> <tr> <td style="text-align: left;">UAV</td> <td style="text-align: left;">How to specify the ways the UAV should deal with situations that go beyond the limits of its training?</td> </tr> <tr> <td style="text-align: left;">Multiple Autonomous Agents</td> <td style="text-align: left;">Swarms</td> <td style="text-align: left;">How to specify the emergent behavior of a swarm that is a consequence of the interaction of individual agents with each other and the environment?</td> </tr> <tr> <td style="text-align: left;" rowspan="5">Autonomous Agent Assisting a Human</td> <td style="text-align: left;" rowspan="2">Human–robot Interaction</td> <td style="text-align: left;">How to specify the perceptual, reasoning, and behavioral processes of robot systems?</td> </tr> <tr> <td style="text-align: left;">How to infer human mental states interactively?</td> </tr> <tr> <td style="text-align: left;" rowspan="3">AI in healthcare</td> <td style="text-align: left;">How to specify ‘black box’ models?</td> </tr> <tr> <td style="text-align: left;">What is the role of explainability and faithfulness of the interpretation of semantics?</td> </tr> <tr> <td style="text-align: left;">What is the role of pre-trained models in pipelines?</td> </tr> <tr> <td style="text-align: left;" rowspan="2">Multiple Autonomous Agents Collaborating with Humans</td> <td style="text-align: left;" rowspan="2">Emergency situations and disaster relief</td> <td style="text-align: left;">How to specify collaboration between autonomous agents and different human agents in emergency settings?</td> </tr> <tr> <td style="text-align: left;">How to specify security where large amounts of data need to be collected, shared, and stored?</td> </tr> </tbody> </table> </div> </figure> <section id="sec3" class="inline-headings-section"> <h3 class="heading">Single autonomous agent: Automated driving, UAV.</h3> <p id="p-15" data-jats-content-type="inline-heading">Automated driving (self-driving) refers to a class of AS that varies in the extent to which they independently make decisions (SAE J3016 standard taxonomy). The higher levels of autonomy, levels 3-5, refer to functionality ranging from traffic jam chauffeur to completely hands-free driving in all conditions. Despite an explosion of activity in this domain in recent years, the majority of systems being considered for deployments depend on careful delineation of the operation design domain to make the specification of appropriate behavior tractable. Even so, the specification problem remains difficult for a number of reasons. Firstly, traffic regulations are written in natural language, ready for human interpretation. Although highway code rules are intended for legal enforcement, they are not specifications that are suitable for machines. There typically are many exceptions, context-dependent conflicting rules, and guidance of an ‘open nature,’ all of which require interpretation in context. Driving rules can often be vague or even conflicting, and may need a base of knowledge to interpret the rule given a specific context. The U.K. Highway Code Rule 163 states that after you have started an overtaking maneuver you should “move back to the left as soon as you can but do not cut in.”<sup><a class="reference-link xref xref-bibr" href="#bib6" data-jats-ref-type="bibr" data-jats-rid="bib6">6</a></sup> A more explicit specification of driving conduct (for example, Rule 163) to something more machine interpretable that captures the appropriate behavior presents a challenge to this research area. When people are taught to perform this activity, a significant portion of the time is spent in elaborating these special cases, and much of the testing in the licensing regime is aimed at probing for uniformity of interpretation. How best to translate these human processes into the AS domain is important not only for achieving safety but also acceptability. Secondly, driving in urban environments is an intrinsically interactive activity, involving several actors whose internal states may be opaque to the automated vehicle. As an example, the U.K. Highway Code asks drivers to not “pull out into traffic so as to cause another driver to slow down.” Without further constraint on what the other drivers could possibly do, specifying appropriate behavior becomes difficult, and any assumptions made in that process would call into question the safety of the overall system when those assumptions are violated. Thus, two key challenges in the area of automated driving are the lack of machine-readable specifications that formally express acceptable driving behavior and the need to specify the actions of other road users (Table <a class="xref xref-table" href="#T1" data-jats-ref-type="table" data-jats-rid="T1">1</a>). To some extent, these issues arise in all open environments. However, in automated driving, the task is so intricately coupled with the other actors that even the default assumptions may not be entirely clear, and the relative variation in behavior due to different modeling assumptions could be qualitatively significant.</p> <p id="p-16">A UAV or drone is a type of aerial vehicle that is capable of autonomous flight without a pilot on board. UAVs are increasingly being applied in diverse applications, such as logistics services, agriculture, emergency response, and security. Specification of the operational environment of UAVs is often challenging due to the complexity and uncertainty of the environments that UAVs need to operate in. For instance, in parcel delivery using UAVs in urban environments, there can be uncertain flight conditions (for example, wind gradients), and highly dynamic and uncertain airspace (for instance, other UAVs in operation). Recent advances in ML offer the potential to increase the autonomy of UAVs in uncertain environments by allowing them to learn from experience. For example, ML can be used to enable UAVs to learn novel maneuvers to achieve perched landings in uncertain windy conditions.<sup><a class="reference-link xref xref-bibr" href="#bib12" data-jats-ref-type="bibr" data-jats-rid="bib12">12</a></sup> In these contexts, a key challenge is how to specify how the system should deal with situations that go beyond the limits of its training (Table <a class="xref xref-table" href="#T1" data-jats-ref-type="table" data-jats-rid="T1">1</a>).</p> </section> <section id="sec4" class="inline-headings-section"> <h3 class="heading">Multiple autonomous agents: Swarm robotics.</h3> <p id="p-17" data-jats-content-type="inline-heading">Swarm robotics provides an approach to the coordination of large numbers of robots, which is inspired from the observation of social insects.<sup><a class="reference-link xref xref-bibr" href="#bib37" data-jats-ref-type="bibr" data-jats-rid="bib37">37</a></sup> Three desirable properties in any swarm robotics system are robustness, flexibility, and scalability. The functionality of a swarm is emergent (for example, aggregation, coherent ad hoc networks, taxis, obstacle avoidance, and object encapsulation)<sup><a class="reference-link xref xref-bibr" href="#bib45" data-jats-ref-type="bibr" data-jats-rid="bib45">45</a></sup> and evolves based on the capabilities and number of robots used. The overall behaviors of a swarm are not explicitly engineered in the system, as they might be in a collection of centrally controlled robots, but they are an emergent consequence of the interaction of individual agents with each other and the environment. This emergent functionality poses a challenge for specification. The properties of individual robots can be specified in a conventional manner, yet it is the emergent behaviors of the swarm that determine the performance of the system as a whole. The challenge is to develop specification approaches that specify properties at the swarm level that can be used to develop, verify, and monitor swarm robotic systems.</p> </section> <section id="sec5" class="inline-headings-section"> <h3 class="heading">Autonomous agent assisting a human: Human–robot interaction, AI in healthcare.</h3> <p id="p-18" data-jats-content-type="inline-heading">Interactive robot systems aim to complete their tasks while explicitly considering the states, goals, and intentions of the human agents they collaborate with, and aiming to calibrate the trust humans have for them to an appropriate level. This form of human-in-the-loop, real-time interaction is required in several application domains, including assistive robotics for activities of daily living,<sup><a class="reference-link xref xref-bibr" href="#bib15" data-jats-ref-type="bibr" data-jats-rid="bib15">15</a></sup> healthcare robotics, shared control of smart mobility devices,<sup><a class="reference-link xref xref-bibr" href="#bib40" data-jats-ref-type="bibr" data-jats-rid="bib40">40</a></sup> and collaborative manufacturing. Most specification challenges arise from the need to provide specifications for the perceptual, reasoning and behavioral processes of robot systems that will need to acquire models of, and deal with, the high variability exhibited in human behavior. While several human-in-the-loop systems employ mental state inference, the necessity for interactively performing such inference (including as beliefs and intentions), typically through sparse and/or sensor data from multimodal interfaces, imposes further challenges for the principled specification of human factors and data-driven adaptation processes in robots operating in close proximity to humans, where safety and reliability are of particular importance.</p> <p id="p-19">Healthcare is a broad application domain which already enjoys the many benefits arising from the use of AI and AI-enabled autonomy. This has ranged from more accurate and automated diagnostics to a greater degree of autonomy in robot surgery, as well as entirely new approaches to drug discovery and design. The use of AI in medical diagnosis has advanced to an extent that in some settings, for example, mammography screening, automated interpretation seems to match human expertise in some trials. However, there remains a gap in test accuracy. It has been argued that the automated systems are not sufficiently specific to replace radiologist double reading in screening programs.<sup><a class="reference-link xref xref-bibr" href="#bib14" data-jats-ref-type="bibr" data-jats-rid="bib14">14</a></sup> These gaps also highlight the main specification challenges in this domain. Historically, the human expertise in this domain has not been explicitly codified, so it can be hard to enumerate desired characteristics. It is clear that the specifications must include notions of invariance to instrument and operator variations, coverage of condition and severity level, and so on. Beyond that, the semantics of the biological features used to make fine determinations are subject to both ambiguity or informality, and variability across experts and systems. Moreover, the use of deep learning to achieve automated interpretation brings with it the need for explainability. This manifests itself in the challenge of guarding against shortcuts,<sup><a class="reference-link xref xref-bibr" href="#bib4" data-jats-ref-type="bibr" data-jats-rid="bib4">4</a></sup> wherein the AI diagnostic system achieves high accuracy by exploiting irrelevant side variables instead of identifying the primary problem (for example, radiographic COVID-19 detection using AI).<sup><a class="reference-link xref xref-bibr" href="#bib4" data-jats-ref-type="bibr" data-jats-rid="bib4">4</a></sup> The specific challenge here is how to specify with respect to ‘black box’ models. In this regard, we can highlight the role of explainability and faithfulness of interpretation of semantics, and the role of pre-trained models in pipelines (see Table <a class="xref xref-table" href="#T1" data-jats-ref-type="table" data-jats-rid="T1">1</a>).</p> </section> <section id="sec6" class="inline-headings-section"> <h3 class="heading">Multiple autonomous agents collaborating with humans: Emergency situations, disaster relief.</h3> <p id="p-20" data-jats-content-type="inline-heading">Emergency situations evolve dynamically and can differ in terms of the type of incident, its magnitude, additional hazards, and the number and location of injured people. They are also characterized by urgency; they require a response in the shortest timeframe possible and call for a coordinated response of emergency services and supporting organizations, which are increasingly making use of AS. This means that successful resolutions depend not only on effective collaboration between humans<sup><a class="reference-link xref xref-bibr" href="#bib25" data-jats-ref-type="bibr" data-jats-rid="bib25">25</a></sup> but also between humans and AS. Thus, there is a need to specify both functional requirements and the social, legal, ethical, empathic, and cultural (SLEEC) rules and norms that govern an emergency scenario. AS in emergency response contexts vary hugely; as such, the kinds of SLEEC issues pertaining to them must be incorporated into the design process rather than implemented afterward. This suggests a shift from a static design challenge toward the need to specify for adaptation to the diversity of emergency actors and complexity of emergency contexts, which are time-sensitive and involve states of exception not common in other open AS environments, such as autonomous vehicles. In addition, to enhance collaboration between autonomous agents and different human agents in emergencies, specifying human behavior remains one of the main challenges in emergency settings.</p> <p id="p-21">There are also challenges for specifying security in the context of disaster relief. A large part of this comes from the vast amounts of data that needs to be collected, shared, and stored between different agencies and individuals. Securing a collaborative information management system is divided between technical forms of security, such as firewalling and encryption, and social forms of security, such as trust. To provide security to a system, both aspects must be addressed in relation to each other within a specification.</p> </section> </section> <section id="sec7" class="sec"> <h2 class="heading">Intellectual Challenges for the Autonomous Systems Community</h2> <p id="p-22">The preceding section discussed specification challenges unique to a representative set of domains investigated within our community. Now we discuss 10 <i>intellectual challenges</i> involved with specifying for trustworthiness in AS that can cut across domains and are exacerbated by the inherent uncertainty involved with the environments in which AS need to operate. These challenges were identified during stimulating discussions among the speakers and participants of the breakout groups at the “Specifying for Trustworthiness” workshop.</p> <figure id="F1" class="fig" data-jats-position="float"> <div class="image-container"><a class="fresco" title="View the full image" href="https://cacm-acm-org-preprod.go-vip.net/wp-content/uploads/2024/01/3624699_fig01.jpg" data-fresco-caption="Figure 1.: Intellectual challenges for the AS community." data-fresco-group="figure" data-fresco-options="fit: 'width', ui: 'outside', thumbnails: false, loop: true, position: true, overflow: true, preload: false"> <img decoding="async" class="graphic" src="https://cacm-acm-org-preprod.go-vip.net/wp-content/uploads/2024/01/3624699_fig01.jpg" alt="Intellectual challenges for the AS community." data-image-id="F1" data-image-type="figure" /> </a></div><figcaption> <div class="title">Intellectual challenges for the AS community.</div> <div class="figcaption-footer"> </div> </figcaption></figure> <p id="p-23">Intellectual challenges 1–6 are in the six <i>focus areas</i> of trust in AS (that is, resilience, trust, functionality, verifiability, security, and governance and regulation), as identified by their respective speakers. Meanwhile, the remaining four challenges have either a <i>common</i> focus (7) across the TAS program, or they are <i>evolving</i> in nature (8–10) (see Figure <a class="xref xref-fig" href="#F1" data-jats-ref-type="fig" data-jats-rid="F1">1</a>). For each challenge we provide an overview, identify high-priority research questions, and suggest future directions.</p> <p id="p-24">Many of the specification challenges to be discussed are shared by systems such as multi-agents systems, cyber-physical-social systems, or AI-based systems. Autonomy is an important characteristic of these systems and so is the need for trustworthiness. Specification challenges have also received a lot of attention in ‘non-AS’, for example, safety-critical systems. Yet, many of the challenges are exacerbated in AS because of the inherent uncertainty of their operating environments: They are long-lived, continuously running systems that interact with the environment and humans in ways that can hardly be fully anticipated at design time and continuously evolve at runtime. In other words, while those challenges are not specific to AS, AS exacerbate them.</p> <section id="sec8" class="inline-headings-section"> <h3 class="heading"><span class="caption-label">1. </span>How to specify human behavior for human–AS cooperation?</h3> <p id="p-25" data-jats-content-type="inline-heading">How to model human behavior to enable cooperation with AS is challenging but crucial for the resilience of the system as a whole. It is the diversity in human enactment that drives uncertainty about what people do and do not do, and subsequently, the way human behavior can be specified. Knowing the mental state of others enables AS to steer a cooperation that is consistent with the needs of the AS, as well as to respond to the needs of human agents in an appropriate manner.</p> <p id="p-26">Different theories of human behaviors explain diversity in human action in different ways and by detecting various determinants of human behavior. For example, a behaviorist approach suggests that every behavior is a response to a certain stimulus.<sup><a class="reference-link xref xref-bibr" href="#bib21" data-jats-ref-type="bibr" data-jats-rid="bib21">21</a></sup> Albeit true, this approach is restrictive in addressing the complexity of human behavior, as well as the different ways that human behavior develops during cooperation. To grasp that humans are embodied with purposes and goals that affect each other, the concept of joint-action can be introduced as “a social interaction whereby two or more individuals coordinate their actions in space and time to bring about change in the environment.”<sup><a class="reference-link xref xref-bibr" href="#bib39" data-jats-ref-type="bibr" data-jats-rid="bib39">39</a></sup> Adapting it to human–robot interaction, this approach suggests an interplay between humans and AS, such that what matters is not only how the AS understands the system but also how humans understand the way the autonomous agent behaves and is willing to cooperate.<sup><a class="reference-link xref xref-bibr" href="#bib17" data-jats-ref-type="bibr" data-jats-rid="bib17">17</a></sup> Thus, cooperation arises from a shared understanding between agents, which is a challenge to specify.</p> <p id="p-27">The social identity approach<sup><a class="reference-link xref xref-bibr" href="#bib41" data-jats-ref-type="bibr" data-jats-rid="bib41">41</a></sup> induces this concept of a shared understanding by providing an explanation of human behavior focusing on how social structures act upon cognition. It proposes that, alongside our personal identity, our personality—who we are— we also have multiple social identities based on social categories and groups. Previous research has shown that social identities influence people’s relation with technology.<sup><a class="reference-link xref xref-bibr" href="#bib30" data-jats-ref-type="bibr" data-jats-rid="bib30">30</a></sup> Sharing a social identity initiates pro-social behaviors, such as helping behaviors in emergency situations.<sup><a class="reference-link xref xref-bibr" href="#bib7" data-jats-ref-type="bibr" data-jats-rid="bib7">7</a></sup> People adapt their behavior in line with their shared identities, which in turn, enhances resilience. Specifying social identities to enable cooperation is challenging. It requires answering questions such as: How do we represent different identities and how do we reason about them? Following the social identity approach to specify identities for human-autonomous agent cooperation requires an investigation of how to operationalize social identity, a psychological state, into software embedded within AS.</p> </section> <section id="sec9" class="inline-headings-section"> <h3 class="heading"><span class="caption-label">2. </span>How to specify data-driven adaptation processes and human factors?</h3> <p id="p-28" data-jats-content-type="inline-heading">Specifying, designing, implementing, and deploying interactive robot systems that are trustworthy for use in scenarios where humans and robots collaborate in close proximity is challenging, given that safety and reliability in such scenarios are of particular importance. Example scenarios include assisting people with activities of daily living, such as mobility<sup><a class="reference-link xref xref-bibr" href="#bib40" data-jats-ref-type="bibr" data-jats-rid="bib40">40</a></sup> and dressing;<sup><a class="reference-link xref xref-bibr" href="#bib15" data-jats-ref-type="bibr" data-jats-rid="bib15">15</a></sup> rehabilitation robotics; adaptive assistance in intelligent vehicles; and robot assistants in care homes and hospital environments. The intellectual challenge the AS community faces is the specification, design, and implementation of trustworthy perceptual, cognitive, and behavior-generation processes that explicitly incorporate parametrizable models of human skills, beliefs, and intentions.<sup><a class="reference-link xref xref-bibr" href="#bib5" data-jats-ref-type="bibr" data-jats-rid="bib5">5</a></sup> These models are necessary for interactive assistive systems since they need to decide not only how but also when to assist.<sup><a class="reference-link xref xref-bibr" href="#bib16" data-jats-ref-type="bibr" data-jats-rid="bib16">16</a></sup> Given the large variability of human behavior, the parameters of these user models need to be acquired interactively, typically from sparse and potentially noisy sensor data, a particularly challenging inverse problem. An additional challenge is introduced in the case of long-term human–robot interaction, where the assistive system needs to learn and take into consideration human developmental aspects, typically manifested in computational learning terms as model drift. As an example, consider an assistive mobility device for children with disabilities:<sup><a class="reference-link xref xref-bibr" href="#bib40" data-jats-ref-type="bibr" data-jats-rid="bib40">40</a></sup> as the child’s perceptual, cognitive, emotional and motor skills develop over time, their requirements for the type, amount and frequency of the provided assistance will need to evolve. Similarly, when assisting an elderly person or someone recovering from surgery, the distributions of the human data that the robot sensors collect will vary not only according to the context but also over time. Depending on the human participant, and their underlying time-varying physiological and behavioral particularities, model drift can be sudden, gradual, or recurring, posing significant challenges to the underlying modeling methods. Principled methods for incorporating long-term human factors into the specification, design, and implementation of assistive systems that adapt and personalize their behavior for the benefit of their human collaborator remain an open research challenge.</p> </section> <section id="sec10" class="inline-headings-section"> <h3 class="heading"><span class="caption-label">3. </span>What standards and assurance processes are needed for AS with evolving functionality?</h3> <p id="p-29" data-jats-content-type="inline-heading">AS with <i>evolving functionality</i>—the ability to adapt and change in function over time—pose significant challenges to current processes for specifying functionality. Most conventional processes for defining system requirements assume that these are fixed and can be defined in a complete and precise manner before the system goes into operation. Existing standards and regulations do not accommodate the adaptive nature of AS with evolving functionality. This is a key limitation<sup><a class="reference-link xref xref-bibr" href="#bib11" data-jats-ref-type="bibr" data-jats-rid="bib11">11</a></sup> preventing the deployment of promising applications, such as swarm robots which adapt through emergent behavior and UAVs with ML-based flight-control systems from deployment.</p> <p id="p-30">For airborne systems and in particular for UAVs, several industry standards and regulations have been introduced to specify requirements for system design and safe operation—for example, DO-178C, DO-254, ED279, ARP4761, NATO STANAG 4671, and CAP 722. However, none of these standards or regulations cover the types of ML-based systems currently being developed to enable UAVs to operate autonomously in uncertain environments.</p> <p id="p-31">The ability to adapt and learn from experience are important for enabling AS to operate in real-world environments. When one considers existing industry standards, they are either implicitly or explicitly based on the V&V model, which moves from requirements through design into implementation, testing, and finally deployment.<sup><a class="reference-link xref xref-bibr" href="#bib26" data-jats-ref-type="bibr" data-jats-rid="bib26">26</a></sup> However, this model is unlikely to suit systems with the ability to adapt their functionality in operation (for example, through interaction with other agents and the environment, as is the case with swarms, or through experience-driven adaptation as is the case with ML). AS with evolving functionality follow a different, more iterative life cycle. Thus, there is a need for new standards and assurance processes that extend beyond design time and allow continuous certification at runtime.</p> </section> <section id="sec11" class="inline-headings-section"> <h3 class="heading"><span class="caption-label">4. </span>How can AS be specified for verifiability?</h3> <p id="p-32" data-jats-content-type="inline-heading">For a system to be <i>verifiable</i>, a person or a tool needs to be able to check its correctness<sup><a class="reference-link xref xref-bibr" href="#bib13" data-jats-ref-type="bibr" data-jats-rid="bib13">13</a></sup> with respect to its requirements and specification. The main challenge is in specifying and designing the system in a way that makes this process as easy and intuitive as possible. For AS in particular, specific challenges include capturing and formalizing requirements, including functionality, safety, security, performance and, beyond these, any additional non-functional requirements purely needed to demonstrate trustworthiness; handling flexibility, adaptation and learning; and managing the inherent complexity and heterogeneity of both the AS and the environment it operates in.</p> <p id="p-33">Specifications must represent the different aspects of the overall system in a way that is natural to domain experts, facilitates modeling and analysis, provides transparency of how the AS works, and offers insights into the reasons that motivate its decisions. To specify for verifiability, a specification framework will need to offer a variety of domain abstractions to represent the diverse, flexible, and possibly evolving requirements AS are expected to satisfy. Furthermore, the underlying verification framework should connect all these domain abstractions to allow an analysis of their interaction. This is a key challenge in specifying for verifiability in AS.</p> <p id="p-34">AS can be distinguished using two criteria: the degree of autonomy and adaption, and the criticality of the application (which can range from harmless to safety-critical). We can consider which techniques or their combinations are needed for V&V at the different stages of the system life cycle. The need for runtime V&V emerges when AS operate in uncontrolled environments, where there is a need for autonomy and learning and adaptation. There, a significant challenge is finding rigorous techniques for the specification and V&V of safety-critical AS, where requirements are often vague, flexible, and may contain uncertainty and fuzziness. V&V at design time can only provide a partial solution, and more research is needed to understand how best to specify and verify learning and adaptive systems by combining design time with runtime techniques. Finally, identifying the design principles that enable V&V of AS is a key pre-requisite to promote verifiability to a first-class design goal alongside functionality, safety, security, and performance.</p> </section> <section id="sec12" class="inline-headings-section"> <h3 class="heading"><span class="caption-label">5. </span>How to specify security from a social perspective?</h3> <p id="p-35" data-jats-content-type="inline-heading">There are technical sides to security, but there are also social dimensions that matter when considering how an AS enforces its status as secure. In this context, security overlaps with trust. One can only be assured a system is secure if one trusts that system. Public trust is a complex issue, shot through with media, emotions, politics, and competing interests. How do we go about specifying security in a social sense?</p> <p id="p-36">On the technical side, there are fairly specific definitions for specification which can be grasped and measured. From the social perspective, the possibility of specification relies on a network of shared assumptions and beliefs that are difficult to unify. In fact, much of the value from engagement over social specifications derives from the diversity and difference. A predominant concern in social aspects of security is where data is shared between systems (social-material interactions)—that is, whenever an AS communicates with a human being or an aspect of the environment. Although these interactions have technical answers, to find answers that consider social science perspectives requires collaboration and agile methods to facilitate that collaboration.</p> <p id="p-37">The human dimension means that it is not enough to specify technical components. Specifications must also capture beliefs, desires, fears, and, at times, misinformation with respect to how those are understood, regarded, and perceived by the public. For example, in what ways can we regard pedestrians as passive users of automated vehicles? How are automated vehicles regarded by the public, and how are pedestrians involved in automated mobility?</p> <p id="p-38">The ethical challenges that emerge for AS security also relate to the legal and social ones. The difficulty centers around how to create regulations and specifications on a technical level, that are also useful socially, facilitating responsiveness to new technologies that are neither simply techno-phobic nor passively accepting. Doing so must involve both innovation and public input, so that the technology developed works for everyone. The ethical, legal, and social implications (ELSI) framework<sup><a class="reference-link xref xref-bibr" href="#bib8" data-jats-ref-type="bibr" data-jats-rid="bib8">8</a></sup> aims to engage designers, engineers, and public bodies in answering these questions. ELSI is an inherently cross-disciplinary set of approaches for tackling AS security, as many interrelated and entangled aspects. Specifying security requires connection, collaboration, and agile ethical methods.</p> </section> <section id="sec13" class="inline-headings-section"> <h3 class="heading"><span class="caption-label">6. </span>How to establish environmental properties and real intent behind requirements in governance frameworks?</h3> <p id="p-39" data-jats-content-type="inline-heading">Computer scientists treat specifications as precise objects, often derived from requirements by purging features such that they are defined with respect to environment properties that can be relied on regardless of the machine’s behavior. Emerging AS applications in human-centered environments can challenge this way of thinking, particularly because the environment properties may not be fully understood or because it is hard to establish if the real intent behind a requirement can be verified. These gaps should be addressed in governance frameworks to engender trust.</p> <p id="p-40">For instance, in all the domains mentioned, we are increasingly seeing systems they are data-first and subject to continuous deployment. This has the interesting consequence that sometimes the task requirements cannot be explicitly stated. Instead, they are only given in terms of instances of observed human behavior,<sup><a class="reference-link xref xref-bibr" href="#bib42" data-jats-ref-type="bibr" data-jats-rid="bib42">42</a></sup> which represent positive examples. An example in medical diagnostics is when an AI-based AS has only a high-level label from the human radiologist, to be matched by the model, rather than detailed causal theories or justifications.<sup><a class="reference-link xref xref-bibr" href="#bib14" data-jats-ref-type="bibr" data-jats-rid="bib14">14</a></sup> We see this as a crucial area for future development, as existing workflows depend on human interpretation of rules in crucial ways, whereas when AS make the same decisions, there is scope for significant disruption of these workflows due to potential gaps that become exposed.</p> <p id="p-41">Furthermore, many emerging concerns, such as fairness, are not only difficult to formalize in the sense of software specification, but also their many definitions can be conflicting such that it is impossible to satisfy all of them in a given system.<sup><a class="reference-link xref xref-bibr" href="#bib36" data-jats-ref-type="bibr" data-jats-rid="bib36">36</a></sup></p> <blockquote class="disp-quote" data-jats-content-type="pull-quote"> <p id="p-42">Many emerging concerns, such as fairness, are not only difficult to formalize in the sense of software specification, but also their many definitions can be conflicting.</p> </blockquote> <p id="p-43">AS of the future will need a combination of informal and formal mechanisms for governance. In domains such as automated vehicles, system trustworthiness may require a complete ecosystem approach<sup><a class="reference-link xref xref-bibr" href="#bib27" data-jats-ref-type="bibr" data-jats-rid="bib27">27</a></sup> involving community-defined scenario libraries, enabling the greater use of simulation in verification, and independent audits via independent third parties. This calls for developing new computational tools for performance and error characterization, systematic adversarial testing with respect to a range of different specification types, and causal explanations that address not only a single instance of a decision but better expose informational dependencies that are useful for identifying edge cases and delineating operational design domains.</p> <p id="p-44">In addition to these technical tools, there is a need to understand the human-machine context in a more holistic manner, as this is really the target of effective governance. People’s trust in an AS is not solely determined by technical reliability. Instead, the expectations of responsibility and accountability are associated with the human team involved in the system’s design and deployment and the organizational design behind the system. A vast majority of system failures arise from mistakes made in this ‘outer loop.’ Therefore, effective regulations must begin with a comprehensive mapping of responsibilities that must be governed, so that computational solutions can be tailored to address these needs. Furthermore, there is a need for ethnographic understanding of AS being used in context, which could help focus technical effort on the real barriers to trustworthiness.</p> </section> <section id="sec14" class="inline-headings-section"> <h3 class="heading"><span class="caption-label">7. </span>How can explainability by design contribute to AS specifications?</h3> <p id="p-45" data-jats-content-type="inline-heading">There are increasing calls for explainability in AS, with emerging frameworks and guidance<sup><a class="reference-link xref xref-bibr" href="#bib18" data-jats-ref-type="bibr" data-jats-rid="bib18">18</a></sup> pointing to the need for AI to provide explanations about decision making. A challenge with specifying such explainability is that existing frameworks and guidance are not prescriptive: What is an actual explanation and how should one be constructed? Furthermore, frameworks and guidance tend to be concerned with AI in general, not AS.</p> <p id="p-46">A case study addressing regulatory requirements on explainability of automated decisions in the context of a loan application<sup><a class="reference-link xref xref-bibr" href="#bib22" data-jats-ref-type="bibr" data-jats-rid="bib22">22</a></sup> provided foundations for a systematic approach. Within this context, explanations can act as external detective controls, as they provide specific information to justify the decision reached and help the user take corrective actions.<sup><a class="reference-link xref xref-bibr" href="#bib43" data-jats-ref-type="bibr" data-jats-rid="bib43">43</a></sup> But explanations can also act as internal detective controls, that is, a mechanism for organizations to demonstrate compliance to the regulatory frameworks they have to implement. The study and design of AS includes many facets; not only black-box or grey-box AI systems, but also the various software and hardware components of the system, the curation and cleansing of datasets used for training and validation, the governance of such systems, their user interface, and crucially the users of such systems with a view of ensuring that they do not harm but benefit these users and society in general. There are typically a range of stakeholders involved, from the system designers to their hosts and/or owners, their users (consumers and operators), third-parties, and, increasingly, regulators. In this context, many questions related to trustworthy AS must be addressed holistically, including:</p> <ul class="list" data-jats-list-type="bullet"> <li class="list-item"> <p id="p-47">What is an actual explanation and how should one be constructed?</p> </li> <li class="list-item"> <p id="p-48">What is the purpose of an explanation?</p> </li> <li class="list-item"> <p id="p-49">What is the audience of an explanation?</p> </li> <li class="list-item"> <p id="p-50">What is the information it should contain?<sup><a class="reference-link xref xref-bibr" href="#bib22" data-jats-ref-type="bibr" data-jats-rid="bib22">22</a></sup><sup>,</sup><sup><a class="reference-link xref xref-bibr" href="#bib43" data-jats-ref-type="bibr" data-jats-rid="bib43">43</a></sup></p> </li> </ul> <p id="p-51">It no longer suffices to focus on the explainability of a black-box decision system. Its behavior must be explained, with more and less details, in the context of the overall AS. However, to adequately address these questions, explainability should not be seen as an after-thought, but as an integral part of the specification and design of a system, leading to explainability requirements to be given the same level of importance as all other aspects of a system.</p> <p id="p-52">In the context of trustworthy AS, emerging AS regulations could be used to drive the socio-technical analysis of explainability. A particular emphasis would have to be on the autonomy and the handoff between systems and humans that characterizes trustworthy AS. The audience of explanations will also be critical, from users and consumers to businesses, organizations and regulators. Finally, considerations for post-mortem explanations, in case of crash or disaster situations involving AS, should lead to adequate architectural design for explainability.</p> <blockquote class="disp-quote" data-jats-content-type="pull-quote"> <p id="p-53">In the context of trustworthy AS, emerging AS regulations could be used to drive the socio-technical analysis of explainability.</p> </blockquote> </section> <section id="sec15" class="inline-headings-section"> <h3 class="heading"><span class="caption-label">8. </span>How to evolve specifications?</h3> <p id="p-54" data-jats-content-type="inline-heading">Every typical AS undergoes changes over its lifetime that require going beyond an initially specified spectrum of operation—despite the observation that this spectrum is typically quite large for AS in the first place. The evolution of trustworthy AS may concern changes in the requirements of their functional or non-functional properties, changes of the environment that the AS operate in, and changes in the trust of users and third parties towards the AS.</p> <p id="p-55">Initial specifications of the AS may no longer reflect the desired properties of the system or they may fail to accurately represent its environment. The evolution of specifications presents challenges in balancing the system’s autonomy.</p> <p id="p-56">While any non-trivial system requires evolution and maintenance,<sup><a class="reference-link xref xref-bibr" href="#bib33" data-jats-ref-type="bibr" data-jats-rid="bib33">33</a></sup> some challenges are exacerbated for trustworthy AS. As an example, observed changes in trust towards the AS might require changes to behavior specifications, even if the AS operations are perfectly safe. Conversely, required changes to specifications might have negative impacts on future trust toward the AS. New methods will be required to efficiently deal with the various dimensions of trust in evolution of specifications.</p> <p id="p-57">One dimension of trust relates to transparency toward developers of AS specifications. Approaches that compare evolving specifications on a syntactical level as currently done for code, or based on metrics as currently done for AI models, are unlikely to be sufficient for effective maintenance. Analysis will need to scale beyond syntactic differences to include also semantic differences<sup><a class="reference-link xref xref-bibr" href="#bib31" data-jats-ref-type="bibr" data-jats-rid="bib31">31</a></sup> and allow for efficient analysis of the impact of changes on the level of systems rather than artifacts. New techniques to compare specifications of AS are required that identify, present, and explain differences as well as their potential impact on the system’s trustworthiness.</p> </section> <section id="sec16" class="inline-headings-section"> <h3 class="heading"><span class="caption-label">9. </span>How to address incompleteness of specifications of AS?</h3> <p id="p-58" data-jats-content-type="inline-heading">Incompleteness is a common property of specifications. Only the use of suitable abstractions allows for coping with the complexity of systems.<sup><a class="reference-link xref xref-bibr" href="#bib28" data-jats-ref-type="bibr" data-jats-rid="bib28">28</a></sup> However, there is an important difference in the incompleteness introduced by abstractions; the process of eliminating unnecessary detail to focus, for example, on behavioral, structural, or security-related aspects of a system; and the incompleteness related to the purpose of the specification—that is, the faithful representation of the system in an abstraction.</p> <p id="p-59">On the one hand, if the purpose of creating and analyzing a specification is to examine an AS and to learn about possible constraints, then incompleteness of the AS representation in the specification is important as it allows for obtaining feedback with low investment in specification development<sup><a class="reference-link xref xref-bibr" href="#bib24" data-jats-ref-type="bibr" data-jats-rid="bib24">24</a></sup>—for example, for the reduction of ambiguities. On the other hand, if the purpose of the specification is to prove a property, then incompleteness of the AS representation may lead to incorrect analyses results manifesting in false positives or false negatives. False positives are often treated by adding the missing knowledge to the specification of AS—for example, verifying the specification of an infusion pump reported a false positive due to incompleteness.<sup><a class="reference-link xref xref-bibr" href="#bib20" data-jats-ref-type="bibr" data-jats-rid="bib20">20</a></sup> The specification had to be changed to a “much more complex”<sup><a class="reference-link xref xref-bibr" href="#bib20" data-jats-ref-type="bibr" data-jats-rid="bib20">20</a></sup> one to remove the false positive.</p> <p id="p-60">One way to address incompleteness is with partial models,<sup><a class="reference-link xref xref-bibr" href="#bib9" data-jats-ref-type="bibr" data-jats-rid="bib9">9</a></sup><sup>,</sup><sup><a class="reference-link xref xref-bibr" href="#bib10" data-jats-ref-type="bibr" data-jats-rid="bib10">10</a></sup><sup>,</sup><sup><a class="reference-link xref xref-bibr" href="#bib44" data-jats-ref-type="bibr" data-jats-rid="bib44">44</a></sup> where models and analyses are extended with modalities qualifying their completeness. The various approaches provide analysis of either syntactic properties<sup><a class="reference-link xref xref-bibr" href="#bib9" data-jats-ref-type="bibr" data-jats-rid="bib9">9</a></sup> or behavior refinements.<sup><a class="reference-link xref xref-bibr" href="#bib10" data-jats-ref-type="bibr" data-jats-rid="bib10">10</a></sup><sup>,</sup><sup><a class="reference-link xref xref-bibr" href="#bib44" data-jats-ref-type="bibr" data-jats-rid="bib44">44</a></sup> Combinations and extensions to rich specification languages for AS are part of this research challenge.</p> <p id="p-61">In addition to analysis tasks, specifications are also used in synthesis tasks; this is where the incompleteness of AS specifications can manifest itself in the construction of biased or incorrect systems. As an example, consider the specification of a robot operating in a warehouse.<sup><a class="reference-link xref xref-bibr" href="#bib32" data-jats-ref-type="bibr" data-jats-rid="bib32">32</a></sup> The specification requires that the robot never hits a wall. With no assumptions about the environment, the synthesizer would take the worst-case view, that is, walls move and hit the robot, and consequently report that the specification is not realizable and no implementation exists. Adding the assumption that walls cannot move as an environment constraint changes the outcome of the synthesis. Interestingly, when formulating requirements for humans, common sense allows us to cope with this type of incompleteness. However, the automated analysis of specifications for AS brings with it the challenge of identifying and handling (all) areas of incompleteness.</p> </section> <section id="sec17" class="inline-headings-section"> <h3 class="heading"><span class="caption-label">10. </span>How to specify competing demands and other agents’ behavior?</h3> <p id="p-62" data-jats-content-type="inline-heading">Conventional approaches to V&V for AS may seek to attain coverage against a specification to demonstrate assurance of functionality and compliance with safety regulations or legal frameworks. Such properties may be derived from existing legal or regulatory frameworks, for example, the U.K. Highway Code for driving, which can then be converted into formal expressions for automatic checking.<sup><a class="reference-link xref xref-bibr" href="#bib19" data-jats-ref-type="bibr" data-jats-rid="bib19">19</a></sup></p> <p id="p-63">But optimal safety does not imply optimal trust, and just because an AS follows rules does not mean it will be accepted as a trustworthy system in human society. Other factors of trustworthiness should be considered, such as reliability, robustness, cooperation, and performance. We can also say that strictly following safety rules may even be detrimental to other trustworthiness properties, for example, performance. Consider an automated vehicle trying to make progress through a busy market square full of people slowly walking across the road uncommitted to the usual observation of road conduct. The <i>safest</i> option for the AS is to wait until the route ahead is completely clear before moving on, as by taking this option you do not endanger any other road user. However, <i>better performance</i> may be to creep forward in a bid to promote your likelihood of success. Driving then, is much more than following safety rules, which makes this a particularly hard specification challenge. In this scenario an assertive driving style would make more progress than a risk-averse one.</p> <p id="p-64">In reality there will be significantly more considerations than just safety and performance, but this example illustrates the principle of conflicting demands between assessment standards. Consideration of other agents, such as properties of fairness or cooperation, would lead to a more trustworthy system. Additionally, the interaction of AS with people may require insight into social norms of which there is no written standard by which these can be judged. Will the task of specification first require a codex of social interaction norms to be drawn together to add to the standards by which trust can be measured? Specifications would need to be written with reference to these standards, regulations, and ethical principles, some of which do not currently exist, to ensure that any assessment captures the full spectrum of these trustworthiness criteria.</p> </section> </section> <section id="sec18" class="sec"> <h2 class="heading">Conclusion</h2> <p id="p-65">As autonomous systems play greater roles in our daily lives and interact more closely with humans, we need to build systems worthy of trust regarding safety, security, and other non-functional properties. In this article, we have first examined AS domains of different levels of maturity and then identified their specification challenges and related research directions. One of these challenges is the formalization of knowledge easily grasped by humans so that it becomes interpretable by machines. Prominent examples include the specification of driving regulations for AVs, and the specification of human knowledge expertise in the context of AI-based medical diagnostics. How to specify and model human behavior, intent, and mental state is a further challenge common to all domains where humans interact closely with AS, such as in human-robot collaborative environments in smart manufacturing. Alternative approaches involve the specification of norms to characterize the desired behavior of AS, which regulate what the system should or should not do. An emerging research direction is the design of monitors to observe the system and check compliance with norms.<sup><a class="reference-link xref xref-bibr" href="#bib2" data-jats-ref-type="bibr" data-jats-rid="bib2">2</a></sup> The example of swarm robotics raises the need and challenge to specify behavior that emerges at the system level and relies on certain actions of the entities that form the system with each other and their environment.</p> <blockquote class="disp-quote" data-jats-content-type="pull-quote"> <p id="p-66">As autonomous systems play greater roles in our daily lives and interact more closely with humans, we need to build systems worthy of trust.</p> </blockquote> <p id="p-67">Beyond the technical aspects, across the specific AS domains, are research challenges related to governance and regulation for trustworthiness, requiring a holistic and human-centered approach to specification focused on responsibility and accountability, and enabling explainability from the outset. Fundamental to specifying for trustworthiness is a sound understanding of human behavior and expectations, as well as the social and ethical norms applicable when humans directly interact with AS. As for future work, an interesting extension of this article would be to produce a classification of properties to be specified for trustworthiness under the different intellectual challenges discussed—for example, socio-technical properties of explainability are purpose, audience, content, timing, and delivery mechanism of explanations.</p> <p id="p-68">We conclude that specifying for trustworthiness requires advances on the technical and engineering side, informed by new insights from social sciences and humanities research. Thus, tackling this specification challenge necessitates tight collaboration of engineers, roboticists, and computer scientists with experts from psychology, sociology, law, politics, economics, ethics, and philosophy. Most importantly, continuous engagement with regulators and the general public will be key to trustworthy AS.</p> <blockquote class="disp-quote" data-jats-content-type="pull-quote"> <p id="p-69">Specifying for trustworthiness requires advances on the technical and engineering side, informed by new insights from social sciences and humanities research.</p> </blockquote> </section> <section id="sec19" class="sec"> <h2 class="heading">Acknowledgments</h2> <p id="p-70">This work has been supported by the U.K. EPSRC under the grants: [EP/V026518/1], [EP/V026607/1], [EP/V026747/1], [EP/V026763/1], [EP/V026682/1], [EP/V026801/2], [EP/S027238/1] and [EP/V00784X/1]. A.B. and B.N. are also supported by EPSRC [EP/R013144/1] and SFI [13/RC/2094_P2]. Y.D. is also supported by a RAEng Chair in Emerging Technologies [CiET171846]. M.M. is also supported by EPSRC [EP/Y005244/1].</p> </section> </div> <footer class="back"></footer> </article> ]]></content:encoded> <wfw:commentRss>https://cacm-acm-org-preprod.go-vip.net/research/on-specifying-for-trustworthiness/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <dc:creator><![CDATA[Amel Bennaceur]]></dc:creator> <dc:creator><![CDATA[Greg Chance]]></dc:creator> <dc:creator><![CDATA[Yiannis Demiris]]></dc:creator> <dc:creator><![CDATA[Anastasia Kordoni]]></dc:creator> <dc:creator><![CDATA[Mark Levine]]></dc:creator> <dc:creator><![CDATA[Luke Moffat]]></dc:creator> <dc:creator><![CDATA[Luc Moreau]]></dc:creator> <dc:creator><![CDATA[Mohammad Reza Mousavi]]></dc:creator> <dc:creator><![CDATA[Bashar Nuseibeh]]></dc:creator> <dc:creator><![CDATA[Subramanian Ramamoorthy]]></dc:creator> <dc:creator><![CDATA[Jan Oliver Ringert]]></dc:creator> <dc:creator><![CDATA[James Wilson]]></dc:creator> <dc:creator><![CDATA[Shane Windsor]]></dc:creator> <dc:creator><![CDATA[Kerstin Eder]]></dc:creator> <post-id xmlns="com-wordpress:feed-additions:1">586482</post-id> </item> <item> <title>Achievement in Microarchitecture</title> <link>https://cacm-acm-org-preprod.go-vip.net/opinion/achievement-in-microarchitecture/</link> <comments>https://cacm-acm-org-preprod.go-vip.net/opinion/achievement-in-microarchitecture/#respond</comments> <dc:creator><![CDATA[Leah Hoffmann]]></dc:creator> <pubDate>Mon, 08 Jan 2024 21:08:53 +0000</pubDate> <category><![CDATA[Architecture and Hardware]]></category> <category><![CDATA[Systems and Networking]]></category> <guid isPermaLink="false">https://cacm-acm-org-preprod.go-vip.net/?post_type=digital-library&p=586418</guid> <description><![CDATA[David Papworth, a 30-year veteran of Intel, on what led to the P6 microprocessor and how that changed the microarchitectural paradigm.]]></description> <content:encoded><![CDATA[<article> <div class="body" lang="en"> <section id="sec1" class="sec"> <p id="p-1">David Papworth, recipient of the 2022 ACM Charles P. “Chuck” Thacker Breakthrough in Computing Award, accepted a big job in 1990 when he joined Intel’s P6 microprocessor team as lead designer. The P6—commercialized as the Pentium Pro—was intended to leapfrog microprocessor design, and it did. Thanks to Papworth’s broad understanding of the hardware-software interface and adroit leadership of more than 500 architects, designers, validators, and engineers, the P6 introduced a new microarchitectural paradigm that is still in use today. Here, Papworth recalls how it all went down.</p> <ul class="list list-simple" style="list-style-type: none; display: table;" data-jats-list-type="simple"> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-2"><b>In the 1980s, before joining Intel, you worked at a startup called Multiflow, which pioneered Very Long Instruction Word (VLIW) architecture. VLIW exploits instruction-level parallelism by enabling the compiler to schedule pipelines of instructions across different functional units—a technique known as superscalar processing. How did VLIW influence your work on Intel’s P6 microprocessor?</b></p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-3">The main thing Multiflow did that was carried forward into the P6 was the idea of a very wide superscalar. But Multiflow was all about scheduling things in software and doing as little as possible in the hardware. By contrast, the predecessors of the Pentium Pro were more of the mindset that “We can build this, and the software will follow.”</p> <p id="p-4">There was a group of us—Bob Colwell (<a class="ext-link" href="https://bit.ly/3sEzgwc" data-jats-ext-link-type="uri">https://bit.ly/3sEzgwc</a>) and myself, in particular—who had experienced how effective it can be for hardware and software to work together. We had a pretty good sense of what software can do, and what it expects from the hardware. We also had ideas about how hardware could exploit parallelism and use run-time information to improve scheduling. We worked through the challenges of trading off between hardware and software while still maintaining compatibility with the PC software base.</p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-5"><b>One of the main innovations the P6 introduced is the paradigm of decomposing instructions into sequences of micro-operations. Can you explain how that works?</b></p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-6">The x86 instruction set has some very complex instructions. Imagine a “hyperbolic arc tangent instruction.” It’s easy to express the software intent as an instruction, but the required set of actions is way more than any practically realizable hardware can do in one step. That means it’s going to be a sequence of simpler things, whether you like it or not.</p> <p id="p-7">So, you have a complex instruction that does a load from memory and some sort of calculation: “Add to the BX Register the contents of this memory location over here.” In order to execute it, both of those values have to be available. That was no problem for the older Intel 386 and 486 pipelines, which were designed to execute everything in order.</p> <p id="p-8">What we did as part of P6 was to add out-of-order execution, which means we’ll do what we find in any order we feel like so long as the values are there. If they’re not there, we will just put it aside, move on to the next thing, and try to do that.</p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-9"><b>So you’re not just converting X86 to RISC instructions and executing them in sequence.</b></p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-10">Not at all. The essence of micro-operations is twofold. One is to decompose complex instructions into what the hardware can actually do. The other is to split them up into what, in software, are called data precedence arcs. So, you have the add operation, which is simple, and most machines can do that directly. There’s also a load that goes with it: “Get this value from memory and prepare it to be put on the other side of the adder.”</p> <p id="p-11">Instead of executing those two operations at the same time, we broke it up. We’ll do the load when we can, and oftentimes that’s well before the other side of the add is ready. And sometimes it’s not. Either way, you don’t want to sit there twiddling your thumbs. You can look at ahead, find the next instructions, and do them.</p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-12"><b>And there are no paired pipelines for all these instructions, just a bunch of functional units to which operations are scheduled based on their availability.</b></p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-13">Right. Things can execute when their operands are ready and there’s a functional unit ready to handle them. This is controlled by the process of register renaming, which takes the data precedence graph expressed by the software and encountered at runtime and maps that onto resources capable of containing that result for as long as it’s needed.</p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-14"><b>You also introduced some important validation and testing protocols.</b></p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-15">When Intel launches a successful microprocessor, a couple years later, it will be selling a hundred million microprocessors a year. Let’s imagine you have to recall two years of production. That’s 200 million microprocessors, each of which costs on the order of $100 to service and replace. That’s $20 billion!</p> <p id="p-16">Now, you can’t sit paralyzed and not launch the thing. But unless you’re Google or Apple and can make the software work around your microprocessor, you’re just deathly afraid of that conundrum, so you do as much as you can to validate it pre-production.</p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-17"><b>Building something as complex as a microprocessor requires a lot of juggling when it comes to requirements and constraints. Can you talk about some of the design tradeoffs you made?</b></p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-18">I think the simplest example is 16-bit performance. The Intel 8088 was one of the company’s most influential microprocessors. It was a 16-bit computation machine, and it had lots of quirks. For example, it would clear the upper byte of a register, then load something into the low byte of that register and read it as a composite thing. That causes horrible violence to the way we built and executed our register rename table, and there’s really no reason to do it.</p> <p id="p-19">So we decided to deprecate it—to say, “We’ll make it work, but it doesn’t have to be fast.” Our reasoning was that the workstations that used the Pentium Pro would be set up to run with modern software, but their compilers could deal with lower performance in that area and still be compatible with a 20-year-old version of Lotus 1-2-3. We thought it would be fine to make that performance tradeoff, but the market taught us it wasn’t entirely fine. Because the first thing they did with the Pentium Pro is run all of these old DOS benchmarks, and some of them didn’t look very flattering.</p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-20"><b>Is that something you did differently in subsequent iterations of the Pentium?</b></p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-21">Yes. As an architect, you have to design machines that can run a lot of software. Perhaps you’d like to do floating point really, really well. Do the people running on a PC or even a workstation really care? Some do. It sells computers. You can say, “Hey, LINPACK gets this great number.” But at the end of the day, you pick, as best you can, a bunch of performance benchmarks, tailor the pipeline to do that, and then see how it works out.</p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-22"><b>After the Pentium Pro, you worked on the Pentium 2 and 3 and 4—at which point you launched a second career helping Intel’s legal department defend against patent cases. What was that like?</b></p> </div> </li> <li class="list-item" style="display: table-row;"> <div class="list-item-content" style="display: table-cell;"> <p id="p-23">The legal work wasn’t as stressful as building microprocessors. I found I could be extremely helpful to the lawyers in explaining complicated technology, and I liked using the other side of my brain. I was also good at staring down a plaintiff’s attorney during depositions and answering questions truthfully without giving them the sound bites they were looking for. In federal case depositions, it’s a nine-hour day, and you spend seven hours on the record. The lawyers are trying to catch just three extra words that they can take out of context and put in a brief. I was skilled at conveying complex technology in legally artful terms, and I had a steady stream of this work for many years.</p> <p id="p-24">However, the patent litigation landscape slowly changed over this time to be less favorable to plaintiffs and more favorable to defendants, particularly in difficult districts like Marshall, TX (<a class="ext-link" href="https://bit.ly/45xnTVg" data-jats-ext-link-type="uri">https://bit.ly/45xnTVg</a>). By 2019, the number of high-profile cases had fallen off dramatically, and then in 2020, COVID hit. At that point, I was 64 years old, and with the quarantines, it wasn’t a good time to get back into big microprocessor design projects. So I retired from Intel, and now I spend my days on my farm looking out over the fields and raising my grandson with my wife Katie.</p> </div> </li> </ul> </section> </div> </article> ]]></content:encoded> <wfw:commentRss>https://cacm-acm-org-preprod.go-vip.net/opinion/achievement-in-microarchitecture/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <post-id xmlns="com-wordpress:feed-additions:1">586418</post-id> </item> <item> <title>Epigenomics Now</title> <link>https://cacm-acm-org-preprod.go-vip.net/news/epigenomics-now/</link> <comments>https://cacm-acm-org-preprod.go-vip.net/news/epigenomics-now/#respond</comments> <dc:creator><![CDATA[Don Monroe]]></dc:creator> <pubDate>Mon, 08 Jan 2024 18:40:35 +0000</pubDate> <category><![CDATA[Architecture and Hardware]]></category> <category><![CDATA[Systems and Networking]]></category> <guid isPermaLink="false">https://cacm-acm-org-preprod.go-vip.net/?post_type=digital-library&p=586388</guid> <description><![CDATA[Computer power helps biologists track the regulation of genetic information. ]]></description> <content:encoded><![CDATA[<article> <div class="body" lang="en"> <section id="sec1" class="sec"> <p id="p-1">For nearly a quarter-century, we have had a (mostly) complete listing of the human genome, the three-billion “letter” sequence of DNA, most of which is the same for all of us. This reference copy makes it much easier for scientists to understand biological processes and to identify the individual variations, such as mutations, that contribute to disease. Despite its central role and its extreme usefulness, however, the genome’s impact on health care has been smaller than many proponents had hoped.</p> <p id="p-2">Part of the reason is that while most of the cells in your body carry identical DNA, the biological activity of different regions varies widely over time and between different tissues. It is these differences in gene expression that orchestrate the intricate development of tissues and the unique features of various cell types, as well as much of the misbehavior of cells in disease.</p> <p id="p-3">Researchers have been refining techniques to map and analyze cellular features that affect gene expression, including chemical modifications of DNA and of the proteins that it wraps around. Although these changes, which are called “epigenetic” because they do not alter the genetic sequence, can persist through cell divisions, and sometimes even transmit altered activity to offspring. (The now-widely-used “-omics” suffix refers to the comprehensive survey of individual “-etic” measurements.)</p> <p id="p-4">In contrast with traditional painstaking laboratory techniques, modern biology exploits robotic manipulation of samples and high-throughput data acquisition to assemble enormous data resources. Computer analysis plays a critical role in interpreting these data, at least as important as it has been for DNA sequence data, and the challenges and opportunities are growing.</p> </section> <section id="sec2" class="sec"> <h2 class="heading">Sequence Information</h2> <p id="p-5">The 3 billion DNA letters, chemically known as bases and denoted C, G, T, and A (for cytosine, guanine, thymine, and adenine) are distributed among 23 pairs of chromosomes, each of which contains tens of millions of bases. Mapping the sequence involves chopping each strand into short pieces, chemically determining the order of bases in each piece, and computationally matching up overlapping segments to reconstruct the full sequence. (This process conflates regions that have very similar sequences; only recent technology that sequences longer sections has completed the end-to-end genome.)</p> <p id="p-6">The sequence is not enough, however. British biologist C.H. Waddington coined the term “epigenetics” in 1942 to describe how local conditions go beyond genetics to determine cell characteristics. For example, as a single fertilized egg divides and develops, the embryo’s cells become increasingly committed to specialized cell-type identities with distinct patterns of gene expression.</p> <p id="p-7">The first step in expressing DNA is its “transcription” into another nucleic acid, RNA, which exactly mirrors the DNA sequence. RNA performs many cellular roles, most familiarly through later “translation” via the genetic code into proteins that form cellular structures or catalyze chemical reactions.</p> <p id="p-8">Some proteins, known as transcription factors, bind to target “promoter” sequences in the DNA to regulate further transcription of nearby genes, creating feedback loops that stabilize particular patterns of expression. Although this mechanism matches Waddington’s original concept, epigenetics is now more often used to describe processes that more directly and persistently regulate the activity of specific regions of the genome. The late Nobel medalist Joshua Lederberg once said that epigenetics had already become a “semantic morass” by the late 1950s, and scientists have since identified numerous relevant processes that further complicated the terminology.</p> <p id="p-9">One important mechanism is the direct methylation of DNA, the chemical attachment of a methyl group to a C in the DNA chain, which often suppresses expression of a nearby gene. Another epigenetic mechanism involves chemical modification of histone proteins, around which much of nuclear DNA is normally wound tightly. Some of these changes increase gene expression, others decrease it.</p> <p id="p-10">To map these modifications, researchers use antibodies that bind to particular “marks.” The DNA is then chopped up and the antibody-bound segments are sequenced. The sequences are then computationally matched to the reference genome to find out which locations were modified.</p> <p id="p-11">DNA information is central to its biological significance, preserved through the reliable incorporation of complementary bases (C with G and A with T) during DNA duplication to make new cells or new offspring. Epigenetic information is more fragile; it is largely maintained through specialized enzymes that mirror the marks on new complementary chains.</p> <p id="p-12">The epigenetic information and the corresponding patterns of gene expression can signify specific cell types, such as neurons or muscle cells. They also are known signatures for particular cancers (although DNA mutations also play an important role). Epigenomics thus informs cancer prognosis, as well as guiding researchers to potential therapeutic targets.</p> </section> <section id="sec3" class="sec"> <h2 class="heading">Going for the Code</h2> <p id="p-13">For many years, researchers have been developing computer tools to analyze various large-scale datasets built on the human genome. For example, in 2015 researchers adapted hidden Markov models to look for an “epigenetic code” that goes beyond the overall density of marks in particular regions for predicting DNA accessibility and transcription.</p> <p id="p-14">“There has been a long debate whether … a code exists in which the combination of these marks is actually something more complex and meaningful than just the sum of them,” said computational biologist Mattia Pelizzola of the Italian Institute of Technology in Milan. He notes that although some marks do have interacting effects, there is little support for a large-scale epigenetic code that had earlier been envisioned.</p> <p id="p-15">An important regulator of gene expression is the three-dimensional (3D) structure of the chromatin, the conglomeration of DNA and proteins in the nucleus. It was long clear that there are two distinct states of chromatin, as well as distinct locations in the nucleus, which differ in their gene activity. Recent research has shown much greater complexity, organized in part by the epigenetic modifications.</p> <p id="p-16">Researchers have developed tools for probing this 3D structure, for example by chemically cross-linking nearby regions of DNA, which need not even be on the same chromosome. Chopping up the DNA and sequencing the cross-linked segments then provides a comprehensive view of which parts of the DNA are brought into proximity at various length scales by the 3D folding.</p> <p id="p-17">These measurements and their computational analysis revealed the formation of loops, and the resulting “topologically associating domains” that bring together genes and regulatory elements. Understanding how epigenetic marks guide the 3D organization is an important, ongoing challenge that is well suited to artificial intelligence. For example, Jie Liu, assistant professor of computational medicine and bioinformatics at the University of Michigan, and his colleagues used a deep learning model to predict 3D structure from available data such as DNA and histone marks and accessibility.</p> <p id="p-18">Deep learning “no longer requires humans to annotate the features,” Liu said. “It has the convolution filters to identify the features automatically from the data.” Nonetheless, he thinks the features that emerge can be understood by people, which is important “so that people can trust the model.”</p> <p id="p-19">“There are tons of models like this in this domain,” Liu said. “We are trying to integrate everything into one framework for everything,” Liu said, using a transformer-based method inspired in part by large language models. “This is really similar.”</p> </section> <section id="sec4" class="sec"> <h2 class="heading">A Rich Future</h2> <p id="p-20">The methodologies for mapping DNA methylation and histone modifications are well established, as are techniques for mapping DNA accessibility and (to a lesser degree) 3D chromatin structure. However, researchers are also exploring other, less direct epigenetic mechanisms, which are also ripe for computational study.</p> <p id="p-21">Some RNA transcripts, for example, can directly influence gene expression without the need to be translated into protein. These RNA molecules, like transcription factors, can transmit their information to daughter cells. “It has been speculated that at the level of transcriptome, there is something that can be inherited and transmitted through cell division,” said Pelizzola. In his own, early-stage research, Pelizzola is also exploring the “epitranscriptome,” a “recently emerging layer of epi-information” involving chemical modifications of transcribed RNA.</p> <p id="p-22">Most epigenetic information is reset during reproduction, but there have been some reports that features can be conveyed through multiple generations. Some biologists suspect such preservation could affect long-term evolution by laying the groundwork for related genetic changes in DNA, but this idea is not universally accepted.</p> <p id="p-23">Computational methods will continue to play key roles in the ongoing explosion of biological understanding, Pelizzola said. For one thing, “Brute force computer power is more and more needed for dealing with huge and increasing amounts of data” produced by improved measurement tools. Second, he noted that computer science and multidisciplinary teams are critical for precise computational methods, such as those based on differential equations.</p> <p id="p-24">Finally, deep learning and artificial intelligence “are very likely to be very disruptive in terms of revealing unexpected connections,” he said. “The key is having enough data” for training.</p> </section> </div> <footer class="back"></footer> </article> ]]></content:encoded> <wfw:commentRss>https://cacm-acm-org-preprod.go-vip.net/news/epigenomics-now/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <post-id xmlns="com-wordpress:feed-additions:1">586388</post-id> </item> <item> <title>Are You Confident in Your Backups?</title> <link>https://cacm-acm-org-preprod.go-vip.net/blogcacm/are-you-confident-in-your-backups/</link> <comments>https://cacm-acm-org-preprod.go-vip.net/blogcacm/are-you-confident-in-your-backups/#respond</comments> <dc:creator><![CDATA[Alex Vakulov]]></dc:creator> <pubDate>Mon, 08 Jan 2024 18:00:35 +0000</pubDate> <category><![CDATA[Computing Applications]]></category> <category><![CDATA[Computing Profession]]></category> <category><![CDATA[Data and Information]]></category> <category><![CDATA[Security and Privacy]]></category> <category><![CDATA[Systems and Networking]]></category> <guid isPermaLink="false">https://cacm-migration.alley.dev/?p=750233</guid> <description><![CDATA[Assessing data security. ]]></description> <content:encoded><![CDATA[ <p>The importance of data backups cannot be overestimated. Backups are essential for reducing the harm from hardware failures and lessening the effects of various hacker attacks, with <a href="https://cacm-acm-org-preprod.go-vip.net/news/257332-winning-the-war-on-ransomware/fulltext">ransomware</a> being the most dangerous. At the same time, given the widespread availability of cost-effective <a href="https://www.baculasystems.com/blog/best-enterprise-backup-solutions/">enterprise backup solutions</a> today, the encryption methods used by ransomware authors should not represent a significant threat. Implementing effective backup strategies is now easier and more affordable for organizations and individual users. However, significant challenges still persist in this area.</p> <p><strong>Why Backups Fail</strong></p> <p>Backups are often ineffective for several reasons, largely influenced by financial considerations. To cut IT costs, some companies do not back up all essential files or do so infrequently. Even organizations with extensive backup systems may fail to test them adequately, leading to difficulties in data restoration during crises.</p> <p>Another common error is storing backups on network drives, which are prime targets for sophisticated ransomware attacks, along with local drives.</p> <p>Human factors, like accidental or intentional deletion, also contribute to backup failures.</p> <p>Additionally, natural disasters or accidents at the datacenter or site, especially if it is in a different country, can hinder access to backups, further complicating the situation.</p> <p><strong>How Many Backups Do You Need?</strong></p> <p>It is often wise to be a bit paranoid when it comes to data backups. Thanks to affordable cloud storage and tailored solutions, organizations can now securely store large amounts of their confidential data. The balance between cost and protection has shifted significantly compared to 10 years ago.</p> <p>IT teams should schedule backups regularly to ensure they can recover the latest versions of critical files whenever needed. When setting up a backup system, two key measures are often considered: Recovery Point Objective (<a href="https://www.f5.com/services/resources/glossary/recovery-point-objective-rpo">RPO</a>) and Recovery Time Objective (<a href="https://docs.oracle.com/database/121/VLDBG/GUID-76435B39-EF7E-418B-9C33-0C43B088178A.htm#VLDBG1567">RTO</a>).</p> <p>RPO defines the maximum period during which data loss is acceptable for a company. In other words, if a data loss incident occurs, the company could lose data generated during this time frame. Therefore, the frequency of backups is adjusted according to this period.</p> <p>On the other hand, RTO specifies the duration that data or an IT system can be offline. After an incident, whether it is data, an app, a virtual machine, or an operating system, RTO is the timeframe within which these need to be restored.</p> <p>The RTO and RPO parameters are tailored for each organization, depending on the type of data, its importance to the business, the cost of its restoration, and whether it is an application, a virtual machine, or an <a href="https://en.wikipedia.org/wiki/Array_(data_structure)">array</a>.</p> <p>A common oversight in backup system management is failing to update the system's rules and tasks regularly. As a company grows and its IT infrastructure evolves, the volume and variety of internal services, data, and applications increase. But often, the backup policies, which might have been set up months or even years earlier, remain unchanged. This neglect can lead to data loss risks, data integrity issues, or excessive downtime of crucial IT systems.</p> <p><strong>Backing Up More Than Just Files</strong></p> <p>Today, when ransomware is a significant threat, simply backing up important files might not be enough. There might be a need to restore entire workstations and systems to their previous, uninfected state. Ransomware can paralyze various critical services, including email and print servers, CAD systems, payment terminals, employee training and payroll systems, potentially halting business operations. To counter this, it is advisable to maintain backups or 'images' of their systems, which can be quickly deployed if the original systems are compromised. It is not necessary to keep multiple backups of each system. Using <a href="https://aws.amazon.com/compare/the-difference-between-incremental-differential-and-other-backups/">incremental backup</a> solutions, which save only the latest version of a system, can be an efficient way to ensure you have the necessary data to revert to a clean state.</p> <p><strong>A Multi-Layered Backup Strategy</strong></p> <p>To enhance the protection and reliability of your organization's data, adopting a multi-layered backup strategy is recommended. Developing a multi-layered backup strategy involves a thorough assessment of your organization's infrastructure to identify the data, systems, and files that require backup. This process includes establishing dependencies among the information systems earmarked for backup.</p> <p>The next step is to define the requirements for Recovery Time Objective (RTO) and Recovery Point Objective (RPO), which are critical in shaping your backup strategy.</p> <p>Once the analysis is complete, you can determine the hardware requirements. This includes selecting appropriate storage systems, servers, tape libraries, and other infrastructure components. These decisions should be based on a detailed list of the information systems that need backup, along with considerations of storage location, the frequency of backups, and the types of backups required.</p> <p>A cornerstone of this strategy is the <a href="https://www.techtarget.com/searchdatabackup/definition/3-2-1-Backup-Strategy">3-2-1 rule</a>, which significantly reduces the risk of a single point of failure. The rule is simple yet effective: maintain at least three copies of your data, keep two copies on different types of media, and keep one backup copy offline. It is important to prioritize data, focusing on the most critical information. For offline backups, ensuring they contain the latest data version is vital.</p> <p>Adhering to the 3-2-1 backup rule minimizes the risk of data loss from ransomware, hardware failures, or internal threats like disgruntled employees. This approach prepares your organization for worst-case scenarios, enhancing resilience against disasters from any source.</p> <p>Further enhancing your backup strategy includes:</p> <ul class="wp-block-list"> <li>Regular testing of backups</li> </ul> <p>It is crucial to routinely test your backups to ensure they work as intended. This testing helps identify any issues in the backup process and confirms the reliability of data restoration.</p> <ul class="wp-block-list"> <li>Network segmentation and air gapping</li> </ul> <p>Segmenting your network and using <a href="https://www.youtube.com/watch?v=rh3yG8fsYA8">air gaps</a> (disconnecting backups from the network) can protect backup integrity. This reduces the risk of network-based attacks affecting your backups.</p> <ul class="wp-block-list"> <li>Encrypting Backups</li> </ul> <p>Adding encryption to your backups provides an additional layer of security. It ensures that even if the data is accessed without authorization, it remains unreadable and secure.</p> <ul class="wp-block-list"> <li>Employee Training</li> </ul> <p>Educating your staff about the importance of backups, best practices, and how to respond in case of data loss is vital. Properly trained employees play a vital role in maintaining the integrity of your backup systems.</p> <p> </p> <p><em><a href="https://www.linkedin.com/in/alex-vakulov-security/"><strong>Alex Vakulov</strong></a> is a cybersecurity researcher with more than 20 years of experience in malware analysis and strong malware removal skills.</em></p> ]]></content:encoded> <wfw:commentRss>https://cacm-acm-org-preprod.go-vip.net/blogcacm/are-you-confident-in-your-backups/feed/</wfw:commentRss> <slash:comments>0</slash:comments> <post-id xmlns="com-wordpress:feed-additions:1">750233</post-id> </item> </channel> </rss>