CINXE.COM

<!doctype html> <html class="no-js aws-lng-en_US aws-with-target" lang="en-US" data-static-assets="https://a0.awsstatic.com" data-js-version="1.0.597" data-css-version="1.0.508"> <head> <meta http-equiv="Content-Security-Policy" content="default-src 'self' data: https://a0.awsstatic.com https://prod.us-east-1.ui.gcr-chat.marketing.aws.dev; base-uri 'none'; connect-src 'self' https://*.analytics.console.aws.a2z.com https://*.harmony.a2z.com https://*.marketing.aws.dev https://*.panorama.console.api.aws https://*.prod.chc-features.uxplatform.aws.dev https://*.us-east-1.prod.mrc-sunrise.marketing.aws.dev https://112-tzm-766.mktoresp.com https://112-tzm-766.mktoutil.com https://a0.awsstatic.com https://a0.p.awsstatic.com https://a1.awsstatic.com https://amazonwebservices.d2.sc.omtrdc.net https://amazonwebservicesinc.tt.omtrdc.net https://api-v2.builderprofile.aws.dev https://api.regional-table.region-services.aws.a2z.com https://api.us-west-2.prod.pricing.aws.a2z.com https://auth.aws.amazon.com https://aws.amazon.com https://aws.amazon.com/p/sf/ https://aws.demdex.net https://b0.p.awsstatic.com https://c0.b0.p.awsstatic.com https://calculator.aws https://chatbot-api.us-east-1.prod.mrc-sunrise.marketing.aws.dev https://chatbot-stream-api.us-east-1.prod.mrc-sunrise.marketing.aws.dev https://cm.everesttech.net https://csml-plc-prod.us-west-2.api.aws/plc/csml/logging https://d0.awsstatic.com https://d1.awsstatic.com https://d1fgizr415o1r6.cloudfront.net https://d2c.aws.amazon.com https://d3borx6sfvnesb.cloudfront.net https://dftu77xade0tc.cloudfront.net https://dpm.demdex.net https://edge.adobedc.net https://fls-na.amazon.com https://i18n-string.us-west-2.prod.pricing.aws.a2z.com https://iad.staging.prod.tv.awsstatic.com https://infra-api.us-east-1.prod.mrc-sunrise.marketing.aws.dev https://ingestion.aperture-public-api.feedback.console.aws.dev https://livechat-api.us-east-1.prod.mrc-sunrise.marketing.aws.dev https://pricing-table.us-west-2.prod.site.p.awsstatic.com https://prod-us-west-2.csp-report.marketing.aws.dev https://prod.log.shortbread.aws.dev https://prod.tools.shortbread.aws.dev https://prod.us-east-1.api.gcr-chat.marketing.aws.dev https://prod.us-east-1.rest-bot.gcr-chat.marketing.aws.dev https://prod.us-east-1.ui.gcr-chat.marketing.aws.dev https://prod2.clientlogger.cn-northwest-1.marketplace.aws.a2z.org.cn https://public.lotus.awt.aws.a2z.com https://s0.awsstatic.com https://s3.amazonaws.com/aws-messaging-pricing-information/ https://s3.amazonaws.com/public-pricing-agc/ https://spot-bid-advisor.s3.amazonaws.com https://t0.m.awsstatic.com https://target.aws.amazon.com https://token.us-west-2.prod.site.p.awsstatic.com https://tv.awsstatic.com https://view-stage.us-west-2.prod.pricing.aws.a2z.com https://view-staging.us-east-1.prod.plc1-prod.pricing.aws.a2z.com https://vs.aws.amazon.com https://webchat-aws.clink.cn https://wrp.aws.amazon.com https://www.youtube-nocookie.com https://xcxrmtkxx5.execute-api.us-east-1.amazonaws.com/prod/ wss://*.transport.connect.us-east-1.amazonaws.com wss://prod.us-east-1.wss-bot.gcr-chat.marketing.aws.dev wss://webchat-aws.clink.cn; font-src 'self' data: https://a0.awsstatic.com https://f0.awsstatic.com https://fonts.gstatic.com https://prod.us-east-1.ui.gcr-chat.marketing.aws.dev; frame-src 'self' https://*.widget.console.aws.amazon.com https://aws.demdex.net https://c0.b0.p.awsstatic.com https://calculator.aws https://conversational-experience-worker.widget.console.aws.amazon.com/lotus/isolatedIFrame https://dpm.demdex.net https://pricing-table.us-west-2.prod.site.p.awsstatic.com https://token.us-west-2.prod.site.p.awsstatic.com https://www.youtube-nocookie.com; img-src 'self' blob: data: https://*.vidyard.com https://*.ytimg.com https://a0.awsstatic.com https://amazonwebservices.d2.sc.omtrdc.net https://avatars.builderprofile.aws.dev https://aws-clink2-resource.s3.cn-northwest-1.amazonaws.com.cn https://aws-quickstart.s3.amazonaws.com https://aws.amazon.com https://aws.demdex.net https://awsmedia.s3.amazonaws.com https://chat.us-east-1.prod.mrc-sunrise.marketing.aws.dev https://cm.everesttech.net https://d1.awsstatic.com https://d1d1et6laiqoh9.cloudfront.net https://d2908q01vomqb2.cloudfront.net https://d2c.aws.amazon.com https://d2cpw7vd6a2efr.cloudfront.net https://d36cz9buwru1tt.cloudfront.net https://d7umqicpi7263.cloudfront.net https://docs.aws.amazon.com https://dpm.demdex.net https://fls-na.amazon.com https://google.ca https://google.co.in https://google.co.jp https://google.co.th https://google.co.uk https://google.com https://google.com.ar https://google.com.au https://google.com.br https://google.com.hk https://google.com.mx https://google.com.tr https://google.com.tw https://google.de https://google.es https://google.fr https://google.it https://google.nl https://google.pl https://google.ru https://iad.staging.prod.tv.awsstatic.com https://img.youtube.com https://marketingplatform.google.com https://media.amazonwebservices.com https://p.adsymptotic.com https://pages.awscloud.com https://prod.us-east-1.ui.gcr-chat.marketing.aws.dev https://s3.amazonaws.com/aws-quickstart/ https://ssl-static.libsyn.com https://static-cdn.jtvnw.net https://tv.awsstatic.com https://webchat-aws.clink.cn https://www.google.com https://www.linkedin.com https://yt3.ggpht.com; media-src 'self' blob: https://*.libsyn.com https://a0.awsstatic.com https://anchor.fm https://awsmedia.s3.amazonaws.com https://awspodcastsiberiaent.s3.eu-west-3.amazonaws.com https://chat.us-east-1.prod.mrc-sunrise.marketing.aws.dev https://chtbl.com https://content.production.cdn.art19.com https://d1.awsstatic.com https://d1hemuljm71t2j.cloudfront.net https://d1le29qyzha1u4.cloudfront.net https://d1oqpvwii7b6rh.cloudfront.net https://d1vo51ubqkiilx.cloudfront.net https://d1yyh5dhdgifnx.cloudfront.net https://d2908q01vomqb2.cloudfront.net https://d2a6igt6jhaluh.cloudfront.net https://d3ctxlq1ktw2nl.cloudfront.net https://d3h2ozso0dirfl.cloudfront.net https://dgen8gghn3u86.cloudfront.net https://dk261l6wntthl.cloudfront.net https://download.stormacq.com/aws/podcast/ https://dts.podtrac.com https://iad.staging.prod.tv.awsstatic.com https://media.amazonwebservices.com https://mktg-apac.s3-ap-southeast-1.amazonaws.com https://rss.art19.com https://s3-ap-northeast-1.amazonaws.com/aws-china-media/ https://tv.awsstatic.com https://www.buzzsprout.com; object-src 'none'; script-src 'sha256-PbryX5lQWCdSR48qR4OIWj6swmfTYkeWtICo76LVZTI=' 'nonce-DlUE4RvcnUDAlTka9GJOAFZbwvgtfIxT2D2mwbLcxGI=' 'self' blob: https://*.cdn.console.awsstatic.com/ https://*.cdn.uis.awsstatic.com/ https://*.us-east-1.prod.mrc-sunrise.marketing.aws.dev https://a.b.cdn.console.awsstatic.com https://a0.awsstatic.com https://amazonwebservicesinc.tt.omtrdc.net https://cdn.builderprofile.aws.dev https://d2c.aws.amazon.com https://googleads.g.doubleclick.net https://loader.us-east-1.prod.mrc-sunrise.marketing.aws.dev https://prod.us-east-1.ui.gcr-chat.marketing.aws.dev https://spot-price.s3.amazonaws.com https://static.doubleclick.net https://t0.m.awsstatic.com https://token.us-west-2.prod.site.p.awsstatic.com https://website.spot.ec2.aws.a2z.com https://www.google.com https://www.gstatic.com https://www.youtube.com/iframe_api https://www.youtube.com/s/player/; style-src 'self' 'unsafe-inline' https://a0.awsstatic.com https://prod.us-east-1.ui.gcr-chat.marketing.aws.dev https://t0.m.awsstatic.com https://token.us-west-2.prod.site.p.awsstatic.com" data-report-uri="https://prod-us-west-2.csp-report.marketing.aws.dev/submit"> <meta http-equiv="content-type" content="text/html; charset=UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <link rel="preconnect" href="https://a0.awsstatic.com" crossorigin="anonymous"> <link rel="dns-prefetch" href="https://a0.awsstatic.com"> <link rel="dns-prefetch" href="https://d1.awsstatic.com"> <link rel="dns-prefetch" href="https://amazonwebservicesinc.tt.omtrdc.net"> <link rel="dns-prefetch" href="https://s0.awsstatic.com"> <link rel="dns-prefetch" href="https://t0.m.awsstatic.com"> <title>What is RLHF? - Reinforcement Learning from Human Feedback Explained - AWS</title> <meta name="description" content="What is Reinforcement Learning from Human Feedback how and why businesses use Reinforcement Learning from Human Feedback, and how to use Reinforcement Learning from Human Feedback with AWS."> <meta name="robots" content="index, follow"> <meta property="twitter:title" content="What is RLHF? - Reinforcement Learning from Human Feedback Explained - AWS"> <meta property="twitter:description" content="What is Reinforcement Learning from Human Feedback how and why businesses use Reinforcement Learning from Human Feedback, and how to use Reinforcement Learning from Human Feedback with AWS."> <meta property="og:title" content="What is RLHF? - Reinforcement Learning from Human Feedback Explained - AWS"> <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"> <link rel="canonical" href="https://aws.amazon.com/what-is/reinforcement-learning-from-human-feedback/"> <link rel="alternate" href="https://aws.amazon.com/ar/what-is/reinforcement-learning-from-human-feedback/" hreflang="ar-sa"> <link rel="alternate" href="https://aws.amazon.com/de/what-is/reinforcement-learning-from-human-feedback/" hreflang="de-de"> <link rel="alternate" href="https://aws.amazon.com/es/what-is/reinforcement-learning-from-human-feedback/" hreflang="es-es"> <link rel="alternate" href="https://aws.amazon.com/fr/what-is/reinforcement-learning-from-human-feedback/" hreflang="fr-fr"> <link rel="alternate" href="https://aws.amazon.com/id/what-is/reinforcement-learning-from-human-feedback/" hreflang="id-id"> <link rel="alternate" href="https://aws.amazon.com/it/what-is/reinforcement-learning-from-human-feedback/" hreflang="it-it"> <link rel="alternate" href="https://aws.amazon.com/jp/what-is/reinforcement-learning-from-human-feedback/" hreflang="ja-jp"> <link rel="alternate" href="https://aws.amazon.com/ko/what-is/reinforcement-learning-from-human-feedback/" hreflang="ko-kr"> <link rel="alternate" href="https://aws.amazon.com/pt/what-is/reinforcement-learning-from-human-feedback/" hreflang="pt-br"> <link rel="alternate" href="https://aws.amazon.com/ru/what-is/reinforcement-learning-from-human-feedback/" hreflang="ru-ru"> <link rel="alternate" href="https://aws.amazon.com/th/what-is/reinforcement-learning-from-human-feedback/" hreflang="th-th"> <link rel="alternate" href="https://aws.amazon.com/tr/what-is/reinforcement-learning-from-human-feedback/" hreflang="tr-tr"> <link rel="alternate" href="https://aws.amazon.com/vi/what-is/reinforcement-learning-from-human-feedback/" hreflang="vi-vn"> <link rel="alternate" href="https://aws.amazon.com/cn/what-is/reinforcement-learning-from-human-feedback/" hreflang="zh-cn"> <link rel="alternate" href="https://aws.amazon.com/tw/what-is/reinforcement-learning-from-human-feedback/" hreflang="zh-tw"> <script src="https://a0.awsstatic.com/libra/1.0.597/csp/csp-report.js" async="true"></script> <meta property="twitter:card" content="summary"> <meta property="twitter:image" content="https://a0.awsstatic.com/libra-css/images/logos/aws_logo_smile_179x109.png"> <meta property="twitter:site" content="@awscloud"> <meta property="fb:pages" content="153063591397681"> <meta name="baidu-site-verification" content="pjxJUyWxae"> <meta name="360-site-verification" content="cbe5c6f0249e273e71fffd6d6580ce09"> <meta name="shenma-site-verification" content="79b94bb338f010af876605819a332e19_1617844070"> <meta name="sogou_site_verification" content="Ow8cCy3Hgq"> <link rel="icon" type="image/ico" href="https://a0.awsstatic.com/libra-css/images/site/fav/favicon.ico"> <link rel="shortcut icon" type="image/ico" href="https://a0.awsstatic.com/libra-css/images/site/fav/favicon.ico"> <link rel="apple-touch-icon" sizes="57x57" href="https://a0.awsstatic.com/libra-css/images/site/touch-icon-iphone-114-smile.png"> <link rel="apple-touch-icon" sizes="72x72" href="https://a0.awsstatic.com/libra-css/images/site/touch-icon-ipad-144-smile.png"> <link rel="apple-touch-icon" sizes="114x114" href="https://a0.awsstatic.com/libra-css/images/site/touch-icon-iphone-114-smile.png"> <link rel="apple-touch-icon" sizes="144x144" href="https://a0.awsstatic.com/libra-css/images/site/touch-icon-ipad-144-smile.png"> <meta property="og:type" content="company"> <meta property="og:url" content="https://aws.amazon.com/what-is/reinforcement-learning-from-human-feedback/"> <meta property="og:image" content="https://a0.awsstatic.com/libra-css/images/logos/aws_logo_smile_1200x630.png"> <meta property="og:site_name" content="Amazon Web Services, Inc."> <meta name="facebook-domain-verification" content="ucogvbvio3zpukhjxw4pcprci7qylr"> <meta name="google-site-verification" content="XHghG81ulgiW-3EylGcF48sG28tBW5EH0bNUhgo_DrU"> <meta name="msvalidate.01" content="6F92E52A288E266E30C2797ECB5FCCF3"> <link rel="stylesheet" href="https://a0.awsstatic.com/libra-css/css/1.0.508/style-awsm-base.css"> <link rel="stylesheet" href="https://a0.awsstatic.com/libra-css/css/1.0.508/style-awsm-components.css"> <script type="esms-options">{"noLoadEventRetriggers": true, "nonce":"DlUE4RvcnUDAlTka9GJOAFZbwvgtfIxT2D2mwbLcxGI="}</script> <script async src="https://a0.awsstatic.com/eb-csr/1.0.123/polyfills/es-module-shims/es-module-shims.js"></script> <script type="importmap">{"imports":{"react":"https://a0.awsstatic.com/eb-csr/1.0.123/react/react.js","react/jsx-runtime":"https://a0.awsstatic.com/eb-csr/1.0.123/react/jsx-runtime.js","react-dom":"https://a0.awsstatic.com/eb-csr/1.0.123/react/react-dom.js","react-dom/server":"https://a0.awsstatic.com/eb-csr/1.0.123/react/server-browser.js","react-dom-server-browser":"https://a0.awsstatic.com/eb-csr/1.0.123/react/react-dom-server-browser.js","sanitize-html":"https://a0.awsstatic.com/eb-csr/1.0.123/sanitize-html/index.js","video.js":"https://a0.awsstatic.com/eb-csr/1.0.123/videojs/video.js","videojs-event-tracking":"https://a0.awsstatic.com/eb-csr/1.0.123/videojs/videojs-event-tracking.js","videojs-hotkeys":"https://a0.awsstatic.com/eb-csr/1.0.123/videojs/videojs-hotkeys.js","@amzn/awsmcc":"https://a0.awsstatic.com/awsmcc/1.0.0/bundle/index.js"}}</script> <script type="application/json" id="aws-page-settings"> { "supportedLanguages": ["ar","cn","de","en","es","fr","id","it","jp","ko","pt","ru","th","tr","tw","vi"], "defaultLanguage": "en", "logDataSet": "LIVE:PROD", "logInstance": "PUB", "csdsEndpoint": "https://d2c.aws.amazon.com/", "framework": "v2", "g11nLibPath": "https://a0.awsstatic.com/g11n-lib/2.0.107", "i18nStringPath": "https://i18n-string.us-west-2.prod.pricing.aws.a2z.com", "libraCSSPath": "https://a0.awsstatic.com/libra-css/css/1.0.508", "libraCSSImagePath": "https://a0.awsstatic.com/libra-css/images", "isLoggingEnabled": true, "currentLanguage": "en-US", "currentStage": "Prod", "isBJS": false, "isMarketplace": false, "isRTL": false, "requireBaseUrl": "https://a0.awsstatic.com", "requirePackages":[ { "name": "libra", "location": "libra/1.0.597" } ], "requirePaths": { "directories": "https://a0.awsstatic.com/libra/1.0.597/directories", "libra-cardsui": "https://a0.awsstatic.com/libra/1.0.597/libra-cardsui", "librastandardlib": "https://a0.awsstatic.com/libra/1.0.597/librastandardlib", "aws-blog": "https://a0.awsstatic.com/aws-blog/1.0.80/js", "plc": "https://a0.awsstatic.com/plc/js/1.0.138/plc", "scripts": "libra/1.0.597/v1-polyfills/scripts", "libra-search": "https://a0.awsstatic.com/libra-search/1.0.19/js", "pricing-calculator": "https://a0.awsstatic.com/pricing-calculator/js/1.0.2", "pricing-savings-plan": "https://a0.awsstatic.com/pricing-savings-plan/js/1.0.23" }, "staticAssetPath": "https://a0.awsstatic.com", "jsAssetPath": "https://a0.awsstatic.com/libra/1.0.597", "awstvVideoAssetOrigin": "https://tv.awsstatic.com", "awstvVideoAPIOrigin": "//aws.amazon.com" } </script> <script src="https://a0.awsstatic.com/libra/1.0.597/libra-head.js"></script> <script src="https://a0.awsstatic.com/s_code/js/3.0/awshome_s_code.js"></script> <script src="https://d2c.aws.amazon.com/client/loader/v1/d2c-load.js"></script> <script async src="https://a0.awsstatic.com/da/js/1.0.51/aws-da.js"></script> <link rel="stylesheet" href="https://a0.awsstatic.com/eb-csr/1.0.123/orchestrate.css"> <script type="module" async="true" src="https://a0.awsstatic.com/eb-csr/1.0.123/orchestrate.js"></script> <script type="application/json" id="target-mediator">{"pageLanguage":"en","supportedLanguages":["ar","cn","de","en","es","fr","id","it","jp","ko","pt","ru","th","tr","tw","vi"],"offerOrigin":"https://s0.awsstatic.com"}</script> <script data-js-script="target-mediator" src="https://a0.awsstatic.com/target/1.0.123/aws-target-mediator.js" async="true"></script> </head> <body class="awsm"> <script id="awsc-panorama-bundle" type="text/javascript" src="https://prod.pa.cdn.uis.awsstatic.com/panorama-nav-init.js" data-config="{"appEntity":"aws-marketing","region":"us-west-1","service":"global-site","trackerConstants":{"cookieDomain":"aws.amazon.com"}}" async="true"></script> <a id="aws-page-skip-to-main" class="lb-sr-only lb-sr-only-focusable lb-bold lb-skip-el" href="#aws-page-content-main"> Skip to main content</a> <header id="aws-page-header" class="awsm m-page-header lb-with-mobile-subrow" role="banner"> <div id="m-nav" class="m-nav" role="navigation" aria-label="Global Navigation"> <div class="m-nav-header lb-clearfix" data-menu-url="https://s0.awsstatic.com/en_US/nav/v3/panel-content/desktop/index.html"> <div class="m-nav-logo"> <div class="lb-bg-logo aws-amazon_web_services_smile-header-desktop-en"> <a href="https://aws.amazon.com/?nc2=h_lg"><span>Click here to return to Amazon Web Services homepage</span></a> </div> </div> <nav class="m-nav-secondary-links" style="min-width: 620px" aria-label="Secondary Global Navigation"> <a href="/about-aws/?nc2=h_header">About AWS</a> <a href="/contact-us/?nc2=h_header">Contact Us</a> <a class="lb-txt-none lb-tiny-iblock lb-txt-13 lb-txt lb-has-trigger-indicator" href="#" data-mbox-ignore="true" data-lb-popover-trigger="popover-support-selector" role="button" aria-expanded="false" aria-label="Explore support options" id="popover-popover-support-selector-trigger" aria-controls="popover-support-selector" aria-haspopup="true"> Support   <svg viewbox="0 0 16 16" fill="none" xmlns="http://www.w3.org/2000/svg" class="icon-chevron-down lb-trigger-mount"> <path d="M1 4.5L8 11.5L15 4.5" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" /> </svg> </a> <a id="m-nav-language-selector" class="lb-tiny-iblock lb-txt lb-has-trigger-indicator" href="#" data-lb-popover-trigger="popover-language-selector" data-language="en" aria-label="Set site language" role="button" aria-controls="popover-language-selector" aria-expanded="false" aria-haspopup="true"> English   <svg viewbox="0 0 16 16" fill="none" xmlns="http://www.w3.org/2000/svg" class="icon-chevron-down lb-trigger-mount"> <path d="M1 4.5L8 11.5L15 4.5" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" /> </svg> </a> <a class="lb-tiny-iblock lb-txt lb-has-trigger-indicator" href="#" data-lb-popover-trigger="popover-my-account" aria-label="Access account options" role="button" aria-controls="popover-my-account" aria-expanded="false" aria-haspopup="true"> My Account   <svg viewbox="0 0 16 16" fill="none" xmlns="http://www.w3.org/2000/svg" class="icon-chevron-down lb-trigger-mount"> <path d="M1 4.5L8 11.5L15 4.5" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" /> </svg> </a> <div class="m-nav-cta-btn"> <div class="lb-mbox js-mbox" data-lb-comp="mbox" data-lb-comp-ignore="true" data-mbox="en_header_nav_cta"> <div class="lb-data-attr-wrapper data-attr-wrapper" data-da-type="so" data-da-so-type="viewport" data-da-so-language="en" data-da-so-category="monitoring" data-da-so-name="nav-buttons" data-da-so-version="sign-up-sign-in-all" data-da-so-url="nav"> <div class="data-attr-wrapper lb-tiny-iblock lb-none-pad lb-box" data-da-type="so" data-da-so-type="viewport" data-da-so-language="en" data-da-so-category="monitoring" data-da-so-name="nav-buttons" data-da-so-version="prospect-sign-in" data-da-so-url="all"> <a class="lb-txt-none lb-tiny-iblock lb-txt-13 lb-txt" style="padding-top:8px; padding-right:13px;" href="https://console.aws.amazon.com/console/home?nc2=h_ct&src=header-signin"> Sign In</a> </div> <div class="data-attr-wrapper lb-tiny-iblock lb-none-v-margin lb-btn" data-da-type="so" data-da-so-type="viewport" data-da-so-language="en" data-da-so-category="monitoring" data-da-so-name="nav-buttons" data-da-so-version="prospect-signup" data-da-so-url="all"> <a class="lb-btn-p-primary" href="https://portal.aws.amazon.com/gp/aws/developer/registration/index.html?nc2=h_ct&src=header_signup" data-trk-params="{"trkOverrideWithQs":true}" role="button"> <span> Create an AWS Account</span> </a> </div> </div> </div> <div class="lb-mbox js-mbox" data-lb-comp="mbox" data-lb-comp-ignore="true" data-mbox="en_header_desktop_nav_cta_test"> <div class="data-attr-wrapper lb-tiny-iblock lb-none-pad lb-box" style="padding-top:2px; padding-left:13px;" data-da-type="so" data-da-so-category="monitoring" data-da-so-language="en" data-da-so-name="builder-id-dropdown-button" data-da-so-type="viewport" data-da-so-version="main-button-clicks" data-da-so-url="desktop"> <div class="lb-tiny-iblock lb-box"> <div class="lb-tiny-iblock lb-micro-v-margin lb-btn lb-icon-only" data-myaws-auth-hidden-only="true"> <a class="lb-btn-da-primary-rounded" href="#" data-mbox-ignore="true" data-lb-popover-trigger="signed-out-options" role="button" aria-expanded="false" aria-label="AWS Builder Id options" id="popover-signed-out-options-trigger" aria-controls="signed-out-options" aria-haspopup="true"> <span> <i class="icon-user-o-aura lb-before"></i></span> </a> </div> <div class="lb-tiny-iblock lb-micro-v-margin m-no-auth lb-btn lb-icon-only" data-myaws-auth-only="true"> <a class="lb-btn-da-primary-rounded" href="#" data-mbox-ignore="true" data-lb-popover-trigger="signed-in-options" role="button" aria-expanded="false" aria-label="AWS Builder Id options" id="popover-signed-in-options-trigger" aria-controls="signed-in-options" aria-haspopup="true"> <span> <i class="icon-user-aura lb-before"></i></span> </a> </div> </div> <div class="lb-none-pad lb-popover lb-popover-rounded lb-popover-mid-small" style="padding-top:40px; padding-left:40px; padding-bottom:40px; padding-right:40px;" data-lb-comp="popover" data-id="signed-out-options" id="signed-out-options" aria-modal="false" aria-labelledby="popover-signed-out-options-trigger" data-action="hover" data-position="top"> <a class="lb-popover-close" role="button" tabindex="0" aria-label="Close" title="Close"> <span class="lb-sr-only">Close</span> </a> <div class="lb-tiny-align-center lb-txt-bold lb-txt-none lb-txt-20 lb-none-v-margin lb-txt"> Profile </div> <div class="lb-tiny-align-center lb-txt-none lb-none-v-margin lb-txt"> Your profile helps improve your interactions with select AWS experiences. </div> <div class="lb-none-pad lb-none-v-margin lb-box" style="margin-top:32px;"> <div class="lb-data-attr-wrapper data-attr-wrapper" data-da-type="so" data-da-so-category="monitoring" data-da-so-language="en" data-da-so-name="builder-id-dropdown-button" data-da-so-type="viewport" data-da-so-version="sign-in-button" data-da-so-url="desktop"> <div class="lb-xlarge-radius lb-border-p lb-none-pad lb-box" style="background-color:rgb(17,22,29); color:rgb(17,22,29); border-color:rgb(17,22,29);"> <a class="lb-tiny-align-center lb-txt-none lb-none-pad lb-none-v-margin lb-txt" style="padding-top:5px; color:#f5f5f5; padding-bottom:5px;" data-myaws-requested-url="true" href="https://auth.aws.amazon.com/sign-in"> Login</a> </div> </div> </div> </div> <div class="lb-none-pad lb-popover lb-popover-rounded lb-popover-mid-small" style="padding-top:40px; padding-left:40px; padding-bottom:40px; padding-right:40px;" data-lb-comp="popover" data-id="signed-in-options" id="signed-in-options" aria-modal="false" aria-labelledby="popover-signed-in-options-trigger" data-action="hover" data-position="top"> <a class="lb-popover-close" role="button" tabindex="0" aria-label="Close" title="Close"> <span class="lb-sr-only">Close</span> </a> <div class="lb-tiny-align-center lb-txt-bold lb-txt-none lb-txt-20 lb-none-v-margin lb-txt"> Profile </div> <div class="lb-tiny-align-center lb-txt-none lb-none-v-margin lb-txt"> Your profile helps improve your interactions with select AWS experiences. </div> <div class="lb-none-pad lb-none-v-margin lb-box" style="margin-top:32px;"> <div class="lb-data-attr-wrapper data-attr-wrapper" data-da-type="so" data-da-so-category="monitoring" data-da-so-language="en" data-da-so-name="builder-id-dropdown-button" data-da-so-type="viewport" data-da-so-version="view-profile" data-da-so-url="desktop"> <div class="lb-xlarge-radius lb-border-p lb-none-pad lb-box" style="color:rgb(17,22,29); border-color:rgb(17,22,29);"> <a class="lb-tiny-align-center lb-txt-none lb-none-v-margin lb-txt" style="padding-top:5px; color:rgb(17,22,29); padding-bottom:5px;" href="https://aws.amazon.com/profile"> View profile</a> </div> </div> <div class="lb-data-attr-wrapper data-attr-wrapper" data-da-type="so" data-da-so-category="monitoring" data-da-so-language="en" data-da-so-name="builder-id-dropdown-button" data-da-so-type="viewport" data-da-so-version="log-out" data-da-so-url="desktop"> <a class="lb-tiny-align-center lb-txt-none lb-none-v-margin lb-txt" style="color:rgb(17,22,29); margin-top:16px;" data-myaws-requested-url="true" href="https://auth.aws.amazon.com/sign-out"> Log out</a> </div> </div> </div> </div> </div> </div> </nav> <div class="m-nav-primary-group"> <nav class="m-nav-primary-links" aria-label="Primary Global Navigation"> <i class="m-nav-angle-left-icon" aria-hidden="true"></i> <ul> <li aria-expanded="false"><span><a href="https://aws.amazon.com/q/?nc2=h_ql_prod_l1_q" class="m-nav-featured">Amazon Q</a></span></li> <li aria-expanded="false"><span><a href="/products/?nc2=h_ql_prod" data-panel="m-nav-panel-products">Products</a></span></li> <li aria-expanded="false"><span><a href="/solutions/?nc2=h_ql_sol" data-panel="m-nav-panel-solutions">Solutions</a></span></li> <li aria-expanded="false"><span><a href="/pricing/?nc2=h_ql_pr" data-panel="m-nav-panel-pricing">Pricing</a></span></li> <li aria-expanded="false"><span><a href="https://aws.amazon.com/documentation-overview/?nc2=h_ql_doc_do" data-panel="m-nav-panel-documentation">Documentation</a></span></li> <li aria-expanded="false"><span><a href="/getting-started/?nc2=h_ql_le" data-panel="m-nav-panel-learn">Learn</a></span></li> <li aria-expanded="false"><span><a href="/partners/?nc2=h_ql_pn" data-panel="m-nav-panel-partner">Partner Network</a></span></li> <li aria-expanded="false"><span><a href="https://aws.amazon.com/marketplace/?nc2=h_ql_mp" data-panel="m-nav-panel-marketplace">AWS Marketplace</a></span></li> <li aria-expanded="false"><span><a href="/customer-enablement/?nc2=h_ql_ce" data-panel="m-nav-panel-customer">Customer Enablement</a></span></li> <li aria-expanded="false"><span><a href="/events/?nc2=h_ql_ev" data-panel="m-nav-panel-events">Events</a></span></li> <li aria-expanded="false"><span><a href="/contact-us/?nc2=h_ql_exm" data-panel="m-nav-panel-more">Explore More </a></span></li> </ul> <div class="m-nav-icon-group"> <i class="m-nav-angle-right-icon" aria-hidden="true"></i> <button class="m-nav-search-icon" tabindex="0" aria-expanded="false" aria-label="Search"> <svg viewbox="0 0 16 16" fill="none" xmlns="http://www.w3.org/2000/svg" class="icon-magnify"> <path d="M10.5 10.5L14.5 14.5" stroke-width="2" stroke-linejoin="round" /> <path d="M7 12.5C10.0376 12.5 12.5 10.0376 12.5 7C12.5 3.96243 10.0376 1.5 7 1.5C3.96243 1.5 1.5 3.96243 1.5 7C1.5 10.0376 3.96243 12.5 7 12.5Z" stroke-width="2" stroke-linejoin="round" /> </svg> </button> </div> </nav> <div id="m-nav-desktop-search" class="m-nav-search"> <form action="https://aws.amazon.com/search/" role="search"> <div class="m-typeahead" data-directory-id="typeahead-suggestions" data-lb-comp="typeahead"> <input class="m-nav-search-field" placeholder="Search" autocomplete="off" spellcheck="false" dir="auto" type="text" name="searchQuery"> </div> </form> <i class="m-nav-close-icon" role="button" aria-label="Close"></i> </div> </div> </div> <div class="lb-popover lb-popover-aui lb-popover-tiny" data-lb-comp="popover" data-id="popover-language-selector" id="popover-language-selector" aria-modal="false" aria-labelledby="popover-popover-language-selector-trigger" data-action="hover" data-position="top"> <a class="lb-popover-close" role="button" tabindex="0" aria-label="Close" title="Close"> <span class="lb-sr-only">Close</span> </a> <div class="lb-grid lb-row lb-row-max-large lb-snap"> <div class="lb-col lb-tiny-24 lb-mid-12"> <ul class="lb-txt-none lb-ul lb-list-style-none lb-tiny-ul-block"> <li lang="ar-SA" translate="no" data-language="ar"><a href="https://aws.amazon.com/ar/?nc1=h_ls">عربي</a></li> <li lang="id-ID" translate="no" data-language="id"><a href="https://aws.amazon.com/id/?nc1=h_ls">Bahasa Indonesia</a></li> <li lang="de-DE" translate="no" data-language="de"><a href="https://aws.amazon.com/de/?nc1=h_ls">Deutsch</a></li> <li lang="en-US" translate="no" data-language="en"><a href="https://aws.amazon.com/?nc1=h_ls">English</a></li> <li lang="es-ES" translate="no" data-language="es"><a href="https://aws.amazon.com/es/?nc1=h_ls">Español</a></li> <li lang="fr-FR" translate="no" data-language="fr"><a href="https://aws.amazon.com/fr/?nc1=h_ls">Français</a></li> <li lang="it-IT" translate="no" data-language="it"><a href="https://aws.amazon.com/it/?nc1=h_ls">Italiano</a></li> <li lang="pt-BR" translate="no" data-language="pt"><a href="https://aws.amazon.com/pt/?nc1=h_ls">Português</a></li> </ul> </div> <div class="lb-col lb-tiny-24 lb-mid-12"> <ul class="lb-txt-none lb-ul lb-list-style-none lb-tiny-ul-block"> <li lang="vi-VN" translate="no" data-language="vi"><a href="https://aws.amazon.com/vi/?nc1=f_ls">Tiếng Việt</a></li> <li lang="tr-TR" translate="no" data-language="tr"><a href="https://aws.amazon.com/tr/?nc1=h_ls">Türkçe</a></li> <li lang="ru-RU" translate="no" data-language="ru"><a href="https://aws.amazon.com/ru/?nc1=h_ls">Ρусский</a></li> <li lang="th-TH" translate="no" data-language="th"><a href="https://aws.amazon.com/th/?nc1=f_ls">ไทย</a></li> <li lang="ja-JP" translate="no" data-language="jp"><a href="https://aws.amazon.com/jp/?nc1=h_ls">日本語</a></li> <li lang="ko-KR" translate="no" data-language="ko"><a href="https://aws.amazon.com/ko/?nc1=h_ls">한국어</a></li> <li lang="zh-CN" translate="no" data-language="cn"><a href="https://aws.amazon.com/cn/?nc1=h_ls">中文 (简体)</a></li> <li lang="zh-TW" translate="no" data-language="tw"><a href="https://aws.amazon.com/tw/?nc1=h_ls">中文 (繁體)</a></li> </ul> </div> </div> </div> <div class="lb-popover lb-popover-aui lb-popover-tiny" data-lb-comp="popover" data-id="popover-my-account" id="popover-my-account" aria-modal="false" aria-labelledby="popover-popover-my-account-trigger" data-action="hover" data-position="top"> <a class="lb-popover-close" role="button" tabindex="0" aria-label="Close" title="Close"> <span class="lb-sr-only">Close</span> </a> <ul class="lb-txt-none lb-ul lb-list-style-none lb-tiny-ul-block"> <li class="m-no-auth" data-myaws-auth-only="true"><a href="/profile/?nc2=h_m_mc">My Profile</a></li> <li class="m-no-auth" data-myaws-auth-only="true"><a href="https://auth.aws.amazon.com/sign-out/?nc2=h_m_mc">Sign out of AWS Builder ID</a></li> <li><a href="https://console.aws.amazon.com/?nc2=h_m_mc">AWS Management Console</a></li> <li><a href="https://console.aws.amazon.com/billing/home#/account?nc2=h_m_ma">Account Settings</a></li> <li><a href="https://console.aws.amazon.com/billing/home?nc2=h_m_bc">Billing & Cost Management</a></li> <li><a href="https://console.aws.amazon.com/iam/home?nc2=h_m_sc#security_credential">Security Credentials</a></li> <li><a href="https://phd.aws.amazon.com/?nc2=h_m_sc">AWS Personal Health Dashboard</a></li> </ul> </div> <div class="lb-popover lb-popover-aui lb-popover-tiny" data-lb-comp="popover" data-id="popover-support-selector" id="popover-support-selector" aria-modal="false" aria-labelledby="popover-popover-support-selector-trigger" data-action="hover" data-position="top"> <a class="lb-popover-close" role="button" tabindex="0" aria-label="Close" title="Close"> <span class="lb-sr-only">Close</span> </a> <ul class="lb-txt-none lb-ul lb-list-style-none lb-tiny-ul-block"> <li><a href="https://console.aws.amazon.com/support/home/?nc2=h_ql_cu">Support Center</a></li> <li><a href="https://iq.aws.amazon.com/?utm=mkt.nav">Expert Help</a></li> <li><a href="https://repost.aws/knowledge-center/?nc2=h_m_ma">Knowledge Center</a></li> <li><a href="/premiumsupport/?nc2=h_m_bc">AWS Support Overview</a></li> <li><a href="https://repost.aws/">AWS re:Post</a></li> </ul> </div> <script type="text/x-handlebars-template" data-hbs-template-path="nav-desktop/suggestions" data-hbs-context="{"pricingText":"Pricing","documentationText":"Documentation","calculatorText":"Calculator"}"></script> <script type="text/x-handlebars-template" data-hbs-template-path="nav-desktop/products-head" data-hbs-context="{"productsText":"Products"}"></script> <script type="text/x-handlebars-template" data-hbs-template-path="nav-desktop/keypages-head" data-hbs-context="{"relatedPagesText":"Related Pages"}"></script> <script type="text/x-handlebars-template" data-hbs-template-path="nav-desktop/tutorials-head" data-hbs-context="{"tutorialsText":"Tutorials"}"></script> <script type="text/x-handlebars-template" data-hbs-template-path="nav-desktop/blogs-head" data-hbs-context="{"blogsText":"Blogs"}"></script> <script type="text/x-handlebars-template" data-hbs-template-path="nav-desktop/see-all" data-hbs-context="{"resultsText":"See more results for"}"></script> </div> <div id="m-nav-mobile" class="m-nav-mobile" role="navigation" aria-label="Global Navigation for Mobile"> <div id="m-nav-mobile-header" class="m-nav-mobile-header m-nav-mobile-with-sub-row" data-menu-url="https://s0.awsstatic.com/en_US/nav/v3/panel-content/mobile/index.html"> <div class="lb-bg-logo aws-amazon_web_services_smile-header-mobile-en"> <a href="https://aws.amazon.com/?nc2=h_lg"><span>Click here to return to Amazon Web Services homepage</span></a> </div> <div class="m-nav-mobile-button-group"> <button class="m-nav-mobile-button icon-search" tabindex="0" aria-expanded="false" aria-label="Search"> <svg viewbox="0 0 16 16" fill="none" xmlns="http://www.w3.org/2000/svg"> <path d="M10.5 10.5L14.5 14.5" stroke-width="2" stroke-linejoin="round" /> <path d="M7 12.5C10.0376 12.5 12.5 10.0376 12.5 7C12.5 3.96243 10.0376 1.5 7 1.5C3.96243 1.5 1.5 3.96243 1.5 7C1.5 10.0376 3.96243 12.5 7 12.5Z" stroke-width="2" stroke-linejoin="round" /> </svg> </button> <button class="m-nav-mobile-button icon-reorder" tabindex="0" aria-expanded="false" aria-label="Menu"> <svg viewbox="0 0 16 16" fill="none" xmlns="http://www.w3.org/2000/svg"> <path d="M15 3H1" stroke-width="2" stroke-linejoin="round" /> <path d="M15 8H1" stroke-width="2" stroke-linejoin="round" /> <path d="M15 13H1" stroke-width="2" stroke-linejoin="round" /> </svg> </button> <div class="lb-mbox js-mbox" data-lb-comp="mbox" data-lb-comp-ignore="true" data-mbox="en_header_mobile_nav_cta_test"> <div class="data-attr-wrapper lb-none-pad lb-none-v-margin lb-box" style="padding-top:0px; padding-left:0px; padding-bottom:0px; margin-top:10px; padding-right:0px;" data-da-type="so" data-da-so-category="monitoring" data-da-so-language="en" data-da-so-name="builder-id-dropdown-button" data-da-so-type="viewport" data-da-so-version="main-button-clicks" data-da-so-url="mobile"> <div class="data-attr-wrapper lb-none-v-margin lb-box" style="padding-top:0px; padding-left:10px; padding-bottom:0px; margin-top:10px; padding-right:27px;" data-da-type="so" data-da-so-category="monitoring" data-da-so-language="en" data-da-so-name="builder-id-dropdown-button" data-da-so-type="viewport" data-da-so-version="main-button-clicks" data-da-so-url="mobile"> <div class="lb-none-v-margin lb-btn lb-icon-only" data-myaws-auth-hidden-only="true"> <a class="lb-btn-da-primary-rounded" href="#" data-mbox-ignore="true" data-lb-popover-trigger="signed-out-options-mobile" role="button" aria-expanded="false" aria-label="AWS Builder Id options" id="popover-signed-out-options-mobile-trigger" aria-controls="signed-out-options-mobile" aria-haspopup="true"> <span> <i class="icon-user-o-aura lb-before"></i></span> </a> </div> <div class="lb-none-v-margin m-no-auth lb-btn lb-icon-only" data-myaws-auth-only="true"> <a class="lb-btn-da-primary-rounded" href="#" data-mbox-ignore="true" data-lb-popover-trigger="signed-in-options-mobile" role="button" aria-expanded="false" aria-label="AWS Builder Id options" id="popover-signed-in-options-mobile-trigger" aria-controls="signed-in-options-mobile" aria-haspopup="true"> <span> <i class="icon-user-aura lb-before"></i></span> </a> </div> <div class="lb-none-pad lb-popover lb-popover-rounded lb-popover-small" style="padding-top:40px; padding-left:40px; padding-bottom:40px; padding-right:40px;" data-lb-comp="popover" data-id="signed-out-options-mobile" id="signed-out-options-mobile" aria-modal="false" aria-labelledby="popover-signed-out-options-mobile-trigger" data-action="clickOnly" data-position="top"> <a class="lb-popover-close" role="button" tabindex="0" aria-label="Close" title="Close"> <span class="lb-sr-only">Close</span> </a> <div class="lb-tiny-align-center lb-txt-bold lb-txt-none lb-txt-20 lb-none-v-margin lb-txt"> Profile </div> <div class="lb-tiny-align-center lb-txt-none lb-none-v-margin lb-txt"> Your profile helps improve your interactions with select AWS experiences. </div> <div class="lb-none-pad lb-none-v-margin lb-box" style="margin-top:32px;"> <div class="lb-data-attr-wrapper data-attr-wrapper" data-da-type="so" data-da-so-category="monitoring" data-da-so-language="en" data-da-so-name="builder-id-dropdown-button" data-da-so-type="viewport" data-da-so-version="sign-in-button" data-da-so-url="mobile"> <div class="lb-xlarge-radius lb-border-p lb-none-pad lb-box" style="background-color:rgb(17,22,29); color:rgb(17,22,29); border-color:rgb(17,22,29);"> <a class="lb-tiny-align-center lb-txt-none lb-none-pad lb-none-v-margin lb-txt" style="padding-top:5px; color:#f5f5f5; padding-bottom:5px;" data-myaws-requested-url="true" href="https://auth.aws.amazon.com/sign-in"> Login</a> </div> </div> </div> </div> <div class="lb-none-pad lb-popover lb-popover-rounded lb-popover-small" style="padding-top:40px; padding-left:40px; padding-bottom:40px; padding-right:40px;" data-lb-comp="popover" data-id="signed-in-options-mobile" id="signed-in-options-mobile" aria-modal="false" aria-labelledby="popover-signed-in-options-mobile-trigger" data-action="clickOnly" data-position="top"> <a class="lb-popover-close" role="button" tabindex="0" aria-label="Close" title="Close"> <span class="lb-sr-only">Close</span> </a> <div class="lb-tiny-align-center lb-txt-bold lb-txt-none lb-txt-20 lb-none-v-margin lb-txt"> Profile </div> <div class="lb-tiny-align-center lb-txt-none lb-none-v-margin lb-txt"> Your profile helps improve your interactions with select AWS experiences. </div> <div class="lb-none-pad lb-none-v-margin lb-box" style="margin-top:32px;"> <div class="lb-data-attr-wrapper data-attr-wrapper" data-da-type="so" data-da-so-category="monitoring" data-da-so-language="en" data-da-so-name="builder-id-dropdown-button" data-da-so-type="viewport" data-da-so-version="view-profile" data-da-so-url="mobile"> <div class="lb-xlarge-radius lb-border-p lb-none-pad lb-box" style="color:rgb(17,22,29); border-width:2px; border-color:rgb(17,22,29);"> <a class="lb-tiny-align-center lb-txt-none lb-none-v-margin lb-txt" style="padding-top:5px; color:rgb(17,22,29); padding-bottom:5px;" href="https://aws.amazon.com/profile"> View profile</a> </div> </div> <div class="lb-data-attr-wrapper data-attr-wrapper" data-da-type="so" data-da-so-category="monitoring" data-da-so-language="en" data-da-so-name="builder-id-dropdown-button" data-da-so-type="viewport" data-da-so-version="log-out" data-da-so-url="mobile"> <a class="lb-tiny-align-center lb-txt-none lb-none-v-margin lb-txt" style="color:rgb(17,22,29); margin-top:16px;" data-myaws-requested-url="true" href="https://auth.aws.amazon.com/sign-out"> Log out</a> </div> </div> </div> <div class="lb-none-pad lb-popover lb-popover-rounded lb-popover-small" style="padding-top:40px; padding-left:40px; padding-bottom:40px; padding-right:40px;" data-lb-comp="popover" data-id="signed-in-options-mobile" id="signed-in-options-mobile" aria-modal="false" aria-labelledby="popover-signed-in-options-mobile-trigger" data-action="clickOnly" data-position="top"> <a class="lb-popover-close" role="button" tabindex="0" aria-label="Close" title="Close"> <span class="lb-sr-only">Close</span> </a> <div class="lb-tiny-align-center lb-txt-bold lb-txt-none lb-txt-20 lb-none-v-margin lb-txt"> Profile </div> <div class="lb-tiny-align-center lb-txt-none lb-none-v-margin lb-txt"> Your profile helps improve your interactions with select AWS experiences. </div> <div class="lb-none-pad lb-none-v-margin lb-box" style="margin-top:32px;"> <div class="lb-data-attr-wrapper data-attr-wrapper" data-da-type="so" data-da-so-category="monitoring" data-da-so-language="en" data-da-so-name="builder-id-dropdown-button" data-da-so-type="viewport" data-da-so-version="view-profile" data-da-so-url="mobile"> <div class="lb-xlarge-radius lb-border-p lb-none-pad lb-box" style="color:rgb(17,22,29); border-width:2px; border-color:rgb(17,22,29);"> <a class="lb-tiny-align-center lb-txt-none lb-none-v-margin lb-txt" style="padding-top:5px; color:rgb(17,22,29); padding-bottom:5px;" href="https://aws.amazon.com/profile"> View profile</a> </div> </div> <div class="lb-data-attr-wrapper data-attr-wrapper" data-da-type="so" data-da-so-category="monitoring" data-da-so-language="en" data-da-so-name="builder-id-dropdown-button" data-da-so-type="viewport" data-da-so-version="log-out" data-da-so-url="mobile"> <a class="lb-tiny-align-center lb-txt-none lb-none-v-margin lb-txt" style="color:rgb(17,22,29); margin-top:16px;" data-myaws-requested-url="true" href="https://auth.aws.amazon.com/sign-out"> Log out</a> </div> </div> </div> </div> </div> </div> </div> <div id="m-nav-mobile-sub-row" class="m-nav-mobile-sub-row"> <div class="data-attr-wrapper lb-btn" data-da-type="so" data-da-so-category="monitoring" data-da-so-language="en" data-da-so-name="global-mobile-sticky-cta-buttons" data-da-so-type="viewport" data-da-so-version="get-started-for-free-cta" data-da-so-url="all"> <a class="lb-btn-p-primary" href="https://portal.aws.amazon.com/gp/aws/developer/registration/index.html?nc2=h_mobile" role="button"> <span> Get Started for Free</span> </a> </div> <div class="data-attr-wrapper lb-btn" data-da-type="so" data-da-so-category="monitoring" data-da-so-language="en" data-da-so-name="global-mobile-sticky-cta-buttons" data-da-so-type="viewport" data-da-so-version="contact-us"> <a class="lb-btn-p" href="https://aws.amazon.com/contact-us/?nc2=h_mobile" role="button"> <span> Contact Us</span> </a> </div> </div> </div> <div id="m-nav-mobile-search" class="m-nav-mobile-search"> <form action="https://aws.amazon.com/search" role="search"> <div class="m-typeahead"> <input class="m-nav-search-field" placeholder="Search" autocomplete="off" spellcheck="false" dir="auto" type="text" name="searchQuery"> </div> </form> </div> <nav id="m-nav-trimdown" aria-label="Condensed Global Navigation for Mobile"> <ul class="m-nav-mobile-menu-group"> <li> <a href="/products/?nc2=h_mo"> <span class="m-nav-link-title">Products</span> </a> </li> <li> <a href="/solutions/?nc2=h_mo"> <span class="m-nav-link-title">Solutions</span> </a> </li> <li> <a href="/pricing/?nc2=h_mo"> <span class="m-nav-link-title">Pricing</span> </a> </li> <li> <a href="/what-is-aws/?nc2=h_mo"> <span class="m-nav-link-title">Introduction to AWS</span> </a> </li> <li> <a href="/getting-started/?nc2=h_mo"> <span class="m-nav-link-title">Getting Started</span> </a> </li> <li> <a href="https://aws.amazon.com/documentation-overview/?nc2=h_mo"> <span class="m-nav-link-title">Documentation</span> </a> </li> <li> <a href="/training/?nc2=h_mo"> <span class="m-nav-link-title">Training and Certification</span> </a> </li> <li> <a href="/developer/?nc2=h_mo"> <span class="m-nav-link-title">Developer Center</span> </a> </li> <li> <a href="/solutions/case-studies/?nc2=h_mo"> <span class="m-nav-link-title">Customer Success</span> </a> </li> <li> <a href="/partners/?nc2=h_mo"> <span class="m-nav-link-title">Partner Network</span> </a> </li> <li> <a href="https://aws.amazon.com/marketplace/?nc2=h_mo"> <span class="m-nav-link-title">AWS Marketplace</span> </a> </li> <li> <a href="https://console.aws.amazon.com/support/home?nc2=h_ql_cu"> <span class="m-nav-link-title">Support</span> </a> </li> <li> <a href="https://repost.aws/"> <span class="m-nav-link-title">AWS re:Post</span> </a> </li> <li> <a href="https://console.aws.amazon.com/console/home"> <span class="m-nav-link-title">Log into Console</span> </a> </li> <li> <a href="/console/mobile/"> <span class="m-nav-link-title">Download the Mobile App</span> </a> </li> </ul> </nav> </div> </header> <div id="aws-page-content" class="lb-page-content" style="padding-top:0px; padding-bottom:0px;" data-page-alert-target="true"> <main id="aws-page-content-main" role="main" tabindex="-1"> <div data-eb-slot="what-is-header" data-eb-slot-meta="{'version':'1.0','slotId':'what-is-header','experienceId':'93f2c10b-57a0-4aac-a291-b4b33afe10b1','allowBlank':false,'hasAltExp':false,'isRTR':false,'filters':{'limit':1,'query':'id \u003d \'what-is-reinforcement-learning-from-human-feedback\''}}"> <div data-eb-tpl-n="awsm-what-is/what-is-header" data-eb-tpl-v="1.0.1" data-eb-ce="" data-eb-c-scope="what-is-header" data-eb-d-scope="DIRECTORIES" data-eb-locale="en-US" data-eb-99e83dc4="" data-eb-ssr-ce="" data-eb-tpl-ns="awsmWhatIs"> <style>[data-eb-99e83dc4] .eb-what-is-header{background-color:#1e2832;background-image:url("//d1.awsstatic.com/r2018/h/QuickSight Q/Site Merch/SiteMerch-QuickSightQ_Hero-BG.c455f708c1d1da51ca3520e7678b415423fd06a5.png")}[data-eb-99e83dc4] .eb-what-is-header .eb-headline{color:#fff;margin-top:0;margin-bottom:0}[data-eb-99e83dc4] .eb-what-is-header .eb-breadcrumbs{position:relative;margin:0;padding:0;list-style:none;color:#d1d5db}[data-eb-99e83dc4] .eb-what-is-header .eb-breadcrumbs-link{position:relative;margin-right:6px;padding-left:11px;color:#539fe5}[data-eb-99e83dc4] .eb-what-is-header .eb-breadcrumbs-link:hover{color:#89bdee}[data-eb-99e83dc4] .eb-what-is-header .eb-breadcrumbs-link:focus{text-decoration:none;outline-offset:2px;outline:#0972d3 solid 2px;border-radius:2px}[data-eb-99e83dc4] .eb-what-is-header .eb-breadcrumbs-link:before{position:absolute;top:-2px;left:0;color:#d1d5db;content:"/"}[data-eb-99e83dc4] .eb-what-is-header .eb-breadcrumbs-item{margin-bottom:0;display:inline-block}[data-eb-99e83dc4] .eb-what-is-header .eb-breadcrumbs-item:first-of-type .eb-breadcrumbs-link{padding-left:0}[data-eb-99e83dc4] .eb-what-is-header .eb-breadcrumbs-item:first-of-type .eb-breadcrumbs-link:before{content:none}</style> <script type="application/json">{"data":{"items":[{"fields":{"primaryCTAText":"Create an AWS Account","description":"<p>Reinforcement learning from human feedback (RLHF) is a machine learning (ML) technique that uses human feedback to optimize ML models to self-learn more efficiently. Reinforcement learning (RL) techniques train software to make decisions that maximize rewards, making their outcomes more accurate. RLHF incorporates human feedback in the rewards function, so the ML model can perform tasks more aligned with human goals, wants, and needs. RLHF is used throughout generative artificial intelligence (generative AI) applications, including in large language models (LLM).</p>","sortDate":"2023-12-07","headlineUrl":"https://aws.amazon.com/what-is/reinforcement-learning-from-human-feedback/?trk=faq_card","id":"faq-hub#what-is-reinforcement-learning-from-human-feedback","category":"Machine Learning","primaryCTA":"https://portal.aws.amazon.com/gp/aws/developer/registration/index.html?pg=what_is_header","headline":"What is RLHF?"},"metadata":{"tags":[{"id":"GLOBAL#tech-category#machine-learning","name":"Machine Learning","namespaceId":"GLOBAL#tech-category","description":"Machine Learning","metadata":{}},{"id":"GLOBAL#tech-category#gen-ai","name":"Generative AI","namespaceId":"GLOBAL#tech-category","description":"<p>Generative AI</p>","metadata":{}},{"id":"faq-hub#faq-type#what-is","name":"what-is","namespaceId":"faq-hub#faq-type","description":"<p>what-is</p>","metadata":{}}]}}]},"metadata":{"auth":{},"testAttributes":{}},"context":{"page":{"pageUrl":"https://aws.amazon.com/what-is/reinforcement-learning-from-human-feedback/"},"environment":{"stage":"prod","region":"us-east-1"},"sdkVersion":"1.0.129"},"refMap":{"manifest.js":"289765ed09","what-is-header.js":"2e0d22c000","what-is-header.rtl.css":"ccf4035484","what-is-header.css":"ce47058367","what-is-header.css.js":"004a4704e8","what-is-header.rtl.css.js":"f687973e4f"},"settings":{"templateMappings":{"category":"category","headline":"headline","primaryCTA":"primaryCTA","primaryCTAText":"primaryCTAText","primaryBreadcrumbText":"primaryBreadcrumbText","primaryBreadcrumbURL":"primaryBreadcrumbURL"}}}</script> <div data-eb-tpl-root="" data-reactroot=""> <div class="eb-what-is-header lb-bg-left-top-cover lb-mid-pad lb-none-v-margin lb-grid" data-eb-item-id="faq-hub#what-is-reinforcement-learning-from-human-feedback" data-eb-tags="[{"id":"GLOBAL#tech-category#machine-learning","name":"Machine Learning","namespaceId":"GLOBAL#tech-category","description":"Machine Learning","metadata":{}},{"id":"GLOBAL#tech-category#gen-ai","name":"Generative AI","namespaceId":"GLOBAL#tech-category","description":"<p>Generative AI</p>\n","metadata":{}},{"id":"faq-hub#faq-type#what-is","name":"what-is","namespaceId":"faq-hub#faq-type","description":"<p>what-is</p>\n","metadata":{}}]"> <script type="application/ld+json">{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"What is Cloud Computing?","item":"https://aws.amazon.com/what-is-cloud-computing/"},{"@type":"ListItem","position":2,"name":"Cloud Computing Concepts Hub","item":"https://aws.amazon.com/what-is/"},{"@type":"ListItem","position":3,"name":"Generative AI","item":"https://aws.amazon.com/ai/generative-ai/"},{"@type":"ListItem","position":4,"name":"Machine Learning","item":"https://aws.amazon.com/ai/machine-learning/"}]}</script> <div class="lb-row lb-row-max-large lb-snap"> <div class="lb-col lb-tiny-24 lb-mid-24"> <div class="lb-txt-p-cobalt lb-rtxt"> <ul class="eb-breadcrumbs"> <li class="eb-breadcrumbs-item"><a class="eb-breadcrumbs-link" title="What is Cloud Computing?" href="https://aws.amazon.com/what-is-cloud-computing/">What is Cloud Computing?</a></li> <li class="eb-breadcrumbs-item"><a class="eb-breadcrumbs-link" title="Cloud Computing Concepts Hub" href="https://aws.amazon.com/what-is/">Cloud Computing Concepts Hub</a></li> <li class="eb-breadcrumbs-item"><a class="eb-breadcrumbs-tags eb-breadcrumbs-link" href="/ai/generative-ai/">Generative AI</a></li> <li class="eb-breadcrumbs-item"><a class="eb-breadcrumbs-tags eb-breadcrumbs-link" href="/ai/machine-learning/">Machine Learning</a></li> </ul> </div> <h1 class="eb-headline lb-txt-none lb-h1 lb-title">What is RLHF?</h1> <br> <div class="lb-small-show lb-mid-iblock lb-large-iblock lb-xlarge-iblock lb-btn"> <a class="lb-btn-p-primary" href="https://portal.aws.amazon.com/gp/aws/developer/registration/index.html?pg=what_is_header" role="button" rel="noopener" target="_blank"><span>Create an AWS Account</span></a> </div> <div class="lb-none-pad lb-none-v-margin lb-grid lb-row lb-row-max-large lb-snap" style="margin-top:10px;margin-bottom:0px"> <div class="lb-col lb-tiny-24 lb-mid-8"></div> <div class="lb-col lb-tiny-24 lb-mid-8"></div> <div class="lb-col lb-tiny-24 lb-mid-8"></div> </div> </div> </div> </div> </div> </div> </div> <div class="lb-tiny-hide lb-small-show lb-mid-show lb-large-show lb-xlarge-show lb-none-pad lb-none-v-margin lb-box"> <div class="lb-mbox js-mbox" data-lb-comp="mbox" data-lb-comp-ignore="true" data-mbox="en_what-is-editorial-upper"> <div class="lb-none-pad lb-small-v-margin lb-xb-grid-wrap" style="margin-bottom:0px;"> <div class="lb-xb-grid lb-row-max-large lb-xb-equal-height lb-snap lb-tiny-xb-1 lb-small-xb-2 lb-large-xb-4"> <div class="lb-xbcol"> <div class="lb-border-p data-attr-wrapper lb-box lb-has-link" data-da-type="ha" data-da-channel="ha" data-da-language="en" data-da-placement="ed1" data-da-campaign="aware_what-is-seo-pages" data-da-content="awssm-11373_aware" data-da-trk="db59842b-4cd4-487c-9dc9-2697ca5ac231~ha_awssm-11373_aware"> <a href="/free/machine-learning/?sc_icampaign=aware_what-is-seo-pages&sc_ichannel=ha&sc_icontent=awssm-11373_aware&sc_iplace=ed&trk=db59842b-4cd4-487c-9dc9-2697ca5ac231~ha_awssm-11373_aware"> <figure class="lb-none-v-margin lb-img"> <div> <img src="https://d1.awsstatic.com/Free-Tier_64.f14d1a130811a363bbea22de4bb589f9ab801dfb.png" alt=" " title=" " class="cq-dd-image"> </div> </figure> <div class="lb-txt-bold lb-txt-none lb-txt-blue-600 lb-small-v-margin lb-txt" style="margin-bottom:0px;"> Explore Free Machine Learning Offers </div> <div class="lb-txt-none lb-txt-squid lb-txt-13 lb-txt"> Build, deploy, and run machine learning applications in the cloud for free </div> </a> </div> </div> <div class="lb-xbcol"> <div class="lb-border-p data-attr-wrapper lb-box lb-has-link" data-da-type="ha" data-da-channel="ha" data-da-language="en" data-da-placement="ed2" data-da-campaign="aware_what-is-seo-pages" data-da-content="awssm-11373_aware" data-da-trk="9d314323-eb6b-4b2e-a2d5-0aa2f06297ea~ha_awssm-11373_aware"> <a href="/ai/machine-learning/?sc_icampaign=aware_what-is-seo-pages&sc_ichannel=ha&sc_icontent=awssm-11373_aware&sc_iplace=ed&trk=9d314323-eb6b-4b2e-a2d5-0aa2f06297ea~ha_awssm-11373_aware"> <figure class="lb-none-v-margin lb-img"> <div> <img src="https://d1.awsstatic.com/Machine-Learning_64.1408e2e83c4e428dd55ba4baaaaf769d14e0a9e1.png" alt=" " title=" " class="cq-dd-image"> </div> </figure> <div class="lb-txt-bold lb-txt-none lb-txt-blue-600 lb-small-v-margin lb-txt" style="margin-bottom:0px;"> Check out Machine Learning Services </div> <div class="lb-txt-none lb-txt-13 lb-txt"> Innovate faster with the most comprehensive set of ML services </div> </a> </div> </div> <div class="lb-xbcol"> <div class="lb-border-p data-attr-wrapper lb-box lb-has-link" data-da-type="ha" data-da-channel="ha" data-da-language="en" data-da-placement="ed3" data-da-campaign="aware_what-is-seo-pages" data-da-content="awssm-11373_aware" data-da-trk="4fefcf6d-2df2-4443-8370-8f4862db9ab8~ha_awssm-11373_aware"> <a href="/ai/learn/?sc_icampaign=aware_what-is-seo-pages&sc_ichannel=ha&sc_icontent=awssm-11373_aware&sc_iplace=ed&trk=4fefcf6d-2df2-4443-8370-8f4862db9ab8~ha_awssm-11373_aware"> <figure class="lb-none-v-margin lb-img"> <div> <img src="https://d1.awsstatic.com/Learn-More_64.dc6d454a262eb880a9dd0d8cb283dca5bc00cb18.png" alt=" " title=" " class="cq-dd-image"> </div> </figure> <div class="lb-txt-bold lb-txt-none lb-txt-blue-600 lb-small-v-margin lb-txt" style="margin-bottom:0px;"> Browse Machine Learning Trainings </div> <div class="lb-txt-none lb-txt-squid lb-txt-13 lb-txt"> Get started on machine learning training with content built by AWS experts </div> </a> </div> </div> <div class="lb-xbcol"> <div class="lb-border-p data-attr-wrapper lb-box lb-has-link" data-da-type="ha" data-da-channel="ha" data-da-language="en" data-da-placement="ed4" data-da-campaign="aware_what-is-seo-pages" data-da-content="awssm-11373_aware" data-da-trk="e1a89b6b-8d52-49cc-af66-b77d1302a5ff~ha_awssm-11373_aware"> <a href="/blogs/machine-learning/?sc_icampaign=aware_what-is-seo-pages&sc_ichannel=ha&sc_icontent=awssm-11373_aware&sc_iplace=ed&trk=e1a89b6b-8d52-49cc-af66-b77d1302a5ff~ha_awssm-11373_aware"> <figure class="lb-none-v-margin lb-img"> <div> <img src="https://d1.awsstatic.com/All-Products_64.78a4c2cdfdd82b7abc3fda6b44371491bdf5963e.png" alt=" " title=" " class="cq-dd-image"> </div> </figure> <div class="lb-txt-bold lb-txt-none lb-txt-blue-600 lb-small-v-margin lb-txt" style="margin-bottom:0px;"> Read Machine Learning Blogs </div> <div class="lb-txt-none lb-txt-squid lb-txt-13 lb-txt"> Read about the latest AWS Machine Learning product news and best practices </div> </a> </div> </div> </div> </div> </div> </div> <div class="lb-mid-pad lb-none-v-margin lb-grid"> <div class="lb-row lb-row-max-large lb-snap"> <div class="lb-col lb-tiny-24 lb-mid-24"> <div data-eb-slot="what-is-faq" data-eb-slot-meta="{'version':'1.0','slotId':'what-is-faq','experienceId':'6e591111-42de-4afc-8fa8-a8dab062f66f','allowBlank':false,'hasAltExp':false,'isRTR':false,'filters':{'limit':25,'query':'tag \u003d \'faq-collections#reinforcement-learning-from-human-feedback\''}}"> <div data-eb-tpl-n="awsm-rt/rt-faq" data-eb-tpl-v="1.0.0" data-eb-ce="" data-eb-c-scope="what-is-faq" data-eb-d-scope="DIRECTORIES" data-eb-locale="en-US" data-eb-73154b46="" data-eb-ssr-ce="" data-eb-tpl-ns="awsmRT" data-eb-hydrated="pending"> <style>[data-eb-73154b46] .eb-faq{display:grid;justify-content:center;grid-template-columns:100%;grid-gap:20px}@media only screen and (min-width:769px){[data-eb-73154b46] .eb-faq{grid-template-columns:250px 518px}}@media only screen and (min-width:980px){[data-eb-73154b46] .eb-faq{grid-template-columns:250px 650px}}@media only screen and (min-width:1200px){[data-eb-73154b46] .eb-faq{grid-template-columns:250px 870px}}[data-eb-73154b46] .eb-faq .eb-bg-dark{background-color:#fbfbfb}[data-eb-73154b46] .eb-faq .eb-sticky-sidebar{height:100%;display:none}@media only screen and (min-width:769px){[data-eb-73154b46] .eb-faq .eb-sticky-sidebar{display:block}}[data-eb-73154b46] .eb-faq .eb-sidebar-wrapper{position:sticky;top:130px;margin-top:30px;margin-bottom:30px}[data-eb-73154b46] .eb-faq .eb-sidebar-content{transition:opacity .2s ease-in .1s;opacity:1;padding:0 15px}[data-eb-73154b46] .eb-faq .eb-sidebar-link{font-family:AmazonEmberBold,Helvetica Neue Bold,Helvetica Neue,Helvetica,Arial,sans-serif;position:relative;color:#333;text-decoration:none;user-select:none;line-height:1.3;margin-top:15px;padding-left:30px;width:250px}[data-eb-73154b46] .eb-faq .eb-sidebar-link.eb-active{color:#0972d3}</style> <script type="application/json">{"data":{"items":[{"fields":{"faqQuestion":"What is RLHF?","faqAnswer":"<p>Reinforcement learning from human feedback (RLHF) is a machine learning (ML) technique that uses human feedback to optimize ML models to self-learn more efficiently. Reinforcement learning (RL) techniques train software to make decisions that maximize rewards, making their outcomes more accurate. RLHF incorporates human feedback in the rewards function, so the ML model can perform tasks more aligned with human goals, wants, and needs. RLHF is used throughout generative artificial intelligence (generative AI) applications, including in large language models (LLM).</p> \n<p><a href=\"https://aws.amazon.com/what-is/machine-learning/\" style=\"color:blue; text-decoration:underline\">Read about machine learning</a></p> \n<p><a href=\"https://aws.amazon.com/what-is/reinforcement-learning/\" style=\"color:blue; text-decoration:underline\">Read about reinforcement learning</a></p> \n<p><a href=\"https://aws.amazon.com/what-is/generative-ai/\" style=\"color:blue; text-decoration:underline\">Read about generative AI</a></p> \n<p><a href=\"https://aws.amazon.com/what-is/large-language-model/\" style=\"color:blue; text-decoration:underline\">Read about large language models</a></p>","id":"seo-faq-pairs#what-is-reinforcement-learning-from-human-feedback","customSort":"1"},"metadata":{"tags":[{"id":"seo-faq-pairs#faq-collections#reinforcement-learning-from-human-feedback","name":"reinforcement-learning-from-human-feedback","namespaceId":"seo-faq-pairs#faq-collections","description":"<p>reinforcement-learning-from-human-feedback</p>","metadata":{}}]}},{"fields":{"faqQuestion":"Why is RLHF important?","faqAnswer":"<p>The applications of artificial intelligence (AI) are broad-ranging, from self-driving cars to natural language processing (NLP), stock market predictors, and retail personalization services. No matter the given application, the goal of AI is ultimately to mimic human responses, behaviors, and decision-making. The ML model must encode human input as training data so that the AI mimics humans more closely when completing complex tasks.</p> \n<p>RLHF is a specific technique that is used in training AI systems to appear more human, alongside other techniques such as supervised and unsupervised learning. First, the model’s responses are compared to the responses of a human. Then a human assesses the quality of different responses from the machine, scoring which responses sound more human. The score can be based on innately human qualities, such as friendliness, the right degree of contextualization, and mood. </p> \n<p>RLHF is prominent in natural language understanding, but it’s also used across other generative AI applications.</p> \n<p><a href=\"https://aws.amazon.com/what-is/artificial-intelligence/\" style=\"color:blue; text-decoration:underline\">Read about artificial intelligence</a></p> \n<p><a href=\"https://aws.amazon.com/what-is/nlp/\" style=\"color:blue; text-decoration:underline\">Read about natural language processing</a></p> \n<p><a href=\"https://aws.amazon.com/compare/the-difference-between-machine-learning-supervised-and-unsupervised/\" style=\"color:blue; text-decoration:underline\">Read about the difference between supervised and unsupervised learning</a></p> \n<h3><strong>Enhances AI performance</strong></h3> \n<p>RLHF makes the ML model more accurate. Models can be trained on pregenerated human data, but having additional human feedback loops significantly enhances model performance compared to its initial state.</p> \n<p>For example, when text is translated from one language to another, a model might produce text that’s technically correct but sounds unnatural to the reader. A professional translator can first perform the translation, with the machine-generated translation scored against it, and then a series of machine-generated translations can be scored for quality. The addition of further training to the model makes it better at producing natural-sounding translations.</p> \n<h3><strong>Introduces complex training parameters</strong></h3> \n<p>In some instances in generative AI, it can be difficult to accurately train the model for certain parameters. For example, how do you define the mood of a piece of music? There might be technical parameters such as key and tempo that indicate a certain mood, but a musical piece’s spirit is more subjective and less well defined than just a series of technicalities. Instead, you can provide human guidance where composers create moody pieces, and then you can label machine-generated pieces according to their level of moodiness. This enables a machine to learn these parameters much more quickly.</p> \n<h3><strong>Enhances user satisfaction</strong></h3> \n<p>Although an ML model can be accurate, it might not appear human. RL is needed to guide the model toward the best, most engaging response for human users.</p> \n<p>For example, if you asked a <a href=\"https://aws.amazon.com/what-is/reinforcement-learning-from-human-feedback/#seo-faq-pairs#why-is-rlhf-important\">chatbot</a> what the weather is like outside, it might respond, “<em>It’s 30 degrees Celsius with clouds and high humidity,</em>” or it might respond, “<em>The temperature is around 30 degrees at the moment. It’s cloudy out and humid, so the air might seem thicker!</em>” Although both responses say the same thing, the second response sounds more natural and provides more context. </p> \n<p>As human users rate which model responses they prefer, you can use RLHF for collecting human feedback and improving your model to best serve real people.</p>","id":"seo-faq-pairs#why-is-rlhf-important","customSort":"2"},"metadata":{"tags":[{"id":"seo-faq-pairs#faq-collections#reinforcement-learning-from-human-feedback","name":"reinforcement-learning-from-human-feedback","namespaceId":"seo-faq-pairs#faq-collections","description":"<p>reinforcement-learning-from-human-feedback</p>","metadata":{}}]}},{"fields":{"faqQuestion":"How does RLHF work?","faqAnswer":"<p>RLHF is performed in four stages before the model is considered ready. Here, we use the example of a language model—an internal company knowledge base chatbot—that uses RLHF for refinement.</p> \n<p>We only give an overview of the learning process. Significant mathematical complexity exists in training the model and its policy refinement for RLHF. However, the complex processes are well defined in RLHF and often have prebuilt algorithms that simply need your unique inputs.</p> \n<h3><strong>Data collection</strong></h3> \n<p>Before performing ML tasks with the language model, a set of human-generated prompts and responses are created for the training data. This set is used later in the model’s training process.</p> \n<p>For example, the prompts might be:</p> \n<ul> \n <li>“<em>Where is the location of the HR department in Boston?”</em></li> \n <li>“<em>What is the approval process for social media posts?</em>”</li> \n <li>“<em>What does the Q1 report indicate about sales compared to previous quarterly reports?</em>” </li> \n</ul> \n<p>A knowledge worker in the company then answers these questions with accurate, natural responses.</p> \n<h3><strong>Supervised fine-tuning of a language model</strong></h3> \n<p>You can use a commercial pretrained model as the base model for RLHF. You can fine-tune the model to the company’s internal knowledge base by using techniques such as retrieval-augmented generation (RAG). When the model is fine-tuned, you compare its response to the predetermined prompts with the human responses collected in the previous step. Mathematical techniques can calculate the degree of similarity between the two. </p> \n<p>For example, the machine-generated responses can be assigned a score between 0 and 1, with 1 being the most accurate and 0 being the least accurate. With these scores, the model now has a policy that is designed to form responses that score closer to human responses. This policy forms the basis of all future decision-making for the model.</p> \n<p><a href=\"https://aws.amazon.com/what-is/retrieval-augmented-generation/\" style=\"color:blue; text-decoration:underline\">Read about RAG</a></p> \n<h3><strong>Building a separate reward model</strong></h3> \n<p>The core of RLHF is training a separate AI <em>reward model</em> based on human feedback, and then using this model as a reward function to optimize policy through RL. Given a set of multiple responses from the model answering the same prompt, humans can indicate their preference regarding the quality of each response. You use these response-rating preferences to build the reward model that automatically estimates how high a human would score any given prompt response. </p> \n<h3><strong>Optimize the language model with the reward-based model</strong></h3> \n<p>The language model then uses the reward model to automatically refine its policy before responding to prompts. Using the reward model, the language model internally evaluates a series of responses and then chooses the response that is most likely to result in the greatest reward. This means that it meets human preferences in a more optimized manner.</p> \n<p>The following image shows an overview of the RLHF learning process.</p> \n<p><a href=\"https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2023/08/31/ML-14874_image001.jpg\" style=\"color:blue; text-decoration:underline\"><img alt=\"\" src=\"https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2023/08/31/ML-14874_image001.jpg\" style=\"height:311px; width:700px\"></a><br type=\"_moz\">  </p>","id":"seo-faq-pairs#how-does-rlhf-work","customSort":"3"},"metadata":{"tags":[{"id":"seo-faq-pairs#faq-collections#reinforcement-learning-from-human-feedback","name":"reinforcement-learning-from-human-feedback","namespaceId":"seo-faq-pairs#faq-collections","description":"<p>reinforcement-learning-from-human-feedback</p>","metadata":{}}]}},{"fields":{"faqQuestion":"How is RLHF used in the field of generative AI? ","faqAnswer":"<p>RLHF is recognized as the industry standard technique for ensuring that LLMs produce content that is truthful, harmless, and helpful. However, human communication is a subjective and creative process—and the helpfulness of LLM output is deeply influenced by human values and preferences. Each model is trained slightly differently and uses different human responders, so the outputs differ even between competitive LLMs. The degree to which each model involves human values is totally up to the creator.</p> \n<p>The applications of RLHF extend beyond the bounds of LLMs to other types of generative AI. Here are some examples:</p> \n<ul> \n <li>RLHF can be used in AI image generation: for example, gauging the degree of realism, technicality, or mood of artwork</li> \n <li>In music generation, RLHF can assist in creating music that matches certain moods and soundtracks to activities</li> \n <li>RLHF can be used in a voice assistant, guiding the voice to sound more friendly, inquisitive, and trustworthy</li> \n</ul>","id":"seo-faq-pairs#how-is-rlhf-used-in-the-field-of-generative-ai","customSort":"4"},"metadata":{"tags":[{"id":"seo-faq-pairs#faq-collections#reinforcement-learning-from-human-feedback","name":"reinforcement-learning-from-human-feedback","namespaceId":"seo-faq-pairs#faq-collections","description":"<p>reinforcement-learning-from-human-feedback</p>","metadata":{}}]}},{"fields":{"faqQuestion":"How can AWS help with your RLHF requirements?","faqAnswer":"<p><a href=\"https://aws.amazon.com/sagemaker/groundtruth/features/\" style=\"color:blue; text-decoration:underline\">Amazon SageMaker Ground Truth</a> offers the most comprehensive set of human-in-the-loop capabilities for incorporating human feedback across the ML lifecycle to improve model accuracy and relevancy. You can complete various human-in-the-loop tasks, from data generation and annotation to reward model generation, model review, and customization through a self-service or AWS managed offering.</p> \n<p>SageMaker Ground Truth includes a data annotator for RLHF capabilities. You can give direct feedback and guidance on output that a model has generated by ranking, classifying, or doing both for its responses for RL outcomes. The data, referred to as <em>comparison and ranking data</em>, is effectively a reward model or reward function, which is then used to train the model. You can use comparison and ranking data to customize an existing model for your use case or to fine-tune a model that you build from scratch.</p> \n<p>Get started with RLHF techniques on AWS by <a href=\"https://portal.aws.amazon.com/billing/signup\" style=\"color:blue; text-decoration:underline\">creating an account</a> today.</p>","id":"seo-faq-pairs#how-can-aws-help-with-your-rlhf-requirements","customSort":"5"},"metadata":{"tags":[{"id":"seo-faq-pairs#faq-collections#reinforcement-learning-from-human-feedback","name":"reinforcement-learning-from-human-feedback","namespaceId":"seo-faq-pairs#faq-collections","description":"<p>reinforcement-learning-from-human-feedback</p>","metadata":{}}]}}]},"metadata":{"auth":{},"pagination":{"empty":false,"present":true},"testAttributes":{}},"context":{"page":{"pageUrl":"https://aws.amazon.com/what-is/reinforcement-learning-from-human-feedback/"},"environment":{"stage":"prod","region":"us-east-1"},"sdkVersion":"1.0.129"},"refMap":{"manifest.js":"3dea65b485","rt-faq.js":"003db38f04","rt-faq.css":"b00bda11a1","rt-faq.css.js":"0af1d62724","rt-faq.rtl.css":"f26a77ea1d","rt-faq.rtl.css.js":"efb444c1ed"},"settings":{"templateMappings":{"question":"faqQuestion","answer":"faqAnswer"}}}</script> <div data-eb-tpl-root="" data-reactroot=""> <div class="eb-faq"> <script type="application/ld+json">{"@context":"https://schema.org","@type":"FAQPage","mainEntity":[[{"@type":"Question","name":"What is RLHF?","acceptedAnswer":{"@type":"Answer","text":"<p>Reinforcement learning from human feedback (RLHF) is a machine learning (ML) technique that uses human feedback to optimize ML models to self-learn more efficiently. Reinforcement learning (RL) techniques train software to make decisions that maximize rewards, making their outcomes more accurate. RLHF incorporates human feedback in the rewards function, so the ML model can perform tasks more aligned with human goals, wants, and needs. RLHF is used throughout generative artificial intelligence (generative AI) applications, including in large language models (LLM).</p> \n<p><a href=\"https://aws.amazon.com/what-is/machine-learning/\" style=\"color:blue; text-decoration:underline\">Read about machine learning</a></p> \n<p><a href=\"https://aws.amazon.com/what-is/reinforcement-learning/\" style=\"color:blue; text-decoration:underline\">Read about reinforcement learning</a></p> \n<p><a href=\"https://aws.amazon.com/what-is/generative-ai/\" style=\"color:blue; text-decoration:underline\">Read about generative AI</a></p> \n<p><a href=\"https://aws.amazon.com/what-is/large-language-model/\" style=\"color:blue; text-decoration:underline\">Read about large language models</a></p>"}},{"@type":"Question","name":"Why is RLHF important?","acceptedAnswer":{"@type":"Answer","text":"<p>The applications of artificial intelligence (AI) are broad-ranging, from self-driving cars to natural language processing (NLP), stock market predictors, and retail personalization services. No matter the given application, the goal of AI is ultimately to mimic human responses, behaviors, and decision-making. The ML model must encode human input as training data so that the AI mimics humans more closely when completing complex tasks.</p> \n<p>RLHF is a specific technique that is used in training AI systems to appear more human, alongside other techniques such as supervised and unsupervised learning. First, the model’s responses are compared to the responses of a human. Then a human assesses the quality of different responses from the machine, scoring which responses sound more human. The score can be based on innately human qualities, such as friendliness, the right degree of contextualization, and mood. </p> \n<p>RLHF is prominent in natural language understanding, but it’s also used across other generative AI applications.</p> \n<p><a href=\"https://aws.amazon.com/what-is/artificial-intelligence/\" style=\"color:blue; text-decoration:underline\">Read about artificial intelligence</a></p> \n<p><a href=\"https://aws.amazon.com/what-is/nlp/\" style=\"color:blue; text-decoration:underline\">Read about natural language processing</a></p> \n<p><a href=\"https://aws.amazon.com/compare/the-difference-between-machine-learning-supervised-and-unsupervised/\" style=\"color:blue; text-decoration:underline\">Read about the difference between supervised and unsupervised learning</a></p> \n<h3><strong>Enhances AI performance</strong></h3> \n<p>RLHF makes the ML model more accurate. Models can be trained on pregenerated human data, but having additional human feedback loops significantly enhances model performance compared to its initial state.</p> \n<p>For example, when text is translated from one language to another, a model might produce text that’s technically correct but sounds unnatural to the reader. A professional translator can first perform the translation, with the machine-generated translation scored against it, and then a series of machine-generated translations can be scored for quality. The addition of further training to the model makes it better at producing natural-sounding translations.</p> \n<h3><strong>Introduces complex training parameters</strong></h3> \n<p>In some instances in generative AI, it can be difficult to accurately train the model for certain parameters. For example, how do you define the mood of a piece of music? There might be technical parameters such as key and tempo that indicate a certain mood, but a musical piece’s spirit is more subjective and less well defined than just a series of technicalities. Instead, you can provide human guidance where composers create moody pieces, and then you can label machine-generated pieces according to their level of moodiness. This enables a machine to learn these parameters much more quickly.</p> \n<h3><strong>Enhances user satisfaction</strong></h3> \n<p>Although an ML model can be accurate, it might not appear human. RL is needed to guide the model toward the best, most engaging response for human users.</p> \n<p>For example, if you asked a <a href=\"https://aws.amazon.com/what-is/reinforcement-learning-from-human-feedback/#seo-faq-pairs#why-is-rlhf-important\">chatbot</a> what the weather is like outside, it might respond, “<em>It’s 30 degrees Celsius with clouds and high humidity,</em>” or it might respond, “<em>The temperature is around 30 degrees at the moment. It’s cloudy out and humid, so the air might seem thicker!</em>” Although both responses say the same thing, the second response sounds more natural and provides more context. </p> \n<p>As human users rate which model responses they prefer, you can use RLHF for collecting human feedback and improving your model to best serve real people.</p>"}},{"@type":"Question","name":"How does RLHF work?","acceptedAnswer":{"@type":"Answer","text":"<p>RLHF is performed in four stages before the model is considered ready. Here, we use the example of a language model—an internal company knowledge base chatbot—that uses RLHF for refinement.</p> \n<p>We only give an overview of the learning process. Significant mathematical complexity exists in training the model and its policy refinement for RLHF. However, the complex processes are well defined in RLHF and often have prebuilt algorithms that simply need your unique inputs.</p> \n<h3><strong>Data collection</strong></h3> \n<p>Before performing ML tasks with the language model, a set of human-generated prompts and responses are created for the training data. This set is used later in the model’s training process.</p> \n<p>For example, the prompts might be:</p> \n<ul> \n <li>“<em>Where is the location of the HR department in Boston?”</em></li> \n <li>“<em>What is the approval process for social media posts?</em>”</li> \n <li>“<em>What does the Q1 report indicate about sales compared to previous quarterly reports?</em>” </li> \n</ul> \n<p>A knowledge worker in the company then answers these questions with accurate, natural responses.</p> \n<h3><strong>Supervised fine-tuning of a language model</strong></h3> \n<p>You can use a commercial pretrained model as the base model for RLHF. You can fine-tune the model to the company’s internal knowledge base by using techniques such as retrieval-augmented generation (RAG). When the model is fine-tuned, you compare its response to the predetermined prompts with the human responses collected in the previous step. Mathematical techniques can calculate the degree of similarity between the two. </p> \n<p>For example, the machine-generated responses can be assigned a score between 0 and 1, with 1 being the most accurate and 0 being the least accurate. With these scores, the model now has a policy that is designed to form responses that score closer to human responses. This policy forms the basis of all future decision-making for the model.</p> \n<p><a href=\"https://aws.amazon.com/what-is/retrieval-augmented-generation/\" style=\"color:blue; text-decoration:underline\">Read about RAG</a></p> \n<h3><strong>Building a separate reward model</strong></h3> \n<p>The core of RLHF is training a separate AI <em>reward model</em> based on human feedback, and then using this model as a reward function to optimize policy through RL. Given a set of multiple responses from the model answering the same prompt, humans can indicate their preference regarding the quality of each response. You use these response-rating preferences to build the reward model that automatically estimates how high a human would score any given prompt response. </p> \n<h3><strong>Optimize the language model with the reward-based model</strong></h3> \n<p>The language model then uses the reward model to automatically refine its policy before responding to prompts. Using the reward model, the language model internally evaluates a series of responses and then chooses the response that is most likely to result in the greatest reward. This means that it meets human preferences in a more optimized manner.</p> \n<p>The following image shows an overview of the RLHF learning process.</p> \n<p><a href=\"https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2023/08/31/ML-14874_image001.jpg\" style=\"color:blue; text-decoration:underline\"><img alt=\"\" src=\"https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2023/08/31/ML-14874_image001.jpg\" style=\"height:311px; width:700px\"></a><br type=\"_moz\">  </p>"}},{"@type":"Question","name":"How is RLHF used in the field of generative AI? ","acceptedAnswer":{"@type":"Answer","text":"<p>RLHF is recognized as the industry standard technique for ensuring that LLMs produce content that is truthful, harmless, and helpful. However, human communication is a subjective and creative process—and the helpfulness of LLM output is deeply influenced by human values and preferences. Each model is trained slightly differently and uses different human responders, so the outputs differ even between competitive LLMs. The degree to which each model involves human values is totally up to the creator.</p> \n<p>The applications of RLHF extend beyond the bounds of LLMs to other types of generative AI. Here are some examples:</p> \n<ul> \n <li>RLHF can be used in AI image generation: for example, gauging the degree of realism, technicality, or mood of artwork</li> \n <li>In music generation, RLHF can assist in creating music that matches certain moods and soundtracks to activities</li> \n <li>RLHF can be used in a voice assistant, guiding the voice to sound more friendly, inquisitive, and trustworthy</li> \n</ul>"}},{"@type":"Question","name":"How can AWS help with your RLHF requirements?","acceptedAnswer":{"@type":"Answer","text":"<p><a href=\"https://aws.amazon.com/sagemaker/groundtruth/features/\" style=\"color:blue; text-decoration:underline\">Amazon SageMaker Ground Truth</a> offers the most comprehensive set of human-in-the-loop capabilities for incorporating human feedback across the ML lifecycle to improve model accuracy and relevancy. You can complete various human-in-the-loop tasks, from data generation and annotation to reward model generation, model review, and customization through a self-service or AWS managed offering.</p> \n<p>SageMaker Ground Truth includes a data annotator for RLHF capabilities. You can give direct feedback and guidance on output that a model has generated by ranking, classifying, or doing both for its responses for RL outcomes. The data, referred to as <em>comparison and ranking data</em>, is effectively a reward model or reward function, which is then used to train the model. You can use comparison and ranking data to customize an existing model for your use case or to fine-tune a model that you build from scratch.</p> \n<p>Get started with RLHF techniques on AWS by <a href=\"https://portal.aws.amazon.com/billing/signup\" style=\"color:blue; text-decoration:underline\">creating an account</a> today.</p>"}}]]}</script> <div class="eb-sticky-sidebar"> <div class="eb-sidebar-wrapper"> <div class="eb-sidebar-content"> <span data-eb-item-id="seo-faq-pairs#what-is-reinforcement-learning-from-human-feedback"><a class="eb-sidebar-link lb-txt-bold lb-txt-none lb-txt-16 lb-txt eb-active" href="#seo-faq-pairs#what-is-reinforcement-learning-from-human-feedback">What is RLHF?</a></span> <span data-eb-item-id="seo-faq-pairs#why-is-rlhf-important"><a class="eb-sidebar-link lb-txt-bold lb-txt-none lb-txt-16 lb-txt" href="#seo-faq-pairs#why-is-rlhf-important">Why is RLHF important?</a></span> <span data-eb-item-id="seo-faq-pairs#how-does-rlhf-work"><a class="eb-sidebar-link lb-txt-bold lb-txt-none lb-txt-16 lb-txt" href="#seo-faq-pairs#how-does-rlhf-work">How does RLHF work?</a></span> <span data-eb-item-id="seo-faq-pairs#how-is-rlhf-used-in-the-field-of-generative-ai"><a class="eb-sidebar-link lb-txt-bold lb-txt-none lb-txt-16 lb-txt" href="#seo-faq-pairs#how-is-rlhf-used-in-the-field-of-generative-ai">How is RLHF used in the field of generative AI? </a></span> <span data-eb-item-id="seo-faq-pairs#how-can-aws-help-with-your-rlhf-requirements"><a class="eb-sidebar-link lb-txt-bold lb-txt-none lb-txt-16 lb-txt" href="#seo-faq-pairs#how-can-aws-help-with-your-rlhf-requirements">How can AWS help with your RLHF requirements?</a></span> </div> </div> </div> <div class="eb-faq-content"> <div class="lb-none-v-margin lb-grid lb-small-pad lb-grid" data-eb-item-id="seo-faq-pairs#what-is-reinforcement-learning-from-human-feedback"> <div class="lb-row lb-row-max-large lb-snap eb-active"> <div class="lb-col lb-tiny-24 lb-mid-24"> <h2 class="lb-txt-bold lb-txt-none lb-txt-28 lb-h2 lb-title" id="seo-faq-pairs#what-is-reinforcement-learning-from-human-feedback">What is RLHF?</h2> <div class="lb-txt-14 lb-rtxt"> <p>Reinforcement learning from human feedback (RLHF) is a machine learning (ML) technique that uses human feedback to optimize ML models to self-learn more efficiently. Reinforcement learning (RL) techniques train software to make decisions that maximize rewards, making their outcomes more accurate. RLHF incorporates human feedback in the rewards function, so the ML model can perform tasks more aligned with human goals, wants, and needs. RLHF is used throughout generative artificial intelligence (generative AI) applications, including in large language models (LLM).</p> <p><a href="https://aws.amazon.com/what-is/machine-learning/">Read about machine learning</a></p> <p><a href="https://aws.amazon.com/what-is/reinforcement-learning/">Read about reinforcement learning</a></p> <p><a href="https://aws.amazon.com/what-is/generative-ai/">Read about generative AI</a></p> <p><a href="https://aws.amazon.com/what-is/large-language-model/">Read about large language models</a></p> </div> </div> </div> </div> <div class="lb-none-v-margin lb-grid lb-small-pad eb-bg-dark" data-eb-item-id="seo-faq-pairs#why-is-rlhf-important"> <div class="lb-row lb-row-max-large lb-snap"> <div class="lb-col lb-tiny-24 lb-mid-24"> <h2 class="lb-txt-bold lb-txt-none lb-txt-28 lb-h2 lb-title" id="seo-faq-pairs#why-is-rlhf-important">Why is RLHF important?</h2> <div class="lb-txt-14 lb-rtxt"> <p>The applications of artificial intelligence (AI) are broad-ranging, from self-driving cars to natural language processing (NLP), stock market predictors, and retail personalization services. No matter the given application, the goal of AI is ultimately to mimic human responses, behaviors, and decision-making. The ML model must encode human input as training data so that the AI mimics humans more closely when completing complex tasks.</p> <p>RLHF is a specific technique that is used in training AI systems to appear more human, alongside other techniques such as supervised and unsupervised learning. First, the model’s responses are compared to the responses of a human. Then a human assesses the quality of different responses from the machine, scoring which responses sound more human. The score can be based on innately human qualities, such as friendliness, the right degree of contextualization, and mood. </p> <p>RLHF is prominent in natural language understanding, but it’s also used across other generative AI applications.</p> <p><a href="https://aws.amazon.com/what-is/artificial-intelligence/">Read about artificial intelligence</a></p> <p><a href="https://aws.amazon.com/what-is/nlp/">Read about natural language processing</a></p> <p><a href="https://aws.amazon.com/compare/the-difference-between-machine-learning-supervised-and-unsupervised/">Read about the difference between supervised and unsupervised learning</a></p> <h3><strong>Enhances AI performance</strong></h3> <p>RLHF makes the ML model more accurate. Models can be trained on pregenerated human data, but having additional human feedback loops significantly enhances model performance compared to its initial state.</p> <p>For example, when text is translated from one language to another, a model might produce text that’s technically correct but sounds unnatural to the reader. A professional translator can first perform the translation, with the machine-generated translation scored against it, and then a series of machine-generated translations can be scored for quality. The addition of further training to the model makes it better at producing natural-sounding translations.</p> <h3><strong>Introduces complex training parameters</strong></h3> <p>In some instances in generative AI, it can be difficult to accurately train the model for certain parameters. For example, how do you define the mood of a piece of music? There might be technical parameters such as key and tempo that indicate a certain mood, but a musical piece’s spirit is more subjective and less well defined than just a series of technicalities. Instead, you can provide human guidance where composers create moody pieces, and then you can label machine-generated pieces according to their level of moodiness. This enables a machine to learn these parameters much more quickly.</p> <h3><strong>Enhances user satisfaction</strong></h3> <p>Although an ML model can be accurate, it might not appear human. RL is needed to guide the model toward the best, most engaging response for human users.</p> <p>For example, if you asked a <a href="https://aws.amazon.com/what-is/reinforcement-learning-from-human-feedback/#seo-faq-pairs#why-is-rlhf-important">chatbot</a> what the weather is like outside, it might respond, “<em>It’s 30 degrees Celsius with clouds and high humidity,</em>” or it might respond, “<em>The temperature is around 30 degrees at the moment. It’s cloudy out and humid, so the air might seem thicker!</em>” Although both responses say the same thing, the second response sounds more natural and provides more context. </p> <p>As human users rate which model responses they prefer, you can use RLHF for collecting human feedback and improving your model to best serve real people.</p> </div> </div> </div> </div> <div class="lb-none-v-margin lb-grid lb-small-pad lb-grid" data-eb-item-id="seo-faq-pairs#how-does-rlhf-work"> <div class="lb-row lb-row-max-large lb-snap"> <div class="lb-col lb-tiny-24 lb-mid-24"> <h2 class="lb-txt-bold lb-txt-none lb-txt-28 lb-h2 lb-title" id="seo-faq-pairs#how-does-rlhf-work">How does RLHF work?</h2> <div class="lb-txt-14 lb-rtxt"> <p>RLHF is performed in four stages before the model is considered ready. Here, we use the example of a language model—an internal company knowledge base chatbot—that uses RLHF for refinement.</p> <p>We only give an overview of the learning process. Significant mathematical complexity exists in training the model and its policy refinement for RLHF. However, the complex processes are well defined in RLHF and often have prebuilt algorithms that simply need your unique inputs.</p> <h3><strong>Data collection</strong></h3> <p>Before performing ML tasks with the language model, a set of human-generated prompts and responses are created for the training data. This set is used later in the model’s training process.</p> <p>For example, the prompts might be:</p> <ul> <li>“<em>Where is the location of the HR department in Boston?”</em></li> <li>“<em>What is the approval process for social media posts?</em>”</li> <li>“<em>What does the Q1 report indicate about sales compared to previous quarterly reports?</em>” </li> </ul> <p>A knowledge worker in the company then answers these questions with accurate, natural responses.</p> <h3><strong>Supervised fine-tuning of a language model</strong></h3> <p>You can use a commercial pretrained model as the base model for RLHF. You can fine-tune the model to the company’s internal knowledge base by using techniques such as retrieval-augmented generation (RAG). When the model is fine-tuned, you compare its response to the predetermined prompts with the human responses collected in the previous step. Mathematical techniques can calculate the degree of similarity between the two. </p> <p>For example, the machine-generated responses can be assigned a score between 0 and 1, with 1 being the most accurate and 0 being the least accurate. With these scores, the model now has a policy that is designed to form responses that score closer to human responses. This policy forms the basis of all future decision-making for the model.</p> <p><a href="https://aws.amazon.com/what-is/retrieval-augmented-generation/">Read about RAG</a></p> <h3><strong>Building a separate reward model</strong></h3> <p>The core of RLHF is training a separate AI <em>reward model</em> based on human feedback, and then using this model as a reward function to optimize policy through RL. Given a set of multiple responses from the model answering the same prompt, humans can indicate their preference regarding the quality of each response. You use these response-rating preferences to build the reward model that automatically estimates how high a human would score any given prompt response. </p> <h3><strong>Optimize the language model with the reward-based model</strong></h3> <p>The language model then uses the reward model to automatically refine its policy before responding to prompts. Using the reward model, the language model internally evaluates a series of responses and then chooses the response that is most likely to result in the greatest reward. This means that it meets human preferences in a more optimized manner.</p> <p>The following image shows an overview of the RLHF learning process.</p> <p><a href="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2023/08/31/ML-14874_image001.jpg"><img alt="" src="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2023/08/31/ML-14874_image001.jpg"></a><br>  </p> </div> </div> </div> </div> <div class="lb-none-v-margin lb-grid lb-small-pad eb-bg-dark" data-eb-item-id="seo-faq-pairs#how-is-rlhf-used-in-the-field-of-generative-ai"> <div class="lb-row lb-row-max-large lb-snap"> <div class="lb-col lb-tiny-24 lb-mid-24"> <h2 class="lb-txt-bold lb-txt-none lb-txt-28 lb-h2 lb-title" id="seo-faq-pairs#how-is-rlhf-used-in-the-field-of-generative-ai">How is RLHF used in the field of generative AI? </h2> <div class="lb-txt-14 lb-rtxt"> <p>RLHF is recognized as the industry standard technique for ensuring that LLMs produce content that is truthful, harmless, and helpful. However, human communication is a subjective and creative process—and the helpfulness of LLM output is deeply influenced by human values and preferences. Each model is trained slightly differently and uses different human responders, so the outputs differ even between competitive LLMs. The degree to which each model involves human values is totally up to the creator.</p> <p>The applications of RLHF extend beyond the bounds of LLMs to other types of generative AI. Here are some examples:</p> <ul> <li>RLHF can be used in AI image generation: for example, gauging the degree of realism, technicality, or mood of artwork</li> <li>In music generation, RLHF can assist in creating music that matches certain moods and soundtracks to activities</li> <li>RLHF can be used in a voice assistant, guiding the voice to sound more friendly, inquisitive, and trustworthy</li> </ul> </div> </div> </div> </div> <div class="lb-none-v-margin lb-grid lb-small-pad lb-grid" data-eb-item-id="seo-faq-pairs#how-can-aws-help-with-your-rlhf-requirements"> <div class="lb-row lb-row-max-large lb-snap"> <div class="lb-col lb-tiny-24 lb-mid-24"> <h2 class="lb-txt-bold lb-txt-none lb-txt-28 lb-h2 lb-title" id="seo-faq-pairs#how-can-aws-help-with-your-rlhf-requirements">How can AWS help with your RLHF requirements?</h2> <div class="lb-txt-14 lb-rtxt"> <p><a href="https://aws.amazon.com/sagemaker/groundtruth/features/">Amazon SageMaker Ground Truth</a> offers the most comprehensive set of human-in-the-loop capabilities for incorporating human feedback across the ML lifecycle to improve model accuracy and relevancy. You can complete various human-in-the-loop tasks, from data generation and annotation to reward model generation, model review, and customization through a self-service or AWS managed offering.</p> <p>SageMaker Ground Truth includes a data annotator for RLHF capabilities. You can give direct feedback and guidance on output that a model has generated by ranking, classifying, or doing both for its responses for RL outcomes. The data, referred to as <em>comparison and ranking data</em>, is effectively a reward model or reward function, which is then used to train the model. You can use comparison and ranking data to customize an existing model for your use case or to fine-tune a model that you build from scratch.</p> <p>Get started with RLHF techniques on AWS by <a href="https://portal.aws.amazon.com/billing/signup">creating an account</a> today.</p> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> </div> <div class="lb-grid"> <div class="lb-row lb-row-max-large lb-snap"> <div class="lb-col lb-tiny-24 lb-mid-24"> <h2 id="Next_Steps_on_AWS" class="lb-txt-bold lb-txt-none lb-txt-28 lb-small-v-margin lb-h2 lb-title" style="margin-top:0px; margin-bottom:0px;"> Next Steps on AWS</h2> <div class="lb-none-pad lb-none-v-margin lb-grid lb-row lb-row-max-large lb-snap"> <div class="lb-col lb-tiny-24 lb-mid-8"> <figure class="lb-img"> <div style="padding-right:60px;"> <img src="https://d1.awsstatic.com/webteam/product-pages/Product-Page_Standard-Icons_01_Product-Features_SqInk.a8d5666758afc5121b4eb818ae18126031c4b61e.png" alt="" title="" class="cq-dd-image"> </div> </figure> <div class="lb-tiny-align-left lb-txt-bold lb-txt-none lb-txt-18 lb-txt" style="margin-top:0px; margin-bottom:15px;"> Check out additional product-related resources </div> <a class="lb-tiny-align-left lb-txt-bold lb-txt-none lb-txt-blue-600 lb-txt" href="/ai/generative-ai/services/" target="_blank" rel="noopener noreferrer" data-trk-params="{"trkOverrideWithQs":true}"> Innovate faster with AWS generative AI services <i class="icon-angle-double-right lb-after"></i></a> </div> <div class="lb-col lb-tiny-24 lb-mid-8"> <figure class="lb-img"> <div style="padding-right:60px;"> <img src="https://d1.awsstatic.com/webteam/product-pages/Product-Page_Standard-Icons_02_Sign-Up_SqInk.f43d5ddc9c43883eec6187f34c68155402b13312.png" alt="" title="" class="cq-dd-image"> </div> </figure> <div class="lb-tiny-align-left lb-txt-bold lb-txt-none lb-txt-18 lb-txt"> Sign up for a free account </div> <div class="lb-tiny-align-left lb-rtxt" style="margin-top:0px; margin-bottom:15px;"> <p>Instant get access to the AWS Free Tier.</p> </div> <a class="lb-tiny-align-left lb-txt-bold lb-txt-none lb-txt-blue-600 lb-txt" href="https://portal.aws.amazon.com/gp/aws/developer/registration/index.html" target="_blank" rel="noopener noreferrer" data-trk-params="{"trkOverrideWithQs":true}"> Sign up <i class="icon-angle-double-right lb-after"></i></a> </div> <div class="lb-col lb-tiny-24 lb-mid-8"> <figure class="lb-img"> <div style="padding-right:60px;"> <img src="https://d1.awsstatic.com/webteam/product-pages/Product-Page_Standard-Icons_03_Start-Building_SqInk.6a1ef4429a6604cda9b0857084aa13e2ee4eebca.png" alt="" title="" class="cq-dd-image"> </div> </figure> <div class="lb-tiny-align-left lb-txt-bold lb-txt-none lb-txt-18 lb-txt"> Start building in the console </div> <div class="lb-rtxt" style="margin-top:0px; margin-bottom:15px;"> <p>Get started building in the AWS management console.</p> </div> <a class="lb-tiny-align-left lb-txt-bold lb-txt-none lb-txt-blue-600 lb-txt" href="https://console.aws.amazon.com/" target="_blank" rel="noopener noreferrer" data-trk-params="{"trkOverrideWithQs":true}"> Sign in <i class="icon-angle-double-right lb-after"></i></a> </div> </div> </div> </div> </div> </main> </div> <footer id="aws-page-footer" class="m-page-footer" role="contentinfo"> <div class="data-attr-wrapper lb-none-v-margin lb-xb-grid-wrap" style="background-color:#141f2e;" data-da-type="so" data-da-so-type="viewport" data-da-so-language="en" data-da-so-category="monitoring" data-da-so-name="footer" data-da-so-version="a"> <div class="lb-xb-grid lb-row-max-large lb-snap lb-tiny-xb-1 lb-small-xb-3 lb-large-xb-5"> <div class="lb-xbcol"> <div class="data-attr-wrapper lb-small-hide lb-btn" data-da-type="so" data-da-so-type="viewport" data-da-so-language="en" data-da-so-category="monitoring" data-da-so-name="footer_buttons" data-da-so-url="all" data-da-so-version="footer_signin-mobile-default"> <a class="lb-btn-p-primary" href="https://console.aws.amazon.com/console/home?nc1=f_ct&src=footer-signin-mobile" role="button"> <span> Sign In to the Console</span> </a> </div> <h3 class="lb-txt-none lb-txt-white lb-tiny-v-margin lb-h3 lb-title"> Learn About AWS</h3> <ul class="lb-txt-white lb-ul lb-list-style-none lb-li-micro-v-margin lb-tiny-ul-block" style="margin-bottom:0px;"> <li><a href="/what-is-aws/?nc1=f_cc" target="_blank" rel="noopener noreferrer">What Is AWS?</a></li> <li><a href="/what-is-cloud-computing/?nc1=f_cc" target="_blank" rel="noopener noreferrer">What Is Cloud Computing?</a></li> <li><a href="/accessibility/?nc1=f_cc" target="_blank" rel="noopener noreferrer">AWS Accessibility</a></li> <li><a href="/devops/what-is-devops/?nc1=f_cc" target="_blank" rel="noopener noreferrer">What Is DevOps?</a></li> <li><a href="/containers/?nc1=f_cc" target="_blank" rel="noopener noreferrer">What Is a Container?</a></li> <li><a href="/what-is/data-lake/?nc1=f_cc" target="_blank" rel="noopener noreferrer">What Is a Data Lake?</a></li> <li><a href="/what-is/artificial-intelligence/?nc1=f_cc" target="_blank" rel="noopener noreferrer">What is Artificial Intelligence (AI)?</a></li> <li><a href="/what-is/generative-ai/?nc1=f_cc" target="_blank" rel="noopener noreferrer">What is Generative AI?</a></li> <li><a href="/what-is/machine-learning/?nc1=f_cc" target="_blank" rel="noopener noreferrer">What is Machine Learning (ML)?</a></li> <li><a href="/security/?nc1=f_cc" target="_blank" rel="noopener noreferrer">AWS Cloud Security</a></li> <li><a href="/new/?nc1=f_cc" target="_blank" rel="noopener noreferrer">What's New</a></li> <li><a href="/blogs/?nc1=f_cc" target="_blank" rel="noopener noreferrer">Blogs</a></li> <li><a href="https://press.aboutamazon.com/press-releases/aws" target="_blank" rel="noopener noreferrer" title="Press Releases" alt="Press Releases">Press Releases</a></li> </ul> </div> <div class="lb-xbcol"> <h3 class="lb-txt-none lb-txt-white lb-tiny-v-margin lb-h3 lb-title"> Resources for AWS</h3> <ul class="lb-txt-white lb-ul lb-list-style-none lb-li-micro-v-margin lb-tiny-ul-block" style="margin-bottom:0px;"> <li><a href="/getting-started/?nc1=f_cc" target="_blank" rel="noopener noreferrer">Getting Started</a></li> <li><a href="/training/?nc1=f_cc" target="_blank" rel="noopener noreferrer">Training and Certification</a></li> <li><a href="/trust-center/?nc1=f_cc" target="_blank" rel="noopener noreferrer">AWS Trust Center</a></li> <li><a href="/solutions/?nc1=f_cc" target="_blank" rel="noopener noreferrer">AWS Solutions Library</a></li> <li><a href="/architecture/?nc1=f_cc" target="_blank" rel="noopener noreferrer">Architecture Center</a></li> <li><a href="/faqs/?nc1=f_dr" target="_blank" rel="noopener noreferrer">Product and Technical FAQs</a></li> <li><a href="/resources/analyst-reports/?nc1=f_cc" target="_blank" rel="noopener noreferrer">Analyst Reports</a></li> <li><a href="/partners/work-with-partners/?nc1=f_dr" target="_blank" rel="noopener noreferrer">AWS Partners</a></li> </ul> </div> <div class="lb-xbcol"> <h3 class="lb-txt-none lb-txt-white lb-tiny-v-margin lb-h3 lb-title"> Developers on AWS</h3> <ul class="lb-txt-white lb-ul lb-list-style-none lb-li-micro-v-margin lb-tiny-ul-block" style="margin-bottom:0px;"> <li><a href="/developer/?nc1=f_dr" target="_blank" rel="noopener noreferrer">Developer Center</a></li> <li><a href="/developer/tools/?nc1=f_dr" target="_blank" rel="noopener noreferrer">SDKs & Tools</a></li> <li><a href="/developer/language/net/?nc1=f_dr" target="_blank" rel="noopener noreferrer">.NET on AWS</a></li> <li><a href="/developer/language/python/?nc1=f_dr" target="_blank" rel="noopener noreferrer">Python on AWS</a></li> <li><a href="/developer/language/java/?nc1=f_dr" target="_blank" rel="noopener noreferrer">Java on AWS</a></li> <li><a href="/developer/language/php/?nc1=f_cc" target="_blank" rel="noopener noreferrer">PHP on AWS</a></li> <li><a href="/developer/language/javascript/?nc1=f_dr" target="_blank" rel="noopener noreferrer">JavaScript on AWS</a></li> </ul> </div> <div class="lb-xbcol"> <h3 class="lb-txt-none lb-txt-white lb-tiny-v-margin lb-h3 lb-title"> Help</h3> <ul class="lb-txt-white lb-ul lb-list-style-none lb-li-micro-v-margin lb-tiny-ul-block" style="margin-bottom:0px;"> <li><a href="/contact-us/?nc1=f_m" target="_blank" rel="noopener noreferrer">Contact Us</a></li> <li><a href="https://iq.aws.amazon.com/?utm=mkt.foot/?nc1=f_m" target="_blank" rel="noopener noreferrer">Get Expert Help</a></li> <li><a href="https://console.aws.amazon.com/support/home/?nc1=f_dr" target="_blank" rel="noopener noreferrer">File a Support Ticket</a></li> <li><a href="https://repost.aws/?nc1=f_dr" target="_blank" rel="noopener noreferrer">AWS re:Post</a></li> <li><a href="https://repost.aws/knowledge-center/?nc1=f_dr" target="_blank" rel="noopener noreferrer">Knowledge Center</a></li> <li><a href="/premiumsupport/?nc1=f_dr" target="_blank" rel="noopener noreferrer">AWS Support Overview</a></li> <li><a href="/legal/?nc1=f_cc" target="_blank" rel="noopener noreferrer">Legal</a></li> <li><a href="/careers/">AWS Careers</a></li> </ul> <div class="lb-mbox js-mbox" data-lb-comp="mbox" data-lb-comp-ignore="true" data-mbox="en_footer-v3_addl-help"> </div> </div> <div class="lb-xbcol"> <div class="lb-mbox js-mbox" data-lb-comp="mbox" data-lb-comp-ignore="true" data-mbox="en_footer-v3_cta"> <div class="data-attr-wrapper lb-tiny-hide lb-small-show lb-btn" data-da-type="so" data-da-so-type="viewport" data-da-so-language="en" data-da-so-category="monitoring" data-da-so-name="footer_buttons" data-da-so-url="all" data-da-so-version="footer_signup-default"> <a class="lb-btn-p-primary" href="https://portal.aws.amazon.com/gp/aws/developer/registration/index.html?nc1=f_ct&src=default" role="button"> <span> Create an AWS Account</span> </a> </div> </div> <div class="lb-xb-grid-wrap" style="padding-left:0px; margin-top:20px; margin-bottom:0px;"> <div class="lb-xb-grid lb-row-max-large lb-xb-equal-height lb-snap lb-gutter-collapse lb-vgutter-collapse lb-tiny-xb-4"> <div class="lb-xbcol"> <a class="lb-txt-none lb-txt-white lb-none-pad lb-txt" style="padding-left:0px; padding-right:5px;" href="https://twitter.com/awscloud" target="_blank" rel="noopener noreferrer" title="Twitter" alt="Twitter"> <i class="icon-twitter lb-before"></i></a> </div> <div class="lb-xbcol"> <a class="lb-txt-none lb-txt-white lb-none-pad lb-none-v-margin lb-txt" style="padding-right:5px;" href="https://www.facebook.com/amazonwebservices" target="_blank" rel="noopener noreferrer" title="Facebook" alt="Facebook"> <i class="icon-facebook lb-before"></i></a> </div> <div class="lb-xbcol"> <a class="lb-txt-none lb-txt-white lb-none-pad lb-txt" style="padding-right:5px;" href="https://www.linkedin.com/company/amazon-web-services/" target="_blank" rel="noopener noreferrer" title="Linkedin" alt="Linkedin"> <i class="icon-linkedin lb-before"></i></a> </div> <div class="lb-xbcol"> <a class="lb-txt-none lb-txt-white lb-none-pad lb-txt" style="padding-right:5px;" href="https://www.instagram.com/amazonwebservices/" target="_blank" rel="noopener noreferrer" title="Instagram" alt="Instagram"> <i class="icon-instagram lb-before"></i></a> </div> </div> </div> <div class="lb-xb-grid-wrap" style="padding-left:0px; margin-top:10px;"> <div class="lb-xb-grid lb-row-max-large lb-xb-equal-height lb-snap lb-gutter-collapse lb-vgutter-collapse lb-tiny-xb-4"> <div class="lb-xbcol"> <a class="lb-txt-none lb-txt-white lb-none-pad lb-txt" style="padding-right:5px;" href="https://www.twitch.tv/aws" target="_blank" rel="noopener noreferrer" title="Twitch" alt="Twitch"> <i class="icon-twitch lb-before"></i></a> </div> <div class="lb-xbcol"> <a class="lb-txt-none lb-txt-white lb-none-pad lb-txt" style="padding-right:5px;" href="https://www.youtube.com/user/AmazonWebServices/Cloud/" target="_blank" rel="noopener noreferrer" title="YouTube" alt="YouTube"> <i class="icon-youtube lb-before"></i></a> </div> <div class="lb-xbcol"> <a class="lb-txt-none lb-txt-white lb-none-pad lb-txt" style="padding-right:5px;" href="/podcasts/" target="_blank" rel="noopener noreferrer" title="Podcast" alt="Podcast"> <i class="icon-podcast lb-before"></i></a> </div> <div class="lb-xbcol"> <a class="lb-txt-none lb-txt-white lb-none-pad lb-txt" style="padding-right:5px;" href="https://pages.awscloud.com/communication-preferences?trk=homepage" target="_blank" rel="noopener noreferrer" title="Email" alt="Email"> <i class="icon-envelope-o lb-before"></i></a> </div> </div> </div> <div class="lb-txt-normal lb-txt-white lb-txt-14 lb-rtxt" style="color:#eaeded; margin-top:0px;"> <div> Amazon is an Equal Opportunity Employer: <i> Minority / Women / Disability / Veteran / Gender Identity / Sexual Orientation / Age.</i> </div> </div> </div> </div> </div> <div class="lb-none-pad lb-none-v-margin lb-xb-grid-wrap" style="background-color:#141f2e;"> <div class="lb-xb-grid lb-row-max-large lb-snap lb-tiny-xb-1"> <div class="lb-xbcol"> <ul class="lb-txt-white lb-tiny-iblock lb-none-v-margin lb-ul lb-list-style-none lb-li-micro-v-margin lb-tiny-ul-iblock"> <li class="lb-txt-bold">Language</li> <li data-language="ar" lang="ar-SA" translate="no"><a href="https://aws.amazon.com/ar/?nc1=h_ls">عربي</a></li> <li data-language="id" lang="id-ID" translate="no"><a href="https://aws.amazon.com/id/?nc1=h_ls">Bahasa Indonesia</a></li> <li data-language="de" lang="de-DE" translate="no"><a href="https://aws.amazon.com/de/?nc1=h_ls">Deutsch</a></li> <li data-language="en" lang="en-US" translate="no"><a href="https://aws.amazon.com/?nc1=h_ls">English</a></li> <li data-language="es" lang="es-ES" translate="no"><a href="https://aws.amazon.com/es/?nc1=h_ls">Español</a></li> <li data-language="fr" lang="fr-FR" translate="no"><a href="https://aws.amazon.com/fr/?nc1=h_ls">Français</a></li> <li data-language="it" lang="it-IT" translate="no"><a href="https://aws.amazon.com/it/?nc1=h_ls">Italiano</a></li> <li data-language="pt" lang="pt-BR" translate="no"><a href="https://aws.amazon.com/pt/?nc1=h_ls">Português</a></li> <li data-language="vi" lang="vi-VN" translate="no"><a href="https://aws.amazon.com/vi/?nc1=f_ls">Tiếng Việt</a></li> <li data-language="tr" lang="tr-TR" translate="no"><a href="https://aws.amazon.com/tr/?nc1=h_ls">Türkçe</a></li> <li data-language="ru" lang="ru-RU" translate="no"><a href="https://aws.amazon.com/ru/?nc1=h_ls">Ρусский</a></li> <li data-language="th" lang="th-TH" translate="no"><a href="https://aws.amazon.com/th/?nc1=f_ls">ไทย</a></li> <li data-language="jp" lang="ja-JP" translate="no"><a href="https://aws.amazon.com/jp/?nc1=h_ls">日本語</a></li> <li data-language="ko" lang="ko-KR" translate="no"><a href="https://aws.amazon.com/ko/?nc1=h_ls">한국어</a></li> <li data-language="cn" lang="zh-CN" translate="no"><a href="https://aws.amazon.com/cn/?nc1=h_ls">中文 (简体)</a></li> <li data-language="tw" lang="zh-TW" translate="no"><a href="https://aws.amazon.com/tw/?nc1=h_ls">中文 (繁體)</a></li> </ul> </div> </div> </div> <div class="lb-none-pad lb-none-v-margin lb-xb-grid-wrap" style="background-color:#EAEDED; padding-top:5px;"> <div class="lb-xb-grid lb-row-max-large lb-snap lb-tiny-xb-1"> <div class="lb-xbcol"> <div class="lb-mbox js-mbox" data-lb-comp="mbox" data-lb-comp-ignore="true" data-mbox="en_footer-legal-links"> <ul class="lb-txt-squid lb-none-v-margin lb-ul lb-list-style-none lb-li-none-v-margin lb-tiny-ul-iblock"> <li><a href="https://aws.amazon.com/privacy/?nc1=f_pr">Privacy</a></li> <li>|</li> <li><a href="https://aws.amazon.com/accessibility/?nc1=f_acc">Accessibility</a></li> <li>|</li> <li><a href="https://aws.amazon.com/terms/?nc1=f_pr">Site Terms</a></li> <li>|</li> <li data-cookie-consent-modal="1"><a href="#"> Cookie Preferences </a></li> <li>|</li> <li>© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.</li> </ul> </div> </div> </div> </div> </footer> <div id="aws-page-end"></div> <div id="lb-page-end"></div> <div id="mrc-sunrise-chat"></div> <script defer id="mrc-sunrise-chat-loader" src="https://loader.us-east-1.prod.mrc-sunrise.marketing.aws.dev/loader.js"></script>  <div class="lb-skt-overlay lb-modal lb-comp-content-container" data-lb-comp="modal" data-lb-modal-id="ie-deprecation-msg" data-ie10-deprecation-msg="You are using an outdated browser. Please upgrade to a modern browser to improve your experience."> <div class="lb-modal-dialog"> <div class="lb-modal-content"> <div class="lb-modal-header"> <h4 class="lb-h4"> Ending Support for Internet Explorer</h4> <a class="lb-modal-close" role="button" href="#" title="Close"> <span class="lb-sr-only">Got it</span> </a> </div> <div class="lb-modal-body"> AWS support for Internet Explorer ends on 07/31/2022. Supported browsers are Chrome, Firefox, Edge, and Safari. <a href="https://aws.amazon.com/blogs/aws/heads-up-aws-support-for-internet-explorer-11-is-ending/" rel="noopener">Learn more »</a> </div> <div class="lb-modal-footer"> <a class="lb-btn-p-primary lb-modal-close lb-modal-action" role="button">Got it</a> </div> </div> </div> </div> <a data-lb-modal-trigger="ie-deprecation-msg" style="display: none;"></a>  </body> </html>