CINXE.COM
Structured data classification with FeatureSpace
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <meta name="description" content="Keras documentation"> <meta name="author" content="Keras Team"> <link rel="shortcut icon" href="https://keras.io/img/favicon.ico"> <link rel="canonical" href="https://keras.io/examples/structured_data/structured_data_classification_with_feature_space/" /> <!-- Social --> <meta property="og:title" content="Keras documentation: Structured data classification with FeatureSpace"> <meta property="og:image" content="https://keras.io/img/logo-k-keras-wb.png"> <meta name="twitter:title" content="Keras documentation: Structured data classification with FeatureSpace"> <meta name="twitter:image" content="https://keras.io/img/k-keras-social.png"> <meta name="twitter:card" content="summary"> <title>Structured data classification with FeatureSpace</title> <!-- Bootstrap core CSS --> <link href="/css/bootstrap.min.css" rel="stylesheet"> <!-- Custom fonts for this template --> <link href="https://fonts.googleapis.com/css2?family=Open+Sans:wght@400;600;700;800&display=swap" rel="stylesheet"> <!-- Custom styles for this template --> <link href="/css/docs.css" rel="stylesheet"> <link href="/css/monokai.css" rel="stylesheet"> <!-- Google Tag Manager --> <script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start': new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0], j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src= 'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f); })(window,document,'script','dataLayer','GTM-5DNGF4N'); </script> <script> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','https://www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-175165319-128', 'auto'); ga('send', 'pageview'); </script> <!-- End Google Tag Manager --> <script async defer src="https://buttons.github.io/buttons.js"></script> </head> <body> <!-- Google Tag Manager (noscript) --> <noscript><iframe src="https://www.googletagmanager.com/ns.html?id=GTM-5DNGF4N" height="0" width="0" style="display:none;visibility:hidden"></iframe></noscript> <!-- End Google Tag Manager (noscript) --> <div class='k-page'> <div class="k-nav" id="nav-menu"> <a href='/'><img src='/img/logo-small.png' class='logo-small' /></a> <div class="nav flex-column nav-pills" role="tablist" aria-orientation="vertical"> <a class="nav-link" href="/about/" role="tab" aria-selected="">About Keras</a> <a class="nav-link" href="/getting_started/" role="tab" aria-selected="">Getting started</a> <a class="nav-link" href="/guides/" role="tab" aria-selected="">Developer guides</a> <a class="nav-link" href="/api/" role="tab" aria-selected="">Keras 3 API documentation</a> <a class="nav-link" href="/2.18/api/" role="tab" aria-selected="">Keras 2 API documentation</a> <a class="nav-link active" href="/examples/" role="tab" aria-selected="">Code examples</a> <a class="nav-sublink" href="/examples/vision/">Computer Vision</a> <a class="nav-sublink" href="/examples/nlp/">Natural Language Processing</a> <a class="nav-sublink active" href="/examples/structured_data/">Structured Data</a> <a class="nav-sublink2 active" href="/examples/structured_data/structured_data_classification_with_feature_space/">Structured data classification with FeatureSpace</a> <a class="nav-sublink2" href="/examples/structured_data/feature_space_advanced/">FeatureSpace advanced use cases</a> <a class="nav-sublink2" href="/examples/structured_data/imbalanced_classification/">Imbalanced classification: credit card fraud detection</a> <a class="nav-sublink2" href="/examples/structured_data/structured_data_classification_from_scratch/">Structured data classification from scratch</a> <a class="nav-sublink2" href="/examples/structured_data/wide_deep_cross_networks/">Structured data learning with Wide, Deep, and Cross networks</a> <a class="nav-sublink2" href="/examples/structured_data/classification_with_grn_and_vsn/">Classification with Gated Residual and Variable Selection Networks</a> <a class="nav-sublink2" href="/examples/structured_data/classification_with_tfdf/">Classification with TensorFlow Decision Forests</a> <a class="nav-sublink2" href="/examples/structured_data/deep_neural_decision_forests/">Classification with Neural Decision Forests</a> <a class="nav-sublink2" href="/examples/structured_data/tabtransformer/">Structured data learning with TabTransformer</a> <a class="nav-sublink2" href="/examples/structured_data/collaborative_filtering_movielens/">Collaborative Filtering for Movie Recommendations</a> <a class="nav-sublink2" href="/examples/structured_data/movielens_recommendations_transformers/">A Transformer-based recommendation system</a> <a class="nav-sublink" href="/examples/timeseries/">Timeseries</a> <a class="nav-sublink" href="/examples/generative/">Generative Deep Learning</a> <a class="nav-sublink" href="/examples/audio/">Audio Data</a> <a class="nav-sublink" href="/examples/rl/">Reinforcement Learning</a> <a class="nav-sublink" href="/examples/graph/">Graph Data</a> <a class="nav-sublink" href="/examples/keras_recipes/">Quick Keras Recipes</a> <a class="nav-link" href="/keras_tuner/" role="tab" aria-selected="">KerasTuner: Hyperparameter Tuning</a> <a class="nav-link" href="/keras_hub/" role="tab" aria-selected="">KerasHub: Pretrained Models</a> <a class="nav-link" href="/keras_cv/" role="tab" aria-selected="">KerasCV: Computer Vision Workflows</a> <a class="nav-link" href="/keras_nlp/" role="tab" aria-selected="">KerasNLP: Natural Language Workflows</a> </div> </div> <div class='k-main'> <div class='k-main-top'> <script> function displayDropdownMenu() { e = document.getElementById("nav-menu"); if (e.style.display == "block") { e.style.display = "none"; } else { e.style.display = "block"; document.getElementById("dropdown-nav").style.display = "block"; } } function resetMobileUI() { if (window.innerWidth <= 840) { document.getElementById("nav-menu").style.display = "none"; document.getElementById("dropdown-nav").style.display = "block"; } else { document.getElementById("nav-menu").style.display = "block"; document.getElementById("dropdown-nav").style.display = "none"; } var navmenu = document.getElementById("nav-menu"); var menuheight = navmenu.clientHeight; var kmain = document.getElementById("k-main-id"); kmain.style.minHeight = (menuheight + 100) + 'px'; } window.onresize = resetMobileUI; window.addEventListener("load", (event) => { resetMobileUI() }); </script> <div id='dropdown-nav' onclick="displayDropdownMenu();"> <svg viewBox="-20 -20 120 120" width="60" height="60"> <rect width="100" height="20"></rect> <rect y="30" width="100" height="20"></rect> <rect y="60" width="100" height="20"></rect> </svg> </div> <form class="bd-search d-flex align-items-center k-search-form" id="search-form"> <input type="search" class="k-search-input" id="search-input" placeholder="Search Keras documentation..." aria-label="Search Keras documentation..." autocomplete="off"> <button class="k-search-btn"> <svg width="13" height="13" viewBox="0 0 13 13"><title>search</title><path d="m4.8495 7.8226c0.82666 0 1.5262-0.29146 2.0985-0.87438 0.57232-0.58292 0.86378-1.2877 0.87438-2.1144 0.010599-0.82666-0.28086-1.5262-0.87438-2.0985-0.59352-0.57232-1.293-0.86378-2.0985-0.87438-0.8055-0.010599-1.5103 0.28086-2.1144 0.87438-0.60414 0.59352-0.8956 1.293-0.87438 2.0985 0.021197 0.8055 0.31266 1.5103 0.87438 2.1144 0.56172 0.60414 1.2665 0.8956 2.1144 0.87438zm4.4695 0.2115 3.681 3.6819-1.259 1.284-3.6817-3.7 0.0019784-0.69479-0.090043-0.098846c-0.87973 0.76087-1.92 1.1413-3.1207 1.1413-1.3553 0-2.5025-0.46363-3.4417-1.3909s-1.4088-2.0686-1.4088-3.4239c0-1.3553 0.4696-2.4966 1.4088-3.4239 0.9392-0.92727 2.0864-1.3969 3.4417-1.4088 1.3553-0.011889 2.4906 0.45771 3.406 1.4088 0.9154 0.95107 1.379 2.0924 1.3909 3.4239 0 1.2126-0.38043 2.2588-1.1413 3.1385l0.098834 0.090049z"></path></svg> </button> </form> <script> var form = document.getElementById('search-form'); form.onsubmit = function(e) { e.preventDefault(); var query = document.getElementById('search-input').value; window.location.href = '/search.html?query=' + query; return False } </script> </div> <div class='k-main-inner' id='k-main-id'> <div class='k-location-slug'> <span class="k-location-slug-pointer">►</span> <a href='/examples/'>Code examples</a> / <a href='/examples/structured_data/'>Structured Data</a> / Structured data classification with FeatureSpace </div> <div class='k-content'> <h1 id="structured-data-classification-with-featurespace">Structured data classification with FeatureSpace</h1> <p><strong>Author:</strong> <a href="https://twitter.com/fchollet">fchollet</a><br> <strong>Date created:</strong> 2022/11/09<br> <strong>Last modified:</strong> 2022/11/09<br> <strong>Description:</strong> Classify tabular data in a few lines of code.</p> <div class='example_version_banner keras_3'>ⓘ This example uses Keras 3</div> <p><img class="k-inline-icon" src="https://colab.research.google.com/img/colab_favicon.ico"/> <a href="https://colab.research.google.com/github/keras-team/keras-io/blob/master/examples/structured_data/ipynb/structured_data_classification_with_feature_space.ipynb"><strong>View in Colab</strong></a> <span class="k-dot">•</span><img class="k-inline-icon" src="https://github.com/favicon.ico"/> <a href="https://github.com/keras-team/keras-io/blob/master/examples/structured_data/structured_data_classification_with_feature_space.py"><strong>GitHub source</strong></a></p> <hr /> <h2 id="introduction">Introduction</h2> <p>This example demonstrates how to do structured data classification (also known as tabular data classification), starting from a raw CSV file. Our data includes numerical features, and integer categorical features, and string categorical features. We will use the utility <a href="/api/utils/feature_space#featurespace-class"><code>keras.utils.FeatureSpace</code></a> to index, preprocess, and encode our features.</p> <p>The code is adapted from the example <a href="https://keras.io/examples/structured_data/structured_data_classification_from_scratch/">Structured data classification from scratch</a>. While the previous example managed its own low-level feature preprocessing and encoding with Keras preprocessing layers, in this example we delegate everything to <code>FeatureSpace</code>, making the workflow extremely quick and easy.</p> <h3 id="the-dataset">The dataset</h3> <p><a href="https://archive.ics.uci.edu/ml/datasets/heart+Disease">Our dataset</a> is provided by the Cleveland Clinic Foundation for Heart Disease. It's a CSV file with 303 rows. Each row contains information about a patient (a <strong>sample</strong>), and each column describes an attribute of the patient (a <strong>feature</strong>). We use the features to predict whether a patient has a heart disease (<strong>binary classification</strong>).</p> <p>Here's the description of each feature:</p> <table> <thead> <tr> <th>Column</th> <th>Description</th> <th>Feature Type</th> </tr> </thead> <tbody> <tr> <td>Age</td> <td>Age in years</td> <td>Numerical</td> </tr> <tr> <td>Sex</td> <td>(1 = male; 0 = female)</td> <td>Categorical</td> </tr> <tr> <td>CP</td> <td>Chest pain type (0, 1, 2, 3, 4)</td> <td>Categorical</td> </tr> <tr> <td>Trestbpd</td> <td>Resting blood pressure (in mm Hg on admission)</td> <td>Numerical</td> </tr> <tr> <td>Chol</td> <td>Serum cholesterol in mg/dl</td> <td>Numerical</td> </tr> <tr> <td>FBS</td> <td>fasting blood sugar in 120 mg/dl (1 = true; 0 = false)</td> <td>Categorical</td> </tr> <tr> <td>RestECG</td> <td>Resting electrocardiogram results (0, 1, 2)</td> <td>Categorical</td> </tr> <tr> <td>Thalach</td> <td>Maximum heart rate achieved</td> <td>Numerical</td> </tr> <tr> <td>Exang</td> <td>Exercise induced angina (1 = yes; 0 = no)</td> <td>Categorical</td> </tr> <tr> <td>Oldpeak</td> <td>ST depression induced by exercise relative to rest</td> <td>Numerical</td> </tr> <tr> <td>Slope</td> <td>Slope of the peak exercise ST segment</td> <td>Numerical</td> </tr> <tr> <td>CA</td> <td>Number of major vessels (0-3) colored by fluoroscopy</td> <td>Both numerical & categorical</td> </tr> <tr> <td>Thal</td> <td>3 = normal; 6 = fixed defect; 7 = reversible defect</td> <td>Categorical</td> </tr> <tr> <td>Target</td> <td>Diagnosis of heart disease (1 = true; 0 = false)</td> <td>Target</td> </tr> </tbody> </table> <hr /> <h2 id="setup">Setup</h2> <div class="codehilite"><pre><span></span><code><span class="kn">import</span> <span class="nn">os</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s2">"KERAS_BACKEND"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"tensorflow"</span> <span class="kn">import</span> <span class="nn">tensorflow</span> <span class="k">as</span> <span class="nn">tf</span> <span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="nn">pd</span> <span class="kn">import</span> <span class="nn">keras</span> <span class="kn">from</span> <span class="nn">keras.utils</span> <span class="kn">import</span> <span class="n">FeatureSpace</span> </code></pre></div> <hr /> <h2 id="preparing-the-data">Preparing the data</h2> <p>Let's download the data and load it into a Pandas dataframe:</p> <div class="codehilite"><pre><span></span><code><span class="n">file_url</span> <span class="o">=</span> <span class="s2">"http://storage.googleapis.com/download.tensorflow.org/data/heart.csv"</span> <span class="n">dataframe</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">read_csv</span><span class="p">(</span><span class="n">file_url</span><span class="p">)</span> </code></pre></div> <p>The dataset includes 303 samples with 14 columns per sample (13 features, plus the target label):</p> <div class="codehilite"><pre><span></span><code><span class="nb">print</span><span class="p">(</span><span class="n">dataframe</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span> </code></pre></div> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>(303, 14) </code></pre></div> </div> <p>Here's a preview of a few samples:</p> <div class="codehilite"><pre><span></span><code><span class="n">dataframe</span><span class="o">.</span><span class="n">head</span><span class="p">()</span> </code></pre></div> <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>.dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </code></pre></div> </div> </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>age</th> <th>sex</th> <th>cp</th> <th>trestbps</th> <th>chol</th> <th>fbs</th> <th>restecg</th> <th>thalach</th> <th>exang</th> <th>oldpeak</th> <th>slope</th> <th>ca</th> <th>thal</th> <th>target</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>63</td> <td>1</td> <td>1</td> <td>145</td> <td>233</td> <td>1</td> <td>2</td> <td>150</td> <td>0</td> <td>2.3</td> <td>3</td> <td>0</td> <td>fixed</td> <td>0</td> </tr> <tr> <th>1</th> <td>67</td> <td>1</td> <td>4</td> <td>160</td> <td>286</td> <td>0</td> <td>2</td> <td>108</td> <td>1</td> <td>1.5</td> <td>2</td> <td>3</td> <td>normal</td> <td>1</td> </tr> <tr> <th>2</th> <td>67</td> <td>1</td> <td>4</td> <td>120</td> <td>229</td> <td>0</td> <td>2</td> <td>129</td> <td>1</td> <td>2.6</td> <td>2</td> <td>2</td> <td>reversible</td> <td>0</td> </tr> <tr> <th>3</th> <td>37</td> <td>1</td> <td>3</td> <td>130</td> <td>250</td> <td>0</td> <td>0</td> <td>187</td> <td>0</td> <td>3.5</td> <td>3</td> <td>0</td> <td>normal</td> <td>0</td> </tr> <tr> <th>4</th> <td>41</td> <td>0</td> <td>2</td> <td>130</td> <td>204</td> <td>0</td> <td>2</td> <td>172</td> <td>0</td> <td>1.4</td> <td>1</td> <td>0</td> <td>normal</td> <td>0</td> </tr> </tbody> </table> </div> <p>The last column, "target", indicates whether the patient has a heart disease (1) or not (0).</p> <p>Let's split the data into a training and validation set:</p> <div class="codehilite"><pre><span></span><code><span class="n">val_dataframe</span> <span class="o">=</span> <span class="n">dataframe</span><span class="o">.</span><span class="n">sample</span><span class="p">(</span><span class="n">frac</span><span class="o">=</span><span class="mf">0.2</span><span class="p">,</span> <span class="n">random_state</span><span class="o">=</span><span class="mi">1337</span><span class="p">)</span> <span class="n">train_dataframe</span> <span class="o">=</span> <span class="n">dataframe</span><span class="o">.</span><span class="n">drop</span><span class="p">(</span><span class="n">val_dataframe</span><span class="o">.</span><span class="n">index</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span> <span class="s2">"Using </span><span class="si">%d</span><span class="s2"> samples for training and </span><span class="si">%d</span><span class="s2"> for validation"</span> <span class="o">%</span> <span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">train_dataframe</span><span class="p">),</span> <span class="nb">len</span><span class="p">(</span><span class="n">val_dataframe</span><span class="p">))</span> <span class="p">)</span> </code></pre></div> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Using 242 samples for training and 61 for validation </code></pre></div> </div> <p>Let's generate <a href="https://www.tensorflow.org/api_docs/python/tf/data/Dataset"><code>tf.data.Dataset</code></a> objects for each dataframe:</p> <div class="codehilite"><pre><span></span><code><span class="k">def</span> <span class="nf">dataframe_to_dataset</span><span class="p">(</span><span class="n">dataframe</span><span class="p">):</span> <span class="n">dataframe</span> <span class="o">=</span> <span class="n">dataframe</span><span class="o">.</span><span class="n">copy</span><span class="p">()</span> <span class="n">labels</span> <span class="o">=</span> <span class="n">dataframe</span><span class="o">.</span><span class="n">pop</span><span class="p">(</span><span class="s2">"target"</span><span class="p">)</span> <span class="n">ds</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">Dataset</span><span class="o">.</span><span class="n">from_tensor_slices</span><span class="p">((</span><span class="nb">dict</span><span class="p">(</span><span class="n">dataframe</span><span class="p">),</span> <span class="n">labels</span><span class="p">))</span> <span class="n">ds</span> <span class="o">=</span> <span class="n">ds</span><span class="o">.</span><span class="n">shuffle</span><span class="p">(</span><span class="n">buffer_size</span><span class="o">=</span><span class="nb">len</span><span class="p">(</span><span class="n">dataframe</span><span class="p">))</span> <span class="k">return</span> <span class="n">ds</span> <span class="n">train_ds</span> <span class="o">=</span> <span class="n">dataframe_to_dataset</span><span class="p">(</span><span class="n">train_dataframe</span><span class="p">)</span> <span class="n">val_ds</span> <span class="o">=</span> <span class="n">dataframe_to_dataset</span><span class="p">(</span><span class="n">val_dataframe</span><span class="p">)</span> </code></pre></div> <p>Each <code>Dataset</code> yields a tuple <code>(input, target)</code> where <code>input</code> is a dictionary of features and <code>target</code> is the value <code>0</code> or <code>1</code>:</p> <div class="codehilite"><pre><span></span><code><span class="k">for</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span> <span class="ow">in</span> <span class="n">train_ds</span><span class="o">.</span><span class="n">take</span><span class="p">(</span><span class="mi">1</span><span class="p">):</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"Input:"</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"Target:"</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span> </code></pre></div> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Input: {'age': <tf.Tensor: shape=(), dtype=int64, numpy=65>, 'sex': <tf.Tensor: shape=(), dtype=int64, numpy=1>, 'cp': <tf.Tensor: shape=(), dtype=int64, numpy=1>, 'trestbps': <tf.Tensor: shape=(), dtype=int64, numpy=138>, 'chol': <tf.Tensor: shape=(), dtype=int64, numpy=282>, 'fbs': <tf.Tensor: shape=(), dtype=int64, numpy=1>, 'restecg': <tf.Tensor: shape=(), dtype=int64, numpy=2>, 'thalach': <tf.Tensor: shape=(), dtype=int64, numpy=174>, 'exang': <tf.Tensor: shape=(), dtype=int64, numpy=0>, 'oldpeak': <tf.Tensor: shape=(), dtype=float64, numpy=1.4>, 'slope': <tf.Tensor: shape=(), dtype=int64, numpy=2>, 'ca': <tf.Tensor: shape=(), dtype=int64, numpy=1>, 'thal': <tf.Tensor: shape=(), dtype=string, numpy=b'normal'>} Target: tf.Tensor(0, shape=(), dtype=int64) </code></pre></div> </div> <p>Let's batch the datasets:</p> <div class="codehilite"><pre><span></span><code><span class="n">train_ds</span> <span class="o">=</span> <span class="n">train_ds</span><span class="o">.</span><span class="n">batch</span><span class="p">(</span><span class="mi">32</span><span class="p">)</span> <span class="n">val_ds</span> <span class="o">=</span> <span class="n">val_ds</span><span class="o">.</span><span class="n">batch</span><span class="p">(</span><span class="mi">32</span><span class="p">)</span> </code></pre></div> <hr /> <h2 id="configuring-a-featurespace">Configuring a <code>FeatureSpace</code></h2> <p>To configure how each feature should be preprocessed, we instantiate a <a href="/api/utils/feature_space#featurespace-class"><code>keras.utils.FeatureSpace</code></a>, and we pass to it a dictionary that maps the name of our features to a string that describes the feature type.</p> <p>We have a few "integer categorical" features such as <code>"FBS"</code>, one "string categorical" feature (<code>"thal"</code>), and a few numerical features, which we'd like to normalize – except <code>"age"</code>, which we'd like to discretize into a number of bins.</p> <p>We also use the <code>crosses</code> argument to capture <em>feature interactions</em> for some categorical features, that is to say, create additional features that represent value co-occurrences for these categorical features. You can compute feature crosses like this for arbitrary sets of categorical features – not just tuples of two features. Because the resulting co-occurences are hashed into a fixed-sized vector, you don't need to worry about whether the co-occurence space is too large.</p> <div class="codehilite"><pre><span></span><code><span class="n">feature_space</span> <span class="o">=</span> <span class="n">FeatureSpace</span><span class="p">(</span> <span class="n">features</span><span class="o">=</span><span class="p">{</span> <span class="c1"># Categorical features encoded as integers</span> <span class="s2">"sex"</span><span class="p">:</span> <span class="s2">"integer_categorical"</span><span class="p">,</span> <span class="s2">"cp"</span><span class="p">:</span> <span class="s2">"integer_categorical"</span><span class="p">,</span> <span class="s2">"fbs"</span><span class="p">:</span> <span class="s2">"integer_categorical"</span><span class="p">,</span> <span class="s2">"restecg"</span><span class="p">:</span> <span class="s2">"integer_categorical"</span><span class="p">,</span> <span class="s2">"exang"</span><span class="p">:</span> <span class="s2">"integer_categorical"</span><span class="p">,</span> <span class="s2">"ca"</span><span class="p">:</span> <span class="s2">"integer_categorical"</span><span class="p">,</span> <span class="c1"># Categorical feature encoded as string</span> <span class="s2">"thal"</span><span class="p">:</span> <span class="s2">"string_categorical"</span><span class="p">,</span> <span class="c1"># Numerical features to discretize</span> <span class="s2">"age"</span><span class="p">:</span> <span class="s2">"float_discretized"</span><span class="p">,</span> <span class="c1"># Numerical features to normalize</span> <span class="s2">"trestbps"</span><span class="p">:</span> <span class="s2">"float_normalized"</span><span class="p">,</span> <span class="s2">"chol"</span><span class="p">:</span> <span class="s2">"float_normalized"</span><span class="p">,</span> <span class="s2">"thalach"</span><span class="p">:</span> <span class="s2">"float_normalized"</span><span class="p">,</span> <span class="s2">"oldpeak"</span><span class="p">:</span> <span class="s2">"float_normalized"</span><span class="p">,</span> <span class="s2">"slope"</span><span class="p">:</span> <span class="s2">"float_normalized"</span><span class="p">,</span> <span class="p">},</span> <span class="c1"># We create additional features by hashing</span> <span class="c1"># value co-occurrences for the</span> <span class="c1"># following groups of categorical features.</span> <span class="n">crosses</span><span class="o">=</span><span class="p">[(</span><span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"age"</span><span class="p">),</span> <span class="p">(</span><span class="s2">"thal"</span><span class="p">,</span> <span class="s2">"ca"</span><span class="p">)],</span> <span class="c1"># The hashing space for these co-occurrences</span> <span class="c1"># wil be 32-dimensional.</span> <span class="n">crossing_dim</span><span class="o">=</span><span class="mi">32</span><span class="p">,</span> <span class="c1"># Our utility will one-hot encode all categorical</span> <span class="c1"># features and concat all features into a single</span> <span class="c1"># vector (one vector per sample).</span> <span class="n">output_mode</span><span class="o">=</span><span class="s2">"concat"</span><span class="p">,</span> <span class="p">)</span> </code></pre></div> <hr /> <h2 id="further-customizing-a-featurespace">Further customizing a <code>FeatureSpace</code></h2> <p>Specifying the feature type via a string name is quick and easy, but sometimes you may want to further configure the preprocessing of each feature. For instance, in our case, our categorical features don't have a large set of possible values – it's only a handful of values per feature (e.g. <code>1</code> and <code>0</code> for the feature <code>"FBS"</code>), and all possible values are represented in the training set. As a result, we don't need to reserve an index to represent "out of vocabulary" values for these features – which would have been the default behavior. Below, we just specify <code>num_oov_indices=0</code> in each of these features to tell the feature preprocessor to skip "out of vocabulary" indexing.</p> <p>Other customizations you have access to include specifying the number of bins for discretizing features of type <code>"float_discretized"</code>, or the dimensionality of the hashing space for feature crossing.</p> <div class="codehilite"><pre><span></span><code><span class="n">feature_space</span> <span class="o">=</span> <span class="n">FeatureSpace</span><span class="p">(</span> <span class="n">features</span><span class="o">=</span><span class="p">{</span> <span class="c1"># Categorical features encoded as integers</span> <span class="s2">"sex"</span><span class="p">:</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">integer_categorical</span><span class="p">(</span><span class="n">num_oov_indices</span><span class="o">=</span><span class="mi">0</span><span class="p">),</span> <span class="s2">"cp"</span><span class="p">:</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">integer_categorical</span><span class="p">(</span><span class="n">num_oov_indices</span><span class="o">=</span><span class="mi">0</span><span class="p">),</span> <span class="s2">"fbs"</span><span class="p">:</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">integer_categorical</span><span class="p">(</span><span class="n">num_oov_indices</span><span class="o">=</span><span class="mi">0</span><span class="p">),</span> <span class="s2">"restecg"</span><span class="p">:</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">integer_categorical</span><span class="p">(</span><span class="n">num_oov_indices</span><span class="o">=</span><span class="mi">0</span><span class="p">),</span> <span class="s2">"exang"</span><span class="p">:</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">integer_categorical</span><span class="p">(</span><span class="n">num_oov_indices</span><span class="o">=</span><span class="mi">0</span><span class="p">),</span> <span class="s2">"ca"</span><span class="p">:</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">integer_categorical</span><span class="p">(</span><span class="n">num_oov_indices</span><span class="o">=</span><span class="mi">0</span><span class="p">),</span> <span class="c1"># Categorical feature encoded as string</span> <span class="s2">"thal"</span><span class="p">:</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">string_categorical</span><span class="p">(</span><span class="n">num_oov_indices</span><span class="o">=</span><span class="mi">0</span><span class="p">),</span> <span class="c1"># Numerical features to discretize</span> <span class="s2">"age"</span><span class="p">:</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">float_discretized</span><span class="p">(</span><span class="n">num_bins</span><span class="o">=</span><span class="mi">30</span><span class="p">),</span> <span class="c1"># Numerical features to normalize</span> <span class="s2">"trestbps"</span><span class="p">:</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">float_normalized</span><span class="p">(),</span> <span class="s2">"chol"</span><span class="p">:</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">float_normalized</span><span class="p">(),</span> <span class="s2">"thalach"</span><span class="p">:</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">float_normalized</span><span class="p">(),</span> <span class="s2">"oldpeak"</span><span class="p">:</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">float_normalized</span><span class="p">(),</span> <span class="s2">"slope"</span><span class="p">:</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">float_normalized</span><span class="p">(),</span> <span class="p">},</span> <span class="c1"># Specify feature cross with a custom crossing dim.</span> <span class="n">crosses</span><span class="o">=</span><span class="p">[</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">cross</span><span class="p">(</span><span class="n">feature_names</span><span class="o">=</span><span class="p">(</span><span class="s2">"sex"</span><span class="p">,</span> <span class="s2">"age"</span><span class="p">),</span> <span class="n">crossing_dim</span><span class="o">=</span><span class="mi">64</span><span class="p">),</span> <span class="n">FeatureSpace</span><span class="o">.</span><span class="n">cross</span><span class="p">(</span> <span class="n">feature_names</span><span class="o">=</span><span class="p">(</span><span class="s2">"thal"</span><span class="p">,</span> <span class="s2">"ca"</span><span class="p">),</span> <span class="n">crossing_dim</span><span class="o">=</span><span class="mi">16</span><span class="p">,</span> <span class="p">),</span> <span class="p">],</span> <span class="n">output_mode</span><span class="o">=</span><span class="s2">"concat"</span><span class="p">,</span> <span class="p">)</span> </code></pre></div> <hr /> <h2 id="adapt-the-featurespace-to-the-training-data">Adapt the <code>FeatureSpace</code> to the training data</h2> <p>Before we start using the <code>FeatureSpace</code> to build a model, we have to adapt it to the training data. During <code>adapt()</code>, the <code>FeatureSpace</code> will:</p> <ul> <li>Index the set of possible values for categorical features.</li> <li>Compute the mean and variance for numerical features to normalize.</li> <li>Compute the value boundaries for the different bins for numerical features to discretize.</li> </ul> <p>Note that <code>adapt()</code> should be called on a <a href="https://www.tensorflow.org/api_docs/python/tf/data/Dataset"><code>tf.data.Dataset</code></a> which yields dicts of feature values – no labels.</p> <div class="codehilite"><pre><span></span><code><span class="n">train_ds_with_no_labels</span> <span class="o">=</span> <span class="n">train_ds</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">,</span> <span class="n">_</span><span class="p">:</span> <span class="n">x</span><span class="p">)</span> <span class="n">feature_space</span><span class="o">.</span><span class="n">adapt</span><span class="p">(</span><span class="n">train_ds_with_no_labels</span><span class="p">)</span> </code></pre></div> <p>At this point, the <code>FeatureSpace</code> can be called on a dict of raw feature values, and will return a single concatenate vector for each sample, combining encoded features and feature crosses.</p> <div class="codehilite"><pre><span></span><code><span class="k">for</span> <span class="n">x</span><span class="p">,</span> <span class="n">_</span> <span class="ow">in</span> <span class="n">train_ds</span><span class="o">.</span><span class="n">take</span><span class="p">(</span><span class="mi">1</span><span class="p">):</span> <span class="n">preprocessed_x</span> <span class="o">=</span> <span class="n">feature_space</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"preprocessed_x.shape:"</span><span class="p">,</span> <span class="n">preprocessed_x</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"preprocessed_x.dtype:"</span><span class="p">,</span> <span class="n">preprocessed_x</span><span class="o">.</span><span class="n">dtype</span><span class="p">)</span> </code></pre></div> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>preprocessed_x.shape: (32, 138) preprocessed_x.dtype: <dtype: 'float32'> </code></pre></div> </div> <hr /> <h2 id="tfdata">Two ways to manage preprocessing: as part of the <a href="https://www.tensorflow.org/api_docs/python/tf/data"><code>tf.data</code></a> pipeline, or in the model itself</h2> <p>There are two ways in which you can leverage your <code>FeatureSpace</code>:</p> <h3 id="tfdata">Asynchronous preprocessing in <a href="https://www.tensorflow.org/api_docs/python/tf/data"><code>tf.data</code></a></h3> <p>You can make it part of your data pipeline, before the model. This enables asynchronous parallel preprocessing of the data on CPU before it hits the model. Do this if you're training on GPU or TPU, or if you want to speed up preprocessing. Usually, this is always the right thing to do during training.</p> <h3 id="synchronous-preprocessing-in-the-model">Synchronous preprocessing in the model</h3> <p>You can make it part of your model. This means that the model will expect dicts of raw feature values, and the preprocessing batch will be done synchronously (in a blocking manner) before the rest of the forward pass. Do this if you want to have an end-to-end model that can process raw feature values – but keep in mind that your model will only be able to run on CPU, since most types of feature preprocessing (e.g. string preprocessing) are not GPU or TPU compatible.</p> <p>Do not do this on GPU / TPU or in performance-sensitive settings. In general, you want to do in-model preprocessing when you do inference on CPU.</p> <p>In our case, we will apply the <code>FeatureSpace</code> in the tf.data pipeline during training, but we will do inference with an end-to-end model that includes the <code>FeatureSpace</code>.</p> <p>Let's create a training and validation dataset of preprocessed batches:</p> <div class="codehilite"><pre><span></span><code><span class="n">preprocessed_train_ds</span> <span class="o">=</span> <span class="n">train_ds</span><span class="o">.</span><span class="n">map</span><span class="p">(</span> <span class="k">lambda</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">:</span> <span class="p">(</span><span class="n">feature_space</span><span class="p">(</span><span class="n">x</span><span class="p">),</span> <span class="n">y</span><span class="p">),</span> <span class="n">num_parallel_calls</span><span class="o">=</span><span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">AUTOTUNE</span> <span class="p">)</span> <span class="n">preprocessed_train_ds</span> <span class="o">=</span> <span class="n">preprocessed_train_ds</span><span class="o">.</span><span class="n">prefetch</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">AUTOTUNE</span><span class="p">)</span> <span class="n">preprocessed_val_ds</span> <span class="o">=</span> <span class="n">val_ds</span><span class="o">.</span><span class="n">map</span><span class="p">(</span> <span class="k">lambda</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">:</span> <span class="p">(</span><span class="n">feature_space</span><span class="p">(</span><span class="n">x</span><span class="p">),</span> <span class="n">y</span><span class="p">),</span> <span class="n">num_parallel_calls</span><span class="o">=</span><span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">AUTOTUNE</span> <span class="p">)</span> <span class="n">preprocessed_val_ds</span> <span class="o">=</span> <span class="n">preprocessed_val_ds</span><span class="o">.</span><span class="n">prefetch</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">AUTOTUNE</span><span class="p">)</span> </code></pre></div> <hr /> <h2 id="build-a-model">Build a model</h2> <p>Time to build a model – or rather two models:</p> <ul> <li>A training model that expects preprocessed features (one sample = one vector)</li> <li>An inference model that expects raw features (one sample = dict of raw feature values)</li> </ul> <div class="codehilite"><pre><span></span><code><span class="n">dict_inputs</span> <span class="o">=</span> <span class="n">feature_space</span><span class="o">.</span><span class="n">get_inputs</span><span class="p">()</span> <span class="n">encoded_features</span> <span class="o">=</span> <span class="n">feature_space</span><span class="o">.</span><span class="n">get_encoded_features</span><span class="p">()</span> <span class="n">x</span> <span class="o">=</span> <span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Dense</span><span class="p">(</span><span class="mi">32</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s2">"relu"</span><span class="p">)(</span><span class="n">encoded_features</span><span class="p">)</span> <span class="n">x</span> <span class="o">=</span> <span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Dropout</span><span class="p">(</span><span class="mf">0.5</span><span class="p">)(</span><span class="n">x</span><span class="p">)</span> <span class="n">predictions</span> <span class="o">=</span> <span class="n">keras</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">Dense</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s2">"sigmoid"</span><span class="p">)(</span><span class="n">x</span><span class="p">)</span> <span class="n">training_model</span> <span class="o">=</span> <span class="n">keras</span><span class="o">.</span><span class="n">Model</span><span class="p">(</span><span class="n">inputs</span><span class="o">=</span><span class="n">encoded_features</span><span class="p">,</span> <span class="n">outputs</span><span class="o">=</span><span class="n">predictions</span><span class="p">)</span> <span class="n">training_model</span><span class="o">.</span><span class="n">compile</span><span class="p">(</span> <span class="n">optimizer</span><span class="o">=</span><span class="s2">"adam"</span><span class="p">,</span> <span class="n">loss</span><span class="o">=</span><span class="s2">"binary_crossentropy"</span><span class="p">,</span> <span class="n">metrics</span><span class="o">=</span><span class="p">[</span><span class="s2">"accuracy"</span><span class="p">]</span> <span class="p">)</span> <span class="n">inference_model</span> <span class="o">=</span> <span class="n">keras</span><span class="o">.</span><span class="n">Model</span><span class="p">(</span><span class="n">inputs</span><span class="o">=</span><span class="n">dict_inputs</span><span class="p">,</span> <span class="n">outputs</span><span class="o">=</span><span class="n">predictions</span><span class="p">)</span> </code></pre></div> <hr /> <h2 id="train-the-model">Train the model</h2> <p>Let's train our model for 50 epochs. Note that feature preprocessing is happening as part of the tf.data pipeline, not as part of the model.</p> <div class="codehilite"><pre><span></span><code><span class="n">training_model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span> <span class="n">preprocessed_train_ds</span><span class="p">,</span> <span class="n">epochs</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span> <span class="n">validation_data</span><span class="o">=</span><span class="n">preprocessed_val_ds</span><span class="p">,</span> <span class="n">verbose</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="p">)</span> </code></pre></div> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 1/20 8/8 - 3s - 352ms/step - accuracy: 0.5200 - loss: 0.7407 - val_accuracy: 0.6196 - val_loss: 0.6663 Epoch 2/20 8/8 - 0s - 20ms/step - accuracy: 0.5881 - loss: 0.6874 - val_accuracy: 0.7732 - val_loss: 0.6015 Epoch 3/20 8/8 - 0s - 19ms/step - accuracy: 0.6580 - loss: 0.6192 - val_accuracy: 0.7839 - val_loss: 0.5577 Epoch 4/20 8/8 - 0s - 19ms/step - accuracy: 0.7096 - loss: 0.5721 - val_accuracy: 0.7856 - val_loss: 0.5200 Epoch 5/20 8/8 - 0s - 18ms/step - accuracy: 0.7292 - loss: 0.5553 - val_accuracy: 0.7764 - val_loss: 0.4853 Epoch 6/20 8/8 - 0s - 19ms/step - accuracy: 0.7561 - loss: 0.5103 - val_accuracy: 0.7732 - val_loss: 0.4627 Epoch 7/20 8/8 - 0s - 19ms/step - accuracy: 0.7231 - loss: 0.5374 - val_accuracy: 0.7764 - val_loss: 0.4413 Epoch 8/20 8/8 - 0s - 19ms/step - accuracy: 0.7769 - loss: 0.4564 - val_accuracy: 0.7683 - val_loss: 0.4320 Epoch 9/20 8/8 - 0s - 18ms/step - accuracy: 0.7769 - loss: 0.4324 - val_accuracy: 0.7856 - val_loss: 0.4191 Epoch 10/20 8/8 - 0s - 19ms/step - accuracy: 0.7778 - loss: 0.4340 - val_accuracy: 0.7888 - val_loss: 0.4084 Epoch 11/20 8/8 - 0s - 19ms/step - accuracy: 0.7760 - loss: 0.4124 - val_accuracy: 0.7716 - val_loss: 0.3977 Epoch 12/20 8/8 - 0s - 19ms/step - accuracy: 0.7964 - loss: 0.4125 - val_accuracy: 0.7667 - val_loss: 0.3959 Epoch 13/20 8/8 - 0s - 18ms/step - accuracy: 0.8051 - loss: 0.3979 - val_accuracy: 0.7856 - val_loss: 0.3891 Epoch 14/20 8/8 - 0s - 19ms/step - accuracy: 0.8043 - loss: 0.3891 - val_accuracy: 0.7856 - val_loss: 0.3840 Epoch 15/20 8/8 - 0s - 18ms/step - accuracy: 0.8633 - loss: 0.3571 - val_accuracy: 0.7872 - val_loss: 0.3764 Epoch 16/20 8/8 - 0s - 19ms/step - accuracy: 0.8728 - loss: 0.3548 - val_accuracy: 0.7888 - val_loss: 0.3699 Epoch 17/20 8/8 - 0s - 19ms/step - accuracy: 0.8698 - loss: 0.3171 - val_accuracy: 0.7872 - val_loss: 0.3727 Epoch 18/20 8/8 - 0s - 18ms/step - accuracy: 0.8529 - loss: 0.3454 - val_accuracy: 0.7904 - val_loss: 0.3669 Epoch 19/20 8/8 - 0s - 17ms/step - accuracy: 0.8589 - loss: 0.3359 - val_accuracy: 0.7980 - val_loss: 0.3770 Epoch 20/20 8/8 - 0s - 17ms/step - accuracy: 0.8455 - loss: 0.3113 - val_accuracy: 0.8044 - val_loss: 0.3684 <keras.src.callbacks.history.History at 0x7f139bb4ed10> </code></pre></div> </div> <p>We quickly get to 80% validation accuracy.</p> <hr /> <h2 id="inference-on-new-data-with-the-endtoend-model">Inference on new data with the end-to-end model</h2> <p>Now, we can use our inference model (which includes the <code>FeatureSpace</code>) to make predictions based on dicts of raw features values, as follows:</p> <div class="codehilite"><pre><span></span><code><span class="n">sample</span> <span class="o">=</span> <span class="p">{</span> <span class="s2">"age"</span><span class="p">:</span> <span class="mi">60</span><span class="p">,</span> <span class="s2">"sex"</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s2">"cp"</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s2">"trestbps"</span><span class="p">:</span> <span class="mi">145</span><span class="p">,</span> <span class="s2">"chol"</span><span class="p">:</span> <span class="mi">233</span><span class="p">,</span> <span class="s2">"fbs"</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="s2">"restecg"</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="s2">"thalach"</span><span class="p">:</span> <span class="mi">150</span><span class="p">,</span> <span class="s2">"exang"</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s2">"oldpeak"</span><span class="p">:</span> <span class="mf">2.3</span><span class="p">,</span> <span class="s2">"slope"</span><span class="p">:</span> <span class="mi">3</span><span class="p">,</span> <span class="s2">"ca"</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s2">"thal"</span><span class="p">:</span> <span class="s2">"fixed"</span><span class="p">,</span> <span class="p">}</span> <span class="n">input_dict</span> <span class="o">=</span> <span class="p">{</span><span class="n">name</span><span class="p">:</span> <span class="n">tf</span><span class="o">.</span><span class="n">convert_to_tensor</span><span class="p">([</span><span class="n">value</span><span class="p">])</span> <span class="k">for</span> <span class="n">name</span><span class="p">,</span> <span class="n">value</span> <span class="ow">in</span> <span class="n">sample</span><span class="o">.</span><span class="n">items</span><span class="p">()}</span> <span class="n">predictions</span> <span class="o">=</span> <span class="n">inference_model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">input_dict</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span> <span class="sa">f</span><span class="s2">"This particular patient had a </span><span class="si">{</span><span class="mi">100</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">predictions</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">% probability "</span> <span class="s2">"of having a heart disease, as evaluated by our model."</span> <span class="p">)</span> </code></pre></div> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code> 1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 273ms/step This particular patient had a 43.13% probability of having a heart disease, as evaluated by our model. </code></pre></div> </div> </div> <div class='k-outline'> <div class='k-outline-depth-1'> <a href='#structured-data-classification-with-featurespace'>Structured data classification with FeatureSpace</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#introduction'>Introduction</a> </div> <div class='k-outline-depth-3'> <a href='#the-dataset'>The dataset</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#setup'>Setup</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#preparing-the-data'>Preparing the data</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#configuring-a-featurespace'>Configuring a <code>FeatureSpace</code></a> </div> <div class='k-outline-depth-2'> ◆ <a href='#further-customizing-a-featurespace'>Further customizing a <code>FeatureSpace</code></a> </div> <div class='k-outline-depth-2'> ◆ <a href='#adapt-the-featurespace-to-the-training-data'>Adapt the <code>FeatureSpace</code> to the training data</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#two-ways-to-manage-preprocessing-as-part-of-the-tfdata-pipeline-or-in-the-model-itself'>Two ways to manage preprocessing: as part of the <code>tf.data</code> pipeline, or in the model itself</a> </div> <div class='k-outline-depth-3'> <a href='#asynchronous-preprocessing-in-tfdata'>Asynchronous preprocessing in <code>tf.data</code></a> </div> <div class='k-outline-depth-3'> <a href='#synchronous-preprocessing-in-the-model'>Synchronous preprocessing in the model</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#build-a-model'>Build a model</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#train-the-model'>Train the model</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#inference-on-new-data-with-the-endtoend-model'>Inference on new data with the end-to-end model</a> </div> </div> </div> </div> </div> </body> <footer style="float: left; width: 100%; padding: 1em; border-top: solid 1px #bbb;"> <a href="https://policies.google.com/terms">Terms</a> | <a href="https://policies.google.com/privacy">Privacy</a> </footer> </html>