CINXE.COM
Review Classification using Active Learning
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <meta name="description" content="Keras documentation"> <meta name="author" content="Keras Team"> <link rel="shortcut icon" href="https://keras.io/img/favicon.ico"> <link rel="canonical" href="https://keras.io/examples/nlp/active_learning_review_classification/" /> <!-- Social --> <meta property="og:title" content="Keras documentation: Review Classification using Active Learning"> <meta property="og:image" content="https://keras.io/img/logo-k-keras-wb.png"> <meta name="twitter:title" content="Keras documentation: Review Classification using Active Learning"> <meta name="twitter:image" content="https://keras.io/img/k-keras-social.png"> <meta name="twitter:card" content="summary"> <title>Review Classification using Active Learning</title> <!-- Bootstrap core CSS --> <link href="/css/bootstrap.min.css" rel="stylesheet"> <!-- Custom fonts for this template --> <link href="https://fonts.googleapis.com/css2?family=Open+Sans:wght@400;600;700;800&display=swap" rel="stylesheet"> <!-- Custom styles for this template --> <link href="/css/docs.css" rel="stylesheet"> <link href="/css/monokai.css" rel="stylesheet"> <!-- Google Tag Manager --> <script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start': new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0], j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src= 'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f); })(window,document,'script','dataLayer','GTM-5DNGF4N'); </script> <script> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','https://www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-175165319-128', 'auto'); ga('send', 'pageview'); </script> <!-- End Google Tag Manager --> <script async defer src="https://buttons.github.io/buttons.js"></script> </head> <body> <!-- Google Tag Manager (noscript) --> <noscript><iframe src="https://www.googletagmanager.com/ns.html?id=GTM-5DNGF4N" height="0" width="0" style="display:none;visibility:hidden"></iframe></noscript> <!-- End Google Tag Manager (noscript) --> <div class='k-page'> <div class="k-nav" id="nav-menu"> <a href='/'><img src='/img/logo-small.png' class='logo-small' /></a> <div class="nav flex-column nav-pills" role="tablist" aria-orientation="vertical"> <a class="nav-link" href="/about/" role="tab" aria-selected="">About Keras</a> <a class="nav-link" href="/getting_started/" role="tab" aria-selected="">Getting started</a> <a class="nav-link" href="/guides/" role="tab" aria-selected="">Developer guides</a> <a class="nav-link" href="/api/" role="tab" aria-selected="">Keras 3 API documentation</a> <a class="nav-link" href="/2.18/api/" role="tab" aria-selected="">Keras 2 API documentation</a> <a class="nav-link active" href="/examples/" role="tab" aria-selected="">Code examples</a> <a class="nav-sublink" href="/examples/vision/">Computer Vision</a> <a class="nav-sublink active" href="/examples/nlp/">Natural Language Processing</a> <a class="nav-sublink2" href="/examples/nlp/text_classification_from_scratch/">Text classification from scratch</a> <a class="nav-sublink2 active" href="/examples/nlp/active_learning_review_classification/">Review Classification using Active Learning</a> <a class="nav-sublink2" href="/examples/nlp/fnet_classification_with_keras_hub/">Text Classification using FNet</a> <a class="nav-sublink2" href="/examples/nlp/multi_label_classification/">Large-scale multi-label text classification</a> <a class="nav-sublink2" href="/examples/nlp/text_classification_with_transformer/">Text classification with Transformer</a> <a class="nav-sublink2" href="/examples/nlp/text_classification_with_switch_transformer/">Text classification with Switch Transformer</a> <a class="nav-sublink2" href="/examples/nlp/tweet-classification-using-tfdf/">Text classification using Decision Forests and pretrained embeddings</a> <a class="nav-sublink2" href="/examples/nlp/pretrained_word_embeddings/">Using pre-trained word embeddings</a> <a class="nav-sublink2" href="/examples/nlp/bidirectional_lstm_imdb/">Bidirectional LSTM on IMDB</a> <a class="nav-sublink2" href="/examples/nlp/data_parallel_training_with_keras_hub/">Data Parallel Training with KerasHub and tf.distribute</a> <a class="nav-sublink2" href="/examples/nlp/neural_machine_translation_with_keras_hub/">English-to-Spanish translation with KerasHub</a> <a class="nav-sublink2" href="/examples/nlp/neural_machine_translation_with_transformer/">English-to-Spanish translation with a sequence-to-sequence Transformer</a> <a class="nav-sublink2" href="/examples/nlp/lstm_seq2seq/">Character-level recurrent sequence-to-sequence model</a> <a class="nav-sublink2" href="/examples/nlp/multimodal_entailment/">Multimodal entailment</a> <a class="nav-sublink2" href="/examples/nlp/ner_transformers/">Named Entity Recognition using Transformers</a> <a class="nav-sublink2" href="/examples/nlp/text_extraction_with_bert/">Text Extraction with BERT</a> <a class="nav-sublink2" href="/examples/nlp/addition_rnn/">Sequence to sequence learning for performing number addition</a> <a class="nav-sublink2" href="/examples/nlp/semantic_similarity_with_keras_hub/">Semantic Similarity with KerasHub</a> <a class="nav-sublink2" href="/examples/nlp/semantic_similarity_with_bert/">Semantic Similarity with BERT</a> <a class="nav-sublink2" href="/examples/nlp/sentence_embeddings_with_sbert/">Sentence embeddings using Siamese RoBERTa-networks</a> <a class="nav-sublink2" href="/examples/nlp/masked_language_modeling/">End-to-end Masked Language Modeling with BERT</a> <a class="nav-sublink2" href="/examples/nlp/abstractive_summarization_with_bart/">Abstractive Text Summarization with BART</a> <a class="nav-sublink2" href="/examples/nlp/pretraining_BERT/">Pretraining BERT with Hugging Face Transformers</a> <a class="nav-sublink2" href="/examples/nlp/parameter_efficient_finetuning_of_gpt2_with_lora/">Parameter-efficient fine-tuning of GPT-2 with LoRA</a> <a class="nav-sublink2" href="/examples/nlp/mlm_training_tpus/">Training a language model from scratch with 🤗 Transformers and TPUs</a> <a class="nav-sublink2" href="/examples/nlp/multiple_choice_task_with_transfer_learning/">MultipleChoice Task with Transfer Learning</a> <a class="nav-sublink2" href="/examples/nlp/question_answering/">Question Answering with Hugging Face Transformers</a> <a class="nav-sublink2" href="/examples/nlp/t5_hf_summarization/">Abstractive Summarization with Hugging Face Transformers</a> <a class="nav-sublink" href="/examples/structured_data/">Structured Data</a> <a class="nav-sublink" href="/examples/timeseries/">Timeseries</a> <a class="nav-sublink" href="/examples/generative/">Generative Deep Learning</a> <a class="nav-sublink" href="/examples/audio/">Audio Data</a> <a class="nav-sublink" href="/examples/rl/">Reinforcement Learning</a> <a class="nav-sublink" href="/examples/graph/">Graph Data</a> <a class="nav-sublink" href="/examples/keras_recipes/">Quick Keras Recipes</a> <a class="nav-link" href="/keras_tuner/" role="tab" aria-selected="">KerasTuner: Hyperparameter Tuning</a> <a class="nav-link" href="/keras_hub/" role="tab" aria-selected="">KerasHub: Pretrained Models</a> <a class="nav-link" href="/keras_cv/" role="tab" aria-selected="">KerasCV: Computer Vision Workflows</a> <a class="nav-link" href="/keras_nlp/" role="tab" aria-selected="">KerasNLP: Natural Language Workflows</a> </div> </div> <div class='k-main'> <div class='k-main-top'> <script> function displayDropdownMenu() { e = document.getElementById("nav-menu"); if (e.style.display == "block") { e.style.display = "none"; } else { e.style.display = "block"; document.getElementById("dropdown-nav").style.display = "block"; } } function resetMobileUI() { if (window.innerWidth <= 840) { document.getElementById("nav-menu").style.display = "none"; document.getElementById("dropdown-nav").style.display = "block"; } else { document.getElementById("nav-menu").style.display = "block"; document.getElementById("dropdown-nav").style.display = "none"; } var navmenu = document.getElementById("nav-menu"); var menuheight = navmenu.clientHeight; var kmain = document.getElementById("k-main-id"); kmain.style.minHeight = (menuheight + 100) + 'px'; } window.onresize = resetMobileUI; window.addEventListener("load", (event) => { resetMobileUI() }); </script> <div id='dropdown-nav' onclick="displayDropdownMenu();"> <svg viewBox="-20 -20 120 120" width="60" height="60"> <rect width="100" height="20"></rect> <rect y="30" width="100" height="20"></rect> <rect y="60" width="100" height="20"></rect> </svg> </div> <form class="bd-search d-flex align-items-center k-search-form" id="search-form"> <input type="search" class="k-search-input" id="search-input" placeholder="Search Keras documentation..." aria-label="Search Keras documentation..." autocomplete="off"> <button class="k-search-btn"> <svg width="13" height="13" viewBox="0 0 13 13"><title>search</title><path d="m4.8495 7.8226c0.82666 0 1.5262-0.29146 2.0985-0.87438 0.57232-0.58292 0.86378-1.2877 0.87438-2.1144 0.010599-0.82666-0.28086-1.5262-0.87438-2.0985-0.59352-0.57232-1.293-0.86378-2.0985-0.87438-0.8055-0.010599-1.5103 0.28086-2.1144 0.87438-0.60414 0.59352-0.8956 1.293-0.87438 2.0985 0.021197 0.8055 0.31266 1.5103 0.87438 2.1144 0.56172 0.60414 1.2665 0.8956 2.1144 0.87438zm4.4695 0.2115 3.681 3.6819-1.259 1.284-3.6817-3.7 0.0019784-0.69479-0.090043-0.098846c-0.87973 0.76087-1.92 1.1413-3.1207 1.1413-1.3553 0-2.5025-0.46363-3.4417-1.3909s-1.4088-2.0686-1.4088-3.4239c0-1.3553 0.4696-2.4966 1.4088-3.4239 0.9392-0.92727 2.0864-1.3969 3.4417-1.4088 1.3553-0.011889 2.4906 0.45771 3.406 1.4088 0.9154 0.95107 1.379 2.0924 1.3909 3.4239 0 1.2126-0.38043 2.2588-1.1413 3.1385l0.098834 0.090049z"></path></svg> </button> </form> <script> var form = document.getElementById('search-form'); form.onsubmit = function(e) { e.preventDefault(); var query = document.getElementById('search-input').value; window.location.href = '/search.html?query=' + query; return False } </script> </div> <div class='k-main-inner' id='k-main-id'> <div class='k-location-slug'> <span class="k-location-slug-pointer">►</span> <a href='/examples/'>Code examples</a> / <a href='/examples/nlp/'>Natural Language Processing</a> / Review Classification using Active Learning </div> <div class='k-content'> <h1 id="review-classification-using-active-learning">Review Classification using Active Learning</h1> <p><strong>Author:</strong> <a href="https://twitter.com/getdarshan">Darshan Deshpande</a><br> <strong>Date created:</strong> 2021/10/29<br> <strong>Last modified:</strong> 2024/05/08<br> <strong>Description:</strong> Demonstrating the advantages of active learning through review classification.</p> <div class='example_version_banner keras_3'>ⓘ This example uses Keras 3</div> <p><img class="k-inline-icon" src="https://colab.research.google.com/img/colab_favicon.ico"/> <a href="https://colab.research.google.com/github/keras-team/keras-io/blob/master/examples/nlp/ipynb/active_learning_review_classification.ipynb"><strong>View in Colab</strong></a> <span class="k-dot">•</span><img class="k-inline-icon" src="https://github.com/favicon.ico"/> <a href="https://github.com/keras-team/keras-io/blob/master/examples/nlp/active_learning_review_classification.py"><strong>GitHub source</strong></a></p> <hr /> <h2 id="introduction">Introduction</h2> <p>With the growth of data-centric Machine Learning, Active Learning has grown in popularity amongst businesses and researchers. Active Learning seeks to progressively train ML models so that the resultant model requires lesser amount of training data to achieve competitive scores.</p> <p>The structure of an Active Learning pipeline involves a classifier and an oracle. The oracle is an annotator that cleans, selects, labels the data, and feeds it to the model when required. The oracle is a trained individual or a group of individuals that ensure consistency in labeling of new data.</p> <p>The process starts with annotating a small subset of the full dataset and training an initial model. The best model checkpoint is saved and then tested on a balanced test set. The test set must be carefully sampled because the full training process will be dependent on it. Once we have the initial evaluation scores, the oracle is tasked with labeling more samples; the number of data points to be sampled is usually determined by the business requirements. After that, the newly sampled data is added to the training set, and the training procedure repeats. This cycle continues until either an acceptable score is reached or some other business metric is met.</p> <p>This tutorial provides a basic demonstration of how Active Learning works by demonstrating a ratio-based (least confidence) sampling strategy that results in lower overall false positive and negative rates when compared to a model trained on the entire dataset. This sampling falls under the domain of <em>uncertainty sampling</em>, in which new datasets are sampled based on the uncertainty that the model outputs for the corresponding label. In our example, we compare our model's false positive and false negative rates and annotate the new data based on their ratio.</p> <p>Some other sampling techniques include:</p> <ol> <li><a href="https://www.researchgate.net/publication/51909346_Committee-Based_Sample_Selection_for_Probabilistic_Classifiers">Committee sampling</a>: Using multiple models to vote for the best data points to be sampled</li> <li><a href="https://www.researchgate.net/publication/51909346_Committee-Based_Sample_Selection_for_Probabilistic_Classifiers">Entropy reduction</a>: Sampling according to an entropy threshold, selecting more of the samples that produce the highest entropy score.</li> <li><a href="https://arxiv.org/abs/1906.00025v1">Minimum margin based sampling</a>: Selects data points closest to the decision boundary</li> </ol> <hr /> <h2 id="importing-required-libraries">Importing required libraries</h2> <div class="codehilite"><pre><span></span><code><span class="kn">import</span> <span class="nn">os</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="p">[</span><span class="s2">"KERAS_BACKEND"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"tensorflow"</span> <span class="c1"># @param ["tensorflow", "jax", "torch"]</span> <span class="kn">import</span> <span class="nn">keras</span> <span class="kn">from</span> <span class="nn">keras</span> <span class="kn">import</span> <span class="n">ops</span> <span class="kn">from</span> <span class="nn">keras</span> <span class="kn">import</span> <span class="n">layers</span> <span class="kn">import</span> <span class="nn">tensorflow_datasets</span> <span class="k">as</span> <span class="nn">tfds</span> <span class="kn">import</span> <span class="nn">tensorflow</span> <span class="k">as</span> <span class="nn">tf</span> <span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="nn">plt</span> <span class="kn">import</span> <span class="nn">re</span> <span class="kn">import</span> <span class="nn">string</span> <span class="n">tfds</span><span class="o">.</span><span class="n">disable_progress_bar</span><span class="p">()</span> </code></pre></div> <hr /> <h2 id="loading-and-preprocessing-the-data">Loading and preprocessing the data</h2> <p>We will be using the IMDB reviews dataset for our experiments. This dataset has 50,000 reviews in total, including training and testing splits. We will merge these splits and sample our own, balanced training, validation and testing sets.</p> <div class="codehilite"><pre><span></span><code><span class="n">dataset</span> <span class="o">=</span> <span class="n">tfds</span><span class="o">.</span><span class="n">load</span><span class="p">(</span> <span class="s2">"imdb_reviews"</span><span class="p">,</span> <span class="n">split</span><span class="o">=</span><span class="s2">"train + test"</span><span class="p">,</span> <span class="n">as_supervised</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">batch_size</span><span class="o">=-</span><span class="mi">1</span><span class="p">,</span> <span class="n">shuffle_files</span><span class="o">=</span><span class="kc">False</span><span class="p">,</span> <span class="p">)</span> <span class="n">reviews</span><span class="p">,</span> <span class="n">labels</span> <span class="o">=</span> <span class="n">tfds</span><span class="o">.</span><span class="n">as_numpy</span><span class="p">(</span><span class="n">dataset</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"Total examples:"</span><span class="p">,</span> <span class="n">reviews</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span> </code></pre></div> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Total examples: 50000 </code></pre></div> </div> <p>Active learning starts with labeling a subset of data. For the ratio sampling technique that we will be using, we will need well-balanced training, validation and testing splits.</p> <div class="codehilite"><pre><span></span><code><span class="n">val_split</span> <span class="o">=</span> <span class="mi">2500</span> <span class="n">test_split</span> <span class="o">=</span> <span class="mi">2500</span> <span class="n">train_split</span> <span class="o">=</span> <span class="mi">7500</span> <span class="c1"># Separating the negative and positive samples for manual stratification</span> <span class="n">x_positives</span><span class="p">,</span> <span class="n">y_positives</span> <span class="o">=</span> <span class="n">reviews</span><span class="p">[</span><span class="n">labels</span> <span class="o">==</span> <span class="mi">1</span><span class="p">],</span> <span class="n">labels</span><span class="p">[</span><span class="n">labels</span> <span class="o">==</span> <span class="mi">1</span><span class="p">]</span> <span class="n">x_negatives</span><span class="p">,</span> <span class="n">y_negatives</span> <span class="o">=</span> <span class="n">reviews</span><span class="p">[</span><span class="n">labels</span> <span class="o">==</span> <span class="mi">0</span><span class="p">],</span> <span class="n">labels</span><span class="p">[</span><span class="n">labels</span> <span class="o">==</span> <span class="mi">0</span><span class="p">]</span> <span class="c1"># Creating training, validation and testing splits</span> <span class="n">x_val</span><span class="p">,</span> <span class="n">y_val</span> <span class="o">=</span> <span class="p">(</span> <span class="n">tf</span><span class="o">.</span><span class="n">concat</span><span class="p">((</span><span class="n">x_positives</span><span class="p">[:</span><span class="n">val_split</span><span class="p">],</span> <span class="n">x_negatives</span><span class="p">[:</span><span class="n">val_split</span><span class="p">]),</span> <span class="mi">0</span><span class="p">),</span> <span class="n">tf</span><span class="o">.</span><span class="n">concat</span><span class="p">((</span><span class="n">y_positives</span><span class="p">[:</span><span class="n">val_split</span><span class="p">],</span> <span class="n">y_negatives</span><span class="p">[:</span><span class="n">val_split</span><span class="p">]),</span> <span class="mi">0</span><span class="p">),</span> <span class="p">)</span> <span class="n">x_test</span><span class="p">,</span> <span class="n">y_test</span> <span class="o">=</span> <span class="p">(</span> <span class="n">tf</span><span class="o">.</span><span class="n">concat</span><span class="p">(</span> <span class="p">(</span> <span class="n">x_positives</span><span class="p">[</span><span class="n">val_split</span> <span class="p">:</span> <span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span><span class="p">],</span> <span class="n">x_negatives</span><span class="p">[</span><span class="n">val_split</span> <span class="p">:</span> <span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span><span class="p">],</span> <span class="p">),</span> <span class="mi">0</span><span class="p">,</span> <span class="p">),</span> <span class="n">tf</span><span class="o">.</span><span class="n">concat</span><span class="p">(</span> <span class="p">(</span> <span class="n">y_positives</span><span class="p">[</span><span class="n">val_split</span> <span class="p">:</span> <span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span><span class="p">],</span> <span class="n">y_negatives</span><span class="p">[</span><span class="n">val_split</span> <span class="p">:</span> <span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span><span class="p">],</span> <span class="p">),</span> <span class="mi">0</span><span class="p">,</span> <span class="p">),</span> <span class="p">)</span> <span class="n">x_train</span><span class="p">,</span> <span class="n">y_train</span> <span class="o">=</span> <span class="p">(</span> <span class="n">tf</span><span class="o">.</span><span class="n">concat</span><span class="p">(</span> <span class="p">(</span> <span class="n">x_positives</span><span class="p">[</span><span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span> <span class="p">:</span> <span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span> <span class="o">+</span> <span class="n">train_split</span><span class="p">],</span> <span class="n">x_negatives</span><span class="p">[</span><span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span> <span class="p">:</span> <span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span> <span class="o">+</span> <span class="n">train_split</span><span class="p">],</span> <span class="p">),</span> <span class="mi">0</span><span class="p">,</span> <span class="p">),</span> <span class="n">tf</span><span class="o">.</span><span class="n">concat</span><span class="p">(</span> <span class="p">(</span> <span class="n">y_positives</span><span class="p">[</span><span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span> <span class="p">:</span> <span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span> <span class="o">+</span> <span class="n">train_split</span><span class="p">],</span> <span class="n">y_negatives</span><span class="p">[</span><span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span> <span class="p">:</span> <span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span> <span class="o">+</span> <span class="n">train_split</span><span class="p">],</span> <span class="p">),</span> <span class="mi">0</span><span class="p">,</span> <span class="p">),</span> <span class="p">)</span> <span class="c1"># Remaining pool of samples are stored separately. These are only labeled as and when required</span> <span class="n">x_pool_positives</span><span class="p">,</span> <span class="n">y_pool_positives</span> <span class="o">=</span> <span class="p">(</span> <span class="n">x_positives</span><span class="p">[</span><span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span> <span class="o">+</span> <span class="n">train_split</span> <span class="p">:],</span> <span class="n">y_positives</span><span class="p">[</span><span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span> <span class="o">+</span> <span class="n">train_split</span> <span class="p">:],</span> <span class="p">)</span> <span class="n">x_pool_negatives</span><span class="p">,</span> <span class="n">y_pool_negatives</span> <span class="o">=</span> <span class="p">(</span> <span class="n">x_negatives</span><span class="p">[</span><span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span> <span class="o">+</span> <span class="n">train_split</span> <span class="p">:],</span> <span class="n">y_negatives</span><span class="p">[</span><span class="n">val_split</span> <span class="o">+</span> <span class="n">test_split</span> <span class="o">+</span> <span class="n">train_split</span> <span class="p">:],</span> <span class="p">)</span> <span class="c1"># Creating TF Datasets for faster prefetching and parallelization</span> <span class="n">train_dataset</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">Dataset</span><span class="o">.</span><span class="n">from_tensor_slices</span><span class="p">((</span><span class="n">x_train</span><span class="p">,</span> <span class="n">y_train</span><span class="p">))</span> <span class="n">val_dataset</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">Dataset</span><span class="o">.</span><span class="n">from_tensor_slices</span><span class="p">((</span><span class="n">x_val</span><span class="p">,</span> <span class="n">y_val</span><span class="p">))</span> <span class="n">test_dataset</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">Dataset</span><span class="o">.</span><span class="n">from_tensor_slices</span><span class="p">((</span><span class="n">x_test</span><span class="p">,</span> <span class="n">y_test</span><span class="p">))</span> <span class="n">pool_negatives</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">Dataset</span><span class="o">.</span><span class="n">from_tensor_slices</span><span class="p">(</span> <span class="p">(</span><span class="n">x_pool_negatives</span><span class="p">,</span> <span class="n">y_pool_negatives</span><span class="p">)</span> <span class="p">)</span> <span class="n">pool_positives</span> <span class="o">=</span> <span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">Dataset</span><span class="o">.</span><span class="n">from_tensor_slices</span><span class="p">(</span> <span class="p">(</span><span class="n">x_pool_positives</span><span class="p">,</span> <span class="n">y_pool_positives</span><span class="p">)</span> <span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Initial training set size: </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">train_dataset</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Validation set size: </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">val_dataset</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Testing set size: </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">test_dataset</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Unlabeled negative pool: </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">pool_negatives</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Unlabeled positive pool: </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">pool_positives</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> </code></pre></div> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Initial training set size: 15000 Validation set size: 5000 Testing set size: 5000 Unlabeled negative pool: 12500 Unlabeled positive pool: 12500 </code></pre></div> </div> <h3 id="fitting-the-textvectorization-layer">Fitting the <code>TextVectorization</code> layer</h3> <p>Since we are working with text data, we will need to encode the text strings as vectors which would then be passed through an <code>Embedding</code> layer. To make this tokenization process faster, we use the <code>map()</code> function with its parallelization functionality.</p> <div class="codehilite"><pre><span></span><code><span class="n">vectorizer</span> <span class="o">=</span> <span class="n">layers</span><span class="o">.</span><span class="n">TextVectorization</span><span class="p">(</span> <span class="mi">3000</span><span class="p">,</span> <span class="n">standardize</span><span class="o">=</span><span class="s2">"lower_and_strip_punctuation"</span><span class="p">,</span> <span class="n">output_sequence_length</span><span class="o">=</span><span class="mi">150</span> <span class="p">)</span> <span class="c1"># Adapting the dataset</span> <span class="n">vectorizer</span><span class="o">.</span><span class="n">adapt</span><span class="p">(</span> <span class="n">train_dataset</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">:</span> <span class="n">x</span><span class="p">,</span> <span class="n">num_parallel_calls</span><span class="o">=</span><span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">AUTOTUNE</span><span class="p">)</span><span class="o">.</span><span class="n">batch</span><span class="p">(</span><span class="mi">256</span><span class="p">)</span> <span class="p">)</span> <span class="k">def</span> <span class="nf">vectorize_text</span><span class="p">(</span><span class="n">text</span><span class="p">,</span> <span class="n">label</span><span class="p">):</span> <span class="n">text</span> <span class="o">=</span> <span class="n">vectorizer</span><span class="p">(</span><span class="n">text</span><span class="p">)</span> <span class="k">return</span> <span class="n">text</span><span class="p">,</span> <span class="n">label</span> <span class="n">train_dataset</span> <span class="o">=</span> <span class="n">train_dataset</span><span class="o">.</span><span class="n">map</span><span class="p">(</span> <span class="n">vectorize_text</span><span class="p">,</span> <span class="n">num_parallel_calls</span><span class="o">=</span><span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">AUTOTUNE</span> <span class="p">)</span><span class="o">.</span><span class="n">prefetch</span><span class="p">(</span><span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">AUTOTUNE</span><span class="p">)</span> <span class="n">pool_negatives</span> <span class="o">=</span> <span class="n">pool_negatives</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">vectorize_text</span><span class="p">,</span> <span class="n">num_parallel_calls</span><span class="o">=</span><span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">AUTOTUNE</span><span class="p">)</span> <span class="n">pool_positives</span> <span class="o">=</span> <span class="n">pool_positives</span><span class="o">.</span><span class="n">map</span><span class="p">(</span><span class="n">vectorize_text</span><span class="p">,</span> <span class="n">num_parallel_calls</span><span class="o">=</span><span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">AUTOTUNE</span><span class="p">)</span> <span class="n">val_dataset</span> <span class="o">=</span> <span class="n">val_dataset</span><span class="o">.</span><span class="n">batch</span><span class="p">(</span><span class="mi">256</span><span class="p">)</span><span class="o">.</span><span class="n">map</span><span class="p">(</span> <span class="n">vectorize_text</span><span class="p">,</span> <span class="n">num_parallel_calls</span><span class="o">=</span><span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">AUTOTUNE</span> <span class="p">)</span> <span class="n">test_dataset</span> <span class="o">=</span> <span class="n">test_dataset</span><span class="o">.</span><span class="n">batch</span><span class="p">(</span><span class="mi">256</span><span class="p">)</span><span class="o">.</span><span class="n">map</span><span class="p">(</span> <span class="n">vectorize_text</span><span class="p">,</span> <span class="n">num_parallel_calls</span><span class="o">=</span><span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">AUTOTUNE</span> <span class="p">)</span> </code></pre></div> <hr /> <h2 id="creating-helper-functions">Creating Helper Functions</h2> <div class="codehilite"><pre><span></span><code><span class="c1"># Helper function for merging new history objects with older ones</span> <span class="k">def</span> <span class="nf">append_history</span><span class="p">(</span><span class="n">losses</span><span class="p">,</span> <span class="n">val_losses</span><span class="p">,</span> <span class="n">accuracy</span><span class="p">,</span> <span class="n">val_accuracy</span><span class="p">,</span> <span class="n">history</span><span class="p">):</span> <span class="n">losses</span> <span class="o">=</span> <span class="n">losses</span> <span class="o">+</span> <span class="n">history</span><span class="o">.</span><span class="n">history</span><span class="p">[</span><span class="s2">"loss"</span><span class="p">]</span> <span class="n">val_losses</span> <span class="o">=</span> <span class="n">val_losses</span> <span class="o">+</span> <span class="n">history</span><span class="o">.</span><span class="n">history</span><span class="p">[</span><span class="s2">"val_loss"</span><span class="p">]</span> <span class="n">accuracy</span> <span class="o">=</span> <span class="n">accuracy</span> <span class="o">+</span> <span class="n">history</span><span class="o">.</span><span class="n">history</span><span class="p">[</span><span class="s2">"binary_accuracy"</span><span class="p">]</span> <span class="n">val_accuracy</span> <span class="o">=</span> <span class="n">val_accuracy</span> <span class="o">+</span> <span class="n">history</span><span class="o">.</span><span class="n">history</span><span class="p">[</span><span class="s2">"val_binary_accuracy"</span><span class="p">]</span> <span class="k">return</span> <span class="n">losses</span><span class="p">,</span> <span class="n">val_losses</span><span class="p">,</span> <span class="n">accuracy</span><span class="p">,</span> <span class="n">val_accuracy</span> <span class="c1"># Plotter function</span> <span class="k">def</span> <span class="nf">plot_history</span><span class="p">(</span><span class="n">losses</span><span class="p">,</span> <span class="n">val_losses</span><span class="p">,</span> <span class="n">accuracies</span><span class="p">,</span> <span class="n">val_accuracies</span><span class="p">):</span> <span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">losses</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">val_losses</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">([</span><span class="s2">"train_loss"</span><span class="p">,</span> <span class="s2">"val_loss"</span><span class="p">])</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">"Epochs"</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">"Loss"</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> <span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">accuracies</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">val_accuracies</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">legend</span><span class="p">([</span><span class="s2">"train_accuracy"</span><span class="p">,</span> <span class="s2">"val_accuracy"</span><span class="p">])</span> <span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span><span class="p">(</span><span class="s2">"Epochs"</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span><span class="p">(</span><span class="s2">"Accuracy"</span><span class="p">)</span> <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> </code></pre></div> <hr /> <h2 id="creating-the-model">Creating the Model</h2> <p>We create a small bidirectional LSTM model. When using Active Learning, you should make sure that the model architecture is capable of overfitting to the initial data. Overfitting gives a strong hint that the model will have enough capacity for future, unseen data.</p> <div class="codehilite"><pre><span></span><code><span class="k">def</span> <span class="nf">create_model</span><span class="p">():</span> <span class="n">model</span> <span class="o">=</span> <span class="n">keras</span><span class="o">.</span><span class="n">models</span><span class="o">.</span><span class="n">Sequential</span><span class="p">(</span> <span class="p">[</span> <span class="n">layers</span><span class="o">.</span><span class="n">Input</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">(</span><span class="mi">150</span><span class="p">,)),</span> <span class="n">layers</span><span class="o">.</span><span class="n">Embedding</span><span class="p">(</span><span class="n">input_dim</span><span class="o">=</span><span class="mi">3000</span><span class="p">,</span> <span class="n">output_dim</span><span class="o">=</span><span class="mi">128</span><span class="p">),</span> <span class="n">layers</span><span class="o">.</span><span class="n">Bidirectional</span><span class="p">(</span><span class="n">layers</span><span class="o">.</span><span class="n">LSTM</span><span class="p">(</span><span class="mi">32</span><span class="p">,</span> <span class="n">return_sequences</span><span class="o">=</span><span class="kc">True</span><span class="p">)),</span> <span class="n">layers</span><span class="o">.</span><span class="n">GlobalMaxPool1D</span><span class="p">(),</span> <span class="n">layers</span><span class="o">.</span><span class="n">Dense</span><span class="p">(</span><span class="mi">20</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s2">"relu"</span><span class="p">),</span> <span class="n">layers</span><span class="o">.</span><span class="n">Dropout</span><span class="p">(</span><span class="mf">0.5</span><span class="p">),</span> <span class="n">layers</span><span class="o">.</span><span class="n">Dense</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">activation</span><span class="o">=</span><span class="s2">"sigmoid"</span><span class="p">),</span> <span class="p">]</span> <span class="p">)</span> <span class="n">model</span><span class="o">.</span><span class="n">summary</span><span class="p">()</span> <span class="k">return</span> <span class="n">model</span> </code></pre></div> <hr /> <h2 id="training-on-the-entire-dataset">Training on the entire dataset</h2> <p>To show the effectiveness of Active Learning, we will first train the model on the entire dataset containing 40,000 labeled samples. This model will be used for comparison later.</p> <div class="codehilite"><pre><span></span><code><span class="k">def</span> <span class="nf">train_full_model</span><span class="p">(</span><span class="n">full_train_dataset</span><span class="p">,</span> <span class="n">val_dataset</span><span class="p">,</span> <span class="n">test_dataset</span><span class="p">):</span> <span class="n">model</span> <span class="o">=</span> <span class="n">create_model</span><span class="p">()</span> <span class="n">model</span><span class="o">.</span><span class="n">compile</span><span class="p">(</span> <span class="n">loss</span><span class="o">=</span><span class="s2">"binary_crossentropy"</span><span class="p">,</span> <span class="n">optimizer</span><span class="o">=</span><span class="s2">"rmsprop"</span><span class="p">,</span> <span class="n">metrics</span><span class="o">=</span><span class="p">[</span> <span class="n">keras</span><span class="o">.</span><span class="n">metrics</span><span class="o">.</span><span class="n">BinaryAccuracy</span><span class="p">(),</span> <span class="n">keras</span><span class="o">.</span><span class="n">metrics</span><span class="o">.</span><span class="n">FalseNegatives</span><span class="p">(),</span> <span class="n">keras</span><span class="o">.</span><span class="n">metrics</span><span class="o">.</span><span class="n">FalsePositives</span><span class="p">(),</span> <span class="p">],</span> <span class="p">)</span> <span class="c1"># We will save the best model at every epoch and load the best one for evaluation on the test set</span> <span class="n">history</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span> <span class="n">full_train_dataset</span><span class="o">.</span><span class="n">batch</span><span class="p">(</span><span class="mi">256</span><span class="p">),</span> <span class="n">epochs</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span> <span class="n">validation_data</span><span class="o">=</span><span class="n">val_dataset</span><span class="p">,</span> <span class="n">callbacks</span><span class="o">=</span><span class="p">[</span> <span class="n">keras</span><span class="o">.</span><span class="n">callbacks</span><span class="o">.</span><span class="n">EarlyStopping</span><span class="p">(</span><span class="n">patience</span><span class="o">=</span><span class="mi">4</span><span class="p">,</span> <span class="n">verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">),</span> <span class="n">keras</span><span class="o">.</span><span class="n">callbacks</span><span class="o">.</span><span class="n">ModelCheckpoint</span><span class="p">(</span> <span class="s2">"FullModelCheckpoint.keras"</span><span class="p">,</span> <span class="n">verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">save_best_only</span><span class="o">=</span><span class="kc">True</span> <span class="p">),</span> <span class="p">],</span> <span class="p">)</span> <span class="c1"># Plot history</span> <span class="n">plot_history</span><span class="p">(</span> <span class="n">history</span><span class="o">.</span><span class="n">history</span><span class="p">[</span><span class="s2">"loss"</span><span class="p">],</span> <span class="n">history</span><span class="o">.</span><span class="n">history</span><span class="p">[</span><span class="s2">"val_loss"</span><span class="p">],</span> <span class="n">history</span><span class="o">.</span><span class="n">history</span><span class="p">[</span><span class="s2">"binary_accuracy"</span><span class="p">],</span> <span class="n">history</span><span class="o">.</span><span class="n">history</span><span class="p">[</span><span class="s2">"val_binary_accuracy"</span><span class="p">],</span> <span class="p">)</span> <span class="c1"># Loading the best checkpoint</span> <span class="n">model</span> <span class="o">=</span> <span class="n">keras</span><span class="o">.</span><span class="n">models</span><span class="o">.</span><span class="n">load_model</span><span class="p">(</span><span class="s2">"FullModelCheckpoint.keras"</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"-"</span> <span class="o">*</span> <span class="mi">100</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span> <span class="s2">"Test set evaluation: "</span><span class="p">,</span> <span class="n">model</span><span class="o">.</span><span class="n">evaluate</span><span class="p">(</span><span class="n">test_dataset</span><span class="p">,</span> <span class="n">verbose</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">return_dict</span><span class="o">=</span><span class="kc">True</span><span class="p">),</span> <span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"-"</span> <span class="o">*</span> <span class="mi">100</span><span class="p">)</span> <span class="k">return</span> <span class="n">model</span> <span class="c1"># Sampling the full train dataset to train on</span> <span class="n">full_train_dataset</span> <span class="o">=</span> <span class="p">(</span> <span class="n">train_dataset</span><span class="o">.</span><span class="n">concatenate</span><span class="p">(</span><span class="n">pool_positives</span><span class="p">)</span> <span class="o">.</span><span class="n">concatenate</span><span class="p">(</span><span class="n">pool_negatives</span><span class="p">)</span> <span class="o">.</span><span class="n">cache</span><span class="p">()</span> <span class="o">.</span><span class="n">shuffle</span><span class="p">(</span><span class="mi">20000</span><span class="p">)</span> <span class="p">)</span> <span class="c1"># Training the full model</span> <span class="n">full_dataset_model</span> <span class="o">=</span> <span class="n">train_full_model</span><span class="p">(</span><span class="n">full_train_dataset</span><span class="p">,</span> <span class="n">val_dataset</span><span class="p">,</span> <span class="n">test_dataset</span><span class="p">)</span> </code></pre></div> <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace"><span style="font-weight: bold">Model: "sequential"</span> </pre> <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃<span style="font-weight: bold"> Layer (type) </span>┃<span style="font-weight: bold"> Output Shape </span>┃<span style="font-weight: bold"> Param # </span>┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ embedding (<span style="color: #0087ff; text-decoration-color: #0087ff">Embedding</span>) │ (<span style="color: #00d7ff; text-decoration-color: #00d7ff">None</span>, <span style="color: #00af00; text-decoration-color: #00af00">150</span>, <span style="color: #00af00; text-decoration-color: #00af00">128</span>) │ <span style="color: #00af00; text-decoration-color: #00af00">384,000</span> │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ bidirectional (<span style="color: #0087ff; text-decoration-color: #0087ff">Bidirectional</span>) │ (<span style="color: #00d7ff; text-decoration-color: #00d7ff">None</span>, <span style="color: #00af00; text-decoration-color: #00af00">150</span>, <span style="color: #00af00; text-decoration-color: #00af00">64</span>) │ <span style="color: #00af00; text-decoration-color: #00af00">41,216</span> │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ global_max_pooling1d │ (<span style="color: #00d7ff; text-decoration-color: #00d7ff">None</span>, <span style="color: #00af00; text-decoration-color: #00af00">64</span>) │ <span style="color: #00af00; text-decoration-color: #00af00">0</span> │ │ (<span style="color: #0087ff; text-decoration-color: #0087ff">GlobalMaxPooling1D</span>) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense (<span style="color: #0087ff; text-decoration-color: #0087ff">Dense</span>) │ (<span style="color: #00d7ff; text-decoration-color: #00d7ff">None</span>, <span style="color: #00af00; text-decoration-color: #00af00">20</span>) │ <span style="color: #00af00; text-decoration-color: #00af00">1,300</span> │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout (<span style="color: #0087ff; text-decoration-color: #0087ff">Dropout</span>) │ (<span style="color: #00d7ff; text-decoration-color: #00d7ff">None</span>, <span style="color: #00af00; text-decoration-color: #00af00">20</span>) │ <span style="color: #00af00; text-decoration-color: #00af00">0</span> │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_1 (<span style="color: #0087ff; text-decoration-color: #0087ff">Dense</span>) │ (<span style="color: #00d7ff; text-decoration-color: #00d7ff">None</span>, <span style="color: #00af00; text-decoration-color: #00af00">1</span>) │ <span style="color: #00af00; text-decoration-color: #00af00">21</span> │ └─────────────────────────────────┴────────────────────────┴───────────────┘ </pre> <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace"><span style="font-weight: bold"> Total params: </span><span style="color: #00af00; text-decoration-color: #00af00">426,537</span> (1.63 MB) </pre> <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace"><span style="font-weight: bold"> Trainable params: </span><span style="color: #00af00; text-decoration-color: #00af00">426,537</span> (1.63 MB) </pre> <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace"><span style="font-weight: bold"> Non-trainable params: </span><span style="color: #00af00; text-decoration-color: #00af00">0</span> (0.00 B) </pre> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 1/20 </code></pre></div> </div> <p>156/157 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 73ms/step - binary_accuracy: 0.6412 - false_negatives: 2084.3333 - false_positives: 5252.1924 - loss: 0.6507</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 1: val_loss improved from inf to 0.57198, saving model to FullModelCheckpoint.keras </code></pre></div> </div> <p>157/157 ━━━━━━━━━━━━━━━━━━━━ 15s 79ms/step - binary_accuracy: 0.6411 - false_negatives: 2135.1772 - false_positives: 5292.4053 - loss: 0.6506 - val_binary_accuracy: 0.7356 - val_false_negatives: 898.0000 - val_false_positives: 424.0000 - val_loss: 0.5720</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 2/20 </code></pre></div> </div> <p>156/157 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.7448 - false_negatives: 1756.2756 - false_positives: 3249.1411 - loss: 0.5416</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 2: val_loss improved from 0.57198 to 0.41756, saving model to FullModelCheckpoint.keras </code></pre></div> </div> <p>157/157 ━━━━━━━━━━━━━━━━━━━━ 12s 76ms/step - binary_accuracy: 0.7450 - false_negatives: 1783.8925 - false_positives: 3279.8101 - loss: 0.5412 - val_binary_accuracy: 0.8156 - val_false_negatives: 531.0000 - val_false_positives: 391.0000 - val_loss: 0.4176</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 3/20 </code></pre></div> </div> <p>156/157 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.8162 - false_negatives: 1539.7693 - false_positives: 2197.1475 - loss: 0.4254</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 3: val_loss improved from 0.41756 to 0.38233, saving model to FullModelCheckpoint.keras </code></pre></div> </div> <p>157/157 ━━━━━━━━━━━━━━━━━━━━ 12s 76ms/step - binary_accuracy: 0.8161 - false_negatives: 1562.6962 - false_positives: 2221.5886 - loss: 0.4254 - val_binary_accuracy: 0.8340 - val_false_negatives: 496.0000 - val_false_positives: 334.0000 - val_loss: 0.3823</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 4/20 </code></pre></div> </div> <p>156/157 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.8413 - false_negatives: 1400.6538 - false_positives: 1818.7372 - loss: 0.3837</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 4: val_loss improved from 0.38233 to 0.36235, saving model to FullModelCheckpoint.keras </code></pre></div> </div> <p>157/157 ━━━━━━━━━━━━━━━━━━━━ 12s 76ms/step - binary_accuracy: 0.8412 - false_negatives: 1421.5063 - false_positives: 1839.3102 - loss: 0.3838 - val_binary_accuracy: 0.8396 - val_false_negatives: 548.0000 - val_false_positives: 254.0000 - val_loss: 0.3623</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 5/20 </code></pre></div> </div> <p>156/157 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.8611 - false_negatives: 1264.5256 - false_positives: 1573.5962 - loss: 0.3468</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 5: val_loss did not improve from 0.36235 </code></pre></div> </div> <p>157/157 ━━━━━━━━━━━━━━━━━━━━ 12s 75ms/step - binary_accuracy: 0.8611 - false_negatives: 1283.0632 - false_positives: 1592.3228 - loss: 0.3468 - val_binary_accuracy: 0.8222 - val_false_negatives: 734.0000 - val_false_positives: 155.0000 - val_loss: 0.4081</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 6/20 </code></pre></div> </div> <p>156/157 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.8706 - false_negatives: 1186.9166 - false_positives: 1427.9487 - loss: 0.3301</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 6: val_loss improved from 0.36235 to 0.35041, saving model to FullModelCheckpoint.keras </code></pre></div> </div> <p>157/157 ━━━━━━━━━━━━━━━━━━━━ 12s 76ms/step - binary_accuracy: 0.8705 - false_negatives: 1204.8038 - false_positives: 1444.9368 - loss: 0.3302 - val_binary_accuracy: 0.8412 - val_false_negatives: 569.0000 - val_false_positives: 225.0000 - val_loss: 0.3504</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 7/20 </code></pre></div> </div> <p>156/157 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.8768 - false_negatives: 1162.4423 - false_positives: 1342.4807 - loss: 0.3084</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 7: val_loss improved from 0.35041 to 0.32680, saving model to FullModelCheckpoint.keras </code></pre></div> </div> <p>157/157 ━━━━━━━━━━━━━━━━━━━━ 12s 76ms/step - binary_accuracy: 0.8768 - false_negatives: 1179.5253 - false_positives: 1358.4114 - loss: 0.3085 - val_binary_accuracy: 0.8590 - val_false_negatives: 364.0000 - val_false_positives: 341.0000 - val_loss: 0.3268</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 8/20 </code></pre></div> </div> <p>156/157 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 73ms/step - binary_accuracy: 0.8865 - false_negatives: 1079.3206 - false_positives: 1250.2693 - loss: 0.2924</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 8: val_loss did not improve from 0.32680 </code></pre></div> </div> <p>157/157 ━━━━━━━━━━━━━━━━━━━━ 12s 76ms/step - binary_accuracy: 0.8864 - false_negatives: 1094.9873 - false_positives: 1265.0632 - loss: 0.2926 - val_binary_accuracy: 0.8460 - val_false_negatives: 548.0000 - val_false_positives: 222.0000 - val_loss: 0.3432</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 9/20 </code></pre></div> </div> <p>156/157 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 73ms/step - binary_accuracy: 0.8912 - false_negatives: 1019.1987 - false_positives: 1189.4551 - loss: 0.2807</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 9: val_loss did not improve from 0.32680 </code></pre></div> </div> <p>157/157 ━━━━━━━━━━━━━━━━━━━━ 12s 77ms/step - binary_accuracy: 0.8912 - false_negatives: 1033.9684 - false_positives: 1203.5632 - loss: 0.2808 - val_binary_accuracy: 0.8588 - val_false_negatives: 330.0000 - val_false_positives: 376.0000 - val_loss: 0.3302</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 10/20 </code></pre></div> </div> <p>156/157 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.8997 - false_negatives: 968.6346 - false_positives: 1109.9103 - loss: 0.2669</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 10: val_loss did not improve from 0.32680 </code></pre></div> </div> <p>157/157 ━━━━━━━━━━━━━━━━━━━━ 12s 76ms/step - binary_accuracy: 0.8996 - false_negatives: 983.1202 - false_positives: 1123.3418 - loss: 0.2671 - val_binary_accuracy: 0.8558 - val_false_negatives: 445.0000 - val_false_positives: 276.0000 - val_loss: 0.3413</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 11/20 </code></pre></div> </div> <p>156/157 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.9055 - false_negatives: 937.0320 - false_positives: 1000.8589 - loss: 0.2520</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 11: val_loss did not improve from 0.32680 </code></pre></div> </div> <p>157/157 ━━━━━━━━━━━━━━━━━━━━ 12s 76ms/step - binary_accuracy: 0.9055 - false_negatives: 950.3608 - false_positives: 1013.6456 - loss: 0.2521 - val_binary_accuracy: 0.8602 - val_false_negatives: 402.0000 - val_false_positives: 297.0000 - val_loss: 0.3281</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 11: early stopping </code></pre></div> </div> <p><img alt="png" src="/img/examples/nlp/active_learning_review_classification/active_learning_review_classification_15_1755.png" /></p> <p><img alt="png" src="/img/examples/nlp/active_learning_review_classification/active_learning_review_classification_15_1756.png" /></p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>---------------------------------------------------------------------------------------------------- Test set evaluation: {'binary_accuracy': 0.8507999777793884, 'false_negatives': 397.0, 'false_positives': 349.0, 'loss': 0.3372706174850464} ---------------------------------------------------------------------------------------------------- </code></pre></div> </div> <hr /> <h2 id="training-via-active-learning">Training via Active Learning</h2> <p>The general process we follow when performing Active Learning is demonstrated below:</p> <p><img alt="Active Learning" src="https://i.imgur.com/dmNKusp.png" /></p> <p>The pipeline can be summarized in five parts:</p> <ol> <li>Sample and annotate a small, balanced training dataset</li> <li>Train the model on this small subset</li> <li>Evaluate the model on a balanced testing set</li> <li>If the model satisfies the business criteria, deploy it in a real time setting</li> <li>If it doesn't pass the criteria, sample a few more samples according to the ratio of false positives and negatives, add them to the training set and repeat from step 2 till the model passes the tests or till all available data is exhausted.</li> </ol> <p>For the code below, we will perform sampling using the following formula:<br/></p> <p><img alt="Ratio Sampling" src="https://i.imgur.com/LyZEiZL.png" /></p> <p>Active Learning techniques use callbacks extensively for progress tracking. We will be using model checkpointing and early stopping for this example. The <code>patience</code> parameter for Early Stopping can help minimize overfitting and the time required. We have set it <code>patience=4</code> for now but since the model is robust, we can increase the patience level if desired.</p> <p>Note: We are not loading the checkpoint after the first training iteration. In my experience working on Active Learning techniques, this helps the model probe the newly formed loss landscape. Even if the model fails to improve in the second iteration, we will still gain insight about the possible future false positive and negative rates. This will help us sample a better set in the next iteration where the model will have a greater chance to improve.</p> <div class="codehilite"><pre><span></span><code><span class="k">def</span> <span class="nf">train_active_learning_models</span><span class="p">(</span> <span class="n">train_dataset</span><span class="p">,</span> <span class="n">pool_negatives</span><span class="p">,</span> <span class="n">pool_positives</span><span class="p">,</span> <span class="n">val_dataset</span><span class="p">,</span> <span class="n">test_dataset</span><span class="p">,</span> <span class="n">num_iterations</span><span class="o">=</span><span class="mi">3</span><span class="p">,</span> <span class="n">sampling_size</span><span class="o">=</span><span class="mi">5000</span><span class="p">,</span> <span class="p">):</span> <span class="c1"># Creating lists for storing metrics</span> <span class="n">losses</span><span class="p">,</span> <span class="n">val_losses</span><span class="p">,</span> <span class="n">accuracies</span><span class="p">,</span> <span class="n">val_accuracies</span> <span class="o">=</span> <span class="p">[],</span> <span class="p">[],</span> <span class="p">[],</span> <span class="p">[]</span> <span class="n">model</span> <span class="o">=</span> <span class="n">create_model</span><span class="p">()</span> <span class="c1"># We will monitor the false positives and false negatives predicted by our model</span> <span class="c1"># These will decide the subsequent sampling ratio for every Active Learning loop</span> <span class="n">model</span><span class="o">.</span><span class="n">compile</span><span class="p">(</span> <span class="n">loss</span><span class="o">=</span><span class="s2">"binary_crossentropy"</span><span class="p">,</span> <span class="n">optimizer</span><span class="o">=</span><span class="s2">"rmsprop"</span><span class="p">,</span> <span class="n">metrics</span><span class="o">=</span><span class="p">[</span> <span class="n">keras</span><span class="o">.</span><span class="n">metrics</span><span class="o">.</span><span class="n">BinaryAccuracy</span><span class="p">(),</span> <span class="n">keras</span><span class="o">.</span><span class="n">metrics</span><span class="o">.</span><span class="n">FalseNegatives</span><span class="p">(),</span> <span class="n">keras</span><span class="o">.</span><span class="n">metrics</span><span class="o">.</span><span class="n">FalsePositives</span><span class="p">(),</span> <span class="p">],</span> <span class="p">)</span> <span class="c1"># Defining checkpoints.</span> <span class="c1"># The checkpoint callback is reused throughout the training since it only saves the best overall model.</span> <span class="n">checkpoint</span> <span class="o">=</span> <span class="n">keras</span><span class="o">.</span><span class="n">callbacks</span><span class="o">.</span><span class="n">ModelCheckpoint</span><span class="p">(</span> <span class="s2">"AL_Model.keras"</span><span class="p">,</span> <span class="n">save_best_only</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">verbose</span><span class="o">=</span><span class="mi">1</span> <span class="p">)</span> <span class="c1"># Here, patience is set to 4. This can be set higher if desired.</span> <span class="n">early_stopping</span> <span class="o">=</span> <span class="n">keras</span><span class="o">.</span><span class="n">callbacks</span><span class="o">.</span><span class="n">EarlyStopping</span><span class="p">(</span><span class="n">patience</span><span class="o">=</span><span class="mi">4</span><span class="p">,</span> <span class="n">verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Starting to train with </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">train_dataset</span><span class="p">)</span><span class="si">}</span><span class="s2"> samples"</span><span class="p">)</span> <span class="c1"># Initial fit with a small subset of the training set</span> <span class="n">history</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span> <span class="n">train_dataset</span><span class="o">.</span><span class="n">cache</span><span class="p">()</span><span class="o">.</span><span class="n">shuffle</span><span class="p">(</span><span class="mi">20000</span><span class="p">)</span><span class="o">.</span><span class="n">batch</span><span class="p">(</span><span class="mi">256</span><span class="p">),</span> <span class="n">epochs</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span> <span class="n">validation_data</span><span class="o">=</span><span class="n">val_dataset</span><span class="p">,</span> <span class="n">callbacks</span><span class="o">=</span><span class="p">[</span><span class="n">checkpoint</span><span class="p">,</span> <span class="n">early_stopping</span><span class="p">],</span> <span class="p">)</span> <span class="c1"># Appending history</span> <span class="n">losses</span><span class="p">,</span> <span class="n">val_losses</span><span class="p">,</span> <span class="n">accuracies</span><span class="p">,</span> <span class="n">val_accuracies</span> <span class="o">=</span> <span class="n">append_history</span><span class="p">(</span> <span class="n">losses</span><span class="p">,</span> <span class="n">val_losses</span><span class="p">,</span> <span class="n">accuracies</span><span class="p">,</span> <span class="n">val_accuracies</span><span class="p">,</span> <span class="n">history</span> <span class="p">)</span> <span class="k">for</span> <span class="n">iteration</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">num_iterations</span><span class="p">):</span> <span class="c1"># Getting predictions from previously trained model</span> <span class="n">predictions</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">test_dataset</span><span class="p">)</span> <span class="c1"># Generating labels from the output probabilities</span> <span class="n">rounded</span> <span class="o">=</span> <span class="n">ops</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="n">ops</span><span class="o">.</span><span class="n">greater</span><span class="p">(</span><span class="n">predictions</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">),</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="c1"># Evaluating the number of zeros and ones incorrrectly classified</span> <span class="n">_</span><span class="p">,</span> <span class="n">_</span><span class="p">,</span> <span class="n">false_negatives</span><span class="p">,</span> <span class="n">false_positives</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">evaluate</span><span class="p">(</span><span class="n">test_dataset</span><span class="p">,</span> <span class="n">verbose</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"-"</span> <span class="o">*</span> <span class="mi">100</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span> <span class="sa">f</span><span class="s2">"Number of zeros incorrectly classified: </span><span class="si">{</span><span class="n">false_negatives</span><span class="si">}</span><span class="s2">, Number of ones incorrectly classified: </span><span class="si">{</span><span class="n">false_positives</span><span class="si">}</span><span class="s2">"</span> <span class="p">)</span> <span class="c1"># This technique of Active Learning demonstrates ratio based sampling where</span> <span class="c1"># Number of ones/zeros to sample = Number of ones/zeros incorrectly classified / Total incorrectly classified</span> <span class="k">if</span> <span class="n">false_negatives</span> <span class="o">!=</span> <span class="mi">0</span> <span class="ow">and</span> <span class="n">false_positives</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">:</span> <span class="n">total</span> <span class="o">=</span> <span class="n">false_negatives</span> <span class="o">+</span> <span class="n">false_positives</span> <span class="n">sample_ratio_ones</span><span class="p">,</span> <span class="n">sample_ratio_zeros</span> <span class="o">=</span> <span class="p">(</span> <span class="n">false_positives</span> <span class="o">/</span> <span class="n">total</span><span class="p">,</span> <span class="n">false_negatives</span> <span class="o">/</span> <span class="n">total</span><span class="p">,</span> <span class="p">)</span> <span class="c1"># In the case where all samples are correctly predicted, we can sample both classes equally</span> <span class="k">else</span><span class="p">:</span> <span class="n">sample_ratio_ones</span><span class="p">,</span> <span class="n">sample_ratio_zeros</span> <span class="o">=</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.5</span> <span class="nb">print</span><span class="p">(</span> <span class="sa">f</span><span class="s2">"Sample ratio for positives: </span><span class="si">{</span><span class="n">sample_ratio_ones</span><span class="si">}</span><span class="s2">, Sample ratio for negatives:</span><span class="si">{</span><span class="n">sample_ratio_zeros</span><span class="si">}</span><span class="s2">"</span> <span class="p">)</span> <span class="c1"># Sample the required number of ones and zeros</span> <span class="n">sampled_dataset</span> <span class="o">=</span> <span class="n">pool_negatives</span><span class="o">.</span><span class="n">take</span><span class="p">(</span> <span class="nb">int</span><span class="p">(</span><span class="n">sample_ratio_zeros</span> <span class="o">*</span> <span class="n">sampling_size</span><span class="p">)</span> <span class="p">)</span><span class="o">.</span><span class="n">concatenate</span><span class="p">(</span><span class="n">pool_positives</span><span class="o">.</span><span class="n">take</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">sample_ratio_ones</span> <span class="o">*</span> <span class="n">sampling_size</span><span class="p">)))</span> <span class="c1"># Skip the sampled data points to avoid repetition of sample</span> <span class="n">pool_negatives</span> <span class="o">=</span> <span class="n">pool_negatives</span><span class="o">.</span><span class="n">skip</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">sample_ratio_zeros</span> <span class="o">*</span> <span class="n">sampling_size</span><span class="p">))</span> <span class="n">pool_positives</span> <span class="o">=</span> <span class="n">pool_positives</span><span class="o">.</span><span class="n">skip</span><span class="p">(</span><span class="nb">int</span><span class="p">(</span><span class="n">sample_ratio_ones</span> <span class="o">*</span> <span class="n">sampling_size</span><span class="p">))</span> <span class="c1"># Concatenating the train_dataset with the sampled_dataset</span> <span class="n">train_dataset</span> <span class="o">=</span> <span class="n">train_dataset</span><span class="o">.</span><span class="n">concatenate</span><span class="p">(</span><span class="n">sampled_dataset</span><span class="p">)</span><span class="o">.</span><span class="n">prefetch</span><span class="p">(</span> <span class="n">tf</span><span class="o">.</span><span class="n">data</span><span class="o">.</span><span class="n">AUTOTUNE</span> <span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Starting training with </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">train_dataset</span><span class="p">)</span><span class="si">}</span><span class="s2"> samples"</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"-"</span> <span class="o">*</span> <span class="mi">100</span><span class="p">)</span> <span class="c1"># We recompile the model to reset the optimizer states and retrain the model</span> <span class="n">model</span><span class="o">.</span><span class="n">compile</span><span class="p">(</span> <span class="n">loss</span><span class="o">=</span><span class="s2">"binary_crossentropy"</span><span class="p">,</span> <span class="n">optimizer</span><span class="o">=</span><span class="s2">"rmsprop"</span><span class="p">,</span> <span class="n">metrics</span><span class="o">=</span><span class="p">[</span> <span class="n">keras</span><span class="o">.</span><span class="n">metrics</span><span class="o">.</span><span class="n">BinaryAccuracy</span><span class="p">(),</span> <span class="n">keras</span><span class="o">.</span><span class="n">metrics</span><span class="o">.</span><span class="n">FalseNegatives</span><span class="p">(),</span> <span class="n">keras</span><span class="o">.</span><span class="n">metrics</span><span class="o">.</span><span class="n">FalsePositives</span><span class="p">(),</span> <span class="p">],</span> <span class="p">)</span> <span class="n">history</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span> <span class="n">train_dataset</span><span class="o">.</span><span class="n">cache</span><span class="p">()</span><span class="o">.</span><span class="n">shuffle</span><span class="p">(</span><span class="mi">20000</span><span class="p">)</span><span class="o">.</span><span class="n">batch</span><span class="p">(</span><span class="mi">256</span><span class="p">),</span> <span class="n">validation_data</span><span class="o">=</span><span class="n">val_dataset</span><span class="p">,</span> <span class="n">epochs</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span> <span class="n">callbacks</span><span class="o">=</span><span class="p">[</span> <span class="n">checkpoint</span><span class="p">,</span> <span class="n">keras</span><span class="o">.</span><span class="n">callbacks</span><span class="o">.</span><span class="n">EarlyStopping</span><span class="p">(</span><span class="n">patience</span><span class="o">=</span><span class="mi">4</span><span class="p">,</span> <span class="n">verbose</span><span class="o">=</span><span class="mi">1</span><span class="p">),</span> <span class="p">],</span> <span class="p">)</span> <span class="c1"># Appending the history</span> <span class="n">losses</span><span class="p">,</span> <span class="n">val_losses</span><span class="p">,</span> <span class="n">accuracies</span><span class="p">,</span> <span class="n">val_accuracies</span> <span class="o">=</span> <span class="n">append_history</span><span class="p">(</span> <span class="n">losses</span><span class="p">,</span> <span class="n">val_losses</span><span class="p">,</span> <span class="n">accuracies</span><span class="p">,</span> <span class="n">val_accuracies</span><span class="p">,</span> <span class="n">history</span> <span class="p">)</span> <span class="c1"># Loading the best model from this training loop</span> <span class="n">model</span> <span class="o">=</span> <span class="n">keras</span><span class="o">.</span><span class="n">models</span><span class="o">.</span><span class="n">load_model</span><span class="p">(</span><span class="s2">"AL_Model.keras"</span><span class="p">)</span> <span class="c1"># Plotting the overall history and evaluating the final model</span> <span class="n">plot_history</span><span class="p">(</span><span class="n">losses</span><span class="p">,</span> <span class="n">val_losses</span><span class="p">,</span> <span class="n">accuracies</span><span class="p">,</span> <span class="n">val_accuracies</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"-"</span> <span class="o">*</span> <span class="mi">100</span><span class="p">)</span> <span class="nb">print</span><span class="p">(</span> <span class="s2">"Test set evaluation: "</span><span class="p">,</span> <span class="n">model</span><span class="o">.</span><span class="n">evaluate</span><span class="p">(</span><span class="n">test_dataset</span><span class="p">,</span> <span class="n">verbose</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">return_dict</span><span class="o">=</span><span class="kc">True</span><span class="p">),</span> <span class="p">)</span> <span class="nb">print</span><span class="p">(</span><span class="s2">"-"</span> <span class="o">*</span> <span class="mi">100</span><span class="p">)</span> <span class="k">return</span> <span class="n">model</span> <span class="n">active_learning_model</span> <span class="o">=</span> <span class="n">train_active_learning_models</span><span class="p">(</span> <span class="n">train_dataset</span><span class="p">,</span> <span class="n">pool_negatives</span><span class="p">,</span> <span class="n">pool_positives</span><span class="p">,</span> <span class="n">val_dataset</span><span class="p">,</span> <span class="n">test_dataset</span> <span class="p">)</span> </code></pre></div> <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace"><span style="font-weight: bold">Model: "sequential_1"</span> </pre> <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ ┃<span style="font-weight: bold"> Layer (type) </span>┃<span style="font-weight: bold"> Output Shape </span>┃<span style="font-weight: bold"> Param # </span>┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │ embedding_1 (<span style="color: #0087ff; text-decoration-color: #0087ff">Embedding</span>) │ (<span style="color: #00d7ff; text-decoration-color: #00d7ff">None</span>, <span style="color: #00af00; text-decoration-color: #00af00">150</span>, <span style="color: #00af00; text-decoration-color: #00af00">128</span>) │ <span style="color: #00af00; text-decoration-color: #00af00">384,000</span> │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ bidirectional_1 (<span style="color: #0087ff; text-decoration-color: #0087ff">Bidirectional</span>) │ (<span style="color: #00d7ff; text-decoration-color: #00d7ff">None</span>, <span style="color: #00af00; text-decoration-color: #00af00">150</span>, <span style="color: #00af00; text-decoration-color: #00af00">64</span>) │ <span style="color: #00af00; text-decoration-color: #00af00">41,216</span> │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ global_max_pooling1d_1 │ (<span style="color: #00d7ff; text-decoration-color: #00d7ff">None</span>, <span style="color: #00af00; text-decoration-color: #00af00">64</span>) │ <span style="color: #00af00; text-decoration-color: #00af00">0</span> │ │ (<span style="color: #0087ff; text-decoration-color: #0087ff">GlobalMaxPooling1D</span>) │ │ │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_2 (<span style="color: #0087ff; text-decoration-color: #0087ff">Dense</span>) │ (<span style="color: #00d7ff; text-decoration-color: #00d7ff">None</span>, <span style="color: #00af00; text-decoration-color: #00af00">20</span>) │ <span style="color: #00af00; text-decoration-color: #00af00">1,300</span> │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dropout_1 (<span style="color: #0087ff; text-decoration-color: #0087ff">Dropout</span>) │ (<span style="color: #00d7ff; text-decoration-color: #00d7ff">None</span>, <span style="color: #00af00; text-decoration-color: #00af00">20</span>) │ <span style="color: #00af00; text-decoration-color: #00af00">0</span> │ ├─────────────────────────────────┼────────────────────────┼───────────────┤ │ dense_3 (<span style="color: #0087ff; text-decoration-color: #0087ff">Dense</span>) │ (<span style="color: #00d7ff; text-decoration-color: #00d7ff">None</span>, <span style="color: #00af00; text-decoration-color: #00af00">1</span>) │ <span style="color: #00af00; text-decoration-color: #00af00">21</span> │ └─────────────────────────────────┴────────────────────────┴───────────────┘ </pre> <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace"><span style="font-weight: bold"> Total params: </span><span style="color: #00af00; text-decoration-color: #00af00">426,537</span> (1.63 MB) </pre> <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace"><span style="font-weight: bold"> Trainable params: </span><span style="color: #00af00; text-decoration-color: #00af00">426,537</span> (1.63 MB) </pre> <pre style="white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace"><span style="font-weight: bold"> Non-trainable params: </span><span style="color: #00af00; text-decoration-color: #00af00">0</span> (0.00 B) </pre> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Starting to train with 15000 samples Epoch 1/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.5197 - false_negatives_1: 1686.7457 - false_positives_1: 1938.3051 - loss: 0.6918</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 1: val_loss improved from inf to 0.67428, saving model to AL_Model.keras </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 8s 89ms/step - binary_accuracy: 0.5202 - false_negatives_1: 1716.9833 - false_positives_1: 1961.4667 - loss: 0.6917 - val_binary_accuracy: 0.6464 - val_false_negatives_1: 279.0000 - val_false_positives_1: 1489.0000 - val_loss: 0.6743</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 2/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.6505 - false_negatives_1: 1216.0170 - false_positives_1: 1434.2373 - loss: 0.6561</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 2: val_loss improved from 0.67428 to 0.59133, saving model to AL_Model.keras </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - binary_accuracy: 0.6507 - false_negatives_1: 1234.9833 - false_positives_1: 1455.7667 - loss: 0.6558 - val_binary_accuracy: 0.7032 - val_false_negatives_1: 235.0000 - val_false_positives_1: 1249.0000 - val_loss: 0.5913</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 3/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.7103 - false_negatives_1: 939.5255 - false_positives_1: 1235.8983 - loss: 0.5829</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 3: val_loss improved from 0.59133 to 0.51602, saving model to AL_Model.keras </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - binary_accuracy: 0.7106 - false_negatives_1: 953.0500 - false_positives_1: 1255.3167 - loss: 0.5827 - val_binary_accuracy: 0.7686 - val_false_negatives_1: 812.0000 - val_false_positives_1: 345.0000 - val_loss: 0.5160</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 4/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.7545 - false_negatives_1: 787.4237 - false_positives_1: 1070.0339 - loss: 0.5214</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 4: val_loss improved from 0.51602 to 0.43948, saving model to AL_Model.keras </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - binary_accuracy: 0.7547 - false_negatives_1: 799.2667 - false_positives_1: 1085.8833 - loss: 0.5212 - val_binary_accuracy: 0.8028 - val_false_negatives_1: 342.0000 - val_false_positives_1: 644.0000 - val_loss: 0.4395</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 5/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.7919 - false_negatives_1: 676.7458 - false_positives_1: 907.4915 - loss: 0.4657</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 5: val_loss improved from 0.43948 to 0.41679, saving model to AL_Model.keras </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - binary_accuracy: 0.7920 - false_negatives_1: 687.3834 - false_positives_1: 921.1667 - loss: 0.4655 - val_binary_accuracy: 0.8158 - val_false_negatives_1: 598.0000 - val_false_positives_1: 323.0000 - val_loss: 0.4168</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 6/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.7994 - false_negatives_1: 661.3560 - false_positives_1: 828.0847 - loss: 0.4498</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 6: val_loss improved from 0.41679 to 0.39680, saving model to AL_Model.keras </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - binary_accuracy: 0.7997 - false_negatives_1: 671.3666 - false_positives_1: 840.2500 - loss: 0.4495 - val_binary_accuracy: 0.8260 - val_false_negatives_1: 382.0000 - val_false_positives_1: 488.0000 - val_loss: 0.3968</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 7/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.8311 - false_negatives_1: 589.1187 - false_positives_1: 707.0170 - loss: 0.4017</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 7: val_loss did not improve from 0.39680 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - binary_accuracy: 0.8312 - false_negatives_1: 598.3500 - false_positives_1: 717.8167 - loss: 0.4016 - val_binary_accuracy: 0.7706 - val_false_negatives_1: 1004.0000 - val_false_positives_1: 143.0000 - val_loss: 0.4884</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 8/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.8365 - false_negatives_1: 566.7288 - false_positives_1: 649.9322 - loss: 0.3896</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 8: val_loss did not improve from 0.39680 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - binary_accuracy: 0.8366 - false_negatives_1: 575.2833 - false_positives_1: 660.2167 - loss: 0.3895 - val_binary_accuracy: 0.8216 - val_false_negatives_1: 623.0000 - val_false_positives_1: 269.0000 - val_loss: 0.4043</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 9/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.8531 - false_negatives_1: 519.0170 - false_positives_1: 591.6440 - loss: 0.3631</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 9: val_loss improved from 0.39680 to 0.37727, saving model to AL_Model.keras </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - binary_accuracy: 0.8531 - false_negatives_1: 527.2667 - false_positives_1: 601.2500 - loss: 0.3631 - val_binary_accuracy: 0.8348 - val_false_negatives_1: 296.0000 - val_false_positives_1: 530.0000 - val_loss: 0.3773</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 10/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.8686 - false_negatives_1: 475.7966 - false_positives_1: 569.0508 - loss: 0.3387</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 10: val_loss improved from 0.37727 to 0.37354, saving model to AL_Model.keras </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - binary_accuracy: 0.8685 - false_negatives_1: 483.5000 - false_positives_1: 577.9667 - loss: 0.3387 - val_binary_accuracy: 0.8400 - val_false_negatives_1: 327.0000 - val_false_positives_1: 473.0000 - val_loss: 0.3735</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 11/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.8716 - false_negatives_1: 452.1356 - false_positives_1: 522.1187 - loss: 0.3303</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 11: val_loss improved from 0.37354 to 0.37074, saving model to AL_Model.keras </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - binary_accuracy: 0.8716 - false_negatives_1: 459.3833 - false_positives_1: 530.6667 - loss: 0.3303 - val_binary_accuracy: 0.8390 - val_false_negatives_1: 362.0000 - val_false_positives_1: 443.0000 - val_loss: 0.3707</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 12/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.8833 - false_negatives_1: 433.0678 - false_positives_1: 481.1864 - loss: 0.3065</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 12: val_loss did not improve from 0.37074 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - binary_accuracy: 0.8833 - false_negatives_1: 439.8333 - false_positives_1: 488.9667 - loss: 0.3066 - val_binary_accuracy: 0.8236 - val_false_negatives_1: 208.0000 - val_false_positives_1: 674.0000 - val_loss: 0.4046</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 13/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.8876 - false_negatives_1: 384.8305 - false_positives_1: 476.5254 - loss: 0.2978</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 13: val_loss did not improve from 0.37074 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 82ms/step - binary_accuracy: 0.8876 - false_negatives_1: 391.2667 - false_positives_1: 484.2500 - loss: 0.2978 - val_binary_accuracy: 0.8380 - val_false_negatives_1: 364.0000 - val_false_positives_1: 446.0000 - val_loss: 0.3783</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 14/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.8976 - false_negatives_1: 378.0169 - false_positives_1: 433.9831 - loss: 0.2754</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 14: val_loss did not improve from 0.37074 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - binary_accuracy: 0.8975 - false_negatives_1: 384.2333 - false_positives_1: 441.3833 - loss: 0.2757 - val_binary_accuracy: 0.8310 - val_false_negatives_1: 525.0000 - val_false_positives_1: 320.0000 - val_loss: 0.3957</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 15/20 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.9013 - false_negatives_1: 354.9322 - false_positives_1: 403.1695 - loss: 0.2709</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 15: val_loss did not improve from 0.37074 </code></pre></div> </div> <p>59/59 ━━━━━━━━━━━━━━━━━━━━ 5s 83ms/step - binary_accuracy: 0.9013 - false_negatives_1: 360.4000 - false_positives_1: 409.5833 - loss: 0.2709 - val_binary_accuracy: 0.8298 - val_false_negatives_1: 302.0000 - val_false_positives_1: 549.0000 - val_loss: 0.4015</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 15: early stopping </code></pre></div> </div> <p>20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 39ms/step</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>---------------------------------------------------------------------------------------------------- Number of zeros incorrectly classified: 290.0, Number of ones incorrectly classified: 538.0 Sample ratio for positives: 0.6497584541062802, Sample ratio for negatives:0.3502415458937198 Starting training with 19999 samples ---------------------------------------------------------------------------------------------------- Epoch 1/20 </code></pre></div> </div> <p>78/79 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.8735 - false_negatives_2: 547.2436 - false_positives_2: 650.2436 - loss: 0.3527</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 1: val_loss did not improve from 0.37074 </code></pre></div> </div> <p>79/79 ━━━━━━━━━━━━━━━━━━━━ 9s 84ms/step - binary_accuracy: 0.8738 - false_negatives_2: 559.2125 - false_positives_2: 665.3375 - loss: 0.3518 - val_binary_accuracy: 0.7932 - val_false_negatives_2: 119.0000 - val_false_positives_2: 915.0000 - val_loss: 0.4949</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 2/20 </code></pre></div> </div> <p>78/79 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.8961 - false_negatives_2: 470.2436 - false_positives_2: 576.1539 - loss: 0.2824</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 2: val_loss did not improve from 0.37074 </code></pre></div> </div> <p>79/79 ━━━━━━━━━━━━━━━━━━━━ 6s 80ms/step - binary_accuracy: 0.8962 - false_negatives_2: 481.4125 - false_positives_2: 589.6750 - loss: 0.2823 - val_binary_accuracy: 0.8014 - val_false_negatives_2: 809.0000 - val_false_positives_2: 184.0000 - val_loss: 0.4580</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 3/20 </code></pre></div> </div> <p>78/79 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.9059 - false_negatives_2: 442.2051 - false_positives_2: 500.5385 - loss: 0.2628</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 3: val_loss did not improve from 0.37074 </code></pre></div> </div> <p>79/79 ━━━━━━━━━━━━━━━━━━━━ 6s 80ms/step - binary_accuracy: 0.9059 - false_negatives_2: 452.6750 - false_positives_2: 513.5250 - loss: 0.2629 - val_binary_accuracy: 0.8294 - val_false_negatives_2: 302.0000 - val_false_positives_2: 551.0000 - val_loss: 0.3868</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 4/20 </code></pre></div> </div> <p>78/79 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.9188 - false_negatives_2: 394.5513 - false_positives_2: 462.4359 - loss: 0.2391</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 4: val_loss did not improve from 0.37074 </code></pre></div> </div> <p>79/79 ━━━━━━━━━━━━━━━━━━━━ 6s 80ms/step - binary_accuracy: 0.9187 - false_negatives_2: 405.0625 - false_positives_2: 474.1250 - loss: 0.2393 - val_binary_accuracy: 0.8268 - val_false_negatives_2: 225.0000 - val_false_positives_2: 641.0000 - val_loss: 0.4197</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 5/20 </code></pre></div> </div> <p>78/79 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.9255 - false_negatives_2: 349.8718 - false_positives_2: 413.0898 - loss: 0.2270</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 5: val_loss did not improve from 0.37074 </code></pre></div> </div> <p>79/79 ━━━━━━━━━━━━━━━━━━━━ 6s 79ms/step - binary_accuracy: 0.9254 - false_negatives_2: 358.6500 - false_positives_2: 423.5625 - loss: 0.2270 - val_binary_accuracy: 0.8228 - val_false_negatives_2: 611.0000 - val_false_positives_2: 275.0000 - val_loss: 0.4233</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 6/20 </code></pre></div> </div> <p>78/79 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 73ms/step - binary_accuracy: 0.9265 - false_negatives_2: 349.8590 - false_positives_2: 389.9359 - loss: 0.2147</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 6: val_loss did not improve from 0.37074 </code></pre></div> </div> <p>79/79 ━━━━━━━━━━━━━━━━━━━━ 6s 80ms/step - binary_accuracy: 0.9265 - false_negatives_2: 358.8375 - false_positives_2: 399.9875 - loss: 0.2148 - val_binary_accuracy: 0.8272 - val_false_negatives_2: 581.0000 - val_false_positives_2: 283.0000 - val_loss: 0.4415</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 7/20 </code></pre></div> </div> <p>78/79 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 72ms/step - binary_accuracy: 0.9409 - false_negatives_2: 286.7820 - false_positives_2: 322.7949 - loss: 0.1877</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 7: val_loss did not improve from 0.37074 </code></pre></div> </div> <p>79/79 ━━━━━━━━━━━━━━━━━━━━ 6s 79ms/step - binary_accuracy: 0.9408 - false_negatives_2: 294.4375 - false_positives_2: 331.4000 - loss: 0.1880 - val_binary_accuracy: 0.8266 - val_false_negatives_2: 528.0000 - val_false_positives_2: 339.0000 - val_loss: 0.4419</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 7: early stopping </code></pre></div> </div> <p>20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 39ms/step</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>---------------------------------------------------------------------------------------------------- Number of zeros incorrectly classified: 376.0, Number of ones incorrectly classified: 442.0 Sample ratio for positives: 0.5403422982885085, Sample ratio for negatives:0.45965770171149145 Starting training with 24998 samples ---------------------------------------------------------------------------------------------------- Epoch 1/20 </code></pre></div> </div> <p>98/98 ━━━━━━━━━━━━━━━━━━━━ 0s 73ms/step - binary_accuracy: 0.8509 - false_negatives_3: 809.9184 - false_positives_3: 1018.9286 - loss: 0.3732</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 1: val_loss improved from 0.37074 to 0.36196, saving model to AL_Model.keras </code></pre></div> </div> <p>98/98 ━━━━━━━━━━━━━━━━━━━━ 11s 83ms/step - binary_accuracy: 0.8509 - false_negatives_3: 817.5757 - false_positives_3: 1028.7980 - loss: 0.3731 - val_binary_accuracy: 0.8424 - val_false_negatives_3: 368.0000 - val_false_positives_3: 420.0000 - val_loss: 0.3620</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 2/20 </code></pre></div> </div> <p>98/98 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.8744 - false_negatives_3: 734.7449 - false_positives_3: 884.7755 - loss: 0.3185</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 2: val_loss did not improve from 0.36196 </code></pre></div> </div> <p>98/98 ━━━━━━━━━━━━━━━━━━━━ 8s 79ms/step - binary_accuracy: 0.8744 - false_negatives_3: 741.9697 - false_positives_3: 893.7172 - loss: 0.3186 - val_binary_accuracy: 0.8316 - val_false_negatives_3: 202.0000 - val_false_positives_3: 640.0000 - val_loss: 0.3792</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 3/20 </code></pre></div> </div> <p>98/98 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.8830 - false_negatives_3: 684.1326 - false_positives_3: 807.8878 - loss: 0.3090</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 3: val_loss did not improve from 0.36196 </code></pre></div> </div> <p>98/98 ━━━━━━━━━━━━━━━━━━━━ 8s 79ms/step - binary_accuracy: 0.8830 - false_negatives_3: 691.0707 - false_positives_3: 816.2222 - loss: 0.3090 - val_binary_accuracy: 0.8118 - val_false_negatives_3: 738.0000 - val_false_positives_3: 203.0000 - val_loss: 0.4112</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 4/20 </code></pre></div> </div> <p>98/98 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.8892 - false_negatives_3: 651.9898 - false_positives_3: 776.4388 - loss: 0.2928</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 4: val_loss did not improve from 0.36196 </code></pre></div> </div> <p>98/98 ━━━━━━━━━━━━━━━━━━━━ 8s 79ms/step - binary_accuracy: 0.8892 - false_negatives_3: 658.4041 - false_positives_3: 784.3839 - loss: 0.2928 - val_binary_accuracy: 0.8344 - val_false_negatives_3: 557.0000 - val_false_positives_3: 271.0000 - val_loss: 0.3734</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 5/20 </code></pre></div> </div> <p>98/98 ━━━━━━━━━━━━━━━━━━━━ 0s 72ms/step - binary_accuracy: 0.8975 - false_negatives_3: 612.0714 - false_positives_3: 688.9184 - loss: 0.2806</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 5: val_loss did not improve from 0.36196 </code></pre></div> </div> <p>98/98 ━━━━━━━━━━━━━━━━━━━━ 8s 79ms/step - binary_accuracy: 0.8974 - false_negatives_3: 618.4343 - false_positives_3: 696.1313 - loss: 0.2807 - val_binary_accuracy: 0.8456 - val_false_negatives_3: 446.0000 - val_false_positives_3: 326.0000 - val_loss: 0.3658</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 5: early stopping </code></pre></div> </div> <p>20/20 ━━━━━━━━━━━━━━━━━━━━ 1s 40ms/step</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>---------------------------------------------------------------------------------------------------- Number of zeros incorrectly classified: 407.0, Number of ones incorrectly classified: 410.0 Sample ratio for positives: 0.5018359853121175, Sample ratio for negatives:0.4981640146878825 Starting training with 29997 samples ---------------------------------------------------------------------------------------------------- Epoch 1/20 </code></pre></div> </div> <p>117/118 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 76ms/step - binary_accuracy: 0.8621 - false_negatives_4: 916.2393 - false_positives_4: 1130.9744 - loss: 0.3527</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 1: val_loss did not improve from 0.36196 </code></pre></div> </div> <p>118/118 ━━━━━━━━━━━━━━━━━━━━ 13s 85ms/step - binary_accuracy: 0.8621 - false_negatives_4: 931.0924 - false_positives_4: 1149.7479 - loss: 0.3525 - val_binary_accuracy: 0.8266 - val_false_negatives_4: 627.0000 - val_false_positives_4: 240.0000 - val_loss: 0.3802</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 2/20 </code></pre></div> </div> <p>117/118 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 76ms/step - binary_accuracy: 0.8761 - false_negatives_4: 876.4872 - false_positives_4: 1005.5726 - loss: 0.3195</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 2: val_loss improved from 0.36196 to 0.35707, saving model to AL_Model.keras </code></pre></div> </div> <p>118/118 ━━━━━━━━━━━━━━━━━━━━ 10s 82ms/step - binary_accuracy: 0.8760 - false_negatives_4: 891.0504 - false_positives_4: 1022.9412 - loss: 0.3196 - val_binary_accuracy: 0.8404 - val_false_negatives_4: 479.0000 - val_false_positives_4: 319.0000 - val_loss: 0.3571</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 3/20 </code></pre></div> </div> <p>117/118 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 74ms/step - binary_accuracy: 0.8874 - false_negatives_4: 801.1710 - false_positives_4: 941.4786 - loss: 0.2965</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 3: val_loss did not improve from 0.35707 </code></pre></div> </div> <p>118/118 ━━━━━━━━━━━━━━━━━━━━ 9s 79ms/step - binary_accuracy: 0.8873 - false_negatives_4: 814.8319 - false_positives_4: 957.8571 - loss: 0.2966 - val_binary_accuracy: 0.8226 - val_false_negatives_4: 677.0000 - val_false_positives_4: 210.0000 - val_loss: 0.3948</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 4/20 </code></pre></div> </div> <p>117/118 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 76ms/step - binary_accuracy: 0.8977 - false_negatives_4: 740.5385 - false_positives_4: 837.1710 - loss: 0.2768</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 4: val_loss did not improve from 0.35707 </code></pre></div> </div> <p>118/118 ━━━━━━━━━━━━━━━━━━━━ 10s 81ms/step - binary_accuracy: 0.8976 - false_negatives_4: 753.5378 - false_positives_4: 852.2437 - loss: 0.2770 - val_binary_accuracy: 0.8406 - val_false_negatives_4: 530.0000 - val_false_positives_4: 267.0000 - val_loss: 0.3630</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 5/20 </code></pre></div> </div> <p>117/118 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 76ms/step - binary_accuracy: 0.9020 - false_negatives_4: 722.5214 - false_positives_4: 808.2308 - loss: 0.2674</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 5: val_loss did not improve from 0.35707 </code></pre></div> </div> <p>118/118 ━━━━━━━━━━━━━━━━━━━━ 10s 82ms/step - binary_accuracy: 0.9019 - false_negatives_4: 734.8655 - false_positives_4: 822.4117 - loss: 0.2676 - val_binary_accuracy: 0.8330 - val_false_negatives_4: 592.0000 - val_false_positives_4: 243.0000 - val_loss: 0.3805</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 6/20 </code></pre></div> </div> <p>117/118 ━━━━━━━━━━━━━━━━━━━[37m━ 0s 76ms/step - binary_accuracy: 0.9059 - false_negatives_4: 682.1453 - false_positives_4: 737.0513 - loss: 0.2525</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 6: val_loss did not improve from 0.35707 </code></pre></div> </div> <p>118/118 ━━━━━━━━━━━━━━━━━━━━ 10s 82ms/step - binary_accuracy: 0.9059 - false_negatives_4: 693.6387 - false_positives_4: 749.9412 - loss: 0.2526 - val_binary_accuracy: 0.8454 - val_false_negatives_4: 391.0000 - val_false_positives_4: 382.0000 - val_loss: 0.3620</p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>Epoch 6: early stopping </code></pre></div> </div> <p><img alt="png" src="/img/examples/nlp/active_learning_review_classification/active_learning_review_classification_17_2767.png" /></p> <p><img alt="png" src="/img/examples/nlp/active_learning_review_classification/active_learning_review_classification_17_2768.png" /></p> <div class="k-default-codeblock"> <div class="codehilite"><pre><span></span><code>---------------------------------------------------------------------------------------------------- Test set evaluation: {'binary_accuracy': 0.8424000144004822, 'false_negatives_4': 491.0, 'false_positives_4': 297.0, 'loss': 0.3661557137966156} ---------------------------------------------------------------------------------------------------- </code></pre></div> </div> <hr /> <h2 id="conclusion">Conclusion</h2> <p>Active Learning is a growing area of research. This example demonstrates the cost-efficiency benefits of using Active Learning, as it eliminates the need to annotate large amounts of data, saving resources.</p> <p>The following are some noteworthy observations from this example:</p> <ol> <li>We only require 30,000 samples to reach the same (if not better) scores as the model trained on the full dataset. This means that in a real life setting, we save the effort required for annotating 10,000 images!</li> <li>The number of false negatives and false positives are well balanced at the end of the training as compared to the skewed ratio obtained from the full training. This makes the model slightly more useful in real life scenarios where both the labels hold equal importance.</li> </ol> <p>For further reading about the types of sampling ratios, training techniques or available open source libraries/implementations, you can refer to the resources below:</p> <ol> <li><a href="http://burrsettles.com/pub/settles.activelearning.pdf">Active Learning Literature Survey</a> (Burr Settles, 2010).</li> <li><a href="https://github.com/modAL-python/modAL">modAL</a>: A Modular Active Learning framework.</li> <li>Google's unofficial <a href="https://github.com/google/active-learning">Active Learning playground</a>.</li> </ol> </div> <div class='k-outline'> <div class='k-outline-depth-1'> <a href='#review-classification-using-active-learning'>Review Classification using Active Learning</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#introduction'>Introduction</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#importing-required-libraries'>Importing required libraries</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#loading-and-preprocessing-the-data'>Loading and preprocessing the data</a> </div> <div class='k-outline-depth-3'> <a href='#fitting-the-textvectorization-layer'>Fitting the <code>TextVectorization</code> layer</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#creating-helper-functions'>Creating Helper Functions</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#creating-the-model'>Creating the Model</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#training-on-the-entire-dataset'>Training on the entire dataset</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#training-via-active-learning'>Training via Active Learning</a> </div> <div class='k-outline-depth-2'> ◆ <a href='#conclusion'>Conclusion</a> </div> </div> </div> </div> </div> </body> <footer style="float: left; width: 100%; padding: 1em; border-top: solid 1px #bbb;"> <a href="https://policies.google.com/terms">Terms</a> | <a href="https://policies.google.com/privacy">Privacy</a> </footer> </html>