CINXE.COM

A Two-Phase Recall-and-Select Framework for Fast Model Selection

<!DOCTYPE html> <html lang="en"> <head> <meta content="text/html; charset=utf-8" http-equiv="content-type"/> <title>A Two-Phase Recall-and-Select Framework for Fast Model Selection</title> <!--Generated on Thu May 2 20:15:09 2024 by LaTeXML (version 0.8.8) http://dlmf.nist.gov/LaTeXML/.--> <meta content="width=device-width, initial-scale=1, shrink-to-fit=no" name="viewport"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv-fonts.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/latexml_styles.css" rel="stylesheet" type="text/css"/> <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.3.3/html2canvas.min.js"></script> <script src="/static/browse/0.3.4/js/addons_new.js"></script> <script src="/static/browse/0.3.4/js/feedbackOverlay.js"></script> <meta content=" model selection, model clustering " lang="en" name="keywords"/> <base href="/html/2404.00069v1/"/></head> <body> <nav class="ltx_page_navbar"> <nav class="ltx_TOC"> <ol class="ltx_toclist"> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S1" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">I </span><span class="ltx_text ltx_font_smallcaps">Introduction</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S2" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">II </span><span class="ltx_text ltx_font_smallcaps">The Framework</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S2.SS1" title="In II The Framework ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-A</span> </span><span class="ltx_text ltx_font_italic">Preliminaries</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S2.SS2" title="In II The Framework ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-B</span> </span><span class="ltx_text ltx_font_italic">Framework Overview</span></span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S3" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">III </span><span class="ltx_text ltx_font_smallcaps">Coarse Recall</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S3.SS1" title="In III Coarse Recall ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">III-A</span> </span><span class="ltx_text ltx_font_italic">Model Clustering</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S3.SS2" title="In III Coarse Recall ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">III-B</span> </span><span class="ltx_text ltx_font_italic">Model Recall</span></span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">IV </span><span class="ltx_text ltx_font_smallcaps">Fine-Selection</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4.SS1" title="In IV Fine-Selection ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-A</span> </span><span class="ltx_text ltx_font_italic">Early Stopping</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4.SS2" title="In IV Fine-Selection ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-B</span> </span><span class="ltx_text ltx_font_italic">Successive Halving</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4.SS3" title="In IV Fine-Selection ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-C</span> </span><span class="ltx_text ltx_font_italic">Fine-Selection Algorithm</span></span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">V </span><span class="ltx_text ltx_font_smallcaps">Experiments</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.SS1" title="In V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">V-A</span> </span><span class="ltx_text ltx_font_italic">Experiment Setup</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.SS2" title="In V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">V-B</span> </span><span class="ltx_text ltx_font_italic">Experiment for Coarse-Recall Phase</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"> <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.SS3" title="In V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">V-C</span> </span><span class="ltx_text ltx_font_italic">Experiments for Fine-Selection Phase</span></span></a> <ol class="ltx_toclist ltx_toclist_subsection"> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.SS3.SSS1" title="In V-C Experiments for Fine-Selection Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">V-C</span>1 </span>Performance</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.SS3.SSS2" title="In V-C Experiments for Fine-Selection Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">V-C</span>2 </span>Time</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.SS3.SSS3" title="In V-C Experiments for Fine-Selection Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">V-C</span>3 </span>Scaling to more models</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.SS4" title="In V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">V-D</span> </span><span class="ltx_text ltx_font_italic">Overall Performance</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.SS5" title="In V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">V-E</span> </span><span class="ltx_text ltx_font_italic">Generalization Study</span></span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S6" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">VI </span><span class="ltx_text ltx_font_smallcaps">Related Work</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S7" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">VII </span><span class="ltx_text ltx_font_smallcaps">Future Work</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S8" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">VIII </span><span class="ltx_text ltx_font_smallcaps">Conclusion</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S9" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">IX </span><span class="ltx_text ltx_font_smallcaps">Acknowledgment</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.SS1" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">-A</span> </span><span class="ltx_text ltx_font_italic">Mnli Results</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.SS2" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">-B</span> </span><span class="ltx_text ltx_font_italic">Model Details</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.SS3" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">-C</span> </span><span class="ltx_text ltx_font_italic">Dataset Details</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.SS4" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">-D</span> </span><span class="ltx_text ltx_font_italic">Experiment on the Number of Dimensions for Max Average Error</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.SS5" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">-E</span> </span><span class="ltx_text ltx_font_italic">Model cards</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.SS6" title="In A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">-F</span> </span><span class="ltx_text ltx_font_italic">K-means Clustering Results</span></span></a></li> </ol></nav> </nav> <div class="ltx_page_main"> <div class="ltx_page_content"> <article class="ltx_document ltx_authors_1line"> <h1 class="ltx_title ltx_title_document">A Two-Phase Recall-and-Select Framework for Fast Model Selection </h1> <div class="ltx_authors"> <span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Jianwei Cui, Wenhang Shi, Honglin Tao, Wei Lu<sup class="ltx_sup" id="id3.3.id1"><span class="ltx_text ltx_font_italic" id="id3.3.id1.1">∗</span></sup>, Xiaoyong Du<sup class="ltx_sup" id="id4.4.id2"><span class="ltx_text ltx_font_italic" id="id4.4.id2.1">∗</span></sup> </span><span class="ltx_author_notes">*Corresponding author <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_font_italic" id="id5.5.id1">Renmin University of China, Beijing, China</span> <br class="ltx_break"/>{cuijianwei, wenhangshi, honglintao, lu-wei, duyong}@ruc.edu.cn </span></span></span> </div> <div class="ltx_abstract"> <h6 class="ltx_title ltx_title_abstract">Abstract</h6> <p class="ltx_p" id="id6.id1">As the ubiquity of deep learning in various machine learning applications has amplified, a proliferation of neural network models has been trained and shared on public model repositories. In the context of a targeted machine learning assignment, utilizing an apt source model as a starting point typically outperforms the strategy of training from scratch, particularly with limited training data. Despite the investigation and development of numerous model selection strategies in prior work, the process remains time-consuming, especially given the ever-increasing scale of model repositories. In this paper, we propose a two-phase (coarse-recall and fine-selection) model selection framework, aiming to enhance the efficiency of selecting a robust model by leveraging the models’ training performances on benchmark datasets. Specifically, the coarse-recall phase clusters models showcasing similar training performances on benchmark datasets in an offline manner. A light-weight proxy score is subsequently computed between this model cluster and the target dataset, which serves to recall a significantly smaller subset of potential candidate models in a swift manner. In the following fine-selection phase, the final model is chosen by fine-tuning the recalled models on the target dataset with successive halving. To accelerate the process, the final fine-tuning performance of each potential model is predicted by mining the model’s convergence trend on the benchmark datasets, which aids in filtering lower performance models more earlier during fine-tuning. Through extensive experimentation on tasks covering natural language processing and computer vision, it has been demonstrated that the proposed methodology facilitates the selection of a high-performing model at a rate about 3x times faster than conventional baseline methods. Our code is available at https://github.com/plasware/two-phase-selection.</p> </div> <div class="ltx_keywords"> <h6 class="ltx_title ltx_title_keywords">Index Terms: </h6> model selection, model clustering </div> <section class="ltx_section" id="S1"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">I </span><span class="ltx_text ltx_font_smallcaps" id="S1.1.1">Introduction</span> </h2> <div class="ltx_para" id="S1.p1"> <p class="ltx_p" id="S1.p1.1">Nowadays, a plethora of neural networks, meticulously trained in diverse fields such as natural language processing and computer vision, are readily available. These models are commonly hosted on public repositories or model hubs <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib1" title="">1</a>]</cite><cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib2" title="">2</a>]</cite>. Given the wide range of these models’ training data, it is plausible that for any specific downstream task, there exists a trained model whose domain distribution of the training dataset is well-transferable for the target task. Employing such a pre-trained model for parameter initialization, followed by fine-tuning on the target dataset often leads to an enhanced performance. This is attributable to the effective transfer and adaptation of the knowledge garnered from the original model to the target task. Therefore, how to select the optimal pre-trained model from a vast collection is crucial for achieving superior training results in a new task, especially when the training data is limited <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib3" title="">3</a>]</cite><cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib4" title="">4</a>]</cite>.</p> </div> <figure class="ltx_figure" id="S1.F1"> <p class="ltx_p ltx_align_center" id="S1.F1.1"><span class="ltx_text" id="S1.F1.1.1"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="720" id="S1.F1.1.1.g1" src="extracted/2404.00069v1/model-effect-desc_2d.png" width="1080"/></span></p> <br class="ltx_break ltx_break"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 1: </span>Fine-tuning performance of 44 and 25 pre-trained models on NLP task MNLI <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib5" title="">5</a>]</cite> and CV task CC6204-Hackaton-Cu <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib6" title="">6</a>]</cite>. The x and y-axis show pre-trained models’ ID and their performances on the dataset, respectively. It’s noted that for each dataset, the ids are sorted by the model accuracy desc.</figcaption> </figure> <div class="ltx_para" id="S1.p2"> <p class="ltx_p" id="S1.p2.1">The principal objective of model selection is to efficiently identify a well-suited pre-trained model from a repository for a novel machine learning task. However, the growing repository, though providing potential for improved task initialization, yet escalates the challenge of pinpointing the optimal model. Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S1.F1" title="Figure 1 ‣ I Introduction ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">1</span></a> illustrates the fine-tuning results of different models on two distinct machine learning tasks. Although the model pool contains a few models that exhibit commendable performance on the target task, they are markedly outnumbered by models that perform poorly. This discrepancy underscores the increasing complexity of model selection as the volume of available models surges.</p> </div> <div class="ltx_para" id="S1.p3"> <p class="ltx_p" id="S1.p3.1">The current body of research on this challenge can be bifurcated into two main categories. The first category is centered on the development of lightweight proxy tasks to predict the post-fine-tuning performance of pre-trained models. While the methods offer computational efficiency by obviating the need for direct fine-tuning, they tend to be more prone to selecting sub-optimal models <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib7" title="">7</a>]</cite>. Conversely, the second category of methods employ a model selection strategy during the fine-tuning process on the target dataset, utilizing a success-halving approach to retain only high-performance models at each training iteration <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib3" title="">3</a>]</cite><cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib4" title="">4</a>]</cite>. However, as illustrated in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S1.F1" title="Figure 1 ‣ I Introduction ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">1</span></a>, only a minor fraction of models in the repository are appropriate for a specific downstream task. Therefore, even with the successive halving strategy, it is computationally inefficient to fine-tune all models in the repository, given that each model needs to be loaded and trained for at least one iteration. Additionally, the efficacy and efficiency of these existing methods decline as the number of pre-trained models continues to expand.</p> </div> <div class="ltx_para" id="S1.p4"> <p class="ltx_p" id="S1.p4.1">Hence, we propose a two-phase model selection framework, a hybrid approach that amalgamates the advantages of both aforementioned categories, enabling the efficient selection of a suitable pre-trained model for a novel task. We split the model selection to two phases: the first phase, referred as the coarse-recall phase, is designed to identify a handful of promising model candidates based on lightweight proxy tasks. Following this, the second fine-selection phase only necessitates the fine-tuning of models recalled from the first phase to identify the most optimal model. This method significantly improves the efficiency of selecting a suitable model from a large repository, as fine-tuning is only carried out on a substantially reduced subset of models.</p> </div> <div class="ltx_para" id="S1.p5"> <p class="ltx_p" id="S1.p5.1">The coarse-call phase computes a light proxy task for each model on the target dataset, keeping only the models with high scores for fast filtering. Although the use of proxy tasks avoids fine-tuning, computing a score for each model still makes model loading and inference necessary and inefficient, especially when the model number increases. To speed up, we propose to cluster similar models based on their performances on a set of benchmark datasets. This is inspired by the fact that there is overlap in the training data of public models, so that models that perform similarly on the standard dataset will perform similarly on the new dataset. Specifically, we construct a performance matrix by training each model offline on all benchmark datasets and saving the corresponding performances. Then we cluster the models based on their performance vectors, and each time a new task arrives, we only compute scores for the clusters’ representative model. By mining the similarity among models’ training, we avoid repeated online computation of proxy scores for similar models and make the model selection more efficient.</p> </div> <div class="ltx_para" id="S1.p6"> <p class="ltx_p" id="S1.p6.1">As fine-tuning is more time-consuming, the second fine-selection phase needs to apply more efficient filter strategy to avoid wasting time training low-performance models at early training steps. This is motivated by the model performance consistency at the beginning and end of the training <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib8" title="">8</a>]</cite>. In this paper, we also apply the successive halving algorithm to filter at least half number of models with lower performance at each training iteration. Meanwhile, as illustrated in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S1.F2" title="Figure 2 ‣ I Introduction ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">2</span></a>(b), it is plausible to filter more than half number of models if we can predict the final training performance. Again, we resort to mine the convergence processes between a pre-trained model and benchmark datasets as illustrated in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S1.F2" title="Figure 2 ‣ I Introduction ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">2</span></a>(b). Specifically, for every recalled model, we collect the training processes on benchmark datasets, and cluster training processes with similar validation accuracy to form a convergence trend which could predict final test performance range at each iteration step. Then, after the model is fine-tuned on the target dataset for a few steps, we can assign a convergence trend to the model if the current training performance is closed to the training performance of the convergence trend at current step. By this way, the final training performance of a pre-trained model on the target dataset could be predicted more accurate, which could helps to filter more models at early steps.</p> </div> <figure class="ltx_figure" id="S1.F2"> <p class="ltx_p ltx_align_center" id="S1.F2.1"><span class="ltx_text" id="S1.F2.1.1"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="476" id="S1.F2.1.1.g1" src="extracted/2404.00069v1/framework-update.png" width="1045"/></span></p> <br class="ltx_break ltx_break"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 2: </span>The framework of two-phase model selection: (a) performance matrix, (b) model clustering based on the performance matrix, and convergence trends mining by clustering convergence processes of a pre-trained model on benchmark datasets, (c) coarse-recall phase running recall strategy based on the proxy score computation between a model cluster and the target dataset, and (d) fine-selection phase fine-tunes the recalled models and filters poorly-performance models according to convergence trend. Both (a) and (b) are maintained offline and could be used for any new task.</figcaption> </figure> <div class="ltx_para" id="S1.p7"> <p class="ltx_p" id="S1.p7.1">To this end, we summarize the contributions of this paper as follows:</p> </div> <div class="ltx_para" id="S1.p8"> <ul class="ltx_itemize" id="S1.I1"> <li class="ltx_item" id="S1.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S1.I1.i1.p1"> <p class="ltx_p" id="S1.I1.i1.p1.1">We propose a two-phase model selection framework. The first coarse-recall phase employs a lightweight proxy score to identify a considerably smaller set of promising model candidates. Subsequently, the second fine-selection phase exclusively fine-tunes and filters models from this refined set.</p> </div> </li> <li class="ltx_item" id="S1.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S1.I1.i2.p1"> <p class="ltx_p" id="S1.I1.i2.p1.1">To further accelerate the process, we propose mining the training performances of pre-trained models on a collection of benchmark datasets, subsequently clustering similar models. Through clustering, we circumvent computing proxy scores online for each model in the first phase and enhance the accuracy of performance predictions for the left models in the second phase, thereby achieving more efficient and precise selection.</p> </div> </li> <li class="ltx_item" id="S1.I1.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S1.I1.i3.p1"> <p class="ltx_p" id="S1.I1.i3.p1.1">We conduct extensive experiments on a substantial variety of training models, encompassing both natural language processing and computer vision domains. The results demonstrate that our proposed framework can effectively identify superior performing models with increasing efficiency, about 3x compared to successive halving and 5x compared to brute force methods.</p> </div> </li> </ul> </div> </section> <section class="ltx_section" id="S2"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">II </span><span class="ltx_text ltx_font_smallcaps" id="S2.1.1">The Framework</span> </h2> <section class="ltx_subsection" id="S2.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S2.SS1.4.1.1">II-A</span> </span><span class="ltx_text ltx_font_italic" id="S2.SS1.5.2">Preliminaries</span> </h3> <div class="ltx_para" id="S2.SS1.p1"> <p class="ltx_p" id="S2.SS1.p1.2"><span class="ltx_text ltx_font_italic" id="S2.SS1.p1.2.1">Model Repository</span>. The model repository is a set of pre-trained models, denoted as <math alttext="M=\{m_{1},m_{2},...,m_{n}\}" class="ltx_Math" display="inline" id="S2.SS1.p1.1.m1.4"><semantics id="S2.SS1.p1.1.m1.4a"><mrow id="S2.SS1.p1.1.m1.4.4" xref="S2.SS1.p1.1.m1.4.4.cmml"><mi id="S2.SS1.p1.1.m1.4.4.5" xref="S2.SS1.p1.1.m1.4.4.5.cmml">M</mi><mo id="S2.SS1.p1.1.m1.4.4.4" xref="S2.SS1.p1.1.m1.4.4.4.cmml">=</mo><mrow id="S2.SS1.p1.1.m1.4.4.3.3" xref="S2.SS1.p1.1.m1.4.4.3.4.cmml"><mo id="S2.SS1.p1.1.m1.4.4.3.3.4" stretchy="false" xref="S2.SS1.p1.1.m1.4.4.3.4.cmml">{</mo><msub id="S2.SS1.p1.1.m1.2.2.1.1.1" xref="S2.SS1.p1.1.m1.2.2.1.1.1.cmml"><mi id="S2.SS1.p1.1.m1.2.2.1.1.1.2" xref="S2.SS1.p1.1.m1.2.2.1.1.1.2.cmml">m</mi><mn id="S2.SS1.p1.1.m1.2.2.1.1.1.3" xref="S2.SS1.p1.1.m1.2.2.1.1.1.3.cmml">1</mn></msub><mo id="S2.SS1.p1.1.m1.4.4.3.3.5" xref="S2.SS1.p1.1.m1.4.4.3.4.cmml">,</mo><msub id="S2.SS1.p1.1.m1.3.3.2.2.2" xref="S2.SS1.p1.1.m1.3.3.2.2.2.cmml"><mi id="S2.SS1.p1.1.m1.3.3.2.2.2.2" xref="S2.SS1.p1.1.m1.3.3.2.2.2.2.cmml">m</mi><mn id="S2.SS1.p1.1.m1.3.3.2.2.2.3" xref="S2.SS1.p1.1.m1.3.3.2.2.2.3.cmml">2</mn></msub><mo id="S2.SS1.p1.1.m1.4.4.3.3.6" xref="S2.SS1.p1.1.m1.4.4.3.4.cmml">,</mo><mi id="S2.SS1.p1.1.m1.1.1" mathvariant="normal" xref="S2.SS1.p1.1.m1.1.1.cmml">…</mi><mo id="S2.SS1.p1.1.m1.4.4.3.3.7" xref="S2.SS1.p1.1.m1.4.4.3.4.cmml">,</mo><msub id="S2.SS1.p1.1.m1.4.4.3.3.3" xref="S2.SS1.p1.1.m1.4.4.3.3.3.cmml"><mi id="S2.SS1.p1.1.m1.4.4.3.3.3.2" xref="S2.SS1.p1.1.m1.4.4.3.3.3.2.cmml">m</mi><mi id="S2.SS1.p1.1.m1.4.4.3.3.3.3" xref="S2.SS1.p1.1.m1.4.4.3.3.3.3.cmml">n</mi></msub><mo id="S2.SS1.p1.1.m1.4.4.3.3.8" stretchy="false" xref="S2.SS1.p1.1.m1.4.4.3.4.cmml">}</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.1.m1.4b"><apply id="S2.SS1.p1.1.m1.4.4.cmml" xref="S2.SS1.p1.1.m1.4.4"><eq id="S2.SS1.p1.1.m1.4.4.4.cmml" xref="S2.SS1.p1.1.m1.4.4.4"></eq><ci id="S2.SS1.p1.1.m1.4.4.5.cmml" xref="S2.SS1.p1.1.m1.4.4.5">𝑀</ci><set id="S2.SS1.p1.1.m1.4.4.3.4.cmml" xref="S2.SS1.p1.1.m1.4.4.3.3"><apply id="S2.SS1.p1.1.m1.2.2.1.1.1.cmml" xref="S2.SS1.p1.1.m1.2.2.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.p1.1.m1.2.2.1.1.1.1.cmml" xref="S2.SS1.p1.1.m1.2.2.1.1.1">subscript</csymbol><ci id="S2.SS1.p1.1.m1.2.2.1.1.1.2.cmml" xref="S2.SS1.p1.1.m1.2.2.1.1.1.2">𝑚</ci><cn id="S2.SS1.p1.1.m1.2.2.1.1.1.3.cmml" type="integer" xref="S2.SS1.p1.1.m1.2.2.1.1.1.3">1</cn></apply><apply id="S2.SS1.p1.1.m1.3.3.2.2.2.cmml" xref="S2.SS1.p1.1.m1.3.3.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.p1.1.m1.3.3.2.2.2.1.cmml" xref="S2.SS1.p1.1.m1.3.3.2.2.2">subscript</csymbol><ci id="S2.SS1.p1.1.m1.3.3.2.2.2.2.cmml" xref="S2.SS1.p1.1.m1.3.3.2.2.2.2">𝑚</ci><cn id="S2.SS1.p1.1.m1.3.3.2.2.2.3.cmml" type="integer" xref="S2.SS1.p1.1.m1.3.3.2.2.2.3">2</cn></apply><ci id="S2.SS1.p1.1.m1.1.1.cmml" xref="S2.SS1.p1.1.m1.1.1">…</ci><apply id="S2.SS1.p1.1.m1.4.4.3.3.3.cmml" xref="S2.SS1.p1.1.m1.4.4.3.3.3"><csymbol cd="ambiguous" id="S2.SS1.p1.1.m1.4.4.3.3.3.1.cmml" xref="S2.SS1.p1.1.m1.4.4.3.3.3">subscript</csymbol><ci id="S2.SS1.p1.1.m1.4.4.3.3.3.2.cmml" xref="S2.SS1.p1.1.m1.4.4.3.3.3.2">𝑚</ci><ci id="S2.SS1.p1.1.m1.4.4.3.3.3.3.cmml" xref="S2.SS1.p1.1.m1.4.4.3.3.3.3">𝑛</ci></apply></set></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.1.m1.4c">M=\{m_{1},m_{2},...,m_{n}\}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.1.m1.4d">italic_M = { italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_m start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT }</annotation></semantics></math>. Here, a pre-trained model <math alttext="m_{j}" class="ltx_Math" display="inline" id="S2.SS1.p1.2.m2.1"><semantics id="S2.SS1.p1.2.m2.1a"><msub id="S2.SS1.p1.2.m2.1.1" xref="S2.SS1.p1.2.m2.1.1.cmml"><mi id="S2.SS1.p1.2.m2.1.1.2" xref="S2.SS1.p1.2.m2.1.1.2.cmml">m</mi><mi id="S2.SS1.p1.2.m2.1.1.3" xref="S2.SS1.p1.2.m2.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.2.m2.1b"><apply id="S2.SS1.p1.2.m2.1.1.cmml" xref="S2.SS1.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S2.SS1.p1.2.m2.1.1.1.cmml" xref="S2.SS1.p1.2.m2.1.1">subscript</csymbol><ci id="S2.SS1.p1.2.m2.1.1.2.cmml" xref="S2.SS1.p1.2.m2.1.1.2">𝑚</ci><ci id="S2.SS1.p1.2.m2.1.1.3.cmml" xref="S2.SS1.p1.2.m2.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.2.m2.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.2.m2.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math> is a neural network model already trained on an upstream dataset with different learning methods, such as masked language model in natual language processing <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib9" title="">9</a>]</cite> or image classification for computer vision.</p> </div> <div class="ltx_para" id="S2.SS1.p2"> <p class="ltx_p" id="S2.SS1.p2.1"><span class="ltx_text ltx_font_italic" id="S2.SS1.p2.1.1">Benchmark Datasets</span>. The benchmark datasets comprise representative datasets from the respective domain, such as the GLUE <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib5" title="">5</a>]</cite> for natural language process and various subsets of ImageNet <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib10" title="">10</a>]</cite> for computer vision. We use <math alttext="D=\{d_{1},d_{2},...,d_{m}\}" class="ltx_Math" display="inline" id="S2.SS1.p2.1.m1.4"><semantics id="S2.SS1.p2.1.m1.4a"><mrow id="S2.SS1.p2.1.m1.4.4" xref="S2.SS1.p2.1.m1.4.4.cmml"><mi id="S2.SS1.p2.1.m1.4.4.5" xref="S2.SS1.p2.1.m1.4.4.5.cmml">D</mi><mo id="S2.SS1.p2.1.m1.4.4.4" xref="S2.SS1.p2.1.m1.4.4.4.cmml">=</mo><mrow id="S2.SS1.p2.1.m1.4.4.3.3" xref="S2.SS1.p2.1.m1.4.4.3.4.cmml"><mo id="S2.SS1.p2.1.m1.4.4.3.3.4" stretchy="false" xref="S2.SS1.p2.1.m1.4.4.3.4.cmml">{</mo><msub id="S2.SS1.p2.1.m1.2.2.1.1.1" xref="S2.SS1.p2.1.m1.2.2.1.1.1.cmml"><mi id="S2.SS1.p2.1.m1.2.2.1.1.1.2" xref="S2.SS1.p2.1.m1.2.2.1.1.1.2.cmml">d</mi><mn id="S2.SS1.p2.1.m1.2.2.1.1.1.3" xref="S2.SS1.p2.1.m1.2.2.1.1.1.3.cmml">1</mn></msub><mo id="S2.SS1.p2.1.m1.4.4.3.3.5" xref="S2.SS1.p2.1.m1.4.4.3.4.cmml">,</mo><msub id="S2.SS1.p2.1.m1.3.3.2.2.2" xref="S2.SS1.p2.1.m1.3.3.2.2.2.cmml"><mi id="S2.SS1.p2.1.m1.3.3.2.2.2.2" xref="S2.SS1.p2.1.m1.3.3.2.2.2.2.cmml">d</mi><mn id="S2.SS1.p2.1.m1.3.3.2.2.2.3" xref="S2.SS1.p2.1.m1.3.3.2.2.2.3.cmml">2</mn></msub><mo id="S2.SS1.p2.1.m1.4.4.3.3.6" xref="S2.SS1.p2.1.m1.4.4.3.4.cmml">,</mo><mi id="S2.SS1.p2.1.m1.1.1" mathvariant="normal" xref="S2.SS1.p2.1.m1.1.1.cmml">…</mi><mo id="S2.SS1.p2.1.m1.4.4.3.3.7" xref="S2.SS1.p2.1.m1.4.4.3.4.cmml">,</mo><msub id="S2.SS1.p2.1.m1.4.4.3.3.3" xref="S2.SS1.p2.1.m1.4.4.3.3.3.cmml"><mi id="S2.SS1.p2.1.m1.4.4.3.3.3.2" xref="S2.SS1.p2.1.m1.4.4.3.3.3.2.cmml">d</mi><mi id="S2.SS1.p2.1.m1.4.4.3.3.3.3" xref="S2.SS1.p2.1.m1.4.4.3.3.3.3.cmml">m</mi></msub><mo id="S2.SS1.p2.1.m1.4.4.3.3.8" stretchy="false" xref="S2.SS1.p2.1.m1.4.4.3.4.cmml">}</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.1.m1.4b"><apply id="S2.SS1.p2.1.m1.4.4.cmml" xref="S2.SS1.p2.1.m1.4.4"><eq id="S2.SS1.p2.1.m1.4.4.4.cmml" xref="S2.SS1.p2.1.m1.4.4.4"></eq><ci id="S2.SS1.p2.1.m1.4.4.5.cmml" xref="S2.SS1.p2.1.m1.4.4.5">𝐷</ci><set id="S2.SS1.p2.1.m1.4.4.3.4.cmml" xref="S2.SS1.p2.1.m1.4.4.3.3"><apply id="S2.SS1.p2.1.m1.2.2.1.1.1.cmml" xref="S2.SS1.p2.1.m1.2.2.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.p2.1.m1.2.2.1.1.1.1.cmml" xref="S2.SS1.p2.1.m1.2.2.1.1.1">subscript</csymbol><ci id="S2.SS1.p2.1.m1.2.2.1.1.1.2.cmml" xref="S2.SS1.p2.1.m1.2.2.1.1.1.2">𝑑</ci><cn id="S2.SS1.p2.1.m1.2.2.1.1.1.3.cmml" type="integer" xref="S2.SS1.p2.1.m1.2.2.1.1.1.3">1</cn></apply><apply id="S2.SS1.p2.1.m1.3.3.2.2.2.cmml" xref="S2.SS1.p2.1.m1.3.3.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.p2.1.m1.3.3.2.2.2.1.cmml" xref="S2.SS1.p2.1.m1.3.3.2.2.2">subscript</csymbol><ci id="S2.SS1.p2.1.m1.3.3.2.2.2.2.cmml" xref="S2.SS1.p2.1.m1.3.3.2.2.2.2">𝑑</ci><cn id="S2.SS1.p2.1.m1.3.3.2.2.2.3.cmml" type="integer" xref="S2.SS1.p2.1.m1.3.3.2.2.2.3">2</cn></apply><ci id="S2.SS1.p2.1.m1.1.1.cmml" xref="S2.SS1.p2.1.m1.1.1">…</ci><apply id="S2.SS1.p2.1.m1.4.4.3.3.3.cmml" xref="S2.SS1.p2.1.m1.4.4.3.3.3"><csymbol cd="ambiguous" id="S2.SS1.p2.1.m1.4.4.3.3.3.1.cmml" xref="S2.SS1.p2.1.m1.4.4.3.3.3">subscript</csymbol><ci id="S2.SS1.p2.1.m1.4.4.3.3.3.2.cmml" xref="S2.SS1.p2.1.m1.4.4.3.3.3.2">𝑑</ci><ci id="S2.SS1.p2.1.m1.4.4.3.3.3.3.cmml" xref="S2.SS1.p2.1.m1.4.4.3.3.3.3">𝑚</ci></apply></set></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.1.m1.4c">D=\{d_{1},d_{2},...,d_{m}\}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.1.m1.4d">italic_D = { italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_d start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT }</annotation></semantics></math> to denote the benchmark datasets.</p> </div> <div class="ltx_para" id="S2.SS1.p3"> <p class="ltx_p" id="S2.SS1.p3.5"><span class="ltx_text ltx_font_italic" id="S2.SS1.p3.5.1">Performance Matrix</span>. The performance matrix records the test results of pre-trained models fine-tuned on benchmark datasets, denoted as <math alttext="Matrix(D,M)" class="ltx_Math" display="inline" id="S2.SS1.p3.1.m1.2"><semantics id="S2.SS1.p3.1.m1.2a"><mrow id="S2.SS1.p3.1.m1.2.3" xref="S2.SS1.p3.1.m1.2.3.cmml"><mi id="S2.SS1.p3.1.m1.2.3.2" xref="S2.SS1.p3.1.m1.2.3.2.cmml">M</mi><mo id="S2.SS1.p3.1.m1.2.3.1" xref="S2.SS1.p3.1.m1.2.3.1.cmml">⁢</mo><mi id="S2.SS1.p3.1.m1.2.3.3" xref="S2.SS1.p3.1.m1.2.3.3.cmml">a</mi><mo id="S2.SS1.p3.1.m1.2.3.1a" xref="S2.SS1.p3.1.m1.2.3.1.cmml">⁢</mo><mi id="S2.SS1.p3.1.m1.2.3.4" xref="S2.SS1.p3.1.m1.2.3.4.cmml">t</mi><mo id="S2.SS1.p3.1.m1.2.3.1b" xref="S2.SS1.p3.1.m1.2.3.1.cmml">⁢</mo><mi id="S2.SS1.p3.1.m1.2.3.5" xref="S2.SS1.p3.1.m1.2.3.5.cmml">r</mi><mo id="S2.SS1.p3.1.m1.2.3.1c" xref="S2.SS1.p3.1.m1.2.3.1.cmml">⁢</mo><mi id="S2.SS1.p3.1.m1.2.3.6" xref="S2.SS1.p3.1.m1.2.3.6.cmml">i</mi><mo id="S2.SS1.p3.1.m1.2.3.1d" xref="S2.SS1.p3.1.m1.2.3.1.cmml">⁢</mo><mi id="S2.SS1.p3.1.m1.2.3.7" xref="S2.SS1.p3.1.m1.2.3.7.cmml">x</mi><mo id="S2.SS1.p3.1.m1.2.3.1e" xref="S2.SS1.p3.1.m1.2.3.1.cmml">⁢</mo><mrow id="S2.SS1.p3.1.m1.2.3.8.2" xref="S2.SS1.p3.1.m1.2.3.8.1.cmml"><mo id="S2.SS1.p3.1.m1.2.3.8.2.1" stretchy="false" xref="S2.SS1.p3.1.m1.2.3.8.1.cmml">(</mo><mi id="S2.SS1.p3.1.m1.1.1" xref="S2.SS1.p3.1.m1.1.1.cmml">D</mi><mo id="S2.SS1.p3.1.m1.2.3.8.2.2" xref="S2.SS1.p3.1.m1.2.3.8.1.cmml">,</mo><mi id="S2.SS1.p3.1.m1.2.2" xref="S2.SS1.p3.1.m1.2.2.cmml">M</mi><mo id="S2.SS1.p3.1.m1.2.3.8.2.3" stretchy="false" xref="S2.SS1.p3.1.m1.2.3.8.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p3.1.m1.2b"><apply id="S2.SS1.p3.1.m1.2.3.cmml" xref="S2.SS1.p3.1.m1.2.3"><times id="S2.SS1.p3.1.m1.2.3.1.cmml" xref="S2.SS1.p3.1.m1.2.3.1"></times><ci id="S2.SS1.p3.1.m1.2.3.2.cmml" xref="S2.SS1.p3.1.m1.2.3.2">𝑀</ci><ci id="S2.SS1.p3.1.m1.2.3.3.cmml" xref="S2.SS1.p3.1.m1.2.3.3">𝑎</ci><ci id="S2.SS1.p3.1.m1.2.3.4.cmml" xref="S2.SS1.p3.1.m1.2.3.4">𝑡</ci><ci id="S2.SS1.p3.1.m1.2.3.5.cmml" xref="S2.SS1.p3.1.m1.2.3.5">𝑟</ci><ci id="S2.SS1.p3.1.m1.2.3.6.cmml" xref="S2.SS1.p3.1.m1.2.3.6">𝑖</ci><ci id="S2.SS1.p3.1.m1.2.3.7.cmml" xref="S2.SS1.p3.1.m1.2.3.7">𝑥</ci><interval closure="open" id="S2.SS1.p3.1.m1.2.3.8.1.cmml" xref="S2.SS1.p3.1.m1.2.3.8.2"><ci id="S2.SS1.p3.1.m1.1.1.cmml" xref="S2.SS1.p3.1.m1.1.1">𝐷</ci><ci id="S2.SS1.p3.1.m1.2.2.cmml" xref="S2.SS1.p3.1.m1.2.2">𝑀</ci></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p3.1.m1.2c">Matrix(D,M)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p3.1.m1.2d">italic_M italic_a italic_t italic_r italic_i italic_x ( italic_D , italic_M )</annotation></semantics></math> with the value <math alttext="Matrix(D,M)[i][j]" class="ltx_Math" display="inline" id="S2.SS1.p3.2.m2.4"><semantics id="S2.SS1.p3.2.m2.4a"><mrow id="S2.SS1.p3.2.m2.4.5" xref="S2.SS1.p3.2.m2.4.5.cmml"><mi id="S2.SS1.p3.2.m2.4.5.2" xref="S2.SS1.p3.2.m2.4.5.2.cmml">M</mi><mo id="S2.SS1.p3.2.m2.4.5.1" xref="S2.SS1.p3.2.m2.4.5.1.cmml">⁢</mo><mi id="S2.SS1.p3.2.m2.4.5.3" xref="S2.SS1.p3.2.m2.4.5.3.cmml">a</mi><mo id="S2.SS1.p3.2.m2.4.5.1a" xref="S2.SS1.p3.2.m2.4.5.1.cmml">⁢</mo><mi id="S2.SS1.p3.2.m2.4.5.4" xref="S2.SS1.p3.2.m2.4.5.4.cmml">t</mi><mo id="S2.SS1.p3.2.m2.4.5.1b" xref="S2.SS1.p3.2.m2.4.5.1.cmml">⁢</mo><mi id="S2.SS1.p3.2.m2.4.5.5" xref="S2.SS1.p3.2.m2.4.5.5.cmml">r</mi><mo id="S2.SS1.p3.2.m2.4.5.1c" xref="S2.SS1.p3.2.m2.4.5.1.cmml">⁢</mo><mi id="S2.SS1.p3.2.m2.4.5.6" xref="S2.SS1.p3.2.m2.4.5.6.cmml">i</mi><mo id="S2.SS1.p3.2.m2.4.5.1d" xref="S2.SS1.p3.2.m2.4.5.1.cmml">⁢</mo><mi id="S2.SS1.p3.2.m2.4.5.7" xref="S2.SS1.p3.2.m2.4.5.7.cmml">x</mi><mo id="S2.SS1.p3.2.m2.4.5.1e" xref="S2.SS1.p3.2.m2.4.5.1.cmml">⁢</mo><mrow id="S2.SS1.p3.2.m2.4.5.8.2" xref="S2.SS1.p3.2.m2.4.5.8.1.cmml"><mo id="S2.SS1.p3.2.m2.4.5.8.2.1" stretchy="false" xref="S2.SS1.p3.2.m2.4.5.8.1.cmml">(</mo><mi id="S2.SS1.p3.2.m2.1.1" xref="S2.SS1.p3.2.m2.1.1.cmml">D</mi><mo id="S2.SS1.p3.2.m2.4.5.8.2.2" xref="S2.SS1.p3.2.m2.4.5.8.1.cmml">,</mo><mi id="S2.SS1.p3.2.m2.2.2" xref="S2.SS1.p3.2.m2.2.2.cmml">M</mi><mo id="S2.SS1.p3.2.m2.4.5.8.2.3" stretchy="false" xref="S2.SS1.p3.2.m2.4.5.8.1.cmml">)</mo></mrow><mo id="S2.SS1.p3.2.m2.4.5.1f" xref="S2.SS1.p3.2.m2.4.5.1.cmml">⁢</mo><mrow id="S2.SS1.p3.2.m2.4.5.9.2" xref="S2.SS1.p3.2.m2.4.5.9.1.cmml"><mo id="S2.SS1.p3.2.m2.4.5.9.2.1" stretchy="false" xref="S2.SS1.p3.2.m2.4.5.9.1.1.cmml">[</mo><mi id="S2.SS1.p3.2.m2.3.3" xref="S2.SS1.p3.2.m2.3.3.cmml">i</mi><mo id="S2.SS1.p3.2.m2.4.5.9.2.2" stretchy="false" xref="S2.SS1.p3.2.m2.4.5.9.1.1.cmml">]</mo></mrow><mo id="S2.SS1.p3.2.m2.4.5.1g" xref="S2.SS1.p3.2.m2.4.5.1.cmml">⁢</mo><mrow id="S2.SS1.p3.2.m2.4.5.10.2" xref="S2.SS1.p3.2.m2.4.5.10.1.cmml"><mo id="S2.SS1.p3.2.m2.4.5.10.2.1" stretchy="false" xref="S2.SS1.p3.2.m2.4.5.10.1.1.cmml">[</mo><mi id="S2.SS1.p3.2.m2.4.4" xref="S2.SS1.p3.2.m2.4.4.cmml">j</mi><mo id="S2.SS1.p3.2.m2.4.5.10.2.2" stretchy="false" xref="S2.SS1.p3.2.m2.4.5.10.1.1.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p3.2.m2.4b"><apply id="S2.SS1.p3.2.m2.4.5.cmml" xref="S2.SS1.p3.2.m2.4.5"><times id="S2.SS1.p3.2.m2.4.5.1.cmml" xref="S2.SS1.p3.2.m2.4.5.1"></times><ci id="S2.SS1.p3.2.m2.4.5.2.cmml" xref="S2.SS1.p3.2.m2.4.5.2">𝑀</ci><ci id="S2.SS1.p3.2.m2.4.5.3.cmml" xref="S2.SS1.p3.2.m2.4.5.3">𝑎</ci><ci id="S2.SS1.p3.2.m2.4.5.4.cmml" xref="S2.SS1.p3.2.m2.4.5.4">𝑡</ci><ci id="S2.SS1.p3.2.m2.4.5.5.cmml" xref="S2.SS1.p3.2.m2.4.5.5">𝑟</ci><ci id="S2.SS1.p3.2.m2.4.5.6.cmml" xref="S2.SS1.p3.2.m2.4.5.6">𝑖</ci><ci id="S2.SS1.p3.2.m2.4.5.7.cmml" xref="S2.SS1.p3.2.m2.4.5.7">𝑥</ci><interval closure="open" id="S2.SS1.p3.2.m2.4.5.8.1.cmml" xref="S2.SS1.p3.2.m2.4.5.8.2"><ci id="S2.SS1.p3.2.m2.1.1.cmml" xref="S2.SS1.p3.2.m2.1.1">𝐷</ci><ci id="S2.SS1.p3.2.m2.2.2.cmml" xref="S2.SS1.p3.2.m2.2.2">𝑀</ci></interval><apply id="S2.SS1.p3.2.m2.4.5.9.1.cmml" xref="S2.SS1.p3.2.m2.4.5.9.2"><csymbol cd="latexml" id="S2.SS1.p3.2.m2.4.5.9.1.1.cmml" xref="S2.SS1.p3.2.m2.4.5.9.2.1">delimited-[]</csymbol><ci id="S2.SS1.p3.2.m2.3.3.cmml" xref="S2.SS1.p3.2.m2.3.3">𝑖</ci></apply><apply id="S2.SS1.p3.2.m2.4.5.10.1.cmml" xref="S2.SS1.p3.2.m2.4.5.10.2"><csymbol cd="latexml" id="S2.SS1.p3.2.m2.4.5.10.1.1.cmml" xref="S2.SS1.p3.2.m2.4.5.10.2.1">delimited-[]</csymbol><ci id="S2.SS1.p3.2.m2.4.4.cmml" xref="S2.SS1.p3.2.m2.4.4">𝑗</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p3.2.m2.4c">Matrix(D,M)[i][j]</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p3.2.m2.4d">italic_M italic_a italic_t italic_r italic_i italic_x ( italic_D , italic_M ) [ italic_i ] [ italic_j ]</annotation></semantics></math> is the training performance of the pre-trained model <math alttext="m_{j}" class="ltx_Math" display="inline" id="S2.SS1.p3.3.m3.1"><semantics id="S2.SS1.p3.3.m3.1a"><msub id="S2.SS1.p3.3.m3.1.1" xref="S2.SS1.p3.3.m3.1.1.cmml"><mi id="S2.SS1.p3.3.m3.1.1.2" xref="S2.SS1.p3.3.m3.1.1.2.cmml">m</mi><mi id="S2.SS1.p3.3.m3.1.1.3" xref="S2.SS1.p3.3.m3.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.p3.3.m3.1b"><apply id="S2.SS1.p3.3.m3.1.1.cmml" xref="S2.SS1.p3.3.m3.1.1"><csymbol cd="ambiguous" id="S2.SS1.p3.3.m3.1.1.1.cmml" xref="S2.SS1.p3.3.m3.1.1">subscript</csymbol><ci id="S2.SS1.p3.3.m3.1.1.2.cmml" xref="S2.SS1.p3.3.m3.1.1.2">𝑚</ci><ci id="S2.SS1.p3.3.m3.1.1.3.cmml" xref="S2.SS1.p3.3.m3.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p3.3.m3.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p3.3.m3.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math> on the benchmark dataset <math alttext="d_{i}" class="ltx_Math" display="inline" id="S2.SS1.p3.4.m4.1"><semantics id="S2.SS1.p3.4.m4.1a"><msub id="S2.SS1.p3.4.m4.1.1" xref="S2.SS1.p3.4.m4.1.1.cmml"><mi id="S2.SS1.p3.4.m4.1.1.2" xref="S2.SS1.p3.4.m4.1.1.2.cmml">d</mi><mi id="S2.SS1.p3.4.m4.1.1.3" xref="S2.SS1.p3.4.m4.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.p3.4.m4.1b"><apply id="S2.SS1.p3.4.m4.1.1.cmml" xref="S2.SS1.p3.4.m4.1.1"><csymbol cd="ambiguous" id="S2.SS1.p3.4.m4.1.1.1.cmml" xref="S2.SS1.p3.4.m4.1.1">subscript</csymbol><ci id="S2.SS1.p3.4.m4.1.1.2.cmml" xref="S2.SS1.p3.4.m4.1.1.2">𝑑</ci><ci id="S2.SS1.p3.4.m4.1.1.3.cmml" xref="S2.SS1.p3.4.m4.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p3.4.m4.1c">d_{i}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p3.4.m4.1d">italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math>, also denoted as <math alttext="p(d_{i}|m_{j})" class="ltx_Math" display="inline" id="S2.SS1.p3.5.m5.1"><semantics id="S2.SS1.p3.5.m5.1a"><mrow id="S2.SS1.p3.5.m5.1.1" xref="S2.SS1.p3.5.m5.1.1.cmml"><mi id="S2.SS1.p3.5.m5.1.1.3" xref="S2.SS1.p3.5.m5.1.1.3.cmml">p</mi><mo id="S2.SS1.p3.5.m5.1.1.2" xref="S2.SS1.p3.5.m5.1.1.2.cmml">⁢</mo><mrow id="S2.SS1.p3.5.m5.1.1.1.1" xref="S2.SS1.p3.5.m5.1.1.1.1.1.cmml"><mo id="S2.SS1.p3.5.m5.1.1.1.1.2" stretchy="false" xref="S2.SS1.p3.5.m5.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.p3.5.m5.1.1.1.1.1" xref="S2.SS1.p3.5.m5.1.1.1.1.1.cmml"><msub id="S2.SS1.p3.5.m5.1.1.1.1.1.2" xref="S2.SS1.p3.5.m5.1.1.1.1.1.2.cmml"><mi id="S2.SS1.p3.5.m5.1.1.1.1.1.2.2" xref="S2.SS1.p3.5.m5.1.1.1.1.1.2.2.cmml">d</mi><mi id="S2.SS1.p3.5.m5.1.1.1.1.1.2.3" xref="S2.SS1.p3.5.m5.1.1.1.1.1.2.3.cmml">i</mi></msub><mo fence="false" id="S2.SS1.p3.5.m5.1.1.1.1.1.1" xref="S2.SS1.p3.5.m5.1.1.1.1.1.1.cmml">|</mo><msub id="S2.SS1.p3.5.m5.1.1.1.1.1.3" xref="S2.SS1.p3.5.m5.1.1.1.1.1.3.cmml"><mi id="S2.SS1.p3.5.m5.1.1.1.1.1.3.2" xref="S2.SS1.p3.5.m5.1.1.1.1.1.3.2.cmml">m</mi><mi id="S2.SS1.p3.5.m5.1.1.1.1.1.3.3" xref="S2.SS1.p3.5.m5.1.1.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S2.SS1.p3.5.m5.1.1.1.1.3" stretchy="false" xref="S2.SS1.p3.5.m5.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p3.5.m5.1b"><apply id="S2.SS1.p3.5.m5.1.1.cmml" xref="S2.SS1.p3.5.m5.1.1"><times id="S2.SS1.p3.5.m5.1.1.2.cmml" xref="S2.SS1.p3.5.m5.1.1.2"></times><ci id="S2.SS1.p3.5.m5.1.1.3.cmml" xref="S2.SS1.p3.5.m5.1.1.3">𝑝</ci><apply id="S2.SS1.p3.5.m5.1.1.1.1.1.cmml" xref="S2.SS1.p3.5.m5.1.1.1.1"><csymbol cd="latexml" id="S2.SS1.p3.5.m5.1.1.1.1.1.1.cmml" xref="S2.SS1.p3.5.m5.1.1.1.1.1.1">conditional</csymbol><apply id="S2.SS1.p3.5.m5.1.1.1.1.1.2.cmml" xref="S2.SS1.p3.5.m5.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.p3.5.m5.1.1.1.1.1.2.1.cmml" xref="S2.SS1.p3.5.m5.1.1.1.1.1.2">subscript</csymbol><ci id="S2.SS1.p3.5.m5.1.1.1.1.1.2.2.cmml" xref="S2.SS1.p3.5.m5.1.1.1.1.1.2.2">𝑑</ci><ci id="S2.SS1.p3.5.m5.1.1.1.1.1.2.3.cmml" xref="S2.SS1.p3.5.m5.1.1.1.1.1.2.3">𝑖</ci></apply><apply id="S2.SS1.p3.5.m5.1.1.1.1.1.3.cmml" xref="S2.SS1.p3.5.m5.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.p3.5.m5.1.1.1.1.1.3.1.cmml" xref="S2.SS1.p3.5.m5.1.1.1.1.1.3">subscript</csymbol><ci id="S2.SS1.p3.5.m5.1.1.1.1.1.3.2.cmml" xref="S2.SS1.p3.5.m5.1.1.1.1.1.3.2">𝑚</ci><ci id="S2.SS1.p3.5.m5.1.1.1.1.1.3.3.cmml" xref="S2.SS1.p3.5.m5.1.1.1.1.1.3.3">𝑗</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p3.5.m5.1c">p(d_{i}|m_{j})</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p3.5.m5.1d">italic_p ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )</annotation></semantics></math>. The training performance could be measured through different metrics for different tasks, like accuracy for classification tasks.</p> </div> <div class="ltx_para" id="S2.SS1.p4"> <p class="ltx_p" id="S2.SS1.p4.4"><span class="ltx_text ltx_font_italic" id="S2.SS1.p4.4.1">Model Cluster</span>. A model cluster contains a group of models having similar training performances on benchmark datasets. Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S1.F2" title="Figure 2 ‣ I Introduction ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">2</span></a>(b) illustrates two model clusters, where <math alttext="C_{1}" class="ltx_Math" display="inline" id="S2.SS1.p4.1.m1.1"><semantics id="S2.SS1.p4.1.m1.1a"><msub id="S2.SS1.p4.1.m1.1.1" xref="S2.SS1.p4.1.m1.1.1.cmml"><mi id="S2.SS1.p4.1.m1.1.1.2" xref="S2.SS1.p4.1.m1.1.1.2.cmml">C</mi><mn id="S2.SS1.p4.1.m1.1.1.3" xref="S2.SS1.p4.1.m1.1.1.3.cmml">1</mn></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.p4.1.m1.1b"><apply id="S2.SS1.p4.1.m1.1.1.cmml" xref="S2.SS1.p4.1.m1.1.1"><csymbol cd="ambiguous" id="S2.SS1.p4.1.m1.1.1.1.cmml" xref="S2.SS1.p4.1.m1.1.1">subscript</csymbol><ci id="S2.SS1.p4.1.m1.1.1.2.cmml" xref="S2.SS1.p4.1.m1.1.1.2">𝐶</ci><cn id="S2.SS1.p4.1.m1.1.1.3.cmml" type="integer" xref="S2.SS1.p4.1.m1.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p4.1.m1.1c">C_{1}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p4.1.m1.1d">italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT</annotation></semantics></math> contains three models <math alttext="{m_{i},m_{j},m_{k}}" class="ltx_Math" display="inline" id="S2.SS1.p4.2.m2.3"><semantics id="S2.SS1.p4.2.m2.3a"><mrow id="S2.SS1.p4.2.m2.3.3.3" xref="S2.SS1.p4.2.m2.3.3.4.cmml"><msub id="S2.SS1.p4.2.m2.1.1.1.1" xref="S2.SS1.p4.2.m2.1.1.1.1.cmml"><mi id="S2.SS1.p4.2.m2.1.1.1.1.2" xref="S2.SS1.p4.2.m2.1.1.1.1.2.cmml">m</mi><mi id="S2.SS1.p4.2.m2.1.1.1.1.3" xref="S2.SS1.p4.2.m2.1.1.1.1.3.cmml">i</mi></msub><mo id="S2.SS1.p4.2.m2.3.3.3.4" xref="S2.SS1.p4.2.m2.3.3.4.cmml">,</mo><msub id="S2.SS1.p4.2.m2.2.2.2.2" xref="S2.SS1.p4.2.m2.2.2.2.2.cmml"><mi id="S2.SS1.p4.2.m2.2.2.2.2.2" xref="S2.SS1.p4.2.m2.2.2.2.2.2.cmml">m</mi><mi id="S2.SS1.p4.2.m2.2.2.2.2.3" xref="S2.SS1.p4.2.m2.2.2.2.2.3.cmml">j</mi></msub><mo id="S2.SS1.p4.2.m2.3.3.3.5" xref="S2.SS1.p4.2.m2.3.3.4.cmml">,</mo><msub id="S2.SS1.p4.2.m2.3.3.3.3" xref="S2.SS1.p4.2.m2.3.3.3.3.cmml"><mi id="S2.SS1.p4.2.m2.3.3.3.3.2" xref="S2.SS1.p4.2.m2.3.3.3.3.2.cmml">m</mi><mi id="S2.SS1.p4.2.m2.3.3.3.3.3" xref="S2.SS1.p4.2.m2.3.3.3.3.3.cmml">k</mi></msub></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p4.2.m2.3b"><list id="S2.SS1.p4.2.m2.3.3.4.cmml" xref="S2.SS1.p4.2.m2.3.3.3"><apply id="S2.SS1.p4.2.m2.1.1.1.1.cmml" xref="S2.SS1.p4.2.m2.1.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.p4.2.m2.1.1.1.1.1.cmml" xref="S2.SS1.p4.2.m2.1.1.1.1">subscript</csymbol><ci id="S2.SS1.p4.2.m2.1.1.1.1.2.cmml" xref="S2.SS1.p4.2.m2.1.1.1.1.2">𝑚</ci><ci id="S2.SS1.p4.2.m2.1.1.1.1.3.cmml" xref="S2.SS1.p4.2.m2.1.1.1.1.3">𝑖</ci></apply><apply id="S2.SS1.p4.2.m2.2.2.2.2.cmml" xref="S2.SS1.p4.2.m2.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.p4.2.m2.2.2.2.2.1.cmml" xref="S2.SS1.p4.2.m2.2.2.2.2">subscript</csymbol><ci id="S2.SS1.p4.2.m2.2.2.2.2.2.cmml" xref="S2.SS1.p4.2.m2.2.2.2.2.2">𝑚</ci><ci id="S2.SS1.p4.2.m2.2.2.2.2.3.cmml" xref="S2.SS1.p4.2.m2.2.2.2.2.3">𝑗</ci></apply><apply id="S2.SS1.p4.2.m2.3.3.3.3.cmml" xref="S2.SS1.p4.2.m2.3.3.3.3"><csymbol cd="ambiguous" id="S2.SS1.p4.2.m2.3.3.3.3.1.cmml" xref="S2.SS1.p4.2.m2.3.3.3.3">subscript</csymbol><ci id="S2.SS1.p4.2.m2.3.3.3.3.2.cmml" xref="S2.SS1.p4.2.m2.3.3.3.3.2">𝑚</ci><ci id="S2.SS1.p4.2.m2.3.3.3.3.3.cmml" xref="S2.SS1.p4.2.m2.3.3.3.3.3">𝑘</ci></apply></list></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p4.2.m2.3c">{m_{i},m_{j},m_{k}}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p4.2.m2.3d">italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_m start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT</annotation></semantics></math> and <math alttext="C_{2}" class="ltx_Math" display="inline" id="S2.SS1.p4.3.m3.1"><semantics id="S2.SS1.p4.3.m3.1a"><msub id="S2.SS1.p4.3.m3.1.1" xref="S2.SS1.p4.3.m3.1.1.cmml"><mi id="S2.SS1.p4.3.m3.1.1.2" xref="S2.SS1.p4.3.m3.1.1.2.cmml">C</mi><mn id="S2.SS1.p4.3.m3.1.1.3" xref="S2.SS1.p4.3.m3.1.1.3.cmml">2</mn></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.p4.3.m3.1b"><apply id="S2.SS1.p4.3.m3.1.1.cmml" xref="S2.SS1.p4.3.m3.1.1"><csymbol cd="ambiguous" id="S2.SS1.p4.3.m3.1.1.1.cmml" xref="S2.SS1.p4.3.m3.1.1">subscript</csymbol><ci id="S2.SS1.p4.3.m3.1.1.2.cmml" xref="S2.SS1.p4.3.m3.1.1.2">𝐶</ci><cn id="S2.SS1.p4.3.m3.1.1.3.cmml" type="integer" xref="S2.SS1.p4.3.m3.1.1.3">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p4.3.m3.1c">C_{2}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p4.3.m3.1d">italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math> contains two models <math alttext="{m_{x},m_{y}}" class="ltx_Math" display="inline" id="S2.SS1.p4.4.m4.2"><semantics id="S2.SS1.p4.4.m4.2a"><mrow id="S2.SS1.p4.4.m4.2.2.2" xref="S2.SS1.p4.4.m4.2.2.3.cmml"><msub id="S2.SS1.p4.4.m4.1.1.1.1" xref="S2.SS1.p4.4.m4.1.1.1.1.cmml"><mi id="S2.SS1.p4.4.m4.1.1.1.1.2" xref="S2.SS1.p4.4.m4.1.1.1.1.2.cmml">m</mi><mi id="S2.SS1.p4.4.m4.1.1.1.1.3" xref="S2.SS1.p4.4.m4.1.1.1.1.3.cmml">x</mi></msub><mo id="S2.SS1.p4.4.m4.2.2.2.3" xref="S2.SS1.p4.4.m4.2.2.3.cmml">,</mo><msub id="S2.SS1.p4.4.m4.2.2.2.2" xref="S2.SS1.p4.4.m4.2.2.2.2.cmml"><mi id="S2.SS1.p4.4.m4.2.2.2.2.2" xref="S2.SS1.p4.4.m4.2.2.2.2.2.cmml">m</mi><mi id="S2.SS1.p4.4.m4.2.2.2.2.3" xref="S2.SS1.p4.4.m4.2.2.2.2.3.cmml">y</mi></msub></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p4.4.m4.2b"><list id="S2.SS1.p4.4.m4.2.2.3.cmml" xref="S2.SS1.p4.4.m4.2.2.2"><apply id="S2.SS1.p4.4.m4.1.1.1.1.cmml" xref="S2.SS1.p4.4.m4.1.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.p4.4.m4.1.1.1.1.1.cmml" xref="S2.SS1.p4.4.m4.1.1.1.1">subscript</csymbol><ci id="S2.SS1.p4.4.m4.1.1.1.1.2.cmml" xref="S2.SS1.p4.4.m4.1.1.1.1.2">𝑚</ci><ci id="S2.SS1.p4.4.m4.1.1.1.1.3.cmml" xref="S2.SS1.p4.4.m4.1.1.1.1.3">𝑥</ci></apply><apply id="S2.SS1.p4.4.m4.2.2.2.2.cmml" xref="S2.SS1.p4.4.m4.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.p4.4.m4.2.2.2.2.1.cmml" xref="S2.SS1.p4.4.m4.2.2.2.2">subscript</csymbol><ci id="S2.SS1.p4.4.m4.2.2.2.2.2.cmml" xref="S2.SS1.p4.4.m4.2.2.2.2.2">𝑚</ci><ci id="S2.SS1.p4.4.m4.2.2.2.2.3.cmml" xref="S2.SS1.p4.4.m4.2.2.2.2.3">𝑦</ci></apply></list></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p4.4.m4.2c">{m_{x},m_{y}}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p4.4.m4.2d">italic_m start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_m start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT</annotation></semantics></math> respectively.</p> </div> <div class="ltx_para" id="S2.SS1.p5"> <p class="ltx_p" id="S2.SS1.p5.6"><span class="ltx_text ltx_font_italic" id="S2.SS1.p5.6.1">Convergence Trend</span>. The convergence trend clusters datasets into different classes on which the model has a similar training process. We use <math alttext="CT(m_{j})_{t}" class="ltx_Math" display="inline" id="S2.SS1.p5.1.m1.1"><semantics id="S2.SS1.p5.1.m1.1a"><mrow id="S2.SS1.p5.1.m1.1.1" xref="S2.SS1.p5.1.m1.1.1.cmml"><mi id="S2.SS1.p5.1.m1.1.1.3" xref="S2.SS1.p5.1.m1.1.1.3.cmml">C</mi><mo id="S2.SS1.p5.1.m1.1.1.2" xref="S2.SS1.p5.1.m1.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p5.1.m1.1.1.4" xref="S2.SS1.p5.1.m1.1.1.4.cmml">T</mi><mo id="S2.SS1.p5.1.m1.1.1.2a" xref="S2.SS1.p5.1.m1.1.1.2.cmml">⁢</mo><msub id="S2.SS1.p5.1.m1.1.1.1" xref="S2.SS1.p5.1.m1.1.1.1.cmml"><mrow id="S2.SS1.p5.1.m1.1.1.1.1.1" xref="S2.SS1.p5.1.m1.1.1.1.1.1.1.cmml"><mo id="S2.SS1.p5.1.m1.1.1.1.1.1.2" stretchy="false" xref="S2.SS1.p5.1.m1.1.1.1.1.1.1.cmml">(</mo><msub id="S2.SS1.p5.1.m1.1.1.1.1.1.1" xref="S2.SS1.p5.1.m1.1.1.1.1.1.1.cmml"><mi id="S2.SS1.p5.1.m1.1.1.1.1.1.1.2" xref="S2.SS1.p5.1.m1.1.1.1.1.1.1.2.cmml">m</mi><mi id="S2.SS1.p5.1.m1.1.1.1.1.1.1.3" xref="S2.SS1.p5.1.m1.1.1.1.1.1.1.3.cmml">j</mi></msub><mo id="S2.SS1.p5.1.m1.1.1.1.1.1.3" stretchy="false" xref="S2.SS1.p5.1.m1.1.1.1.1.1.1.cmml">)</mo></mrow><mi id="S2.SS1.p5.1.m1.1.1.1.3" xref="S2.SS1.p5.1.m1.1.1.1.3.cmml">t</mi></msub></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p5.1.m1.1b"><apply id="S2.SS1.p5.1.m1.1.1.cmml" xref="S2.SS1.p5.1.m1.1.1"><times id="S2.SS1.p5.1.m1.1.1.2.cmml" xref="S2.SS1.p5.1.m1.1.1.2"></times><ci id="S2.SS1.p5.1.m1.1.1.3.cmml" xref="S2.SS1.p5.1.m1.1.1.3">𝐶</ci><ci id="S2.SS1.p5.1.m1.1.1.4.cmml" xref="S2.SS1.p5.1.m1.1.1.4">𝑇</ci><apply id="S2.SS1.p5.1.m1.1.1.1.cmml" xref="S2.SS1.p5.1.m1.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.p5.1.m1.1.1.1.2.cmml" xref="S2.SS1.p5.1.m1.1.1.1">subscript</csymbol><apply id="S2.SS1.p5.1.m1.1.1.1.1.1.1.cmml" xref="S2.SS1.p5.1.m1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.p5.1.m1.1.1.1.1.1.1.1.cmml" xref="S2.SS1.p5.1.m1.1.1.1.1.1">subscript</csymbol><ci id="S2.SS1.p5.1.m1.1.1.1.1.1.1.2.cmml" xref="S2.SS1.p5.1.m1.1.1.1.1.1.1.2">𝑚</ci><ci id="S2.SS1.p5.1.m1.1.1.1.1.1.1.3.cmml" xref="S2.SS1.p5.1.m1.1.1.1.1.1.1.3">𝑗</ci></apply><ci id="S2.SS1.p5.1.m1.1.1.1.3.cmml" xref="S2.SS1.p5.1.m1.1.1.1.3">𝑡</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p5.1.m1.1c">CT(m_{j})_{t}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p5.1.m1.1d">italic_C italic_T ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> to denote the class of convergence trend to which the dataset belongs at the <math alttext="t" class="ltx_Math" display="inline" id="S2.SS1.p5.2.m2.1"><semantics id="S2.SS1.p5.2.m2.1a"><mi id="S2.SS1.p5.2.m2.1.1" xref="S2.SS1.p5.2.m2.1.1.cmml">t</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p5.2.m2.1b"><ci id="S2.SS1.p5.2.m2.1.1.cmml" xref="S2.SS1.p5.2.m2.1.1">𝑡</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p5.2.m2.1c">t</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p5.2.m2.1d">italic_t</annotation></semantics></math> validation for model <math alttext="m_{j}" class="ltx_Math" display="inline" id="S2.SS1.p5.3.m3.1"><semantics id="S2.SS1.p5.3.m3.1a"><msub id="S2.SS1.p5.3.m3.1.1" xref="S2.SS1.p5.3.m3.1.1.cmml"><mi id="S2.SS1.p5.3.m3.1.1.2" xref="S2.SS1.p5.3.m3.1.1.2.cmml">m</mi><mi id="S2.SS1.p5.3.m3.1.1.3" xref="S2.SS1.p5.3.m3.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.p5.3.m3.1b"><apply id="S2.SS1.p5.3.m3.1.1.cmml" xref="S2.SS1.p5.3.m3.1.1"><csymbol cd="ambiguous" id="S2.SS1.p5.3.m3.1.1.1.cmml" xref="S2.SS1.p5.3.m3.1.1">subscript</csymbol><ci id="S2.SS1.p5.3.m3.1.1.2.cmml" xref="S2.SS1.p5.3.m3.1.1.2">𝑚</ci><ci id="S2.SS1.p5.3.m3.1.1.3.cmml" xref="S2.SS1.p5.3.m3.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p5.3.m3.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p5.3.m3.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math>. Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S1.F2" title="Figure 2 ‣ I Introduction ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">2</span></a>(b) illustrates two convergence trends for <math alttext="m_{j}" class="ltx_Math" display="inline" id="S2.SS1.p5.4.m4.1"><semantics id="S2.SS1.p5.4.m4.1a"><msub id="S2.SS1.p5.4.m4.1.1" xref="S2.SS1.p5.4.m4.1.1.cmml"><mi id="S2.SS1.p5.4.m4.1.1.2" xref="S2.SS1.p5.4.m4.1.1.2.cmml">m</mi><mi id="S2.SS1.p5.4.m4.1.1.3" xref="S2.SS1.p5.4.m4.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.p5.4.m4.1b"><apply id="S2.SS1.p5.4.m4.1.1.cmml" xref="S2.SS1.p5.4.m4.1.1"><csymbol cd="ambiguous" id="S2.SS1.p5.4.m4.1.1.1.cmml" xref="S2.SS1.p5.4.m4.1.1">subscript</csymbol><ci id="S2.SS1.p5.4.m4.1.1.2.cmml" xref="S2.SS1.p5.4.m4.1.1.2">𝑚</ci><ci id="S2.SS1.p5.4.m4.1.1.3.cmml" xref="S2.SS1.p5.4.m4.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p5.4.m4.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p5.4.m4.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math>. The first convergence trend <math alttext="CT(m_{j})_{t}[0]" class="ltx_Math" display="inline" id="S2.SS1.p5.5.m5.2"><semantics id="S2.SS1.p5.5.m5.2a"><mrow id="S2.SS1.p5.5.m5.2.2" xref="S2.SS1.p5.5.m5.2.2.cmml"><mi id="S2.SS1.p5.5.m5.2.2.3" xref="S2.SS1.p5.5.m5.2.2.3.cmml">C</mi><mo id="S2.SS1.p5.5.m5.2.2.2" xref="S2.SS1.p5.5.m5.2.2.2.cmml">⁢</mo><mi id="S2.SS1.p5.5.m5.2.2.4" xref="S2.SS1.p5.5.m5.2.2.4.cmml">T</mi><mo id="S2.SS1.p5.5.m5.2.2.2a" xref="S2.SS1.p5.5.m5.2.2.2.cmml">⁢</mo><msub id="S2.SS1.p5.5.m5.2.2.1" xref="S2.SS1.p5.5.m5.2.2.1.cmml"><mrow id="S2.SS1.p5.5.m5.2.2.1.1.1" xref="S2.SS1.p5.5.m5.2.2.1.1.1.1.cmml"><mo id="S2.SS1.p5.5.m5.2.2.1.1.1.2" stretchy="false" xref="S2.SS1.p5.5.m5.2.2.1.1.1.1.cmml">(</mo><msub id="S2.SS1.p5.5.m5.2.2.1.1.1.1" xref="S2.SS1.p5.5.m5.2.2.1.1.1.1.cmml"><mi id="S2.SS1.p5.5.m5.2.2.1.1.1.1.2" xref="S2.SS1.p5.5.m5.2.2.1.1.1.1.2.cmml">m</mi><mi id="S2.SS1.p5.5.m5.2.2.1.1.1.1.3" xref="S2.SS1.p5.5.m5.2.2.1.1.1.1.3.cmml">j</mi></msub><mo id="S2.SS1.p5.5.m5.2.2.1.1.1.3" stretchy="false" xref="S2.SS1.p5.5.m5.2.2.1.1.1.1.cmml">)</mo></mrow><mi id="S2.SS1.p5.5.m5.2.2.1.3" xref="S2.SS1.p5.5.m5.2.2.1.3.cmml">t</mi></msub><mo id="S2.SS1.p5.5.m5.2.2.2b" xref="S2.SS1.p5.5.m5.2.2.2.cmml">⁢</mo><mrow id="S2.SS1.p5.5.m5.2.2.5.2" xref="S2.SS1.p5.5.m5.2.2.5.1.cmml"><mo id="S2.SS1.p5.5.m5.2.2.5.2.1" stretchy="false" xref="S2.SS1.p5.5.m5.2.2.5.1.1.cmml">[</mo><mn id="S2.SS1.p5.5.m5.1.1" xref="S2.SS1.p5.5.m5.1.1.cmml">0</mn><mo id="S2.SS1.p5.5.m5.2.2.5.2.2" stretchy="false" xref="S2.SS1.p5.5.m5.2.2.5.1.1.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p5.5.m5.2b"><apply id="S2.SS1.p5.5.m5.2.2.cmml" xref="S2.SS1.p5.5.m5.2.2"><times id="S2.SS1.p5.5.m5.2.2.2.cmml" xref="S2.SS1.p5.5.m5.2.2.2"></times><ci id="S2.SS1.p5.5.m5.2.2.3.cmml" xref="S2.SS1.p5.5.m5.2.2.3">𝐶</ci><ci id="S2.SS1.p5.5.m5.2.2.4.cmml" xref="S2.SS1.p5.5.m5.2.2.4">𝑇</ci><apply id="S2.SS1.p5.5.m5.2.2.1.cmml" xref="S2.SS1.p5.5.m5.2.2.1"><csymbol cd="ambiguous" id="S2.SS1.p5.5.m5.2.2.1.2.cmml" xref="S2.SS1.p5.5.m5.2.2.1">subscript</csymbol><apply id="S2.SS1.p5.5.m5.2.2.1.1.1.1.cmml" xref="S2.SS1.p5.5.m5.2.2.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.p5.5.m5.2.2.1.1.1.1.1.cmml" xref="S2.SS1.p5.5.m5.2.2.1.1.1">subscript</csymbol><ci id="S2.SS1.p5.5.m5.2.2.1.1.1.1.2.cmml" xref="S2.SS1.p5.5.m5.2.2.1.1.1.1.2">𝑚</ci><ci id="S2.SS1.p5.5.m5.2.2.1.1.1.1.3.cmml" xref="S2.SS1.p5.5.m5.2.2.1.1.1.1.3">𝑗</ci></apply><ci id="S2.SS1.p5.5.m5.2.2.1.3.cmml" xref="S2.SS1.p5.5.m5.2.2.1.3">𝑡</ci></apply><apply id="S2.SS1.p5.5.m5.2.2.5.1.cmml" xref="S2.SS1.p5.5.m5.2.2.5.2"><csymbol cd="latexml" id="S2.SS1.p5.5.m5.2.2.5.1.1.cmml" xref="S2.SS1.p5.5.m5.2.2.5.2.1">delimited-[]</csymbol><cn id="S2.SS1.p5.5.m5.1.1.cmml" type="integer" xref="S2.SS1.p5.5.m5.1.1">0</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p5.5.m5.2c">CT(m_{j})_{t}[0]</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p5.5.m5.2d">italic_C italic_T ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ 0 ]</annotation></semantics></math> represents a convergence process which could achieve relative higher final training performance, and the second convergence trend <math alttext="CT(m_{j})_{t}[1]" class="ltx_Math" display="inline" id="S2.SS1.p5.6.m6.2"><semantics id="S2.SS1.p5.6.m6.2a"><mrow id="S2.SS1.p5.6.m6.2.2" xref="S2.SS1.p5.6.m6.2.2.cmml"><mi id="S2.SS1.p5.6.m6.2.2.3" xref="S2.SS1.p5.6.m6.2.2.3.cmml">C</mi><mo id="S2.SS1.p5.6.m6.2.2.2" xref="S2.SS1.p5.6.m6.2.2.2.cmml">⁢</mo><mi id="S2.SS1.p5.6.m6.2.2.4" xref="S2.SS1.p5.6.m6.2.2.4.cmml">T</mi><mo id="S2.SS1.p5.6.m6.2.2.2a" xref="S2.SS1.p5.6.m6.2.2.2.cmml">⁢</mo><msub id="S2.SS1.p5.6.m6.2.2.1" xref="S2.SS1.p5.6.m6.2.2.1.cmml"><mrow id="S2.SS1.p5.6.m6.2.2.1.1.1" xref="S2.SS1.p5.6.m6.2.2.1.1.1.1.cmml"><mo id="S2.SS1.p5.6.m6.2.2.1.1.1.2" stretchy="false" xref="S2.SS1.p5.6.m6.2.2.1.1.1.1.cmml">(</mo><msub id="S2.SS1.p5.6.m6.2.2.1.1.1.1" xref="S2.SS1.p5.6.m6.2.2.1.1.1.1.cmml"><mi id="S2.SS1.p5.6.m6.2.2.1.1.1.1.2" xref="S2.SS1.p5.6.m6.2.2.1.1.1.1.2.cmml">m</mi><mi id="S2.SS1.p5.6.m6.2.2.1.1.1.1.3" xref="S2.SS1.p5.6.m6.2.2.1.1.1.1.3.cmml">j</mi></msub><mo id="S2.SS1.p5.6.m6.2.2.1.1.1.3" stretchy="false" xref="S2.SS1.p5.6.m6.2.2.1.1.1.1.cmml">)</mo></mrow><mi id="S2.SS1.p5.6.m6.2.2.1.3" xref="S2.SS1.p5.6.m6.2.2.1.3.cmml">t</mi></msub><mo id="S2.SS1.p5.6.m6.2.2.2b" xref="S2.SS1.p5.6.m6.2.2.2.cmml">⁢</mo><mrow id="S2.SS1.p5.6.m6.2.2.5.2" xref="S2.SS1.p5.6.m6.2.2.5.1.cmml"><mo id="S2.SS1.p5.6.m6.2.2.5.2.1" stretchy="false" xref="S2.SS1.p5.6.m6.2.2.5.1.1.cmml">[</mo><mn id="S2.SS1.p5.6.m6.1.1" xref="S2.SS1.p5.6.m6.1.1.cmml">1</mn><mo id="S2.SS1.p5.6.m6.2.2.5.2.2" stretchy="false" xref="S2.SS1.p5.6.m6.2.2.5.1.1.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p5.6.m6.2b"><apply id="S2.SS1.p5.6.m6.2.2.cmml" xref="S2.SS1.p5.6.m6.2.2"><times id="S2.SS1.p5.6.m6.2.2.2.cmml" xref="S2.SS1.p5.6.m6.2.2.2"></times><ci id="S2.SS1.p5.6.m6.2.2.3.cmml" xref="S2.SS1.p5.6.m6.2.2.3">𝐶</ci><ci id="S2.SS1.p5.6.m6.2.2.4.cmml" xref="S2.SS1.p5.6.m6.2.2.4">𝑇</ci><apply id="S2.SS1.p5.6.m6.2.2.1.cmml" xref="S2.SS1.p5.6.m6.2.2.1"><csymbol cd="ambiguous" id="S2.SS1.p5.6.m6.2.2.1.2.cmml" xref="S2.SS1.p5.6.m6.2.2.1">subscript</csymbol><apply id="S2.SS1.p5.6.m6.2.2.1.1.1.1.cmml" xref="S2.SS1.p5.6.m6.2.2.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.p5.6.m6.2.2.1.1.1.1.1.cmml" xref="S2.SS1.p5.6.m6.2.2.1.1.1">subscript</csymbol><ci id="S2.SS1.p5.6.m6.2.2.1.1.1.1.2.cmml" xref="S2.SS1.p5.6.m6.2.2.1.1.1.1.2">𝑚</ci><ci id="S2.SS1.p5.6.m6.2.2.1.1.1.1.3.cmml" xref="S2.SS1.p5.6.m6.2.2.1.1.1.1.3">𝑗</ci></apply><ci id="S2.SS1.p5.6.m6.2.2.1.3.cmml" xref="S2.SS1.p5.6.m6.2.2.1.3">𝑡</ci></apply><apply id="S2.SS1.p5.6.m6.2.2.5.1.cmml" xref="S2.SS1.p5.6.m6.2.2.5.2"><csymbol cd="latexml" id="S2.SS1.p5.6.m6.2.2.5.1.1.cmml" xref="S2.SS1.p5.6.m6.2.2.5.2.1">delimited-[]</csymbol><cn id="S2.SS1.p5.6.m6.1.1.cmml" type="integer" xref="S2.SS1.p5.6.m6.1.1">1</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p5.6.m6.2c">CT(m_{j})_{t}[1]</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p5.6.m6.2d">italic_C italic_T ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ 1 ]</annotation></semantics></math> achieves lower final training performance. Based on the mined convergence trend, the final performance after training could be predicted more accurate at early training steps.</p> </div> <div class="ltx_para" id="S2.SS1.p6"> <p class="ltx_p" id="S2.SS1.p6.6"><span class="ltx_text ltx_font_italic" id="S2.SS1.p6.6.1">Proxy Score</span>. The proxy score is computed based on the proxy task which predicts the training performance <math alttext="p(d_{i}|m_{j})" class="ltx_Math" display="inline" id="S2.SS1.p6.1.m1.1"><semantics id="S2.SS1.p6.1.m1.1a"><mrow id="S2.SS1.p6.1.m1.1.1" xref="S2.SS1.p6.1.m1.1.1.cmml"><mi id="S2.SS1.p6.1.m1.1.1.3" xref="S2.SS1.p6.1.m1.1.1.3.cmml">p</mi><mo id="S2.SS1.p6.1.m1.1.1.2" xref="S2.SS1.p6.1.m1.1.1.2.cmml">⁢</mo><mrow id="S2.SS1.p6.1.m1.1.1.1.1" xref="S2.SS1.p6.1.m1.1.1.1.1.1.cmml"><mo id="S2.SS1.p6.1.m1.1.1.1.1.2" stretchy="false" xref="S2.SS1.p6.1.m1.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.p6.1.m1.1.1.1.1.1" xref="S2.SS1.p6.1.m1.1.1.1.1.1.cmml"><msub id="S2.SS1.p6.1.m1.1.1.1.1.1.2" xref="S2.SS1.p6.1.m1.1.1.1.1.1.2.cmml"><mi id="S2.SS1.p6.1.m1.1.1.1.1.1.2.2" xref="S2.SS1.p6.1.m1.1.1.1.1.1.2.2.cmml">d</mi><mi id="S2.SS1.p6.1.m1.1.1.1.1.1.2.3" xref="S2.SS1.p6.1.m1.1.1.1.1.1.2.3.cmml">i</mi></msub><mo fence="false" id="S2.SS1.p6.1.m1.1.1.1.1.1.1" xref="S2.SS1.p6.1.m1.1.1.1.1.1.1.cmml">|</mo><msub id="S2.SS1.p6.1.m1.1.1.1.1.1.3" xref="S2.SS1.p6.1.m1.1.1.1.1.1.3.cmml"><mi id="S2.SS1.p6.1.m1.1.1.1.1.1.3.2" xref="S2.SS1.p6.1.m1.1.1.1.1.1.3.2.cmml">m</mi><mi id="S2.SS1.p6.1.m1.1.1.1.1.1.3.3" xref="S2.SS1.p6.1.m1.1.1.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S2.SS1.p6.1.m1.1.1.1.1.3" stretchy="false" xref="S2.SS1.p6.1.m1.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p6.1.m1.1b"><apply id="S2.SS1.p6.1.m1.1.1.cmml" xref="S2.SS1.p6.1.m1.1.1"><times id="S2.SS1.p6.1.m1.1.1.2.cmml" xref="S2.SS1.p6.1.m1.1.1.2"></times><ci id="S2.SS1.p6.1.m1.1.1.3.cmml" xref="S2.SS1.p6.1.m1.1.1.3">𝑝</ci><apply id="S2.SS1.p6.1.m1.1.1.1.1.1.cmml" xref="S2.SS1.p6.1.m1.1.1.1.1"><csymbol cd="latexml" id="S2.SS1.p6.1.m1.1.1.1.1.1.1.cmml" xref="S2.SS1.p6.1.m1.1.1.1.1.1.1">conditional</csymbol><apply id="S2.SS1.p6.1.m1.1.1.1.1.1.2.cmml" xref="S2.SS1.p6.1.m1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.p6.1.m1.1.1.1.1.1.2.1.cmml" xref="S2.SS1.p6.1.m1.1.1.1.1.1.2">subscript</csymbol><ci id="S2.SS1.p6.1.m1.1.1.1.1.1.2.2.cmml" xref="S2.SS1.p6.1.m1.1.1.1.1.1.2.2">𝑑</ci><ci id="S2.SS1.p6.1.m1.1.1.1.1.1.2.3.cmml" xref="S2.SS1.p6.1.m1.1.1.1.1.1.2.3">𝑖</ci></apply><apply id="S2.SS1.p6.1.m1.1.1.1.1.1.3.cmml" xref="S2.SS1.p6.1.m1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.p6.1.m1.1.1.1.1.1.3.1.cmml" xref="S2.SS1.p6.1.m1.1.1.1.1.1.3">subscript</csymbol><ci id="S2.SS1.p6.1.m1.1.1.1.1.1.3.2.cmml" xref="S2.SS1.p6.1.m1.1.1.1.1.1.3.2">𝑚</ci><ci id="S2.SS1.p6.1.m1.1.1.1.1.1.3.3.cmml" xref="S2.SS1.p6.1.m1.1.1.1.1.1.3.3">𝑗</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p6.1.m1.1c">p(d_{i}|m_{j})</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p6.1.m1.1d">italic_p ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )</annotation></semantics></math> without actually fine-tuning <math alttext="m_{j}" class="ltx_Math" display="inline" id="S2.SS1.p6.2.m2.1"><semantics id="S2.SS1.p6.2.m2.1a"><msub id="S2.SS1.p6.2.m2.1.1" xref="S2.SS1.p6.2.m2.1.1.cmml"><mi id="S2.SS1.p6.2.m2.1.1.2" xref="S2.SS1.p6.2.m2.1.1.2.cmml">m</mi><mi id="S2.SS1.p6.2.m2.1.1.3" xref="S2.SS1.p6.2.m2.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.p6.2.m2.1b"><apply id="S2.SS1.p6.2.m2.1.1.cmml" xref="S2.SS1.p6.2.m2.1.1"><csymbol cd="ambiguous" id="S2.SS1.p6.2.m2.1.1.1.cmml" xref="S2.SS1.p6.2.m2.1.1">subscript</csymbol><ci id="S2.SS1.p6.2.m2.1.1.2.cmml" xref="S2.SS1.p6.2.m2.1.1.2">𝑚</ci><ci id="S2.SS1.p6.2.m2.1.1.3.cmml" xref="S2.SS1.p6.2.m2.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p6.2.m2.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p6.2.m2.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math> on <math alttext="d_{i}" class="ltx_Math" display="inline" id="S2.SS1.p6.3.m3.1"><semantics id="S2.SS1.p6.3.m3.1a"><msub id="S2.SS1.p6.3.m3.1.1" xref="S2.SS1.p6.3.m3.1.1.cmml"><mi id="S2.SS1.p6.3.m3.1.1.2" xref="S2.SS1.p6.3.m3.1.1.2.cmml">d</mi><mi id="S2.SS1.p6.3.m3.1.1.3" xref="S2.SS1.p6.3.m3.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.p6.3.m3.1b"><apply id="S2.SS1.p6.3.m3.1.1.cmml" xref="S2.SS1.p6.3.m3.1.1"><csymbol cd="ambiguous" id="S2.SS1.p6.3.m3.1.1.1.cmml" xref="S2.SS1.p6.3.m3.1.1">subscript</csymbol><ci id="S2.SS1.p6.3.m3.1.1.2.cmml" xref="S2.SS1.p6.3.m3.1.1.2">𝑑</ci><ci id="S2.SS1.p6.3.m3.1.1.3.cmml" xref="S2.SS1.p6.3.m3.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p6.3.m3.1c">d_{i}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p6.3.m3.1d">italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math>, denoted as <math alttext="proxy\_score(d_{i}|m_{j})" class="ltx_Math" display="inline" id="S2.SS1.p6.4.m4.1"><semantics id="S2.SS1.p6.4.m4.1a"><mrow id="S2.SS1.p6.4.m4.1.1" xref="S2.SS1.p6.4.m4.1.1.cmml"><mi id="S2.SS1.p6.4.m4.1.1.3" xref="S2.SS1.p6.4.m4.1.1.3.cmml">p</mi><mo id="S2.SS1.p6.4.m4.1.1.2" xref="S2.SS1.p6.4.m4.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p6.4.m4.1.1.4" xref="S2.SS1.p6.4.m4.1.1.4.cmml">r</mi><mo id="S2.SS1.p6.4.m4.1.1.2a" xref="S2.SS1.p6.4.m4.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p6.4.m4.1.1.5" xref="S2.SS1.p6.4.m4.1.1.5.cmml">o</mi><mo id="S2.SS1.p6.4.m4.1.1.2b" xref="S2.SS1.p6.4.m4.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p6.4.m4.1.1.6" xref="S2.SS1.p6.4.m4.1.1.6.cmml">x</mi><mo id="S2.SS1.p6.4.m4.1.1.2c" xref="S2.SS1.p6.4.m4.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p6.4.m4.1.1.7" xref="S2.SS1.p6.4.m4.1.1.7.cmml">y</mi><mo id="S2.SS1.p6.4.m4.1.1.2d" xref="S2.SS1.p6.4.m4.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p6.4.m4.1.1.8" mathvariant="normal" xref="S2.SS1.p6.4.m4.1.1.8.cmml">_</mi><mo id="S2.SS1.p6.4.m4.1.1.2e" xref="S2.SS1.p6.4.m4.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p6.4.m4.1.1.9" xref="S2.SS1.p6.4.m4.1.1.9.cmml">s</mi><mo id="S2.SS1.p6.4.m4.1.1.2f" xref="S2.SS1.p6.4.m4.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p6.4.m4.1.1.10" xref="S2.SS1.p6.4.m4.1.1.10.cmml">c</mi><mo id="S2.SS1.p6.4.m4.1.1.2g" xref="S2.SS1.p6.4.m4.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p6.4.m4.1.1.11" xref="S2.SS1.p6.4.m4.1.1.11.cmml">o</mi><mo id="S2.SS1.p6.4.m4.1.1.2h" xref="S2.SS1.p6.4.m4.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p6.4.m4.1.1.12" xref="S2.SS1.p6.4.m4.1.1.12.cmml">r</mi><mo id="S2.SS1.p6.4.m4.1.1.2i" xref="S2.SS1.p6.4.m4.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p6.4.m4.1.1.13" xref="S2.SS1.p6.4.m4.1.1.13.cmml">e</mi><mo id="S2.SS1.p6.4.m4.1.1.2j" xref="S2.SS1.p6.4.m4.1.1.2.cmml">⁢</mo><mrow id="S2.SS1.p6.4.m4.1.1.1.1" xref="S2.SS1.p6.4.m4.1.1.1.1.1.cmml"><mo id="S2.SS1.p6.4.m4.1.1.1.1.2" stretchy="false" xref="S2.SS1.p6.4.m4.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.p6.4.m4.1.1.1.1.1" xref="S2.SS1.p6.4.m4.1.1.1.1.1.cmml"><msub id="S2.SS1.p6.4.m4.1.1.1.1.1.2" xref="S2.SS1.p6.4.m4.1.1.1.1.1.2.cmml"><mi id="S2.SS1.p6.4.m4.1.1.1.1.1.2.2" xref="S2.SS1.p6.4.m4.1.1.1.1.1.2.2.cmml">d</mi><mi id="S2.SS1.p6.4.m4.1.1.1.1.1.2.3" xref="S2.SS1.p6.4.m4.1.1.1.1.1.2.3.cmml">i</mi></msub><mo fence="false" id="S2.SS1.p6.4.m4.1.1.1.1.1.1" xref="S2.SS1.p6.4.m4.1.1.1.1.1.1.cmml">|</mo><msub id="S2.SS1.p6.4.m4.1.1.1.1.1.3" xref="S2.SS1.p6.4.m4.1.1.1.1.1.3.cmml"><mi id="S2.SS1.p6.4.m4.1.1.1.1.1.3.2" xref="S2.SS1.p6.4.m4.1.1.1.1.1.3.2.cmml">m</mi><mi id="S2.SS1.p6.4.m4.1.1.1.1.1.3.3" xref="S2.SS1.p6.4.m4.1.1.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S2.SS1.p6.4.m4.1.1.1.1.3" stretchy="false" xref="S2.SS1.p6.4.m4.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p6.4.m4.1b"><apply id="S2.SS1.p6.4.m4.1.1.cmml" xref="S2.SS1.p6.4.m4.1.1"><times id="S2.SS1.p6.4.m4.1.1.2.cmml" xref="S2.SS1.p6.4.m4.1.1.2"></times><ci id="S2.SS1.p6.4.m4.1.1.3.cmml" xref="S2.SS1.p6.4.m4.1.1.3">𝑝</ci><ci id="S2.SS1.p6.4.m4.1.1.4.cmml" xref="S2.SS1.p6.4.m4.1.1.4">𝑟</ci><ci id="S2.SS1.p6.4.m4.1.1.5.cmml" xref="S2.SS1.p6.4.m4.1.1.5">𝑜</ci><ci id="S2.SS1.p6.4.m4.1.1.6.cmml" xref="S2.SS1.p6.4.m4.1.1.6">𝑥</ci><ci id="S2.SS1.p6.4.m4.1.1.7.cmml" xref="S2.SS1.p6.4.m4.1.1.7">𝑦</ci><ci id="S2.SS1.p6.4.m4.1.1.8.cmml" xref="S2.SS1.p6.4.m4.1.1.8">_</ci><ci id="S2.SS1.p6.4.m4.1.1.9.cmml" xref="S2.SS1.p6.4.m4.1.1.9">𝑠</ci><ci id="S2.SS1.p6.4.m4.1.1.10.cmml" xref="S2.SS1.p6.4.m4.1.1.10">𝑐</ci><ci id="S2.SS1.p6.4.m4.1.1.11.cmml" xref="S2.SS1.p6.4.m4.1.1.11">𝑜</ci><ci id="S2.SS1.p6.4.m4.1.1.12.cmml" xref="S2.SS1.p6.4.m4.1.1.12">𝑟</ci><ci id="S2.SS1.p6.4.m4.1.1.13.cmml" xref="S2.SS1.p6.4.m4.1.1.13">𝑒</ci><apply id="S2.SS1.p6.4.m4.1.1.1.1.1.cmml" xref="S2.SS1.p6.4.m4.1.1.1.1"><csymbol cd="latexml" id="S2.SS1.p6.4.m4.1.1.1.1.1.1.cmml" xref="S2.SS1.p6.4.m4.1.1.1.1.1.1">conditional</csymbol><apply id="S2.SS1.p6.4.m4.1.1.1.1.1.2.cmml" xref="S2.SS1.p6.4.m4.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.p6.4.m4.1.1.1.1.1.2.1.cmml" xref="S2.SS1.p6.4.m4.1.1.1.1.1.2">subscript</csymbol><ci id="S2.SS1.p6.4.m4.1.1.1.1.1.2.2.cmml" xref="S2.SS1.p6.4.m4.1.1.1.1.1.2.2">𝑑</ci><ci id="S2.SS1.p6.4.m4.1.1.1.1.1.2.3.cmml" xref="S2.SS1.p6.4.m4.1.1.1.1.1.2.3">𝑖</ci></apply><apply id="S2.SS1.p6.4.m4.1.1.1.1.1.3.cmml" xref="S2.SS1.p6.4.m4.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.p6.4.m4.1.1.1.1.1.3.1.cmml" xref="S2.SS1.p6.4.m4.1.1.1.1.1.3">subscript</csymbol><ci id="S2.SS1.p6.4.m4.1.1.1.1.1.3.2.cmml" xref="S2.SS1.p6.4.m4.1.1.1.1.1.3.2">𝑚</ci><ci id="S2.SS1.p6.4.m4.1.1.1.1.1.3.3.cmml" xref="S2.SS1.p6.4.m4.1.1.1.1.1.3.3">𝑗</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p6.4.m4.1c">proxy\_score(d_{i}|m_{j})</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p6.4.m4.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )</annotation></semantics></math>. Several proxy tasks have been developed to predict <math alttext="p(d_{i}|m_{j})" class="ltx_Math" display="inline" id="S2.SS1.p6.5.m5.1"><semantics id="S2.SS1.p6.5.m5.1a"><mrow id="S2.SS1.p6.5.m5.1.1" xref="S2.SS1.p6.5.m5.1.1.cmml"><mi id="S2.SS1.p6.5.m5.1.1.3" xref="S2.SS1.p6.5.m5.1.1.3.cmml">p</mi><mo id="S2.SS1.p6.5.m5.1.1.2" xref="S2.SS1.p6.5.m5.1.1.2.cmml">⁢</mo><mrow id="S2.SS1.p6.5.m5.1.1.1.1" xref="S2.SS1.p6.5.m5.1.1.1.1.1.cmml"><mo id="S2.SS1.p6.5.m5.1.1.1.1.2" stretchy="false" xref="S2.SS1.p6.5.m5.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.p6.5.m5.1.1.1.1.1" xref="S2.SS1.p6.5.m5.1.1.1.1.1.cmml"><msub id="S2.SS1.p6.5.m5.1.1.1.1.1.2" xref="S2.SS1.p6.5.m5.1.1.1.1.1.2.cmml"><mi id="S2.SS1.p6.5.m5.1.1.1.1.1.2.2" xref="S2.SS1.p6.5.m5.1.1.1.1.1.2.2.cmml">d</mi><mi id="S2.SS1.p6.5.m5.1.1.1.1.1.2.3" xref="S2.SS1.p6.5.m5.1.1.1.1.1.2.3.cmml">i</mi></msub><mo fence="false" id="S2.SS1.p6.5.m5.1.1.1.1.1.1" xref="S2.SS1.p6.5.m5.1.1.1.1.1.1.cmml">|</mo><msub id="S2.SS1.p6.5.m5.1.1.1.1.1.3" xref="S2.SS1.p6.5.m5.1.1.1.1.1.3.cmml"><mi id="S2.SS1.p6.5.m5.1.1.1.1.1.3.2" xref="S2.SS1.p6.5.m5.1.1.1.1.1.3.2.cmml">m</mi><mi id="S2.SS1.p6.5.m5.1.1.1.1.1.3.3" xref="S2.SS1.p6.5.m5.1.1.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S2.SS1.p6.5.m5.1.1.1.1.3" stretchy="false" xref="S2.SS1.p6.5.m5.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p6.5.m5.1b"><apply id="S2.SS1.p6.5.m5.1.1.cmml" xref="S2.SS1.p6.5.m5.1.1"><times id="S2.SS1.p6.5.m5.1.1.2.cmml" xref="S2.SS1.p6.5.m5.1.1.2"></times><ci id="S2.SS1.p6.5.m5.1.1.3.cmml" xref="S2.SS1.p6.5.m5.1.1.3">𝑝</ci><apply id="S2.SS1.p6.5.m5.1.1.1.1.1.cmml" xref="S2.SS1.p6.5.m5.1.1.1.1"><csymbol cd="latexml" id="S2.SS1.p6.5.m5.1.1.1.1.1.1.cmml" xref="S2.SS1.p6.5.m5.1.1.1.1.1.1">conditional</csymbol><apply id="S2.SS1.p6.5.m5.1.1.1.1.1.2.cmml" xref="S2.SS1.p6.5.m5.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.p6.5.m5.1.1.1.1.1.2.1.cmml" xref="S2.SS1.p6.5.m5.1.1.1.1.1.2">subscript</csymbol><ci id="S2.SS1.p6.5.m5.1.1.1.1.1.2.2.cmml" xref="S2.SS1.p6.5.m5.1.1.1.1.1.2.2">𝑑</ci><ci id="S2.SS1.p6.5.m5.1.1.1.1.1.2.3.cmml" xref="S2.SS1.p6.5.m5.1.1.1.1.1.2.3">𝑖</ci></apply><apply id="S2.SS1.p6.5.m5.1.1.1.1.1.3.cmml" xref="S2.SS1.p6.5.m5.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.p6.5.m5.1.1.1.1.1.3.1.cmml" xref="S2.SS1.p6.5.m5.1.1.1.1.1.3">subscript</csymbol><ci id="S2.SS1.p6.5.m5.1.1.1.1.1.3.2.cmml" xref="S2.SS1.p6.5.m5.1.1.1.1.1.3.2">𝑚</ci><ci id="S2.SS1.p6.5.m5.1.1.1.1.1.3.3.cmml" xref="S2.SS1.p6.5.m5.1.1.1.1.1.3.3">𝑗</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p6.5.m5.1c">p(d_{i}|m_{j})</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p6.5.m5.1d">italic_p ( italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )</annotation></semantics></math>, such as LEEP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib11" title="">11</a>]</cite>, KNN classification <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib12" title="">12</a>]</cite> etc. In this paper, we apply the LEEP score as the <math alttext="proxy\_score" class="ltx_Math" display="inline" id="S2.SS1.p6.6.m6.1"><semantics id="S2.SS1.p6.6.m6.1a"><mrow id="S2.SS1.p6.6.m6.1.1" xref="S2.SS1.p6.6.m6.1.1.cmml"><mi id="S2.SS1.p6.6.m6.1.1.2" xref="S2.SS1.p6.6.m6.1.1.2.cmml">p</mi><mo id="S2.SS1.p6.6.m6.1.1.1" xref="S2.SS1.p6.6.m6.1.1.1.cmml">⁢</mo><mi id="S2.SS1.p6.6.m6.1.1.3" xref="S2.SS1.p6.6.m6.1.1.3.cmml">r</mi><mo id="S2.SS1.p6.6.m6.1.1.1a" xref="S2.SS1.p6.6.m6.1.1.1.cmml">⁢</mo><mi id="S2.SS1.p6.6.m6.1.1.4" xref="S2.SS1.p6.6.m6.1.1.4.cmml">o</mi><mo id="S2.SS1.p6.6.m6.1.1.1b" xref="S2.SS1.p6.6.m6.1.1.1.cmml">⁢</mo><mi id="S2.SS1.p6.6.m6.1.1.5" xref="S2.SS1.p6.6.m6.1.1.5.cmml">x</mi><mo id="S2.SS1.p6.6.m6.1.1.1c" xref="S2.SS1.p6.6.m6.1.1.1.cmml">⁢</mo><mi id="S2.SS1.p6.6.m6.1.1.6" xref="S2.SS1.p6.6.m6.1.1.6.cmml">y</mi><mo id="S2.SS1.p6.6.m6.1.1.1d" xref="S2.SS1.p6.6.m6.1.1.1.cmml">⁢</mo><mi id="S2.SS1.p6.6.m6.1.1.7" mathvariant="normal" xref="S2.SS1.p6.6.m6.1.1.7.cmml">_</mi><mo id="S2.SS1.p6.6.m6.1.1.1e" xref="S2.SS1.p6.6.m6.1.1.1.cmml">⁢</mo><mi id="S2.SS1.p6.6.m6.1.1.8" xref="S2.SS1.p6.6.m6.1.1.8.cmml">s</mi><mo id="S2.SS1.p6.6.m6.1.1.1f" xref="S2.SS1.p6.6.m6.1.1.1.cmml">⁢</mo><mi id="S2.SS1.p6.6.m6.1.1.9" xref="S2.SS1.p6.6.m6.1.1.9.cmml">c</mi><mo id="S2.SS1.p6.6.m6.1.1.1g" xref="S2.SS1.p6.6.m6.1.1.1.cmml">⁢</mo><mi id="S2.SS1.p6.6.m6.1.1.10" xref="S2.SS1.p6.6.m6.1.1.10.cmml">o</mi><mo id="S2.SS1.p6.6.m6.1.1.1h" xref="S2.SS1.p6.6.m6.1.1.1.cmml">⁢</mo><mi id="S2.SS1.p6.6.m6.1.1.11" xref="S2.SS1.p6.6.m6.1.1.11.cmml">r</mi><mo id="S2.SS1.p6.6.m6.1.1.1i" xref="S2.SS1.p6.6.m6.1.1.1.cmml">⁢</mo><mi id="S2.SS1.p6.6.m6.1.1.12" xref="S2.SS1.p6.6.m6.1.1.12.cmml">e</mi></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p6.6.m6.1b"><apply id="S2.SS1.p6.6.m6.1.1.cmml" xref="S2.SS1.p6.6.m6.1.1"><times id="S2.SS1.p6.6.m6.1.1.1.cmml" xref="S2.SS1.p6.6.m6.1.1.1"></times><ci id="S2.SS1.p6.6.m6.1.1.2.cmml" xref="S2.SS1.p6.6.m6.1.1.2">𝑝</ci><ci id="S2.SS1.p6.6.m6.1.1.3.cmml" xref="S2.SS1.p6.6.m6.1.1.3">𝑟</ci><ci id="S2.SS1.p6.6.m6.1.1.4.cmml" xref="S2.SS1.p6.6.m6.1.1.4">𝑜</ci><ci id="S2.SS1.p6.6.m6.1.1.5.cmml" xref="S2.SS1.p6.6.m6.1.1.5">𝑥</ci><ci id="S2.SS1.p6.6.m6.1.1.6.cmml" xref="S2.SS1.p6.6.m6.1.1.6">𝑦</ci><ci id="S2.SS1.p6.6.m6.1.1.7.cmml" xref="S2.SS1.p6.6.m6.1.1.7">_</ci><ci id="S2.SS1.p6.6.m6.1.1.8.cmml" xref="S2.SS1.p6.6.m6.1.1.8">𝑠</ci><ci id="S2.SS1.p6.6.m6.1.1.9.cmml" xref="S2.SS1.p6.6.m6.1.1.9">𝑐</ci><ci id="S2.SS1.p6.6.m6.1.1.10.cmml" xref="S2.SS1.p6.6.m6.1.1.10">𝑜</ci><ci id="S2.SS1.p6.6.m6.1.1.11.cmml" xref="S2.SS1.p6.6.m6.1.1.11">𝑟</ci><ci id="S2.SS1.p6.6.m6.1.1.12.cmml" xref="S2.SS1.p6.6.m6.1.1.12">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p6.6.m6.1c">proxy\_score</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p6.6.m6.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e</annotation></semantics></math>, which is the average log-likelihood of the expected empirical predictor result of a source model on the target dataset. The advantage of the LEEP score lies in two aspects. Firstly, LEEP could be applied to heterogeneous target tasks that have different label spaces compared with the pre-trained models. Secondly, the computation of the LEEP score does not need extra training, which is more efficient compared with the KNN method.</p> </div> <div class="ltx_para" id="S2.SS1.p7"> <p class="ltx_p" id="S2.SS1.p7.6"><span class="ltx_text ltx_font_italic" id="S2.SS1.p7.6.1">Target Task</span>. We use <math alttext="T" class="ltx_Math" display="inline" id="S2.SS1.p7.1.m1.1"><semantics id="S2.SS1.p7.1.m1.1a"><mi id="S2.SS1.p7.1.m1.1.1" xref="S2.SS1.p7.1.m1.1.1.cmml">T</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p7.1.m1.1b"><ci id="S2.SS1.p7.1.m1.1.1.cmml" xref="S2.SS1.p7.1.m1.1.1">𝑇</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p7.1.m1.1c">T</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p7.1.m1.1d">italic_T</annotation></semantics></math> to denote the target task and <math alttext="d(T)" class="ltx_Math" display="inline" id="S2.SS1.p7.2.m2.1"><semantics id="S2.SS1.p7.2.m2.1a"><mrow id="S2.SS1.p7.2.m2.1.2" xref="S2.SS1.p7.2.m2.1.2.cmml"><mi id="S2.SS1.p7.2.m2.1.2.2" xref="S2.SS1.p7.2.m2.1.2.2.cmml">d</mi><mo id="S2.SS1.p7.2.m2.1.2.1" xref="S2.SS1.p7.2.m2.1.2.1.cmml">⁢</mo><mrow id="S2.SS1.p7.2.m2.1.2.3.2" xref="S2.SS1.p7.2.m2.1.2.cmml"><mo id="S2.SS1.p7.2.m2.1.2.3.2.1" stretchy="false" xref="S2.SS1.p7.2.m2.1.2.cmml">(</mo><mi id="S2.SS1.p7.2.m2.1.1" xref="S2.SS1.p7.2.m2.1.1.cmml">T</mi><mo id="S2.SS1.p7.2.m2.1.2.3.2.2" stretchy="false" xref="S2.SS1.p7.2.m2.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p7.2.m2.1b"><apply id="S2.SS1.p7.2.m2.1.2.cmml" xref="S2.SS1.p7.2.m2.1.2"><times id="S2.SS1.p7.2.m2.1.2.1.cmml" xref="S2.SS1.p7.2.m2.1.2.1"></times><ci id="S2.SS1.p7.2.m2.1.2.2.cmml" xref="S2.SS1.p7.2.m2.1.2.2">𝑑</ci><ci id="S2.SS1.p7.2.m2.1.1.cmml" xref="S2.SS1.p7.2.m2.1.1">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p7.2.m2.1c">d(T)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p7.2.m2.1d">italic_d ( italic_T )</annotation></semantics></math> to denote the training dataset of <math alttext="T" class="ltx_Math" display="inline" id="S2.SS1.p7.3.m3.1"><semantics id="S2.SS1.p7.3.m3.1a"><mi id="S2.SS1.p7.3.m3.1.1" xref="S2.SS1.p7.3.m3.1.1.cmml">T</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p7.3.m3.1b"><ci id="S2.SS1.p7.3.m3.1.1.cmml" xref="S2.SS1.p7.3.m3.1.1">𝑇</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p7.3.m3.1c">T</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p7.3.m3.1d">italic_T</annotation></semantics></math>. The proxy score for model <math alttext="m_{j}" class="ltx_Math" display="inline" id="S2.SS1.p7.4.m4.1"><semantics id="S2.SS1.p7.4.m4.1a"><msub id="S2.SS1.p7.4.m4.1.1" xref="S2.SS1.p7.4.m4.1.1.cmml"><mi id="S2.SS1.p7.4.m4.1.1.2" xref="S2.SS1.p7.4.m4.1.1.2.cmml">m</mi><mi id="S2.SS1.p7.4.m4.1.1.3" xref="S2.SS1.p7.4.m4.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.p7.4.m4.1b"><apply id="S2.SS1.p7.4.m4.1.1.cmml" xref="S2.SS1.p7.4.m4.1.1"><csymbol cd="ambiguous" id="S2.SS1.p7.4.m4.1.1.1.cmml" xref="S2.SS1.p7.4.m4.1.1">subscript</csymbol><ci id="S2.SS1.p7.4.m4.1.1.2.cmml" xref="S2.SS1.p7.4.m4.1.1.2">𝑚</ci><ci id="S2.SS1.p7.4.m4.1.1.3.cmml" xref="S2.SS1.p7.4.m4.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p7.4.m4.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p7.4.m4.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math> on the target dataset is denoted as <math alttext="proxy\_score(d(T)|m_{j})" class="ltx_Math" display="inline" id="S2.SS1.p7.5.m5.2"><semantics id="S2.SS1.p7.5.m5.2a"><mrow id="S2.SS1.p7.5.m5.2.2" xref="S2.SS1.p7.5.m5.2.2.cmml"><mi id="S2.SS1.p7.5.m5.2.2.3" xref="S2.SS1.p7.5.m5.2.2.3.cmml">p</mi><mo id="S2.SS1.p7.5.m5.2.2.2" xref="S2.SS1.p7.5.m5.2.2.2.cmml">⁢</mo><mi id="S2.SS1.p7.5.m5.2.2.4" xref="S2.SS1.p7.5.m5.2.2.4.cmml">r</mi><mo id="S2.SS1.p7.5.m5.2.2.2a" xref="S2.SS1.p7.5.m5.2.2.2.cmml">⁢</mo><mi id="S2.SS1.p7.5.m5.2.2.5" xref="S2.SS1.p7.5.m5.2.2.5.cmml">o</mi><mo id="S2.SS1.p7.5.m5.2.2.2b" xref="S2.SS1.p7.5.m5.2.2.2.cmml">⁢</mo><mi id="S2.SS1.p7.5.m5.2.2.6" xref="S2.SS1.p7.5.m5.2.2.6.cmml">x</mi><mo id="S2.SS1.p7.5.m5.2.2.2c" xref="S2.SS1.p7.5.m5.2.2.2.cmml">⁢</mo><mi id="S2.SS1.p7.5.m5.2.2.7" xref="S2.SS1.p7.5.m5.2.2.7.cmml">y</mi><mo id="S2.SS1.p7.5.m5.2.2.2d" xref="S2.SS1.p7.5.m5.2.2.2.cmml">⁢</mo><mi id="S2.SS1.p7.5.m5.2.2.8" mathvariant="normal" xref="S2.SS1.p7.5.m5.2.2.8.cmml">_</mi><mo id="S2.SS1.p7.5.m5.2.2.2e" xref="S2.SS1.p7.5.m5.2.2.2.cmml">⁢</mo><mi id="S2.SS1.p7.5.m5.2.2.9" xref="S2.SS1.p7.5.m5.2.2.9.cmml">s</mi><mo id="S2.SS1.p7.5.m5.2.2.2f" xref="S2.SS1.p7.5.m5.2.2.2.cmml">⁢</mo><mi id="S2.SS1.p7.5.m5.2.2.10" xref="S2.SS1.p7.5.m5.2.2.10.cmml">c</mi><mo id="S2.SS1.p7.5.m5.2.2.2g" xref="S2.SS1.p7.5.m5.2.2.2.cmml">⁢</mo><mi id="S2.SS1.p7.5.m5.2.2.11" xref="S2.SS1.p7.5.m5.2.2.11.cmml">o</mi><mo id="S2.SS1.p7.5.m5.2.2.2h" xref="S2.SS1.p7.5.m5.2.2.2.cmml">⁢</mo><mi id="S2.SS1.p7.5.m5.2.2.12" xref="S2.SS1.p7.5.m5.2.2.12.cmml">r</mi><mo id="S2.SS1.p7.5.m5.2.2.2i" xref="S2.SS1.p7.5.m5.2.2.2.cmml">⁢</mo><mi id="S2.SS1.p7.5.m5.2.2.13" xref="S2.SS1.p7.5.m5.2.2.13.cmml">e</mi><mo id="S2.SS1.p7.5.m5.2.2.2j" xref="S2.SS1.p7.5.m5.2.2.2.cmml">⁢</mo><mrow id="S2.SS1.p7.5.m5.2.2.1.1" xref="S2.SS1.p7.5.m5.2.2.1.1.1.cmml"><mo id="S2.SS1.p7.5.m5.2.2.1.1.2" stretchy="false" xref="S2.SS1.p7.5.m5.2.2.1.1.1.cmml">(</mo><mrow id="S2.SS1.p7.5.m5.2.2.1.1.1" xref="S2.SS1.p7.5.m5.2.2.1.1.1.cmml"><mrow id="S2.SS1.p7.5.m5.2.2.1.1.1.2" xref="S2.SS1.p7.5.m5.2.2.1.1.1.2.cmml"><mi id="S2.SS1.p7.5.m5.2.2.1.1.1.2.2" xref="S2.SS1.p7.5.m5.2.2.1.1.1.2.2.cmml">d</mi><mo id="S2.SS1.p7.5.m5.2.2.1.1.1.2.1" xref="S2.SS1.p7.5.m5.2.2.1.1.1.2.1.cmml">⁢</mo><mrow id="S2.SS1.p7.5.m5.2.2.1.1.1.2.3.2" xref="S2.SS1.p7.5.m5.2.2.1.1.1.2.cmml"><mo id="S2.SS1.p7.5.m5.2.2.1.1.1.2.3.2.1" stretchy="false" xref="S2.SS1.p7.5.m5.2.2.1.1.1.2.cmml">(</mo><mi id="S2.SS1.p7.5.m5.1.1" xref="S2.SS1.p7.5.m5.1.1.cmml">T</mi><mo id="S2.SS1.p7.5.m5.2.2.1.1.1.2.3.2.2" stretchy="false" xref="S2.SS1.p7.5.m5.2.2.1.1.1.2.cmml">)</mo></mrow></mrow><mo fence="false" id="S2.SS1.p7.5.m5.2.2.1.1.1.1" xref="S2.SS1.p7.5.m5.2.2.1.1.1.1.cmml">|</mo><msub id="S2.SS1.p7.5.m5.2.2.1.1.1.3" xref="S2.SS1.p7.5.m5.2.2.1.1.1.3.cmml"><mi id="S2.SS1.p7.5.m5.2.2.1.1.1.3.2" xref="S2.SS1.p7.5.m5.2.2.1.1.1.3.2.cmml">m</mi><mi id="S2.SS1.p7.5.m5.2.2.1.1.1.3.3" xref="S2.SS1.p7.5.m5.2.2.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S2.SS1.p7.5.m5.2.2.1.1.3" stretchy="false" xref="S2.SS1.p7.5.m5.2.2.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p7.5.m5.2b"><apply id="S2.SS1.p7.5.m5.2.2.cmml" xref="S2.SS1.p7.5.m5.2.2"><times id="S2.SS1.p7.5.m5.2.2.2.cmml" xref="S2.SS1.p7.5.m5.2.2.2"></times><ci id="S2.SS1.p7.5.m5.2.2.3.cmml" xref="S2.SS1.p7.5.m5.2.2.3">𝑝</ci><ci id="S2.SS1.p7.5.m5.2.2.4.cmml" xref="S2.SS1.p7.5.m5.2.2.4">𝑟</ci><ci id="S2.SS1.p7.5.m5.2.2.5.cmml" xref="S2.SS1.p7.5.m5.2.2.5">𝑜</ci><ci id="S2.SS1.p7.5.m5.2.2.6.cmml" xref="S2.SS1.p7.5.m5.2.2.6">𝑥</ci><ci id="S2.SS1.p7.5.m5.2.2.7.cmml" xref="S2.SS1.p7.5.m5.2.2.7">𝑦</ci><ci id="S2.SS1.p7.5.m5.2.2.8.cmml" xref="S2.SS1.p7.5.m5.2.2.8">_</ci><ci id="S2.SS1.p7.5.m5.2.2.9.cmml" xref="S2.SS1.p7.5.m5.2.2.9">𝑠</ci><ci id="S2.SS1.p7.5.m5.2.2.10.cmml" xref="S2.SS1.p7.5.m5.2.2.10">𝑐</ci><ci id="S2.SS1.p7.5.m5.2.2.11.cmml" xref="S2.SS1.p7.5.m5.2.2.11">𝑜</ci><ci id="S2.SS1.p7.5.m5.2.2.12.cmml" xref="S2.SS1.p7.5.m5.2.2.12">𝑟</ci><ci id="S2.SS1.p7.5.m5.2.2.13.cmml" xref="S2.SS1.p7.5.m5.2.2.13">𝑒</ci><apply id="S2.SS1.p7.5.m5.2.2.1.1.1.cmml" xref="S2.SS1.p7.5.m5.2.2.1.1"><csymbol cd="latexml" id="S2.SS1.p7.5.m5.2.2.1.1.1.1.cmml" xref="S2.SS1.p7.5.m5.2.2.1.1.1.1">conditional</csymbol><apply id="S2.SS1.p7.5.m5.2.2.1.1.1.2.cmml" xref="S2.SS1.p7.5.m5.2.2.1.1.1.2"><times id="S2.SS1.p7.5.m5.2.2.1.1.1.2.1.cmml" xref="S2.SS1.p7.5.m5.2.2.1.1.1.2.1"></times><ci id="S2.SS1.p7.5.m5.2.2.1.1.1.2.2.cmml" xref="S2.SS1.p7.5.m5.2.2.1.1.1.2.2">𝑑</ci><ci id="S2.SS1.p7.5.m5.1.1.cmml" xref="S2.SS1.p7.5.m5.1.1">𝑇</ci></apply><apply id="S2.SS1.p7.5.m5.2.2.1.1.1.3.cmml" xref="S2.SS1.p7.5.m5.2.2.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.p7.5.m5.2.2.1.1.1.3.1.cmml" xref="S2.SS1.p7.5.m5.2.2.1.1.1.3">subscript</csymbol><ci id="S2.SS1.p7.5.m5.2.2.1.1.1.3.2.cmml" xref="S2.SS1.p7.5.m5.2.2.1.1.1.3.2">𝑚</ci><ci id="S2.SS1.p7.5.m5.2.2.1.1.1.3.3.cmml" xref="S2.SS1.p7.5.m5.2.2.1.1.1.3.3">𝑗</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p7.5.m5.2c">proxy\_score(d(T)|m_{j})</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p7.5.m5.2d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e ( italic_d ( italic_T ) | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )</annotation></semantics></math>, or <math alttext="proxy\_score(T|m_{j})" class="ltx_Math" display="inline" id="S2.SS1.p7.6.m6.1"><semantics id="S2.SS1.p7.6.m6.1a"><mrow id="S2.SS1.p7.6.m6.1.1" xref="S2.SS1.p7.6.m6.1.1.cmml"><mi id="S2.SS1.p7.6.m6.1.1.3" xref="S2.SS1.p7.6.m6.1.1.3.cmml">p</mi><mo id="S2.SS1.p7.6.m6.1.1.2" xref="S2.SS1.p7.6.m6.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p7.6.m6.1.1.4" xref="S2.SS1.p7.6.m6.1.1.4.cmml">r</mi><mo id="S2.SS1.p7.6.m6.1.1.2a" xref="S2.SS1.p7.6.m6.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p7.6.m6.1.1.5" xref="S2.SS1.p7.6.m6.1.1.5.cmml">o</mi><mo id="S2.SS1.p7.6.m6.1.1.2b" xref="S2.SS1.p7.6.m6.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p7.6.m6.1.1.6" xref="S2.SS1.p7.6.m6.1.1.6.cmml">x</mi><mo id="S2.SS1.p7.6.m6.1.1.2c" xref="S2.SS1.p7.6.m6.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p7.6.m6.1.1.7" xref="S2.SS1.p7.6.m6.1.1.7.cmml">y</mi><mo id="S2.SS1.p7.6.m6.1.1.2d" xref="S2.SS1.p7.6.m6.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p7.6.m6.1.1.8" mathvariant="normal" xref="S2.SS1.p7.6.m6.1.1.8.cmml">_</mi><mo id="S2.SS1.p7.6.m6.1.1.2e" xref="S2.SS1.p7.6.m6.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p7.6.m6.1.1.9" xref="S2.SS1.p7.6.m6.1.1.9.cmml">s</mi><mo id="S2.SS1.p7.6.m6.1.1.2f" xref="S2.SS1.p7.6.m6.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p7.6.m6.1.1.10" xref="S2.SS1.p7.6.m6.1.1.10.cmml">c</mi><mo id="S2.SS1.p7.6.m6.1.1.2g" xref="S2.SS1.p7.6.m6.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p7.6.m6.1.1.11" xref="S2.SS1.p7.6.m6.1.1.11.cmml">o</mi><mo id="S2.SS1.p7.6.m6.1.1.2h" xref="S2.SS1.p7.6.m6.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p7.6.m6.1.1.12" xref="S2.SS1.p7.6.m6.1.1.12.cmml">r</mi><mo id="S2.SS1.p7.6.m6.1.1.2i" xref="S2.SS1.p7.6.m6.1.1.2.cmml">⁢</mo><mi id="S2.SS1.p7.6.m6.1.1.13" xref="S2.SS1.p7.6.m6.1.1.13.cmml">e</mi><mo id="S2.SS1.p7.6.m6.1.1.2j" xref="S2.SS1.p7.6.m6.1.1.2.cmml">⁢</mo><mrow id="S2.SS1.p7.6.m6.1.1.1.1" xref="S2.SS1.p7.6.m6.1.1.1.1.1.cmml"><mo id="S2.SS1.p7.6.m6.1.1.1.1.2" stretchy="false" xref="S2.SS1.p7.6.m6.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.p7.6.m6.1.1.1.1.1" xref="S2.SS1.p7.6.m6.1.1.1.1.1.cmml"><mi id="S2.SS1.p7.6.m6.1.1.1.1.1.2" xref="S2.SS1.p7.6.m6.1.1.1.1.1.2.cmml">T</mi><mo fence="false" id="S2.SS1.p7.6.m6.1.1.1.1.1.1" xref="S2.SS1.p7.6.m6.1.1.1.1.1.1.cmml">|</mo><msub id="S2.SS1.p7.6.m6.1.1.1.1.1.3" xref="S2.SS1.p7.6.m6.1.1.1.1.1.3.cmml"><mi id="S2.SS1.p7.6.m6.1.1.1.1.1.3.2" xref="S2.SS1.p7.6.m6.1.1.1.1.1.3.2.cmml">m</mi><mi id="S2.SS1.p7.6.m6.1.1.1.1.1.3.3" xref="S2.SS1.p7.6.m6.1.1.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S2.SS1.p7.6.m6.1.1.1.1.3" stretchy="false" xref="S2.SS1.p7.6.m6.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p7.6.m6.1b"><apply id="S2.SS1.p7.6.m6.1.1.cmml" xref="S2.SS1.p7.6.m6.1.1"><times id="S2.SS1.p7.6.m6.1.1.2.cmml" xref="S2.SS1.p7.6.m6.1.1.2"></times><ci id="S2.SS1.p7.6.m6.1.1.3.cmml" xref="S2.SS1.p7.6.m6.1.1.3">𝑝</ci><ci id="S2.SS1.p7.6.m6.1.1.4.cmml" xref="S2.SS1.p7.6.m6.1.1.4">𝑟</ci><ci id="S2.SS1.p7.6.m6.1.1.5.cmml" xref="S2.SS1.p7.6.m6.1.1.5">𝑜</ci><ci id="S2.SS1.p7.6.m6.1.1.6.cmml" xref="S2.SS1.p7.6.m6.1.1.6">𝑥</ci><ci id="S2.SS1.p7.6.m6.1.1.7.cmml" xref="S2.SS1.p7.6.m6.1.1.7">𝑦</ci><ci id="S2.SS1.p7.6.m6.1.1.8.cmml" xref="S2.SS1.p7.6.m6.1.1.8">_</ci><ci id="S2.SS1.p7.6.m6.1.1.9.cmml" xref="S2.SS1.p7.6.m6.1.1.9">𝑠</ci><ci id="S2.SS1.p7.6.m6.1.1.10.cmml" xref="S2.SS1.p7.6.m6.1.1.10">𝑐</ci><ci id="S2.SS1.p7.6.m6.1.1.11.cmml" xref="S2.SS1.p7.6.m6.1.1.11">𝑜</ci><ci id="S2.SS1.p7.6.m6.1.1.12.cmml" xref="S2.SS1.p7.6.m6.1.1.12">𝑟</ci><ci id="S2.SS1.p7.6.m6.1.1.13.cmml" xref="S2.SS1.p7.6.m6.1.1.13">𝑒</ci><apply id="S2.SS1.p7.6.m6.1.1.1.1.1.cmml" xref="S2.SS1.p7.6.m6.1.1.1.1"><csymbol cd="latexml" id="S2.SS1.p7.6.m6.1.1.1.1.1.1.cmml" xref="S2.SS1.p7.6.m6.1.1.1.1.1.1">conditional</csymbol><ci id="S2.SS1.p7.6.m6.1.1.1.1.1.2.cmml" xref="S2.SS1.p7.6.m6.1.1.1.1.1.2">𝑇</ci><apply id="S2.SS1.p7.6.m6.1.1.1.1.1.3.cmml" xref="S2.SS1.p7.6.m6.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.p7.6.m6.1.1.1.1.1.3.1.cmml" xref="S2.SS1.p7.6.m6.1.1.1.1.1.3">subscript</csymbol><ci id="S2.SS1.p7.6.m6.1.1.1.1.1.3.2.cmml" xref="S2.SS1.p7.6.m6.1.1.1.1.1.3.2">𝑚</ci><ci id="S2.SS1.p7.6.m6.1.1.1.1.1.3.3.cmml" xref="S2.SS1.p7.6.m6.1.1.1.1.1.3.3">𝑗</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p7.6.m6.1c">proxy\_score(T|m_{j})</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p7.6.m6.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e ( italic_T | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )</annotation></semantics></math>.</p> </div> </section> <section class="ltx_subsection" id="S2.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S2.SS2.4.1.1">II-B</span> </span><span class="ltx_text ltx_font_italic" id="S2.SS2.5.2">Framework Overview</span> </h3> <div class="ltx_para" id="S2.SS2.p1"> <p class="ltx_p" id="S2.SS2.p1.1">Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S1.F2" title="Figure 2 ‣ I Introduction ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">2</span></a> illustrates the whole process of the two-phase framework containing two computation parts:</p> </div> <div class="ltx_para" id="S2.SS2.p2"> <p class="ltx_p" id="S2.SS2.p2.2"><span class="ltx_text ltx_font_italic" id="S2.SS2.p2.2.1">Offline</span>. The construction of the performance matrix constitutes the core offline process, wherein we fine-tune each model in <math alttext="M" class="ltx_Math" display="inline" id="S2.SS2.p2.1.m1.1"><semantics id="S2.SS2.p2.1.m1.1a"><mi id="S2.SS2.p2.1.m1.1.1" xref="S2.SS2.p2.1.m1.1.1.cmml">M</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.1.m1.1b"><ci id="S2.SS2.p2.1.m1.1.1.cmml" xref="S2.SS2.p2.1.m1.1.1">𝑀</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.1.m1.1c">M</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.1.m1.1d">italic_M</annotation></semantics></math> on the benchmark datasets and record the validation and test results throughout the training process. Subsequently, we execute model clustering based on this matrix, resulting in <math alttext="MC=\{C_{1},C_{2},...,C_{p}\}" class="ltx_Math" display="inline" id="S2.SS2.p2.2.m2.4"><semantics id="S2.SS2.p2.2.m2.4a"><mrow id="S2.SS2.p2.2.m2.4.4" xref="S2.SS2.p2.2.m2.4.4.cmml"><mrow id="S2.SS2.p2.2.m2.4.4.5" xref="S2.SS2.p2.2.m2.4.4.5.cmml"><mi id="S2.SS2.p2.2.m2.4.4.5.2" xref="S2.SS2.p2.2.m2.4.4.5.2.cmml">M</mi><mo id="S2.SS2.p2.2.m2.4.4.5.1" xref="S2.SS2.p2.2.m2.4.4.5.1.cmml">⁢</mo><mi id="S2.SS2.p2.2.m2.4.4.5.3" xref="S2.SS2.p2.2.m2.4.4.5.3.cmml">C</mi></mrow><mo id="S2.SS2.p2.2.m2.4.4.4" xref="S2.SS2.p2.2.m2.4.4.4.cmml">=</mo><mrow id="S2.SS2.p2.2.m2.4.4.3.3" xref="S2.SS2.p2.2.m2.4.4.3.4.cmml"><mo id="S2.SS2.p2.2.m2.4.4.3.3.4" stretchy="false" xref="S2.SS2.p2.2.m2.4.4.3.4.cmml">{</mo><msub id="S2.SS2.p2.2.m2.2.2.1.1.1" xref="S2.SS2.p2.2.m2.2.2.1.1.1.cmml"><mi id="S2.SS2.p2.2.m2.2.2.1.1.1.2" xref="S2.SS2.p2.2.m2.2.2.1.1.1.2.cmml">C</mi><mn id="S2.SS2.p2.2.m2.2.2.1.1.1.3" xref="S2.SS2.p2.2.m2.2.2.1.1.1.3.cmml">1</mn></msub><mo id="S2.SS2.p2.2.m2.4.4.3.3.5" xref="S2.SS2.p2.2.m2.4.4.3.4.cmml">,</mo><msub id="S2.SS2.p2.2.m2.3.3.2.2.2" xref="S2.SS2.p2.2.m2.3.3.2.2.2.cmml"><mi id="S2.SS2.p2.2.m2.3.3.2.2.2.2" xref="S2.SS2.p2.2.m2.3.3.2.2.2.2.cmml">C</mi><mn id="S2.SS2.p2.2.m2.3.3.2.2.2.3" xref="S2.SS2.p2.2.m2.3.3.2.2.2.3.cmml">2</mn></msub><mo id="S2.SS2.p2.2.m2.4.4.3.3.6" xref="S2.SS2.p2.2.m2.4.4.3.4.cmml">,</mo><mi id="S2.SS2.p2.2.m2.1.1" mathvariant="normal" xref="S2.SS2.p2.2.m2.1.1.cmml">…</mi><mo id="S2.SS2.p2.2.m2.4.4.3.3.7" xref="S2.SS2.p2.2.m2.4.4.3.4.cmml">,</mo><msub id="S2.SS2.p2.2.m2.4.4.3.3.3" xref="S2.SS2.p2.2.m2.4.4.3.3.3.cmml"><mi id="S2.SS2.p2.2.m2.4.4.3.3.3.2" xref="S2.SS2.p2.2.m2.4.4.3.3.3.2.cmml">C</mi><mi id="S2.SS2.p2.2.m2.4.4.3.3.3.3" xref="S2.SS2.p2.2.m2.4.4.3.3.3.3.cmml">p</mi></msub><mo id="S2.SS2.p2.2.m2.4.4.3.3.8" stretchy="false" xref="S2.SS2.p2.2.m2.4.4.3.4.cmml">}</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.2.m2.4b"><apply id="S2.SS2.p2.2.m2.4.4.cmml" xref="S2.SS2.p2.2.m2.4.4"><eq id="S2.SS2.p2.2.m2.4.4.4.cmml" xref="S2.SS2.p2.2.m2.4.4.4"></eq><apply id="S2.SS2.p2.2.m2.4.4.5.cmml" xref="S2.SS2.p2.2.m2.4.4.5"><times id="S2.SS2.p2.2.m2.4.4.5.1.cmml" xref="S2.SS2.p2.2.m2.4.4.5.1"></times><ci id="S2.SS2.p2.2.m2.4.4.5.2.cmml" xref="S2.SS2.p2.2.m2.4.4.5.2">𝑀</ci><ci id="S2.SS2.p2.2.m2.4.4.5.3.cmml" xref="S2.SS2.p2.2.m2.4.4.5.3">𝐶</ci></apply><set id="S2.SS2.p2.2.m2.4.4.3.4.cmml" xref="S2.SS2.p2.2.m2.4.4.3.3"><apply id="S2.SS2.p2.2.m2.2.2.1.1.1.cmml" xref="S2.SS2.p2.2.m2.2.2.1.1.1"><csymbol cd="ambiguous" id="S2.SS2.p2.2.m2.2.2.1.1.1.1.cmml" xref="S2.SS2.p2.2.m2.2.2.1.1.1">subscript</csymbol><ci id="S2.SS2.p2.2.m2.2.2.1.1.1.2.cmml" xref="S2.SS2.p2.2.m2.2.2.1.1.1.2">𝐶</ci><cn id="S2.SS2.p2.2.m2.2.2.1.1.1.3.cmml" type="integer" xref="S2.SS2.p2.2.m2.2.2.1.1.1.3">1</cn></apply><apply id="S2.SS2.p2.2.m2.3.3.2.2.2.cmml" xref="S2.SS2.p2.2.m2.3.3.2.2.2"><csymbol cd="ambiguous" id="S2.SS2.p2.2.m2.3.3.2.2.2.1.cmml" xref="S2.SS2.p2.2.m2.3.3.2.2.2">subscript</csymbol><ci id="S2.SS2.p2.2.m2.3.3.2.2.2.2.cmml" xref="S2.SS2.p2.2.m2.3.3.2.2.2.2">𝐶</ci><cn id="S2.SS2.p2.2.m2.3.3.2.2.2.3.cmml" type="integer" xref="S2.SS2.p2.2.m2.3.3.2.2.2.3">2</cn></apply><ci id="S2.SS2.p2.2.m2.1.1.cmml" xref="S2.SS2.p2.2.m2.1.1">…</ci><apply id="S2.SS2.p2.2.m2.4.4.3.3.3.cmml" xref="S2.SS2.p2.2.m2.4.4.3.3.3"><csymbol cd="ambiguous" id="S2.SS2.p2.2.m2.4.4.3.3.3.1.cmml" xref="S2.SS2.p2.2.m2.4.4.3.3.3">subscript</csymbol><ci id="S2.SS2.p2.2.m2.4.4.3.3.3.2.cmml" xref="S2.SS2.p2.2.m2.4.4.3.3.3.2">𝐶</ci><ci id="S2.SS2.p2.2.m2.4.4.3.3.3.3.cmml" xref="S2.SS2.p2.2.m2.4.4.3.3.3.3">𝑝</ci></apply></set></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.2.m2.4c">MC=\{C_{1},C_{2},...,C_{p}\}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.2.m2.4d">italic_M italic_C = { italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_C start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT }</annotation></semantics></math>. Although this offline computation is time consuming, this part could be only computed once and then the generated model clusters and intermediate results could be used in the online computation directly.</p> </div> <div class="ltx_para" id="S2.SS2.p3"> <p class="ltx_p" id="S2.SS2.p3.10"><span class="ltx_text ltx_font_italic" id="S2.SS2.p3.10.1">Online</span>. This part implements the two-phase model selection computation for a new task <math alttext="T" class="ltx_Math" display="inline" id="S2.SS2.p3.1.m1.1"><semantics id="S2.SS2.p3.1.m1.1a"><mi id="S2.SS2.p3.1.m1.1.1" xref="S2.SS2.p3.1.m1.1.1.cmml">T</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.1.m1.1b"><ci id="S2.SS2.p3.1.m1.1.1.cmml" xref="S2.SS2.p3.1.m1.1.1">𝑇</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.1.m1.1c">T</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.1.m1.1d">italic_T</annotation></semantics></math>. The coarse-recall phase, denoted as <math alttext="CR=corase\_recall(M|MC,d(T))" class="ltx_Math" display="inline" id="S2.SS2.p3.2.m2.2"><semantics id="S2.SS2.p3.2.m2.2a"><mrow id="S2.SS2.p3.2.m2.2.2" xref="S2.SS2.p3.2.m2.2.2.cmml"><mrow id="S2.SS2.p3.2.m2.2.2.3" xref="S2.SS2.p3.2.m2.2.2.3.cmml"><mi id="S2.SS2.p3.2.m2.2.2.3.2" xref="S2.SS2.p3.2.m2.2.2.3.2.cmml">C</mi><mo id="S2.SS2.p3.2.m2.2.2.3.1" xref="S2.SS2.p3.2.m2.2.2.3.1.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.3.3" xref="S2.SS2.p3.2.m2.2.2.3.3.cmml">R</mi></mrow><mo id="S2.SS2.p3.2.m2.2.2.2" xref="S2.SS2.p3.2.m2.2.2.2.cmml">=</mo><mrow id="S2.SS2.p3.2.m2.2.2.1" xref="S2.SS2.p3.2.m2.2.2.1.cmml"><mi id="S2.SS2.p3.2.m2.2.2.1.3" xref="S2.SS2.p3.2.m2.2.2.1.3.cmml">c</mi><mo id="S2.SS2.p3.2.m2.2.2.1.2" xref="S2.SS2.p3.2.m2.2.2.1.2.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.1.4" xref="S2.SS2.p3.2.m2.2.2.1.4.cmml">o</mi><mo id="S2.SS2.p3.2.m2.2.2.1.2a" xref="S2.SS2.p3.2.m2.2.2.1.2.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.1.5" xref="S2.SS2.p3.2.m2.2.2.1.5.cmml">r</mi><mo id="S2.SS2.p3.2.m2.2.2.1.2b" xref="S2.SS2.p3.2.m2.2.2.1.2.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.1.6" xref="S2.SS2.p3.2.m2.2.2.1.6.cmml">a</mi><mo id="S2.SS2.p3.2.m2.2.2.1.2c" xref="S2.SS2.p3.2.m2.2.2.1.2.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.1.7" xref="S2.SS2.p3.2.m2.2.2.1.7.cmml">s</mi><mo id="S2.SS2.p3.2.m2.2.2.1.2d" xref="S2.SS2.p3.2.m2.2.2.1.2.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.1.8" xref="S2.SS2.p3.2.m2.2.2.1.8.cmml">e</mi><mo id="S2.SS2.p3.2.m2.2.2.1.2e" xref="S2.SS2.p3.2.m2.2.2.1.2.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.1.9" mathvariant="normal" xref="S2.SS2.p3.2.m2.2.2.1.9.cmml">_</mi><mo id="S2.SS2.p3.2.m2.2.2.1.2f" xref="S2.SS2.p3.2.m2.2.2.1.2.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.1.10" xref="S2.SS2.p3.2.m2.2.2.1.10.cmml">r</mi><mo id="S2.SS2.p3.2.m2.2.2.1.2g" xref="S2.SS2.p3.2.m2.2.2.1.2.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.1.11" xref="S2.SS2.p3.2.m2.2.2.1.11.cmml">e</mi><mo id="S2.SS2.p3.2.m2.2.2.1.2h" xref="S2.SS2.p3.2.m2.2.2.1.2.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.1.12" xref="S2.SS2.p3.2.m2.2.2.1.12.cmml">c</mi><mo id="S2.SS2.p3.2.m2.2.2.1.2i" xref="S2.SS2.p3.2.m2.2.2.1.2.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.1.13" xref="S2.SS2.p3.2.m2.2.2.1.13.cmml">a</mi><mo id="S2.SS2.p3.2.m2.2.2.1.2j" xref="S2.SS2.p3.2.m2.2.2.1.2.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.1.14" xref="S2.SS2.p3.2.m2.2.2.1.14.cmml">l</mi><mo id="S2.SS2.p3.2.m2.2.2.1.2k" xref="S2.SS2.p3.2.m2.2.2.1.2.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.1.15" xref="S2.SS2.p3.2.m2.2.2.1.15.cmml">l</mi><mo id="S2.SS2.p3.2.m2.2.2.1.2l" xref="S2.SS2.p3.2.m2.2.2.1.2.cmml">⁢</mo><mrow id="S2.SS2.p3.2.m2.2.2.1.1.1" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.cmml"><mo id="S2.SS2.p3.2.m2.2.2.1.1.1.2" stretchy="false" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.cmml">(</mo><mrow id="S2.SS2.p3.2.m2.2.2.1.1.1.1" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.cmml"><mi id="S2.SS2.p3.2.m2.2.2.1.1.1.1.4" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.4.cmml">M</mi><mo fence="false" id="S2.SS2.p3.2.m2.2.2.1.1.1.1.3" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.3.cmml">|</mo><mrow id="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.3.cmml"><mrow id="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.cmml"><mi id="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.2" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.2.cmml">M</mi><mo id="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.1" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.1.cmml">⁢</mo><mi id="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.3" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.3.cmml">C</mi></mrow><mo id="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.3" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.3.cmml">,</mo><mrow id="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.cmml"><mi id="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.2" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.2.cmml">d</mi><mo id="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.1" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.1.cmml">⁢</mo><mrow id="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.3.2" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.cmml"><mo id="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.3.2.1" stretchy="false" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.cmml">(</mo><mi id="S2.SS2.p3.2.m2.1.1" xref="S2.SS2.p3.2.m2.1.1.cmml">T</mi><mo id="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.3.2.2" stretchy="false" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.cmml">)</mo></mrow></mrow></mrow></mrow><mo id="S2.SS2.p3.2.m2.2.2.1.1.1.3" stretchy="false" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.2.m2.2b"><apply id="S2.SS2.p3.2.m2.2.2.cmml" xref="S2.SS2.p3.2.m2.2.2"><eq id="S2.SS2.p3.2.m2.2.2.2.cmml" xref="S2.SS2.p3.2.m2.2.2.2"></eq><apply id="S2.SS2.p3.2.m2.2.2.3.cmml" xref="S2.SS2.p3.2.m2.2.2.3"><times id="S2.SS2.p3.2.m2.2.2.3.1.cmml" xref="S2.SS2.p3.2.m2.2.2.3.1"></times><ci id="S2.SS2.p3.2.m2.2.2.3.2.cmml" xref="S2.SS2.p3.2.m2.2.2.3.2">𝐶</ci><ci id="S2.SS2.p3.2.m2.2.2.3.3.cmml" xref="S2.SS2.p3.2.m2.2.2.3.3">𝑅</ci></apply><apply id="S2.SS2.p3.2.m2.2.2.1.cmml" xref="S2.SS2.p3.2.m2.2.2.1"><times id="S2.SS2.p3.2.m2.2.2.1.2.cmml" xref="S2.SS2.p3.2.m2.2.2.1.2"></times><ci id="S2.SS2.p3.2.m2.2.2.1.3.cmml" xref="S2.SS2.p3.2.m2.2.2.1.3">𝑐</ci><ci id="S2.SS2.p3.2.m2.2.2.1.4.cmml" xref="S2.SS2.p3.2.m2.2.2.1.4">𝑜</ci><ci id="S2.SS2.p3.2.m2.2.2.1.5.cmml" xref="S2.SS2.p3.2.m2.2.2.1.5">𝑟</ci><ci id="S2.SS2.p3.2.m2.2.2.1.6.cmml" xref="S2.SS2.p3.2.m2.2.2.1.6">𝑎</ci><ci id="S2.SS2.p3.2.m2.2.2.1.7.cmml" xref="S2.SS2.p3.2.m2.2.2.1.7">𝑠</ci><ci id="S2.SS2.p3.2.m2.2.2.1.8.cmml" xref="S2.SS2.p3.2.m2.2.2.1.8">𝑒</ci><ci id="S2.SS2.p3.2.m2.2.2.1.9.cmml" xref="S2.SS2.p3.2.m2.2.2.1.9">_</ci><ci id="S2.SS2.p3.2.m2.2.2.1.10.cmml" xref="S2.SS2.p3.2.m2.2.2.1.10">𝑟</ci><ci id="S2.SS2.p3.2.m2.2.2.1.11.cmml" xref="S2.SS2.p3.2.m2.2.2.1.11">𝑒</ci><ci id="S2.SS2.p3.2.m2.2.2.1.12.cmml" xref="S2.SS2.p3.2.m2.2.2.1.12">𝑐</ci><ci id="S2.SS2.p3.2.m2.2.2.1.13.cmml" xref="S2.SS2.p3.2.m2.2.2.1.13">𝑎</ci><ci id="S2.SS2.p3.2.m2.2.2.1.14.cmml" xref="S2.SS2.p3.2.m2.2.2.1.14">𝑙</ci><ci id="S2.SS2.p3.2.m2.2.2.1.15.cmml" xref="S2.SS2.p3.2.m2.2.2.1.15">𝑙</ci><apply id="S2.SS2.p3.2.m2.2.2.1.1.1.1.cmml" xref="S2.SS2.p3.2.m2.2.2.1.1.1"><csymbol cd="latexml" id="S2.SS2.p3.2.m2.2.2.1.1.1.1.3.cmml" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.3">conditional</csymbol><ci id="S2.SS2.p3.2.m2.2.2.1.1.1.1.4.cmml" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.4">𝑀</ci><list id="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.3.cmml" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2"><apply id="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.cmml" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1"><times id="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.1.cmml" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.1"></times><ci id="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.2.cmml" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.2">𝑀</ci><ci id="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.3.cmml" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.1.1.1.3">𝐶</ci></apply><apply id="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.cmml" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2"><times id="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.1.cmml" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.1"></times><ci id="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.2.cmml" xref="S2.SS2.p3.2.m2.2.2.1.1.1.1.2.2.2.2">𝑑</ci><ci id="S2.SS2.p3.2.m2.1.1.cmml" xref="S2.SS2.p3.2.m2.1.1">𝑇</ci></apply></list></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.2.m2.2c">CR=corase\_recall(M|MC,d(T))</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.2.m2.2d">italic_C italic_R = italic_c italic_o italic_r italic_a italic_s italic_e _ italic_r italic_e italic_c italic_a italic_l italic_l ( italic_M | italic_M italic_C , italic_d ( italic_T ) )</annotation></semantics></math>, returns <math alttext="K" class="ltx_Math" display="inline" id="S2.SS2.p3.3.m3.1"><semantics id="S2.SS2.p3.3.m3.1a"><mi id="S2.SS2.p3.3.m3.1.1" xref="S2.SS2.p3.3.m3.1.1.cmml">K</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.3.m3.1b"><ci id="S2.SS2.p3.3.m3.1.1.cmml" xref="S2.SS2.p3.3.m3.1.1">𝐾</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.3.m3.1c">K</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.3.m3.1d">italic_K</annotation></semantics></math> candidate models from model sets <math alttext="M" class="ltx_Math" display="inline" id="S2.SS2.p3.4.m4.1"><semantics id="S2.SS2.p3.4.m4.1a"><mi id="S2.SS2.p3.4.m4.1.1" xref="S2.SS2.p3.4.m4.1.1.cmml">M</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.4.m4.1b"><ci id="S2.SS2.p3.4.m4.1.1.cmml" xref="S2.SS2.p3.4.m4.1.1">𝑀</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.4.m4.1c">M</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.4.m4.1d">italic_M</annotation></semantics></math>, prone to achieve high training performance on <math alttext="d(T)" class="ltx_Math" display="inline" id="S2.SS2.p3.5.m5.1"><semantics id="S2.SS2.p3.5.m5.1a"><mrow id="S2.SS2.p3.5.m5.1.2" xref="S2.SS2.p3.5.m5.1.2.cmml"><mi id="S2.SS2.p3.5.m5.1.2.2" xref="S2.SS2.p3.5.m5.1.2.2.cmml">d</mi><mo id="S2.SS2.p3.5.m5.1.2.1" xref="S2.SS2.p3.5.m5.1.2.1.cmml">⁢</mo><mrow id="S2.SS2.p3.5.m5.1.2.3.2" xref="S2.SS2.p3.5.m5.1.2.cmml"><mo id="S2.SS2.p3.5.m5.1.2.3.2.1" stretchy="false" xref="S2.SS2.p3.5.m5.1.2.cmml">(</mo><mi id="S2.SS2.p3.5.m5.1.1" xref="S2.SS2.p3.5.m5.1.1.cmml">T</mi><mo id="S2.SS2.p3.5.m5.1.2.3.2.2" stretchy="false" xref="S2.SS2.p3.5.m5.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.5.m5.1b"><apply id="S2.SS2.p3.5.m5.1.2.cmml" xref="S2.SS2.p3.5.m5.1.2"><times id="S2.SS2.p3.5.m5.1.2.1.cmml" xref="S2.SS2.p3.5.m5.1.2.1"></times><ci id="S2.SS2.p3.5.m5.1.2.2.cmml" xref="S2.SS2.p3.5.m5.1.2.2">𝑑</ci><ci id="S2.SS2.p3.5.m5.1.1.cmml" xref="S2.SS2.p3.5.m5.1.1">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.5.m5.1c">d(T)</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.5.m5.1d">italic_d ( italic_T )</annotation></semantics></math> by computing the proxy score for model clusters <math alttext="MC" class="ltx_Math" display="inline" id="S2.SS2.p3.6.m6.1"><semantics id="S2.SS2.p3.6.m6.1a"><mrow id="S2.SS2.p3.6.m6.1.1" xref="S2.SS2.p3.6.m6.1.1.cmml"><mi id="S2.SS2.p3.6.m6.1.1.2" xref="S2.SS2.p3.6.m6.1.1.2.cmml">M</mi><mo id="S2.SS2.p3.6.m6.1.1.1" xref="S2.SS2.p3.6.m6.1.1.1.cmml">⁢</mo><mi id="S2.SS2.p3.6.m6.1.1.3" xref="S2.SS2.p3.6.m6.1.1.3.cmml">C</mi></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.6.m6.1b"><apply id="S2.SS2.p3.6.m6.1.1.cmml" xref="S2.SS2.p3.6.m6.1.1"><times id="S2.SS2.p3.6.m6.1.1.1.cmml" xref="S2.SS2.p3.6.m6.1.1.1"></times><ci id="S2.SS2.p3.6.m6.1.1.2.cmml" xref="S2.SS2.p3.6.m6.1.1.2">𝑀</ci><ci id="S2.SS2.p3.6.m6.1.1.3.cmml" xref="S2.SS2.p3.6.m6.1.1.3">𝐶</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.6.m6.1c">MC</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.6.m6.1d">italic_M italic_C</annotation></semantics></math> on <math alttext="d(T)" class="ltx_Math" display="inline" id="S2.SS2.p3.7.m7.1"><semantics id="S2.SS2.p3.7.m7.1a"><mrow id="S2.SS2.p3.7.m7.1.2" xref="S2.SS2.p3.7.m7.1.2.cmml"><mi id="S2.SS2.p3.7.m7.1.2.2" xref="S2.SS2.p3.7.m7.1.2.2.cmml">d</mi><mo id="S2.SS2.p3.7.m7.1.2.1" xref="S2.SS2.p3.7.m7.1.2.1.cmml">⁢</mo><mrow id="S2.SS2.p3.7.m7.1.2.3.2" xref="S2.SS2.p3.7.m7.1.2.cmml"><mo id="S2.SS2.p3.7.m7.1.2.3.2.1" stretchy="false" xref="S2.SS2.p3.7.m7.1.2.cmml">(</mo><mi id="S2.SS2.p3.7.m7.1.1" xref="S2.SS2.p3.7.m7.1.1.cmml">T</mi><mo id="S2.SS2.p3.7.m7.1.2.3.2.2" stretchy="false" xref="S2.SS2.p3.7.m7.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.7.m7.1b"><apply id="S2.SS2.p3.7.m7.1.2.cmml" xref="S2.SS2.p3.7.m7.1.2"><times id="S2.SS2.p3.7.m7.1.2.1.cmml" xref="S2.SS2.p3.7.m7.1.2.1"></times><ci id="S2.SS2.p3.7.m7.1.2.2.cmml" xref="S2.SS2.p3.7.m7.1.2.2">𝑑</ci><ci id="S2.SS2.p3.7.m7.1.1.cmml" xref="S2.SS2.p3.7.m7.1.1">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.7.m7.1c">d(T)</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.7.m7.1d">italic_d ( italic_T )</annotation></semantics></math>. Then, the fine-selection phase, denoted as <math alttext="fine\_selection(CR|CT,d(T))" class="ltx_Math" display="inline" id="S2.SS2.p3.8.m8.2"><semantics id="S2.SS2.p3.8.m8.2a"><mrow id="S2.SS2.p3.8.m8.2.2" xref="S2.SS2.p3.8.m8.2.2.cmml"><mi id="S2.SS2.p3.8.m8.2.2.3" xref="S2.SS2.p3.8.m8.2.2.3.cmml">f</mi><mo id="S2.SS2.p3.8.m8.2.2.2" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.4" xref="S2.SS2.p3.8.m8.2.2.4.cmml">i</mi><mo id="S2.SS2.p3.8.m8.2.2.2a" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.5" xref="S2.SS2.p3.8.m8.2.2.5.cmml">n</mi><mo id="S2.SS2.p3.8.m8.2.2.2b" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.6" xref="S2.SS2.p3.8.m8.2.2.6.cmml">e</mi><mo id="S2.SS2.p3.8.m8.2.2.2c" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.7" mathvariant="normal" xref="S2.SS2.p3.8.m8.2.2.7.cmml">_</mi><mo id="S2.SS2.p3.8.m8.2.2.2d" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.8" xref="S2.SS2.p3.8.m8.2.2.8.cmml">s</mi><mo id="S2.SS2.p3.8.m8.2.2.2e" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.9" xref="S2.SS2.p3.8.m8.2.2.9.cmml">e</mi><mo id="S2.SS2.p3.8.m8.2.2.2f" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.10" xref="S2.SS2.p3.8.m8.2.2.10.cmml">l</mi><mo id="S2.SS2.p3.8.m8.2.2.2g" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.11" xref="S2.SS2.p3.8.m8.2.2.11.cmml">e</mi><mo id="S2.SS2.p3.8.m8.2.2.2h" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.12" xref="S2.SS2.p3.8.m8.2.2.12.cmml">c</mi><mo id="S2.SS2.p3.8.m8.2.2.2i" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.13" xref="S2.SS2.p3.8.m8.2.2.13.cmml">t</mi><mo id="S2.SS2.p3.8.m8.2.2.2j" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.14" xref="S2.SS2.p3.8.m8.2.2.14.cmml">i</mi><mo id="S2.SS2.p3.8.m8.2.2.2k" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.15" xref="S2.SS2.p3.8.m8.2.2.15.cmml">o</mi><mo id="S2.SS2.p3.8.m8.2.2.2l" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.16" xref="S2.SS2.p3.8.m8.2.2.16.cmml">n</mi><mo id="S2.SS2.p3.8.m8.2.2.2m" xref="S2.SS2.p3.8.m8.2.2.2.cmml">⁢</mo><mrow id="S2.SS2.p3.8.m8.2.2.1.1" xref="S2.SS2.p3.8.m8.2.2.1.1.1.cmml"><mo id="S2.SS2.p3.8.m8.2.2.1.1.2" stretchy="false" xref="S2.SS2.p3.8.m8.2.2.1.1.1.cmml">(</mo><mrow id="S2.SS2.p3.8.m8.2.2.1.1.1" xref="S2.SS2.p3.8.m8.2.2.1.1.1.cmml"><mrow id="S2.SS2.p3.8.m8.2.2.1.1.1.4" xref="S2.SS2.p3.8.m8.2.2.1.1.1.4.cmml"><mi id="S2.SS2.p3.8.m8.2.2.1.1.1.4.2" xref="S2.SS2.p3.8.m8.2.2.1.1.1.4.2.cmml">C</mi><mo id="S2.SS2.p3.8.m8.2.2.1.1.1.4.1" xref="S2.SS2.p3.8.m8.2.2.1.1.1.4.1.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.1.1.1.4.3" xref="S2.SS2.p3.8.m8.2.2.1.1.1.4.3.cmml">R</mi></mrow><mo fence="false" id="S2.SS2.p3.8.m8.2.2.1.1.1.3" xref="S2.SS2.p3.8.m8.2.2.1.1.1.3.cmml">|</mo><mrow id="S2.SS2.p3.8.m8.2.2.1.1.1.2.2" xref="S2.SS2.p3.8.m8.2.2.1.1.1.2.3.cmml"><mrow id="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1" xref="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.cmml"><mi id="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.2" xref="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.2.cmml">C</mi><mo id="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.1" xref="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.1.cmml">⁢</mo><mi id="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.3" xref="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.3.cmml">T</mi></mrow><mo id="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.3" xref="S2.SS2.p3.8.m8.2.2.1.1.1.2.3.cmml">,</mo><mrow id="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2" xref="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.cmml"><mi id="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.2" xref="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.2.cmml">d</mi><mo id="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.1" xref="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.1.cmml">⁢</mo><mrow id="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.3.2" xref="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.cmml"><mo id="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.3.2.1" stretchy="false" xref="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.cmml">(</mo><mi id="S2.SS2.p3.8.m8.1.1" xref="S2.SS2.p3.8.m8.1.1.cmml">T</mi><mo id="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.3.2.2" stretchy="false" xref="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.cmml">)</mo></mrow></mrow></mrow></mrow><mo id="S2.SS2.p3.8.m8.2.2.1.1.3" stretchy="false" xref="S2.SS2.p3.8.m8.2.2.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.8.m8.2b"><apply id="S2.SS2.p3.8.m8.2.2.cmml" xref="S2.SS2.p3.8.m8.2.2"><times id="S2.SS2.p3.8.m8.2.2.2.cmml" xref="S2.SS2.p3.8.m8.2.2.2"></times><ci id="S2.SS2.p3.8.m8.2.2.3.cmml" xref="S2.SS2.p3.8.m8.2.2.3">𝑓</ci><ci id="S2.SS2.p3.8.m8.2.2.4.cmml" xref="S2.SS2.p3.8.m8.2.2.4">𝑖</ci><ci id="S2.SS2.p3.8.m8.2.2.5.cmml" xref="S2.SS2.p3.8.m8.2.2.5">𝑛</ci><ci id="S2.SS2.p3.8.m8.2.2.6.cmml" xref="S2.SS2.p3.8.m8.2.2.6">𝑒</ci><ci id="S2.SS2.p3.8.m8.2.2.7.cmml" xref="S2.SS2.p3.8.m8.2.2.7">_</ci><ci id="S2.SS2.p3.8.m8.2.2.8.cmml" xref="S2.SS2.p3.8.m8.2.2.8">𝑠</ci><ci id="S2.SS2.p3.8.m8.2.2.9.cmml" xref="S2.SS2.p3.8.m8.2.2.9">𝑒</ci><ci id="S2.SS2.p3.8.m8.2.2.10.cmml" xref="S2.SS2.p3.8.m8.2.2.10">𝑙</ci><ci id="S2.SS2.p3.8.m8.2.2.11.cmml" xref="S2.SS2.p3.8.m8.2.2.11">𝑒</ci><ci id="S2.SS2.p3.8.m8.2.2.12.cmml" xref="S2.SS2.p3.8.m8.2.2.12">𝑐</ci><ci id="S2.SS2.p3.8.m8.2.2.13.cmml" xref="S2.SS2.p3.8.m8.2.2.13">𝑡</ci><ci id="S2.SS2.p3.8.m8.2.2.14.cmml" xref="S2.SS2.p3.8.m8.2.2.14">𝑖</ci><ci id="S2.SS2.p3.8.m8.2.2.15.cmml" xref="S2.SS2.p3.8.m8.2.2.15">𝑜</ci><ci id="S2.SS2.p3.8.m8.2.2.16.cmml" xref="S2.SS2.p3.8.m8.2.2.16">𝑛</ci><apply id="S2.SS2.p3.8.m8.2.2.1.1.1.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1"><csymbol cd="latexml" id="S2.SS2.p3.8.m8.2.2.1.1.1.3.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1.1.3">conditional</csymbol><apply id="S2.SS2.p3.8.m8.2.2.1.1.1.4.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1.1.4"><times id="S2.SS2.p3.8.m8.2.2.1.1.1.4.1.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1.1.4.1"></times><ci id="S2.SS2.p3.8.m8.2.2.1.1.1.4.2.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1.1.4.2">𝐶</ci><ci id="S2.SS2.p3.8.m8.2.2.1.1.1.4.3.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1.1.4.3">𝑅</ci></apply><list id="S2.SS2.p3.8.m8.2.2.1.1.1.2.3.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1.1.2.2"><apply id="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1"><times id="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.1.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.1"></times><ci id="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.2.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.2">𝐶</ci><ci id="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.3.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1.1.1.1.1.3">𝑇</ci></apply><apply id="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2"><times id="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.1.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.1"></times><ci id="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.2.cmml" xref="S2.SS2.p3.8.m8.2.2.1.1.1.2.2.2.2">𝑑</ci><ci id="S2.SS2.p3.8.m8.1.1.cmml" xref="S2.SS2.p3.8.m8.1.1">𝑇</ci></apply></list></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.8.m8.2c">fine\_selection(CR|CT,d(T))</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.8.m8.2d">italic_f italic_i italic_n italic_e _ italic_s italic_e italic_l italic_e italic_c italic_t italic_i italic_o italic_n ( italic_C italic_R | italic_C italic_T , italic_d ( italic_T ) )</annotation></semantics></math>, returns the final selected model from the recalled models <math alttext="CR" class="ltx_Math" display="inline" id="S2.SS2.p3.9.m9.1"><semantics id="S2.SS2.p3.9.m9.1a"><mrow id="S2.SS2.p3.9.m9.1.1" xref="S2.SS2.p3.9.m9.1.1.cmml"><mi id="S2.SS2.p3.9.m9.1.1.2" xref="S2.SS2.p3.9.m9.1.1.2.cmml">C</mi><mo id="S2.SS2.p3.9.m9.1.1.1" xref="S2.SS2.p3.9.m9.1.1.1.cmml">⁢</mo><mi id="S2.SS2.p3.9.m9.1.1.3" xref="S2.SS2.p3.9.m9.1.1.3.cmml">R</mi></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.9.m9.1b"><apply id="S2.SS2.p3.9.m9.1.1.cmml" xref="S2.SS2.p3.9.m9.1.1"><times id="S2.SS2.p3.9.m9.1.1.1.cmml" xref="S2.SS2.p3.9.m9.1.1.1"></times><ci id="S2.SS2.p3.9.m9.1.1.2.cmml" xref="S2.SS2.p3.9.m9.1.1.2">𝐶</ci><ci id="S2.SS2.p3.9.m9.1.1.3.cmml" xref="S2.SS2.p3.9.m9.1.1.3">𝑅</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.9.m9.1c">CR</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.9.m9.1d">italic_C italic_R</annotation></semantics></math> with convergence trend <math alttext="CT" class="ltx_Math" display="inline" id="S2.SS2.p3.10.m10.1"><semantics id="S2.SS2.p3.10.m10.1a"><mrow id="S2.SS2.p3.10.m10.1.1" xref="S2.SS2.p3.10.m10.1.1.cmml"><mi id="S2.SS2.p3.10.m10.1.1.2" xref="S2.SS2.p3.10.m10.1.1.2.cmml">C</mi><mo id="S2.SS2.p3.10.m10.1.1.1" xref="S2.SS2.p3.10.m10.1.1.1.cmml">⁢</mo><mi id="S2.SS2.p3.10.m10.1.1.3" xref="S2.SS2.p3.10.m10.1.1.3.cmml">T</mi></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.10.m10.1b"><apply id="S2.SS2.p3.10.m10.1.1.cmml" xref="S2.SS2.p3.10.m10.1.1"><times id="S2.SS2.p3.10.m10.1.1.1.cmml" xref="S2.SS2.p3.10.m10.1.1.1"></times><ci id="S2.SS2.p3.10.m10.1.1.2.cmml" xref="S2.SS2.p3.10.m10.1.1.2">𝐶</ci><ci id="S2.SS2.p3.10.m10.1.1.3.cmml" xref="S2.SS2.p3.10.m10.1.1.3">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.10.m10.1c">CT</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.10.m10.1d">italic_C italic_T</annotation></semantics></math>.</p> </div> </section> </section> <section class="ltx_section" id="S3"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">III </span><span class="ltx_text ltx_font_smallcaps" id="S3.1.1">Coarse Recall</span> </h2> <div class="ltx_para" id="S3.p1"> <p class="ltx_p" id="S3.p1.1">The coarse-recall phase aims to efficiently identify a much smaller number of candidate models which tend to achieve good result on the target task. Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S1.F2" title="Figure 2 ‣ I Introduction ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">2</span></a>(c) illustrates the overall steps of coarse-recall phase. We firstly present the model clustering process, and then introduce proxy score computation for model clusters on the target dataset to return the recalled models.</p> </div> <section class="ltx_subsection" id="S3.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S3.SS1.4.1.1">III-A</span> </span><span class="ltx_text ltx_font_italic" id="S3.SS1.5.2">Model Clustering</span> </h3> <div class="ltx_para" id="S3.SS1.p1"> <p class="ltx_p" id="S3.SS1.p1.3">The computation of <math alttext="proxy\_score(T|m_{j})" class="ltx_Math" display="inline" id="S3.SS1.p1.1.m1.1"><semantics id="S3.SS1.p1.1.m1.1a"><mrow id="S3.SS1.p1.1.m1.1.1" xref="S3.SS1.p1.1.m1.1.1.cmml"><mi id="S3.SS1.p1.1.m1.1.1.3" xref="S3.SS1.p1.1.m1.1.1.3.cmml">p</mi><mo id="S3.SS1.p1.1.m1.1.1.2" xref="S3.SS1.p1.1.m1.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p1.1.m1.1.1.4" xref="S3.SS1.p1.1.m1.1.1.4.cmml">r</mi><mo id="S3.SS1.p1.1.m1.1.1.2a" xref="S3.SS1.p1.1.m1.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p1.1.m1.1.1.5" xref="S3.SS1.p1.1.m1.1.1.5.cmml">o</mi><mo id="S3.SS1.p1.1.m1.1.1.2b" xref="S3.SS1.p1.1.m1.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p1.1.m1.1.1.6" xref="S3.SS1.p1.1.m1.1.1.6.cmml">x</mi><mo id="S3.SS1.p1.1.m1.1.1.2c" xref="S3.SS1.p1.1.m1.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p1.1.m1.1.1.7" xref="S3.SS1.p1.1.m1.1.1.7.cmml">y</mi><mo id="S3.SS1.p1.1.m1.1.1.2d" xref="S3.SS1.p1.1.m1.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p1.1.m1.1.1.8" mathvariant="normal" xref="S3.SS1.p1.1.m1.1.1.8.cmml">_</mi><mo id="S3.SS1.p1.1.m1.1.1.2e" xref="S3.SS1.p1.1.m1.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p1.1.m1.1.1.9" xref="S3.SS1.p1.1.m1.1.1.9.cmml">s</mi><mo id="S3.SS1.p1.1.m1.1.1.2f" xref="S3.SS1.p1.1.m1.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p1.1.m1.1.1.10" xref="S3.SS1.p1.1.m1.1.1.10.cmml">c</mi><mo id="S3.SS1.p1.1.m1.1.1.2g" xref="S3.SS1.p1.1.m1.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p1.1.m1.1.1.11" xref="S3.SS1.p1.1.m1.1.1.11.cmml">o</mi><mo id="S3.SS1.p1.1.m1.1.1.2h" xref="S3.SS1.p1.1.m1.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p1.1.m1.1.1.12" xref="S3.SS1.p1.1.m1.1.1.12.cmml">r</mi><mo id="S3.SS1.p1.1.m1.1.1.2i" xref="S3.SS1.p1.1.m1.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p1.1.m1.1.1.13" xref="S3.SS1.p1.1.m1.1.1.13.cmml">e</mi><mo id="S3.SS1.p1.1.m1.1.1.2j" xref="S3.SS1.p1.1.m1.1.1.2.cmml">⁢</mo><mrow id="S3.SS1.p1.1.m1.1.1.1.1" xref="S3.SS1.p1.1.m1.1.1.1.1.1.cmml"><mo id="S3.SS1.p1.1.m1.1.1.1.1.2" stretchy="false" xref="S3.SS1.p1.1.m1.1.1.1.1.1.cmml">(</mo><mrow id="S3.SS1.p1.1.m1.1.1.1.1.1" xref="S3.SS1.p1.1.m1.1.1.1.1.1.cmml"><mi id="S3.SS1.p1.1.m1.1.1.1.1.1.2" xref="S3.SS1.p1.1.m1.1.1.1.1.1.2.cmml">T</mi><mo fence="false" id="S3.SS1.p1.1.m1.1.1.1.1.1.1" xref="S3.SS1.p1.1.m1.1.1.1.1.1.1.cmml">|</mo><msub id="S3.SS1.p1.1.m1.1.1.1.1.1.3" xref="S3.SS1.p1.1.m1.1.1.1.1.1.3.cmml"><mi id="S3.SS1.p1.1.m1.1.1.1.1.1.3.2" xref="S3.SS1.p1.1.m1.1.1.1.1.1.3.2.cmml">m</mi><mi id="S3.SS1.p1.1.m1.1.1.1.1.1.3.3" xref="S3.SS1.p1.1.m1.1.1.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S3.SS1.p1.1.m1.1.1.1.1.3" stretchy="false" xref="S3.SS1.p1.1.m1.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.1.m1.1b"><apply id="S3.SS1.p1.1.m1.1.1.cmml" xref="S3.SS1.p1.1.m1.1.1"><times id="S3.SS1.p1.1.m1.1.1.2.cmml" xref="S3.SS1.p1.1.m1.1.1.2"></times><ci id="S3.SS1.p1.1.m1.1.1.3.cmml" xref="S3.SS1.p1.1.m1.1.1.3">𝑝</ci><ci id="S3.SS1.p1.1.m1.1.1.4.cmml" xref="S3.SS1.p1.1.m1.1.1.4">𝑟</ci><ci id="S3.SS1.p1.1.m1.1.1.5.cmml" xref="S3.SS1.p1.1.m1.1.1.5">𝑜</ci><ci id="S3.SS1.p1.1.m1.1.1.6.cmml" xref="S3.SS1.p1.1.m1.1.1.6">𝑥</ci><ci id="S3.SS1.p1.1.m1.1.1.7.cmml" xref="S3.SS1.p1.1.m1.1.1.7">𝑦</ci><ci id="S3.SS1.p1.1.m1.1.1.8.cmml" xref="S3.SS1.p1.1.m1.1.1.8">_</ci><ci id="S3.SS1.p1.1.m1.1.1.9.cmml" xref="S3.SS1.p1.1.m1.1.1.9">𝑠</ci><ci id="S3.SS1.p1.1.m1.1.1.10.cmml" xref="S3.SS1.p1.1.m1.1.1.10">𝑐</ci><ci id="S3.SS1.p1.1.m1.1.1.11.cmml" xref="S3.SS1.p1.1.m1.1.1.11">𝑜</ci><ci id="S3.SS1.p1.1.m1.1.1.12.cmml" xref="S3.SS1.p1.1.m1.1.1.12">𝑟</ci><ci id="S3.SS1.p1.1.m1.1.1.13.cmml" xref="S3.SS1.p1.1.m1.1.1.13">𝑒</ci><apply id="S3.SS1.p1.1.m1.1.1.1.1.1.cmml" xref="S3.SS1.p1.1.m1.1.1.1.1"><csymbol cd="latexml" id="S3.SS1.p1.1.m1.1.1.1.1.1.1.cmml" xref="S3.SS1.p1.1.m1.1.1.1.1.1.1">conditional</csymbol><ci id="S3.SS1.p1.1.m1.1.1.1.1.1.2.cmml" xref="S3.SS1.p1.1.m1.1.1.1.1.1.2">𝑇</ci><apply id="S3.SS1.p1.1.m1.1.1.1.1.1.3.cmml" xref="S3.SS1.p1.1.m1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.SS1.p1.1.m1.1.1.1.1.1.3.1.cmml" xref="S3.SS1.p1.1.m1.1.1.1.1.1.3">subscript</csymbol><ci id="S3.SS1.p1.1.m1.1.1.1.1.1.3.2.cmml" xref="S3.SS1.p1.1.m1.1.1.1.1.1.3.2">𝑚</ci><ci id="S3.SS1.p1.1.m1.1.1.1.1.1.3.3.cmml" xref="S3.SS1.p1.1.m1.1.1.1.1.1.3.3">𝑗</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.1.m1.1c">proxy\_score(T|m_{j})</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.1.m1.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e ( italic_T | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )</annotation></semantics></math> needs to load the model into memory and do inference computation on the target dataset. The load and inference step may consume dozens of seconds for a pre-trained model with millions of parameters and a target dataset with hundreds of data items, and consequently is still time consuming. To accelerate the coarse-recall phase, a natural way is to group pre-trained models into clusters by measuring the similarity between pre-trained models, so that the proxy score only needs to be computed for the representative model in a cluster. It reduces the time complexity of coarse-recall phase from <math alttext="O(|M|)" class="ltx_Math" display="inline" id="S3.SS1.p1.2.m2.2"><semantics id="S3.SS1.p1.2.m2.2a"><mrow id="S3.SS1.p1.2.m2.2.2" xref="S3.SS1.p1.2.m2.2.2.cmml"><mi id="S3.SS1.p1.2.m2.2.2.3" xref="S3.SS1.p1.2.m2.2.2.3.cmml">O</mi><mo id="S3.SS1.p1.2.m2.2.2.2" xref="S3.SS1.p1.2.m2.2.2.2.cmml">⁢</mo><mrow id="S3.SS1.p1.2.m2.2.2.1.1" xref="S3.SS1.p1.2.m2.2.2.cmml"><mo id="S3.SS1.p1.2.m2.2.2.1.1.2" stretchy="false" xref="S3.SS1.p1.2.m2.2.2.cmml">(</mo><mrow id="S3.SS1.p1.2.m2.2.2.1.1.1.2" xref="S3.SS1.p1.2.m2.2.2.1.1.1.1.cmml"><mo id="S3.SS1.p1.2.m2.2.2.1.1.1.2.1" stretchy="false" xref="S3.SS1.p1.2.m2.2.2.1.1.1.1.1.cmml">|</mo><mi id="S3.SS1.p1.2.m2.1.1" xref="S3.SS1.p1.2.m2.1.1.cmml">M</mi><mo id="S3.SS1.p1.2.m2.2.2.1.1.1.2.2" stretchy="false" xref="S3.SS1.p1.2.m2.2.2.1.1.1.1.1.cmml">|</mo></mrow><mo id="S3.SS1.p1.2.m2.2.2.1.1.3" stretchy="false" xref="S3.SS1.p1.2.m2.2.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.2.m2.2b"><apply id="S3.SS1.p1.2.m2.2.2.cmml" xref="S3.SS1.p1.2.m2.2.2"><times id="S3.SS1.p1.2.m2.2.2.2.cmml" xref="S3.SS1.p1.2.m2.2.2.2"></times><ci id="S3.SS1.p1.2.m2.2.2.3.cmml" xref="S3.SS1.p1.2.m2.2.2.3">𝑂</ci><apply id="S3.SS1.p1.2.m2.2.2.1.1.1.1.cmml" xref="S3.SS1.p1.2.m2.2.2.1.1.1.2"><abs id="S3.SS1.p1.2.m2.2.2.1.1.1.1.1.cmml" xref="S3.SS1.p1.2.m2.2.2.1.1.1.2.1"></abs><ci id="S3.SS1.p1.2.m2.1.1.cmml" xref="S3.SS1.p1.2.m2.1.1">𝑀</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.2.m2.2c">O(|M|)</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.2.m2.2d">italic_O ( | italic_M | )</annotation></semantics></math> to <math alttext="O(|MC|)" class="ltx_Math" display="inline" id="S3.SS1.p1.3.m3.1"><semantics id="S3.SS1.p1.3.m3.1a"><mrow id="S3.SS1.p1.3.m3.1.1" xref="S3.SS1.p1.3.m3.1.1.cmml"><mi id="S3.SS1.p1.3.m3.1.1.3" xref="S3.SS1.p1.3.m3.1.1.3.cmml">O</mi><mo id="S3.SS1.p1.3.m3.1.1.2" xref="S3.SS1.p1.3.m3.1.1.2.cmml">⁢</mo><mrow id="S3.SS1.p1.3.m3.1.1.1.1" xref="S3.SS1.p1.3.m3.1.1.cmml"><mo id="S3.SS1.p1.3.m3.1.1.1.1.2" stretchy="false" xref="S3.SS1.p1.3.m3.1.1.cmml">(</mo><mrow id="S3.SS1.p1.3.m3.1.1.1.1.1.1" xref="S3.SS1.p1.3.m3.1.1.1.1.1.2.cmml"><mo id="S3.SS1.p1.3.m3.1.1.1.1.1.1.2" stretchy="false" xref="S3.SS1.p1.3.m3.1.1.1.1.1.2.1.cmml">|</mo><mrow id="S3.SS1.p1.3.m3.1.1.1.1.1.1.1" xref="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.cmml"><mi id="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.2" xref="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.2.cmml">M</mi><mo id="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.1" xref="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.1.cmml">⁢</mo><mi id="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.3" xref="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.3.cmml">C</mi></mrow><mo id="S3.SS1.p1.3.m3.1.1.1.1.1.1.3" stretchy="false" xref="S3.SS1.p1.3.m3.1.1.1.1.1.2.1.cmml">|</mo></mrow><mo id="S3.SS1.p1.3.m3.1.1.1.1.3" stretchy="false" xref="S3.SS1.p1.3.m3.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.3.m3.1b"><apply id="S3.SS1.p1.3.m3.1.1.cmml" xref="S3.SS1.p1.3.m3.1.1"><times id="S3.SS1.p1.3.m3.1.1.2.cmml" xref="S3.SS1.p1.3.m3.1.1.2"></times><ci id="S3.SS1.p1.3.m3.1.1.3.cmml" xref="S3.SS1.p1.3.m3.1.1.3">𝑂</ci><apply id="S3.SS1.p1.3.m3.1.1.1.1.1.2.cmml" xref="S3.SS1.p1.3.m3.1.1.1.1.1.1"><abs id="S3.SS1.p1.3.m3.1.1.1.1.1.2.1.cmml" xref="S3.SS1.p1.3.m3.1.1.1.1.1.1.2"></abs><apply id="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.cmml" xref="S3.SS1.p1.3.m3.1.1.1.1.1.1.1"><times id="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.1.cmml" xref="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.1"></times><ci id="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.2.cmml" xref="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.2">𝑀</ci><ci id="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.3.cmml" xref="S3.SS1.p1.3.m3.1.1.1.1.1.1.1.3">𝐶</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.3.m3.1c">O(|MC|)</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.3.m3.1d">italic_O ( | italic_M italic_C | )</annotation></semantics></math>.</p> </div> <div class="ltx_para" id="S3.SS1.p2"> <p class="ltx_p" id="S3.SS1.p2.1">The models’ similarity measures how two pre-trained models tend to have similar training performance on a target dataset. The training performance could be related to various factors, such as training data domain and quality, model architecture, and parameter size, etc. As these factors are heterogeneous and could not always be available, it maybe unfeasible to combine these factors explicitly to compute the model similarity. In our work, we propose to measure the similarity between models through a data-driven way motivated by the phenomena that models having similar training performances on benchmark datasets also tend to have similar training performance on a new task.</p> </div> <div class="ltx_para" id="S3.SS1.p3"> <p class="ltx_p" id="S3.SS1.p3.12">Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S1.F2" title="Figure 2 ‣ I Introduction ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">2</span></a> (a) and (b) illustrate the model clustering process. Based on the performance matrix <math alttext="Matrix(D,M)" class="ltx_Math" display="inline" id="S3.SS1.p3.1.m1.2"><semantics id="S3.SS1.p3.1.m1.2a"><mrow id="S3.SS1.p3.1.m1.2.3" xref="S3.SS1.p3.1.m1.2.3.cmml"><mi id="S3.SS1.p3.1.m1.2.3.2" xref="S3.SS1.p3.1.m1.2.3.2.cmml">M</mi><mo id="S3.SS1.p3.1.m1.2.3.1" xref="S3.SS1.p3.1.m1.2.3.1.cmml">⁢</mo><mi id="S3.SS1.p3.1.m1.2.3.3" xref="S3.SS1.p3.1.m1.2.3.3.cmml">a</mi><mo id="S3.SS1.p3.1.m1.2.3.1a" xref="S3.SS1.p3.1.m1.2.3.1.cmml">⁢</mo><mi id="S3.SS1.p3.1.m1.2.3.4" xref="S3.SS1.p3.1.m1.2.3.4.cmml">t</mi><mo id="S3.SS1.p3.1.m1.2.3.1b" xref="S3.SS1.p3.1.m1.2.3.1.cmml">⁢</mo><mi id="S3.SS1.p3.1.m1.2.3.5" xref="S3.SS1.p3.1.m1.2.3.5.cmml">r</mi><mo id="S3.SS1.p3.1.m1.2.3.1c" xref="S3.SS1.p3.1.m1.2.3.1.cmml">⁢</mo><mi id="S3.SS1.p3.1.m1.2.3.6" xref="S3.SS1.p3.1.m1.2.3.6.cmml">i</mi><mo id="S3.SS1.p3.1.m1.2.3.1d" xref="S3.SS1.p3.1.m1.2.3.1.cmml">⁢</mo><mi id="S3.SS1.p3.1.m1.2.3.7" xref="S3.SS1.p3.1.m1.2.3.7.cmml">x</mi><mo id="S3.SS1.p3.1.m1.2.3.1e" xref="S3.SS1.p3.1.m1.2.3.1.cmml">⁢</mo><mrow id="S3.SS1.p3.1.m1.2.3.8.2" xref="S3.SS1.p3.1.m1.2.3.8.1.cmml"><mo id="S3.SS1.p3.1.m1.2.3.8.2.1" stretchy="false" xref="S3.SS1.p3.1.m1.2.3.8.1.cmml">(</mo><mi id="S3.SS1.p3.1.m1.1.1" xref="S3.SS1.p3.1.m1.1.1.cmml">D</mi><mo id="S3.SS1.p3.1.m1.2.3.8.2.2" xref="S3.SS1.p3.1.m1.2.3.8.1.cmml">,</mo><mi id="S3.SS1.p3.1.m1.2.2" xref="S3.SS1.p3.1.m1.2.2.cmml">M</mi><mo id="S3.SS1.p3.1.m1.2.3.8.2.3" stretchy="false" xref="S3.SS1.p3.1.m1.2.3.8.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.1.m1.2b"><apply id="S3.SS1.p3.1.m1.2.3.cmml" xref="S3.SS1.p3.1.m1.2.3"><times id="S3.SS1.p3.1.m1.2.3.1.cmml" xref="S3.SS1.p3.1.m1.2.3.1"></times><ci id="S3.SS1.p3.1.m1.2.3.2.cmml" xref="S3.SS1.p3.1.m1.2.3.2">𝑀</ci><ci id="S3.SS1.p3.1.m1.2.3.3.cmml" xref="S3.SS1.p3.1.m1.2.3.3">𝑎</ci><ci id="S3.SS1.p3.1.m1.2.3.4.cmml" xref="S3.SS1.p3.1.m1.2.3.4">𝑡</ci><ci id="S3.SS1.p3.1.m1.2.3.5.cmml" xref="S3.SS1.p3.1.m1.2.3.5">𝑟</ci><ci id="S3.SS1.p3.1.m1.2.3.6.cmml" xref="S3.SS1.p3.1.m1.2.3.6">𝑖</ci><ci id="S3.SS1.p3.1.m1.2.3.7.cmml" xref="S3.SS1.p3.1.m1.2.3.7">𝑥</ci><interval closure="open" id="S3.SS1.p3.1.m1.2.3.8.1.cmml" xref="S3.SS1.p3.1.m1.2.3.8.2"><ci id="S3.SS1.p3.1.m1.1.1.cmml" xref="S3.SS1.p3.1.m1.1.1">𝐷</ci><ci id="S3.SS1.p3.1.m1.2.2.cmml" xref="S3.SS1.p3.1.m1.2.2">𝑀</ci></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.1.m1.2c">Matrix(D,M)</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.1.m1.2d">italic_M italic_a italic_t italic_r italic_i italic_x ( italic_D , italic_M )</annotation></semantics></math>, each model <math alttext="m_{j}" class="ltx_Math" display="inline" id="S3.SS1.p3.2.m2.1"><semantics id="S3.SS1.p3.2.m2.1a"><msub id="S3.SS1.p3.2.m2.1.1" xref="S3.SS1.p3.2.m2.1.1.cmml"><mi id="S3.SS1.p3.2.m2.1.1.2" xref="S3.SS1.p3.2.m2.1.1.2.cmml">m</mi><mi id="S3.SS1.p3.2.m2.1.1.3" xref="S3.SS1.p3.2.m2.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.2.m2.1b"><apply id="S3.SS1.p3.2.m2.1.1.cmml" xref="S3.SS1.p3.2.m2.1.1"><csymbol cd="ambiguous" id="S3.SS1.p3.2.m2.1.1.1.cmml" xref="S3.SS1.p3.2.m2.1.1">subscript</csymbol><ci id="S3.SS1.p3.2.m2.1.1.2.cmml" xref="S3.SS1.p3.2.m2.1.1.2">𝑚</ci><ci id="S3.SS1.p3.2.m2.1.1.3.cmml" xref="S3.SS1.p3.2.m2.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.2.m2.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.2.m2.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math> could be represented as a <math alttext="\left|D\right|" class="ltx_Math" display="inline" id="S3.SS1.p3.3.m3.1"><semantics id="S3.SS1.p3.3.m3.1a"><mrow id="S3.SS1.p3.3.m3.1.2.2" xref="S3.SS1.p3.3.m3.1.2.1.cmml"><mo id="S3.SS1.p3.3.m3.1.2.2.1" xref="S3.SS1.p3.3.m3.1.2.1.1.cmml">|</mo><mi id="S3.SS1.p3.3.m3.1.1" xref="S3.SS1.p3.3.m3.1.1.cmml">D</mi><mo id="S3.SS1.p3.3.m3.1.2.2.2" xref="S3.SS1.p3.3.m3.1.2.1.1.cmml">|</mo></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.3.m3.1b"><apply id="S3.SS1.p3.3.m3.1.2.1.cmml" xref="S3.SS1.p3.3.m3.1.2.2"><abs id="S3.SS1.p3.3.m3.1.2.1.1.cmml" xref="S3.SS1.p3.3.m3.1.2.2.1"></abs><ci id="S3.SS1.p3.3.m3.1.1.cmml" xref="S3.SS1.p3.3.m3.1.1">𝐷</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.3.m3.1c">\left|D\right|</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.3.m3.1d">| italic_D |</annotation></semantics></math>-dimensional vector <math alttext="vec(m_{j})=(p(d_{1}|m_{j}),p(d_{2}|m_{j}),...,p(d_{m}|m_{j}))" class="ltx_Math" display="inline" id="S3.SS1.p3.4.m4.5"><semantics id="S3.SS1.p3.4.m4.5a"><mrow id="S3.SS1.p3.4.m4.5.5" xref="S3.SS1.p3.4.m4.5.5.cmml"><mrow id="S3.SS1.p3.4.m4.2.2.1" xref="S3.SS1.p3.4.m4.2.2.1.cmml"><mi id="S3.SS1.p3.4.m4.2.2.1.3" xref="S3.SS1.p3.4.m4.2.2.1.3.cmml">v</mi><mo id="S3.SS1.p3.4.m4.2.2.1.2" xref="S3.SS1.p3.4.m4.2.2.1.2.cmml">⁢</mo><mi id="S3.SS1.p3.4.m4.2.2.1.4" xref="S3.SS1.p3.4.m4.2.2.1.4.cmml">e</mi><mo id="S3.SS1.p3.4.m4.2.2.1.2a" xref="S3.SS1.p3.4.m4.2.2.1.2.cmml">⁢</mo><mi id="S3.SS1.p3.4.m4.2.2.1.5" xref="S3.SS1.p3.4.m4.2.2.1.5.cmml">c</mi><mo id="S3.SS1.p3.4.m4.2.2.1.2b" xref="S3.SS1.p3.4.m4.2.2.1.2.cmml">⁢</mo><mrow id="S3.SS1.p3.4.m4.2.2.1.1.1" xref="S3.SS1.p3.4.m4.2.2.1.1.1.1.cmml"><mo id="S3.SS1.p3.4.m4.2.2.1.1.1.2" stretchy="false" xref="S3.SS1.p3.4.m4.2.2.1.1.1.1.cmml">(</mo><msub id="S3.SS1.p3.4.m4.2.2.1.1.1.1" xref="S3.SS1.p3.4.m4.2.2.1.1.1.1.cmml"><mi id="S3.SS1.p3.4.m4.2.2.1.1.1.1.2" xref="S3.SS1.p3.4.m4.2.2.1.1.1.1.2.cmml">m</mi><mi id="S3.SS1.p3.4.m4.2.2.1.1.1.1.3" xref="S3.SS1.p3.4.m4.2.2.1.1.1.1.3.cmml">j</mi></msub><mo id="S3.SS1.p3.4.m4.2.2.1.1.1.3" stretchy="false" xref="S3.SS1.p3.4.m4.2.2.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S3.SS1.p3.4.m4.5.5.5" xref="S3.SS1.p3.4.m4.5.5.5.cmml">=</mo><mrow id="S3.SS1.p3.4.m4.5.5.4.3" xref="S3.SS1.p3.4.m4.5.5.4.4.cmml"><mo id="S3.SS1.p3.4.m4.5.5.4.3.4" stretchy="false" xref="S3.SS1.p3.4.m4.5.5.4.4.cmml">(</mo><mrow id="S3.SS1.p3.4.m4.3.3.2.1.1" xref="S3.SS1.p3.4.m4.3.3.2.1.1.cmml"><mi id="S3.SS1.p3.4.m4.3.3.2.1.1.3" xref="S3.SS1.p3.4.m4.3.3.2.1.1.3.cmml">p</mi><mo id="S3.SS1.p3.4.m4.3.3.2.1.1.2" xref="S3.SS1.p3.4.m4.3.3.2.1.1.2.cmml">⁢</mo><mrow id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.cmml"><mo id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.2" stretchy="false" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.cmml">(</mo><mrow id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.cmml"><msub id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2.cmml"><mi id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2.2" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2.2.cmml">d</mi><mn id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2.3" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2.3.cmml">1</mn></msub><mo fence="false" id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.1" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.1.cmml">|</mo><msub id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3.cmml"><mi id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3.2" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3.2.cmml">m</mi><mi id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3.3" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.3" stretchy="false" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S3.SS1.p3.4.m4.5.5.4.3.5" xref="S3.SS1.p3.4.m4.5.5.4.4.cmml">,</mo><mrow id="S3.SS1.p3.4.m4.4.4.3.2.2" xref="S3.SS1.p3.4.m4.4.4.3.2.2.cmml"><mi id="S3.SS1.p3.4.m4.4.4.3.2.2.3" xref="S3.SS1.p3.4.m4.4.4.3.2.2.3.cmml">p</mi><mo id="S3.SS1.p3.4.m4.4.4.3.2.2.2" xref="S3.SS1.p3.4.m4.4.4.3.2.2.2.cmml">⁢</mo><mrow id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.cmml"><mo id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.2" stretchy="false" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.cmml">(</mo><mrow id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.cmml"><msub id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2.cmml"><mi id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2.2" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2.2.cmml">d</mi><mn id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2.3" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2.3.cmml">2</mn></msub><mo fence="false" id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.1" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.1.cmml">|</mo><msub id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3.cmml"><mi id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3.2" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3.2.cmml">m</mi><mi id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3.3" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.3" stretchy="false" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.cmml">)</mo></mrow></mrow><mo id="S3.SS1.p3.4.m4.5.5.4.3.6" xref="S3.SS1.p3.4.m4.5.5.4.4.cmml">,</mo><mi id="S3.SS1.p3.4.m4.1.1" mathvariant="normal" xref="S3.SS1.p3.4.m4.1.1.cmml">…</mi><mo id="S3.SS1.p3.4.m4.5.5.4.3.7" xref="S3.SS1.p3.4.m4.5.5.4.4.cmml">,</mo><mrow id="S3.SS1.p3.4.m4.5.5.4.3.3" xref="S3.SS1.p3.4.m4.5.5.4.3.3.cmml"><mi id="S3.SS1.p3.4.m4.5.5.4.3.3.3" xref="S3.SS1.p3.4.m4.5.5.4.3.3.3.cmml">p</mi><mo id="S3.SS1.p3.4.m4.5.5.4.3.3.2" xref="S3.SS1.p3.4.m4.5.5.4.3.3.2.cmml">⁢</mo><mrow id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.cmml"><mo id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.2" stretchy="false" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.cmml">(</mo><mrow id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.cmml"><msub id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2.cmml"><mi id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2.2" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2.2.cmml">d</mi><mi id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2.3" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2.3.cmml">m</mi></msub><mo fence="false" id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.1" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.1.cmml">|</mo><msub id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3.cmml"><mi id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3.2" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3.2.cmml">m</mi><mi id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3.3" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.3" stretchy="false" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.cmml">)</mo></mrow></mrow><mo id="S3.SS1.p3.4.m4.5.5.4.3.8" stretchy="false" xref="S3.SS1.p3.4.m4.5.5.4.4.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.4.m4.5b"><apply id="S3.SS1.p3.4.m4.5.5.cmml" xref="S3.SS1.p3.4.m4.5.5"><eq id="S3.SS1.p3.4.m4.5.5.5.cmml" xref="S3.SS1.p3.4.m4.5.5.5"></eq><apply id="S3.SS1.p3.4.m4.2.2.1.cmml" xref="S3.SS1.p3.4.m4.2.2.1"><times id="S3.SS1.p3.4.m4.2.2.1.2.cmml" xref="S3.SS1.p3.4.m4.2.2.1.2"></times><ci id="S3.SS1.p3.4.m4.2.2.1.3.cmml" xref="S3.SS1.p3.4.m4.2.2.1.3">𝑣</ci><ci id="S3.SS1.p3.4.m4.2.2.1.4.cmml" xref="S3.SS1.p3.4.m4.2.2.1.4">𝑒</ci><ci id="S3.SS1.p3.4.m4.2.2.1.5.cmml" xref="S3.SS1.p3.4.m4.2.2.1.5">𝑐</ci><apply id="S3.SS1.p3.4.m4.2.2.1.1.1.1.cmml" xref="S3.SS1.p3.4.m4.2.2.1.1.1"><csymbol cd="ambiguous" id="S3.SS1.p3.4.m4.2.2.1.1.1.1.1.cmml" xref="S3.SS1.p3.4.m4.2.2.1.1.1">subscript</csymbol><ci id="S3.SS1.p3.4.m4.2.2.1.1.1.1.2.cmml" xref="S3.SS1.p3.4.m4.2.2.1.1.1.1.2">𝑚</ci><ci id="S3.SS1.p3.4.m4.2.2.1.1.1.1.3.cmml" xref="S3.SS1.p3.4.m4.2.2.1.1.1.1.3">𝑗</ci></apply></apply><vector id="S3.SS1.p3.4.m4.5.5.4.4.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3"><apply id="S3.SS1.p3.4.m4.3.3.2.1.1.cmml" xref="S3.SS1.p3.4.m4.3.3.2.1.1"><times id="S3.SS1.p3.4.m4.3.3.2.1.1.2.cmml" xref="S3.SS1.p3.4.m4.3.3.2.1.1.2"></times><ci id="S3.SS1.p3.4.m4.3.3.2.1.1.3.cmml" xref="S3.SS1.p3.4.m4.3.3.2.1.1.3">𝑝</ci><apply id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.cmml" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1"><csymbol cd="latexml" id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.1.cmml" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.1">conditional</csymbol><apply id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2.cmml" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2.1.cmml" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2">subscript</csymbol><ci id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2.2.cmml" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2.2">𝑑</ci><cn id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2.3.cmml" type="integer" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.2.3">1</cn></apply><apply id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3.cmml" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3.1.cmml" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3">subscript</csymbol><ci id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3.2.cmml" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3.2">𝑚</ci><ci id="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3.3.cmml" xref="S3.SS1.p3.4.m4.3.3.2.1.1.1.1.1.3.3">𝑗</ci></apply></apply></apply><apply id="S3.SS1.p3.4.m4.4.4.3.2.2.cmml" xref="S3.SS1.p3.4.m4.4.4.3.2.2"><times id="S3.SS1.p3.4.m4.4.4.3.2.2.2.cmml" xref="S3.SS1.p3.4.m4.4.4.3.2.2.2"></times><ci id="S3.SS1.p3.4.m4.4.4.3.2.2.3.cmml" xref="S3.SS1.p3.4.m4.4.4.3.2.2.3">𝑝</ci><apply id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.cmml" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1"><csymbol cd="latexml" id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.1.cmml" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.1">conditional</csymbol><apply id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2.cmml" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2"><csymbol cd="ambiguous" id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2.1.cmml" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2">subscript</csymbol><ci id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2.2.cmml" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2.2">𝑑</ci><cn id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2.3.cmml" type="integer" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.2.3">2</cn></apply><apply id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3.cmml" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3"><csymbol cd="ambiguous" id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3.1.cmml" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3">subscript</csymbol><ci id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3.2.cmml" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3.2">𝑚</ci><ci id="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3.3.cmml" xref="S3.SS1.p3.4.m4.4.4.3.2.2.1.1.1.3.3">𝑗</ci></apply></apply></apply><ci id="S3.SS1.p3.4.m4.1.1.cmml" xref="S3.SS1.p3.4.m4.1.1">…</ci><apply id="S3.SS1.p3.4.m4.5.5.4.3.3.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3.3"><times id="S3.SS1.p3.4.m4.5.5.4.3.3.2.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3.3.2"></times><ci id="S3.SS1.p3.4.m4.5.5.4.3.3.3.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3.3.3">𝑝</ci><apply id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1"><csymbol cd="latexml" id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.1.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.1">conditional</csymbol><apply id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2"><csymbol cd="ambiguous" id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2.1.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2">subscript</csymbol><ci id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2.2.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2.2">𝑑</ci><ci id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2.3.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.2.3">𝑚</ci></apply><apply id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3"><csymbol cd="ambiguous" id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3.1.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3">subscript</csymbol><ci id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3.2.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3.2">𝑚</ci><ci id="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3.3.cmml" xref="S3.SS1.p3.4.m4.5.5.4.3.3.1.1.1.3.3">𝑗</ci></apply></apply></apply></vector></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.4.m4.5c">vec(m_{j})=(p(d_{1}|m_{j}),p(d_{2}|m_{j}),...,p(d_{m}|m_{j}))</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.4.m4.5d">italic_v italic_e italic_c ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = ( italic_p ( italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) , italic_p ( italic_d start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) , … , italic_p ( italic_d start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) )</annotation></semantics></math>. And model clustering could be conducted by any clustering algorithm based on the models’ distance <math alttext="dis(m_{j1},m_{j2})" class="ltx_Math" display="inline" id="S3.SS1.p3.5.m5.2"><semantics id="S3.SS1.p3.5.m5.2a"><mrow id="S3.SS1.p3.5.m5.2.2" xref="S3.SS1.p3.5.m5.2.2.cmml"><mi id="S3.SS1.p3.5.m5.2.2.4" xref="S3.SS1.p3.5.m5.2.2.4.cmml">d</mi><mo id="S3.SS1.p3.5.m5.2.2.3" xref="S3.SS1.p3.5.m5.2.2.3.cmml">⁢</mo><mi id="S3.SS1.p3.5.m5.2.2.5" xref="S3.SS1.p3.5.m5.2.2.5.cmml">i</mi><mo id="S3.SS1.p3.5.m5.2.2.3a" xref="S3.SS1.p3.5.m5.2.2.3.cmml">⁢</mo><mi id="S3.SS1.p3.5.m5.2.2.6" xref="S3.SS1.p3.5.m5.2.2.6.cmml">s</mi><mo id="S3.SS1.p3.5.m5.2.2.3b" xref="S3.SS1.p3.5.m5.2.2.3.cmml">⁢</mo><mrow id="S3.SS1.p3.5.m5.2.2.2.2" xref="S3.SS1.p3.5.m5.2.2.2.3.cmml"><mo id="S3.SS1.p3.5.m5.2.2.2.2.3" stretchy="false" xref="S3.SS1.p3.5.m5.2.2.2.3.cmml">(</mo><msub id="S3.SS1.p3.5.m5.1.1.1.1.1" xref="S3.SS1.p3.5.m5.1.1.1.1.1.cmml"><mi id="S3.SS1.p3.5.m5.1.1.1.1.1.2" xref="S3.SS1.p3.5.m5.1.1.1.1.1.2.cmml">m</mi><mrow id="S3.SS1.p3.5.m5.1.1.1.1.1.3" xref="S3.SS1.p3.5.m5.1.1.1.1.1.3.cmml"><mi id="S3.SS1.p3.5.m5.1.1.1.1.1.3.2" xref="S3.SS1.p3.5.m5.1.1.1.1.1.3.2.cmml">j</mi><mo id="S3.SS1.p3.5.m5.1.1.1.1.1.3.1" xref="S3.SS1.p3.5.m5.1.1.1.1.1.3.1.cmml">⁢</mo><mn id="S3.SS1.p3.5.m5.1.1.1.1.1.3.3" xref="S3.SS1.p3.5.m5.1.1.1.1.1.3.3.cmml">1</mn></mrow></msub><mo id="S3.SS1.p3.5.m5.2.2.2.2.4" xref="S3.SS1.p3.5.m5.2.2.2.3.cmml">,</mo><msub id="S3.SS1.p3.5.m5.2.2.2.2.2" xref="S3.SS1.p3.5.m5.2.2.2.2.2.cmml"><mi id="S3.SS1.p3.5.m5.2.2.2.2.2.2" xref="S3.SS1.p3.5.m5.2.2.2.2.2.2.cmml">m</mi><mrow id="S3.SS1.p3.5.m5.2.2.2.2.2.3" xref="S3.SS1.p3.5.m5.2.2.2.2.2.3.cmml"><mi id="S3.SS1.p3.5.m5.2.2.2.2.2.3.2" xref="S3.SS1.p3.5.m5.2.2.2.2.2.3.2.cmml">j</mi><mo id="S3.SS1.p3.5.m5.2.2.2.2.2.3.1" xref="S3.SS1.p3.5.m5.2.2.2.2.2.3.1.cmml">⁢</mo><mn id="S3.SS1.p3.5.m5.2.2.2.2.2.3.3" xref="S3.SS1.p3.5.m5.2.2.2.2.2.3.3.cmml">2</mn></mrow></msub><mo id="S3.SS1.p3.5.m5.2.2.2.2.5" stretchy="false" xref="S3.SS1.p3.5.m5.2.2.2.3.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.5.m5.2b"><apply id="S3.SS1.p3.5.m5.2.2.cmml" xref="S3.SS1.p3.5.m5.2.2"><times id="S3.SS1.p3.5.m5.2.2.3.cmml" xref="S3.SS1.p3.5.m5.2.2.3"></times><ci id="S3.SS1.p3.5.m5.2.2.4.cmml" xref="S3.SS1.p3.5.m5.2.2.4">𝑑</ci><ci id="S3.SS1.p3.5.m5.2.2.5.cmml" xref="S3.SS1.p3.5.m5.2.2.5">𝑖</ci><ci id="S3.SS1.p3.5.m5.2.2.6.cmml" xref="S3.SS1.p3.5.m5.2.2.6">𝑠</ci><interval closure="open" id="S3.SS1.p3.5.m5.2.2.2.3.cmml" xref="S3.SS1.p3.5.m5.2.2.2.2"><apply id="S3.SS1.p3.5.m5.1.1.1.1.1.cmml" xref="S3.SS1.p3.5.m5.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS1.p3.5.m5.1.1.1.1.1.1.cmml" xref="S3.SS1.p3.5.m5.1.1.1.1.1">subscript</csymbol><ci id="S3.SS1.p3.5.m5.1.1.1.1.1.2.cmml" xref="S3.SS1.p3.5.m5.1.1.1.1.1.2">𝑚</ci><apply id="S3.SS1.p3.5.m5.1.1.1.1.1.3.cmml" xref="S3.SS1.p3.5.m5.1.1.1.1.1.3"><times id="S3.SS1.p3.5.m5.1.1.1.1.1.3.1.cmml" xref="S3.SS1.p3.5.m5.1.1.1.1.1.3.1"></times><ci id="S3.SS1.p3.5.m5.1.1.1.1.1.3.2.cmml" xref="S3.SS1.p3.5.m5.1.1.1.1.1.3.2">𝑗</ci><cn id="S3.SS1.p3.5.m5.1.1.1.1.1.3.3.cmml" type="integer" xref="S3.SS1.p3.5.m5.1.1.1.1.1.3.3">1</cn></apply></apply><apply id="S3.SS1.p3.5.m5.2.2.2.2.2.cmml" xref="S3.SS1.p3.5.m5.2.2.2.2.2"><csymbol cd="ambiguous" id="S3.SS1.p3.5.m5.2.2.2.2.2.1.cmml" xref="S3.SS1.p3.5.m5.2.2.2.2.2">subscript</csymbol><ci id="S3.SS1.p3.5.m5.2.2.2.2.2.2.cmml" xref="S3.SS1.p3.5.m5.2.2.2.2.2.2">𝑚</ci><apply id="S3.SS1.p3.5.m5.2.2.2.2.2.3.cmml" xref="S3.SS1.p3.5.m5.2.2.2.2.2.3"><times id="S3.SS1.p3.5.m5.2.2.2.2.2.3.1.cmml" xref="S3.SS1.p3.5.m5.2.2.2.2.2.3.1"></times><ci id="S3.SS1.p3.5.m5.2.2.2.2.2.3.2.cmml" xref="S3.SS1.p3.5.m5.2.2.2.2.2.3.2">𝑗</ci><cn id="S3.SS1.p3.5.m5.2.2.2.2.2.3.3.cmml" type="integer" xref="S3.SS1.p3.5.m5.2.2.2.2.2.3.3">2</cn></apply></apply></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.5.m5.2c">dis(m_{j1},m_{j2})</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.5.m5.2d">italic_d italic_i italic_s ( italic_m start_POSTSUBSCRIPT italic_j 1 end_POSTSUBSCRIPT , italic_m start_POSTSUBSCRIPT italic_j 2 end_POSTSUBSCRIPT )</annotation></semantics></math> measured based on <math alttext="vec(m_{j1})" class="ltx_Math" display="inline" id="S3.SS1.p3.6.m6.1"><semantics id="S3.SS1.p3.6.m6.1a"><mrow id="S3.SS1.p3.6.m6.1.1" xref="S3.SS1.p3.6.m6.1.1.cmml"><mi id="S3.SS1.p3.6.m6.1.1.3" xref="S3.SS1.p3.6.m6.1.1.3.cmml">v</mi><mo id="S3.SS1.p3.6.m6.1.1.2" xref="S3.SS1.p3.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p3.6.m6.1.1.4" xref="S3.SS1.p3.6.m6.1.1.4.cmml">e</mi><mo id="S3.SS1.p3.6.m6.1.1.2a" xref="S3.SS1.p3.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p3.6.m6.1.1.5" xref="S3.SS1.p3.6.m6.1.1.5.cmml">c</mi><mo id="S3.SS1.p3.6.m6.1.1.2b" xref="S3.SS1.p3.6.m6.1.1.2.cmml">⁢</mo><mrow id="S3.SS1.p3.6.m6.1.1.1.1" xref="S3.SS1.p3.6.m6.1.1.1.1.1.cmml"><mo id="S3.SS1.p3.6.m6.1.1.1.1.2" stretchy="false" xref="S3.SS1.p3.6.m6.1.1.1.1.1.cmml">(</mo><msub id="S3.SS1.p3.6.m6.1.1.1.1.1" xref="S3.SS1.p3.6.m6.1.1.1.1.1.cmml"><mi id="S3.SS1.p3.6.m6.1.1.1.1.1.2" xref="S3.SS1.p3.6.m6.1.1.1.1.1.2.cmml">m</mi><mrow id="S3.SS1.p3.6.m6.1.1.1.1.1.3" xref="S3.SS1.p3.6.m6.1.1.1.1.1.3.cmml"><mi id="S3.SS1.p3.6.m6.1.1.1.1.1.3.2" xref="S3.SS1.p3.6.m6.1.1.1.1.1.3.2.cmml">j</mi><mo id="S3.SS1.p3.6.m6.1.1.1.1.1.3.1" xref="S3.SS1.p3.6.m6.1.1.1.1.1.3.1.cmml">⁢</mo><mn id="S3.SS1.p3.6.m6.1.1.1.1.1.3.3" xref="S3.SS1.p3.6.m6.1.1.1.1.1.3.3.cmml">1</mn></mrow></msub><mo id="S3.SS1.p3.6.m6.1.1.1.1.3" stretchy="false" xref="S3.SS1.p3.6.m6.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.6.m6.1b"><apply id="S3.SS1.p3.6.m6.1.1.cmml" xref="S3.SS1.p3.6.m6.1.1"><times id="S3.SS1.p3.6.m6.1.1.2.cmml" xref="S3.SS1.p3.6.m6.1.1.2"></times><ci id="S3.SS1.p3.6.m6.1.1.3.cmml" xref="S3.SS1.p3.6.m6.1.1.3">𝑣</ci><ci id="S3.SS1.p3.6.m6.1.1.4.cmml" xref="S3.SS1.p3.6.m6.1.1.4">𝑒</ci><ci id="S3.SS1.p3.6.m6.1.1.5.cmml" xref="S3.SS1.p3.6.m6.1.1.5">𝑐</ci><apply id="S3.SS1.p3.6.m6.1.1.1.1.1.cmml" xref="S3.SS1.p3.6.m6.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS1.p3.6.m6.1.1.1.1.1.1.cmml" xref="S3.SS1.p3.6.m6.1.1.1.1">subscript</csymbol><ci id="S3.SS1.p3.6.m6.1.1.1.1.1.2.cmml" xref="S3.SS1.p3.6.m6.1.1.1.1.1.2">𝑚</ci><apply id="S3.SS1.p3.6.m6.1.1.1.1.1.3.cmml" xref="S3.SS1.p3.6.m6.1.1.1.1.1.3"><times id="S3.SS1.p3.6.m6.1.1.1.1.1.3.1.cmml" xref="S3.SS1.p3.6.m6.1.1.1.1.1.3.1"></times><ci id="S3.SS1.p3.6.m6.1.1.1.1.1.3.2.cmml" xref="S3.SS1.p3.6.m6.1.1.1.1.1.3.2">𝑗</ci><cn id="S3.SS1.p3.6.m6.1.1.1.1.1.3.3.cmml" type="integer" xref="S3.SS1.p3.6.m6.1.1.1.1.1.3.3">1</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.6.m6.1c">vec(m_{j1})</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.6.m6.1d">italic_v italic_e italic_c ( italic_m start_POSTSUBSCRIPT italic_j 1 end_POSTSUBSCRIPT )</annotation></semantics></math> and <math alttext="vec(m_{j2})" class="ltx_Math" display="inline" id="S3.SS1.p3.7.m7.1"><semantics id="S3.SS1.p3.7.m7.1a"><mrow id="S3.SS1.p3.7.m7.1.1" xref="S3.SS1.p3.7.m7.1.1.cmml"><mi id="S3.SS1.p3.7.m7.1.1.3" xref="S3.SS1.p3.7.m7.1.1.3.cmml">v</mi><mo id="S3.SS1.p3.7.m7.1.1.2" xref="S3.SS1.p3.7.m7.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p3.7.m7.1.1.4" xref="S3.SS1.p3.7.m7.1.1.4.cmml">e</mi><mo id="S3.SS1.p3.7.m7.1.1.2a" xref="S3.SS1.p3.7.m7.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p3.7.m7.1.1.5" xref="S3.SS1.p3.7.m7.1.1.5.cmml">c</mi><mo id="S3.SS1.p3.7.m7.1.1.2b" xref="S3.SS1.p3.7.m7.1.1.2.cmml">⁢</mo><mrow id="S3.SS1.p3.7.m7.1.1.1.1" xref="S3.SS1.p3.7.m7.1.1.1.1.1.cmml"><mo id="S3.SS1.p3.7.m7.1.1.1.1.2" stretchy="false" xref="S3.SS1.p3.7.m7.1.1.1.1.1.cmml">(</mo><msub id="S3.SS1.p3.7.m7.1.1.1.1.1" xref="S3.SS1.p3.7.m7.1.1.1.1.1.cmml"><mi id="S3.SS1.p3.7.m7.1.1.1.1.1.2" xref="S3.SS1.p3.7.m7.1.1.1.1.1.2.cmml">m</mi><mrow id="S3.SS1.p3.7.m7.1.1.1.1.1.3" xref="S3.SS1.p3.7.m7.1.1.1.1.1.3.cmml"><mi id="S3.SS1.p3.7.m7.1.1.1.1.1.3.2" xref="S3.SS1.p3.7.m7.1.1.1.1.1.3.2.cmml">j</mi><mo id="S3.SS1.p3.7.m7.1.1.1.1.1.3.1" xref="S3.SS1.p3.7.m7.1.1.1.1.1.3.1.cmml">⁢</mo><mn id="S3.SS1.p3.7.m7.1.1.1.1.1.3.3" xref="S3.SS1.p3.7.m7.1.1.1.1.1.3.3.cmml">2</mn></mrow></msub><mo id="S3.SS1.p3.7.m7.1.1.1.1.3" stretchy="false" xref="S3.SS1.p3.7.m7.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.7.m7.1b"><apply id="S3.SS1.p3.7.m7.1.1.cmml" xref="S3.SS1.p3.7.m7.1.1"><times id="S3.SS1.p3.7.m7.1.1.2.cmml" xref="S3.SS1.p3.7.m7.1.1.2"></times><ci id="S3.SS1.p3.7.m7.1.1.3.cmml" xref="S3.SS1.p3.7.m7.1.1.3">𝑣</ci><ci id="S3.SS1.p3.7.m7.1.1.4.cmml" xref="S3.SS1.p3.7.m7.1.1.4">𝑒</ci><ci id="S3.SS1.p3.7.m7.1.1.5.cmml" xref="S3.SS1.p3.7.m7.1.1.5">𝑐</ci><apply id="S3.SS1.p3.7.m7.1.1.1.1.1.cmml" xref="S3.SS1.p3.7.m7.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS1.p3.7.m7.1.1.1.1.1.1.cmml" xref="S3.SS1.p3.7.m7.1.1.1.1">subscript</csymbol><ci id="S3.SS1.p3.7.m7.1.1.1.1.1.2.cmml" xref="S3.SS1.p3.7.m7.1.1.1.1.1.2">𝑚</ci><apply id="S3.SS1.p3.7.m7.1.1.1.1.1.3.cmml" xref="S3.SS1.p3.7.m7.1.1.1.1.1.3"><times id="S3.SS1.p3.7.m7.1.1.1.1.1.3.1.cmml" xref="S3.SS1.p3.7.m7.1.1.1.1.1.3.1"></times><ci id="S3.SS1.p3.7.m7.1.1.1.1.1.3.2.cmml" xref="S3.SS1.p3.7.m7.1.1.1.1.1.3.2">𝑗</ci><cn id="S3.SS1.p3.7.m7.1.1.1.1.1.3.3.cmml" type="integer" xref="S3.SS1.p3.7.m7.1.1.1.1.1.3.3">2</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.7.m7.1c">vec(m_{j2})</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.7.m7.1d">italic_v italic_e italic_c ( italic_m start_POSTSUBSCRIPT italic_j 2 end_POSTSUBSCRIPT )</annotation></semantics></math>. As the benchmark datasets cover a group of representative tasks for a machine learning application, the training performances on such datasets could measure both the feature extraction capability and domain characteristics of a pre-trained model. Therefore, for a target task <math alttext="T" class="ltx_Math" display="inline" id="S3.SS1.p3.8.m8.1"><semantics id="S3.SS1.p3.8.m8.1a"><mi id="S3.SS1.p3.8.m8.1.1" xref="S3.SS1.p3.8.m8.1.1.cmml">T</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.8.m8.1b"><ci id="S3.SS1.p3.8.m8.1.1.cmml" xref="S3.SS1.p3.8.m8.1.1">𝑇</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.8.m8.1c">T</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.8.m8.1d">italic_T</annotation></semantics></math>, it is possible that there are benchmark tasks which share similar feature extraction or domain of the training dataset. So the models, which are similar measured by <math alttext="vec(m_{j1})" class="ltx_Math" display="inline" id="S3.SS1.p3.9.m9.1"><semantics id="S3.SS1.p3.9.m9.1a"><mrow id="S3.SS1.p3.9.m9.1.1" xref="S3.SS1.p3.9.m9.1.1.cmml"><mi id="S3.SS1.p3.9.m9.1.1.3" xref="S3.SS1.p3.9.m9.1.1.3.cmml">v</mi><mo id="S3.SS1.p3.9.m9.1.1.2" xref="S3.SS1.p3.9.m9.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p3.9.m9.1.1.4" xref="S3.SS1.p3.9.m9.1.1.4.cmml">e</mi><mo id="S3.SS1.p3.9.m9.1.1.2a" xref="S3.SS1.p3.9.m9.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p3.9.m9.1.1.5" xref="S3.SS1.p3.9.m9.1.1.5.cmml">c</mi><mo id="S3.SS1.p3.9.m9.1.1.2b" xref="S3.SS1.p3.9.m9.1.1.2.cmml">⁢</mo><mrow id="S3.SS1.p3.9.m9.1.1.1.1" xref="S3.SS1.p3.9.m9.1.1.1.1.1.cmml"><mo id="S3.SS1.p3.9.m9.1.1.1.1.2" stretchy="false" xref="S3.SS1.p3.9.m9.1.1.1.1.1.cmml">(</mo><msub id="S3.SS1.p3.9.m9.1.1.1.1.1" xref="S3.SS1.p3.9.m9.1.1.1.1.1.cmml"><mi id="S3.SS1.p3.9.m9.1.1.1.1.1.2" xref="S3.SS1.p3.9.m9.1.1.1.1.1.2.cmml">m</mi><mrow id="S3.SS1.p3.9.m9.1.1.1.1.1.3" xref="S3.SS1.p3.9.m9.1.1.1.1.1.3.cmml"><mi id="S3.SS1.p3.9.m9.1.1.1.1.1.3.2" xref="S3.SS1.p3.9.m9.1.1.1.1.1.3.2.cmml">j</mi><mo id="S3.SS1.p3.9.m9.1.1.1.1.1.3.1" xref="S3.SS1.p3.9.m9.1.1.1.1.1.3.1.cmml">⁢</mo><mn id="S3.SS1.p3.9.m9.1.1.1.1.1.3.3" xref="S3.SS1.p3.9.m9.1.1.1.1.1.3.3.cmml">1</mn></mrow></msub><mo id="S3.SS1.p3.9.m9.1.1.1.1.3" stretchy="false" xref="S3.SS1.p3.9.m9.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.9.m9.1b"><apply id="S3.SS1.p3.9.m9.1.1.cmml" xref="S3.SS1.p3.9.m9.1.1"><times id="S3.SS1.p3.9.m9.1.1.2.cmml" xref="S3.SS1.p3.9.m9.1.1.2"></times><ci id="S3.SS1.p3.9.m9.1.1.3.cmml" xref="S3.SS1.p3.9.m9.1.1.3">𝑣</ci><ci id="S3.SS1.p3.9.m9.1.1.4.cmml" xref="S3.SS1.p3.9.m9.1.1.4">𝑒</ci><ci id="S3.SS1.p3.9.m9.1.1.5.cmml" xref="S3.SS1.p3.9.m9.1.1.5">𝑐</ci><apply id="S3.SS1.p3.9.m9.1.1.1.1.1.cmml" xref="S3.SS1.p3.9.m9.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS1.p3.9.m9.1.1.1.1.1.1.cmml" xref="S3.SS1.p3.9.m9.1.1.1.1">subscript</csymbol><ci id="S3.SS1.p3.9.m9.1.1.1.1.1.2.cmml" xref="S3.SS1.p3.9.m9.1.1.1.1.1.2">𝑚</ci><apply id="S3.SS1.p3.9.m9.1.1.1.1.1.3.cmml" xref="S3.SS1.p3.9.m9.1.1.1.1.1.3"><times id="S3.SS1.p3.9.m9.1.1.1.1.1.3.1.cmml" xref="S3.SS1.p3.9.m9.1.1.1.1.1.3.1"></times><ci id="S3.SS1.p3.9.m9.1.1.1.1.1.3.2.cmml" xref="S3.SS1.p3.9.m9.1.1.1.1.1.3.2">𝑗</ci><cn id="S3.SS1.p3.9.m9.1.1.1.1.1.3.3.cmml" type="integer" xref="S3.SS1.p3.9.m9.1.1.1.1.1.3.3">1</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.9.m9.1c">vec(m_{j1})</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.9.m9.1d">italic_v italic_e italic_c ( italic_m start_POSTSUBSCRIPT italic_j 1 end_POSTSUBSCRIPT )</annotation></semantics></math> and <math alttext="vec(m_{j2})" class="ltx_Math" display="inline" id="S3.SS1.p3.10.m10.1"><semantics id="S3.SS1.p3.10.m10.1a"><mrow id="S3.SS1.p3.10.m10.1.1" xref="S3.SS1.p3.10.m10.1.1.cmml"><mi id="S3.SS1.p3.10.m10.1.1.3" xref="S3.SS1.p3.10.m10.1.1.3.cmml">v</mi><mo id="S3.SS1.p3.10.m10.1.1.2" xref="S3.SS1.p3.10.m10.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p3.10.m10.1.1.4" xref="S3.SS1.p3.10.m10.1.1.4.cmml">e</mi><mo id="S3.SS1.p3.10.m10.1.1.2a" xref="S3.SS1.p3.10.m10.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p3.10.m10.1.1.5" xref="S3.SS1.p3.10.m10.1.1.5.cmml">c</mi><mo id="S3.SS1.p3.10.m10.1.1.2b" xref="S3.SS1.p3.10.m10.1.1.2.cmml">⁢</mo><mrow id="S3.SS1.p3.10.m10.1.1.1.1" xref="S3.SS1.p3.10.m10.1.1.1.1.1.cmml"><mo id="S3.SS1.p3.10.m10.1.1.1.1.2" stretchy="false" xref="S3.SS1.p3.10.m10.1.1.1.1.1.cmml">(</mo><msub id="S3.SS1.p3.10.m10.1.1.1.1.1" xref="S3.SS1.p3.10.m10.1.1.1.1.1.cmml"><mi id="S3.SS1.p3.10.m10.1.1.1.1.1.2" xref="S3.SS1.p3.10.m10.1.1.1.1.1.2.cmml">m</mi><mrow id="S3.SS1.p3.10.m10.1.1.1.1.1.3" xref="S3.SS1.p3.10.m10.1.1.1.1.1.3.cmml"><mi id="S3.SS1.p3.10.m10.1.1.1.1.1.3.2" xref="S3.SS1.p3.10.m10.1.1.1.1.1.3.2.cmml">j</mi><mo id="S3.SS1.p3.10.m10.1.1.1.1.1.3.1" xref="S3.SS1.p3.10.m10.1.1.1.1.1.3.1.cmml">⁢</mo><mn id="S3.SS1.p3.10.m10.1.1.1.1.1.3.3" xref="S3.SS1.p3.10.m10.1.1.1.1.1.3.3.cmml">2</mn></mrow></msub><mo id="S3.SS1.p3.10.m10.1.1.1.1.3" stretchy="false" xref="S3.SS1.p3.10.m10.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.10.m10.1b"><apply id="S3.SS1.p3.10.m10.1.1.cmml" xref="S3.SS1.p3.10.m10.1.1"><times id="S3.SS1.p3.10.m10.1.1.2.cmml" xref="S3.SS1.p3.10.m10.1.1.2"></times><ci id="S3.SS1.p3.10.m10.1.1.3.cmml" xref="S3.SS1.p3.10.m10.1.1.3">𝑣</ci><ci id="S3.SS1.p3.10.m10.1.1.4.cmml" xref="S3.SS1.p3.10.m10.1.1.4">𝑒</ci><ci id="S3.SS1.p3.10.m10.1.1.5.cmml" xref="S3.SS1.p3.10.m10.1.1.5">𝑐</ci><apply id="S3.SS1.p3.10.m10.1.1.1.1.1.cmml" xref="S3.SS1.p3.10.m10.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS1.p3.10.m10.1.1.1.1.1.1.cmml" xref="S3.SS1.p3.10.m10.1.1.1.1">subscript</csymbol><ci id="S3.SS1.p3.10.m10.1.1.1.1.1.2.cmml" xref="S3.SS1.p3.10.m10.1.1.1.1.1.2">𝑚</ci><apply id="S3.SS1.p3.10.m10.1.1.1.1.1.3.cmml" xref="S3.SS1.p3.10.m10.1.1.1.1.1.3"><times id="S3.SS1.p3.10.m10.1.1.1.1.1.3.1.cmml" xref="S3.SS1.p3.10.m10.1.1.1.1.1.3.1"></times><ci id="S3.SS1.p3.10.m10.1.1.1.1.1.3.2.cmml" xref="S3.SS1.p3.10.m10.1.1.1.1.1.3.2">𝑗</ci><cn id="S3.SS1.p3.10.m10.1.1.1.1.1.3.3.cmml" type="integer" xref="S3.SS1.p3.10.m10.1.1.1.1.1.3.3">2</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.10.m10.1c">vec(m_{j2})</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.10.m10.1d">italic_v italic_e italic_c ( italic_m start_POSTSUBSCRIPT italic_j 2 end_POSTSUBSCRIPT )</annotation></semantics></math> on benchmark datasets, are also tend to have similar performance in the new task <math alttext="T" class="ltx_Math" display="inline" id="S3.SS1.p3.11.m11.1"><semantics id="S3.SS1.p3.11.m11.1a"><mi id="S3.SS1.p3.11.m11.1.1" xref="S3.SS1.p3.11.m11.1.1.cmml">T</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.11.m11.1b"><ci id="S3.SS1.p3.11.m11.1.1.cmml" xref="S3.SS1.p3.11.m11.1.1">𝑇</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.11.m11.1c">T</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.11.m11.1d">italic_T</annotation></semantics></math>. Here, we measure the model similarity through the average accuracy differences on <math alttext="k" class="ltx_Math" display="inline" id="S3.SS1.p3.12.m12.1"><semantics id="S3.SS1.p3.12.m12.1a"><mi id="S3.SS1.p3.12.m12.1.1" xref="S3.SS1.p3.12.m12.1.1.cmml">k</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.12.m12.1b"><ci id="S3.SS1.p3.12.m12.1.1.cmml" xref="S3.SS1.p3.12.m12.1.1">𝑘</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.12.m12.1c">k</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.12.m12.1d">italic_k</annotation></semantics></math> benchmark datasets where two models have maximum accuracy differences:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E1"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="sim(m_{j1},m_{j2})=1-avg(top_{k}{|vec[m_{j1}]-vec[m_{j2}]|})" class="ltx_Math" display="block" id="S3.E1.m1.3"><semantics id="S3.E1.m1.3a"><mrow id="S3.E1.m1.3.3" xref="S3.E1.m1.3.3.cmml"><mrow id="S3.E1.m1.2.2.2" xref="S3.E1.m1.2.2.2.cmml"><mi id="S3.E1.m1.2.2.2.4" xref="S3.E1.m1.2.2.2.4.cmml">s</mi><mo id="S3.E1.m1.2.2.2.3" xref="S3.E1.m1.2.2.2.3.cmml">⁢</mo><mi id="S3.E1.m1.2.2.2.5" xref="S3.E1.m1.2.2.2.5.cmml">i</mi><mo id="S3.E1.m1.2.2.2.3a" xref="S3.E1.m1.2.2.2.3.cmml">⁢</mo><mi id="S3.E1.m1.2.2.2.6" xref="S3.E1.m1.2.2.2.6.cmml">m</mi><mo id="S3.E1.m1.2.2.2.3b" xref="S3.E1.m1.2.2.2.3.cmml">⁢</mo><mrow id="S3.E1.m1.2.2.2.2.2" xref="S3.E1.m1.2.2.2.2.3.cmml"><mo id="S3.E1.m1.2.2.2.2.2.3" stretchy="false" xref="S3.E1.m1.2.2.2.2.3.cmml">(</mo><msub id="S3.E1.m1.1.1.1.1.1.1" xref="S3.E1.m1.1.1.1.1.1.1.cmml"><mi id="S3.E1.m1.1.1.1.1.1.1.2" xref="S3.E1.m1.1.1.1.1.1.1.2.cmml">m</mi><mrow id="S3.E1.m1.1.1.1.1.1.1.3" xref="S3.E1.m1.1.1.1.1.1.1.3.cmml"><mi id="S3.E1.m1.1.1.1.1.1.1.3.2" xref="S3.E1.m1.1.1.1.1.1.1.3.2.cmml">j</mi><mo id="S3.E1.m1.1.1.1.1.1.1.3.1" xref="S3.E1.m1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mn id="S3.E1.m1.1.1.1.1.1.1.3.3" xref="S3.E1.m1.1.1.1.1.1.1.3.3.cmml">1</mn></mrow></msub><mo id="S3.E1.m1.2.2.2.2.2.4" xref="S3.E1.m1.2.2.2.2.3.cmml">,</mo><msub id="S3.E1.m1.2.2.2.2.2.2" xref="S3.E1.m1.2.2.2.2.2.2.cmml"><mi id="S3.E1.m1.2.2.2.2.2.2.2" xref="S3.E1.m1.2.2.2.2.2.2.2.cmml">m</mi><mrow id="S3.E1.m1.2.2.2.2.2.2.3" xref="S3.E1.m1.2.2.2.2.2.2.3.cmml"><mi id="S3.E1.m1.2.2.2.2.2.2.3.2" xref="S3.E1.m1.2.2.2.2.2.2.3.2.cmml">j</mi><mo id="S3.E1.m1.2.2.2.2.2.2.3.1" xref="S3.E1.m1.2.2.2.2.2.2.3.1.cmml">⁢</mo><mn id="S3.E1.m1.2.2.2.2.2.2.3.3" xref="S3.E1.m1.2.2.2.2.2.2.3.3.cmml">2</mn></mrow></msub><mo id="S3.E1.m1.2.2.2.2.2.5" stretchy="false" xref="S3.E1.m1.2.2.2.2.3.cmml">)</mo></mrow></mrow><mo id="S3.E1.m1.3.3.4" xref="S3.E1.m1.3.3.4.cmml">=</mo><mrow id="S3.E1.m1.3.3.3" xref="S3.E1.m1.3.3.3.cmml"><mn id="S3.E1.m1.3.3.3.3" xref="S3.E1.m1.3.3.3.3.cmml">1</mn><mo id="S3.E1.m1.3.3.3.2" xref="S3.E1.m1.3.3.3.2.cmml">−</mo><mrow id="S3.E1.m1.3.3.3.1" xref="S3.E1.m1.3.3.3.1.cmml"><mi id="S3.E1.m1.3.3.3.1.3" xref="S3.E1.m1.3.3.3.1.3.cmml">a</mi><mo id="S3.E1.m1.3.3.3.1.2" xref="S3.E1.m1.3.3.3.1.2.cmml">⁢</mo><mi id="S3.E1.m1.3.3.3.1.4" xref="S3.E1.m1.3.3.3.1.4.cmml">v</mi><mo id="S3.E1.m1.3.3.3.1.2a" xref="S3.E1.m1.3.3.3.1.2.cmml">⁢</mo><mi id="S3.E1.m1.3.3.3.1.5" xref="S3.E1.m1.3.3.3.1.5.cmml">g</mi><mo id="S3.E1.m1.3.3.3.1.2b" xref="S3.E1.m1.3.3.3.1.2.cmml">⁢</mo><mrow id="S3.E1.m1.3.3.3.1.1.1" xref="S3.E1.m1.3.3.3.1.1.1.1.cmml"><mo id="S3.E1.m1.3.3.3.1.1.1.2" stretchy="false" xref="S3.E1.m1.3.3.3.1.1.1.1.cmml">(</mo><mrow id="S3.E1.m1.3.3.3.1.1.1.1" xref="S3.E1.m1.3.3.3.1.1.1.1.cmml"><mi id="S3.E1.m1.3.3.3.1.1.1.1.3" xref="S3.E1.m1.3.3.3.1.1.1.1.3.cmml">t</mi><mo id="S3.E1.m1.3.3.3.1.1.1.1.2" xref="S3.E1.m1.3.3.3.1.1.1.1.2.cmml">⁢</mo><mi id="S3.E1.m1.3.3.3.1.1.1.1.4" xref="S3.E1.m1.3.3.3.1.1.1.1.4.cmml">o</mi><mo id="S3.E1.m1.3.3.3.1.1.1.1.2a" xref="S3.E1.m1.3.3.3.1.1.1.1.2.cmml">⁢</mo><msub id="S3.E1.m1.3.3.3.1.1.1.1.5" xref="S3.E1.m1.3.3.3.1.1.1.1.5.cmml"><mi id="S3.E1.m1.3.3.3.1.1.1.1.5.2" xref="S3.E1.m1.3.3.3.1.1.1.1.5.2.cmml">p</mi><mi id="S3.E1.m1.3.3.3.1.1.1.1.5.3" xref="S3.E1.m1.3.3.3.1.1.1.1.5.3.cmml">k</mi></msub><mo id="S3.E1.m1.3.3.3.1.1.1.1.2b" xref="S3.E1.m1.3.3.3.1.1.1.1.2.cmml">⁢</mo><mrow id="S3.E1.m1.3.3.3.1.1.1.1.1.1" xref="S3.E1.m1.3.3.3.1.1.1.1.1.2.cmml"><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.2" stretchy="false" xref="S3.E1.m1.3.3.3.1.1.1.1.1.2.1.cmml">|</mo><mrow id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.cmml"><mrow id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.cmml"><mi id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.3" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.3.cmml">v</mi><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.2" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.2.cmml">⁢</mo><mi id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.4" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.4.cmml">e</mi><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.2a" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.2.cmml">⁢</mo><mi id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.5" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.5.cmml">c</mi><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.2b" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.2.cmml">⁢</mo><mrow id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.2.cmml"><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.2" stretchy="false" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.2.1.cmml">[</mo><msub id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.cmml"><mi id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.2" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.2.cmml">m</mi><mrow id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.cmml"><mi id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.2" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml">j</mi><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.1" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mn id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.3" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml">1</mn></mrow></msub><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.3" stretchy="false" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.2.1.cmml">]</mo></mrow></mrow><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.3" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.3.cmml">−</mo><mrow id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.cmml"><mi id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.3" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.3.cmml">v</mi><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.2" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.2.cmml">⁢</mo><mi id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.4" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.4.cmml">e</mi><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.2a" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.2.cmml">⁢</mo><mi id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.5" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.5.cmml">c</mi><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.2b" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.2.cmml">⁢</mo><mrow id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.2.cmml"><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.2" stretchy="false" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.2.1.cmml">[</mo><msub id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.cmml"><mi id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.2" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.2.cmml">m</mi><mrow id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.cmml"><mi id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.2" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.2.cmml">j</mi><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.1" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.1.cmml">⁢</mo><mn id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.3" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.3.cmml">2</mn></mrow></msub><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.3" stretchy="false" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.2.1.cmml">]</mo></mrow></mrow></mrow><mo id="S3.E1.m1.3.3.3.1.1.1.1.1.1.3" stretchy="false" xref="S3.E1.m1.3.3.3.1.1.1.1.1.2.1.cmml">|</mo></mrow></mrow><mo id="S3.E1.m1.3.3.3.1.1.1.3" stretchy="false" xref="S3.E1.m1.3.3.3.1.1.1.1.cmml">)</mo></mrow></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E1.m1.3b"><apply id="S3.E1.m1.3.3.cmml" xref="S3.E1.m1.3.3"><eq id="S3.E1.m1.3.3.4.cmml" xref="S3.E1.m1.3.3.4"></eq><apply id="S3.E1.m1.2.2.2.cmml" xref="S3.E1.m1.2.2.2"><times id="S3.E1.m1.2.2.2.3.cmml" xref="S3.E1.m1.2.2.2.3"></times><ci id="S3.E1.m1.2.2.2.4.cmml" xref="S3.E1.m1.2.2.2.4">𝑠</ci><ci id="S3.E1.m1.2.2.2.5.cmml" xref="S3.E1.m1.2.2.2.5">𝑖</ci><ci id="S3.E1.m1.2.2.2.6.cmml" xref="S3.E1.m1.2.2.2.6">𝑚</ci><interval closure="open" id="S3.E1.m1.2.2.2.2.3.cmml" xref="S3.E1.m1.2.2.2.2.2"><apply id="S3.E1.m1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.E1.m1.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.1.1.1.1.1.1">subscript</csymbol><ci id="S3.E1.m1.1.1.1.1.1.1.2.cmml" xref="S3.E1.m1.1.1.1.1.1.1.2">𝑚</ci><apply id="S3.E1.m1.1.1.1.1.1.1.3.cmml" xref="S3.E1.m1.1.1.1.1.1.1.3"><times id="S3.E1.m1.1.1.1.1.1.1.3.1.cmml" xref="S3.E1.m1.1.1.1.1.1.1.3.1"></times><ci id="S3.E1.m1.1.1.1.1.1.1.3.2.cmml" xref="S3.E1.m1.1.1.1.1.1.1.3.2">𝑗</ci><cn id="S3.E1.m1.1.1.1.1.1.1.3.3.cmml" type="integer" xref="S3.E1.m1.1.1.1.1.1.1.3.3">1</cn></apply></apply><apply id="S3.E1.m1.2.2.2.2.2.2.cmml" xref="S3.E1.m1.2.2.2.2.2.2"><csymbol cd="ambiguous" id="S3.E1.m1.2.2.2.2.2.2.1.cmml" xref="S3.E1.m1.2.2.2.2.2.2">subscript</csymbol><ci id="S3.E1.m1.2.2.2.2.2.2.2.cmml" xref="S3.E1.m1.2.2.2.2.2.2.2">𝑚</ci><apply id="S3.E1.m1.2.2.2.2.2.2.3.cmml" xref="S3.E1.m1.2.2.2.2.2.2.3"><times id="S3.E1.m1.2.2.2.2.2.2.3.1.cmml" xref="S3.E1.m1.2.2.2.2.2.2.3.1"></times><ci id="S3.E1.m1.2.2.2.2.2.2.3.2.cmml" xref="S3.E1.m1.2.2.2.2.2.2.3.2">𝑗</ci><cn id="S3.E1.m1.2.2.2.2.2.2.3.3.cmml" type="integer" xref="S3.E1.m1.2.2.2.2.2.2.3.3">2</cn></apply></apply></interval></apply><apply id="S3.E1.m1.3.3.3.cmml" xref="S3.E1.m1.3.3.3"><minus id="S3.E1.m1.3.3.3.2.cmml" xref="S3.E1.m1.3.3.3.2"></minus><cn id="S3.E1.m1.3.3.3.3.cmml" type="integer" xref="S3.E1.m1.3.3.3.3">1</cn><apply id="S3.E1.m1.3.3.3.1.cmml" xref="S3.E1.m1.3.3.3.1"><times id="S3.E1.m1.3.3.3.1.2.cmml" xref="S3.E1.m1.3.3.3.1.2"></times><ci id="S3.E1.m1.3.3.3.1.3.cmml" xref="S3.E1.m1.3.3.3.1.3">𝑎</ci><ci id="S3.E1.m1.3.3.3.1.4.cmml" xref="S3.E1.m1.3.3.3.1.4">𝑣</ci><ci id="S3.E1.m1.3.3.3.1.5.cmml" xref="S3.E1.m1.3.3.3.1.5">𝑔</ci><apply id="S3.E1.m1.3.3.3.1.1.1.1.cmml" xref="S3.E1.m1.3.3.3.1.1.1"><times id="S3.E1.m1.3.3.3.1.1.1.1.2.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.2"></times><ci id="S3.E1.m1.3.3.3.1.1.1.1.3.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.3">𝑡</ci><ci id="S3.E1.m1.3.3.3.1.1.1.1.4.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.4">𝑜</ci><apply id="S3.E1.m1.3.3.3.1.1.1.1.5.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.5"><csymbol cd="ambiguous" id="S3.E1.m1.3.3.3.1.1.1.1.5.1.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.5">subscript</csymbol><ci id="S3.E1.m1.3.3.3.1.1.1.1.5.2.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.5.2">𝑝</ci><ci id="S3.E1.m1.3.3.3.1.1.1.1.5.3.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.5.3">𝑘</ci></apply><apply id="S3.E1.m1.3.3.3.1.1.1.1.1.2.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1"><abs id="S3.E1.m1.3.3.3.1.1.1.1.1.2.1.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.2"></abs><apply id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1"><minus id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.3.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.3"></minus><apply id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1"><times id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.2.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.2"></times><ci id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.3.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.3">𝑣</ci><ci id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.4.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.4">𝑒</ci><ci id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.5.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.5">𝑐</ci><apply id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.2.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.2">delimited-[]</csymbol><apply id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1">subscript</csymbol><ci id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.2">𝑚</ci><apply id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3"><times id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.1"></times><ci id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.2">𝑗</ci><cn id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml" type="integer" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3.3">1</cn></apply></apply></apply></apply><apply id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2"><times id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.2.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.2"></times><ci id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.3.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.3">𝑣</ci><ci id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.4.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.4">𝑒</ci><ci id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.5.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.5">𝑐</ci><apply id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.2.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1"><csymbol cd="latexml" id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.2.1.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.2">delimited-[]</csymbol><apply id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1"><csymbol cd="ambiguous" id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.1.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1">subscript</csymbol><ci id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.2.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.2">𝑚</ci><apply id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3"><times id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.1.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.1"></times><ci id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.2.cmml" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.2">𝑗</ci><cn id="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.3.cmml" type="integer" xref="S3.E1.m1.3.3.3.1.1.1.1.1.1.1.2.1.1.1.3.3">2</cn></apply></apply></apply></apply></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E1.m1.3c">sim(m_{j1},m_{j2})=1-avg(top_{k}{|vec[m_{j1}]-vec[m_{j2}]|})</annotation><annotation encoding="application/x-llamapun" id="S3.E1.m1.3d">italic_s italic_i italic_m ( italic_m start_POSTSUBSCRIPT italic_j 1 end_POSTSUBSCRIPT , italic_m start_POSTSUBSCRIPT italic_j 2 end_POSTSUBSCRIPT ) = 1 - italic_a italic_v italic_g ( italic_t italic_o italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | italic_v italic_e italic_c [ italic_m start_POSTSUBSCRIPT italic_j 1 end_POSTSUBSCRIPT ] - italic_v italic_e italic_c [ italic_m start_POSTSUBSCRIPT italic_j 2 end_POSTSUBSCRIPT ] | )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(1)</span></td> </tr></tbody> </table> </div> <div class="ltx_para" id="S3.SS1.p4"> <p class="ltx_p" id="S3.SS1.p4.3">For the performance matrix <math alttext="Matrix(D,M)" class="ltx_Math" display="inline" id="S3.SS1.p4.1.m1.2"><semantics id="S3.SS1.p4.1.m1.2a"><mrow id="S3.SS1.p4.1.m1.2.3" xref="S3.SS1.p4.1.m1.2.3.cmml"><mi id="S3.SS1.p4.1.m1.2.3.2" xref="S3.SS1.p4.1.m1.2.3.2.cmml">M</mi><mo id="S3.SS1.p4.1.m1.2.3.1" xref="S3.SS1.p4.1.m1.2.3.1.cmml">⁢</mo><mi id="S3.SS1.p4.1.m1.2.3.3" xref="S3.SS1.p4.1.m1.2.3.3.cmml">a</mi><mo id="S3.SS1.p4.1.m1.2.3.1a" xref="S3.SS1.p4.1.m1.2.3.1.cmml">⁢</mo><mi id="S3.SS1.p4.1.m1.2.3.4" xref="S3.SS1.p4.1.m1.2.3.4.cmml">t</mi><mo id="S3.SS1.p4.1.m1.2.3.1b" xref="S3.SS1.p4.1.m1.2.3.1.cmml">⁢</mo><mi id="S3.SS1.p4.1.m1.2.3.5" xref="S3.SS1.p4.1.m1.2.3.5.cmml">r</mi><mo id="S3.SS1.p4.1.m1.2.3.1c" xref="S3.SS1.p4.1.m1.2.3.1.cmml">⁢</mo><mi id="S3.SS1.p4.1.m1.2.3.6" xref="S3.SS1.p4.1.m1.2.3.6.cmml">i</mi><mo id="S3.SS1.p4.1.m1.2.3.1d" xref="S3.SS1.p4.1.m1.2.3.1.cmml">⁢</mo><mi id="S3.SS1.p4.1.m1.2.3.7" xref="S3.SS1.p4.1.m1.2.3.7.cmml">x</mi><mo id="S3.SS1.p4.1.m1.2.3.1e" xref="S3.SS1.p4.1.m1.2.3.1.cmml">⁢</mo><mrow id="S3.SS1.p4.1.m1.2.3.8.2" xref="S3.SS1.p4.1.m1.2.3.8.1.cmml"><mo id="S3.SS1.p4.1.m1.2.3.8.2.1" stretchy="false" xref="S3.SS1.p4.1.m1.2.3.8.1.cmml">(</mo><mi id="S3.SS1.p4.1.m1.1.1" xref="S3.SS1.p4.1.m1.1.1.cmml">D</mi><mo id="S3.SS1.p4.1.m1.2.3.8.2.2" xref="S3.SS1.p4.1.m1.2.3.8.1.cmml">,</mo><mi id="S3.SS1.p4.1.m1.2.2" xref="S3.SS1.p4.1.m1.2.2.cmml">M</mi><mo id="S3.SS1.p4.1.m1.2.3.8.2.3" stretchy="false" xref="S3.SS1.p4.1.m1.2.3.8.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p4.1.m1.2b"><apply id="S3.SS1.p4.1.m1.2.3.cmml" xref="S3.SS1.p4.1.m1.2.3"><times id="S3.SS1.p4.1.m1.2.3.1.cmml" xref="S3.SS1.p4.1.m1.2.3.1"></times><ci id="S3.SS1.p4.1.m1.2.3.2.cmml" xref="S3.SS1.p4.1.m1.2.3.2">𝑀</ci><ci id="S3.SS1.p4.1.m1.2.3.3.cmml" xref="S3.SS1.p4.1.m1.2.3.3">𝑎</ci><ci id="S3.SS1.p4.1.m1.2.3.4.cmml" xref="S3.SS1.p4.1.m1.2.3.4">𝑡</ci><ci id="S3.SS1.p4.1.m1.2.3.5.cmml" xref="S3.SS1.p4.1.m1.2.3.5">𝑟</ci><ci id="S3.SS1.p4.1.m1.2.3.6.cmml" xref="S3.SS1.p4.1.m1.2.3.6">𝑖</ci><ci id="S3.SS1.p4.1.m1.2.3.7.cmml" xref="S3.SS1.p4.1.m1.2.3.7">𝑥</ci><interval closure="open" id="S3.SS1.p4.1.m1.2.3.8.1.cmml" xref="S3.SS1.p4.1.m1.2.3.8.2"><ci id="S3.SS1.p4.1.m1.1.1.cmml" xref="S3.SS1.p4.1.m1.1.1">𝐷</ci><ci id="S3.SS1.p4.1.m1.2.2.cmml" xref="S3.SS1.p4.1.m1.2.2">𝑀</ci></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p4.1.m1.2c">Matrix(D,M)</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p4.1.m1.2d">italic_M italic_a italic_t italic_r italic_i italic_x ( italic_D , italic_M )</annotation></semantics></math>, although there are <math alttext="m\cdot n" class="ltx_Math" display="inline" id="S3.SS1.p4.2.m2.1"><semantics id="S3.SS1.p4.2.m2.1a"><mrow id="S3.SS1.p4.2.m2.1.1" xref="S3.SS1.p4.2.m2.1.1.cmml"><mi id="S3.SS1.p4.2.m2.1.1.2" xref="S3.SS1.p4.2.m2.1.1.2.cmml">m</mi><mo id="S3.SS1.p4.2.m2.1.1.1" lspace="0.222em" rspace="0.222em" xref="S3.SS1.p4.2.m2.1.1.1.cmml">⋅</mo><mi id="S3.SS1.p4.2.m2.1.1.3" xref="S3.SS1.p4.2.m2.1.1.3.cmml">n</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p4.2.m2.1b"><apply id="S3.SS1.p4.2.m2.1.1.cmml" xref="S3.SS1.p4.2.m2.1.1"><ci id="S3.SS1.p4.2.m2.1.1.1.cmml" xref="S3.SS1.p4.2.m2.1.1.1">⋅</ci><ci id="S3.SS1.p4.2.m2.1.1.2.cmml" xref="S3.SS1.p4.2.m2.1.1.2">𝑚</ci><ci id="S3.SS1.p4.2.m2.1.1.3.cmml" xref="S3.SS1.p4.2.m2.1.1.3">𝑛</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p4.2.m2.1c">m\cdot n</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p4.2.m2.1d">italic_m ⋅ italic_n</annotation></semantics></math> elements which need <math alttext="m\cdot n" class="ltx_Math" display="inline" id="S3.SS1.p4.3.m3.1"><semantics id="S3.SS1.p4.3.m3.1a"><mrow id="S3.SS1.p4.3.m3.1.1" xref="S3.SS1.p4.3.m3.1.1.cmml"><mi id="S3.SS1.p4.3.m3.1.1.2" xref="S3.SS1.p4.3.m3.1.1.2.cmml">m</mi><mo id="S3.SS1.p4.3.m3.1.1.1" lspace="0.222em" rspace="0.222em" xref="S3.SS1.p4.3.m3.1.1.1.cmml">⋅</mo><mi id="S3.SS1.p4.3.m3.1.1.3" xref="S3.SS1.p4.3.m3.1.1.3.cmml">n</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p4.3.m3.1b"><apply id="S3.SS1.p4.3.m3.1.1.cmml" xref="S3.SS1.p4.3.m3.1.1"><ci id="S3.SS1.p4.3.m3.1.1.1.cmml" xref="S3.SS1.p4.3.m3.1.1.1">⋅</ci><ci id="S3.SS1.p4.3.m3.1.1.2.cmml" xref="S3.SS1.p4.3.m3.1.1.2">𝑚</ci><ci id="S3.SS1.p4.3.m3.1.1.3.cmml" xref="S3.SS1.p4.3.m3.1.1.3">𝑛</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p4.3.m3.1c">m\cdot n</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p4.3.m3.1d">italic_m ⋅ italic_n</annotation></semantics></math> times training, it is not necessary to train a pre-trained model on the whole benchmark dataset, since only top accuracy differences will be used to measure the model similarity. Actually, the training performance on a subset of training data with relative small size could be enough.</p> </div> <div class="ltx_para" id="S3.SS1.p5"> <p class="ltx_p" id="S3.SS1.p5.7">Based on the above model similarity measurement, we can adopt state-of-the-art clustering algorithms to group models in model repository, such K-means <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib13" title="">13</a>]</cite>, hierarchical clustering <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib14" title="">14</a>]</cite>, etc. After model clustering, for a cluster <math alttext="C_{i}" class="ltx_Math" display="inline" id="S3.SS1.p5.1.m1.1"><semantics id="S3.SS1.p5.1.m1.1a"><msub id="S3.SS1.p5.1.m1.1.1" xref="S3.SS1.p5.1.m1.1.1.cmml"><mi id="S3.SS1.p5.1.m1.1.1.2" xref="S3.SS1.p5.1.m1.1.1.2.cmml">C</mi><mi id="S3.SS1.p5.1.m1.1.1.3" xref="S3.SS1.p5.1.m1.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS1.p5.1.m1.1b"><apply id="S3.SS1.p5.1.m1.1.1.cmml" xref="S3.SS1.p5.1.m1.1.1"><csymbol cd="ambiguous" id="S3.SS1.p5.1.m1.1.1.1.cmml" xref="S3.SS1.p5.1.m1.1.1">subscript</csymbol><ci id="S3.SS1.p5.1.m1.1.1.2.cmml" xref="S3.SS1.p5.1.m1.1.1.2">𝐶</ci><ci id="S3.SS1.p5.1.m1.1.1.3.cmml" xref="S3.SS1.p5.1.m1.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p5.1.m1.1c">C_{i}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p5.1.m1.1d">italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math>, the model belongs to <math alttext="C_{i}" class="ltx_Math" display="inline" id="S3.SS1.p5.2.m2.1"><semantics id="S3.SS1.p5.2.m2.1a"><msub id="S3.SS1.p5.2.m2.1.1" xref="S3.SS1.p5.2.m2.1.1.cmml"><mi id="S3.SS1.p5.2.m2.1.1.2" xref="S3.SS1.p5.2.m2.1.1.2.cmml">C</mi><mi id="S3.SS1.p5.2.m2.1.1.3" xref="S3.SS1.p5.2.m2.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS1.p5.2.m2.1b"><apply id="S3.SS1.p5.2.m2.1.1.cmml" xref="S3.SS1.p5.2.m2.1.1"><csymbol cd="ambiguous" id="S3.SS1.p5.2.m2.1.1.1.cmml" xref="S3.SS1.p5.2.m2.1.1">subscript</csymbol><ci id="S3.SS1.p5.2.m2.1.1.2.cmml" xref="S3.SS1.p5.2.m2.1.1.2">𝐶</ci><ci id="S3.SS1.p5.2.m2.1.1.3.cmml" xref="S3.SS1.p5.2.m2.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p5.2.m2.1c">C_{i}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p5.2.m2.1d">italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math> and has the maximum average training performance on benchmark datasets is selected as the representative model, denoted as <math alttext="m(C_{i})" class="ltx_Math" display="inline" id="S3.SS1.p5.3.m3.1"><semantics id="S3.SS1.p5.3.m3.1a"><mrow id="S3.SS1.p5.3.m3.1.1" xref="S3.SS1.p5.3.m3.1.1.cmml"><mi id="S3.SS1.p5.3.m3.1.1.3" xref="S3.SS1.p5.3.m3.1.1.3.cmml">m</mi><mo id="S3.SS1.p5.3.m3.1.1.2" xref="S3.SS1.p5.3.m3.1.1.2.cmml">⁢</mo><mrow id="S3.SS1.p5.3.m3.1.1.1.1" xref="S3.SS1.p5.3.m3.1.1.1.1.1.cmml"><mo id="S3.SS1.p5.3.m3.1.1.1.1.2" stretchy="false" xref="S3.SS1.p5.3.m3.1.1.1.1.1.cmml">(</mo><msub id="S3.SS1.p5.3.m3.1.1.1.1.1" xref="S3.SS1.p5.3.m3.1.1.1.1.1.cmml"><mi id="S3.SS1.p5.3.m3.1.1.1.1.1.2" xref="S3.SS1.p5.3.m3.1.1.1.1.1.2.cmml">C</mi><mi id="S3.SS1.p5.3.m3.1.1.1.1.1.3" xref="S3.SS1.p5.3.m3.1.1.1.1.1.3.cmml">i</mi></msub><mo id="S3.SS1.p5.3.m3.1.1.1.1.3" stretchy="false" xref="S3.SS1.p5.3.m3.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p5.3.m3.1b"><apply id="S3.SS1.p5.3.m3.1.1.cmml" xref="S3.SS1.p5.3.m3.1.1"><times id="S3.SS1.p5.3.m3.1.1.2.cmml" xref="S3.SS1.p5.3.m3.1.1.2"></times><ci id="S3.SS1.p5.3.m3.1.1.3.cmml" xref="S3.SS1.p5.3.m3.1.1.3">𝑚</ci><apply id="S3.SS1.p5.3.m3.1.1.1.1.1.cmml" xref="S3.SS1.p5.3.m3.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS1.p5.3.m3.1.1.1.1.1.1.cmml" xref="S3.SS1.p5.3.m3.1.1.1.1">subscript</csymbol><ci id="S3.SS1.p5.3.m3.1.1.1.1.1.2.cmml" xref="S3.SS1.p5.3.m3.1.1.1.1.1.2">𝐶</ci><ci id="S3.SS1.p5.3.m3.1.1.1.1.1.3.cmml" xref="S3.SS1.p5.3.m3.1.1.1.1.1.3">𝑖</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p5.3.m3.1c">m(C_{i})</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p5.3.m3.1d">italic_m ( italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )</annotation></semantics></math>. Then, the proxy score between the target dataset <math alttext="d(T)" class="ltx_Math" display="inline" id="S3.SS1.p5.4.m4.1"><semantics id="S3.SS1.p5.4.m4.1a"><mrow id="S3.SS1.p5.4.m4.1.2" xref="S3.SS1.p5.4.m4.1.2.cmml"><mi id="S3.SS1.p5.4.m4.1.2.2" xref="S3.SS1.p5.4.m4.1.2.2.cmml">d</mi><mo id="S3.SS1.p5.4.m4.1.2.1" xref="S3.SS1.p5.4.m4.1.2.1.cmml">⁢</mo><mrow id="S3.SS1.p5.4.m4.1.2.3.2" xref="S3.SS1.p5.4.m4.1.2.cmml"><mo id="S3.SS1.p5.4.m4.1.2.3.2.1" stretchy="false" xref="S3.SS1.p5.4.m4.1.2.cmml">(</mo><mi id="S3.SS1.p5.4.m4.1.1" xref="S3.SS1.p5.4.m4.1.1.cmml">T</mi><mo id="S3.SS1.p5.4.m4.1.2.3.2.2" stretchy="false" xref="S3.SS1.p5.4.m4.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p5.4.m4.1b"><apply id="S3.SS1.p5.4.m4.1.2.cmml" xref="S3.SS1.p5.4.m4.1.2"><times id="S3.SS1.p5.4.m4.1.2.1.cmml" xref="S3.SS1.p5.4.m4.1.2.1"></times><ci id="S3.SS1.p5.4.m4.1.2.2.cmml" xref="S3.SS1.p5.4.m4.1.2.2">𝑑</ci><ci id="S3.SS1.p5.4.m4.1.1.cmml" xref="S3.SS1.p5.4.m4.1.1">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p5.4.m4.1c">d(T)</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p5.4.m4.1d">italic_d ( italic_T )</annotation></semantics></math> and model cluster <math alttext="C_{i}" class="ltx_Math" display="inline" id="S3.SS1.p5.5.m5.1"><semantics id="S3.SS1.p5.5.m5.1a"><msub id="S3.SS1.p5.5.m5.1.1" xref="S3.SS1.p5.5.m5.1.1.cmml"><mi id="S3.SS1.p5.5.m5.1.1.2" xref="S3.SS1.p5.5.m5.1.1.2.cmml">C</mi><mi id="S3.SS1.p5.5.m5.1.1.3" xref="S3.SS1.p5.5.m5.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS1.p5.5.m5.1b"><apply id="S3.SS1.p5.5.m5.1.1.cmml" xref="S3.SS1.p5.5.m5.1.1"><csymbol cd="ambiguous" id="S3.SS1.p5.5.m5.1.1.1.cmml" xref="S3.SS1.p5.5.m5.1.1">subscript</csymbol><ci id="S3.SS1.p5.5.m5.1.1.2.cmml" xref="S3.SS1.p5.5.m5.1.1.2">𝐶</ci><ci id="S3.SS1.p5.5.m5.1.1.3.cmml" xref="S3.SS1.p5.5.m5.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p5.5.m5.1c">C_{i}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p5.5.m5.1d">italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math> could be calculated as <math alttext="proxy\_score(T|m(C_{i}))" class="ltx_Math" display="inline" id="S3.SS1.p5.6.m6.1"><semantics id="S3.SS1.p5.6.m6.1a"><mrow id="S3.SS1.p5.6.m6.1.1" xref="S3.SS1.p5.6.m6.1.1.cmml"><mi id="S3.SS1.p5.6.m6.1.1.3" xref="S3.SS1.p5.6.m6.1.1.3.cmml">p</mi><mo id="S3.SS1.p5.6.m6.1.1.2" xref="S3.SS1.p5.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p5.6.m6.1.1.4" xref="S3.SS1.p5.6.m6.1.1.4.cmml">r</mi><mo id="S3.SS1.p5.6.m6.1.1.2a" xref="S3.SS1.p5.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p5.6.m6.1.1.5" xref="S3.SS1.p5.6.m6.1.1.5.cmml">o</mi><mo id="S3.SS1.p5.6.m6.1.1.2b" xref="S3.SS1.p5.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p5.6.m6.1.1.6" xref="S3.SS1.p5.6.m6.1.1.6.cmml">x</mi><mo id="S3.SS1.p5.6.m6.1.1.2c" xref="S3.SS1.p5.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p5.6.m6.1.1.7" xref="S3.SS1.p5.6.m6.1.1.7.cmml">y</mi><mo id="S3.SS1.p5.6.m6.1.1.2d" xref="S3.SS1.p5.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p5.6.m6.1.1.8" mathvariant="normal" xref="S3.SS1.p5.6.m6.1.1.8.cmml">_</mi><mo id="S3.SS1.p5.6.m6.1.1.2e" xref="S3.SS1.p5.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p5.6.m6.1.1.9" xref="S3.SS1.p5.6.m6.1.1.9.cmml">s</mi><mo id="S3.SS1.p5.6.m6.1.1.2f" xref="S3.SS1.p5.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p5.6.m6.1.1.10" xref="S3.SS1.p5.6.m6.1.1.10.cmml">c</mi><mo id="S3.SS1.p5.6.m6.1.1.2g" xref="S3.SS1.p5.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p5.6.m6.1.1.11" xref="S3.SS1.p5.6.m6.1.1.11.cmml">o</mi><mo id="S3.SS1.p5.6.m6.1.1.2h" xref="S3.SS1.p5.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p5.6.m6.1.1.12" xref="S3.SS1.p5.6.m6.1.1.12.cmml">r</mi><mo id="S3.SS1.p5.6.m6.1.1.2i" xref="S3.SS1.p5.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS1.p5.6.m6.1.1.13" xref="S3.SS1.p5.6.m6.1.1.13.cmml">e</mi><mo id="S3.SS1.p5.6.m6.1.1.2j" xref="S3.SS1.p5.6.m6.1.1.2.cmml">⁢</mo><mrow id="S3.SS1.p5.6.m6.1.1.1.1" xref="S3.SS1.p5.6.m6.1.1.1.1.1.cmml"><mo id="S3.SS1.p5.6.m6.1.1.1.1.2" stretchy="false" xref="S3.SS1.p5.6.m6.1.1.1.1.1.cmml">(</mo><mrow id="S3.SS1.p5.6.m6.1.1.1.1.1" xref="S3.SS1.p5.6.m6.1.1.1.1.1.cmml"><mi id="S3.SS1.p5.6.m6.1.1.1.1.1.3" xref="S3.SS1.p5.6.m6.1.1.1.1.1.3.cmml">T</mi><mo fence="false" id="S3.SS1.p5.6.m6.1.1.1.1.1.2" xref="S3.SS1.p5.6.m6.1.1.1.1.1.2.cmml">|</mo><mrow id="S3.SS1.p5.6.m6.1.1.1.1.1.1" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.cmml"><mi id="S3.SS1.p5.6.m6.1.1.1.1.1.1.3" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.3.cmml">m</mi><mo id="S3.SS1.p5.6.m6.1.1.1.1.1.1.2" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.2.cmml">⁢</mo><mrow id="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.cmml"><mo id="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.2" stretchy="false" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.cmml">(</mo><msub id="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.cmml"><mi id="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.2" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.2.cmml">C</mi><mi id="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.3" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.3.cmml">i</mi></msub><mo id="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.3" stretchy="false" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></mrow><mo id="S3.SS1.p5.6.m6.1.1.1.1.3" stretchy="false" xref="S3.SS1.p5.6.m6.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p5.6.m6.1b"><apply id="S3.SS1.p5.6.m6.1.1.cmml" xref="S3.SS1.p5.6.m6.1.1"><times id="S3.SS1.p5.6.m6.1.1.2.cmml" xref="S3.SS1.p5.6.m6.1.1.2"></times><ci id="S3.SS1.p5.6.m6.1.1.3.cmml" xref="S3.SS1.p5.6.m6.1.1.3">𝑝</ci><ci id="S3.SS1.p5.6.m6.1.1.4.cmml" xref="S3.SS1.p5.6.m6.1.1.4">𝑟</ci><ci id="S3.SS1.p5.6.m6.1.1.5.cmml" xref="S3.SS1.p5.6.m6.1.1.5">𝑜</ci><ci id="S3.SS1.p5.6.m6.1.1.6.cmml" xref="S3.SS1.p5.6.m6.1.1.6">𝑥</ci><ci id="S3.SS1.p5.6.m6.1.1.7.cmml" xref="S3.SS1.p5.6.m6.1.1.7">𝑦</ci><ci id="S3.SS1.p5.6.m6.1.1.8.cmml" xref="S3.SS1.p5.6.m6.1.1.8">_</ci><ci id="S3.SS1.p5.6.m6.1.1.9.cmml" xref="S3.SS1.p5.6.m6.1.1.9">𝑠</ci><ci id="S3.SS1.p5.6.m6.1.1.10.cmml" xref="S3.SS1.p5.6.m6.1.1.10">𝑐</ci><ci id="S3.SS1.p5.6.m6.1.1.11.cmml" xref="S3.SS1.p5.6.m6.1.1.11">𝑜</ci><ci id="S3.SS1.p5.6.m6.1.1.12.cmml" xref="S3.SS1.p5.6.m6.1.1.12">𝑟</ci><ci id="S3.SS1.p5.6.m6.1.1.13.cmml" xref="S3.SS1.p5.6.m6.1.1.13">𝑒</ci><apply id="S3.SS1.p5.6.m6.1.1.1.1.1.cmml" xref="S3.SS1.p5.6.m6.1.1.1.1"><csymbol cd="latexml" id="S3.SS1.p5.6.m6.1.1.1.1.1.2.cmml" xref="S3.SS1.p5.6.m6.1.1.1.1.1.2">conditional</csymbol><ci id="S3.SS1.p5.6.m6.1.1.1.1.1.3.cmml" xref="S3.SS1.p5.6.m6.1.1.1.1.1.3">𝑇</ci><apply id="S3.SS1.p5.6.m6.1.1.1.1.1.1.cmml" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1"><times id="S3.SS1.p5.6.m6.1.1.1.1.1.1.2.cmml" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.2"></times><ci id="S3.SS1.p5.6.m6.1.1.1.1.1.1.3.cmml" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.3">𝑚</ci><apply id="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.cmml" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.1.cmml" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1">subscript</csymbol><ci id="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.2.cmml" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.2">𝐶</ci><ci id="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.3.cmml" xref="S3.SS1.p5.6.m6.1.1.1.1.1.1.1.1.1.3">𝑖</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p5.6.m6.1c">proxy\_score(T|m(C_{i}))</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p5.6.m6.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e ( italic_T | italic_m ( italic_C start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) )</annotation></semantics></math> which could avoid online computing the proxy score for all the models on <math alttext="d(T)" class="ltx_Math" display="inline" id="S3.SS1.p5.7.m7.1"><semantics id="S3.SS1.p5.7.m7.1a"><mrow id="S3.SS1.p5.7.m7.1.2" xref="S3.SS1.p5.7.m7.1.2.cmml"><mi id="S3.SS1.p5.7.m7.1.2.2" xref="S3.SS1.p5.7.m7.1.2.2.cmml">d</mi><mo id="S3.SS1.p5.7.m7.1.2.1" xref="S3.SS1.p5.7.m7.1.2.1.cmml">⁢</mo><mrow id="S3.SS1.p5.7.m7.1.2.3.2" xref="S3.SS1.p5.7.m7.1.2.cmml"><mo id="S3.SS1.p5.7.m7.1.2.3.2.1" stretchy="false" xref="S3.SS1.p5.7.m7.1.2.cmml">(</mo><mi id="S3.SS1.p5.7.m7.1.1" xref="S3.SS1.p5.7.m7.1.1.cmml">T</mi><mo id="S3.SS1.p5.7.m7.1.2.3.2.2" stretchy="false" xref="S3.SS1.p5.7.m7.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p5.7.m7.1b"><apply id="S3.SS1.p5.7.m7.1.2.cmml" xref="S3.SS1.p5.7.m7.1.2"><times id="S3.SS1.p5.7.m7.1.2.1.cmml" xref="S3.SS1.p5.7.m7.1.2.1"></times><ci id="S3.SS1.p5.7.m7.1.2.2.cmml" xref="S3.SS1.p5.7.m7.1.2.2">𝑑</ci><ci id="S3.SS1.p5.7.m7.1.1.cmml" xref="S3.SS1.p5.7.m7.1.1">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p5.7.m7.1c">d(T)</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p5.7.m7.1d">italic_d ( italic_T )</annotation></semantics></math> in the coarse-recall phase.</p> </div> </section> <section class="ltx_subsection" id="S3.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S3.SS2.4.1.1">III-B</span> </span><span class="ltx_text ltx_font_italic" id="S3.SS2.5.2">Model Recall</span> </h3> <div class="ltx_para" id="S3.SS2.p1"> <p class="ltx_p" id="S3.SS2.p1.12">Based on the training performance matrix and model clustering result, a <math alttext="recall\_score" class="ltx_Math" display="inline" id="S3.SS2.p1.1.m1.1"><semantics id="S3.SS2.p1.1.m1.1a"><mrow id="S3.SS2.p1.1.m1.1.1" xref="S3.SS2.p1.1.m1.1.1.cmml"><mi id="S3.SS2.p1.1.m1.1.1.2" xref="S3.SS2.p1.1.m1.1.1.2.cmml">r</mi><mo id="S3.SS2.p1.1.m1.1.1.1" xref="S3.SS2.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.1.m1.1.1.3" xref="S3.SS2.p1.1.m1.1.1.3.cmml">e</mi><mo id="S3.SS2.p1.1.m1.1.1.1a" xref="S3.SS2.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.1.m1.1.1.4" xref="S3.SS2.p1.1.m1.1.1.4.cmml">c</mi><mo id="S3.SS2.p1.1.m1.1.1.1b" xref="S3.SS2.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.1.m1.1.1.5" xref="S3.SS2.p1.1.m1.1.1.5.cmml">a</mi><mo id="S3.SS2.p1.1.m1.1.1.1c" xref="S3.SS2.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.1.m1.1.1.6" xref="S3.SS2.p1.1.m1.1.1.6.cmml">l</mi><mo id="S3.SS2.p1.1.m1.1.1.1d" xref="S3.SS2.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.1.m1.1.1.7" xref="S3.SS2.p1.1.m1.1.1.7.cmml">l</mi><mo id="S3.SS2.p1.1.m1.1.1.1e" xref="S3.SS2.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.1.m1.1.1.8" mathvariant="normal" xref="S3.SS2.p1.1.m1.1.1.8.cmml">_</mi><mo id="S3.SS2.p1.1.m1.1.1.1f" xref="S3.SS2.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.1.m1.1.1.9" xref="S3.SS2.p1.1.m1.1.1.9.cmml">s</mi><mo id="S3.SS2.p1.1.m1.1.1.1g" xref="S3.SS2.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.1.m1.1.1.10" xref="S3.SS2.p1.1.m1.1.1.10.cmml">c</mi><mo id="S3.SS2.p1.1.m1.1.1.1h" xref="S3.SS2.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.1.m1.1.1.11" xref="S3.SS2.p1.1.m1.1.1.11.cmml">o</mi><mo id="S3.SS2.p1.1.m1.1.1.1i" xref="S3.SS2.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.1.m1.1.1.12" xref="S3.SS2.p1.1.m1.1.1.12.cmml">r</mi><mo id="S3.SS2.p1.1.m1.1.1.1j" xref="S3.SS2.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.1.m1.1.1.13" xref="S3.SS2.p1.1.m1.1.1.13.cmml">e</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.1.m1.1b"><apply id="S3.SS2.p1.1.m1.1.1.cmml" xref="S3.SS2.p1.1.m1.1.1"><times id="S3.SS2.p1.1.m1.1.1.1.cmml" xref="S3.SS2.p1.1.m1.1.1.1"></times><ci id="S3.SS2.p1.1.m1.1.1.2.cmml" xref="S3.SS2.p1.1.m1.1.1.2">𝑟</ci><ci id="S3.SS2.p1.1.m1.1.1.3.cmml" xref="S3.SS2.p1.1.m1.1.1.3">𝑒</ci><ci id="S3.SS2.p1.1.m1.1.1.4.cmml" xref="S3.SS2.p1.1.m1.1.1.4">𝑐</ci><ci id="S3.SS2.p1.1.m1.1.1.5.cmml" xref="S3.SS2.p1.1.m1.1.1.5">𝑎</ci><ci id="S3.SS2.p1.1.m1.1.1.6.cmml" xref="S3.SS2.p1.1.m1.1.1.6">𝑙</ci><ci id="S3.SS2.p1.1.m1.1.1.7.cmml" xref="S3.SS2.p1.1.m1.1.1.7">𝑙</ci><ci id="S3.SS2.p1.1.m1.1.1.8.cmml" xref="S3.SS2.p1.1.m1.1.1.8">_</ci><ci id="S3.SS2.p1.1.m1.1.1.9.cmml" xref="S3.SS2.p1.1.m1.1.1.9">𝑠</ci><ci id="S3.SS2.p1.1.m1.1.1.10.cmml" xref="S3.SS2.p1.1.m1.1.1.10">𝑐</ci><ci id="S3.SS2.p1.1.m1.1.1.11.cmml" xref="S3.SS2.p1.1.m1.1.1.11">𝑜</ci><ci id="S3.SS2.p1.1.m1.1.1.12.cmml" xref="S3.SS2.p1.1.m1.1.1.12">𝑟</ci><ci id="S3.SS2.p1.1.m1.1.1.13.cmml" xref="S3.SS2.p1.1.m1.1.1.13">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.1.m1.1c">recall\_score</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.1.m1.1d">italic_r italic_e italic_c italic_a italic_l italic_l _ italic_s italic_c italic_o italic_r italic_e</annotation></semantics></math> is computed as Eq. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S3.E2" title="In III-B Model Recall ‣ III Coarse Recall ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">2</span></a>, where <math alttext="acc(m_{j})" class="ltx_Math" display="inline" id="S3.SS2.p1.2.m2.1"><semantics id="S3.SS2.p1.2.m2.1a"><mrow id="S3.SS2.p1.2.m2.1.1" xref="S3.SS2.p1.2.m2.1.1.cmml"><mi id="S3.SS2.p1.2.m2.1.1.3" xref="S3.SS2.p1.2.m2.1.1.3.cmml">a</mi><mo id="S3.SS2.p1.2.m2.1.1.2" xref="S3.SS2.p1.2.m2.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.2.m2.1.1.4" xref="S3.SS2.p1.2.m2.1.1.4.cmml">c</mi><mo id="S3.SS2.p1.2.m2.1.1.2a" xref="S3.SS2.p1.2.m2.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.2.m2.1.1.5" xref="S3.SS2.p1.2.m2.1.1.5.cmml">c</mi><mo id="S3.SS2.p1.2.m2.1.1.2b" xref="S3.SS2.p1.2.m2.1.1.2.cmml">⁢</mo><mrow id="S3.SS2.p1.2.m2.1.1.1.1" xref="S3.SS2.p1.2.m2.1.1.1.1.1.cmml"><mo id="S3.SS2.p1.2.m2.1.1.1.1.2" stretchy="false" xref="S3.SS2.p1.2.m2.1.1.1.1.1.cmml">(</mo><msub id="S3.SS2.p1.2.m2.1.1.1.1.1" xref="S3.SS2.p1.2.m2.1.1.1.1.1.cmml"><mi id="S3.SS2.p1.2.m2.1.1.1.1.1.2" xref="S3.SS2.p1.2.m2.1.1.1.1.1.2.cmml">m</mi><mi id="S3.SS2.p1.2.m2.1.1.1.1.1.3" xref="S3.SS2.p1.2.m2.1.1.1.1.1.3.cmml">j</mi></msub><mo id="S3.SS2.p1.2.m2.1.1.1.1.3" stretchy="false" xref="S3.SS2.p1.2.m2.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.2.m2.1b"><apply id="S3.SS2.p1.2.m2.1.1.cmml" xref="S3.SS2.p1.2.m2.1.1"><times id="S3.SS2.p1.2.m2.1.1.2.cmml" xref="S3.SS2.p1.2.m2.1.1.2"></times><ci id="S3.SS2.p1.2.m2.1.1.3.cmml" xref="S3.SS2.p1.2.m2.1.1.3">𝑎</ci><ci id="S3.SS2.p1.2.m2.1.1.4.cmml" xref="S3.SS2.p1.2.m2.1.1.4">𝑐</ci><ci id="S3.SS2.p1.2.m2.1.1.5.cmml" xref="S3.SS2.p1.2.m2.1.1.5">𝑐</ci><apply id="S3.SS2.p1.2.m2.1.1.1.1.1.cmml" xref="S3.SS2.p1.2.m2.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.2.m2.1.1.1.1.1.1.cmml" xref="S3.SS2.p1.2.m2.1.1.1.1">subscript</csymbol><ci id="S3.SS2.p1.2.m2.1.1.1.1.1.2.cmml" xref="S3.SS2.p1.2.m2.1.1.1.1.1.2">𝑚</ci><ci id="S3.SS2.p1.2.m2.1.1.1.1.1.3.cmml" xref="S3.SS2.p1.2.m2.1.1.1.1.1.3">𝑗</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.2.m2.1c">acc(m_{j})</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.2.m2.1d">italic_a italic_c italic_c ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )</annotation></semantics></math> denotes the average accuracy of <math alttext="m_{j}" class="ltx_Math" display="inline" id="S3.SS2.p1.3.m3.1"><semantics id="S3.SS2.p1.3.m3.1a"><msub id="S3.SS2.p1.3.m3.1.1" xref="S3.SS2.p1.3.m3.1.1.cmml"><mi id="S3.SS2.p1.3.m3.1.1.2" xref="S3.SS2.p1.3.m3.1.1.2.cmml">m</mi><mi id="S3.SS2.p1.3.m3.1.1.3" xref="S3.SS2.p1.3.m3.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.3.m3.1b"><apply id="S3.SS2.p1.3.m3.1.1.cmml" xref="S3.SS2.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.3.m3.1.1.1.cmml" xref="S3.SS2.p1.3.m3.1.1">subscript</csymbol><ci id="S3.SS2.p1.3.m3.1.1.2.cmml" xref="S3.SS2.p1.3.m3.1.1.2">𝑚</ci><ci id="S3.SS2.p1.3.m3.1.1.3.cmml" xref="S3.SS2.p1.3.m3.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.3.m3.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.3.m3.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math> on benchmark datasets <math alttext="D" class="ltx_Math" display="inline" id="S3.SS2.p1.4.m4.1"><semantics id="S3.SS2.p1.4.m4.1a"><mi id="S3.SS2.p1.4.m4.1.1" xref="S3.SS2.p1.4.m4.1.1.cmml">D</mi><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.4.m4.1b"><ci id="S3.SS2.p1.4.m4.1.1.cmml" xref="S3.SS2.p1.4.m4.1.1">𝐷</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.4.m4.1c">D</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.4.m4.1d">italic_D</annotation></semantics></math>. We can see the <math alttext="recall\_score(T|m_{j})" class="ltx_Math" display="inline" id="S3.SS2.p1.5.m5.1"><semantics id="S3.SS2.p1.5.m5.1a"><mrow id="S3.SS2.p1.5.m5.1.1" xref="S3.SS2.p1.5.m5.1.1.cmml"><mi id="S3.SS2.p1.5.m5.1.1.3" xref="S3.SS2.p1.5.m5.1.1.3.cmml">r</mi><mo id="S3.SS2.p1.5.m5.1.1.2" xref="S3.SS2.p1.5.m5.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.5.m5.1.1.4" xref="S3.SS2.p1.5.m5.1.1.4.cmml">e</mi><mo id="S3.SS2.p1.5.m5.1.1.2a" xref="S3.SS2.p1.5.m5.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.5.m5.1.1.5" xref="S3.SS2.p1.5.m5.1.1.5.cmml">c</mi><mo id="S3.SS2.p1.5.m5.1.1.2b" xref="S3.SS2.p1.5.m5.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.5.m5.1.1.6" xref="S3.SS2.p1.5.m5.1.1.6.cmml">a</mi><mo id="S3.SS2.p1.5.m5.1.1.2c" xref="S3.SS2.p1.5.m5.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.5.m5.1.1.7" xref="S3.SS2.p1.5.m5.1.1.7.cmml">l</mi><mo id="S3.SS2.p1.5.m5.1.1.2d" xref="S3.SS2.p1.5.m5.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.5.m5.1.1.8" xref="S3.SS2.p1.5.m5.1.1.8.cmml">l</mi><mo id="S3.SS2.p1.5.m5.1.1.2e" xref="S3.SS2.p1.5.m5.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.5.m5.1.1.9" mathvariant="normal" xref="S3.SS2.p1.5.m5.1.1.9.cmml">_</mi><mo id="S3.SS2.p1.5.m5.1.1.2f" xref="S3.SS2.p1.5.m5.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.5.m5.1.1.10" xref="S3.SS2.p1.5.m5.1.1.10.cmml">s</mi><mo id="S3.SS2.p1.5.m5.1.1.2g" xref="S3.SS2.p1.5.m5.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.5.m5.1.1.11" xref="S3.SS2.p1.5.m5.1.1.11.cmml">c</mi><mo id="S3.SS2.p1.5.m5.1.1.2h" xref="S3.SS2.p1.5.m5.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.5.m5.1.1.12" xref="S3.SS2.p1.5.m5.1.1.12.cmml">o</mi><mo id="S3.SS2.p1.5.m5.1.1.2i" xref="S3.SS2.p1.5.m5.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.5.m5.1.1.13" xref="S3.SS2.p1.5.m5.1.1.13.cmml">r</mi><mo id="S3.SS2.p1.5.m5.1.1.2j" xref="S3.SS2.p1.5.m5.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.5.m5.1.1.14" xref="S3.SS2.p1.5.m5.1.1.14.cmml">e</mi><mo id="S3.SS2.p1.5.m5.1.1.2k" xref="S3.SS2.p1.5.m5.1.1.2.cmml">⁢</mo><mrow id="S3.SS2.p1.5.m5.1.1.1.1" xref="S3.SS2.p1.5.m5.1.1.1.1.1.cmml"><mo id="S3.SS2.p1.5.m5.1.1.1.1.2" stretchy="false" xref="S3.SS2.p1.5.m5.1.1.1.1.1.cmml">(</mo><mrow id="S3.SS2.p1.5.m5.1.1.1.1.1" xref="S3.SS2.p1.5.m5.1.1.1.1.1.cmml"><mi id="S3.SS2.p1.5.m5.1.1.1.1.1.2" xref="S3.SS2.p1.5.m5.1.1.1.1.1.2.cmml">T</mi><mo fence="false" id="S3.SS2.p1.5.m5.1.1.1.1.1.1" xref="S3.SS2.p1.5.m5.1.1.1.1.1.1.cmml">|</mo><msub id="S3.SS2.p1.5.m5.1.1.1.1.1.3" xref="S3.SS2.p1.5.m5.1.1.1.1.1.3.cmml"><mi id="S3.SS2.p1.5.m5.1.1.1.1.1.3.2" xref="S3.SS2.p1.5.m5.1.1.1.1.1.3.2.cmml">m</mi><mi id="S3.SS2.p1.5.m5.1.1.1.1.1.3.3" xref="S3.SS2.p1.5.m5.1.1.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S3.SS2.p1.5.m5.1.1.1.1.3" stretchy="false" xref="S3.SS2.p1.5.m5.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.5.m5.1b"><apply id="S3.SS2.p1.5.m5.1.1.cmml" xref="S3.SS2.p1.5.m5.1.1"><times id="S3.SS2.p1.5.m5.1.1.2.cmml" xref="S3.SS2.p1.5.m5.1.1.2"></times><ci id="S3.SS2.p1.5.m5.1.1.3.cmml" xref="S3.SS2.p1.5.m5.1.1.3">𝑟</ci><ci id="S3.SS2.p1.5.m5.1.1.4.cmml" xref="S3.SS2.p1.5.m5.1.1.4">𝑒</ci><ci id="S3.SS2.p1.5.m5.1.1.5.cmml" xref="S3.SS2.p1.5.m5.1.1.5">𝑐</ci><ci id="S3.SS2.p1.5.m5.1.1.6.cmml" xref="S3.SS2.p1.5.m5.1.1.6">𝑎</ci><ci id="S3.SS2.p1.5.m5.1.1.7.cmml" xref="S3.SS2.p1.5.m5.1.1.7">𝑙</ci><ci id="S3.SS2.p1.5.m5.1.1.8.cmml" xref="S3.SS2.p1.5.m5.1.1.8">𝑙</ci><ci id="S3.SS2.p1.5.m5.1.1.9.cmml" xref="S3.SS2.p1.5.m5.1.1.9">_</ci><ci id="S3.SS2.p1.5.m5.1.1.10.cmml" xref="S3.SS2.p1.5.m5.1.1.10">𝑠</ci><ci id="S3.SS2.p1.5.m5.1.1.11.cmml" xref="S3.SS2.p1.5.m5.1.1.11">𝑐</ci><ci id="S3.SS2.p1.5.m5.1.1.12.cmml" xref="S3.SS2.p1.5.m5.1.1.12">𝑜</ci><ci id="S3.SS2.p1.5.m5.1.1.13.cmml" xref="S3.SS2.p1.5.m5.1.1.13">𝑟</ci><ci id="S3.SS2.p1.5.m5.1.1.14.cmml" xref="S3.SS2.p1.5.m5.1.1.14">𝑒</ci><apply id="S3.SS2.p1.5.m5.1.1.1.1.1.cmml" xref="S3.SS2.p1.5.m5.1.1.1.1"><csymbol cd="latexml" id="S3.SS2.p1.5.m5.1.1.1.1.1.1.cmml" xref="S3.SS2.p1.5.m5.1.1.1.1.1.1">conditional</csymbol><ci id="S3.SS2.p1.5.m5.1.1.1.1.1.2.cmml" xref="S3.SS2.p1.5.m5.1.1.1.1.1.2">𝑇</ci><apply id="S3.SS2.p1.5.m5.1.1.1.1.1.3.cmml" xref="S3.SS2.p1.5.m5.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.SS2.p1.5.m5.1.1.1.1.1.3.1.cmml" xref="S3.SS2.p1.5.m5.1.1.1.1.1.3">subscript</csymbol><ci id="S3.SS2.p1.5.m5.1.1.1.1.1.3.2.cmml" xref="S3.SS2.p1.5.m5.1.1.1.1.1.3.2">𝑚</ci><ci id="S3.SS2.p1.5.m5.1.1.1.1.1.3.3.cmml" xref="S3.SS2.p1.5.m5.1.1.1.1.1.3.3">𝑗</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.5.m5.1c">recall\_score(T|m_{j})</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.5.m5.1d">italic_r italic_e italic_c italic_a italic_l italic_l _ italic_s italic_c italic_o italic_r italic_e ( italic_T | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )</annotation></semantics></math> contains two parts. The <math alttext="acc(m_{j})" class="ltx_Math" display="inline" id="S3.SS2.p1.6.m6.1"><semantics id="S3.SS2.p1.6.m6.1a"><mrow id="S3.SS2.p1.6.m6.1.1" xref="S3.SS2.p1.6.m6.1.1.cmml"><mi id="S3.SS2.p1.6.m6.1.1.3" xref="S3.SS2.p1.6.m6.1.1.3.cmml">a</mi><mo id="S3.SS2.p1.6.m6.1.1.2" xref="S3.SS2.p1.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.6.m6.1.1.4" xref="S3.SS2.p1.6.m6.1.1.4.cmml">c</mi><mo id="S3.SS2.p1.6.m6.1.1.2a" xref="S3.SS2.p1.6.m6.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.6.m6.1.1.5" xref="S3.SS2.p1.6.m6.1.1.5.cmml">c</mi><mo id="S3.SS2.p1.6.m6.1.1.2b" xref="S3.SS2.p1.6.m6.1.1.2.cmml">⁢</mo><mrow id="S3.SS2.p1.6.m6.1.1.1.1" xref="S3.SS2.p1.6.m6.1.1.1.1.1.cmml"><mo id="S3.SS2.p1.6.m6.1.1.1.1.2" stretchy="false" xref="S3.SS2.p1.6.m6.1.1.1.1.1.cmml">(</mo><msub id="S3.SS2.p1.6.m6.1.1.1.1.1" xref="S3.SS2.p1.6.m6.1.1.1.1.1.cmml"><mi id="S3.SS2.p1.6.m6.1.1.1.1.1.2" xref="S3.SS2.p1.6.m6.1.1.1.1.1.2.cmml">m</mi><mi id="S3.SS2.p1.6.m6.1.1.1.1.1.3" xref="S3.SS2.p1.6.m6.1.1.1.1.1.3.cmml">j</mi></msub><mo id="S3.SS2.p1.6.m6.1.1.1.1.3" stretchy="false" xref="S3.SS2.p1.6.m6.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.6.m6.1b"><apply id="S3.SS2.p1.6.m6.1.1.cmml" xref="S3.SS2.p1.6.m6.1.1"><times id="S3.SS2.p1.6.m6.1.1.2.cmml" xref="S3.SS2.p1.6.m6.1.1.2"></times><ci id="S3.SS2.p1.6.m6.1.1.3.cmml" xref="S3.SS2.p1.6.m6.1.1.3">𝑎</ci><ci id="S3.SS2.p1.6.m6.1.1.4.cmml" xref="S3.SS2.p1.6.m6.1.1.4">𝑐</ci><ci id="S3.SS2.p1.6.m6.1.1.5.cmml" xref="S3.SS2.p1.6.m6.1.1.5">𝑐</ci><apply id="S3.SS2.p1.6.m6.1.1.1.1.1.cmml" xref="S3.SS2.p1.6.m6.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.6.m6.1.1.1.1.1.1.cmml" xref="S3.SS2.p1.6.m6.1.1.1.1">subscript</csymbol><ci id="S3.SS2.p1.6.m6.1.1.1.1.1.2.cmml" xref="S3.SS2.p1.6.m6.1.1.1.1.1.2">𝑚</ci><ci id="S3.SS2.p1.6.m6.1.1.1.1.1.3.cmml" xref="S3.SS2.p1.6.m6.1.1.1.1.1.3">𝑗</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.6.m6.1c">acc(m_{j})</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.6.m6.1d">italic_a italic_c italic_c ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )</annotation></semantics></math> counts the prior capacity for a model <math alttext="m_{j}" class="ltx_Math" display="inline" id="S3.SS2.p1.7.m7.1"><semantics id="S3.SS2.p1.7.m7.1a"><msub id="S3.SS2.p1.7.m7.1.1" xref="S3.SS2.p1.7.m7.1.1.cmml"><mi id="S3.SS2.p1.7.m7.1.1.2" xref="S3.SS2.p1.7.m7.1.1.2.cmml">m</mi><mi id="S3.SS2.p1.7.m7.1.1.3" xref="S3.SS2.p1.7.m7.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.7.m7.1b"><apply id="S3.SS2.p1.7.m7.1.1.cmml" xref="S3.SS2.p1.7.m7.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.7.m7.1.1.1.cmml" xref="S3.SS2.p1.7.m7.1.1">subscript</csymbol><ci id="S3.SS2.p1.7.m7.1.1.2.cmml" xref="S3.SS2.p1.7.m7.1.1.2">𝑚</ci><ci id="S3.SS2.p1.7.m7.1.1.3.cmml" xref="S3.SS2.p1.7.m7.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.7.m7.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.7.m7.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math> to any new task, and the <math alttext="proxy\_score(T|m_{j})" class="ltx_Math" display="inline" id="S3.SS2.p1.8.m8.1"><semantics id="S3.SS2.p1.8.m8.1a"><mrow id="S3.SS2.p1.8.m8.1.1" xref="S3.SS2.p1.8.m8.1.1.cmml"><mi id="S3.SS2.p1.8.m8.1.1.3" xref="S3.SS2.p1.8.m8.1.1.3.cmml">p</mi><mo id="S3.SS2.p1.8.m8.1.1.2" xref="S3.SS2.p1.8.m8.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.8.m8.1.1.4" xref="S3.SS2.p1.8.m8.1.1.4.cmml">r</mi><mo id="S3.SS2.p1.8.m8.1.1.2a" xref="S3.SS2.p1.8.m8.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.8.m8.1.1.5" xref="S3.SS2.p1.8.m8.1.1.5.cmml">o</mi><mo id="S3.SS2.p1.8.m8.1.1.2b" xref="S3.SS2.p1.8.m8.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.8.m8.1.1.6" xref="S3.SS2.p1.8.m8.1.1.6.cmml">x</mi><mo id="S3.SS2.p1.8.m8.1.1.2c" xref="S3.SS2.p1.8.m8.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.8.m8.1.1.7" xref="S3.SS2.p1.8.m8.1.1.7.cmml">y</mi><mo id="S3.SS2.p1.8.m8.1.1.2d" xref="S3.SS2.p1.8.m8.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.8.m8.1.1.8" mathvariant="normal" xref="S3.SS2.p1.8.m8.1.1.8.cmml">_</mi><mo id="S3.SS2.p1.8.m8.1.1.2e" xref="S3.SS2.p1.8.m8.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.8.m8.1.1.9" xref="S3.SS2.p1.8.m8.1.1.9.cmml">s</mi><mo id="S3.SS2.p1.8.m8.1.1.2f" xref="S3.SS2.p1.8.m8.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.8.m8.1.1.10" xref="S3.SS2.p1.8.m8.1.1.10.cmml">c</mi><mo id="S3.SS2.p1.8.m8.1.1.2g" xref="S3.SS2.p1.8.m8.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.8.m8.1.1.11" xref="S3.SS2.p1.8.m8.1.1.11.cmml">o</mi><mo id="S3.SS2.p1.8.m8.1.1.2h" xref="S3.SS2.p1.8.m8.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.8.m8.1.1.12" xref="S3.SS2.p1.8.m8.1.1.12.cmml">r</mi><mo id="S3.SS2.p1.8.m8.1.1.2i" xref="S3.SS2.p1.8.m8.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p1.8.m8.1.1.13" xref="S3.SS2.p1.8.m8.1.1.13.cmml">e</mi><mo id="S3.SS2.p1.8.m8.1.1.2j" xref="S3.SS2.p1.8.m8.1.1.2.cmml">⁢</mo><mrow id="S3.SS2.p1.8.m8.1.1.1.1" xref="S3.SS2.p1.8.m8.1.1.1.1.1.cmml"><mo id="S3.SS2.p1.8.m8.1.1.1.1.2" stretchy="false" xref="S3.SS2.p1.8.m8.1.1.1.1.1.cmml">(</mo><mrow id="S3.SS2.p1.8.m8.1.1.1.1.1" xref="S3.SS2.p1.8.m8.1.1.1.1.1.cmml"><mi id="S3.SS2.p1.8.m8.1.1.1.1.1.2" xref="S3.SS2.p1.8.m8.1.1.1.1.1.2.cmml">T</mi><mo fence="false" id="S3.SS2.p1.8.m8.1.1.1.1.1.1" xref="S3.SS2.p1.8.m8.1.1.1.1.1.1.cmml">|</mo><msub id="S3.SS2.p1.8.m8.1.1.1.1.1.3" xref="S3.SS2.p1.8.m8.1.1.1.1.1.3.cmml"><mi id="S3.SS2.p1.8.m8.1.1.1.1.1.3.2" xref="S3.SS2.p1.8.m8.1.1.1.1.1.3.2.cmml">m</mi><mi id="S3.SS2.p1.8.m8.1.1.1.1.1.3.3" xref="S3.SS2.p1.8.m8.1.1.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S3.SS2.p1.8.m8.1.1.1.1.3" stretchy="false" xref="S3.SS2.p1.8.m8.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.8.m8.1b"><apply id="S3.SS2.p1.8.m8.1.1.cmml" xref="S3.SS2.p1.8.m8.1.1"><times id="S3.SS2.p1.8.m8.1.1.2.cmml" xref="S3.SS2.p1.8.m8.1.1.2"></times><ci id="S3.SS2.p1.8.m8.1.1.3.cmml" xref="S3.SS2.p1.8.m8.1.1.3">𝑝</ci><ci id="S3.SS2.p1.8.m8.1.1.4.cmml" xref="S3.SS2.p1.8.m8.1.1.4">𝑟</ci><ci id="S3.SS2.p1.8.m8.1.1.5.cmml" xref="S3.SS2.p1.8.m8.1.1.5">𝑜</ci><ci id="S3.SS2.p1.8.m8.1.1.6.cmml" xref="S3.SS2.p1.8.m8.1.1.6">𝑥</ci><ci id="S3.SS2.p1.8.m8.1.1.7.cmml" xref="S3.SS2.p1.8.m8.1.1.7">𝑦</ci><ci id="S3.SS2.p1.8.m8.1.1.8.cmml" xref="S3.SS2.p1.8.m8.1.1.8">_</ci><ci id="S3.SS2.p1.8.m8.1.1.9.cmml" xref="S3.SS2.p1.8.m8.1.1.9">𝑠</ci><ci id="S3.SS2.p1.8.m8.1.1.10.cmml" xref="S3.SS2.p1.8.m8.1.1.10">𝑐</ci><ci id="S3.SS2.p1.8.m8.1.1.11.cmml" xref="S3.SS2.p1.8.m8.1.1.11">𝑜</ci><ci id="S3.SS2.p1.8.m8.1.1.12.cmml" xref="S3.SS2.p1.8.m8.1.1.12">𝑟</ci><ci id="S3.SS2.p1.8.m8.1.1.13.cmml" xref="S3.SS2.p1.8.m8.1.1.13">𝑒</ci><apply id="S3.SS2.p1.8.m8.1.1.1.1.1.cmml" xref="S3.SS2.p1.8.m8.1.1.1.1"><csymbol cd="latexml" id="S3.SS2.p1.8.m8.1.1.1.1.1.1.cmml" xref="S3.SS2.p1.8.m8.1.1.1.1.1.1">conditional</csymbol><ci id="S3.SS2.p1.8.m8.1.1.1.1.1.2.cmml" xref="S3.SS2.p1.8.m8.1.1.1.1.1.2">𝑇</ci><apply id="S3.SS2.p1.8.m8.1.1.1.1.1.3.cmml" xref="S3.SS2.p1.8.m8.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.SS2.p1.8.m8.1.1.1.1.1.3.1.cmml" xref="S3.SS2.p1.8.m8.1.1.1.1.1.3">subscript</csymbol><ci id="S3.SS2.p1.8.m8.1.1.1.1.1.3.2.cmml" xref="S3.SS2.p1.8.m8.1.1.1.1.1.3.2">𝑚</ci><ci id="S3.SS2.p1.8.m8.1.1.1.1.1.3.3.cmml" xref="S3.SS2.p1.8.m8.1.1.1.1.1.3.3">𝑗</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.8.m8.1c">proxy\_score(T|m_{j})</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.8.m8.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e ( italic_T | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )</annotation></semantics></math> represents how the model <math alttext="m_{j}" class="ltx_Math" display="inline" id="S3.SS2.p1.9.m9.1"><semantics id="S3.SS2.p1.9.m9.1a"><msub id="S3.SS2.p1.9.m9.1.1" xref="S3.SS2.p1.9.m9.1.1.cmml"><mi id="S3.SS2.p1.9.m9.1.1.2" xref="S3.SS2.p1.9.m9.1.1.2.cmml">m</mi><mi id="S3.SS2.p1.9.m9.1.1.3" xref="S3.SS2.p1.9.m9.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.9.m9.1b"><apply id="S3.SS2.p1.9.m9.1.1.cmml" xref="S3.SS2.p1.9.m9.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.9.m9.1.1.1.cmml" xref="S3.SS2.p1.9.m9.1.1">subscript</csymbol><ci id="S3.SS2.p1.9.m9.1.1.2.cmml" xref="S3.SS2.p1.9.m9.1.1.2">𝑚</ci><ci id="S3.SS2.p1.9.m9.1.1.3.cmml" xref="S3.SS2.p1.9.m9.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.9.m9.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.9.m9.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math> matches the specific task <math alttext="T" class="ltx_Math" display="inline" id="S3.SS2.p1.10.m10.1"><semantics id="S3.SS2.p1.10.m10.1a"><mi id="S3.SS2.p1.10.m10.1.1" xref="S3.SS2.p1.10.m10.1.1.cmml">T</mi><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.10.m10.1b"><ci id="S3.SS2.p1.10.m10.1.1.cmml" xref="S3.SS2.p1.10.m10.1.1">𝑇</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.10.m10.1c">T</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.10.m10.1d">italic_T</annotation></semantics></math>. In this paper, we adopt LEEP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib11" title="">11</a>]</cite> to compute the <math alttext="proxy\_score" class="ltx_Math" display="inline" id="S3.SS2.p1.11.m11.1"><semantics id="S3.SS2.p1.11.m11.1a"><mrow id="S3.SS2.p1.11.m11.1.1" xref="S3.SS2.p1.11.m11.1.1.cmml"><mi id="S3.SS2.p1.11.m11.1.1.2" xref="S3.SS2.p1.11.m11.1.1.2.cmml">p</mi><mo id="S3.SS2.p1.11.m11.1.1.1" xref="S3.SS2.p1.11.m11.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.11.m11.1.1.3" xref="S3.SS2.p1.11.m11.1.1.3.cmml">r</mi><mo id="S3.SS2.p1.11.m11.1.1.1a" xref="S3.SS2.p1.11.m11.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.11.m11.1.1.4" xref="S3.SS2.p1.11.m11.1.1.4.cmml">o</mi><mo id="S3.SS2.p1.11.m11.1.1.1b" xref="S3.SS2.p1.11.m11.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.11.m11.1.1.5" xref="S3.SS2.p1.11.m11.1.1.5.cmml">x</mi><mo id="S3.SS2.p1.11.m11.1.1.1c" xref="S3.SS2.p1.11.m11.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.11.m11.1.1.6" xref="S3.SS2.p1.11.m11.1.1.6.cmml">y</mi><mo id="S3.SS2.p1.11.m11.1.1.1d" xref="S3.SS2.p1.11.m11.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.11.m11.1.1.7" mathvariant="normal" xref="S3.SS2.p1.11.m11.1.1.7.cmml">_</mi><mo id="S3.SS2.p1.11.m11.1.1.1e" xref="S3.SS2.p1.11.m11.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.11.m11.1.1.8" xref="S3.SS2.p1.11.m11.1.1.8.cmml">s</mi><mo id="S3.SS2.p1.11.m11.1.1.1f" xref="S3.SS2.p1.11.m11.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.11.m11.1.1.9" xref="S3.SS2.p1.11.m11.1.1.9.cmml">c</mi><mo id="S3.SS2.p1.11.m11.1.1.1g" xref="S3.SS2.p1.11.m11.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.11.m11.1.1.10" xref="S3.SS2.p1.11.m11.1.1.10.cmml">o</mi><mo id="S3.SS2.p1.11.m11.1.1.1h" xref="S3.SS2.p1.11.m11.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.11.m11.1.1.11" xref="S3.SS2.p1.11.m11.1.1.11.cmml">r</mi><mo id="S3.SS2.p1.11.m11.1.1.1i" xref="S3.SS2.p1.11.m11.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p1.11.m11.1.1.12" xref="S3.SS2.p1.11.m11.1.1.12.cmml">e</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.11.m11.1b"><apply id="S3.SS2.p1.11.m11.1.1.cmml" xref="S3.SS2.p1.11.m11.1.1"><times id="S3.SS2.p1.11.m11.1.1.1.cmml" xref="S3.SS2.p1.11.m11.1.1.1"></times><ci id="S3.SS2.p1.11.m11.1.1.2.cmml" xref="S3.SS2.p1.11.m11.1.1.2">𝑝</ci><ci id="S3.SS2.p1.11.m11.1.1.3.cmml" xref="S3.SS2.p1.11.m11.1.1.3">𝑟</ci><ci id="S3.SS2.p1.11.m11.1.1.4.cmml" xref="S3.SS2.p1.11.m11.1.1.4">𝑜</ci><ci id="S3.SS2.p1.11.m11.1.1.5.cmml" xref="S3.SS2.p1.11.m11.1.1.5">𝑥</ci><ci id="S3.SS2.p1.11.m11.1.1.6.cmml" xref="S3.SS2.p1.11.m11.1.1.6">𝑦</ci><ci id="S3.SS2.p1.11.m11.1.1.7.cmml" xref="S3.SS2.p1.11.m11.1.1.7">_</ci><ci id="S3.SS2.p1.11.m11.1.1.8.cmml" xref="S3.SS2.p1.11.m11.1.1.8">𝑠</ci><ci id="S3.SS2.p1.11.m11.1.1.9.cmml" xref="S3.SS2.p1.11.m11.1.1.9">𝑐</ci><ci id="S3.SS2.p1.11.m11.1.1.10.cmml" xref="S3.SS2.p1.11.m11.1.1.10">𝑜</ci><ci id="S3.SS2.p1.11.m11.1.1.11.cmml" xref="S3.SS2.p1.11.m11.1.1.11">𝑟</ci><ci id="S3.SS2.p1.11.m11.1.1.12.cmml" xref="S3.SS2.p1.11.m11.1.1.12">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.11.m11.1c">proxy\_score</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.11.m11.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e</annotation></semantics></math> and normalize score between [0, 1]. Combining these two scores, we can sort the models in descendent order, and reserve top <math alttext="K" class="ltx_Math" display="inline" id="S3.SS2.p1.12.m12.1"><semantics id="S3.SS2.p1.12.m12.1a"><mi id="S3.SS2.p1.12.m12.1.1" xref="S3.SS2.p1.12.m12.1.1.cmml">K</mi><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.12.m12.1b"><ci id="S3.SS2.p1.12.m12.1.1.cmml" xref="S3.SS2.p1.12.m12.1.1">𝐾</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.12.m12.1c">K</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.12.m12.1d">italic_K</annotation></semantics></math> models as the result of the coarse-recall phase.</p> <table class="ltx_equation ltx_eqn_table" id="S3.E2"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="recall\_score(T|m_{j})=acc(m_{j})\cdot proxy\_score(T|m_{j})" class="ltx_Math" display="block" id="S3.E2.m1.3"><semantics id="S3.E2.m1.3a"><mrow id="S3.E2.m1.3.3" xref="S3.E2.m1.3.3.cmml"><mrow id="S3.E2.m1.1.1.1" xref="S3.E2.m1.1.1.1.cmml"><mi id="S3.E2.m1.1.1.1.3" xref="S3.E2.m1.1.1.1.3.cmml">r</mi><mo id="S3.E2.m1.1.1.1.2" xref="S3.E2.m1.1.1.1.2.cmml">⁢</mo><mi id="S3.E2.m1.1.1.1.4" xref="S3.E2.m1.1.1.1.4.cmml">e</mi><mo id="S3.E2.m1.1.1.1.2a" xref="S3.E2.m1.1.1.1.2.cmml">⁢</mo><mi id="S3.E2.m1.1.1.1.5" xref="S3.E2.m1.1.1.1.5.cmml">c</mi><mo id="S3.E2.m1.1.1.1.2b" xref="S3.E2.m1.1.1.1.2.cmml">⁢</mo><mi id="S3.E2.m1.1.1.1.6" xref="S3.E2.m1.1.1.1.6.cmml">a</mi><mo id="S3.E2.m1.1.1.1.2c" xref="S3.E2.m1.1.1.1.2.cmml">⁢</mo><mi id="S3.E2.m1.1.1.1.7" xref="S3.E2.m1.1.1.1.7.cmml">l</mi><mo id="S3.E2.m1.1.1.1.2d" xref="S3.E2.m1.1.1.1.2.cmml">⁢</mo><mi id="S3.E2.m1.1.1.1.8" xref="S3.E2.m1.1.1.1.8.cmml">l</mi><mo id="S3.E2.m1.1.1.1.2e" xref="S3.E2.m1.1.1.1.2.cmml">⁢</mo><mi id="S3.E2.m1.1.1.1.9" mathvariant="normal" xref="S3.E2.m1.1.1.1.9.cmml">_</mi><mo id="S3.E2.m1.1.1.1.2f" xref="S3.E2.m1.1.1.1.2.cmml">⁢</mo><mi id="S3.E2.m1.1.1.1.10" xref="S3.E2.m1.1.1.1.10.cmml">s</mi><mo id="S3.E2.m1.1.1.1.2g" xref="S3.E2.m1.1.1.1.2.cmml">⁢</mo><mi id="S3.E2.m1.1.1.1.11" xref="S3.E2.m1.1.1.1.11.cmml">c</mi><mo id="S3.E2.m1.1.1.1.2h" xref="S3.E2.m1.1.1.1.2.cmml">⁢</mo><mi id="S3.E2.m1.1.1.1.12" xref="S3.E2.m1.1.1.1.12.cmml">o</mi><mo id="S3.E2.m1.1.1.1.2i" xref="S3.E2.m1.1.1.1.2.cmml">⁢</mo><mi id="S3.E2.m1.1.1.1.13" xref="S3.E2.m1.1.1.1.13.cmml">r</mi><mo id="S3.E2.m1.1.1.1.2j" xref="S3.E2.m1.1.1.1.2.cmml">⁢</mo><mi id="S3.E2.m1.1.1.1.14" xref="S3.E2.m1.1.1.1.14.cmml">e</mi><mo id="S3.E2.m1.1.1.1.2k" xref="S3.E2.m1.1.1.1.2.cmml">⁢</mo><mrow id="S3.E2.m1.1.1.1.1.1" xref="S3.E2.m1.1.1.1.1.1.1.cmml"><mo id="S3.E2.m1.1.1.1.1.1.2" stretchy="false" xref="S3.E2.m1.1.1.1.1.1.1.cmml">(</mo><mrow id="S3.E2.m1.1.1.1.1.1.1" xref="S3.E2.m1.1.1.1.1.1.1.cmml"><mi id="S3.E2.m1.1.1.1.1.1.1.2" xref="S3.E2.m1.1.1.1.1.1.1.2.cmml">T</mi><mo fence="false" id="S3.E2.m1.1.1.1.1.1.1.1" xref="S3.E2.m1.1.1.1.1.1.1.1.cmml">|</mo><msub id="S3.E2.m1.1.1.1.1.1.1.3" xref="S3.E2.m1.1.1.1.1.1.1.3.cmml"><mi id="S3.E2.m1.1.1.1.1.1.1.3.2" xref="S3.E2.m1.1.1.1.1.1.1.3.2.cmml">m</mi><mi id="S3.E2.m1.1.1.1.1.1.1.3.3" xref="S3.E2.m1.1.1.1.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S3.E2.m1.1.1.1.1.1.3" stretchy="false" xref="S3.E2.m1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S3.E2.m1.3.3.4" xref="S3.E2.m1.3.3.4.cmml">=</mo><mrow id="S3.E2.m1.3.3.3" xref="S3.E2.m1.3.3.3.cmml"><mrow id="S3.E2.m1.2.2.2.1" xref="S3.E2.m1.2.2.2.1.cmml"><mrow id="S3.E2.m1.2.2.2.1.1" xref="S3.E2.m1.2.2.2.1.1.cmml"><mi id="S3.E2.m1.2.2.2.1.1.3" xref="S3.E2.m1.2.2.2.1.1.3.cmml">a</mi><mo id="S3.E2.m1.2.2.2.1.1.2" xref="S3.E2.m1.2.2.2.1.1.2.cmml">⁢</mo><mi id="S3.E2.m1.2.2.2.1.1.4" xref="S3.E2.m1.2.2.2.1.1.4.cmml">c</mi><mo id="S3.E2.m1.2.2.2.1.1.2a" xref="S3.E2.m1.2.2.2.1.1.2.cmml">⁢</mo><mi id="S3.E2.m1.2.2.2.1.1.5" xref="S3.E2.m1.2.2.2.1.1.5.cmml">c</mi><mo id="S3.E2.m1.2.2.2.1.1.2b" xref="S3.E2.m1.2.2.2.1.1.2.cmml">⁢</mo><mrow id="S3.E2.m1.2.2.2.1.1.1.1" xref="S3.E2.m1.2.2.2.1.1.1.1.1.cmml"><mo id="S3.E2.m1.2.2.2.1.1.1.1.2" stretchy="false" xref="S3.E2.m1.2.2.2.1.1.1.1.1.cmml">(</mo><msub id="S3.E2.m1.2.2.2.1.1.1.1.1" xref="S3.E2.m1.2.2.2.1.1.1.1.1.cmml"><mi id="S3.E2.m1.2.2.2.1.1.1.1.1.2" xref="S3.E2.m1.2.2.2.1.1.1.1.1.2.cmml">m</mi><mi id="S3.E2.m1.2.2.2.1.1.1.1.1.3" xref="S3.E2.m1.2.2.2.1.1.1.1.1.3.cmml">j</mi></msub><mo id="S3.E2.m1.2.2.2.1.1.1.1.3" rspace="0.055em" stretchy="false" xref="S3.E2.m1.2.2.2.1.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S3.E2.m1.2.2.2.1.2" rspace="0.222em" xref="S3.E2.m1.2.2.2.1.2.cmml">⋅</mo><mi id="S3.E2.m1.2.2.2.1.3" xref="S3.E2.m1.2.2.2.1.3.cmml">p</mi></mrow><mo id="S3.E2.m1.3.3.3.3" xref="S3.E2.m1.3.3.3.3.cmml">⁢</mo><mi id="S3.E2.m1.3.3.3.4" xref="S3.E2.m1.3.3.3.4.cmml">r</mi><mo id="S3.E2.m1.3.3.3.3a" xref="S3.E2.m1.3.3.3.3.cmml">⁢</mo><mi id="S3.E2.m1.3.3.3.5" xref="S3.E2.m1.3.3.3.5.cmml">o</mi><mo id="S3.E2.m1.3.3.3.3b" xref="S3.E2.m1.3.3.3.3.cmml">⁢</mo><mi id="S3.E2.m1.3.3.3.6" xref="S3.E2.m1.3.3.3.6.cmml">x</mi><mo id="S3.E2.m1.3.3.3.3c" xref="S3.E2.m1.3.3.3.3.cmml">⁢</mo><mi id="S3.E2.m1.3.3.3.7" xref="S3.E2.m1.3.3.3.7.cmml">y</mi><mo id="S3.E2.m1.3.3.3.3d" xref="S3.E2.m1.3.3.3.3.cmml">⁢</mo><mi id="S3.E2.m1.3.3.3.8" mathvariant="normal" xref="S3.E2.m1.3.3.3.8.cmml">_</mi><mo id="S3.E2.m1.3.3.3.3e" xref="S3.E2.m1.3.3.3.3.cmml">⁢</mo><mi id="S3.E2.m1.3.3.3.9" xref="S3.E2.m1.3.3.3.9.cmml">s</mi><mo id="S3.E2.m1.3.3.3.3f" xref="S3.E2.m1.3.3.3.3.cmml">⁢</mo><mi id="S3.E2.m1.3.3.3.10" xref="S3.E2.m1.3.3.3.10.cmml">c</mi><mo id="S3.E2.m1.3.3.3.3g" xref="S3.E2.m1.3.3.3.3.cmml">⁢</mo><mi id="S3.E2.m1.3.3.3.11" xref="S3.E2.m1.3.3.3.11.cmml">o</mi><mo id="S3.E2.m1.3.3.3.3h" xref="S3.E2.m1.3.3.3.3.cmml">⁢</mo><mi id="S3.E2.m1.3.3.3.12" xref="S3.E2.m1.3.3.3.12.cmml">r</mi><mo id="S3.E2.m1.3.3.3.3i" xref="S3.E2.m1.3.3.3.3.cmml">⁢</mo><mi id="S3.E2.m1.3.3.3.13" xref="S3.E2.m1.3.3.3.13.cmml">e</mi><mo id="S3.E2.m1.3.3.3.3j" xref="S3.E2.m1.3.3.3.3.cmml">⁢</mo><mrow id="S3.E2.m1.3.3.3.2.1" xref="S3.E2.m1.3.3.3.2.1.1.cmml"><mo id="S3.E2.m1.3.3.3.2.1.2" stretchy="false" xref="S3.E2.m1.3.3.3.2.1.1.cmml">(</mo><mrow id="S3.E2.m1.3.3.3.2.1.1" xref="S3.E2.m1.3.3.3.2.1.1.cmml"><mi id="S3.E2.m1.3.3.3.2.1.1.2" xref="S3.E2.m1.3.3.3.2.1.1.2.cmml">T</mi><mo fence="false" id="S3.E2.m1.3.3.3.2.1.1.1" xref="S3.E2.m1.3.3.3.2.1.1.1.cmml">|</mo><msub id="S3.E2.m1.3.3.3.2.1.1.3" xref="S3.E2.m1.3.3.3.2.1.1.3.cmml"><mi id="S3.E2.m1.3.3.3.2.1.1.3.2" xref="S3.E2.m1.3.3.3.2.1.1.3.2.cmml">m</mi><mi id="S3.E2.m1.3.3.3.2.1.1.3.3" xref="S3.E2.m1.3.3.3.2.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S3.E2.m1.3.3.3.2.1.3" stretchy="false" xref="S3.E2.m1.3.3.3.2.1.1.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E2.m1.3b"><apply id="S3.E2.m1.3.3.cmml" xref="S3.E2.m1.3.3"><eq id="S3.E2.m1.3.3.4.cmml" xref="S3.E2.m1.3.3.4"></eq><apply id="S3.E2.m1.1.1.1.cmml" xref="S3.E2.m1.1.1.1"><times id="S3.E2.m1.1.1.1.2.cmml" xref="S3.E2.m1.1.1.1.2"></times><ci id="S3.E2.m1.1.1.1.3.cmml" xref="S3.E2.m1.1.1.1.3">𝑟</ci><ci id="S3.E2.m1.1.1.1.4.cmml" xref="S3.E2.m1.1.1.1.4">𝑒</ci><ci id="S3.E2.m1.1.1.1.5.cmml" xref="S3.E2.m1.1.1.1.5">𝑐</ci><ci id="S3.E2.m1.1.1.1.6.cmml" xref="S3.E2.m1.1.1.1.6">𝑎</ci><ci id="S3.E2.m1.1.1.1.7.cmml" xref="S3.E2.m1.1.1.1.7">𝑙</ci><ci id="S3.E2.m1.1.1.1.8.cmml" xref="S3.E2.m1.1.1.1.8">𝑙</ci><ci id="S3.E2.m1.1.1.1.9.cmml" xref="S3.E2.m1.1.1.1.9">_</ci><ci id="S3.E2.m1.1.1.1.10.cmml" xref="S3.E2.m1.1.1.1.10">𝑠</ci><ci id="S3.E2.m1.1.1.1.11.cmml" xref="S3.E2.m1.1.1.1.11">𝑐</ci><ci id="S3.E2.m1.1.1.1.12.cmml" xref="S3.E2.m1.1.1.1.12">𝑜</ci><ci id="S3.E2.m1.1.1.1.13.cmml" xref="S3.E2.m1.1.1.1.13">𝑟</ci><ci id="S3.E2.m1.1.1.1.14.cmml" xref="S3.E2.m1.1.1.1.14">𝑒</ci><apply id="S3.E2.m1.1.1.1.1.1.1.cmml" xref="S3.E2.m1.1.1.1.1.1"><csymbol cd="latexml" id="S3.E2.m1.1.1.1.1.1.1.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1.1">conditional</csymbol><ci id="S3.E2.m1.1.1.1.1.1.1.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.2">𝑇</ci><apply id="S3.E2.m1.1.1.1.1.1.1.3.cmml" xref="S3.E2.m1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.E2.m1.1.1.1.1.1.1.3.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S3.E2.m1.1.1.1.1.1.1.3.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.3.2">𝑚</ci><ci id="S3.E2.m1.1.1.1.1.1.1.3.3.cmml" xref="S3.E2.m1.1.1.1.1.1.1.3.3">𝑗</ci></apply></apply></apply><apply id="S3.E2.m1.3.3.3.cmml" xref="S3.E2.m1.3.3.3"><times id="S3.E2.m1.3.3.3.3.cmml" xref="S3.E2.m1.3.3.3.3"></times><apply id="S3.E2.m1.2.2.2.1.cmml" xref="S3.E2.m1.2.2.2.1"><ci id="S3.E2.m1.2.2.2.1.2.cmml" xref="S3.E2.m1.2.2.2.1.2">⋅</ci><apply id="S3.E2.m1.2.2.2.1.1.cmml" xref="S3.E2.m1.2.2.2.1.1"><times id="S3.E2.m1.2.2.2.1.1.2.cmml" xref="S3.E2.m1.2.2.2.1.1.2"></times><ci id="S3.E2.m1.2.2.2.1.1.3.cmml" xref="S3.E2.m1.2.2.2.1.1.3">𝑎</ci><ci id="S3.E2.m1.2.2.2.1.1.4.cmml" xref="S3.E2.m1.2.2.2.1.1.4">𝑐</ci><ci id="S3.E2.m1.2.2.2.1.1.5.cmml" xref="S3.E2.m1.2.2.2.1.1.5">𝑐</ci><apply id="S3.E2.m1.2.2.2.1.1.1.1.1.cmml" xref="S3.E2.m1.2.2.2.1.1.1.1"><csymbol cd="ambiguous" id="S3.E2.m1.2.2.2.1.1.1.1.1.1.cmml" xref="S3.E2.m1.2.2.2.1.1.1.1">subscript</csymbol><ci id="S3.E2.m1.2.2.2.1.1.1.1.1.2.cmml" xref="S3.E2.m1.2.2.2.1.1.1.1.1.2">𝑚</ci><ci id="S3.E2.m1.2.2.2.1.1.1.1.1.3.cmml" xref="S3.E2.m1.2.2.2.1.1.1.1.1.3">𝑗</ci></apply></apply><ci id="S3.E2.m1.2.2.2.1.3.cmml" xref="S3.E2.m1.2.2.2.1.3">𝑝</ci></apply><ci id="S3.E2.m1.3.3.3.4.cmml" xref="S3.E2.m1.3.3.3.4">𝑟</ci><ci id="S3.E2.m1.3.3.3.5.cmml" xref="S3.E2.m1.3.3.3.5">𝑜</ci><ci id="S3.E2.m1.3.3.3.6.cmml" xref="S3.E2.m1.3.3.3.6">𝑥</ci><ci id="S3.E2.m1.3.3.3.7.cmml" xref="S3.E2.m1.3.3.3.7">𝑦</ci><ci id="S3.E2.m1.3.3.3.8.cmml" xref="S3.E2.m1.3.3.3.8">_</ci><ci id="S3.E2.m1.3.3.3.9.cmml" xref="S3.E2.m1.3.3.3.9">𝑠</ci><ci id="S3.E2.m1.3.3.3.10.cmml" xref="S3.E2.m1.3.3.3.10">𝑐</ci><ci id="S3.E2.m1.3.3.3.11.cmml" xref="S3.E2.m1.3.3.3.11">𝑜</ci><ci id="S3.E2.m1.3.3.3.12.cmml" xref="S3.E2.m1.3.3.3.12">𝑟</ci><ci id="S3.E2.m1.3.3.3.13.cmml" xref="S3.E2.m1.3.3.3.13">𝑒</ci><apply id="S3.E2.m1.3.3.3.2.1.1.cmml" xref="S3.E2.m1.3.3.3.2.1"><csymbol cd="latexml" id="S3.E2.m1.3.3.3.2.1.1.1.cmml" xref="S3.E2.m1.3.3.3.2.1.1.1">conditional</csymbol><ci id="S3.E2.m1.3.3.3.2.1.1.2.cmml" xref="S3.E2.m1.3.3.3.2.1.1.2">𝑇</ci><apply id="S3.E2.m1.3.3.3.2.1.1.3.cmml" xref="S3.E2.m1.3.3.3.2.1.1.3"><csymbol cd="ambiguous" id="S3.E2.m1.3.3.3.2.1.1.3.1.cmml" xref="S3.E2.m1.3.3.3.2.1.1.3">subscript</csymbol><ci id="S3.E2.m1.3.3.3.2.1.1.3.2.cmml" xref="S3.E2.m1.3.3.3.2.1.1.3.2">𝑚</ci><ci id="S3.E2.m1.3.3.3.2.1.1.3.3.cmml" xref="S3.E2.m1.3.3.3.2.1.1.3.3">𝑗</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E2.m1.3c">recall\_score(T|m_{j})=acc(m_{j})\cdot proxy\_score(T|m_{j})</annotation><annotation encoding="application/x-llamapun" id="S3.E2.m1.3d">italic_r italic_e italic_c italic_a italic_l italic_l _ italic_s italic_c italic_o italic_r italic_e ( italic_T | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = italic_a italic_c italic_c ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ⋅ italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e ( italic_T | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(2)</span></td> </tr></tbody> </table> </div> <div class="ltx_para" id="S3.SS2.p2"> <p class="ltx_p" id="S3.SS2.p2.9">As introduced in previous section, to speed up the computation of coarse-recall phase, we only compute the <math alttext="proxy\_score" class="ltx_Math" display="inline" id="S3.SS2.p2.1.m1.1"><semantics id="S3.SS2.p2.1.m1.1a"><mrow id="S3.SS2.p2.1.m1.1.1" xref="S3.SS2.p2.1.m1.1.1.cmml"><mi id="S3.SS2.p2.1.m1.1.1.2" xref="S3.SS2.p2.1.m1.1.1.2.cmml">p</mi><mo id="S3.SS2.p2.1.m1.1.1.1" xref="S3.SS2.p2.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.1.m1.1.1.3" xref="S3.SS2.p2.1.m1.1.1.3.cmml">r</mi><mo id="S3.SS2.p2.1.m1.1.1.1a" xref="S3.SS2.p2.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.1.m1.1.1.4" xref="S3.SS2.p2.1.m1.1.1.4.cmml">o</mi><mo id="S3.SS2.p2.1.m1.1.1.1b" xref="S3.SS2.p2.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.1.m1.1.1.5" xref="S3.SS2.p2.1.m1.1.1.5.cmml">x</mi><mo id="S3.SS2.p2.1.m1.1.1.1c" xref="S3.SS2.p2.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.1.m1.1.1.6" xref="S3.SS2.p2.1.m1.1.1.6.cmml">y</mi><mo id="S3.SS2.p2.1.m1.1.1.1d" xref="S3.SS2.p2.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.1.m1.1.1.7" mathvariant="normal" xref="S3.SS2.p2.1.m1.1.1.7.cmml">_</mi><mo id="S3.SS2.p2.1.m1.1.1.1e" xref="S3.SS2.p2.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.1.m1.1.1.8" xref="S3.SS2.p2.1.m1.1.1.8.cmml">s</mi><mo id="S3.SS2.p2.1.m1.1.1.1f" xref="S3.SS2.p2.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.1.m1.1.1.9" xref="S3.SS2.p2.1.m1.1.1.9.cmml">c</mi><mo id="S3.SS2.p2.1.m1.1.1.1g" xref="S3.SS2.p2.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.1.m1.1.1.10" xref="S3.SS2.p2.1.m1.1.1.10.cmml">o</mi><mo id="S3.SS2.p2.1.m1.1.1.1h" xref="S3.SS2.p2.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.1.m1.1.1.11" xref="S3.SS2.p2.1.m1.1.1.11.cmml">r</mi><mo id="S3.SS2.p2.1.m1.1.1.1i" xref="S3.SS2.p2.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.1.m1.1.1.12" xref="S3.SS2.p2.1.m1.1.1.12.cmml">e</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p2.1.m1.1b"><apply id="S3.SS2.p2.1.m1.1.1.cmml" xref="S3.SS2.p2.1.m1.1.1"><times id="S3.SS2.p2.1.m1.1.1.1.cmml" xref="S3.SS2.p2.1.m1.1.1.1"></times><ci id="S3.SS2.p2.1.m1.1.1.2.cmml" xref="S3.SS2.p2.1.m1.1.1.2">𝑝</ci><ci id="S3.SS2.p2.1.m1.1.1.3.cmml" xref="S3.SS2.p2.1.m1.1.1.3">𝑟</ci><ci id="S3.SS2.p2.1.m1.1.1.4.cmml" xref="S3.SS2.p2.1.m1.1.1.4">𝑜</ci><ci id="S3.SS2.p2.1.m1.1.1.5.cmml" xref="S3.SS2.p2.1.m1.1.1.5">𝑥</ci><ci id="S3.SS2.p2.1.m1.1.1.6.cmml" xref="S3.SS2.p2.1.m1.1.1.6">𝑦</ci><ci id="S3.SS2.p2.1.m1.1.1.7.cmml" xref="S3.SS2.p2.1.m1.1.1.7">_</ci><ci id="S3.SS2.p2.1.m1.1.1.8.cmml" xref="S3.SS2.p2.1.m1.1.1.8">𝑠</ci><ci id="S3.SS2.p2.1.m1.1.1.9.cmml" xref="S3.SS2.p2.1.m1.1.1.9">𝑐</ci><ci id="S3.SS2.p2.1.m1.1.1.10.cmml" xref="S3.SS2.p2.1.m1.1.1.10">𝑜</ci><ci id="S3.SS2.p2.1.m1.1.1.11.cmml" xref="S3.SS2.p2.1.m1.1.1.11">𝑟</ci><ci id="S3.SS2.p2.1.m1.1.1.12.cmml" xref="S3.SS2.p2.1.m1.1.1.12">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p2.1.m1.1c">proxy\_score</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p2.1.m1.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e</annotation></semantics></math> between the target dataset and the representative model of a cluster, therefore, <math alttext="proxy\_score(T|m_{i})" class="ltx_Math" display="inline" id="S3.SS2.p2.2.m2.1"><semantics id="S3.SS2.p2.2.m2.1a"><mrow id="S3.SS2.p2.2.m2.1.1" xref="S3.SS2.p2.2.m2.1.1.cmml"><mi id="S3.SS2.p2.2.m2.1.1.3" xref="S3.SS2.p2.2.m2.1.1.3.cmml">p</mi><mo id="S3.SS2.p2.2.m2.1.1.2" xref="S3.SS2.p2.2.m2.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p2.2.m2.1.1.4" xref="S3.SS2.p2.2.m2.1.1.4.cmml">r</mi><mo id="S3.SS2.p2.2.m2.1.1.2a" xref="S3.SS2.p2.2.m2.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p2.2.m2.1.1.5" xref="S3.SS2.p2.2.m2.1.1.5.cmml">o</mi><mo id="S3.SS2.p2.2.m2.1.1.2b" xref="S3.SS2.p2.2.m2.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p2.2.m2.1.1.6" xref="S3.SS2.p2.2.m2.1.1.6.cmml">x</mi><mo id="S3.SS2.p2.2.m2.1.1.2c" xref="S3.SS2.p2.2.m2.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p2.2.m2.1.1.7" xref="S3.SS2.p2.2.m2.1.1.7.cmml">y</mi><mo id="S3.SS2.p2.2.m2.1.1.2d" xref="S3.SS2.p2.2.m2.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p2.2.m2.1.1.8" mathvariant="normal" xref="S3.SS2.p2.2.m2.1.1.8.cmml">_</mi><mo id="S3.SS2.p2.2.m2.1.1.2e" xref="S3.SS2.p2.2.m2.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p2.2.m2.1.1.9" xref="S3.SS2.p2.2.m2.1.1.9.cmml">s</mi><mo id="S3.SS2.p2.2.m2.1.1.2f" xref="S3.SS2.p2.2.m2.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p2.2.m2.1.1.10" xref="S3.SS2.p2.2.m2.1.1.10.cmml">c</mi><mo id="S3.SS2.p2.2.m2.1.1.2g" xref="S3.SS2.p2.2.m2.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p2.2.m2.1.1.11" xref="S3.SS2.p2.2.m2.1.1.11.cmml">o</mi><mo id="S3.SS2.p2.2.m2.1.1.2h" xref="S3.SS2.p2.2.m2.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p2.2.m2.1.1.12" xref="S3.SS2.p2.2.m2.1.1.12.cmml">r</mi><mo id="S3.SS2.p2.2.m2.1.1.2i" xref="S3.SS2.p2.2.m2.1.1.2.cmml">⁢</mo><mi id="S3.SS2.p2.2.m2.1.1.13" xref="S3.SS2.p2.2.m2.1.1.13.cmml">e</mi><mo id="S3.SS2.p2.2.m2.1.1.2j" xref="S3.SS2.p2.2.m2.1.1.2.cmml">⁢</mo><mrow id="S3.SS2.p2.2.m2.1.1.1.1" xref="S3.SS2.p2.2.m2.1.1.1.1.1.cmml"><mo id="S3.SS2.p2.2.m2.1.1.1.1.2" stretchy="false" xref="S3.SS2.p2.2.m2.1.1.1.1.1.cmml">(</mo><mrow id="S3.SS2.p2.2.m2.1.1.1.1.1" xref="S3.SS2.p2.2.m2.1.1.1.1.1.cmml"><mi id="S3.SS2.p2.2.m2.1.1.1.1.1.2" xref="S3.SS2.p2.2.m2.1.1.1.1.1.2.cmml">T</mi><mo fence="false" id="S3.SS2.p2.2.m2.1.1.1.1.1.1" xref="S3.SS2.p2.2.m2.1.1.1.1.1.1.cmml">|</mo><msub id="S3.SS2.p2.2.m2.1.1.1.1.1.3" xref="S3.SS2.p2.2.m2.1.1.1.1.1.3.cmml"><mi id="S3.SS2.p2.2.m2.1.1.1.1.1.3.2" xref="S3.SS2.p2.2.m2.1.1.1.1.1.3.2.cmml">m</mi><mi id="S3.SS2.p2.2.m2.1.1.1.1.1.3.3" xref="S3.SS2.p2.2.m2.1.1.1.1.1.3.3.cmml">i</mi></msub></mrow><mo id="S3.SS2.p2.2.m2.1.1.1.1.3" stretchy="false" xref="S3.SS2.p2.2.m2.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p2.2.m2.1b"><apply id="S3.SS2.p2.2.m2.1.1.cmml" xref="S3.SS2.p2.2.m2.1.1"><times id="S3.SS2.p2.2.m2.1.1.2.cmml" xref="S3.SS2.p2.2.m2.1.1.2"></times><ci id="S3.SS2.p2.2.m2.1.1.3.cmml" xref="S3.SS2.p2.2.m2.1.1.3">𝑝</ci><ci id="S3.SS2.p2.2.m2.1.1.4.cmml" xref="S3.SS2.p2.2.m2.1.1.4">𝑟</ci><ci id="S3.SS2.p2.2.m2.1.1.5.cmml" xref="S3.SS2.p2.2.m2.1.1.5">𝑜</ci><ci id="S3.SS2.p2.2.m2.1.1.6.cmml" xref="S3.SS2.p2.2.m2.1.1.6">𝑥</ci><ci id="S3.SS2.p2.2.m2.1.1.7.cmml" xref="S3.SS2.p2.2.m2.1.1.7">𝑦</ci><ci id="S3.SS2.p2.2.m2.1.1.8.cmml" xref="S3.SS2.p2.2.m2.1.1.8">_</ci><ci id="S3.SS2.p2.2.m2.1.1.9.cmml" xref="S3.SS2.p2.2.m2.1.1.9">𝑠</ci><ci id="S3.SS2.p2.2.m2.1.1.10.cmml" xref="S3.SS2.p2.2.m2.1.1.10">𝑐</ci><ci id="S3.SS2.p2.2.m2.1.1.11.cmml" xref="S3.SS2.p2.2.m2.1.1.11">𝑜</ci><ci id="S3.SS2.p2.2.m2.1.1.12.cmml" xref="S3.SS2.p2.2.m2.1.1.12">𝑟</ci><ci id="S3.SS2.p2.2.m2.1.1.13.cmml" xref="S3.SS2.p2.2.m2.1.1.13">𝑒</ci><apply id="S3.SS2.p2.2.m2.1.1.1.1.1.cmml" xref="S3.SS2.p2.2.m2.1.1.1.1"><csymbol cd="latexml" id="S3.SS2.p2.2.m2.1.1.1.1.1.1.cmml" xref="S3.SS2.p2.2.m2.1.1.1.1.1.1">conditional</csymbol><ci id="S3.SS2.p2.2.m2.1.1.1.1.1.2.cmml" xref="S3.SS2.p2.2.m2.1.1.1.1.1.2">𝑇</ci><apply id="S3.SS2.p2.2.m2.1.1.1.1.1.3.cmml" xref="S3.SS2.p2.2.m2.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.SS2.p2.2.m2.1.1.1.1.1.3.1.cmml" xref="S3.SS2.p2.2.m2.1.1.1.1.1.3">subscript</csymbol><ci id="S3.SS2.p2.2.m2.1.1.1.1.1.3.2.cmml" xref="S3.SS2.p2.2.m2.1.1.1.1.1.3.2">𝑚</ci><ci id="S3.SS2.p2.2.m2.1.1.1.1.1.3.3.cmml" xref="S3.SS2.p2.2.m2.1.1.1.1.1.3.3">𝑖</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p2.2.m2.1c">proxy\_score(T|m_{i})</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p2.2.m2.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e ( italic_T | italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )</annotation></semantics></math> could be rewritten as <math alttext="proxy\_score(T|m(c(m_{j}))" class="ltx_math_unparsed" display="inline" id="S3.SS2.p2.3.m3.1"><semantics id="S3.SS2.p2.3.m3.1a"><mrow id="S3.SS2.p2.3.m3.1b"><mi id="S3.SS2.p2.3.m3.1.1">p</mi><mi id="S3.SS2.p2.3.m3.1.2">r</mi><mi id="S3.SS2.p2.3.m3.1.3">o</mi><mi id="S3.SS2.p2.3.m3.1.4">x</mi><mi id="S3.SS2.p2.3.m3.1.5">y</mi><mi id="S3.SS2.p2.3.m3.1.6" mathvariant="normal">_</mi><mi id="S3.SS2.p2.3.m3.1.7">s</mi><mi id="S3.SS2.p2.3.m3.1.8">c</mi><mi id="S3.SS2.p2.3.m3.1.9">o</mi><mi id="S3.SS2.p2.3.m3.1.10">r</mi><mi id="S3.SS2.p2.3.m3.1.11">e</mi><mrow id="S3.SS2.p2.3.m3.1.12"><mo id="S3.SS2.p2.3.m3.1.12.1" stretchy="false">(</mo><mi id="S3.SS2.p2.3.m3.1.12.2">T</mi><mo fence="false" id="S3.SS2.p2.3.m3.1.12.3" rspace="0.167em" stretchy="false">|</mo><mi id="S3.SS2.p2.3.m3.1.12.4">m</mi><mrow id="S3.SS2.p2.3.m3.1.12.5"><mo id="S3.SS2.p2.3.m3.1.12.5.1" stretchy="false">(</mo><mi id="S3.SS2.p2.3.m3.1.12.5.2">c</mi><mrow id="S3.SS2.p2.3.m3.1.12.5.3"><mo id="S3.SS2.p2.3.m3.1.12.5.3.1" stretchy="false">(</mo><msub id="S3.SS2.p2.3.m3.1.12.5.3.2"><mi id="S3.SS2.p2.3.m3.1.12.5.3.2.2">m</mi><mi id="S3.SS2.p2.3.m3.1.12.5.3.2.3">j</mi></msub><mo id="S3.SS2.p2.3.m3.1.12.5.3.3" stretchy="false">)</mo></mrow><mo id="S3.SS2.p2.3.m3.1.12.5.4" stretchy="false">)</mo></mrow></mrow></mrow><annotation encoding="application/x-tex" id="S3.SS2.p2.3.m3.1c">proxy\_score(T|m(c(m_{j}))</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p2.3.m3.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e ( italic_T | italic_m ( italic_c ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) )</annotation></semantics></math> where <math alttext="c(m_{j})" class="ltx_Math" display="inline" id="S3.SS2.p2.4.m4.1"><semantics id="S3.SS2.p2.4.m4.1a"><mrow id="S3.SS2.p2.4.m4.1.1" xref="S3.SS2.p2.4.m4.1.1.cmml"><mi id="S3.SS2.p2.4.m4.1.1.3" xref="S3.SS2.p2.4.m4.1.1.3.cmml">c</mi><mo id="S3.SS2.p2.4.m4.1.1.2" xref="S3.SS2.p2.4.m4.1.1.2.cmml">⁢</mo><mrow id="S3.SS2.p2.4.m4.1.1.1.1" xref="S3.SS2.p2.4.m4.1.1.1.1.1.cmml"><mo id="S3.SS2.p2.4.m4.1.1.1.1.2" stretchy="false" xref="S3.SS2.p2.4.m4.1.1.1.1.1.cmml">(</mo><msub id="S3.SS2.p2.4.m4.1.1.1.1.1" xref="S3.SS2.p2.4.m4.1.1.1.1.1.cmml"><mi id="S3.SS2.p2.4.m4.1.1.1.1.1.2" xref="S3.SS2.p2.4.m4.1.1.1.1.1.2.cmml">m</mi><mi id="S3.SS2.p2.4.m4.1.1.1.1.1.3" xref="S3.SS2.p2.4.m4.1.1.1.1.1.3.cmml">j</mi></msub><mo id="S3.SS2.p2.4.m4.1.1.1.1.3" stretchy="false" xref="S3.SS2.p2.4.m4.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p2.4.m4.1b"><apply id="S3.SS2.p2.4.m4.1.1.cmml" xref="S3.SS2.p2.4.m4.1.1"><times id="S3.SS2.p2.4.m4.1.1.2.cmml" xref="S3.SS2.p2.4.m4.1.1.2"></times><ci id="S3.SS2.p2.4.m4.1.1.3.cmml" xref="S3.SS2.p2.4.m4.1.1.3">𝑐</ci><apply id="S3.SS2.p2.4.m4.1.1.1.1.1.cmml" xref="S3.SS2.p2.4.m4.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS2.p2.4.m4.1.1.1.1.1.1.cmml" xref="S3.SS2.p2.4.m4.1.1.1.1">subscript</csymbol><ci id="S3.SS2.p2.4.m4.1.1.1.1.1.2.cmml" xref="S3.SS2.p2.4.m4.1.1.1.1.1.2">𝑚</ci><ci id="S3.SS2.p2.4.m4.1.1.1.1.1.3.cmml" xref="S3.SS2.p2.4.m4.1.1.1.1.1.3">𝑗</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p2.4.m4.1c">c(m_{j})</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p2.4.m4.1d">italic_c ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )</annotation></semantics></math> denotes the cluster of model <math alttext="m_{j}" class="ltx_Math" display="inline" id="S3.SS2.p2.5.m5.1"><semantics id="S3.SS2.p2.5.m5.1a"><msub id="S3.SS2.p2.5.m5.1.1" xref="S3.SS2.p2.5.m5.1.1.cmml"><mi id="S3.SS2.p2.5.m5.1.1.2" xref="S3.SS2.p2.5.m5.1.1.2.cmml">m</mi><mi id="S3.SS2.p2.5.m5.1.1.3" xref="S3.SS2.p2.5.m5.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS2.p2.5.m5.1b"><apply id="S3.SS2.p2.5.m5.1.1.cmml" xref="S3.SS2.p2.5.m5.1.1"><csymbol cd="ambiguous" id="S3.SS2.p2.5.m5.1.1.1.cmml" xref="S3.SS2.p2.5.m5.1.1">subscript</csymbol><ci id="S3.SS2.p2.5.m5.1.1.2.cmml" xref="S3.SS2.p2.5.m5.1.1.2">𝑚</ci><ci id="S3.SS2.p2.5.m5.1.1.3.cmml" xref="S3.SS2.p2.5.m5.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p2.5.m5.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p2.5.m5.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math> belonging to. Meanwhile, as there may be a number of singleton model clusters (<math alttext="|Ci|=1" class="ltx_Math" display="inline" id="S3.SS2.p2.6.m6.1"><semantics id="S3.SS2.p2.6.m6.1a"><mrow id="S3.SS2.p2.6.m6.1.1" xref="S3.SS2.p2.6.m6.1.1.cmml"><mrow id="S3.SS2.p2.6.m6.1.1.1.1" xref="S3.SS2.p2.6.m6.1.1.1.2.cmml"><mo id="S3.SS2.p2.6.m6.1.1.1.1.2" stretchy="false" xref="S3.SS2.p2.6.m6.1.1.1.2.1.cmml">|</mo><mrow id="S3.SS2.p2.6.m6.1.1.1.1.1" xref="S3.SS2.p2.6.m6.1.1.1.1.1.cmml"><mi id="S3.SS2.p2.6.m6.1.1.1.1.1.2" xref="S3.SS2.p2.6.m6.1.1.1.1.1.2.cmml">C</mi><mo id="S3.SS2.p2.6.m6.1.1.1.1.1.1" xref="S3.SS2.p2.6.m6.1.1.1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.6.m6.1.1.1.1.1.3" xref="S3.SS2.p2.6.m6.1.1.1.1.1.3.cmml">i</mi></mrow><mo id="S3.SS2.p2.6.m6.1.1.1.1.3" stretchy="false" xref="S3.SS2.p2.6.m6.1.1.1.2.1.cmml">|</mo></mrow><mo id="S3.SS2.p2.6.m6.1.1.2" xref="S3.SS2.p2.6.m6.1.1.2.cmml">=</mo><mn id="S3.SS2.p2.6.m6.1.1.3" xref="S3.SS2.p2.6.m6.1.1.3.cmml">1</mn></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p2.6.m6.1b"><apply id="S3.SS2.p2.6.m6.1.1.cmml" xref="S3.SS2.p2.6.m6.1.1"><eq id="S3.SS2.p2.6.m6.1.1.2.cmml" xref="S3.SS2.p2.6.m6.1.1.2"></eq><apply id="S3.SS2.p2.6.m6.1.1.1.2.cmml" xref="S3.SS2.p2.6.m6.1.1.1.1"><abs id="S3.SS2.p2.6.m6.1.1.1.2.1.cmml" xref="S3.SS2.p2.6.m6.1.1.1.1.2"></abs><apply id="S3.SS2.p2.6.m6.1.1.1.1.1.cmml" xref="S3.SS2.p2.6.m6.1.1.1.1.1"><times id="S3.SS2.p2.6.m6.1.1.1.1.1.1.cmml" xref="S3.SS2.p2.6.m6.1.1.1.1.1.1"></times><ci id="S3.SS2.p2.6.m6.1.1.1.1.1.2.cmml" xref="S3.SS2.p2.6.m6.1.1.1.1.1.2">𝐶</ci><ci id="S3.SS2.p2.6.m6.1.1.1.1.1.3.cmml" xref="S3.SS2.p2.6.m6.1.1.1.1.1.3">𝑖</ci></apply></apply><cn id="S3.SS2.p2.6.m6.1.1.3.cmml" type="integer" xref="S3.SS2.p2.6.m6.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p2.6.m6.1c">|Ci|=1</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p2.6.m6.1d">| italic_C italic_i | = 1</annotation></semantics></math>) after model clustering, the <math alttext="proxy\_score" class="ltx_Math" display="inline" id="S3.SS2.p2.7.m7.1"><semantics id="S3.SS2.p2.7.m7.1a"><mrow id="S3.SS2.p2.7.m7.1.1" xref="S3.SS2.p2.7.m7.1.1.cmml"><mi id="S3.SS2.p2.7.m7.1.1.2" xref="S3.SS2.p2.7.m7.1.1.2.cmml">p</mi><mo id="S3.SS2.p2.7.m7.1.1.1" xref="S3.SS2.p2.7.m7.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.7.m7.1.1.3" xref="S3.SS2.p2.7.m7.1.1.3.cmml">r</mi><mo id="S3.SS2.p2.7.m7.1.1.1a" xref="S3.SS2.p2.7.m7.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.7.m7.1.1.4" xref="S3.SS2.p2.7.m7.1.1.4.cmml">o</mi><mo id="S3.SS2.p2.7.m7.1.1.1b" xref="S3.SS2.p2.7.m7.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.7.m7.1.1.5" xref="S3.SS2.p2.7.m7.1.1.5.cmml">x</mi><mo id="S3.SS2.p2.7.m7.1.1.1c" xref="S3.SS2.p2.7.m7.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.7.m7.1.1.6" xref="S3.SS2.p2.7.m7.1.1.6.cmml">y</mi><mo id="S3.SS2.p2.7.m7.1.1.1d" xref="S3.SS2.p2.7.m7.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.7.m7.1.1.7" mathvariant="normal" xref="S3.SS2.p2.7.m7.1.1.7.cmml">_</mi><mo id="S3.SS2.p2.7.m7.1.1.1e" xref="S3.SS2.p2.7.m7.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.7.m7.1.1.8" xref="S3.SS2.p2.7.m7.1.1.8.cmml">s</mi><mo id="S3.SS2.p2.7.m7.1.1.1f" xref="S3.SS2.p2.7.m7.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.7.m7.1.1.9" xref="S3.SS2.p2.7.m7.1.1.9.cmml">c</mi><mo id="S3.SS2.p2.7.m7.1.1.1g" xref="S3.SS2.p2.7.m7.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.7.m7.1.1.10" xref="S3.SS2.p2.7.m7.1.1.10.cmml">o</mi><mo id="S3.SS2.p2.7.m7.1.1.1h" xref="S3.SS2.p2.7.m7.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.7.m7.1.1.11" xref="S3.SS2.p2.7.m7.1.1.11.cmml">r</mi><mo id="S3.SS2.p2.7.m7.1.1.1i" xref="S3.SS2.p2.7.m7.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.7.m7.1.1.12" xref="S3.SS2.p2.7.m7.1.1.12.cmml">e</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p2.7.m7.1b"><apply id="S3.SS2.p2.7.m7.1.1.cmml" xref="S3.SS2.p2.7.m7.1.1"><times id="S3.SS2.p2.7.m7.1.1.1.cmml" xref="S3.SS2.p2.7.m7.1.1.1"></times><ci id="S3.SS2.p2.7.m7.1.1.2.cmml" xref="S3.SS2.p2.7.m7.1.1.2">𝑝</ci><ci id="S3.SS2.p2.7.m7.1.1.3.cmml" xref="S3.SS2.p2.7.m7.1.1.3">𝑟</ci><ci id="S3.SS2.p2.7.m7.1.1.4.cmml" xref="S3.SS2.p2.7.m7.1.1.4">𝑜</ci><ci id="S3.SS2.p2.7.m7.1.1.5.cmml" xref="S3.SS2.p2.7.m7.1.1.5">𝑥</ci><ci id="S3.SS2.p2.7.m7.1.1.6.cmml" xref="S3.SS2.p2.7.m7.1.1.6">𝑦</ci><ci id="S3.SS2.p2.7.m7.1.1.7.cmml" xref="S3.SS2.p2.7.m7.1.1.7">_</ci><ci id="S3.SS2.p2.7.m7.1.1.8.cmml" xref="S3.SS2.p2.7.m7.1.1.8">𝑠</ci><ci id="S3.SS2.p2.7.m7.1.1.9.cmml" xref="S3.SS2.p2.7.m7.1.1.9">𝑐</ci><ci id="S3.SS2.p2.7.m7.1.1.10.cmml" xref="S3.SS2.p2.7.m7.1.1.10">𝑜</ci><ci id="S3.SS2.p2.7.m7.1.1.11.cmml" xref="S3.SS2.p2.7.m7.1.1.11">𝑟</ci><ci id="S3.SS2.p2.7.m7.1.1.12.cmml" xref="S3.SS2.p2.7.m7.1.1.12">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p2.7.m7.1c">proxy\_score</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p2.7.m7.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e</annotation></semantics></math> is only computed between the target target <math alttext="T" class="ltx_Math" display="inline" id="S3.SS2.p2.8.m8.1"><semantics id="S3.SS2.p2.8.m8.1a"><mi id="S3.SS2.p2.8.m8.1.1" xref="S3.SS2.p2.8.m8.1.1.cmml">T</mi><annotation-xml encoding="MathML-Content" id="S3.SS2.p2.8.m8.1b"><ci id="S3.SS2.p2.8.m8.1.1.cmml" xref="S3.SS2.p2.8.m8.1.1">𝑇</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p2.8.m8.1c">T</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p2.8.m8.1d">italic_T</annotation></semantics></math> and non-singleton clusters for efficiency consideration. Therefore, for models in non-singleton clusters, the <math alttext="recall\_score" class="ltx_Math" display="inline" id="S3.SS2.p2.9.m9.1"><semantics id="S3.SS2.p2.9.m9.1a"><mrow id="S3.SS2.p2.9.m9.1.1" xref="S3.SS2.p2.9.m9.1.1.cmml"><mi id="S3.SS2.p2.9.m9.1.1.2" xref="S3.SS2.p2.9.m9.1.1.2.cmml">r</mi><mo id="S3.SS2.p2.9.m9.1.1.1" xref="S3.SS2.p2.9.m9.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.9.m9.1.1.3" xref="S3.SS2.p2.9.m9.1.1.3.cmml">e</mi><mo id="S3.SS2.p2.9.m9.1.1.1a" xref="S3.SS2.p2.9.m9.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.9.m9.1.1.4" xref="S3.SS2.p2.9.m9.1.1.4.cmml">c</mi><mo id="S3.SS2.p2.9.m9.1.1.1b" xref="S3.SS2.p2.9.m9.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.9.m9.1.1.5" xref="S3.SS2.p2.9.m9.1.1.5.cmml">a</mi><mo id="S3.SS2.p2.9.m9.1.1.1c" xref="S3.SS2.p2.9.m9.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.9.m9.1.1.6" xref="S3.SS2.p2.9.m9.1.1.6.cmml">l</mi><mo id="S3.SS2.p2.9.m9.1.1.1d" xref="S3.SS2.p2.9.m9.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.9.m9.1.1.7" xref="S3.SS2.p2.9.m9.1.1.7.cmml">l</mi><mo id="S3.SS2.p2.9.m9.1.1.1e" xref="S3.SS2.p2.9.m9.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.9.m9.1.1.8" mathvariant="normal" xref="S3.SS2.p2.9.m9.1.1.8.cmml">_</mi><mo id="S3.SS2.p2.9.m9.1.1.1f" xref="S3.SS2.p2.9.m9.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.9.m9.1.1.9" xref="S3.SS2.p2.9.m9.1.1.9.cmml">s</mi><mo id="S3.SS2.p2.9.m9.1.1.1g" xref="S3.SS2.p2.9.m9.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.9.m9.1.1.10" xref="S3.SS2.p2.9.m9.1.1.10.cmml">c</mi><mo id="S3.SS2.p2.9.m9.1.1.1h" xref="S3.SS2.p2.9.m9.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.9.m9.1.1.11" xref="S3.SS2.p2.9.m9.1.1.11.cmml">o</mi><mo id="S3.SS2.p2.9.m9.1.1.1i" xref="S3.SS2.p2.9.m9.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.9.m9.1.1.12" xref="S3.SS2.p2.9.m9.1.1.12.cmml">r</mi><mo id="S3.SS2.p2.9.m9.1.1.1j" xref="S3.SS2.p2.9.m9.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p2.9.m9.1.1.13" xref="S3.SS2.p2.9.m9.1.1.13.cmml">e</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p2.9.m9.1b"><apply id="S3.SS2.p2.9.m9.1.1.cmml" xref="S3.SS2.p2.9.m9.1.1"><times id="S3.SS2.p2.9.m9.1.1.1.cmml" xref="S3.SS2.p2.9.m9.1.1.1"></times><ci id="S3.SS2.p2.9.m9.1.1.2.cmml" xref="S3.SS2.p2.9.m9.1.1.2">𝑟</ci><ci id="S3.SS2.p2.9.m9.1.1.3.cmml" xref="S3.SS2.p2.9.m9.1.1.3">𝑒</ci><ci id="S3.SS2.p2.9.m9.1.1.4.cmml" xref="S3.SS2.p2.9.m9.1.1.4">𝑐</ci><ci id="S3.SS2.p2.9.m9.1.1.5.cmml" xref="S3.SS2.p2.9.m9.1.1.5">𝑎</ci><ci id="S3.SS2.p2.9.m9.1.1.6.cmml" xref="S3.SS2.p2.9.m9.1.1.6">𝑙</ci><ci id="S3.SS2.p2.9.m9.1.1.7.cmml" xref="S3.SS2.p2.9.m9.1.1.7">𝑙</ci><ci id="S3.SS2.p2.9.m9.1.1.8.cmml" xref="S3.SS2.p2.9.m9.1.1.8">_</ci><ci id="S3.SS2.p2.9.m9.1.1.9.cmml" xref="S3.SS2.p2.9.m9.1.1.9">𝑠</ci><ci id="S3.SS2.p2.9.m9.1.1.10.cmml" xref="S3.SS2.p2.9.m9.1.1.10">𝑐</ci><ci id="S3.SS2.p2.9.m9.1.1.11.cmml" xref="S3.SS2.p2.9.m9.1.1.11">𝑜</ci><ci id="S3.SS2.p2.9.m9.1.1.12.cmml" xref="S3.SS2.p2.9.m9.1.1.12">𝑟</ci><ci id="S3.SS2.p2.9.m9.1.1.13.cmml" xref="S3.SS2.p2.9.m9.1.1.13">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p2.9.m9.1c">recall\_score</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p2.9.m9.1d">italic_r italic_e italic_c italic_a italic_l italic_l _ italic_s italic_c italic_o italic_r italic_e</annotation></semantics></math> is computed as:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E3"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\begin{split}\begin{aligned} recall&amp;\_score(T|m_{j})=acc(m_{j})\cdot\\ &amp;proxy\_score(T|m(c(m_{j})))\ for\ |c(m_{j})|&gt;1\end{aligned}\end{split}" class="ltx_math_unparsed" display="block" id="S3.E3.m1.1"><semantics id="S3.E3.m1.1a"><mtable displaystyle="true" id="S3.E3.m1.1.1"><mtr id="S3.E3.m1.1.1a"><mtd class="ltx_align_right" columnalign="right" id="S3.E3.m1.1.1b"><mtable columnspacing="0pt" displaystyle="true" id="S3.E3.m1.1.1.1.1.1.1" rowspacing="0pt"><mtr id="S3.E3.m1.1.1.1.1.1.1a"><mtd class="ltx_align_right" columnalign="right" id="S3.E3.m1.1.1.1.1.1.1b"><mrow id="S3.E3.m1.1.1.1.1.1.1.2.1.1"><mi id="S3.E3.m1.1.1.1.1.1.1.2.1.1.2">r</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.1.1.1">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.1.1.3">e</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.1.1.1a">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.1.1.4">c</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.1.1.1b">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.1.1.5">a</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.1.1.1c">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.1.1.6">l</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.1.1.1d">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.1.1.7">l</mi></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S3.E3.m1.1.1.1.1.1.1c"><mrow id="S3.E3.m1.1.1.1.1.1.1.2.2.1"><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.1" mathvariant="normal">_</mi><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.2">s</mi><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.3">c</mi><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.4">o</mi><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.5">r</mi><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.6">e</mi><mrow id="S3.E3.m1.1.1.1.1.1.1.2.2.1.7"><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.1.7.1" stretchy="false">(</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.7.2">T</mi><mo fence="false" id="S3.E3.m1.1.1.1.1.1.1.2.2.1.7.3" rspace="0.167em" stretchy="false">|</mo><msub id="S3.E3.m1.1.1.1.1.1.1.2.2.1.7.4"><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.7.4.2">m</mi><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.7.4.3">j</mi></msub><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.1.7.5" stretchy="false">)</mo></mrow><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.1.8">=</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.9">a</mi><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.10">c</mi><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.11">c</mi><mrow id="S3.E3.m1.1.1.1.1.1.1.2.2.1.12"><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.1.12.1" stretchy="false">(</mo><msub id="S3.E3.m1.1.1.1.1.1.1.2.2.1.12.2"><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.12.2.2">m</mi><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.1.12.2.3">j</mi></msub><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.1.12.3" rspace="0.055em" stretchy="false">)</mo></mrow><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.1.13">⋅</mo></mrow></mtd></mtr><mtr id="S3.E3.m1.1.1.1.1.1.1d"><mtd id="S3.E3.m1.1.1.1.1.1.1e"></mtd><mtd class="ltx_align_left" columnalign="left" id="S3.E3.m1.1.1.1.1.1.1f"><mrow id="S3.E3.m1.1.1.1.1.1.1.2.2.2"><mrow id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2"><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.4">p</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.5">r</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3a">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.6">o</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3b">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.7">x</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3c">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.8">y</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3d">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.9" mathvariant="normal">_</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3e">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.10">s</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3f">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.11">c</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3g">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.12">o</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3h">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.13">r</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3i">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.14">e</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3j">⁢</mo><mrow id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1"><mo id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.2" stretchy="false">(</mo><mrow id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1"><mi id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.3">T</mi><mo fence="false" id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.2">|</mo><mrow id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1"><mi id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3">m</mi><mo id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2">⁢</mo><mrow id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1"><mo id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2" stretchy="false">(</mo><mrow id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1"><mi id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3">c</mi><mo id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2">⁢</mo><mrow id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1"><mo id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2" stretchy="false">(</mo><msub id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1"><mi id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2">m</mi><mi id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3">j</mi></msub><mo id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3" stretchy="false">)</mo></mrow></mrow><mo id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3" stretchy="false">)</mo></mrow></mrow></mrow><mo id="S3.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.3" stretchy="false">)</mo></mrow><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3k" lspace="0.500em">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.15">f</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3l">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.16">o</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3m">⁢</mo><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.17">r</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.3n" lspace="0.500em">⁢</mo><mrow id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.2.1"><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.2.1.2" stretchy="false">|</mo><mrow id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.2.1.1"><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.2.1.1.3">c</mi><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.2.1.1.2">⁢</mo><mrow id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.2.1.1.1.1"><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.2.1.1.1.1.2" stretchy="false">(</mo><msub id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.2.1.1.1.1.1"><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.2.1.1.1.1.1.2">m</mi><mi id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.2.1.1.1.1.1.3">j</mi></msub><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.2.1.1.1.1.3" stretchy="false">)</mo></mrow></mrow><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.2.2.1.3" stretchy="false">|</mo></mrow></mrow><mo id="S3.E3.m1.1.1.1.1.1.1.2.2.2.3">&gt;</mo><mn id="S3.E3.m1.1.1.1.1.1.1.2.2.2.4">1</mn></mrow></mtd></mtr></mtable></mtd></mtr></mtable><annotation encoding="application/x-tex" id="S3.E3.m1.1b">\begin{split}\begin{aligned} recall&amp;\_score(T|m_{j})=acc(m_{j})\cdot\\ &amp;proxy\_score(T|m(c(m_{j})))\ for\ |c(m_{j})|&gt;1\end{aligned}\end{split}</annotation><annotation encoding="application/x-llamapun" id="S3.E3.m1.1c">start_ROW start_CELL start_ROW start_CELL italic_r italic_e italic_c italic_a italic_l italic_l end_CELL start_CELL _ italic_s italic_c italic_o italic_r italic_e ( italic_T | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = italic_a italic_c italic_c ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ⋅ end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e ( italic_T | italic_m ( italic_c ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) ) italic_f italic_o italic_r | italic_c ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) | &gt; 1 end_CELL end_ROW end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(3)</span></td> </tr></tbody> </table> </div> <div class="ltx_para" id="S3.SS2.p3"> <p class="ltx_p" id="S3.SS2.p3.5">As we do not compute the <math alttext="proxy\_score" class="ltx_Math" display="inline" id="S3.SS2.p3.1.m1.1"><semantics id="S3.SS2.p3.1.m1.1a"><mrow id="S3.SS2.p3.1.m1.1.1" xref="S3.SS2.p3.1.m1.1.1.cmml"><mi id="S3.SS2.p3.1.m1.1.1.2" xref="S3.SS2.p3.1.m1.1.1.2.cmml">p</mi><mo id="S3.SS2.p3.1.m1.1.1.1" xref="S3.SS2.p3.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.1.m1.1.1.3" xref="S3.SS2.p3.1.m1.1.1.3.cmml">r</mi><mo id="S3.SS2.p3.1.m1.1.1.1a" xref="S3.SS2.p3.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.1.m1.1.1.4" xref="S3.SS2.p3.1.m1.1.1.4.cmml">o</mi><mo id="S3.SS2.p3.1.m1.1.1.1b" xref="S3.SS2.p3.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.1.m1.1.1.5" xref="S3.SS2.p3.1.m1.1.1.5.cmml">x</mi><mo id="S3.SS2.p3.1.m1.1.1.1c" xref="S3.SS2.p3.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.1.m1.1.1.6" xref="S3.SS2.p3.1.m1.1.1.6.cmml">y</mi><mo id="S3.SS2.p3.1.m1.1.1.1d" xref="S3.SS2.p3.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.1.m1.1.1.7" mathvariant="normal" xref="S3.SS2.p3.1.m1.1.1.7.cmml">_</mi><mo id="S3.SS2.p3.1.m1.1.1.1e" xref="S3.SS2.p3.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.1.m1.1.1.8" xref="S3.SS2.p3.1.m1.1.1.8.cmml">s</mi><mo id="S3.SS2.p3.1.m1.1.1.1f" xref="S3.SS2.p3.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.1.m1.1.1.9" xref="S3.SS2.p3.1.m1.1.1.9.cmml">c</mi><mo id="S3.SS2.p3.1.m1.1.1.1g" xref="S3.SS2.p3.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.1.m1.1.1.10" xref="S3.SS2.p3.1.m1.1.1.10.cmml">o</mi><mo id="S3.SS2.p3.1.m1.1.1.1h" xref="S3.SS2.p3.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.1.m1.1.1.11" xref="S3.SS2.p3.1.m1.1.1.11.cmml">r</mi><mo id="S3.SS2.p3.1.m1.1.1.1i" xref="S3.SS2.p3.1.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.1.m1.1.1.12" xref="S3.SS2.p3.1.m1.1.1.12.cmml">e</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p3.1.m1.1b"><apply id="S3.SS2.p3.1.m1.1.1.cmml" xref="S3.SS2.p3.1.m1.1.1"><times id="S3.SS2.p3.1.m1.1.1.1.cmml" xref="S3.SS2.p3.1.m1.1.1.1"></times><ci id="S3.SS2.p3.1.m1.1.1.2.cmml" xref="S3.SS2.p3.1.m1.1.1.2">𝑝</ci><ci id="S3.SS2.p3.1.m1.1.1.3.cmml" xref="S3.SS2.p3.1.m1.1.1.3">𝑟</ci><ci id="S3.SS2.p3.1.m1.1.1.4.cmml" xref="S3.SS2.p3.1.m1.1.1.4">𝑜</ci><ci id="S3.SS2.p3.1.m1.1.1.5.cmml" xref="S3.SS2.p3.1.m1.1.1.5">𝑥</ci><ci id="S3.SS2.p3.1.m1.1.1.6.cmml" xref="S3.SS2.p3.1.m1.1.1.6">𝑦</ci><ci id="S3.SS2.p3.1.m1.1.1.7.cmml" xref="S3.SS2.p3.1.m1.1.1.7">_</ci><ci id="S3.SS2.p3.1.m1.1.1.8.cmml" xref="S3.SS2.p3.1.m1.1.1.8">𝑠</ci><ci id="S3.SS2.p3.1.m1.1.1.9.cmml" xref="S3.SS2.p3.1.m1.1.1.9">𝑐</ci><ci id="S3.SS2.p3.1.m1.1.1.10.cmml" xref="S3.SS2.p3.1.m1.1.1.10">𝑜</ci><ci id="S3.SS2.p3.1.m1.1.1.11.cmml" xref="S3.SS2.p3.1.m1.1.1.11">𝑟</ci><ci id="S3.SS2.p3.1.m1.1.1.12.cmml" xref="S3.SS2.p3.1.m1.1.1.12">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p3.1.m1.1c">proxy\_score</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p3.1.m1.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e</annotation></semantics></math> directly for singleton model clusters, the <math alttext="recall\_score" class="ltx_Math" display="inline" id="S3.SS2.p3.2.m2.1"><semantics id="S3.SS2.p3.2.m2.1a"><mrow id="S3.SS2.p3.2.m2.1.1" xref="S3.SS2.p3.2.m2.1.1.cmml"><mi id="S3.SS2.p3.2.m2.1.1.2" xref="S3.SS2.p3.2.m2.1.1.2.cmml">r</mi><mo id="S3.SS2.p3.2.m2.1.1.1" xref="S3.SS2.p3.2.m2.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.2.m2.1.1.3" xref="S3.SS2.p3.2.m2.1.1.3.cmml">e</mi><mo id="S3.SS2.p3.2.m2.1.1.1a" xref="S3.SS2.p3.2.m2.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.2.m2.1.1.4" xref="S3.SS2.p3.2.m2.1.1.4.cmml">c</mi><mo id="S3.SS2.p3.2.m2.1.1.1b" xref="S3.SS2.p3.2.m2.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.2.m2.1.1.5" xref="S3.SS2.p3.2.m2.1.1.5.cmml">a</mi><mo id="S3.SS2.p3.2.m2.1.1.1c" xref="S3.SS2.p3.2.m2.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.2.m2.1.1.6" xref="S3.SS2.p3.2.m2.1.1.6.cmml">l</mi><mo id="S3.SS2.p3.2.m2.1.1.1d" xref="S3.SS2.p3.2.m2.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.2.m2.1.1.7" xref="S3.SS2.p3.2.m2.1.1.7.cmml">l</mi><mo id="S3.SS2.p3.2.m2.1.1.1e" xref="S3.SS2.p3.2.m2.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.2.m2.1.1.8" mathvariant="normal" xref="S3.SS2.p3.2.m2.1.1.8.cmml">_</mi><mo id="S3.SS2.p3.2.m2.1.1.1f" xref="S3.SS2.p3.2.m2.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.2.m2.1.1.9" xref="S3.SS2.p3.2.m2.1.1.9.cmml">s</mi><mo id="S3.SS2.p3.2.m2.1.1.1g" xref="S3.SS2.p3.2.m2.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.2.m2.1.1.10" xref="S3.SS2.p3.2.m2.1.1.10.cmml">c</mi><mo id="S3.SS2.p3.2.m2.1.1.1h" xref="S3.SS2.p3.2.m2.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.2.m2.1.1.11" xref="S3.SS2.p3.2.m2.1.1.11.cmml">o</mi><mo id="S3.SS2.p3.2.m2.1.1.1i" xref="S3.SS2.p3.2.m2.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.2.m2.1.1.12" xref="S3.SS2.p3.2.m2.1.1.12.cmml">r</mi><mo id="S3.SS2.p3.2.m2.1.1.1j" xref="S3.SS2.p3.2.m2.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.2.m2.1.1.13" xref="S3.SS2.p3.2.m2.1.1.13.cmml">e</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p3.2.m2.1b"><apply id="S3.SS2.p3.2.m2.1.1.cmml" xref="S3.SS2.p3.2.m2.1.1"><times id="S3.SS2.p3.2.m2.1.1.1.cmml" xref="S3.SS2.p3.2.m2.1.1.1"></times><ci id="S3.SS2.p3.2.m2.1.1.2.cmml" xref="S3.SS2.p3.2.m2.1.1.2">𝑟</ci><ci id="S3.SS2.p3.2.m2.1.1.3.cmml" xref="S3.SS2.p3.2.m2.1.1.3">𝑒</ci><ci id="S3.SS2.p3.2.m2.1.1.4.cmml" xref="S3.SS2.p3.2.m2.1.1.4">𝑐</ci><ci id="S3.SS2.p3.2.m2.1.1.5.cmml" xref="S3.SS2.p3.2.m2.1.1.5">𝑎</ci><ci id="S3.SS2.p3.2.m2.1.1.6.cmml" xref="S3.SS2.p3.2.m2.1.1.6">𝑙</ci><ci id="S3.SS2.p3.2.m2.1.1.7.cmml" xref="S3.SS2.p3.2.m2.1.1.7">𝑙</ci><ci id="S3.SS2.p3.2.m2.1.1.8.cmml" xref="S3.SS2.p3.2.m2.1.1.8">_</ci><ci id="S3.SS2.p3.2.m2.1.1.9.cmml" xref="S3.SS2.p3.2.m2.1.1.9">𝑠</ci><ci id="S3.SS2.p3.2.m2.1.1.10.cmml" xref="S3.SS2.p3.2.m2.1.1.10">𝑐</ci><ci id="S3.SS2.p3.2.m2.1.1.11.cmml" xref="S3.SS2.p3.2.m2.1.1.11">𝑜</ci><ci id="S3.SS2.p3.2.m2.1.1.12.cmml" xref="S3.SS2.p3.2.m2.1.1.12">𝑟</ci><ci id="S3.SS2.p3.2.m2.1.1.13.cmml" xref="S3.SS2.p3.2.m2.1.1.13">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p3.2.m2.1c">recall\_score</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p3.2.m2.1d">italic_r italic_e italic_c italic_a italic_l italic_l _ italic_s italic_c italic_o italic_r italic_e</annotation></semantics></math> for models in singleton clusters is computed as Eq. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S3.E4" title="In III-B Model Recall ‣ III Coarse Recall ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">4</span></a> where the <math alttext="proxy\_score" class="ltx_Math" display="inline" id="S3.SS2.p3.3.m3.1"><semantics id="S3.SS2.p3.3.m3.1a"><mrow id="S3.SS2.p3.3.m3.1.1" xref="S3.SS2.p3.3.m3.1.1.cmml"><mi id="S3.SS2.p3.3.m3.1.1.2" xref="S3.SS2.p3.3.m3.1.1.2.cmml">p</mi><mo id="S3.SS2.p3.3.m3.1.1.1" xref="S3.SS2.p3.3.m3.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.3.m3.1.1.3" xref="S3.SS2.p3.3.m3.1.1.3.cmml">r</mi><mo id="S3.SS2.p3.3.m3.1.1.1a" xref="S3.SS2.p3.3.m3.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.3.m3.1.1.4" xref="S3.SS2.p3.3.m3.1.1.4.cmml">o</mi><mo id="S3.SS2.p3.3.m3.1.1.1b" xref="S3.SS2.p3.3.m3.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.3.m3.1.1.5" xref="S3.SS2.p3.3.m3.1.1.5.cmml">x</mi><mo id="S3.SS2.p3.3.m3.1.1.1c" xref="S3.SS2.p3.3.m3.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.3.m3.1.1.6" xref="S3.SS2.p3.3.m3.1.1.6.cmml">y</mi><mo id="S3.SS2.p3.3.m3.1.1.1d" xref="S3.SS2.p3.3.m3.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.3.m3.1.1.7" mathvariant="normal" xref="S3.SS2.p3.3.m3.1.1.7.cmml">_</mi><mo id="S3.SS2.p3.3.m3.1.1.1e" xref="S3.SS2.p3.3.m3.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.3.m3.1.1.8" xref="S3.SS2.p3.3.m3.1.1.8.cmml">s</mi><mo id="S3.SS2.p3.3.m3.1.1.1f" xref="S3.SS2.p3.3.m3.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.3.m3.1.1.9" xref="S3.SS2.p3.3.m3.1.1.9.cmml">c</mi><mo id="S3.SS2.p3.3.m3.1.1.1g" xref="S3.SS2.p3.3.m3.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.3.m3.1.1.10" xref="S3.SS2.p3.3.m3.1.1.10.cmml">o</mi><mo id="S3.SS2.p3.3.m3.1.1.1h" xref="S3.SS2.p3.3.m3.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.3.m3.1.1.11" xref="S3.SS2.p3.3.m3.1.1.11.cmml">r</mi><mo id="S3.SS2.p3.3.m3.1.1.1i" xref="S3.SS2.p3.3.m3.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.3.m3.1.1.12" xref="S3.SS2.p3.3.m3.1.1.12.cmml">e</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p3.3.m3.1b"><apply id="S3.SS2.p3.3.m3.1.1.cmml" xref="S3.SS2.p3.3.m3.1.1"><times id="S3.SS2.p3.3.m3.1.1.1.cmml" xref="S3.SS2.p3.3.m3.1.1.1"></times><ci id="S3.SS2.p3.3.m3.1.1.2.cmml" xref="S3.SS2.p3.3.m3.1.1.2">𝑝</ci><ci id="S3.SS2.p3.3.m3.1.1.3.cmml" xref="S3.SS2.p3.3.m3.1.1.3">𝑟</ci><ci id="S3.SS2.p3.3.m3.1.1.4.cmml" xref="S3.SS2.p3.3.m3.1.1.4">𝑜</ci><ci id="S3.SS2.p3.3.m3.1.1.5.cmml" xref="S3.SS2.p3.3.m3.1.1.5">𝑥</ci><ci id="S3.SS2.p3.3.m3.1.1.6.cmml" xref="S3.SS2.p3.3.m3.1.1.6">𝑦</ci><ci id="S3.SS2.p3.3.m3.1.1.7.cmml" xref="S3.SS2.p3.3.m3.1.1.7">_</ci><ci id="S3.SS2.p3.3.m3.1.1.8.cmml" xref="S3.SS2.p3.3.m3.1.1.8">𝑠</ci><ci id="S3.SS2.p3.3.m3.1.1.9.cmml" xref="S3.SS2.p3.3.m3.1.1.9">𝑐</ci><ci id="S3.SS2.p3.3.m3.1.1.10.cmml" xref="S3.SS2.p3.3.m3.1.1.10">𝑜</ci><ci id="S3.SS2.p3.3.m3.1.1.11.cmml" xref="S3.SS2.p3.3.m3.1.1.11">𝑟</ci><ci id="S3.SS2.p3.3.m3.1.1.12.cmml" xref="S3.SS2.p3.3.m3.1.1.12">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p3.3.m3.1c">proxy\_score</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p3.3.m3.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e</annotation></semantics></math> is propagated from the representative models of non-singleton clusters (denoted as <math alttext="C_{non}" class="ltx_Math" display="inline" id="S3.SS2.p3.4.m4.1"><semantics id="S3.SS2.p3.4.m4.1a"><msub id="S3.SS2.p3.4.m4.1.1" xref="S3.SS2.p3.4.m4.1.1.cmml"><mi id="S3.SS2.p3.4.m4.1.1.2" xref="S3.SS2.p3.4.m4.1.1.2.cmml">C</mi><mrow id="S3.SS2.p3.4.m4.1.1.3" xref="S3.SS2.p3.4.m4.1.1.3.cmml"><mi id="S3.SS2.p3.4.m4.1.1.3.2" xref="S3.SS2.p3.4.m4.1.1.3.2.cmml">n</mi><mo id="S3.SS2.p3.4.m4.1.1.3.1" xref="S3.SS2.p3.4.m4.1.1.3.1.cmml">⁢</mo><mi id="S3.SS2.p3.4.m4.1.1.3.3" xref="S3.SS2.p3.4.m4.1.1.3.3.cmml">o</mi><mo id="S3.SS2.p3.4.m4.1.1.3.1a" xref="S3.SS2.p3.4.m4.1.1.3.1.cmml">⁢</mo><mi id="S3.SS2.p3.4.m4.1.1.3.4" xref="S3.SS2.p3.4.m4.1.1.3.4.cmml">n</mi></mrow></msub><annotation-xml encoding="MathML-Content" id="S3.SS2.p3.4.m4.1b"><apply id="S3.SS2.p3.4.m4.1.1.cmml" xref="S3.SS2.p3.4.m4.1.1"><csymbol cd="ambiguous" id="S3.SS2.p3.4.m4.1.1.1.cmml" xref="S3.SS2.p3.4.m4.1.1">subscript</csymbol><ci id="S3.SS2.p3.4.m4.1.1.2.cmml" xref="S3.SS2.p3.4.m4.1.1.2">𝐶</ci><apply id="S3.SS2.p3.4.m4.1.1.3.cmml" xref="S3.SS2.p3.4.m4.1.1.3"><times id="S3.SS2.p3.4.m4.1.1.3.1.cmml" xref="S3.SS2.p3.4.m4.1.1.3.1"></times><ci id="S3.SS2.p3.4.m4.1.1.3.2.cmml" xref="S3.SS2.p3.4.m4.1.1.3.2">𝑛</ci><ci id="S3.SS2.p3.4.m4.1.1.3.3.cmml" xref="S3.SS2.p3.4.m4.1.1.3.3">𝑜</ci><ci id="S3.SS2.p3.4.m4.1.1.3.4.cmml" xref="S3.SS2.p3.4.m4.1.1.3.4">𝑛</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p3.4.m4.1c">C_{non}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p3.4.m4.1d">italic_C start_POSTSUBSCRIPT italic_n italic_o italic_n end_POSTSUBSCRIPT</annotation></semantics></math>) and decayed by the model similarity <math alttext="sim(m_{j},m(C_{k}))" class="ltx_Math" display="inline" id="S3.SS2.p3.5.m5.2"><semantics id="S3.SS2.p3.5.m5.2a"><mrow id="S3.SS2.p3.5.m5.2.2" xref="S3.SS2.p3.5.m5.2.2.cmml"><mi id="S3.SS2.p3.5.m5.2.2.4" xref="S3.SS2.p3.5.m5.2.2.4.cmml">s</mi><mo id="S3.SS2.p3.5.m5.2.2.3" xref="S3.SS2.p3.5.m5.2.2.3.cmml">⁢</mo><mi id="S3.SS2.p3.5.m5.2.2.5" xref="S3.SS2.p3.5.m5.2.2.5.cmml">i</mi><mo id="S3.SS2.p3.5.m5.2.2.3a" xref="S3.SS2.p3.5.m5.2.2.3.cmml">⁢</mo><mi id="S3.SS2.p3.5.m5.2.2.6" xref="S3.SS2.p3.5.m5.2.2.6.cmml">m</mi><mo id="S3.SS2.p3.5.m5.2.2.3b" xref="S3.SS2.p3.5.m5.2.2.3.cmml">⁢</mo><mrow id="S3.SS2.p3.5.m5.2.2.2.2" xref="S3.SS2.p3.5.m5.2.2.2.3.cmml"><mo id="S3.SS2.p3.5.m5.2.2.2.2.3" stretchy="false" xref="S3.SS2.p3.5.m5.2.2.2.3.cmml">(</mo><msub id="S3.SS2.p3.5.m5.1.1.1.1.1" xref="S3.SS2.p3.5.m5.1.1.1.1.1.cmml"><mi id="S3.SS2.p3.5.m5.1.1.1.1.1.2" xref="S3.SS2.p3.5.m5.1.1.1.1.1.2.cmml">m</mi><mi id="S3.SS2.p3.5.m5.1.1.1.1.1.3" xref="S3.SS2.p3.5.m5.1.1.1.1.1.3.cmml">j</mi></msub><mo id="S3.SS2.p3.5.m5.2.2.2.2.4" xref="S3.SS2.p3.5.m5.2.2.2.3.cmml">,</mo><mrow id="S3.SS2.p3.5.m5.2.2.2.2.2" xref="S3.SS2.p3.5.m5.2.2.2.2.2.cmml"><mi id="S3.SS2.p3.5.m5.2.2.2.2.2.3" xref="S3.SS2.p3.5.m5.2.2.2.2.2.3.cmml">m</mi><mo id="S3.SS2.p3.5.m5.2.2.2.2.2.2" xref="S3.SS2.p3.5.m5.2.2.2.2.2.2.cmml">⁢</mo><mrow id="S3.SS2.p3.5.m5.2.2.2.2.2.1.1" xref="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.cmml"><mo id="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.2" stretchy="false" xref="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.cmml">(</mo><msub id="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1" xref="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.cmml"><mi id="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.2" xref="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.2.cmml">C</mi><mi id="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.3" xref="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.3.cmml">k</mi></msub><mo id="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.3" stretchy="false" xref="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.cmml">)</mo></mrow></mrow><mo id="S3.SS2.p3.5.m5.2.2.2.2.5" stretchy="false" xref="S3.SS2.p3.5.m5.2.2.2.3.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p3.5.m5.2b"><apply id="S3.SS2.p3.5.m5.2.2.cmml" xref="S3.SS2.p3.5.m5.2.2"><times id="S3.SS2.p3.5.m5.2.2.3.cmml" xref="S3.SS2.p3.5.m5.2.2.3"></times><ci id="S3.SS2.p3.5.m5.2.2.4.cmml" xref="S3.SS2.p3.5.m5.2.2.4">𝑠</ci><ci id="S3.SS2.p3.5.m5.2.2.5.cmml" xref="S3.SS2.p3.5.m5.2.2.5">𝑖</ci><ci id="S3.SS2.p3.5.m5.2.2.6.cmml" xref="S3.SS2.p3.5.m5.2.2.6">𝑚</ci><interval closure="open" id="S3.SS2.p3.5.m5.2.2.2.3.cmml" xref="S3.SS2.p3.5.m5.2.2.2.2"><apply id="S3.SS2.p3.5.m5.1.1.1.1.1.cmml" xref="S3.SS2.p3.5.m5.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS2.p3.5.m5.1.1.1.1.1.1.cmml" xref="S3.SS2.p3.5.m5.1.1.1.1.1">subscript</csymbol><ci id="S3.SS2.p3.5.m5.1.1.1.1.1.2.cmml" xref="S3.SS2.p3.5.m5.1.1.1.1.1.2">𝑚</ci><ci id="S3.SS2.p3.5.m5.1.1.1.1.1.3.cmml" xref="S3.SS2.p3.5.m5.1.1.1.1.1.3">𝑗</ci></apply><apply id="S3.SS2.p3.5.m5.2.2.2.2.2.cmml" xref="S3.SS2.p3.5.m5.2.2.2.2.2"><times id="S3.SS2.p3.5.m5.2.2.2.2.2.2.cmml" xref="S3.SS2.p3.5.m5.2.2.2.2.2.2"></times><ci id="S3.SS2.p3.5.m5.2.2.2.2.2.3.cmml" xref="S3.SS2.p3.5.m5.2.2.2.2.2.3">𝑚</ci><apply id="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.cmml" xref="S3.SS2.p3.5.m5.2.2.2.2.2.1.1"><csymbol cd="ambiguous" id="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.1.cmml" xref="S3.SS2.p3.5.m5.2.2.2.2.2.1.1">subscript</csymbol><ci id="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.2.cmml" xref="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.2">𝐶</ci><ci id="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.3.cmml" xref="S3.SS2.p3.5.m5.2.2.2.2.2.1.1.1.3">𝑘</ci></apply></apply></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p3.5.m5.2c">sim(m_{j},m(C_{k}))</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p3.5.m5.2d">italic_s italic_i italic_m ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_m ( italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) )</annotation></semantics></math>:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E4"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\begin{split}\begin{aligned} &amp;recall\_score(T|m_{i})=acc(m_{i})\cdot\frac{1}{|% C_{non}|}\cdot\\ &amp;\sum_{k=1}^{|C_{non}|}(sim(m_{j},m(C_{k}))\cdot proxy\_score(T|m(C_{k})))\end% {aligned}\end{split}" class="ltx_math_unparsed" display="block" id="S3.E4.m1.1"><semantics id="S3.E4.m1.1a"><mtable displaystyle="true" id="S3.E4.m1.1.1"><mtr id="S3.E4.m1.1.1a"><mtd class="ltx_align_right" columnalign="right" id="S3.E4.m1.1.1b"><mtable columnspacing="0pt" displaystyle="true" id="S3.E4.m1.1.1.1.1.1.1" rowspacing="0pt"><mtr id="S3.E4.m1.1.1.1.1.1.1a"><mtd id="S3.E4.m1.1.1.1.1.1.1b"></mtd><mtd class="ltx_align_left" columnalign="left" id="S3.E4.m1.1.1.1.1.1.1c"><mrow id="S3.E4.m1.1.1.1.1.1.1.1.1.1"><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.2">r</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.3">e</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.4">c</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.5">a</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.6">l</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.7">l</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.8" mathvariant="normal">_</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.9">s</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.10">c</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.11">o</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.12">r</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.13">e</mi><mrow id="S3.E4.m1.1.1.1.1.1.1.1.1.1.14"><mo id="S3.E4.m1.1.1.1.1.1.1.1.1.1.14.1" stretchy="false">(</mo><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.14.2">T</mi><mo fence="false" id="S3.E4.m1.1.1.1.1.1.1.1.1.1.14.3" rspace="0.167em" stretchy="false">|</mo><msub id="S3.E4.m1.1.1.1.1.1.1.1.1.1.14.4"><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.14.4.2">m</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.14.4.3">i</mi></msub><mo id="S3.E4.m1.1.1.1.1.1.1.1.1.1.14.5" stretchy="false">)</mo></mrow><mo id="S3.E4.m1.1.1.1.1.1.1.1.1.1.15">=</mo><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.16">a</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.17">c</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.18">c</mi><mrow id="S3.E4.m1.1.1.1.1.1.1.1.1.1.19"><mo id="S3.E4.m1.1.1.1.1.1.1.1.1.1.19.1" stretchy="false">(</mo><msub id="S3.E4.m1.1.1.1.1.1.1.1.1.1.19.2"><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.19.2.2">m</mi><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.19.2.3">i</mi></msub><mo id="S3.E4.m1.1.1.1.1.1.1.1.1.1.19.3" rspace="0.055em" stretchy="false">)</mo></mrow><mo id="S3.E4.m1.1.1.1.1.1.1.1.1.1.20" rspace="0.222em">⋅</mo><mfrac id="S3.E4.m1.1.1.1.1.1.1.1.1.1.1"><mn id="S3.E4.m1.1.1.1.1.1.1.1.1.1.1.3">1</mn><mrow id="S3.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1"><mo id="S3.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.2" stretchy="false">|</mo><msub id="S3.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1"><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.2">C</mi><mrow id="S3.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.3"><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.2">n</mi><mo id="S3.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.1">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3">o</mi><mo id="S3.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.1a">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.4">n</mi></mrow></msub><mo id="S3.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.3" stretchy="false">|</mo></mrow></mfrac><mo id="S3.E4.m1.1.1.1.1.1.1.1.1.1.21" lspace="0.222em">⋅</mo></mrow></mtd></mtr><mtr id="S3.E4.m1.1.1.1.1.1.1d"><mtd id="S3.E4.m1.1.1.1.1.1.1e"></mtd><mtd class="ltx_align_left" columnalign="left" id="S3.E4.m1.1.1.1.1.1.1f"><mrow id="S3.E4.m1.1.1.1.1.1.1.3.2.2"><munderover id="S3.E4.m1.1.1.1.1.1.1.3.2.2.3"><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.3.2.2" movablelimits="false">∑</mo><mrow id="S3.E4.m1.1.1.1.1.1.1.3.2.2.3.2.3"><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.3.2.3.2">k</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.3.2.3.1">=</mo><mn id="S3.E4.m1.1.1.1.1.1.1.3.2.2.3.2.3.3">1</mn></mrow><mrow id="S3.E4.m1.1.1.1.1.1.1.2.1.1.1.1.1"><mo id="S3.E4.m1.1.1.1.1.1.1.2.1.1.1.1.1.2" stretchy="false">|</mo><msub id="S3.E4.m1.1.1.1.1.1.1.2.1.1.1.1.1.1"><mi id="S3.E4.m1.1.1.1.1.1.1.2.1.1.1.1.1.1.2">C</mi><mrow id="S3.E4.m1.1.1.1.1.1.1.2.1.1.1.1.1.1.3"><mi id="S3.E4.m1.1.1.1.1.1.1.2.1.1.1.1.1.1.3.2">n</mi><mo id="S3.E4.m1.1.1.1.1.1.1.2.1.1.1.1.1.1.3.1">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.2.1.1.1.1.1.1.3.3">o</mi><mo id="S3.E4.m1.1.1.1.1.1.1.2.1.1.1.1.1.1.3.1a">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.2.1.1.1.1.1.1.3.4">n</mi></mrow></msub><mo id="S3.E4.m1.1.1.1.1.1.1.2.1.1.1.1.1.3" stretchy="false">|</mo></mrow></munderover><mrow id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1"><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.2" lspace="0em" stretchy="false">(</mo><mrow id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1"><mrow id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2"><mrow id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2"><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.4">s</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.3">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.5">i</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.3a">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.6">m</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.3b">⁢</mo><mrow id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.2.2"><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.2.2.3" stretchy="false">(</mo><msub id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.1.1.1.1.1"><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.1.1.1.1.1.2">m</mi><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.1.1.1.1.1.3">j</mi></msub><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.2.2.4">,</mo><mrow id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.2.2.2"><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.2.2.2.3">m</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.2.2.2.2">⁢</mo><mrow id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.2.2.2.1.1"><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.2.2.2.1.1.2" stretchy="false">(</mo><msub id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.2.2.2.1.1.1"><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.2.2.2.1.1.1.2">C</mi><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.2.2.2.1.1.1.3">k</mi></msub><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.2.2.2.1.1.3" stretchy="false">)</mo></mrow></mrow><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.2.2.2.5" rspace="0.055em" stretchy="false">)</mo></mrow></mrow><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.3" rspace="0.222em">⋅</mo><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.2.4">p</mi></mrow><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.4">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.5">r</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.4a">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.6">o</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.4b">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.7">x</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.4c">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.8">y</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.4d">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.9" mathvariant="normal">_</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.4e">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.10">s</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.4f">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.11">c</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.4g">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.12">o</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.4h">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.13">r</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.4i">⁢</mo><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.14">e</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.4j">⁢</mo><mrow id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1"><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.2" stretchy="false">(</mo><mrow id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.1"><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.1.3">T</mi><mo fence="false" id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.1.2">|</mo><mrow id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.1.1"><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.1.1.3">m</mi><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.1.1.2">⁢</mo><mrow id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.1.1.1.1"><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.1.1.1.1.2" stretchy="false">(</mo><msub id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.1.1.1.1.1"><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.1.1.1.1.1.2">C</mi><mi id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.1.1.1.1.1.3">k</mi></msub><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.1.1.1.1.3" stretchy="false">)</mo></mrow></mrow></mrow><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.1.3.1.3" stretchy="false">)</mo></mrow></mrow><mo id="S3.E4.m1.1.1.1.1.1.1.3.2.2.2.1.3" stretchy="false">)</mo></mrow></mrow></mtd></mtr></mtable></mtd></mtr></mtable><annotation encoding="application/x-tex" id="S3.E4.m1.1b">\begin{split}\begin{aligned} &amp;recall\_score(T|m_{i})=acc(m_{i})\cdot\frac{1}{|% C_{non}|}\cdot\\ &amp;\sum_{k=1}^{|C_{non}|}(sim(m_{j},m(C_{k}))\cdot proxy\_score(T|m(C_{k})))\end% {aligned}\end{split}</annotation><annotation encoding="application/x-llamapun" id="S3.E4.m1.1c">start_ROW start_CELL start_ROW start_CELL end_CELL start_CELL italic_r italic_e italic_c italic_a italic_l italic_l _ italic_s italic_c italic_o italic_r italic_e ( italic_T | italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = italic_a italic_c italic_c ( italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ⋅ divide start_ARG 1 end_ARG start_ARG | italic_C start_POSTSUBSCRIPT italic_n italic_o italic_n end_POSTSUBSCRIPT | end_ARG ⋅ end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT | italic_C start_POSTSUBSCRIPT italic_n italic_o italic_n end_POSTSUBSCRIPT | end_POSTSUPERSCRIPT ( italic_s italic_i italic_m ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_m ( italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) ⋅ italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e ( italic_T | italic_m ( italic_C start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ) ) end_CELL end_ROW end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(4)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S3.SS2.p3.8">Combining Eq. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S3.E3" title="In III-B Model Recall ‣ III Coarse Recall ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">3</span></a> and Eq. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S3.E4" title="In III-B Model Recall ‣ III Coarse Recall ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">4</span></a>, we can compute the <math alttext="recall\_score" class="ltx_Math" display="inline" id="S3.SS2.p3.6.m1.1"><semantics id="S3.SS2.p3.6.m1.1a"><mrow id="S3.SS2.p3.6.m1.1.1" xref="S3.SS2.p3.6.m1.1.1.cmml"><mi id="S3.SS2.p3.6.m1.1.1.2" xref="S3.SS2.p3.6.m1.1.1.2.cmml">r</mi><mo id="S3.SS2.p3.6.m1.1.1.1" xref="S3.SS2.p3.6.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.6.m1.1.1.3" xref="S3.SS2.p3.6.m1.1.1.3.cmml">e</mi><mo id="S3.SS2.p3.6.m1.1.1.1a" xref="S3.SS2.p3.6.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.6.m1.1.1.4" xref="S3.SS2.p3.6.m1.1.1.4.cmml">c</mi><mo id="S3.SS2.p3.6.m1.1.1.1b" xref="S3.SS2.p3.6.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.6.m1.1.1.5" xref="S3.SS2.p3.6.m1.1.1.5.cmml">a</mi><mo id="S3.SS2.p3.6.m1.1.1.1c" xref="S3.SS2.p3.6.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.6.m1.1.1.6" xref="S3.SS2.p3.6.m1.1.1.6.cmml">l</mi><mo id="S3.SS2.p3.6.m1.1.1.1d" xref="S3.SS2.p3.6.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.6.m1.1.1.7" xref="S3.SS2.p3.6.m1.1.1.7.cmml">l</mi><mo id="S3.SS2.p3.6.m1.1.1.1e" xref="S3.SS2.p3.6.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.6.m1.1.1.8" mathvariant="normal" xref="S3.SS2.p3.6.m1.1.1.8.cmml">_</mi><mo id="S3.SS2.p3.6.m1.1.1.1f" xref="S3.SS2.p3.6.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.6.m1.1.1.9" xref="S3.SS2.p3.6.m1.1.1.9.cmml">s</mi><mo id="S3.SS2.p3.6.m1.1.1.1g" xref="S3.SS2.p3.6.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.6.m1.1.1.10" xref="S3.SS2.p3.6.m1.1.1.10.cmml">c</mi><mo id="S3.SS2.p3.6.m1.1.1.1h" xref="S3.SS2.p3.6.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.6.m1.1.1.11" xref="S3.SS2.p3.6.m1.1.1.11.cmml">o</mi><mo id="S3.SS2.p3.6.m1.1.1.1i" xref="S3.SS2.p3.6.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.6.m1.1.1.12" xref="S3.SS2.p3.6.m1.1.1.12.cmml">r</mi><mo id="S3.SS2.p3.6.m1.1.1.1j" xref="S3.SS2.p3.6.m1.1.1.1.cmml">⁢</mo><mi id="S3.SS2.p3.6.m1.1.1.13" xref="S3.SS2.p3.6.m1.1.1.13.cmml">e</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p3.6.m1.1b"><apply id="S3.SS2.p3.6.m1.1.1.cmml" xref="S3.SS2.p3.6.m1.1.1"><times id="S3.SS2.p3.6.m1.1.1.1.cmml" xref="S3.SS2.p3.6.m1.1.1.1"></times><ci id="S3.SS2.p3.6.m1.1.1.2.cmml" xref="S3.SS2.p3.6.m1.1.1.2">𝑟</ci><ci id="S3.SS2.p3.6.m1.1.1.3.cmml" xref="S3.SS2.p3.6.m1.1.1.3">𝑒</ci><ci id="S3.SS2.p3.6.m1.1.1.4.cmml" xref="S3.SS2.p3.6.m1.1.1.4">𝑐</ci><ci id="S3.SS2.p3.6.m1.1.1.5.cmml" xref="S3.SS2.p3.6.m1.1.1.5">𝑎</ci><ci id="S3.SS2.p3.6.m1.1.1.6.cmml" xref="S3.SS2.p3.6.m1.1.1.6">𝑙</ci><ci id="S3.SS2.p3.6.m1.1.1.7.cmml" xref="S3.SS2.p3.6.m1.1.1.7">𝑙</ci><ci id="S3.SS2.p3.6.m1.1.1.8.cmml" xref="S3.SS2.p3.6.m1.1.1.8">_</ci><ci id="S3.SS2.p3.6.m1.1.1.9.cmml" xref="S3.SS2.p3.6.m1.1.1.9">𝑠</ci><ci id="S3.SS2.p3.6.m1.1.1.10.cmml" xref="S3.SS2.p3.6.m1.1.1.10">𝑐</ci><ci id="S3.SS2.p3.6.m1.1.1.11.cmml" xref="S3.SS2.p3.6.m1.1.1.11">𝑜</ci><ci id="S3.SS2.p3.6.m1.1.1.12.cmml" xref="S3.SS2.p3.6.m1.1.1.12">𝑟</ci><ci id="S3.SS2.p3.6.m1.1.1.13.cmml" xref="S3.SS2.p3.6.m1.1.1.13">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p3.6.m1.1c">recall\_score</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p3.6.m1.1d">italic_r italic_e italic_c italic_a italic_l italic_l _ italic_s italic_c italic_o italic_r italic_e</annotation></semantics></math> for all the models in <math alttext="M" class="ltx_Math" display="inline" id="S3.SS2.p3.7.m2.1"><semantics id="S3.SS2.p3.7.m2.1a"><mi id="S3.SS2.p3.7.m2.1.1" xref="S3.SS2.p3.7.m2.1.1.cmml">M</mi><annotation-xml encoding="MathML-Content" id="S3.SS2.p3.7.m2.1b"><ci id="S3.SS2.p3.7.m2.1.1.cmml" xref="S3.SS2.p3.7.m2.1.1">𝑀</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p3.7.m2.1c">M</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p3.7.m2.1d">italic_M</annotation></semantics></math> and return the top <math alttext="K" class="ltx_Math" display="inline" id="S3.SS2.p3.8.m3.1"><semantics id="S3.SS2.p3.8.m3.1a"><mi id="S3.SS2.p3.8.m3.1.1" xref="S3.SS2.p3.8.m3.1.1.cmml">K</mi><annotation-xml encoding="MathML-Content" id="S3.SS2.p3.8.m3.1b"><ci id="S3.SS2.p3.8.m3.1.1.cmml" xref="S3.SS2.p3.8.m3.1.1">𝐾</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p3.8.m3.1c">K</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p3.8.m3.1d">italic_K</annotation></semantics></math> models to the fine-selection phase.</p> </div> </section> </section> <section class="ltx_section" id="S4"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">IV </span><span class="ltx_text ltx_font_smallcaps" id="S4.1.1">Fine-Selection</span> </h2> <div class="ltx_para" id="S4.p1"> <p class="ltx_p" id="S4.p1.2">After the initial coarse recall phase, we reduced the number of candidate pre-trained models from <math alttext="O|M|" class="ltx_Math" display="inline" id="S4.p1.1.m1.1"><semantics id="S4.p1.1.m1.1a"><mrow id="S4.p1.1.m1.1.2" xref="S4.p1.1.m1.1.2.cmml"><mi id="S4.p1.1.m1.1.2.2" xref="S4.p1.1.m1.1.2.2.cmml">O</mi><mo id="S4.p1.1.m1.1.2.1" xref="S4.p1.1.m1.1.2.1.cmml">⁢</mo><mrow id="S4.p1.1.m1.1.2.3.2" xref="S4.p1.1.m1.1.2.3.1.cmml"><mo id="S4.p1.1.m1.1.2.3.2.1" stretchy="false" xref="S4.p1.1.m1.1.2.3.1.1.cmml">|</mo><mi id="S4.p1.1.m1.1.1" xref="S4.p1.1.m1.1.1.cmml">M</mi><mo id="S4.p1.1.m1.1.2.3.2.2" stretchy="false" xref="S4.p1.1.m1.1.2.3.1.1.cmml">|</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S4.p1.1.m1.1b"><apply id="S4.p1.1.m1.1.2.cmml" xref="S4.p1.1.m1.1.2"><times id="S4.p1.1.m1.1.2.1.cmml" xref="S4.p1.1.m1.1.2.1"></times><ci id="S4.p1.1.m1.1.2.2.cmml" xref="S4.p1.1.m1.1.2.2">𝑂</ci><apply id="S4.p1.1.m1.1.2.3.1.cmml" xref="S4.p1.1.m1.1.2.3.2"><abs id="S4.p1.1.m1.1.2.3.1.1.cmml" xref="S4.p1.1.m1.1.2.3.2.1"></abs><ci id="S4.p1.1.m1.1.1.cmml" xref="S4.p1.1.m1.1.1">𝑀</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.p1.1.m1.1c">O|M|</annotation><annotation encoding="application/x-llamapun" id="S4.p1.1.m1.1d">italic_O | italic_M |</annotation></semantics></math> to <math alttext="O|MC|" class="ltx_Math" display="inline" id="S4.p1.2.m2.1"><semantics id="S4.p1.2.m2.1a"><mrow id="S4.p1.2.m2.1.1" xref="S4.p1.2.m2.1.1.cmml"><mi id="S4.p1.2.m2.1.1.3" xref="S4.p1.2.m2.1.1.3.cmml">O</mi><mo id="S4.p1.2.m2.1.1.2" xref="S4.p1.2.m2.1.1.2.cmml">⁢</mo><mrow id="S4.p1.2.m2.1.1.1.1" xref="S4.p1.2.m2.1.1.1.2.cmml"><mo id="S4.p1.2.m2.1.1.1.1.2" stretchy="false" xref="S4.p1.2.m2.1.1.1.2.1.cmml">|</mo><mrow id="S4.p1.2.m2.1.1.1.1.1" xref="S4.p1.2.m2.1.1.1.1.1.cmml"><mi id="S4.p1.2.m2.1.1.1.1.1.2" xref="S4.p1.2.m2.1.1.1.1.1.2.cmml">M</mi><mo id="S4.p1.2.m2.1.1.1.1.1.1" xref="S4.p1.2.m2.1.1.1.1.1.1.cmml">⁢</mo><mi id="S4.p1.2.m2.1.1.1.1.1.3" xref="S4.p1.2.m2.1.1.1.1.1.3.cmml">C</mi></mrow><mo id="S4.p1.2.m2.1.1.1.1.3" stretchy="false" xref="S4.p1.2.m2.1.1.1.2.1.cmml">|</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S4.p1.2.m2.1b"><apply id="S4.p1.2.m2.1.1.cmml" xref="S4.p1.2.m2.1.1"><times id="S4.p1.2.m2.1.1.2.cmml" xref="S4.p1.2.m2.1.1.2"></times><ci id="S4.p1.2.m2.1.1.3.cmml" xref="S4.p1.2.m2.1.1.3">𝑂</ci><apply id="S4.p1.2.m2.1.1.1.2.cmml" xref="S4.p1.2.m2.1.1.1.1"><abs id="S4.p1.2.m2.1.1.1.2.1.cmml" xref="S4.p1.2.m2.1.1.1.1.2"></abs><apply id="S4.p1.2.m2.1.1.1.1.1.cmml" xref="S4.p1.2.m2.1.1.1.1.1"><times id="S4.p1.2.m2.1.1.1.1.1.1.cmml" xref="S4.p1.2.m2.1.1.1.1.1.1"></times><ci id="S4.p1.2.m2.1.1.1.1.1.2.cmml" xref="S4.p1.2.m2.1.1.1.1.1.2">𝑀</ci><ci id="S4.p1.2.m2.1.1.1.1.1.3.cmml" xref="S4.p1.2.m2.1.1.1.1.1.3">𝐶</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.p1.2.m2.1c">O|MC|</annotation><annotation encoding="application/x-llamapun" id="S4.p1.2.m2.1d">italic_O | italic_M italic_C |</annotation></semantics></math>. Next, based on successive halving, we utilize the convergence information from fine-tuning the models on benchmark datasets to select a good model more quickly and accurately.</p> </div> <section class="ltx_subsection" id="S4.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S4.SS1.4.1.1">IV-A</span> </span><span class="ltx_text ltx_font_italic" id="S4.SS1.5.2">Early Stopping</span> </h3> <div class="ltx_para" id="S4.SS1.p1"> <p class="ltx_p" id="S4.SS1.p1.1">Fine-tuning models is a time-consuming process. Key to improve efficiency is the ability to filter out poorly-performing models at an earlier training steps. Therefore, we’re interested in understanding whether there is a strong and prevalent correlation between the initial validation performance and the final test performance during the fine-tuning process of pre-trained models. This correlation could potentially allow us to filter out more models at an earlier stage of training. For each target dataset, we plot the validation performance changes during the fine-tuning process for models that pass the initial screening. Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4.F3" title="Figure 3 ‣ IV-A Early Stopping ‣ IV Fine-Selection ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">3</span></a> illustrates the performance changes of 10 models screened for the MNLI dataset. It can be observed that the models performing well on the test set also exhibit better validation performance in the early stages of training, and it seems only two models out of the total ten models achieve much higher performance at the first training epoch. Therefore, we do not need to fine-tune all models to the point of convergence. Instead, we can filter out under-performing models at an earlier stage, leading to a more efficient model selection process. Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.F8" title="Figure 8 ‣ -A Mnli Results ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">8</span></a> in Appendix.A <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib15" title="">15</a>]</cite> provides model’s performances under another set of hyperparameters, showing the sensitivity of the training process to hyperparameters and the robustness of our method.</p> </div> <figure class="ltx_figure" id="S4.F3"> <p class="ltx_p ltx_align_center" id="S4.F3.1"><span class="ltx_text" id="S4.F3.1.1"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="936" id="S4.F3.1.1.g1" src="extracted/2404.00069v1/mnli.png" width="1404"/></span></p> <br class="ltx_break ltx_break"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 3: </span>Top-10 models validation and test results on MNLI dataset. Model names ignore the repository name they belong to, the full names can be found in Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.T8" title="TABLE VIII ‣ -B Model Details ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">VIII</span></a> in Appendix. B <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib15" title="">15</a>]</cite>.</figcaption> </figure> </section> <section class="ltx_subsection" id="S4.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S4.SS2.4.1.1">IV-B</span> </span><span class="ltx_text ltx_font_italic" id="S4.SS2.5.2">Successive Halving</span> </h3> <div class="ltx_para" id="S4.SS2.p1"> <p class="ltx_p" id="S4.SS2.p1.3">Based on the observation that models which perform well in early training iterations are likely to maintain superior performance when trained to full convergence, successive halving is the state-of-the-art method applied to speed up the model selection process <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib3" title="">3</a>]</cite><cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib4" title="">4</a>]</cite>. The successive halving algorithm operates iteratively in stages, and halves the pool of considered models at each stage. Specifically, each model is trained for a fixed number of iterations at each stage and then its performance is validated. Models with poor performance are discarded, while the better-performing ones proceed to the next round of more intensive training. This process continues until the top models are identified. Assuming that the initial model number is <math alttext="|M|" class="ltx_Math" display="inline" id="S4.SS2.p1.1.m1.1"><semantics id="S4.SS2.p1.1.m1.1a"><mrow id="S4.SS2.p1.1.m1.1.2.2" xref="S4.SS2.p1.1.m1.1.2.1.cmml"><mo id="S4.SS2.p1.1.m1.1.2.2.1" stretchy="false" xref="S4.SS2.p1.1.m1.1.2.1.1.cmml">|</mo><mi id="S4.SS2.p1.1.m1.1.1" xref="S4.SS2.p1.1.m1.1.1.cmml">M</mi><mo id="S4.SS2.p1.1.m1.1.2.2.2" stretchy="false" xref="S4.SS2.p1.1.m1.1.2.1.1.cmml">|</mo></mrow><annotation-xml encoding="MathML-Content" id="S4.SS2.p1.1.m1.1b"><apply id="S4.SS2.p1.1.m1.1.2.1.cmml" xref="S4.SS2.p1.1.m1.1.2.2"><abs id="S4.SS2.p1.1.m1.1.2.1.1.cmml" xref="S4.SS2.p1.1.m1.1.2.2.1"></abs><ci id="S4.SS2.p1.1.m1.1.1.cmml" xref="S4.SS2.p1.1.m1.1.1">𝑀</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.p1.1.m1.1c">|M|</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.p1.1.m1.1d">| italic_M |</annotation></semantics></math> and the training step during each halving is <math alttext="s" class="ltx_Math" display="inline" id="S4.SS2.p1.2.m2.1"><semantics id="S4.SS2.p1.2.m2.1a"><mi id="S4.SS2.p1.2.m2.1.1" xref="S4.SS2.p1.2.m2.1.1.cmml">s</mi><annotation-xml encoding="MathML-Content" id="S4.SS2.p1.2.m2.1b"><ci id="S4.SS2.p1.2.m2.1.1.cmml" xref="S4.SS2.p1.2.m2.1.1">𝑠</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.p1.2.m2.1c">s</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.p1.2.m2.1d">italic_s</annotation></semantics></math> batches, the total budget for identifying the highest performance model would typically be approximately <math alttext="|M|\cdot s\cdot log_{2}(|M|)" class="ltx_Math" display="inline" id="S4.SS2.p1.3.m3.3"><semantics id="S4.SS2.p1.3.m3.3a"><mrow id="S4.SS2.p1.3.m3.3.3" xref="S4.SS2.p1.3.m3.3.3.cmml"><mrow id="S4.SS2.p1.3.m3.3.3.3" xref="S4.SS2.p1.3.m3.3.3.3.cmml"><mrow id="S4.SS2.p1.3.m3.3.3.3.2.2" xref="S4.SS2.p1.3.m3.3.3.3.2.1.cmml"><mo id="S4.SS2.p1.3.m3.3.3.3.2.2.1" stretchy="false" xref="S4.SS2.p1.3.m3.3.3.3.2.1.1.cmml">|</mo><mi id="S4.SS2.p1.3.m3.1.1" xref="S4.SS2.p1.3.m3.1.1.cmml">M</mi><mo id="S4.SS2.p1.3.m3.3.3.3.2.2.2" rspace="0.055em" stretchy="false" xref="S4.SS2.p1.3.m3.3.3.3.2.1.1.cmml">|</mo></mrow><mo id="S4.SS2.p1.3.m3.3.3.3.1" rspace="0.222em" xref="S4.SS2.p1.3.m3.3.3.3.1.cmml">⋅</mo><mi id="S4.SS2.p1.3.m3.3.3.3.3" xref="S4.SS2.p1.3.m3.3.3.3.3.cmml">s</mi><mo id="S4.SS2.p1.3.m3.3.3.3.1a" lspace="0.222em" rspace="0.222em" xref="S4.SS2.p1.3.m3.3.3.3.1.cmml">⋅</mo><mi id="S4.SS2.p1.3.m3.3.3.3.4" xref="S4.SS2.p1.3.m3.3.3.3.4.cmml">l</mi></mrow><mo id="S4.SS2.p1.3.m3.3.3.2" xref="S4.SS2.p1.3.m3.3.3.2.cmml">⁢</mo><mi id="S4.SS2.p1.3.m3.3.3.4" xref="S4.SS2.p1.3.m3.3.3.4.cmml">o</mi><mo id="S4.SS2.p1.3.m3.3.3.2a" xref="S4.SS2.p1.3.m3.3.3.2.cmml">⁢</mo><msub id="S4.SS2.p1.3.m3.3.3.5" xref="S4.SS2.p1.3.m3.3.3.5.cmml"><mi id="S4.SS2.p1.3.m3.3.3.5.2" xref="S4.SS2.p1.3.m3.3.3.5.2.cmml">g</mi><mn id="S4.SS2.p1.3.m3.3.3.5.3" xref="S4.SS2.p1.3.m3.3.3.5.3.cmml">2</mn></msub><mo id="S4.SS2.p1.3.m3.3.3.2b" xref="S4.SS2.p1.3.m3.3.3.2.cmml">⁢</mo><mrow id="S4.SS2.p1.3.m3.3.3.1.1" xref="S4.SS2.p1.3.m3.3.3.cmml"><mo id="S4.SS2.p1.3.m3.3.3.1.1.2" stretchy="false" xref="S4.SS2.p1.3.m3.3.3.cmml">(</mo><mrow id="S4.SS2.p1.3.m3.3.3.1.1.1.2" xref="S4.SS2.p1.3.m3.3.3.1.1.1.1.cmml"><mo id="S4.SS2.p1.3.m3.3.3.1.1.1.2.1" stretchy="false" xref="S4.SS2.p1.3.m3.3.3.1.1.1.1.1.cmml">|</mo><mi id="S4.SS2.p1.3.m3.2.2" xref="S4.SS2.p1.3.m3.2.2.cmml">M</mi><mo id="S4.SS2.p1.3.m3.3.3.1.1.1.2.2" stretchy="false" xref="S4.SS2.p1.3.m3.3.3.1.1.1.1.1.cmml">|</mo></mrow><mo id="S4.SS2.p1.3.m3.3.3.1.1.3" stretchy="false" xref="S4.SS2.p1.3.m3.3.3.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S4.SS2.p1.3.m3.3b"><apply id="S4.SS2.p1.3.m3.3.3.cmml" xref="S4.SS2.p1.3.m3.3.3"><times id="S4.SS2.p1.3.m3.3.3.2.cmml" xref="S4.SS2.p1.3.m3.3.3.2"></times><apply id="S4.SS2.p1.3.m3.3.3.3.cmml" xref="S4.SS2.p1.3.m3.3.3.3"><ci id="S4.SS2.p1.3.m3.3.3.3.1.cmml" xref="S4.SS2.p1.3.m3.3.3.3.1">⋅</ci><apply id="S4.SS2.p1.3.m3.3.3.3.2.1.cmml" xref="S4.SS2.p1.3.m3.3.3.3.2.2"><abs id="S4.SS2.p1.3.m3.3.3.3.2.1.1.cmml" xref="S4.SS2.p1.3.m3.3.3.3.2.2.1"></abs><ci id="S4.SS2.p1.3.m3.1.1.cmml" xref="S4.SS2.p1.3.m3.1.1">𝑀</ci></apply><ci id="S4.SS2.p1.3.m3.3.3.3.3.cmml" xref="S4.SS2.p1.3.m3.3.3.3.3">𝑠</ci><ci id="S4.SS2.p1.3.m3.3.3.3.4.cmml" xref="S4.SS2.p1.3.m3.3.3.3.4">𝑙</ci></apply><ci id="S4.SS2.p1.3.m3.3.3.4.cmml" xref="S4.SS2.p1.3.m3.3.3.4">𝑜</ci><apply id="S4.SS2.p1.3.m3.3.3.5.cmml" xref="S4.SS2.p1.3.m3.3.3.5"><csymbol cd="ambiguous" id="S4.SS2.p1.3.m3.3.3.5.1.cmml" xref="S4.SS2.p1.3.m3.3.3.5">subscript</csymbol><ci id="S4.SS2.p1.3.m3.3.3.5.2.cmml" xref="S4.SS2.p1.3.m3.3.3.5.2">𝑔</ci><cn id="S4.SS2.p1.3.m3.3.3.5.3.cmml" type="integer" xref="S4.SS2.p1.3.m3.3.3.5.3">2</cn></apply><apply id="S4.SS2.p1.3.m3.3.3.1.1.1.1.cmml" xref="S4.SS2.p1.3.m3.3.3.1.1.1.2"><abs id="S4.SS2.p1.3.m3.3.3.1.1.1.1.1.cmml" xref="S4.SS2.p1.3.m3.3.3.1.1.1.2.1"></abs><ci id="S4.SS2.p1.3.m3.2.2.cmml" xref="S4.SS2.p1.3.m3.2.2">𝑀</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.p1.3.m3.3c">|M|\cdot s\cdot log_{2}(|M|)</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.p1.3.m3.3d">| italic_M | ⋅ italic_s ⋅ italic_l italic_o italic_g start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( | italic_M | )</annotation></semantics></math> steps.</p> </div> </section> <section class="ltx_subsection" id="S4.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S4.SS3.4.1.1">IV-C</span> </span><span class="ltx_text ltx_font_italic" id="S4.SS3.5.2">Fine-Selection Algorithm</span> </h3> <div class="ltx_para" id="S4.SS3.p1"> <p class="ltx_p" id="S4.SS3.p1.1">While successive halving ensures that computational resources are largely devoted to the most promising models, the practice of only filtering out half of the models in each round limits the further improvement of the selection efficiency. Therefore, we propose our refinement method called ’fine-selection’, which, built on successive halving, further leverages the fine-tuning information of the model on benchmark datasets. After observing the fine-tuning performance of the model on various datasets, we found that the performance changes of the same model across different datasets can be categorized into distinct clusters. As illustrated in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4.F4" title="Figure 4 ‣ IV-C Fine-Selection Algorithm ‣ IV Fine-Selection ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">4</span></a>, the fine-tuning performance of the BERT_base model on some benchmark datasets can be divided into four groups. Similar phenomena were observed across different models. Therefore, we propose to mining the convergence trend of models from the fine-tuning performance on benchmark datasets to predict the model’s final performance on the target dataset.</p> </div> <div class="ltx_para" id="S4.SS3.p2"> <p class="ltx_p" id="S4.SS3.p2.20">For a target dataset <math alttext="d(T)" class="ltx_Math" display="inline" id="S4.SS3.p2.1.m1.1"><semantics id="S4.SS3.p2.1.m1.1a"><mrow id="S4.SS3.p2.1.m1.1.2" xref="S4.SS3.p2.1.m1.1.2.cmml"><mi id="S4.SS3.p2.1.m1.1.2.2" xref="S4.SS3.p2.1.m1.1.2.2.cmml">d</mi><mo id="S4.SS3.p2.1.m1.1.2.1" xref="S4.SS3.p2.1.m1.1.2.1.cmml">⁢</mo><mrow id="S4.SS3.p2.1.m1.1.2.3.2" xref="S4.SS3.p2.1.m1.1.2.cmml"><mo id="S4.SS3.p2.1.m1.1.2.3.2.1" stretchy="false" xref="S4.SS3.p2.1.m1.1.2.cmml">(</mo><mi id="S4.SS3.p2.1.m1.1.1" xref="S4.SS3.p2.1.m1.1.1.cmml">T</mi><mo id="S4.SS3.p2.1.m1.1.2.3.2.2" stretchy="false" xref="S4.SS3.p2.1.m1.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.1.m1.1b"><apply id="S4.SS3.p2.1.m1.1.2.cmml" xref="S4.SS3.p2.1.m1.1.2"><times id="S4.SS3.p2.1.m1.1.2.1.cmml" xref="S4.SS3.p2.1.m1.1.2.1"></times><ci id="S4.SS3.p2.1.m1.1.2.2.cmml" xref="S4.SS3.p2.1.m1.1.2.2">𝑑</ci><ci id="S4.SS3.p2.1.m1.1.1.cmml" xref="S4.SS3.p2.1.m1.1.1">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.1.m1.1c">d(T)</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.1.m1.1d">italic_d ( italic_T )</annotation></semantics></math> and a given pre-trained model <math alttext="m_{j}" class="ltx_Math" display="inline" id="S4.SS3.p2.2.m2.1"><semantics id="S4.SS3.p2.2.m2.1a"><msub id="S4.SS3.p2.2.m2.1.1" xref="S4.SS3.p2.2.m2.1.1.cmml"><mi id="S4.SS3.p2.2.m2.1.1.2" xref="S4.SS3.p2.2.m2.1.1.2.cmml">m</mi><mi id="S4.SS3.p2.2.m2.1.1.3" xref="S4.SS3.p2.2.m2.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.2.m2.1b"><apply id="S4.SS3.p2.2.m2.1.1.cmml" xref="S4.SS3.p2.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS3.p2.2.m2.1.1.1.cmml" xref="S4.SS3.p2.2.m2.1.1">subscript</csymbol><ci id="S4.SS3.p2.2.m2.1.1.2.cmml" xref="S4.SS3.p2.2.m2.1.1.2">𝑚</ci><ci id="S4.SS3.p2.2.m2.1.1.3.cmml" xref="S4.SS3.p2.2.m2.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.2.m2.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.2.m2.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math>, after training <math alttext="m_{j}" class="ltx_Math" display="inline" id="S4.SS3.p2.3.m3.1"><semantics id="S4.SS3.p2.3.m3.1a"><msub id="S4.SS3.p2.3.m3.1.1" xref="S4.SS3.p2.3.m3.1.1.cmml"><mi id="S4.SS3.p2.3.m3.1.1.2" xref="S4.SS3.p2.3.m3.1.1.2.cmml">m</mi><mi id="S4.SS3.p2.3.m3.1.1.3" xref="S4.SS3.p2.3.m3.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.3.m3.1b"><apply id="S4.SS3.p2.3.m3.1.1.cmml" xref="S4.SS3.p2.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS3.p2.3.m3.1.1.1.cmml" xref="S4.SS3.p2.3.m3.1.1">subscript</csymbol><ci id="S4.SS3.p2.3.m3.1.1.2.cmml" xref="S4.SS3.p2.3.m3.1.1.2">𝑚</ci><ci id="S4.SS3.p2.3.m3.1.1.3.cmml" xref="S4.SS3.p2.3.m3.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.3.m3.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.3.m3.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math> on <math alttext="d(T)" class="ltx_Math" display="inline" id="S4.SS3.p2.4.m4.1"><semantics id="S4.SS3.p2.4.m4.1a"><mrow id="S4.SS3.p2.4.m4.1.2" xref="S4.SS3.p2.4.m4.1.2.cmml"><mi id="S4.SS3.p2.4.m4.1.2.2" xref="S4.SS3.p2.4.m4.1.2.2.cmml">d</mi><mo id="S4.SS3.p2.4.m4.1.2.1" xref="S4.SS3.p2.4.m4.1.2.1.cmml">⁢</mo><mrow id="S4.SS3.p2.4.m4.1.2.3.2" xref="S4.SS3.p2.4.m4.1.2.cmml"><mo id="S4.SS3.p2.4.m4.1.2.3.2.1" stretchy="false" xref="S4.SS3.p2.4.m4.1.2.cmml">(</mo><mi id="S4.SS3.p2.4.m4.1.1" xref="S4.SS3.p2.4.m4.1.1.cmml">T</mi><mo id="S4.SS3.p2.4.m4.1.2.3.2.2" stretchy="false" xref="S4.SS3.p2.4.m4.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.4.m4.1b"><apply id="S4.SS3.p2.4.m4.1.2.cmml" xref="S4.SS3.p2.4.m4.1.2"><times id="S4.SS3.p2.4.m4.1.2.1.cmml" xref="S4.SS3.p2.4.m4.1.2.1"></times><ci id="S4.SS3.p2.4.m4.1.2.2.cmml" xref="S4.SS3.p2.4.m4.1.2.2">𝑑</ci><ci id="S4.SS3.p2.4.m4.1.1.cmml" xref="S4.SS3.p2.4.m4.1.1">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.4.m4.1c">d(T)</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.4.m4.1d">italic_d ( italic_T )</annotation></semantics></math> for every <math alttext="s" class="ltx_Math" display="inline" id="S4.SS3.p2.5.m5.1"><semantics id="S4.SS3.p2.5.m5.1a"><mi id="S4.SS3.p2.5.m5.1.1" xref="S4.SS3.p2.5.m5.1.1.cmml">s</mi><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.5.m5.1b"><ci id="S4.SS3.p2.5.m5.1.1.cmml" xref="S4.SS3.p2.5.m5.1.1">𝑠</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.5.m5.1c">s</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.5.m5.1d">italic_s</annotation></semantics></math> steps, we can compute the validation accuracy <math alttext="val(T|m_{j})_{t}" class="ltx_Math" display="inline" id="S4.SS3.p2.6.m6.1"><semantics id="S4.SS3.p2.6.m6.1a"><mrow id="S4.SS3.p2.6.m6.1.1" xref="S4.SS3.p2.6.m6.1.1.cmml"><mi id="S4.SS3.p2.6.m6.1.1.3" xref="S4.SS3.p2.6.m6.1.1.3.cmml">v</mi><mo id="S4.SS3.p2.6.m6.1.1.2" xref="S4.SS3.p2.6.m6.1.1.2.cmml">⁢</mo><mi id="S4.SS3.p2.6.m6.1.1.4" xref="S4.SS3.p2.6.m6.1.1.4.cmml">a</mi><mo id="S4.SS3.p2.6.m6.1.1.2a" xref="S4.SS3.p2.6.m6.1.1.2.cmml">⁢</mo><mi id="S4.SS3.p2.6.m6.1.1.5" xref="S4.SS3.p2.6.m6.1.1.5.cmml">l</mi><mo id="S4.SS3.p2.6.m6.1.1.2b" xref="S4.SS3.p2.6.m6.1.1.2.cmml">⁢</mo><msub id="S4.SS3.p2.6.m6.1.1.1" xref="S4.SS3.p2.6.m6.1.1.1.cmml"><mrow id="S4.SS3.p2.6.m6.1.1.1.1.1" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.cmml"><mo id="S4.SS3.p2.6.m6.1.1.1.1.1.2" stretchy="false" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.cmml">(</mo><mrow id="S4.SS3.p2.6.m6.1.1.1.1.1.1" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.cmml"><mi id="S4.SS3.p2.6.m6.1.1.1.1.1.1.2" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.2.cmml">T</mi><mo fence="false" id="S4.SS3.p2.6.m6.1.1.1.1.1.1.1" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.1.cmml">|</mo><msub id="S4.SS3.p2.6.m6.1.1.1.1.1.1.3" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.3.cmml"><mi id="S4.SS3.p2.6.m6.1.1.1.1.1.1.3.2" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.3.2.cmml">m</mi><mi id="S4.SS3.p2.6.m6.1.1.1.1.1.1.3.3" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S4.SS3.p2.6.m6.1.1.1.1.1.3" stretchy="false" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.cmml">)</mo></mrow><mi id="S4.SS3.p2.6.m6.1.1.1.3" xref="S4.SS3.p2.6.m6.1.1.1.3.cmml">t</mi></msub></mrow><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.6.m6.1b"><apply id="S4.SS3.p2.6.m6.1.1.cmml" xref="S4.SS3.p2.6.m6.1.1"><times id="S4.SS3.p2.6.m6.1.1.2.cmml" xref="S4.SS3.p2.6.m6.1.1.2"></times><ci id="S4.SS3.p2.6.m6.1.1.3.cmml" xref="S4.SS3.p2.6.m6.1.1.3">𝑣</ci><ci id="S4.SS3.p2.6.m6.1.1.4.cmml" xref="S4.SS3.p2.6.m6.1.1.4">𝑎</ci><ci id="S4.SS3.p2.6.m6.1.1.5.cmml" xref="S4.SS3.p2.6.m6.1.1.5">𝑙</ci><apply id="S4.SS3.p2.6.m6.1.1.1.cmml" xref="S4.SS3.p2.6.m6.1.1.1"><csymbol cd="ambiguous" id="S4.SS3.p2.6.m6.1.1.1.2.cmml" xref="S4.SS3.p2.6.m6.1.1.1">subscript</csymbol><apply id="S4.SS3.p2.6.m6.1.1.1.1.1.1.cmml" xref="S4.SS3.p2.6.m6.1.1.1.1.1"><csymbol cd="latexml" id="S4.SS3.p2.6.m6.1.1.1.1.1.1.1.cmml" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.1">conditional</csymbol><ci id="S4.SS3.p2.6.m6.1.1.1.1.1.1.2.cmml" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.2">𝑇</ci><apply id="S4.SS3.p2.6.m6.1.1.1.1.1.1.3.cmml" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S4.SS3.p2.6.m6.1.1.1.1.1.1.3.1.cmml" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.3">subscript</csymbol><ci id="S4.SS3.p2.6.m6.1.1.1.1.1.1.3.2.cmml" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.3.2">𝑚</ci><ci id="S4.SS3.p2.6.m6.1.1.1.1.1.1.3.3.cmml" xref="S4.SS3.p2.6.m6.1.1.1.1.1.1.3.3">𝑗</ci></apply></apply><ci id="S4.SS3.p2.6.m6.1.1.1.3.cmml" xref="S4.SS3.p2.6.m6.1.1.1.3">𝑡</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.6.m6.1c">val(T|m_{j})_{t}</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.6.m6.1d">italic_v italic_a italic_l ( italic_T | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> at stage <math alttext="t" class="ltx_Math" display="inline" id="S4.SS3.p2.7.m7.1"><semantics id="S4.SS3.p2.7.m7.1a"><mi id="S4.SS3.p2.7.m7.1.1" xref="S4.SS3.p2.7.m7.1.1.cmml">t</mi><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.7.m7.1b"><ci id="S4.SS3.p2.7.m7.1.1.cmml" xref="S4.SS3.p2.7.m7.1.1">𝑡</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.7.m7.1c">t</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.7.m7.1d">italic_t</annotation></semantics></math> and predict the final training performance as follows. Firstly, we generate a group of convergence trends for <math alttext="m_{j}" class="ltx_Math" display="inline" id="S4.SS3.p2.8.m8.1"><semantics id="S4.SS3.p2.8.m8.1a"><msub id="S4.SS3.p2.8.m8.1.1" xref="S4.SS3.p2.8.m8.1.1.cmml"><mi id="S4.SS3.p2.8.m8.1.1.2" xref="S4.SS3.p2.8.m8.1.1.2.cmml">m</mi><mi id="S4.SS3.p2.8.m8.1.1.3" xref="S4.SS3.p2.8.m8.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.8.m8.1b"><apply id="S4.SS3.p2.8.m8.1.1.cmml" xref="S4.SS3.p2.8.m8.1.1"><csymbol cd="ambiguous" id="S4.SS3.p2.8.m8.1.1.1.cmml" xref="S4.SS3.p2.8.m8.1.1">subscript</csymbol><ci id="S4.SS3.p2.8.m8.1.1.2.cmml" xref="S4.SS3.p2.8.m8.1.1.2">𝑚</ci><ci id="S4.SS3.p2.8.m8.1.1.3.cmml" xref="S4.SS3.p2.8.m8.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.8.m8.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.8.m8.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math>, denoted as <math alttext="\{CT(m_{j})_{t}\}" class="ltx_Math" display="inline" id="S4.SS3.p2.9.m9.1"><semantics id="S4.SS3.p2.9.m9.1a"><mrow id="S4.SS3.p2.9.m9.1.1.1" xref="S4.SS3.p2.9.m9.1.1.2.cmml"><mo id="S4.SS3.p2.9.m9.1.1.1.2" stretchy="false" xref="S4.SS3.p2.9.m9.1.1.2.cmml">{</mo><mrow id="S4.SS3.p2.9.m9.1.1.1.1" xref="S4.SS3.p2.9.m9.1.1.1.1.cmml"><mi id="S4.SS3.p2.9.m9.1.1.1.1.3" xref="S4.SS3.p2.9.m9.1.1.1.1.3.cmml">C</mi><mo id="S4.SS3.p2.9.m9.1.1.1.1.2" xref="S4.SS3.p2.9.m9.1.1.1.1.2.cmml">⁢</mo><mi id="S4.SS3.p2.9.m9.1.1.1.1.4" xref="S4.SS3.p2.9.m9.1.1.1.1.4.cmml">T</mi><mo id="S4.SS3.p2.9.m9.1.1.1.1.2a" xref="S4.SS3.p2.9.m9.1.1.1.1.2.cmml">⁢</mo><msub id="S4.SS3.p2.9.m9.1.1.1.1.1" xref="S4.SS3.p2.9.m9.1.1.1.1.1.cmml"><mrow id="S4.SS3.p2.9.m9.1.1.1.1.1.1.1" xref="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.cmml"><mo id="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.2" stretchy="false" xref="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.cmml">(</mo><msub id="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1" xref="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.cmml"><mi id="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.2" xref="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.2.cmml">m</mi><mi id="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.3" xref="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.3.cmml">j</mi></msub><mo id="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.3" stretchy="false" xref="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.cmml">)</mo></mrow><mi id="S4.SS3.p2.9.m9.1.1.1.1.1.3" xref="S4.SS3.p2.9.m9.1.1.1.1.1.3.cmml">t</mi></msub></mrow><mo id="S4.SS3.p2.9.m9.1.1.1.3" stretchy="false" xref="S4.SS3.p2.9.m9.1.1.2.cmml">}</mo></mrow><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.9.m9.1b"><set id="S4.SS3.p2.9.m9.1.1.2.cmml" xref="S4.SS3.p2.9.m9.1.1.1"><apply id="S4.SS3.p2.9.m9.1.1.1.1.cmml" xref="S4.SS3.p2.9.m9.1.1.1.1"><times id="S4.SS3.p2.9.m9.1.1.1.1.2.cmml" xref="S4.SS3.p2.9.m9.1.1.1.1.2"></times><ci id="S4.SS3.p2.9.m9.1.1.1.1.3.cmml" xref="S4.SS3.p2.9.m9.1.1.1.1.3">𝐶</ci><ci id="S4.SS3.p2.9.m9.1.1.1.1.4.cmml" xref="S4.SS3.p2.9.m9.1.1.1.1.4">𝑇</ci><apply id="S4.SS3.p2.9.m9.1.1.1.1.1.cmml" xref="S4.SS3.p2.9.m9.1.1.1.1.1"><csymbol cd="ambiguous" id="S4.SS3.p2.9.m9.1.1.1.1.1.2.cmml" xref="S4.SS3.p2.9.m9.1.1.1.1.1">subscript</csymbol><apply id="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.cmml" xref="S4.SS3.p2.9.m9.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.1.cmml" xref="S4.SS3.p2.9.m9.1.1.1.1.1.1.1">subscript</csymbol><ci id="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.2.cmml" xref="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.2">𝑚</ci><ci id="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.3.cmml" xref="S4.SS3.p2.9.m9.1.1.1.1.1.1.1.1.3">𝑗</ci></apply><ci id="S4.SS3.p2.9.m9.1.1.1.1.1.3.cmml" xref="S4.SS3.p2.9.m9.1.1.1.1.1.3">𝑡</ci></apply></apply></set></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.9.m9.1c">\{CT(m_{j})_{t}\}</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.9.m9.1d">{ italic_C italic_T ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT }</annotation></semantics></math>. Specifically, we cluster the benchmark datasets into <math alttext="c" class="ltx_Math" display="inline" id="S4.SS3.p2.10.m10.1"><semantics id="S4.SS3.p2.10.m10.1a"><mi id="S4.SS3.p2.10.m10.1.1" xref="S4.SS3.p2.10.m10.1.1.cmml">c</mi><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.10.m10.1b"><ci id="S4.SS3.p2.10.m10.1.1.cmml" xref="S4.SS3.p2.10.m10.1.1">𝑐</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.10.m10.1c">c</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.10.m10.1d">italic_c</annotation></semantics></math> clusters based on the validate accuracy of <math alttext="m" class="ltx_Math" display="inline" id="S4.SS3.p2.11.m11.1"><semantics id="S4.SS3.p2.11.m11.1a"><mi id="S4.SS3.p2.11.m11.1.1" xref="S4.SS3.p2.11.m11.1.1.cmml">m</mi><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.11.m11.1b"><ci id="S4.SS3.p2.11.m11.1.1.cmml" xref="S4.SS3.p2.11.m11.1.1">𝑚</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.11.m11.1c">m</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.11.m11.1d">italic_m</annotation></semantics></math> on these datasets. Then, a convergence trend <math alttext="CT(m_{j})_{t}[x]=(\overline{val_{x}},\overline{test_{x}})" class="ltx_Math" display="inline" id="S4.SS3.p2.12.m12.4"><semantics id="S4.SS3.p2.12.m12.4a"><mrow id="S4.SS3.p2.12.m12.4.4" xref="S4.SS3.p2.12.m12.4.4.cmml"><mrow id="S4.SS3.p2.12.m12.4.4.1" xref="S4.SS3.p2.12.m12.4.4.1.cmml"><mi id="S4.SS3.p2.12.m12.4.4.1.3" xref="S4.SS3.p2.12.m12.4.4.1.3.cmml">C</mi><mo id="S4.SS3.p2.12.m12.4.4.1.2" xref="S4.SS3.p2.12.m12.4.4.1.2.cmml">⁢</mo><mi id="S4.SS3.p2.12.m12.4.4.1.4" xref="S4.SS3.p2.12.m12.4.4.1.4.cmml">T</mi><mo id="S4.SS3.p2.12.m12.4.4.1.2a" xref="S4.SS3.p2.12.m12.4.4.1.2.cmml">⁢</mo><msub id="S4.SS3.p2.12.m12.4.4.1.1" xref="S4.SS3.p2.12.m12.4.4.1.1.cmml"><mrow id="S4.SS3.p2.12.m12.4.4.1.1.1.1" xref="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.cmml"><mo id="S4.SS3.p2.12.m12.4.4.1.1.1.1.2" stretchy="false" xref="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.cmml">(</mo><msub id="S4.SS3.p2.12.m12.4.4.1.1.1.1.1" xref="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.cmml"><mi id="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.2" xref="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.2.cmml">m</mi><mi id="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.3" xref="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.3.cmml">j</mi></msub><mo id="S4.SS3.p2.12.m12.4.4.1.1.1.1.3" stretchy="false" xref="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.cmml">)</mo></mrow><mi id="S4.SS3.p2.12.m12.4.4.1.1.3" xref="S4.SS3.p2.12.m12.4.4.1.1.3.cmml">t</mi></msub><mo id="S4.SS3.p2.12.m12.4.4.1.2b" xref="S4.SS3.p2.12.m12.4.4.1.2.cmml">⁢</mo><mrow id="S4.SS3.p2.12.m12.4.4.1.5.2" xref="S4.SS3.p2.12.m12.4.4.1.5.1.cmml"><mo id="S4.SS3.p2.12.m12.4.4.1.5.2.1" stretchy="false" xref="S4.SS3.p2.12.m12.4.4.1.5.1.1.cmml">[</mo><mi id="S4.SS3.p2.12.m12.1.1" xref="S4.SS3.p2.12.m12.1.1.cmml">x</mi><mo id="S4.SS3.p2.12.m12.4.4.1.5.2.2" stretchy="false" xref="S4.SS3.p2.12.m12.4.4.1.5.1.1.cmml">]</mo></mrow></mrow><mo id="S4.SS3.p2.12.m12.4.4.2" xref="S4.SS3.p2.12.m12.4.4.2.cmml">=</mo><mrow id="S4.SS3.p2.12.m12.4.4.3.2" xref="S4.SS3.p2.12.m12.4.4.3.1.cmml"><mo id="S4.SS3.p2.12.m12.4.4.3.2.1" stretchy="false" xref="S4.SS3.p2.12.m12.4.4.3.1.cmml">(</mo><mover accent="true" id="S4.SS3.p2.12.m12.2.2" xref="S4.SS3.p2.12.m12.2.2.cmml"><mrow id="S4.SS3.p2.12.m12.2.2.2" xref="S4.SS3.p2.12.m12.2.2.2.cmml"><mi id="S4.SS3.p2.12.m12.2.2.2.2" xref="S4.SS3.p2.12.m12.2.2.2.2.cmml">v</mi><mo id="S4.SS3.p2.12.m12.2.2.2.1" xref="S4.SS3.p2.12.m12.2.2.2.1.cmml">⁢</mo><mi id="S4.SS3.p2.12.m12.2.2.2.3" xref="S4.SS3.p2.12.m12.2.2.2.3.cmml">a</mi><mo id="S4.SS3.p2.12.m12.2.2.2.1a" xref="S4.SS3.p2.12.m12.2.2.2.1.cmml">⁢</mo><msub id="S4.SS3.p2.12.m12.2.2.2.4" xref="S4.SS3.p2.12.m12.2.2.2.4.cmml"><mi id="S4.SS3.p2.12.m12.2.2.2.4.2" xref="S4.SS3.p2.12.m12.2.2.2.4.2.cmml">l</mi><mi id="S4.SS3.p2.12.m12.2.2.2.4.3" xref="S4.SS3.p2.12.m12.2.2.2.4.3.cmml">x</mi></msub></mrow><mo id="S4.SS3.p2.12.m12.2.2.1" xref="S4.SS3.p2.12.m12.2.2.1.cmml">¯</mo></mover><mo id="S4.SS3.p2.12.m12.4.4.3.2.2" xref="S4.SS3.p2.12.m12.4.4.3.1.cmml">,</mo><mover accent="true" id="S4.SS3.p2.12.m12.3.3" xref="S4.SS3.p2.12.m12.3.3.cmml"><mrow id="S4.SS3.p2.12.m12.3.3.2" xref="S4.SS3.p2.12.m12.3.3.2.cmml"><mi id="S4.SS3.p2.12.m12.3.3.2.2" xref="S4.SS3.p2.12.m12.3.3.2.2.cmml">t</mi><mo id="S4.SS3.p2.12.m12.3.3.2.1" xref="S4.SS3.p2.12.m12.3.3.2.1.cmml">⁢</mo><mi id="S4.SS3.p2.12.m12.3.3.2.3" xref="S4.SS3.p2.12.m12.3.3.2.3.cmml">e</mi><mo id="S4.SS3.p2.12.m12.3.3.2.1a" xref="S4.SS3.p2.12.m12.3.3.2.1.cmml">⁢</mo><mi id="S4.SS3.p2.12.m12.3.3.2.4" xref="S4.SS3.p2.12.m12.3.3.2.4.cmml">s</mi><mo id="S4.SS3.p2.12.m12.3.3.2.1b" xref="S4.SS3.p2.12.m12.3.3.2.1.cmml">⁢</mo><msub id="S4.SS3.p2.12.m12.3.3.2.5" xref="S4.SS3.p2.12.m12.3.3.2.5.cmml"><mi id="S4.SS3.p2.12.m12.3.3.2.5.2" xref="S4.SS3.p2.12.m12.3.3.2.5.2.cmml">t</mi><mi id="S4.SS3.p2.12.m12.3.3.2.5.3" xref="S4.SS3.p2.12.m12.3.3.2.5.3.cmml">x</mi></msub></mrow><mo id="S4.SS3.p2.12.m12.3.3.1" xref="S4.SS3.p2.12.m12.3.3.1.cmml">¯</mo></mover><mo id="S4.SS3.p2.12.m12.4.4.3.2.3" stretchy="false" xref="S4.SS3.p2.12.m12.4.4.3.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.12.m12.4b"><apply id="S4.SS3.p2.12.m12.4.4.cmml" xref="S4.SS3.p2.12.m12.4.4"><eq id="S4.SS3.p2.12.m12.4.4.2.cmml" xref="S4.SS3.p2.12.m12.4.4.2"></eq><apply id="S4.SS3.p2.12.m12.4.4.1.cmml" xref="S4.SS3.p2.12.m12.4.4.1"><times id="S4.SS3.p2.12.m12.4.4.1.2.cmml" xref="S4.SS3.p2.12.m12.4.4.1.2"></times><ci id="S4.SS3.p2.12.m12.4.4.1.3.cmml" xref="S4.SS3.p2.12.m12.4.4.1.3">𝐶</ci><ci id="S4.SS3.p2.12.m12.4.4.1.4.cmml" xref="S4.SS3.p2.12.m12.4.4.1.4">𝑇</ci><apply id="S4.SS3.p2.12.m12.4.4.1.1.cmml" xref="S4.SS3.p2.12.m12.4.4.1.1"><csymbol cd="ambiguous" id="S4.SS3.p2.12.m12.4.4.1.1.2.cmml" xref="S4.SS3.p2.12.m12.4.4.1.1">subscript</csymbol><apply id="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.cmml" xref="S4.SS3.p2.12.m12.4.4.1.1.1.1"><csymbol cd="ambiguous" id="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.1.cmml" xref="S4.SS3.p2.12.m12.4.4.1.1.1.1">subscript</csymbol><ci id="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.2.cmml" xref="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.2">𝑚</ci><ci id="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.3.cmml" xref="S4.SS3.p2.12.m12.4.4.1.1.1.1.1.3">𝑗</ci></apply><ci id="S4.SS3.p2.12.m12.4.4.1.1.3.cmml" xref="S4.SS3.p2.12.m12.4.4.1.1.3">𝑡</ci></apply><apply id="S4.SS3.p2.12.m12.4.4.1.5.1.cmml" xref="S4.SS3.p2.12.m12.4.4.1.5.2"><csymbol cd="latexml" id="S4.SS3.p2.12.m12.4.4.1.5.1.1.cmml" xref="S4.SS3.p2.12.m12.4.4.1.5.2.1">delimited-[]</csymbol><ci id="S4.SS3.p2.12.m12.1.1.cmml" xref="S4.SS3.p2.12.m12.1.1">𝑥</ci></apply></apply><interval closure="open" id="S4.SS3.p2.12.m12.4.4.3.1.cmml" xref="S4.SS3.p2.12.m12.4.4.3.2"><apply id="S4.SS3.p2.12.m12.2.2.cmml" xref="S4.SS3.p2.12.m12.2.2"><ci id="S4.SS3.p2.12.m12.2.2.1.cmml" xref="S4.SS3.p2.12.m12.2.2.1">¯</ci><apply id="S4.SS3.p2.12.m12.2.2.2.cmml" xref="S4.SS3.p2.12.m12.2.2.2"><times id="S4.SS3.p2.12.m12.2.2.2.1.cmml" xref="S4.SS3.p2.12.m12.2.2.2.1"></times><ci id="S4.SS3.p2.12.m12.2.2.2.2.cmml" xref="S4.SS3.p2.12.m12.2.2.2.2">𝑣</ci><ci id="S4.SS3.p2.12.m12.2.2.2.3.cmml" xref="S4.SS3.p2.12.m12.2.2.2.3">𝑎</ci><apply id="S4.SS3.p2.12.m12.2.2.2.4.cmml" xref="S4.SS3.p2.12.m12.2.2.2.4"><csymbol cd="ambiguous" id="S4.SS3.p2.12.m12.2.2.2.4.1.cmml" xref="S4.SS3.p2.12.m12.2.2.2.4">subscript</csymbol><ci id="S4.SS3.p2.12.m12.2.2.2.4.2.cmml" xref="S4.SS3.p2.12.m12.2.2.2.4.2">𝑙</ci><ci id="S4.SS3.p2.12.m12.2.2.2.4.3.cmml" xref="S4.SS3.p2.12.m12.2.2.2.4.3">𝑥</ci></apply></apply></apply><apply id="S4.SS3.p2.12.m12.3.3.cmml" xref="S4.SS3.p2.12.m12.3.3"><ci id="S4.SS3.p2.12.m12.3.3.1.cmml" xref="S4.SS3.p2.12.m12.3.3.1">¯</ci><apply id="S4.SS3.p2.12.m12.3.3.2.cmml" xref="S4.SS3.p2.12.m12.3.3.2"><times id="S4.SS3.p2.12.m12.3.3.2.1.cmml" xref="S4.SS3.p2.12.m12.3.3.2.1"></times><ci id="S4.SS3.p2.12.m12.3.3.2.2.cmml" xref="S4.SS3.p2.12.m12.3.3.2.2">𝑡</ci><ci id="S4.SS3.p2.12.m12.3.3.2.3.cmml" xref="S4.SS3.p2.12.m12.3.3.2.3">𝑒</ci><ci id="S4.SS3.p2.12.m12.3.3.2.4.cmml" xref="S4.SS3.p2.12.m12.3.3.2.4">𝑠</ci><apply id="S4.SS3.p2.12.m12.3.3.2.5.cmml" xref="S4.SS3.p2.12.m12.3.3.2.5"><csymbol cd="ambiguous" id="S4.SS3.p2.12.m12.3.3.2.5.1.cmml" xref="S4.SS3.p2.12.m12.3.3.2.5">subscript</csymbol><ci id="S4.SS3.p2.12.m12.3.3.2.5.2.cmml" xref="S4.SS3.p2.12.m12.3.3.2.5.2">𝑡</ci><ci id="S4.SS3.p2.12.m12.3.3.2.5.3.cmml" xref="S4.SS3.p2.12.m12.3.3.2.5.3">𝑥</ci></apply></apply></apply></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.12.m12.4c">CT(m_{j})_{t}[x]=(\overline{val_{x}},\overline{test_{x}})</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.12.m12.4d">italic_C italic_T ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ italic_x ] = ( over¯ start_ARG italic_v italic_a italic_l start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_ARG , over¯ start_ARG italic_t italic_e italic_s italic_t start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_ARG )</annotation></semantics></math> is computed as <math alttext="\overline{val_{x}}" class="ltx_Math" display="inline" id="S4.SS3.p2.13.m13.1"><semantics id="S4.SS3.p2.13.m13.1a"><mover accent="true" id="S4.SS3.p2.13.m13.1.1" xref="S4.SS3.p2.13.m13.1.1.cmml"><mrow id="S4.SS3.p2.13.m13.1.1.2" xref="S4.SS3.p2.13.m13.1.1.2.cmml"><mi id="S4.SS3.p2.13.m13.1.1.2.2" xref="S4.SS3.p2.13.m13.1.1.2.2.cmml">v</mi><mo id="S4.SS3.p2.13.m13.1.1.2.1" xref="S4.SS3.p2.13.m13.1.1.2.1.cmml">⁢</mo><mi id="S4.SS3.p2.13.m13.1.1.2.3" xref="S4.SS3.p2.13.m13.1.1.2.3.cmml">a</mi><mo id="S4.SS3.p2.13.m13.1.1.2.1a" xref="S4.SS3.p2.13.m13.1.1.2.1.cmml">⁢</mo><msub id="S4.SS3.p2.13.m13.1.1.2.4" xref="S4.SS3.p2.13.m13.1.1.2.4.cmml"><mi id="S4.SS3.p2.13.m13.1.1.2.4.2" xref="S4.SS3.p2.13.m13.1.1.2.4.2.cmml">l</mi><mi id="S4.SS3.p2.13.m13.1.1.2.4.3" xref="S4.SS3.p2.13.m13.1.1.2.4.3.cmml">x</mi></msub></mrow><mo id="S4.SS3.p2.13.m13.1.1.1" xref="S4.SS3.p2.13.m13.1.1.1.cmml">¯</mo></mover><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.13.m13.1b"><apply id="S4.SS3.p2.13.m13.1.1.cmml" xref="S4.SS3.p2.13.m13.1.1"><ci id="S4.SS3.p2.13.m13.1.1.1.cmml" xref="S4.SS3.p2.13.m13.1.1.1">¯</ci><apply id="S4.SS3.p2.13.m13.1.1.2.cmml" xref="S4.SS3.p2.13.m13.1.1.2"><times id="S4.SS3.p2.13.m13.1.1.2.1.cmml" xref="S4.SS3.p2.13.m13.1.1.2.1"></times><ci id="S4.SS3.p2.13.m13.1.1.2.2.cmml" xref="S4.SS3.p2.13.m13.1.1.2.2">𝑣</ci><ci id="S4.SS3.p2.13.m13.1.1.2.3.cmml" xref="S4.SS3.p2.13.m13.1.1.2.3">𝑎</ci><apply id="S4.SS3.p2.13.m13.1.1.2.4.cmml" xref="S4.SS3.p2.13.m13.1.1.2.4"><csymbol cd="ambiguous" id="S4.SS3.p2.13.m13.1.1.2.4.1.cmml" xref="S4.SS3.p2.13.m13.1.1.2.4">subscript</csymbol><ci id="S4.SS3.p2.13.m13.1.1.2.4.2.cmml" xref="S4.SS3.p2.13.m13.1.1.2.4.2">𝑙</ci><ci id="S4.SS3.p2.13.m13.1.1.2.4.3.cmml" xref="S4.SS3.p2.13.m13.1.1.2.4.3">𝑥</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.13.m13.1c">\overline{val_{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.13.m13.1d">over¯ start_ARG italic_v italic_a italic_l start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_ARG</annotation></semantics></math> and <math alttext="\overline{test_{x}}" class="ltx_Math" display="inline" id="S4.SS3.p2.14.m14.1"><semantics id="S4.SS3.p2.14.m14.1a"><mover accent="true" id="S4.SS3.p2.14.m14.1.1" xref="S4.SS3.p2.14.m14.1.1.cmml"><mrow id="S4.SS3.p2.14.m14.1.1.2" xref="S4.SS3.p2.14.m14.1.1.2.cmml"><mi id="S4.SS3.p2.14.m14.1.1.2.2" xref="S4.SS3.p2.14.m14.1.1.2.2.cmml">t</mi><mo id="S4.SS3.p2.14.m14.1.1.2.1" xref="S4.SS3.p2.14.m14.1.1.2.1.cmml">⁢</mo><mi id="S4.SS3.p2.14.m14.1.1.2.3" xref="S4.SS3.p2.14.m14.1.1.2.3.cmml">e</mi><mo id="S4.SS3.p2.14.m14.1.1.2.1a" xref="S4.SS3.p2.14.m14.1.1.2.1.cmml">⁢</mo><mi id="S4.SS3.p2.14.m14.1.1.2.4" xref="S4.SS3.p2.14.m14.1.1.2.4.cmml">s</mi><mo id="S4.SS3.p2.14.m14.1.1.2.1b" xref="S4.SS3.p2.14.m14.1.1.2.1.cmml">⁢</mo><msub id="S4.SS3.p2.14.m14.1.1.2.5" xref="S4.SS3.p2.14.m14.1.1.2.5.cmml"><mi id="S4.SS3.p2.14.m14.1.1.2.5.2" xref="S4.SS3.p2.14.m14.1.1.2.5.2.cmml">t</mi><mi id="S4.SS3.p2.14.m14.1.1.2.5.3" xref="S4.SS3.p2.14.m14.1.1.2.5.3.cmml">x</mi></msub></mrow><mo id="S4.SS3.p2.14.m14.1.1.1" xref="S4.SS3.p2.14.m14.1.1.1.cmml">¯</mo></mover><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.14.m14.1b"><apply id="S4.SS3.p2.14.m14.1.1.cmml" xref="S4.SS3.p2.14.m14.1.1"><ci id="S4.SS3.p2.14.m14.1.1.1.cmml" xref="S4.SS3.p2.14.m14.1.1.1">¯</ci><apply id="S4.SS3.p2.14.m14.1.1.2.cmml" xref="S4.SS3.p2.14.m14.1.1.2"><times id="S4.SS3.p2.14.m14.1.1.2.1.cmml" xref="S4.SS3.p2.14.m14.1.1.2.1"></times><ci id="S4.SS3.p2.14.m14.1.1.2.2.cmml" xref="S4.SS3.p2.14.m14.1.1.2.2">𝑡</ci><ci id="S4.SS3.p2.14.m14.1.1.2.3.cmml" xref="S4.SS3.p2.14.m14.1.1.2.3">𝑒</ci><ci id="S4.SS3.p2.14.m14.1.1.2.4.cmml" xref="S4.SS3.p2.14.m14.1.1.2.4">𝑠</ci><apply id="S4.SS3.p2.14.m14.1.1.2.5.cmml" xref="S4.SS3.p2.14.m14.1.1.2.5"><csymbol cd="ambiguous" id="S4.SS3.p2.14.m14.1.1.2.5.1.cmml" xref="S4.SS3.p2.14.m14.1.1.2.5">subscript</csymbol><ci id="S4.SS3.p2.14.m14.1.1.2.5.2.cmml" xref="S4.SS3.p2.14.m14.1.1.2.5.2">𝑡</ci><ci id="S4.SS3.p2.14.m14.1.1.2.5.3.cmml" xref="S4.SS3.p2.14.m14.1.1.2.5.3">𝑥</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.14.m14.1c">\overline{test_{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.14.m14.1d">over¯ start_ARG italic_t italic_e italic_s italic_t start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_ARG</annotation></semantics></math> are the average validate and test accuracy of <math alttext="m_{j}" class="ltx_Math" display="inline" id="S4.SS3.p2.15.m15.1"><semantics id="S4.SS3.p2.15.m15.1a"><msub id="S4.SS3.p2.15.m15.1.1" xref="S4.SS3.p2.15.m15.1.1.cmml"><mi id="S4.SS3.p2.15.m15.1.1.2" xref="S4.SS3.p2.15.m15.1.1.2.cmml">m</mi><mi id="S4.SS3.p2.15.m15.1.1.3" xref="S4.SS3.p2.15.m15.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.15.m15.1b"><apply id="S4.SS3.p2.15.m15.1.1.cmml" xref="S4.SS3.p2.15.m15.1.1"><csymbol cd="ambiguous" id="S4.SS3.p2.15.m15.1.1.1.cmml" xref="S4.SS3.p2.15.m15.1.1">subscript</csymbol><ci id="S4.SS3.p2.15.m15.1.1.2.cmml" xref="S4.SS3.p2.15.m15.1.1.2">𝑚</ci><ci id="S4.SS3.p2.15.m15.1.1.3.cmml" xref="S4.SS3.p2.15.m15.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.15.m15.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.15.m15.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math> on the datasets in cluster <math alttext="x" class="ltx_Math" display="inline" id="S4.SS3.p2.16.m16.1"><semantics id="S4.SS3.p2.16.m16.1a"><mi id="S4.SS3.p2.16.m16.1.1" xref="S4.SS3.p2.16.m16.1.1.cmml">x</mi><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.16.m16.1b"><ci id="S4.SS3.p2.16.m16.1.1.cmml" xref="S4.SS3.p2.16.m16.1.1">𝑥</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.16.m16.1c">x</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.16.m16.1d">italic_x</annotation></semantics></math> respectively. Then, based on the generated convergence trends <math alttext="CT(m_{j})_{t}" class="ltx_Math" display="inline" id="S4.SS3.p2.17.m17.1"><semantics id="S4.SS3.p2.17.m17.1a"><mrow id="S4.SS3.p2.17.m17.1.1" xref="S4.SS3.p2.17.m17.1.1.cmml"><mi id="S4.SS3.p2.17.m17.1.1.3" xref="S4.SS3.p2.17.m17.1.1.3.cmml">C</mi><mo id="S4.SS3.p2.17.m17.1.1.2" xref="S4.SS3.p2.17.m17.1.1.2.cmml">⁢</mo><mi id="S4.SS3.p2.17.m17.1.1.4" xref="S4.SS3.p2.17.m17.1.1.4.cmml">T</mi><mo id="S4.SS3.p2.17.m17.1.1.2a" xref="S4.SS3.p2.17.m17.1.1.2.cmml">⁢</mo><msub id="S4.SS3.p2.17.m17.1.1.1" xref="S4.SS3.p2.17.m17.1.1.1.cmml"><mrow id="S4.SS3.p2.17.m17.1.1.1.1.1" xref="S4.SS3.p2.17.m17.1.1.1.1.1.1.cmml"><mo id="S4.SS3.p2.17.m17.1.1.1.1.1.2" stretchy="false" xref="S4.SS3.p2.17.m17.1.1.1.1.1.1.cmml">(</mo><msub id="S4.SS3.p2.17.m17.1.1.1.1.1.1" xref="S4.SS3.p2.17.m17.1.1.1.1.1.1.cmml"><mi id="S4.SS3.p2.17.m17.1.1.1.1.1.1.2" xref="S4.SS3.p2.17.m17.1.1.1.1.1.1.2.cmml">m</mi><mi id="S4.SS3.p2.17.m17.1.1.1.1.1.1.3" xref="S4.SS3.p2.17.m17.1.1.1.1.1.1.3.cmml">j</mi></msub><mo id="S4.SS3.p2.17.m17.1.1.1.1.1.3" stretchy="false" xref="S4.SS3.p2.17.m17.1.1.1.1.1.1.cmml">)</mo></mrow><mi id="S4.SS3.p2.17.m17.1.1.1.3" xref="S4.SS3.p2.17.m17.1.1.1.3.cmml">t</mi></msub></mrow><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.17.m17.1b"><apply id="S4.SS3.p2.17.m17.1.1.cmml" xref="S4.SS3.p2.17.m17.1.1"><times id="S4.SS3.p2.17.m17.1.1.2.cmml" xref="S4.SS3.p2.17.m17.1.1.2"></times><ci id="S4.SS3.p2.17.m17.1.1.3.cmml" xref="S4.SS3.p2.17.m17.1.1.3">𝐶</ci><ci id="S4.SS3.p2.17.m17.1.1.4.cmml" xref="S4.SS3.p2.17.m17.1.1.4">𝑇</ci><apply id="S4.SS3.p2.17.m17.1.1.1.cmml" xref="S4.SS3.p2.17.m17.1.1.1"><csymbol cd="ambiguous" id="S4.SS3.p2.17.m17.1.1.1.2.cmml" xref="S4.SS3.p2.17.m17.1.1.1">subscript</csymbol><apply id="S4.SS3.p2.17.m17.1.1.1.1.1.1.cmml" xref="S4.SS3.p2.17.m17.1.1.1.1.1"><csymbol cd="ambiguous" id="S4.SS3.p2.17.m17.1.1.1.1.1.1.1.cmml" xref="S4.SS3.p2.17.m17.1.1.1.1.1">subscript</csymbol><ci id="S4.SS3.p2.17.m17.1.1.1.1.1.1.2.cmml" xref="S4.SS3.p2.17.m17.1.1.1.1.1.1.2">𝑚</ci><ci id="S4.SS3.p2.17.m17.1.1.1.1.1.1.3.cmml" xref="S4.SS3.p2.17.m17.1.1.1.1.1.1.3">𝑗</ci></apply><ci id="S4.SS3.p2.17.m17.1.1.1.3.cmml" xref="S4.SS3.p2.17.m17.1.1.1.3">𝑡</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.17.m17.1c">CT(m_{j})_{t}</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.17.m17.1d">italic_C italic_T ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math>, we can assign the best matched convergence trend for <math alttext="m_{j}" class="ltx_Math" display="inline" id="S4.SS3.p2.18.m18.1"><semantics id="S4.SS3.p2.18.m18.1a"><msub id="S4.SS3.p2.18.m18.1.1" xref="S4.SS3.p2.18.m18.1.1.cmml"><mi id="S4.SS3.p2.18.m18.1.1.2" xref="S4.SS3.p2.18.m18.1.1.2.cmml">m</mi><mi id="S4.SS3.p2.18.m18.1.1.3" xref="S4.SS3.p2.18.m18.1.1.3.cmml">j</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.18.m18.1b"><apply id="S4.SS3.p2.18.m18.1.1.cmml" xref="S4.SS3.p2.18.m18.1.1"><csymbol cd="ambiguous" id="S4.SS3.p2.18.m18.1.1.1.cmml" xref="S4.SS3.p2.18.m18.1.1">subscript</csymbol><ci id="S4.SS3.p2.18.m18.1.1.2.cmml" xref="S4.SS3.p2.18.m18.1.1.2">𝑚</ci><ci id="S4.SS3.p2.18.m18.1.1.3.cmml" xref="S4.SS3.p2.18.m18.1.1.3">𝑗</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.18.m18.1c">m_{j}</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.18.m18.1d">italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT</annotation></semantics></math> trained on <math alttext="d(T)" class="ltx_Math" display="inline" id="S4.SS3.p2.19.m19.1"><semantics id="S4.SS3.p2.19.m19.1a"><mrow id="S4.SS3.p2.19.m19.1.2" xref="S4.SS3.p2.19.m19.1.2.cmml"><mi id="S4.SS3.p2.19.m19.1.2.2" xref="S4.SS3.p2.19.m19.1.2.2.cmml">d</mi><mo id="S4.SS3.p2.19.m19.1.2.1" xref="S4.SS3.p2.19.m19.1.2.1.cmml">⁢</mo><mrow id="S4.SS3.p2.19.m19.1.2.3.2" xref="S4.SS3.p2.19.m19.1.2.cmml"><mo id="S4.SS3.p2.19.m19.1.2.3.2.1" stretchy="false" xref="S4.SS3.p2.19.m19.1.2.cmml">(</mo><mi id="S4.SS3.p2.19.m19.1.1" xref="S4.SS3.p2.19.m19.1.1.cmml">T</mi><mo id="S4.SS3.p2.19.m19.1.2.3.2.2" stretchy="false" xref="S4.SS3.p2.19.m19.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.19.m19.1b"><apply id="S4.SS3.p2.19.m19.1.2.cmml" xref="S4.SS3.p2.19.m19.1.2"><times id="S4.SS3.p2.19.m19.1.2.1.cmml" xref="S4.SS3.p2.19.m19.1.2.1"></times><ci id="S4.SS3.p2.19.m19.1.2.2.cmml" xref="S4.SS3.p2.19.m19.1.2.2">𝑑</ci><ci id="S4.SS3.p2.19.m19.1.1.cmml" xref="S4.SS3.p2.19.m19.1.1">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.19.m19.1c">d(T)</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.19.m19.1d">italic_d ( italic_T )</annotation></semantics></math> after <math alttext="s" class="ltx_Math" display="inline" id="S4.SS3.p2.20.m20.1"><semantics id="S4.SS3.p2.20.m20.1a"><mi id="S4.SS3.p2.20.m20.1.1" xref="S4.SS3.p2.20.m20.1.1.cmml">s</mi><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.20.m20.1b"><ci id="S4.SS3.p2.20.m20.1.1.cmml" xref="S4.SS3.p2.20.m20.1.1">𝑠</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.20.m20.1c">s</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.20.m20.1d">italic_s</annotation></semantics></math> steps as Eq. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4.E5" title="In IV-C Fine-Selection Algorithm ‣ IV Fine-Selection ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">5</span></a>. The final training performance could be predicted as the test accuracy of the matched convergence trend as in Eq. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4.E6" title="In IV-C Fine-Selection Algorithm ‣ IV Fine-Selection ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">6</span></a>.</p> </div> <div class="ltx_para" id="S4.SS3.p3"> <table class="ltx_equation ltx_eqn_table" id="S4.E5"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\begin{split}\begin{aligned} matched(val(T|m_{j})_{t})=\underset{x}{\arg\min}(% {|\overline{val_{x}}-val(T|m_{j})_{t}|})\end{aligned}\end{split}" class="ltx_math_unparsed" display="block" id="S4.E5.m1.1"><semantics id="S4.E5.m1.1a"><mtable displaystyle="true" id="S4.E5.m1.1.1"><mtr id="S4.E5.m1.1.1a"><mtd class="ltx_align_right" columnalign="right" id="S4.E5.m1.1.1b"><mtable displaystyle="true" id="S4.E5.m1.1.1.1.1.1.1"><mtr id="S4.E5.m1.1.1.1.1.1.1a"><mtd class="ltx_align_right" columnalign="right" id="S4.E5.m1.1.1.1.1.1.1b"><mrow id="S4.E5.m1.1.1.1.1.1.1.2.1.1"><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.1">m</mi><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.2">a</mi><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.3">t</mi><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.4">c</mi><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.5">h</mi><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.6">e</mi><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.7">d</mi><mrow id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8"><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.1" stretchy="false">(</mo><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.2">v</mi><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.3">a</mi><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.4">l</mi><msub id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.5"><mrow id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.5.2"><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.5.2.1" stretchy="false">(</mo><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.5.2.2">T</mi><mo fence="false" id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.5.2.3" rspace="0.167em" stretchy="false">|</mo><msub id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.5.2.4"><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.5.2.4.2">m</mi><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.5.2.4.3">j</mi></msub><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.5.2.5" stretchy="false">)</mo></mrow><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.5.3">t</mi></msub><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.8.6" stretchy="false">)</mo></mrow><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.9">=</mo><munder accentunder="true" id="S4.E5.m1.1.1.1.1.1.1.2.1.1.10"><mrow id="S4.E5.m1.1.1.1.1.1.1.2.1.1.10.2"><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.10.2.1">arg</mi><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.10.2a" lspace="0.167em">⁡</mo><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.10.2.2">min</mi></mrow><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.10.1">𝑥</mo></munder><mrow id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11"><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.1" stretchy="false">(</mo><mo fence="false" id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.2" rspace="0.167em" stretchy="false">|</mo><mover accent="true" id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.3"><mrow id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.3.2"><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.3.2.2">v</mi><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.3.2.1">⁢</mo><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.3.2.3">a</mi><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.3.2.1a">⁢</mo><msub id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.3.2.4"><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.3.2.4.2">l</mi><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.3.2.4.3">x</mi></msub></mrow><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.3.1">¯</mo></mover><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.4">−</mo><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.5">v</mi><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.6">a</mi><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.7">l</mi><msub id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.8"><mrow id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.8.2"><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.8.2.1" stretchy="false">(</mo><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.8.2.2">T</mi><mo fence="false" id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.8.2.3" rspace="0.167em" stretchy="false">|</mo><msub id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.8.2.4"><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.8.2.4.2">m</mi><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.8.2.4.3">j</mi></msub><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.8.2.5" stretchy="false">)</mo></mrow><mi id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.8.3">t</mi></msub><mo fence="false" id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.9" rspace="0.167em" stretchy="false">|</mo><mo id="S4.E5.m1.1.1.1.1.1.1.2.1.1.11.10" stretchy="false">)</mo></mrow></mrow></mtd></mtr></mtable></mtd></mtr></mtable><annotation encoding="application/x-tex" id="S4.E5.m1.1b">\begin{split}\begin{aligned} matched(val(T|m_{j})_{t})=\underset{x}{\arg\min}(% {|\overline{val_{x}}-val(T|m_{j})_{t}|})\end{aligned}\end{split}</annotation><annotation encoding="application/x-llamapun" id="S4.E5.m1.1c">start_ROW start_CELL start_ROW start_CELL italic_m italic_a italic_t italic_c italic_h italic_e italic_d ( italic_v italic_a italic_l ( italic_T | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) = underitalic_x start_ARG roman_arg roman_min end_ARG ( | over¯ start_ARG italic_v italic_a italic_l start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT end_ARG - italic_v italic_a italic_l ( italic_T | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | ) end_CELL end_ROW end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(5)</span></td> </tr></tbody> </table> </div> <div class="ltx_para" id="S4.SS3.p4"> <table class="ltx_equation ltx_eqn_table" id="S4.E6"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\begin{split}\begin{aligned} pred(T|m_{j})_{t}=CT(m_{j})_{t}[matched(val(T|m_{% j})_{t})]\end{aligned}\end{split}" class="ltx_Math" display="block" id="S4.E6.m1.1"><semantics id="S4.E6.m1.1a"><mtable displaystyle="true" id="S4.E6.m1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.cmml"><mtr id="S4.E6.m1.1.1a" xref="S4.E6.m1.1.1.1.1.1.1.cmml"><mtd class="ltx_align_right" columnalign="right" id="S4.E6.m1.1.1b" xref="S4.E6.m1.1.1.1.1.1.1.cmml"><mtable displaystyle="true" id="S4.E6.m1.1.1.1.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.cmml"><mtr id="S4.E6.m1.1.1.1.1.1.1a" xref="S4.E6.m1.1.1.1.1.1.1.cmml"><mtd class="ltx_align_right" columnalign="right" id="S4.E6.m1.1.1.1.1.1.1b" xref="S4.E6.m1.1.1.1.1.1.1.cmml"><mrow id="S4.E6.m1.1.1.1.1.1.1.3.3.3" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.cmml"><mrow id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.cmml"><mi id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.3" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.3.cmml">p</mi><mo id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.2" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.2.cmml">⁢</mo><mi id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.4" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.4.cmml">r</mi><mo id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.2a" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.2.cmml">⁢</mo><mi id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.5" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.5.cmml">e</mi><mo id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.2b" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.2.cmml">⁢</mo><mi id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.6" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.6.cmml">d</mi><mo id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.2c" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.2.cmml">⁢</mo><msub id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mrow id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mo id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.2" stretchy="false" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">(</mo><mrow id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mi id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml">T</mi><mo fence="false" id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">|</mo><msub id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml"><mi id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.2" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml">m</mi><mi id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.3" stretchy="false" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">)</mo></mrow><mi id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml">t</mi></msub></mrow><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.4" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.4.cmml">=</mo><mrow id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.cmml"><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.4" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.4.cmml">C</mi><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.3" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.3.cmml">⁢</mo><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.5" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.5.cmml">T</mi><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.3a" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.3.cmml">⁢</mo><msub id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.cmml"><mrow id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.cmml"><mo id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.2" stretchy="false" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.cmml">(</mo><msub id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.cmml"><mi id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.2" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.2.cmml">m</mi><mi id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.3" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.3.cmml">j</mi></msub><mo id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.3" stretchy="false" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.cmml">)</mo></mrow><mi id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.3" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.3.cmml">t</mi></msub><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.3b" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.3.cmml">⁢</mo><mrow id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.2.cmml"><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.2" stretchy="false" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.2.1.cmml">[</mo><mrow id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.cmml"><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.3" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.3.cmml">m</mi><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2.cmml">⁢</mo><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.4" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.4.cmml">a</mi><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2a" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2.cmml">⁢</mo><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.5" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.5.cmml">t</mi><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2b" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2.cmml">⁢</mo><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.6" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.6.cmml">c</mi><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2c" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2.cmml">⁢</mo><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.7" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.7.cmml">h</mi><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2d" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2.cmml">⁢</mo><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.8" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.8.cmml">e</mi><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2e" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2.cmml">⁢</mo><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.9" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.9.cmml">d</mi><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2f" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2.cmml">⁢</mo><mrow id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.cmml"><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.2" stretchy="false" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.cmml">(</mo><mrow id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.cmml"><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.3" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.3.cmml">v</mi><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.2" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.2.cmml">⁢</mo><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.4" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.4.cmml">a</mi><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.2a" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.2.cmml">⁢</mo><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.5" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.5.cmml">l</mi><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.2b" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.2.cmml">⁢</mo><msub id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.cmml"><mrow id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.cmml"><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.2" stretchy="false" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.cmml">(</mo><mrow id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.cmml"><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.2" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.2.cmml">T</mi><mo fence="false" id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.1" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.1.cmml">|</mo><msub id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3.cmml"><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3.2" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3.2.cmml">m</mi><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3.3" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3.3.cmml">j</mi></msub></mrow><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.3" stretchy="false" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.cmml">)</mo></mrow><mi id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.3" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.3.cmml">t</mi></msub></mrow><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.3" stretchy="false" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.3" stretchy="false" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.2.1.cmml">]</mo></mrow></mrow></mrow></mtd></mtr></mtable></mtd></mtr></mtable><annotation-xml encoding="MathML-Content" id="S4.E6.m1.1b"><matrix id="S4.E6.m1.1.1.1.1.1.1.cmml" xref="S4.E6.m1.1.1"><matrixrow id="S4.E6.m1.1.1.1.1.1.1a.cmml" xref="S4.E6.m1.1.1"><apply id="S4.E6.m1.1.1.1.1.1.1.3.3.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3"><eq id="S4.E6.m1.1.1.1.1.1.1.3.3.3.4.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.4"></eq><apply id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1"><times id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.2"></times><ci id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.3">𝑝</ci><ci id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.4.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.4">𝑟</ci><ci id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.5.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.5">𝑒</ci><ci id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.6.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.6">𝑑</ci><apply id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1">subscript</csymbol><apply id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1">conditional</csymbol><ci id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2">𝑇</ci><apply id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.2">𝑚</ci><ci id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3">𝑗</ci></apply></apply><ci id="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.1.1.1.1.1.3">𝑡</ci></apply></apply><apply id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3"><times id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.3"></times><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.4.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.4">𝐶</ci><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.5.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.5">𝑇</ci><apply id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1"><csymbol cd="ambiguous" id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.2.cmml" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1">subscript</csymbol><apply id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1"><csymbol cd="ambiguous" id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1">subscript</csymbol><ci id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.2.cmml" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.2">𝑚</ci><ci id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.1.1.1.3">𝑗</ci></apply><ci id="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.2.2.2.2.1.3">𝑡</ci></apply><apply id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.2.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1"><csymbol cd="latexml" id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.2.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.2">delimited-[]</csymbol><apply id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1"><times id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.2"></times><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.3">𝑚</ci><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.4.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.4">𝑎</ci><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.5.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.5">𝑡</ci><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.6.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.6">𝑐</ci><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.7.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.7">ℎ</ci><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.8.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.8">𝑒</ci><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.9.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.9">𝑑</ci><apply id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1"><times id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.2.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.2"></times><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.3">𝑣</ci><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.4.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.4">𝑎</ci><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.5.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.5">𝑙</ci><apply id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.2.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1">subscript</csymbol><apply id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.1">conditional</csymbol><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.2.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.2">𝑇</ci><apply id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3.2">𝑚</ci><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.1.1.1.3.3">𝑗</ci></apply></apply><ci id="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.3.cmml" xref="S4.E6.m1.1.1.1.1.1.1.3.3.3.3.2.1.1.1.1.1.1.3">𝑡</ci></apply></apply></apply></apply></apply></apply></matrixrow></matrix></annotation-xml><annotation encoding="application/x-tex" id="S4.E6.m1.1c">\begin{split}\begin{aligned} pred(T|m_{j})_{t}=CT(m_{j})_{t}[matched(val(T|m_{% j})_{t})]\end{aligned}\end{split}</annotation><annotation encoding="application/x-llamapun" id="S4.E6.m1.1d">start_ROW start_CELL start_ROW start_CELL italic_p italic_r italic_e italic_d ( italic_T | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_C italic_T ( italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ italic_m italic_a italic_t italic_c italic_h italic_e italic_d ( italic_v italic_a italic_l ( italic_T | italic_m start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ] end_CELL end_ROW end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(6)</span></td> </tr></tbody> </table> </div> <figure class="ltx_figure" id="S4.F4"> <p class="ltx_p ltx_align_center" id="S4.F4.1"><span class="ltx_text" id="S4.F4.1.1"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="960" id="S4.F4.1.1.g1" src="extracted/2404.00069v1/bert-asian-hate-tweets-asian-unclean-freeze-4.png" width="1600"/></span></p> <br class="ltx_break ltx_break"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 4: </span>Validation&amp;Test performance of the DoyyingFace/bert-asian-hate-tweets-asian-unclean-freeze-4 model on 30 datasets.</figcaption> </figure> <div class="ltx_para" id="S4.SS3.p5"> <p class="ltx_p" id="S4.SS3.p5.1">Based on convergence trend mining, Algorithm 1 describes the proposed fine-selection algorithm. Like successive halving, when each remaining model has been fine-tuned for <math alttext="s" class="ltx_Math" display="inline" id="S4.SS3.p5.1.m1.1"><semantics id="S4.SS3.p5.1.m1.1a"><mi id="S4.SS3.p5.1.m1.1.1" xref="S4.SS3.p5.1.m1.1.1.cmml">s</mi><annotation-xml encoding="MathML-Content" id="S4.SS3.p5.1.m1.1b"><ci id="S4.SS3.p5.1.m1.1.1.cmml" xref="S4.SS3.p5.1.m1.1.1">𝑠</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p5.1.m1.1c">s</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p5.1.m1.1d">italic_s</annotation></semantics></math> steps, if the number of remaining models is not 1, fine-selection is performed. The specific process is as follows:</p> <ul class="ltx_itemize" id="S4.I1"> <li class="ltx_item" id="S4.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S4.I1.i1.p1"> <p class="ltx_p" id="S4.I1.i1.p1.1">Obtaining: Obtain the model’s every stage validation and final test results on all benchmark datasets.</p> </div> </li> <li class="ltx_item" id="S4.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S4.I1.i2.p1"> <p class="ltx_p" id="S4.I1.i2.p1.1">Convergence Trend Mining and Match: Perform clustering on validation results of current stage, and get the matched convergence trend by Eq. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4.E5" title="In IV-C Fine-Selection Algorithm ‣ IV Fine-Selection ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">5</span></a>.</p> </div> </li> <li class="ltx_item" id="S4.I1.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S4.I1.i3.p1"> <p class="ltx_p" id="S4.I1.i3.p1.1">Predict: Use the mean final test performance of the matched convergence trend as the predicted result by Eq. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4.E6" title="In IV-C Fine-Selection Algorithm ‣ IV Fine-Selection ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">6</span></a>.</p> </div> </li> <li class="ltx_item" id="S4.I1.i4" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S4.I1.i4.p1"> <p class="ltx_p" id="S4.I1.i4.p1.1">Fine-Filter: Among the remaining models, starting from the model with the worst validation performance, if there exists a model with better validation performance and whose predicted performance is also better by a certain threshold, we remove this model.</p> </div> </li> <li class="ltx_item" id="S4.I1.i5" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S4.I1.i5.p1"> <p class="ltx_p" id="S4.I1.i5.p1.1">Halving: If the number of remaining models is more than half of the number of models at the beginning of the selection, we directly eliminate the model with the worst validation result until the number is reduced by half.</p> </div> </li> </ul> </div> <div class="ltx_para" id="S4.SS3.p6"> <p class="ltx_p" id="S4.SS3.p6.1">In summary, our fine-selection method ultimately yields a single model fully trained on the target dataset. It ensures that at least half of the models are filtered out at each step, thereby resulting in a selection efficiency significantly higher than that of successive halving.</p> </div> <figure class="ltx_float ltx_algorithm" id="alg1"> <div class="ltx_listing ltx_lst_numbers_left ltx_listing" id="alg1.7"> <div class="ltx_listingline" id="alg1.7.7"> <div class="ltx_listing ltx_listing" id="alg1.7.7.7"> <div class="ltx_listingline" id="alg1.l0"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l0.1.1.1" style="font-size:80%;">0:</span></span>  Recalled models <math alttext="M_{0}" class="ltx_Math" display="inline" id="alg1.l0.m1.1"><semantics id="alg1.l0.m1.1a"><msub id="alg1.l0.m1.1.1" xref="alg1.l0.m1.1.1.cmml"><mi id="alg1.l0.m1.1.1.2" xref="alg1.l0.m1.1.1.2.cmml">M</mi><mn id="alg1.l0.m1.1.1.3" xref="alg1.l0.m1.1.1.3.cmml">0</mn></msub><annotation-xml encoding="MathML-Content" id="alg1.l0.m1.1b"><apply id="alg1.l0.m1.1.1.cmml" xref="alg1.l0.m1.1.1"><csymbol cd="ambiguous" id="alg1.l0.m1.1.1.1.cmml" xref="alg1.l0.m1.1.1">subscript</csymbol><ci id="alg1.l0.m1.1.1.2.cmml" xref="alg1.l0.m1.1.1.2">𝑀</ci><cn id="alg1.l0.m1.1.1.3.cmml" type="integer" xref="alg1.l0.m1.1.1.3">0</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l0.m1.1c">M_{0}</annotation><annotation encoding="application/x-llamapun" id="alg1.l0.m1.1d">italic_M start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT</annotation></semantics></math>; </div> <div class="ltx_listingline" id="alg1.4.4.4.4">Validation and Test result of <math alttext="M" class="ltx_Math" display="inline" id="alg1.1.1.1.1.m1.1"><semantics id="alg1.1.1.1.1.m1.1a"><mi id="alg1.1.1.1.1.m1.1.1" xref="alg1.1.1.1.1.m1.1.1.cmml">M</mi><annotation-xml encoding="MathML-Content" id="alg1.1.1.1.1.m1.1b"><ci id="alg1.1.1.1.1.m1.1.1.cmml" xref="alg1.1.1.1.1.m1.1.1">𝑀</ci></annotation-xml><annotation encoding="application/x-tex" id="alg1.1.1.1.1.m1.1c">M</annotation><annotation encoding="application/x-llamapun" id="alg1.1.1.1.1.m1.1d">italic_M</annotation></semantics></math> on <math alttext="D" class="ltx_Math" display="inline" id="alg1.2.2.2.2.m2.1"><semantics id="alg1.2.2.2.2.m2.1a"><mi id="alg1.2.2.2.2.m2.1.1" xref="alg1.2.2.2.2.m2.1.1.cmml">D</mi><annotation-xml encoding="MathML-Content" id="alg1.2.2.2.2.m2.1b"><ci id="alg1.2.2.2.2.m2.1.1.cmml" xref="alg1.2.2.2.2.m2.1.1">𝐷</ci></annotation-xml><annotation encoding="application/x-tex" id="alg1.2.2.2.2.m2.1c">D</annotation><annotation encoding="application/x-llamapun" id="alg1.2.2.2.2.m2.1d">italic_D</annotation></semantics></math>, <math alttext="val" class="ltx_Math" display="inline" id="alg1.3.3.3.3.m3.1"><semantics id="alg1.3.3.3.3.m3.1a"><mrow id="alg1.3.3.3.3.m3.1.1" xref="alg1.3.3.3.3.m3.1.1.cmml"><mi id="alg1.3.3.3.3.m3.1.1.2" xref="alg1.3.3.3.3.m3.1.1.2.cmml">v</mi><mo id="alg1.3.3.3.3.m3.1.1.1" xref="alg1.3.3.3.3.m3.1.1.1.cmml">⁢</mo><mi id="alg1.3.3.3.3.m3.1.1.3" xref="alg1.3.3.3.3.m3.1.1.3.cmml">a</mi><mo id="alg1.3.3.3.3.m3.1.1.1a" xref="alg1.3.3.3.3.m3.1.1.1.cmml">⁢</mo><mi id="alg1.3.3.3.3.m3.1.1.4" xref="alg1.3.3.3.3.m3.1.1.4.cmml">l</mi></mrow><annotation-xml encoding="MathML-Content" id="alg1.3.3.3.3.m3.1b"><apply id="alg1.3.3.3.3.m3.1.1.cmml" xref="alg1.3.3.3.3.m3.1.1"><times id="alg1.3.3.3.3.m3.1.1.1.cmml" xref="alg1.3.3.3.3.m3.1.1.1"></times><ci id="alg1.3.3.3.3.m3.1.1.2.cmml" xref="alg1.3.3.3.3.m3.1.1.2">𝑣</ci><ci id="alg1.3.3.3.3.m3.1.1.3.cmml" xref="alg1.3.3.3.3.m3.1.1.3">𝑎</ci><ci id="alg1.3.3.3.3.m3.1.1.4.cmml" xref="alg1.3.3.3.3.m3.1.1.4">𝑙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.3.3.3.3.m3.1c">val</annotation><annotation encoding="application/x-llamapun" id="alg1.3.3.3.3.m3.1d">italic_v italic_a italic_l</annotation></semantics></math> and <math alttext="test" class="ltx_Math" display="inline" id="alg1.4.4.4.4.m4.1"><semantics id="alg1.4.4.4.4.m4.1a"><mrow id="alg1.4.4.4.4.m4.1.1" xref="alg1.4.4.4.4.m4.1.1.cmml"><mi id="alg1.4.4.4.4.m4.1.1.2" xref="alg1.4.4.4.4.m4.1.1.2.cmml">t</mi><mo id="alg1.4.4.4.4.m4.1.1.1" xref="alg1.4.4.4.4.m4.1.1.1.cmml">⁢</mo><mi id="alg1.4.4.4.4.m4.1.1.3" xref="alg1.4.4.4.4.m4.1.1.3.cmml">e</mi><mo id="alg1.4.4.4.4.m4.1.1.1a" xref="alg1.4.4.4.4.m4.1.1.1.cmml">⁢</mo><mi id="alg1.4.4.4.4.m4.1.1.4" xref="alg1.4.4.4.4.m4.1.1.4.cmml">s</mi><mo id="alg1.4.4.4.4.m4.1.1.1b" xref="alg1.4.4.4.4.m4.1.1.1.cmml">⁢</mo><mi id="alg1.4.4.4.4.m4.1.1.5" xref="alg1.4.4.4.4.m4.1.1.5.cmml">t</mi></mrow><annotation-xml encoding="MathML-Content" id="alg1.4.4.4.4.m4.1b"><apply id="alg1.4.4.4.4.m4.1.1.cmml" xref="alg1.4.4.4.4.m4.1.1"><times id="alg1.4.4.4.4.m4.1.1.1.cmml" xref="alg1.4.4.4.4.m4.1.1.1"></times><ci id="alg1.4.4.4.4.m4.1.1.2.cmml" xref="alg1.4.4.4.4.m4.1.1.2">𝑡</ci><ci id="alg1.4.4.4.4.m4.1.1.3.cmml" xref="alg1.4.4.4.4.m4.1.1.3">𝑒</ci><ci id="alg1.4.4.4.4.m4.1.1.4.cmml" xref="alg1.4.4.4.4.m4.1.1.4">𝑠</ci><ci id="alg1.4.4.4.4.m4.1.1.5.cmml" xref="alg1.4.4.4.4.m4.1.1.5">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.4.4.4.4.m4.1c">test</annotation><annotation encoding="application/x-llamapun" id="alg1.4.4.4.4.m4.1d">italic_t italic_e italic_s italic_t</annotation></semantics></math>; </div> <div class="ltx_listingline" id="alg1.6.6.6.6">Total training steps <math alttext="T" class="ltx_Math" display="inline" id="alg1.5.5.5.5.m1.1"><semantics id="alg1.5.5.5.5.m1.1a"><mi id="alg1.5.5.5.5.m1.1.1" xref="alg1.5.5.5.5.m1.1.1.cmml">T</mi><annotation-xml encoding="MathML-Content" id="alg1.5.5.5.5.m1.1b"><ci id="alg1.5.5.5.5.m1.1.1.cmml" xref="alg1.5.5.5.5.m1.1.1">𝑇</ci></annotation-xml><annotation encoding="application/x-tex" id="alg1.5.5.5.5.m1.1c">T</annotation><annotation encoding="application/x-llamapun" id="alg1.5.5.5.5.m1.1d">italic_T</annotation></semantics></math>, validation interval <math alttext="s" class="ltx_Math" display="inline" id="alg1.6.6.6.6.m2.1"><semantics id="alg1.6.6.6.6.m2.1a"><mi id="alg1.6.6.6.6.m2.1.1" xref="alg1.6.6.6.6.m2.1.1.cmml">s</mi><annotation-xml encoding="MathML-Content" id="alg1.6.6.6.6.m2.1b"><ci id="alg1.6.6.6.6.m2.1.1.cmml" xref="alg1.6.6.6.6.m2.1.1">𝑠</ci></annotation-xml><annotation encoding="application/x-tex" id="alg1.6.6.6.6.m2.1c">s</annotation><annotation encoding="application/x-llamapun" id="alg1.6.6.6.6.m2.1d">italic_s</annotation></semantics></math> </div> <div class="ltx_listingline" id="alg1.l0a"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l0a.1.1.1" style="font-size:80%;">0:</span></span>  Final Trained Model </div> <div class="ltx_listingline" id="alg1.l1"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l1.1.1.1" style="font-size:80%;">1:</span></span>  <span class="ltx_text ltx_font_bold" id="alg1.l1.2">for</span> <math alttext="t=0" class="ltx_Math" display="inline" id="alg1.l1.m1.1"><semantics id="alg1.l1.m1.1a"><mrow id="alg1.l1.m1.1.1" xref="alg1.l1.m1.1.1.cmml"><mi id="alg1.l1.m1.1.1.2" xref="alg1.l1.m1.1.1.2.cmml">t</mi><mo id="alg1.l1.m1.1.1.1" xref="alg1.l1.m1.1.1.1.cmml">=</mo><mn id="alg1.l1.m1.1.1.3" xref="alg1.l1.m1.1.1.3.cmml">0</mn></mrow><annotation-xml encoding="MathML-Content" id="alg1.l1.m1.1b"><apply id="alg1.l1.m1.1.1.cmml" xref="alg1.l1.m1.1.1"><eq id="alg1.l1.m1.1.1.1.cmml" xref="alg1.l1.m1.1.1.1"></eq><ci id="alg1.l1.m1.1.1.2.cmml" xref="alg1.l1.m1.1.1.2">𝑡</ci><cn id="alg1.l1.m1.1.1.3.cmml" type="integer" xref="alg1.l1.m1.1.1.3">0</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l1.m1.1c">t=0</annotation><annotation encoding="application/x-llamapun" id="alg1.l1.m1.1d">italic_t = 0</annotation></semantics></math> to <math alttext="\lfloor T/s\rfloor-1" class="ltx_Math" display="inline" id="alg1.l1.m2.1"><semantics id="alg1.l1.m2.1a"><mrow id="alg1.l1.m2.1.1" xref="alg1.l1.m2.1.1.cmml"><mrow id="alg1.l1.m2.1.1.1.1" xref="alg1.l1.m2.1.1.1.2.cmml"><mo id="alg1.l1.m2.1.1.1.1.2" stretchy="false" xref="alg1.l1.m2.1.1.1.2.1.cmml">⌊</mo><mrow id="alg1.l1.m2.1.1.1.1.1" xref="alg1.l1.m2.1.1.1.1.1.cmml"><mi id="alg1.l1.m2.1.1.1.1.1.2" xref="alg1.l1.m2.1.1.1.1.1.2.cmml">T</mi><mo id="alg1.l1.m2.1.1.1.1.1.1" xref="alg1.l1.m2.1.1.1.1.1.1.cmml">/</mo><mi id="alg1.l1.m2.1.1.1.1.1.3" xref="alg1.l1.m2.1.1.1.1.1.3.cmml">s</mi></mrow><mo id="alg1.l1.m2.1.1.1.1.3" stretchy="false" xref="alg1.l1.m2.1.1.1.2.1.cmml">⌋</mo></mrow><mo id="alg1.l1.m2.1.1.2" xref="alg1.l1.m2.1.1.2.cmml">−</mo><mn id="alg1.l1.m2.1.1.3" xref="alg1.l1.m2.1.1.3.cmml">1</mn></mrow><annotation-xml encoding="MathML-Content" id="alg1.l1.m2.1b"><apply id="alg1.l1.m2.1.1.cmml" xref="alg1.l1.m2.1.1"><minus id="alg1.l1.m2.1.1.2.cmml" xref="alg1.l1.m2.1.1.2"></minus><apply id="alg1.l1.m2.1.1.1.2.cmml" xref="alg1.l1.m2.1.1.1.1"><floor id="alg1.l1.m2.1.1.1.2.1.cmml" xref="alg1.l1.m2.1.1.1.1.2"></floor><apply id="alg1.l1.m2.1.1.1.1.1.cmml" xref="alg1.l1.m2.1.1.1.1.1"><divide id="alg1.l1.m2.1.1.1.1.1.1.cmml" xref="alg1.l1.m2.1.1.1.1.1.1"></divide><ci id="alg1.l1.m2.1.1.1.1.1.2.cmml" xref="alg1.l1.m2.1.1.1.1.1.2">𝑇</ci><ci id="alg1.l1.m2.1.1.1.1.1.3.cmml" xref="alg1.l1.m2.1.1.1.1.1.3">𝑠</ci></apply></apply><cn id="alg1.l1.m2.1.1.3.cmml" type="integer" xref="alg1.l1.m2.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l1.m2.1c">\lfloor T/s\rfloor-1</annotation><annotation encoding="application/x-llamapun" id="alg1.l1.m2.1d">⌊ italic_T / italic_s ⌋ - 1</annotation></semantics></math> <span class="ltx_text ltx_font_bold" id="alg1.l1.3">do</span> </div> <div class="ltx_listingline" id="alg1.l2"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l2.1.1.1" style="font-size:80%;">2:</span></span>     <math alttext="M^{\prime}\leftarrow" class="ltx_Math" display="inline" id="alg1.l2.m1.1"><semantics id="alg1.l2.m1.1a"><mrow id="alg1.l2.m1.1.1" xref="alg1.l2.m1.1.1.cmml"><msup id="alg1.l2.m1.1.1.2" xref="alg1.l2.m1.1.1.2.cmml"><mi id="alg1.l2.m1.1.1.2.2" xref="alg1.l2.m1.1.1.2.2.cmml">M</mi><mo id="alg1.l2.m1.1.1.2.3" xref="alg1.l2.m1.1.1.2.3.cmml">′</mo></msup><mo id="alg1.l2.m1.1.1.1" stretchy="false" xref="alg1.l2.m1.1.1.1.cmml">←</mo><mi id="alg1.l2.m1.1.1.3" xref="alg1.l2.m1.1.1.3.cmml"></mi></mrow><annotation-xml encoding="MathML-Content" id="alg1.l2.m1.1b"><apply id="alg1.l2.m1.1.1.cmml" xref="alg1.l2.m1.1.1"><ci id="alg1.l2.m1.1.1.1.cmml" xref="alg1.l2.m1.1.1.1">←</ci><apply id="alg1.l2.m1.1.1.2.cmml" xref="alg1.l2.m1.1.1.2"><csymbol cd="ambiguous" id="alg1.l2.m1.1.1.2.1.cmml" xref="alg1.l2.m1.1.1.2">superscript</csymbol><ci id="alg1.l2.m1.1.1.2.2.cmml" xref="alg1.l2.m1.1.1.2.2">𝑀</ci><ci id="alg1.l2.m1.1.1.2.3.cmml" xref="alg1.l2.m1.1.1.2.3">′</ci></apply><csymbol cd="latexml" id="alg1.l2.m1.1.1.3.cmml" xref="alg1.l2.m1.1.1.3">absent</csymbol></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l2.m1.1c">M^{\prime}\leftarrow</annotation><annotation encoding="application/x-llamapun" id="alg1.l2.m1.1d">italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ←</annotation></semantics></math> Train each model in <math alttext="M_{t}" class="ltx_Math" display="inline" id="alg1.l2.m2.1"><semantics id="alg1.l2.m2.1a"><msub id="alg1.l2.m2.1.1" xref="alg1.l2.m2.1.1.cmml"><mi id="alg1.l2.m2.1.1.2" xref="alg1.l2.m2.1.1.2.cmml">M</mi><mi id="alg1.l2.m2.1.1.3" xref="alg1.l2.m2.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="alg1.l2.m2.1b"><apply id="alg1.l2.m2.1.1.cmml" xref="alg1.l2.m2.1.1"><csymbol cd="ambiguous" id="alg1.l2.m2.1.1.1.cmml" xref="alg1.l2.m2.1.1">subscript</csymbol><ci id="alg1.l2.m2.1.1.2.cmml" xref="alg1.l2.m2.1.1.2">𝑀</ci><ci id="alg1.l2.m2.1.1.3.cmml" xref="alg1.l2.m2.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l2.m2.1c">M_{t}</annotation><annotation encoding="application/x-llamapun" id="alg1.l2.m2.1d">italic_M start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> </div> <div class="ltx_listingline" id="alg1.l3"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l3.1.1.1" style="font-size:80%;">3:</span></span>     <span class="ltx_text ltx_font_bold" id="alg1.l3.2">if</span> <math alttext="|M^{\prime}|\neq 1" class="ltx_Math" display="inline" id="alg1.l3.m1.1"><semantics id="alg1.l3.m1.1a"><mrow id="alg1.l3.m1.1.1" xref="alg1.l3.m1.1.1.cmml"><mrow id="alg1.l3.m1.1.1.1.1" xref="alg1.l3.m1.1.1.1.2.cmml"><mo id="alg1.l3.m1.1.1.1.1.2" stretchy="false" xref="alg1.l3.m1.1.1.1.2.1.cmml">|</mo><msup id="alg1.l3.m1.1.1.1.1.1" xref="alg1.l3.m1.1.1.1.1.1.cmml"><mi id="alg1.l3.m1.1.1.1.1.1.2" xref="alg1.l3.m1.1.1.1.1.1.2.cmml">M</mi><mo id="alg1.l3.m1.1.1.1.1.1.3" xref="alg1.l3.m1.1.1.1.1.1.3.cmml">′</mo></msup><mo id="alg1.l3.m1.1.1.1.1.3" stretchy="false" xref="alg1.l3.m1.1.1.1.2.1.cmml">|</mo></mrow><mo id="alg1.l3.m1.1.1.2" xref="alg1.l3.m1.1.1.2.cmml">≠</mo><mn id="alg1.l3.m1.1.1.3" xref="alg1.l3.m1.1.1.3.cmml">1</mn></mrow><annotation-xml encoding="MathML-Content" id="alg1.l3.m1.1b"><apply id="alg1.l3.m1.1.1.cmml" xref="alg1.l3.m1.1.1"><neq id="alg1.l3.m1.1.1.2.cmml" xref="alg1.l3.m1.1.1.2"></neq><apply id="alg1.l3.m1.1.1.1.2.cmml" xref="alg1.l3.m1.1.1.1.1"><abs id="alg1.l3.m1.1.1.1.2.1.cmml" xref="alg1.l3.m1.1.1.1.1.2"></abs><apply id="alg1.l3.m1.1.1.1.1.1.cmml" xref="alg1.l3.m1.1.1.1.1.1"><csymbol cd="ambiguous" id="alg1.l3.m1.1.1.1.1.1.1.cmml" xref="alg1.l3.m1.1.1.1.1.1">superscript</csymbol><ci id="alg1.l3.m1.1.1.1.1.1.2.cmml" xref="alg1.l3.m1.1.1.1.1.1.2">𝑀</ci><ci id="alg1.l3.m1.1.1.1.1.1.3.cmml" xref="alg1.l3.m1.1.1.1.1.1.3">′</ci></apply></apply><cn id="alg1.l3.m1.1.1.3.cmml" type="integer" xref="alg1.l3.m1.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l3.m1.1c">|M^{\prime}|\neq 1</annotation><annotation encoding="application/x-llamapun" id="alg1.l3.m1.1d">| italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | ≠ 1</annotation></semantics></math> <span class="ltx_text ltx_font_bold" id="alg1.l3.3">then</span> </div> <div class="ltx_listingline" id="alg1.l4"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l4.1.1.1" style="font-size:80%;">4:</span></span>        <math alttext="v_{j}\leftarrow" class="ltx_Math" display="inline" id="alg1.l4.m1.1"><semantics id="alg1.l4.m1.1a"><mrow id="alg1.l4.m1.1.1" xref="alg1.l4.m1.1.1.cmml"><msub id="alg1.l4.m1.1.1.2" xref="alg1.l4.m1.1.1.2.cmml"><mi id="alg1.l4.m1.1.1.2.2" xref="alg1.l4.m1.1.1.2.2.cmml">v</mi><mi id="alg1.l4.m1.1.1.2.3" xref="alg1.l4.m1.1.1.2.3.cmml">j</mi></msub><mo id="alg1.l4.m1.1.1.1" stretchy="false" xref="alg1.l4.m1.1.1.1.cmml">←</mo><mi id="alg1.l4.m1.1.1.3" xref="alg1.l4.m1.1.1.3.cmml"></mi></mrow><annotation-xml encoding="MathML-Content" id="alg1.l4.m1.1b"><apply id="alg1.l4.m1.1.1.cmml" xref="alg1.l4.m1.1.1"><ci id="alg1.l4.m1.1.1.1.cmml" xref="alg1.l4.m1.1.1.1">←</ci><apply id="alg1.l4.m1.1.1.2.cmml" xref="alg1.l4.m1.1.1.2"><csymbol cd="ambiguous" id="alg1.l4.m1.1.1.2.1.cmml" xref="alg1.l4.m1.1.1.2">subscript</csymbol><ci id="alg1.l4.m1.1.1.2.2.cmml" xref="alg1.l4.m1.1.1.2.2">𝑣</ci><ci id="alg1.l4.m1.1.1.2.3.cmml" xref="alg1.l4.m1.1.1.2.3">𝑗</ci></apply><csymbol cd="latexml" id="alg1.l4.m1.1.1.3.cmml" xref="alg1.l4.m1.1.1.3">absent</csymbol></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l4.m1.1c">v_{j}\leftarrow</annotation><annotation encoding="application/x-llamapun" id="alg1.l4.m1.1d">italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ←</annotation></semantics></math> Validate each model </div> <div class="ltx_listingline" id="alg1.l5"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l5.1.1.1" style="font-size:80%;">5:</span></span>        <math alttext="x_{j}\leftarrow" class="ltx_Math" display="inline" id="alg1.l5.m1.1"><semantics id="alg1.l5.m1.1a"><mrow id="alg1.l5.m1.1.1" xref="alg1.l5.m1.1.1.cmml"><msub id="alg1.l5.m1.1.1.2" xref="alg1.l5.m1.1.1.2.cmml"><mi id="alg1.l5.m1.1.1.2.2" xref="alg1.l5.m1.1.1.2.2.cmml">x</mi><mi id="alg1.l5.m1.1.1.2.3" xref="alg1.l5.m1.1.1.2.3.cmml">j</mi></msub><mo id="alg1.l5.m1.1.1.1" stretchy="false" xref="alg1.l5.m1.1.1.1.cmml">←</mo><mi id="alg1.l5.m1.1.1.3" xref="alg1.l5.m1.1.1.3.cmml"></mi></mrow><annotation-xml encoding="MathML-Content" id="alg1.l5.m1.1b"><apply id="alg1.l5.m1.1.1.cmml" xref="alg1.l5.m1.1.1"><ci id="alg1.l5.m1.1.1.1.cmml" xref="alg1.l5.m1.1.1.1">←</ci><apply id="alg1.l5.m1.1.1.2.cmml" xref="alg1.l5.m1.1.1.2"><csymbol cd="ambiguous" id="alg1.l5.m1.1.1.2.1.cmml" xref="alg1.l5.m1.1.1.2">subscript</csymbol><ci id="alg1.l5.m1.1.1.2.2.cmml" xref="alg1.l5.m1.1.1.2.2">𝑥</ci><ci id="alg1.l5.m1.1.1.2.3.cmml" xref="alg1.l5.m1.1.1.2.3">𝑗</ci></apply><csymbol cd="latexml" id="alg1.l5.m1.1.1.3.cmml" xref="alg1.l5.m1.1.1.3">absent</csymbol></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l5.m1.1c">x_{j}\leftarrow</annotation><annotation encoding="application/x-llamapun" id="alg1.l5.m1.1d">italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ←</annotation></semantics></math> Match convergence trends by Eq. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4.E5" title="In IV-C Fine-Selection Algorithm ‣ IV Fine-Selection ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">5</span></a> </div> <div class="ltx_listingline" id="alg1.l6"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l6.1.1.1" style="font-size:80%;">6:</span></span>        <math alttext="pred_{j}\leftarrow" class="ltx_Math" display="inline" id="alg1.l6.m1.1"><semantics id="alg1.l6.m1.1a"><mrow id="alg1.l6.m1.1.1" xref="alg1.l6.m1.1.1.cmml"><mrow id="alg1.l6.m1.1.1.2" xref="alg1.l6.m1.1.1.2.cmml"><mi id="alg1.l6.m1.1.1.2.2" xref="alg1.l6.m1.1.1.2.2.cmml">p</mi><mo id="alg1.l6.m1.1.1.2.1" xref="alg1.l6.m1.1.1.2.1.cmml">⁢</mo><mi id="alg1.l6.m1.1.1.2.3" xref="alg1.l6.m1.1.1.2.3.cmml">r</mi><mo id="alg1.l6.m1.1.1.2.1a" xref="alg1.l6.m1.1.1.2.1.cmml">⁢</mo><mi id="alg1.l6.m1.1.1.2.4" xref="alg1.l6.m1.1.1.2.4.cmml">e</mi><mo id="alg1.l6.m1.1.1.2.1b" xref="alg1.l6.m1.1.1.2.1.cmml">⁢</mo><msub id="alg1.l6.m1.1.1.2.5" xref="alg1.l6.m1.1.1.2.5.cmml"><mi id="alg1.l6.m1.1.1.2.5.2" xref="alg1.l6.m1.1.1.2.5.2.cmml">d</mi><mi id="alg1.l6.m1.1.1.2.5.3" xref="alg1.l6.m1.1.1.2.5.3.cmml">j</mi></msub></mrow><mo id="alg1.l6.m1.1.1.1" stretchy="false" xref="alg1.l6.m1.1.1.1.cmml">←</mo><mi id="alg1.l6.m1.1.1.3" xref="alg1.l6.m1.1.1.3.cmml"></mi></mrow><annotation-xml encoding="MathML-Content" id="alg1.l6.m1.1b"><apply id="alg1.l6.m1.1.1.cmml" xref="alg1.l6.m1.1.1"><ci id="alg1.l6.m1.1.1.1.cmml" xref="alg1.l6.m1.1.1.1">←</ci><apply id="alg1.l6.m1.1.1.2.cmml" xref="alg1.l6.m1.1.1.2"><times id="alg1.l6.m1.1.1.2.1.cmml" xref="alg1.l6.m1.1.1.2.1"></times><ci id="alg1.l6.m1.1.1.2.2.cmml" xref="alg1.l6.m1.1.1.2.2">𝑝</ci><ci id="alg1.l6.m1.1.1.2.3.cmml" xref="alg1.l6.m1.1.1.2.3">𝑟</ci><ci id="alg1.l6.m1.1.1.2.4.cmml" xref="alg1.l6.m1.1.1.2.4">𝑒</ci><apply id="alg1.l6.m1.1.1.2.5.cmml" xref="alg1.l6.m1.1.1.2.5"><csymbol cd="ambiguous" id="alg1.l6.m1.1.1.2.5.1.cmml" xref="alg1.l6.m1.1.1.2.5">subscript</csymbol><ci id="alg1.l6.m1.1.1.2.5.2.cmml" xref="alg1.l6.m1.1.1.2.5.2">𝑑</ci><ci id="alg1.l6.m1.1.1.2.5.3.cmml" xref="alg1.l6.m1.1.1.2.5.3">𝑗</ci></apply></apply><csymbol cd="latexml" id="alg1.l6.m1.1.1.3.cmml" xref="alg1.l6.m1.1.1.3">absent</csymbol></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l6.m1.1c">pred_{j}\leftarrow</annotation><annotation encoding="application/x-llamapun" id="alg1.l6.m1.1d">italic_p italic_r italic_e italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ←</annotation></semantics></math> Predict final performance by Eq. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4.E6" title="In IV-C Fine-Selection Algorithm ‣ IV Fine-Selection ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">6</span></a> </div> <div class="ltx_listingline" id="alg1.l7"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l7.1.1.1" style="font-size:80%;">7:</span></span>        <math alttext="M^{\prime}\leftarrow" class="ltx_Math" display="inline" id="alg1.l7.m1.1"><semantics id="alg1.l7.m1.1a"><mrow id="alg1.l7.m1.1.1" xref="alg1.l7.m1.1.1.cmml"><msup id="alg1.l7.m1.1.1.2" xref="alg1.l7.m1.1.1.2.cmml"><mi id="alg1.l7.m1.1.1.2.2" xref="alg1.l7.m1.1.1.2.2.cmml">M</mi><mo id="alg1.l7.m1.1.1.2.3" xref="alg1.l7.m1.1.1.2.3.cmml">′</mo></msup><mo id="alg1.l7.m1.1.1.1" stretchy="false" xref="alg1.l7.m1.1.1.1.cmml">←</mo><mi id="alg1.l7.m1.1.1.3" xref="alg1.l7.m1.1.1.3.cmml"></mi></mrow><annotation-xml encoding="MathML-Content" id="alg1.l7.m1.1b"><apply id="alg1.l7.m1.1.1.cmml" xref="alg1.l7.m1.1.1"><ci id="alg1.l7.m1.1.1.1.cmml" xref="alg1.l7.m1.1.1.1">←</ci><apply id="alg1.l7.m1.1.1.2.cmml" xref="alg1.l7.m1.1.1.2"><csymbol cd="ambiguous" id="alg1.l7.m1.1.1.2.1.cmml" xref="alg1.l7.m1.1.1.2">superscript</csymbol><ci id="alg1.l7.m1.1.1.2.2.cmml" xref="alg1.l7.m1.1.1.2.2">𝑀</ci><ci id="alg1.l7.m1.1.1.2.3.cmml" xref="alg1.l7.m1.1.1.2.3">′</ci></apply><csymbol cd="latexml" id="alg1.l7.m1.1.1.3.cmml" xref="alg1.l7.m1.1.1.3">absent</csymbol></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l7.m1.1c">M^{\prime}\leftarrow</annotation><annotation encoding="application/x-llamapun" id="alg1.l7.m1.1d">italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ←</annotation></semantics></math> Remove models with lower <math alttext="v" class="ltx_Math" display="inline" id="alg1.l7.m2.1"><semantics id="alg1.l7.m2.1a"><mi id="alg1.l7.m2.1.1" xref="alg1.l7.m2.1.1.cmml">v</mi><annotation-xml encoding="MathML-Content" id="alg1.l7.m2.1b"><ci id="alg1.l7.m2.1.1.cmml" xref="alg1.l7.m2.1.1">𝑣</ci></annotation-xml><annotation encoding="application/x-tex" id="alg1.l7.m2.1c">v</annotation><annotation encoding="application/x-llamapun" id="alg1.l7.m2.1d">italic_v</annotation></semantics></math> and <math alttext="pred" class="ltx_Math" display="inline" id="alg1.l7.m3.1"><semantics id="alg1.l7.m3.1a"><mrow id="alg1.l7.m3.1.1" xref="alg1.l7.m3.1.1.cmml"><mi id="alg1.l7.m3.1.1.2" xref="alg1.l7.m3.1.1.2.cmml">p</mi><mo id="alg1.l7.m3.1.1.1" xref="alg1.l7.m3.1.1.1.cmml">⁢</mo><mi id="alg1.l7.m3.1.1.3" xref="alg1.l7.m3.1.1.3.cmml">r</mi><mo id="alg1.l7.m3.1.1.1a" xref="alg1.l7.m3.1.1.1.cmml">⁢</mo><mi id="alg1.l7.m3.1.1.4" xref="alg1.l7.m3.1.1.4.cmml">e</mi><mo id="alg1.l7.m3.1.1.1b" xref="alg1.l7.m3.1.1.1.cmml">⁢</mo><mi id="alg1.l7.m3.1.1.5" xref="alg1.l7.m3.1.1.5.cmml">d</mi></mrow><annotation-xml encoding="MathML-Content" id="alg1.l7.m3.1b"><apply id="alg1.l7.m3.1.1.cmml" xref="alg1.l7.m3.1.1"><times id="alg1.l7.m3.1.1.1.cmml" xref="alg1.l7.m3.1.1.1"></times><ci id="alg1.l7.m3.1.1.2.cmml" xref="alg1.l7.m3.1.1.2">𝑝</ci><ci id="alg1.l7.m3.1.1.3.cmml" xref="alg1.l7.m3.1.1.3">𝑟</ci><ci id="alg1.l7.m3.1.1.4.cmml" xref="alg1.l7.m3.1.1.4">𝑒</ci><ci id="alg1.l7.m3.1.1.5.cmml" xref="alg1.l7.m3.1.1.5">𝑑</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l7.m3.1c">pred</annotation><annotation encoding="application/x-llamapun" id="alg1.l7.m3.1d">italic_p italic_r italic_e italic_d</annotation></semantics></math>, </div> <div class="ltx_listingline" id="alg1.7.7.7.7"> <math alttext="pred" class="ltx_Math" display="inline" id="alg1.7.7.7.7.m1.1"><semantics id="alg1.7.7.7.7.m1.1a"><mrow id="alg1.7.7.7.7.m1.1.1" xref="alg1.7.7.7.7.m1.1.1.cmml"><mi id="alg1.7.7.7.7.m1.1.1.2" xref="alg1.7.7.7.7.m1.1.1.2.cmml">p</mi><mo id="alg1.7.7.7.7.m1.1.1.1" xref="alg1.7.7.7.7.m1.1.1.1.cmml">⁢</mo><mi id="alg1.7.7.7.7.m1.1.1.3" xref="alg1.7.7.7.7.m1.1.1.3.cmml">r</mi><mo id="alg1.7.7.7.7.m1.1.1.1a" xref="alg1.7.7.7.7.m1.1.1.1.cmml">⁢</mo><mi id="alg1.7.7.7.7.m1.1.1.4" xref="alg1.7.7.7.7.m1.1.1.4.cmml">e</mi><mo id="alg1.7.7.7.7.m1.1.1.1b" xref="alg1.7.7.7.7.m1.1.1.1.cmml">⁢</mo><mi id="alg1.7.7.7.7.m1.1.1.5" xref="alg1.7.7.7.7.m1.1.1.5.cmml">d</mi></mrow><annotation-xml encoding="MathML-Content" id="alg1.7.7.7.7.m1.1b"><apply id="alg1.7.7.7.7.m1.1.1.cmml" xref="alg1.7.7.7.7.m1.1.1"><times id="alg1.7.7.7.7.m1.1.1.1.cmml" xref="alg1.7.7.7.7.m1.1.1.1"></times><ci id="alg1.7.7.7.7.m1.1.1.2.cmml" xref="alg1.7.7.7.7.m1.1.1.2">𝑝</ci><ci id="alg1.7.7.7.7.m1.1.1.3.cmml" xref="alg1.7.7.7.7.m1.1.1.3">𝑟</ci><ci id="alg1.7.7.7.7.m1.1.1.4.cmml" xref="alg1.7.7.7.7.m1.1.1.4">𝑒</ci><ci id="alg1.7.7.7.7.m1.1.1.5.cmml" xref="alg1.7.7.7.7.m1.1.1.5">𝑑</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.7.7.7.7.m1.1c">pred</annotation><annotation encoding="application/x-llamapun" id="alg1.7.7.7.7.m1.1d">italic_p italic_r italic_e italic_d</annotation></semantics></math>’s difference should larger than the threshold </div> <div class="ltx_listingline" id="alg1.l8"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l8.1.1.1" style="font-size:80%;">8:</span></span>        <span class="ltx_text ltx_font_bold" id="alg1.l8.2">while</span> <math alttext="|M^{\prime}|&gt;\lfloor|M_{t}|/2\rfloor" class="ltx_Math" display="inline" id="alg1.l8.m1.2"><semantics id="alg1.l8.m1.2a"><mrow id="alg1.l8.m1.2.2" xref="alg1.l8.m1.2.2.cmml"><mrow id="alg1.l8.m1.1.1.1.1" xref="alg1.l8.m1.1.1.1.2.cmml"><mo id="alg1.l8.m1.1.1.1.1.2" stretchy="false" xref="alg1.l8.m1.1.1.1.2.1.cmml">|</mo><msup id="alg1.l8.m1.1.1.1.1.1" xref="alg1.l8.m1.1.1.1.1.1.cmml"><mi id="alg1.l8.m1.1.1.1.1.1.2" xref="alg1.l8.m1.1.1.1.1.1.2.cmml">M</mi><mo id="alg1.l8.m1.1.1.1.1.1.3" xref="alg1.l8.m1.1.1.1.1.1.3.cmml">′</mo></msup><mo id="alg1.l8.m1.1.1.1.1.3" stretchy="false" xref="alg1.l8.m1.1.1.1.2.1.cmml">|</mo></mrow><mo id="alg1.l8.m1.2.2.3" xref="alg1.l8.m1.2.2.3.cmml">&gt;</mo><mrow id="alg1.l8.m1.2.2.2.1" xref="alg1.l8.m1.2.2.2.2.cmml"><mo id="alg1.l8.m1.2.2.2.1.2" stretchy="false" xref="alg1.l8.m1.2.2.2.2.1.cmml">⌊</mo><mrow id="alg1.l8.m1.2.2.2.1.1" xref="alg1.l8.m1.2.2.2.1.1.cmml"><mrow id="alg1.l8.m1.2.2.2.1.1.1.1" xref="alg1.l8.m1.2.2.2.1.1.1.2.cmml"><mo id="alg1.l8.m1.2.2.2.1.1.1.1.2" stretchy="false" xref="alg1.l8.m1.2.2.2.1.1.1.2.1.cmml">|</mo><msub id="alg1.l8.m1.2.2.2.1.1.1.1.1" xref="alg1.l8.m1.2.2.2.1.1.1.1.1.cmml"><mi id="alg1.l8.m1.2.2.2.1.1.1.1.1.2" xref="alg1.l8.m1.2.2.2.1.1.1.1.1.2.cmml">M</mi><mi id="alg1.l8.m1.2.2.2.1.1.1.1.1.3" xref="alg1.l8.m1.2.2.2.1.1.1.1.1.3.cmml">t</mi></msub><mo id="alg1.l8.m1.2.2.2.1.1.1.1.3" stretchy="false" xref="alg1.l8.m1.2.2.2.1.1.1.2.1.cmml">|</mo></mrow><mo id="alg1.l8.m1.2.2.2.1.1.2" xref="alg1.l8.m1.2.2.2.1.1.2.cmml">/</mo><mn id="alg1.l8.m1.2.2.2.1.1.3" xref="alg1.l8.m1.2.2.2.1.1.3.cmml">2</mn></mrow><mo id="alg1.l8.m1.2.2.2.1.3" stretchy="false" xref="alg1.l8.m1.2.2.2.2.1.cmml">⌋</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="alg1.l8.m1.2b"><apply id="alg1.l8.m1.2.2.cmml" xref="alg1.l8.m1.2.2"><gt id="alg1.l8.m1.2.2.3.cmml" xref="alg1.l8.m1.2.2.3"></gt><apply id="alg1.l8.m1.1.1.1.2.cmml" xref="alg1.l8.m1.1.1.1.1"><abs id="alg1.l8.m1.1.1.1.2.1.cmml" xref="alg1.l8.m1.1.1.1.1.2"></abs><apply id="alg1.l8.m1.1.1.1.1.1.cmml" xref="alg1.l8.m1.1.1.1.1.1"><csymbol cd="ambiguous" id="alg1.l8.m1.1.1.1.1.1.1.cmml" xref="alg1.l8.m1.1.1.1.1.1">superscript</csymbol><ci id="alg1.l8.m1.1.1.1.1.1.2.cmml" xref="alg1.l8.m1.1.1.1.1.1.2">𝑀</ci><ci id="alg1.l8.m1.1.1.1.1.1.3.cmml" xref="alg1.l8.m1.1.1.1.1.1.3">′</ci></apply></apply><apply id="alg1.l8.m1.2.2.2.2.cmml" xref="alg1.l8.m1.2.2.2.1"><floor id="alg1.l8.m1.2.2.2.2.1.cmml" xref="alg1.l8.m1.2.2.2.1.2"></floor><apply id="alg1.l8.m1.2.2.2.1.1.cmml" xref="alg1.l8.m1.2.2.2.1.1"><divide id="alg1.l8.m1.2.2.2.1.1.2.cmml" xref="alg1.l8.m1.2.2.2.1.1.2"></divide><apply id="alg1.l8.m1.2.2.2.1.1.1.2.cmml" xref="alg1.l8.m1.2.2.2.1.1.1.1"><abs id="alg1.l8.m1.2.2.2.1.1.1.2.1.cmml" xref="alg1.l8.m1.2.2.2.1.1.1.1.2"></abs><apply id="alg1.l8.m1.2.2.2.1.1.1.1.1.cmml" xref="alg1.l8.m1.2.2.2.1.1.1.1.1"><csymbol cd="ambiguous" id="alg1.l8.m1.2.2.2.1.1.1.1.1.1.cmml" xref="alg1.l8.m1.2.2.2.1.1.1.1.1">subscript</csymbol><ci id="alg1.l8.m1.2.2.2.1.1.1.1.1.2.cmml" xref="alg1.l8.m1.2.2.2.1.1.1.1.1.2">𝑀</ci><ci id="alg1.l8.m1.2.2.2.1.1.1.1.1.3.cmml" xref="alg1.l8.m1.2.2.2.1.1.1.1.1.3">𝑡</ci></apply></apply><cn id="alg1.l8.m1.2.2.2.1.1.3.cmml" type="integer" xref="alg1.l8.m1.2.2.2.1.1.3">2</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l8.m1.2c">|M^{\prime}|&gt;\lfloor|M_{t}|/2\rfloor</annotation><annotation encoding="application/x-llamapun" id="alg1.l8.m1.2d">| italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | &gt; ⌊ | italic_M start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | / 2 ⌋</annotation></semantics></math> <span class="ltx_text ltx_font_bold" id="alg1.l8.3">do</span> </div> <div class="ltx_listingline" id="alg1.l9"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l9.1.1.1" style="font-size:80%;">9:</span></span>           <math alttext="M^{\prime}\leftarrow" class="ltx_Math" display="inline" id="alg1.l9.m1.1"><semantics id="alg1.l9.m1.1a"><mrow id="alg1.l9.m1.1.1" xref="alg1.l9.m1.1.1.cmml"><msup id="alg1.l9.m1.1.1.2" xref="alg1.l9.m1.1.1.2.cmml"><mi id="alg1.l9.m1.1.1.2.2" xref="alg1.l9.m1.1.1.2.2.cmml">M</mi><mo id="alg1.l9.m1.1.1.2.3" xref="alg1.l9.m1.1.1.2.3.cmml">′</mo></msup><mo id="alg1.l9.m1.1.1.1" stretchy="false" xref="alg1.l9.m1.1.1.1.cmml">←</mo><mi id="alg1.l9.m1.1.1.3" xref="alg1.l9.m1.1.1.3.cmml"></mi></mrow><annotation-xml encoding="MathML-Content" id="alg1.l9.m1.1b"><apply id="alg1.l9.m1.1.1.cmml" xref="alg1.l9.m1.1.1"><ci id="alg1.l9.m1.1.1.1.cmml" xref="alg1.l9.m1.1.1.1">←</ci><apply id="alg1.l9.m1.1.1.2.cmml" xref="alg1.l9.m1.1.1.2"><csymbol cd="ambiguous" id="alg1.l9.m1.1.1.2.1.cmml" xref="alg1.l9.m1.1.1.2">superscript</csymbol><ci id="alg1.l9.m1.1.1.2.2.cmml" xref="alg1.l9.m1.1.1.2.2">𝑀</ci><ci id="alg1.l9.m1.1.1.2.3.cmml" xref="alg1.l9.m1.1.1.2.3">′</ci></apply><csymbol cd="latexml" id="alg1.l9.m1.1.1.3.cmml" xref="alg1.l9.m1.1.1.3">absent</csymbol></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l9.m1.1c">M^{\prime}\leftarrow</annotation><annotation encoding="application/x-llamapun" id="alg1.l9.m1.1d">italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ←</annotation></semantics></math> Remove models in <math alttext="M^{\prime}" class="ltx_Math" display="inline" id="alg1.l9.m2.1"><semantics id="alg1.l9.m2.1a"><msup id="alg1.l9.m2.1.1" xref="alg1.l9.m2.1.1.cmml"><mi id="alg1.l9.m2.1.1.2" xref="alg1.l9.m2.1.1.2.cmml">M</mi><mo id="alg1.l9.m2.1.1.3" xref="alg1.l9.m2.1.1.3.cmml">′</mo></msup><annotation-xml encoding="MathML-Content" id="alg1.l9.m2.1b"><apply id="alg1.l9.m2.1.1.cmml" xref="alg1.l9.m2.1.1"><csymbol cd="ambiguous" id="alg1.l9.m2.1.1.1.cmml" xref="alg1.l9.m2.1.1">superscript</csymbol><ci id="alg1.l9.m2.1.1.2.cmml" xref="alg1.l9.m2.1.1.2">𝑀</ci><ci id="alg1.l9.m2.1.1.3.cmml" xref="alg1.l9.m2.1.1.3">′</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l9.m2.1c">M^{\prime}</annotation><annotation encoding="application/x-llamapun" id="alg1.l9.m2.1d">italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT</annotation></semantics></math> with lowest <math alttext="v" class="ltx_Math" display="inline" id="alg1.l9.m3.1"><semantics id="alg1.l9.m3.1a"><mi id="alg1.l9.m3.1.1" xref="alg1.l9.m3.1.1.cmml">v</mi><annotation-xml encoding="MathML-Content" id="alg1.l9.m3.1b"><ci id="alg1.l9.m3.1.1.cmml" xref="alg1.l9.m3.1.1">𝑣</ci></annotation-xml><annotation encoding="application/x-tex" id="alg1.l9.m3.1c">v</annotation><annotation encoding="application/x-llamapun" id="alg1.l9.m3.1d">italic_v</annotation></semantics></math> </div> <div class="ltx_listingline" id="alg1.l10"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l10.1.1.1" style="font-size:80%;">10:</span></span>        <span class="ltx_text ltx_font_bold" id="alg1.l10.2">end</span> <span class="ltx_text ltx_font_bold" id="alg1.l10.3">while</span> </div> <div class="ltx_listingline" id="alg1.l11"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l11.1.1.1" style="font-size:80%;">11:</span></span>     <span class="ltx_text ltx_font_bold" id="alg1.l11.2">end</span> <span class="ltx_text ltx_font_bold" id="alg1.l11.3">if</span> </div> <div class="ltx_listingline" id="alg1.l12"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l12.1.1.1" style="font-size:80%;">12:</span></span>     <math alttext="M_{t+1}\leftarrow M^{\prime}" class="ltx_Math" display="inline" id="alg1.l12.m1.1"><semantics id="alg1.l12.m1.1a"><mrow id="alg1.l12.m1.1.1" xref="alg1.l12.m1.1.1.cmml"><msub id="alg1.l12.m1.1.1.2" xref="alg1.l12.m1.1.1.2.cmml"><mi id="alg1.l12.m1.1.1.2.2" xref="alg1.l12.m1.1.1.2.2.cmml">M</mi><mrow id="alg1.l12.m1.1.1.2.3" xref="alg1.l12.m1.1.1.2.3.cmml"><mi id="alg1.l12.m1.1.1.2.3.2" xref="alg1.l12.m1.1.1.2.3.2.cmml">t</mi><mo id="alg1.l12.m1.1.1.2.3.1" xref="alg1.l12.m1.1.1.2.3.1.cmml">+</mo><mn id="alg1.l12.m1.1.1.2.3.3" xref="alg1.l12.m1.1.1.2.3.3.cmml">1</mn></mrow></msub><mo id="alg1.l12.m1.1.1.1" stretchy="false" xref="alg1.l12.m1.1.1.1.cmml">←</mo><msup id="alg1.l12.m1.1.1.3" xref="alg1.l12.m1.1.1.3.cmml"><mi id="alg1.l12.m1.1.1.3.2" xref="alg1.l12.m1.1.1.3.2.cmml">M</mi><mo id="alg1.l12.m1.1.1.3.3" xref="alg1.l12.m1.1.1.3.3.cmml">′</mo></msup></mrow><annotation-xml encoding="MathML-Content" id="alg1.l12.m1.1b"><apply id="alg1.l12.m1.1.1.cmml" xref="alg1.l12.m1.1.1"><ci id="alg1.l12.m1.1.1.1.cmml" xref="alg1.l12.m1.1.1.1">←</ci><apply id="alg1.l12.m1.1.1.2.cmml" xref="alg1.l12.m1.1.1.2"><csymbol cd="ambiguous" id="alg1.l12.m1.1.1.2.1.cmml" xref="alg1.l12.m1.1.1.2">subscript</csymbol><ci id="alg1.l12.m1.1.1.2.2.cmml" xref="alg1.l12.m1.1.1.2.2">𝑀</ci><apply id="alg1.l12.m1.1.1.2.3.cmml" xref="alg1.l12.m1.1.1.2.3"><plus id="alg1.l12.m1.1.1.2.3.1.cmml" xref="alg1.l12.m1.1.1.2.3.1"></plus><ci id="alg1.l12.m1.1.1.2.3.2.cmml" xref="alg1.l12.m1.1.1.2.3.2">𝑡</ci><cn id="alg1.l12.m1.1.1.2.3.3.cmml" type="integer" xref="alg1.l12.m1.1.1.2.3.3">1</cn></apply></apply><apply id="alg1.l12.m1.1.1.3.cmml" xref="alg1.l12.m1.1.1.3"><csymbol cd="ambiguous" id="alg1.l12.m1.1.1.3.1.cmml" xref="alg1.l12.m1.1.1.3">superscript</csymbol><ci id="alg1.l12.m1.1.1.3.2.cmml" xref="alg1.l12.m1.1.1.3.2">𝑀</ci><ci id="alg1.l12.m1.1.1.3.3.cmml" xref="alg1.l12.m1.1.1.3.3">′</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l12.m1.1c">M_{t+1}\leftarrow M^{\prime}</annotation><annotation encoding="application/x-llamapun" id="alg1.l12.m1.1d">italic_M start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ← italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT</annotation></semantics></math> </div> <div class="ltx_listingline" id="alg1.7.7.7.8"> </div> <div class="ltx_listingline" id="alg1.l13"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l13.1.1.1" style="font-size:80%;">13:</span></span>  <span class="ltx_text ltx_font_bold" id="alg1.l13.2">end</span> <span class="ltx_text ltx_font_bold" id="alg1.l13.3">for</span> </div> <div class="ltx_listingline" id="alg1.l14"> <span class="ltx_tag ltx_tag_listingline"><span class="ltx_text" id="alg1.l14.1.1.1" style="font-size:80%;">14:</span></span>  <span class="ltx_text ltx_font_bold" id="alg1.l14.2">return</span>  <math alttext="M_{T/s}[0]" class="ltx_Math" display="inline" id="alg1.l14.m1.1"><semantics id="alg1.l14.m1.1a"><mrow id="alg1.l14.m1.1.2" xref="alg1.l14.m1.1.2.cmml"><msub id="alg1.l14.m1.1.2.2" xref="alg1.l14.m1.1.2.2.cmml"><mi id="alg1.l14.m1.1.2.2.2" xref="alg1.l14.m1.1.2.2.2.cmml">M</mi><mrow id="alg1.l14.m1.1.2.2.3" xref="alg1.l14.m1.1.2.2.3.cmml"><mi id="alg1.l14.m1.1.2.2.3.2" xref="alg1.l14.m1.1.2.2.3.2.cmml">T</mi><mo id="alg1.l14.m1.1.2.2.3.1" xref="alg1.l14.m1.1.2.2.3.1.cmml">/</mo><mi id="alg1.l14.m1.1.2.2.3.3" xref="alg1.l14.m1.1.2.2.3.3.cmml">s</mi></mrow></msub><mo id="alg1.l14.m1.1.2.1" xref="alg1.l14.m1.1.2.1.cmml">⁢</mo><mrow id="alg1.l14.m1.1.2.3.2" xref="alg1.l14.m1.1.2.3.1.cmml"><mo id="alg1.l14.m1.1.2.3.2.1" stretchy="false" xref="alg1.l14.m1.1.2.3.1.1.cmml">[</mo><mn id="alg1.l14.m1.1.1" xref="alg1.l14.m1.1.1.cmml">0</mn><mo id="alg1.l14.m1.1.2.3.2.2" stretchy="false" xref="alg1.l14.m1.1.2.3.1.1.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="alg1.l14.m1.1b"><apply id="alg1.l14.m1.1.2.cmml" xref="alg1.l14.m1.1.2"><times id="alg1.l14.m1.1.2.1.cmml" xref="alg1.l14.m1.1.2.1"></times><apply id="alg1.l14.m1.1.2.2.cmml" xref="alg1.l14.m1.1.2.2"><csymbol cd="ambiguous" id="alg1.l14.m1.1.2.2.1.cmml" xref="alg1.l14.m1.1.2.2">subscript</csymbol><ci id="alg1.l14.m1.1.2.2.2.cmml" xref="alg1.l14.m1.1.2.2.2">𝑀</ci><apply id="alg1.l14.m1.1.2.2.3.cmml" xref="alg1.l14.m1.1.2.2.3"><divide id="alg1.l14.m1.1.2.2.3.1.cmml" xref="alg1.l14.m1.1.2.2.3.1"></divide><ci id="alg1.l14.m1.1.2.2.3.2.cmml" xref="alg1.l14.m1.1.2.2.3.2">𝑇</ci><ci id="alg1.l14.m1.1.2.2.3.3.cmml" xref="alg1.l14.m1.1.2.2.3.3">𝑠</ci></apply></apply><apply id="alg1.l14.m1.1.2.3.1.cmml" xref="alg1.l14.m1.1.2.3.2"><csymbol cd="latexml" id="alg1.l14.m1.1.2.3.1.1.cmml" xref="alg1.l14.m1.1.2.3.2.1">delimited-[]</csymbol><cn id="alg1.l14.m1.1.1.cmml" type="integer" xref="alg1.l14.m1.1.1">0</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="alg1.l14.m1.1c">M_{T/s}[0]</annotation><annotation encoding="application/x-llamapun" id="alg1.l14.m1.1d">italic_M start_POSTSUBSCRIPT italic_T / italic_s end_POSTSUBSCRIPT [ 0 ]</annotation></semantics></math> </div> </div> </div> </div> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_float"><span class="ltx_text ltx_font_bold" id="alg1.9.1.1">Algorithm 1</span> </span>Fine-Selection</figcaption> </figure> </section> </section> <section class="ltx_section" id="S5"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">V </span><span class="ltx_text ltx_font_smallcaps" id="S5.1.1">Experiments</span> </h2> <div class="ltx_para" id="S5.p1"> <p class="ltx_p" id="S5.p1.1">In this section, we study the model selection experiment results on natural language process and computer vision tasks, and demonstrate the effectiveness and efficiency of the proposed methods.</p> </div> <section class="ltx_subsection" id="S5.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S5.SS1.4.1.1">V-A</span> </span><span class="ltx_text ltx_font_italic" id="S5.SS1.5.2">Experiment Setup</span> </h3> <div class="ltx_para" id="S5.SS1.p1"> <p class="ltx_p" id="S5.SS1.p1.1"><span class="ltx_text ltx_font_bold" id="S5.SS1.p1.1.1">Pre-trained Models.</span> The models are collected from HuggingFace <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib1" title="">1</a>]</cite>. For natural language tasks, we select 40 models as model repositories, which both contain the state-of-the-art pre-trained models (such as BERT<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib9" title="">9</a>]</cite>, Roberta <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib16" title="">16</a>]</cite>, etc) and the models fine-tuned on some downstream tasks. For computer vision tasks, we select 30 models as repositories. The structure of the models includes ViT <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib17" title="">17</a>]</cite>, BEiT<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib18" title="">18</a>]</cite>, DEiT <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib19" title="">19</a>]</cite>, poolformer <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib20" title="">20</a>]</cite>, dinat <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib21" title="">21</a>]</cite>, and Visual Attention Network <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib22" title="">22</a>]</cite>. Similar to NLP models, these models contain state-of-the-art models and their fine-tuned versions. A complete list of models are given in Appendix. B <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib15" title="">15</a>]</cite>.</p> </div> <div class="ltx_para" id="S5.SS1.p2"> <p class="ltx_p" id="S5.SS1.p2.1"><span class="ltx_text ltx_font_bold" id="S5.SS1.p2.1.1">Datasets.</span> The experiments are conducted on NLP and CV datasets which are divided into benchmark datasets and target datasets. The benchmark datasets are used to construct the performance matrix for model clustering. The target datasets are used to evaluate the proposed two-phase model selection method. The datasets outputs are in the form of classification and are totally different in two parts, which means they are variant in the sub-tasks, label numbers, and label distributions. It is worth noting that our datasets contain both common datasets like GLUE<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib5" title="">5</a>]</cite> and cifar10 <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib23" title="">23</a>]</cite> as well as domain-specific datasets such as finance and medical science. These datasets are available on HuggingFace and have been split into training and testing sets by their contributors. The datasets used for performance matrix construction and method evaluation are given below:</p> </div> <div class="ltx_para" id="S5.SS1.p3"> <ul class="ltx_itemize" id="S5.I1"> <li class="ltx_item" id="S5.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S5.I1.i1.p1"> <p class="ltx_p" id="S5.I1.i1.p1.1">Benchmark Datasets: The benchmark datasets for NLP are mainly from GLUE<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib5" title="">5</a>]</cite> and SuperGLUE <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib24" title="">24</a>]</cite>. They are COLA, MRPC, QNLI, QQP, RTE, SST2, STSB and WNLI in GLUE, and CB, COPA, WIC in SuperGLUE. And we use some domain-specific tasks popular in HuggingFace, which are: imdb <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib25" title="">25</a>]</cite>, yelp_review_full <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib26" title="">26</a>]</cite>, yahoo_answer_topic, dbpedia_14 <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib26" title="">26</a>]</cite>, xnli <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib27" title="">27</a>]</cite>, anli <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib28" title="">28</a>]</cite>, app_reviews <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib29" title="">29</a>]</cite>, trec <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib30" title="">30</a>]</cite> <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib31" title="">31</a>]</cite>, sick <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib32" title="">32</a>]</cite> and financial_phrasebank <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib33" title="">33</a>]</cite>. For CV benchmark, we use datasets of image classification in different domains: food101 <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib34" title="">34</a>]</cite>, CC6204-Hackaton-Cub-Dataset <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib6" title="">6</a>]</cite>, cats_vs_dogs <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib35" title="">35</a>]</cite>, cifar10 <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib23" title="">23</a>]</cite> and MNIST <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib36" title="">36</a>]</cite>.</p> </div> </li> <li class="ltx_item" id="S5.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S5.I1.i2.p1"> <p class="ltx_p" id="S5.I1.i2.p1.1">Target Datasets: For NLP evaluation, four tasks are selected: tweet_eval <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib37" title="">37</a>]</cite> collected from Tweeter reviews, MNLI <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib38" title="">38</a>]</cite> from the GLUE benchmark, MultiRC<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib39" title="">39</a>]</cite>, and Boolq <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib40" title="">40</a>]</cite> from superGLUE benchmark. For CV evaluation, four image classification datasets with different domains are selected: chest-xray-classification <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib41" title="">41</a>]</cite>, MedMNIST <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib42" title="">42</a>]</cite>, oxford-flowers <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib43" title="">43</a>]</cite>, and beans <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib44" title="">44</a>]</cite>.</p> </div> </li> </ul> </div> <div class="ltx_para" id="S5.SS1.p4"> <p class="ltx_p" id="S5.SS1.p4.1">We give further description of datasets in Appendix. C<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib15" title="">15</a>]</cite>.</p> </div> <div class="ltx_para" id="S5.SS1.p5"> <p class="ltx_p" id="S5.SS1.p5.2"><span class="ltx_text ltx_font_bold" id="S5.SS1.p5.2.1">Performance Matrix.</span> We build a performance matrix by fine-tuning all the pre-trained models on the benchmark datasets, which contains <math alttext="40\times 24" class="ltx_Math" display="inline" id="S5.SS1.p5.1.m1.1"><semantics id="S5.SS1.p5.1.m1.1a"><mrow id="S5.SS1.p5.1.m1.1.1" xref="S5.SS1.p5.1.m1.1.1.cmml"><mn id="S5.SS1.p5.1.m1.1.1.2" xref="S5.SS1.p5.1.m1.1.1.2.cmml">40</mn><mo id="S5.SS1.p5.1.m1.1.1.1" lspace="0.222em" rspace="0.222em" xref="S5.SS1.p5.1.m1.1.1.1.cmml">×</mo><mn id="S5.SS1.p5.1.m1.1.1.3" xref="S5.SS1.p5.1.m1.1.1.3.cmml">24</mn></mrow><annotation-xml encoding="MathML-Content" id="S5.SS1.p5.1.m1.1b"><apply id="S5.SS1.p5.1.m1.1.1.cmml" xref="S5.SS1.p5.1.m1.1.1"><times id="S5.SS1.p5.1.m1.1.1.1.cmml" xref="S5.SS1.p5.1.m1.1.1.1"></times><cn id="S5.SS1.p5.1.m1.1.1.2.cmml" type="integer" xref="S5.SS1.p5.1.m1.1.1.2">40</cn><cn id="S5.SS1.p5.1.m1.1.1.3.cmml" type="integer" xref="S5.SS1.p5.1.m1.1.1.3">24</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS1.p5.1.m1.1c">40\times 24</annotation><annotation encoding="application/x-llamapun" id="S5.SS1.p5.1.m1.1d">40 × 24</annotation></semantics></math> trains for natural language processing and <math alttext="30\times 10" class="ltx_Math" display="inline" id="S5.SS1.p5.2.m2.1"><semantics id="S5.SS1.p5.2.m2.1a"><mrow id="S5.SS1.p5.2.m2.1.1" xref="S5.SS1.p5.2.m2.1.1.cmml"><mn id="S5.SS1.p5.2.m2.1.1.2" xref="S5.SS1.p5.2.m2.1.1.2.cmml">30</mn><mo id="S5.SS1.p5.2.m2.1.1.1" lspace="0.222em" rspace="0.222em" xref="S5.SS1.p5.2.m2.1.1.1.cmml">×</mo><mn id="S5.SS1.p5.2.m2.1.1.3" xref="S5.SS1.p5.2.m2.1.1.3.cmml">10</mn></mrow><annotation-xml encoding="MathML-Content" id="S5.SS1.p5.2.m2.1b"><apply id="S5.SS1.p5.2.m2.1.1.cmml" xref="S5.SS1.p5.2.m2.1.1"><times id="S5.SS1.p5.2.m2.1.1.1.cmml" xref="S5.SS1.p5.2.m2.1.1.1"></times><cn id="S5.SS1.p5.2.m2.1.1.2.cmml" type="integer" xref="S5.SS1.p5.2.m2.1.1.2">30</cn><cn id="S5.SS1.p5.2.m2.1.1.3.cmml" type="integer" xref="S5.SS1.p5.2.m2.1.1.3">10</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS1.p5.2.m2.1c">30\times 10</annotation><annotation encoding="application/x-llamapun" id="S5.SS1.p5.2.m2.1d">30 × 10</annotation></semantics></math> trains for computer vision respectively. Each training is conducted with 5 epochs and 4 epochs for natural language processing and computer vision respectively, which is enough to compare the relative accuracy between different trains and could support convergence trend mining for the fine-selection phase.</p> </div> <figure class="ltx_table" id="S5.T1"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">TABLE I: </span>Clustering Methods Comparison</figcaption> <table class="ltx_tabular ltx_centering ltx_align_middle" id="S5.T1.1"> <tr class="ltx_tr" id="S5.T1.1.1"> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T1.1.1.1" rowspan="2"><span class="ltx_text" id="S5.T1.1.1.1.1">Model Similarity</span></td> <td class="ltx_td ltx_align_center ltx_border_t" colspan="2" id="S5.T1.1.1.2">Hierarchical clustering</td> <td class="ltx_td ltx_align_center ltx_border_t" colspan="2" id="S5.T1.1.1.3">K-means</td> </tr> <tr class="ltx_tr" id="S5.T1.1.2"> <td class="ltx_td ltx_align_center" id="S5.T1.1.2.1">NLP</td> <td class="ltx_td ltx_align_center" id="S5.T1.1.2.2">CV</td> <td class="ltx_td ltx_align_center" id="S5.T1.1.2.3">NLP</td> <td class="ltx_td ltx_align_center" id="S5.T1.1.2.4">CV</td> </tr> <tr class="ltx_tr" id="S5.T1.1.3"> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T1.1.3.1">performance-based</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T1.1.3.2"><span class="ltx_text ltx_font_bold" id="S5.T1.1.3.2.1">0.505</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T1.1.3.3"><span class="ltx_text ltx_font_bold" id="S5.T1.1.3.3.1">0.806</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T1.1.3.4">0.466</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T1.1.3.5">0.702</td> </tr> <tr class="ltx_tr" id="S5.T1.1.4"> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T1.1.4.1">text-based</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T1.1.4.2">0.476</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T1.1.4.3">0.696</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T1.1.4.4">0.453</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T1.1.4.5">0.732</td> </tr> </table> </figure> </section> <section class="ltx_subsection" id="S5.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S5.SS2.4.1.1">V-B</span> </span><span class="ltx_text ltx_font_italic" id="S5.SS2.5.2">Experiment for Coarse-Recall Phase</span> </h3> <div class="ltx_para" id="S5.SS2.p1"> <p class="ltx_p" id="S5.SS2.p1.1">For coarse-recall phase, we first study the experiment results for model clustering and then study the results for model recall.</p> </div> <div class="ltx_para" id="S5.SS2.p2"> <p class="ltx_p" id="S5.SS2.p2.1"><span class="ltx_text ltx_font_bold" id="S5.SS2.p2.1.1">Model Clustering.</span> We study the model clustering results by comparing different model similarity measurements and clustering algorithms, and the clustering results are measured in terms of silhouette coefficient <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib45" title="">45</a>]</cite>. For model similarity measurement comparison, the performance-based similarity is calculated by Eq. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S3.E1" title="In III-A Model Clustering ‣ III Coarse Recall ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">1</span></a> and we conduct an experiment in Appendix. D <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib15" title="">15</a>]</cite> to determine the parameter <math alttext="k" class="ltx_Math" display="inline" id="S5.SS2.p2.1.m1.1"><semantics id="S5.SS2.p2.1.m1.1a"><mi id="S5.SS2.p2.1.m1.1.1" xref="S5.SS2.p2.1.m1.1.1.cmml">k</mi><annotation-xml encoding="MathML-Content" id="S5.SS2.p2.1.m1.1b"><ci id="S5.SS2.p2.1.m1.1.1.cmml" xref="S5.SS2.p2.1.m1.1.1">𝑘</ci></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p2.1.m1.1c">k</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p2.1.m1.1d">italic_k</annotation></semantics></math>. The text-based similarity is calculated from the text of the corresponding model card (Appendix E. <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib15" title="">15</a>]</cite> shows an example of a model card). We adopt SBERT<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib46" title="">46</a>]</cite> to encode the text into an vector so that cosine similarity could be computed. For the clustering algorithms, we compare two state-of-the-art clustering algorithms, K-means and hierarchical clustering.</p> </div> <div class="ltx_para" id="S5.SS2.p3"> <p class="ltx_p" id="S5.SS2.p3.1">Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.T1" title="TABLE I ‣ V-A Experiment Setup ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">I</span></a> shows the comparison results. We can see the performance-based similarity achieves higher silhouette coefficient compared to text-based similarity, demonstrating the former could generate a better clustering structure. For clustering algorithm comparison, the hierarchical clustering outperforms the K-means clustering in a clear gap based on performance-based similarity, showing that clusters generated by hierarchical clustering have smaller similarity between and more connection within. To take a step further, it is worth noting that clusters generated by hierarchical clustering are more reasonable than those generated by K-means clustering as discussed below. Therefore, we will conduct the following experiments based on the results of hierarchical clustering.</p> </div> <figure class="ltx_table" id="S5.T2"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">TABLE II: </span>Model Clustering Results</figcaption> <table class="ltx_tabular ltx_centering ltx_align_middle" id="S5.T2.14.14"> <tr class="ltx_tr" id="S5.T2.14.14.15"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" colspan="3" id="S5.T2.14.14.15.1"><span class="ltx_text ltx_font_bold" id="S5.T2.14.14.15.1.1">Model Clusters of Natural Language Processing</span></td> </tr> <tr class="ltx_tr" id="S5.T2.14.14.16"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.14.14.16.1"><span class="ltx_text ltx_font_bold" id="S5.T2.14.14.16.1.1">Cluster</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.14.14.16.2"><span class="ltx_text ltx_font_bold" id="S5.T2.14.14.16.2.1">Size</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.14.14.16.3"><span class="ltx_text ltx_font_bold" id="S5.T2.14.14.16.3.1">Pre-trained Models</span></td> </tr> <tr class="ltx_tr" id="S5.T2.1.1.1"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.1.1.1.1"><math alttext="C_{1}" class="ltx_Math" display="inline" id="S5.T2.1.1.1.1.m1.1"><semantics id="S5.T2.1.1.1.1.m1.1a"><msub id="S5.T2.1.1.1.1.m1.1.1" xref="S5.T2.1.1.1.1.m1.1.1.cmml"><mi id="S5.T2.1.1.1.1.m1.1.1.2" xref="S5.T2.1.1.1.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.1.1.1.1.m1.1.1.3" xref="S5.T2.1.1.1.1.m1.1.1.3.cmml">1</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.1.1.1.1.m1.1b"><apply id="S5.T2.1.1.1.1.m1.1.1.cmml" xref="S5.T2.1.1.1.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.1.1.1.1.m1.1.1.1.cmml" xref="S5.T2.1.1.1.1.m1.1.1">subscript</csymbol><ci id="S5.T2.1.1.1.1.m1.1.1.2.cmml" xref="S5.T2.1.1.1.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.1.1.1.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.1.1.1.1.m1.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.1.1.1.1.m1.1c">C_{1}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.1.1.1.1.m1.1d">italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.1.1.1.2">5</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.1.1.1.3"> <span class="ltx_text" id="S5.T2.1.1.1.3.1"></span> <span class="ltx_text" id="S5.T2.1.1.1.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.1.1.1.3.2.1"> <span class="ltx_tr" id="S5.T2.1.1.1.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.1.1.1.3.2.1.1.1">Jeevesh8/bert_ft_qqp-68, Jeevesh8/bert_ft_qqp-9, Jeevesh8/bert_ft_qqp-40,</span></span> <span class="ltx_tr" id="S5.T2.1.1.1.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.1.1.1.3.2.1.2.1">connectivity/bert_ft_qqp-1, connectivity/bert_ft_qqp-7</span></span> </span></span><span class="ltx_text" id="S5.T2.1.1.1.3.3"></span></td> </tr> <tr class="ltx_tr" id="S5.T2.2.2.2"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.2.2.2.1"><math alttext="C_{2}" class="ltx_Math" display="inline" id="S5.T2.2.2.2.1.m1.1"><semantics id="S5.T2.2.2.2.1.m1.1a"><msub id="S5.T2.2.2.2.1.m1.1.1" xref="S5.T2.2.2.2.1.m1.1.1.cmml"><mi id="S5.T2.2.2.2.1.m1.1.1.2" xref="S5.T2.2.2.2.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.2.2.2.1.m1.1.1.3" xref="S5.T2.2.2.2.1.m1.1.1.3.cmml">2</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.2.2.2.1.m1.1b"><apply id="S5.T2.2.2.2.1.m1.1.1.cmml" xref="S5.T2.2.2.2.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.2.2.2.1.m1.1.1.1.cmml" xref="S5.T2.2.2.2.1.m1.1.1">subscript</csymbol><ci id="S5.T2.2.2.2.1.m1.1.1.2.cmml" xref="S5.T2.2.2.2.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.2.2.2.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.2.2.2.1.m1.1.1.3">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.2.2.2.1.m1.1c">C_{2}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.2.2.2.1.m1.1d">italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.2.2.2.2">7</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.2.2.2.3"> <span class="ltx_text" id="S5.T2.2.2.2.3.1"></span> <span class="ltx_text" id="S5.T2.2.2.2.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.2.2.2.3.2.1"> <span class="ltx_tr" id="S5.T2.2.2.2.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.2.2.2.3.2.1.1.1">Jeevesh8/512seq_len_6ep_bert_ft_cola-91, anirudh21/bert-base-uncased-finetuned-qnli, Jeevesh8/bert_ft_cola-88,</span></span> <span class="ltx_tr" id="S5.T2.2.2.2.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.2.2.2.3.2.1.2.1">manueltonneau/bert-twitter-en-is-hired, bert-base-uncased,</span></span> <span class="ltx_tr" id="S5.T2.2.2.2.3.2.1.3"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.2.2.2.3.2.1.3.1">aditeyabaral/finetuned-sail2017-xlm-roberta-base, DoyyingFace/bert-asian-hate-tweets-asian-unclean-freeze-4</span></span> </span></span><span class="ltx_text" id="S5.T2.2.2.2.3.3"></span></td> </tr> <tr class="ltx_tr" id="S5.T2.3.3.3"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.3.3.3.1"><math alttext="C_{3}" class="ltx_Math" display="inline" id="S5.T2.3.3.3.1.m1.1"><semantics id="S5.T2.3.3.3.1.m1.1a"><msub id="S5.T2.3.3.3.1.m1.1.1" xref="S5.T2.3.3.3.1.m1.1.1.cmml"><mi id="S5.T2.3.3.3.1.m1.1.1.2" xref="S5.T2.3.3.3.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.3.3.3.1.m1.1.1.3" xref="S5.T2.3.3.3.1.m1.1.1.3.cmml">3</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.3.3.3.1.m1.1b"><apply id="S5.T2.3.3.3.1.m1.1.1.cmml" xref="S5.T2.3.3.3.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.3.3.3.1.m1.1.1.1.cmml" xref="S5.T2.3.3.3.1.m1.1.1">subscript</csymbol><ci id="S5.T2.3.3.3.1.m1.1.1.2.cmml" xref="S5.T2.3.3.3.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.3.3.3.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.3.3.3.1.m1.1.1.3">3</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.3.3.3.1.m1.1c">C_{3}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.3.3.3.1.m1.1d">italic_C start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.3.3.3.2">5</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.3.3.3.3"> <span class="ltx_text" id="S5.T2.3.3.3.3.1"></span> <span class="ltx_text" id="S5.T2.3.3.3.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.3.3.3.3.2.1"> <span class="ltx_tr" id="S5.T2.3.3.3.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.3.3.3.3.2.1.1.1">Jeevesh8/feather_berts_46, ishan/bert-base-uncased-mnli</span></span> <span class="ltx_tr" id="S5.T2.3.3.3.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.3.3.3.3.2.1.2.1">roberta-base, Alireza1044/albert-base-v2-qnli, albert-base-v2</span></span> </span></span><span class="ltx_text" id="S5.T2.3.3.3.3.3"></span></td> </tr> <tr class="ltx_tr" id="S5.T2.4.4.4"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.4.4.4.1"><math alttext="C_{4}" class="ltx_Math" display="inline" id="S5.T2.4.4.4.1.m1.1"><semantics id="S5.T2.4.4.4.1.m1.1a"><msub id="S5.T2.4.4.4.1.m1.1.1" xref="S5.T2.4.4.4.1.m1.1.1.cmml"><mi id="S5.T2.4.4.4.1.m1.1.1.2" xref="S5.T2.4.4.4.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.4.4.4.1.m1.1.1.3" xref="S5.T2.4.4.4.1.m1.1.1.3.cmml">4</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.4.4.4.1.m1.1b"><apply id="S5.T2.4.4.4.1.m1.1.1.cmml" xref="S5.T2.4.4.4.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.4.4.4.1.m1.1.1.1.cmml" xref="S5.T2.4.4.4.1.m1.1.1">subscript</csymbol><ci id="S5.T2.4.4.4.1.m1.1.1.2.cmml" xref="S5.T2.4.4.4.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.4.4.4.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.4.4.4.1.m1.1.1.3">4</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.4.4.4.1.m1.1c">C_{4}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.4.4.4.1.m1.1d">italic_C start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.4.4.4.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.4.4.4.3"> <span class="ltx_text" id="S5.T2.4.4.4.3.1"></span> <span class="ltx_text" id="S5.T2.4.4.4.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.4.4.4.3.2.1"> <span class="ltx_tr" id="S5.T2.4.4.4.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.4.4.4.3.2.1.1.1">CAMeL-Lab/bert-base-arabic-camelbert-mix-did-nadi, aliosm/sha3bor-metre-detector-arabertv2-base</span></span> </span></span><span class="ltx_text" id="S5.T2.4.4.4.3.3"></span></td> </tr> <tr class="ltx_tr" id="S5.T2.5.5.5"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.5.5.5.1"><math alttext="C_{5}" class="ltx_Math" display="inline" id="S5.T2.5.5.5.1.m1.1"><semantics id="S5.T2.5.5.5.1.m1.1a"><msub id="S5.T2.5.5.5.1.m1.1.1" xref="S5.T2.5.5.5.1.m1.1.1.cmml"><mi id="S5.T2.5.5.5.1.m1.1.1.2" xref="S5.T2.5.5.5.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.5.5.5.1.m1.1.1.3" xref="S5.T2.5.5.5.1.m1.1.1.3.cmml">5</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.5.5.5.1.m1.1b"><apply id="S5.T2.5.5.5.1.m1.1.1.cmml" xref="S5.T2.5.5.5.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.5.5.5.1.m1.1.1.1.cmml" xref="S5.T2.5.5.5.1.m1.1.1">subscript</csymbol><ci id="S5.T2.5.5.5.1.m1.1.1.2.cmml" xref="S5.T2.5.5.5.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.5.5.5.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.5.5.5.1.m1.1.1.3">5</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.5.5.5.1.m1.1c">C_{5}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.5.5.5.1.m1.1d">italic_C start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.5.5.5.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.5.5.5.3"> <span class="ltx_text" id="S5.T2.5.5.5.3.1"></span> <span class="ltx_text" id="S5.T2.5.5.5.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.5.5.5.3.2.1"> <span class="ltx_tr" id="S5.T2.5.5.5.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.5.5.5.3.2.1.1.1">Splend1dchan/bert-base-uncased-slue-goldtrascription-e3-lr1e-4, aychang/bert-base-cased-trec-coarse</span></span> </span></span><span class="ltx_text" id="S5.T2.5.5.5.3.3"></span></td> </tr> <tr class="ltx_tr" id="S5.T2.6.6.6"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.6.6.6.1"><math alttext="C_{6}" class="ltx_Math" display="inline" id="S5.T2.6.6.6.1.m1.1"><semantics id="S5.T2.6.6.6.1.m1.1a"><msub id="S5.T2.6.6.6.1.m1.1.1" xref="S5.T2.6.6.6.1.m1.1.1.cmml"><mi id="S5.T2.6.6.6.1.m1.1.1.2" xref="S5.T2.6.6.6.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.6.6.6.1.m1.1.1.3" xref="S5.T2.6.6.6.1.m1.1.1.3.cmml">6</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.6.6.6.1.m1.1b"><apply id="S5.T2.6.6.6.1.m1.1.1.cmml" xref="S5.T2.6.6.6.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.6.6.6.1.m1.1.1.1.cmml" xref="S5.T2.6.6.6.1.m1.1.1">subscript</csymbol><ci id="S5.T2.6.6.6.1.m1.1.1.2.cmml" xref="S5.T2.6.6.6.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.6.6.6.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.6.6.6.1.m1.1.1.3">6</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.6.6.6.1.m1.1c">C_{6}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.6.6.6.1.m1.1d">italic_C start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.6.6.6.2">3</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.6.6.6.3"> <span class="ltx_text" id="S5.T2.6.6.6.3.1"></span> <span class="ltx_text" id="S5.T2.6.6.6.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.6.6.6.3.2.1"> <span class="ltx_tr" id="S5.T2.6.6.6.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.6.6.6.3.2.1.1.1">aviator-neural–bert-base-uncased-sst2, distilbert-base-uncased,</span></span> <span class="ltx_tr" id="S5.T2.6.6.6.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.6.6.6.3.2.1.2.1">18811449050–bert_finetuning_test</span></span> </span></span><span class="ltx_text" id="S5.T2.6.6.6.3.3"></span></td> </tr> <tr class="ltx_tr" id="S5.T2.7.7.7"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.7.7.7.1"><math alttext="C_{7}" class="ltx_Math" display="inline" id="S5.T2.7.7.7.1.m1.1"><semantics id="S5.T2.7.7.7.1.m1.1a"><msub id="S5.T2.7.7.7.1.m1.1.1" xref="S5.T2.7.7.7.1.m1.1.1.cmml"><mi id="S5.T2.7.7.7.1.m1.1.1.2" xref="S5.T2.7.7.7.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.7.7.7.1.m1.1.1.3" xref="S5.T2.7.7.7.1.m1.1.1.3.cmml">7</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.7.7.7.1.m1.1b"><apply id="S5.T2.7.7.7.1.m1.1.1.cmml" xref="S5.T2.7.7.7.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.7.7.7.1.m1.1.1.1.cmml" xref="S5.T2.7.7.7.1.m1.1.1">subscript</csymbol><ci id="S5.T2.7.7.7.1.m1.1.1.2.cmml" xref="S5.T2.7.7.7.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.7.7.7.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.7.7.7.1.m1.1.1.3">7</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.7.7.7.1.m1.1c">C_{7}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.7.7.7.1.m1.1d">italic_C start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.7.7.7.2">4</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.7.7.7.3"> <span class="ltx_text" id="S5.T2.7.7.7.3.1"></span> <span class="ltx_text" id="S5.T2.7.7.7.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.7.7.7.3.2.1"> <span class="ltx_tr" id="S5.T2.7.7.7.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.7.7.7.3.2.1.1.1">Jeevesh8/init_bert_ft_qqp-33, Jeevesh8/init_bert_ft_qqp-24,</span></span> <span class="ltx_tr" id="S5.T2.7.7.7.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.7.7.7.3.2.1.2.1">connectivity/bert_ft_qqp-17, connectivity/bert_ft_qqp-96</span></span> </span></span><span class="ltx_text" id="S5.T2.7.7.7.3.3"></span></td> </tr> <tr class="ltx_tr" id="S5.T2.8.8.8"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.8.8.8.1"><math alttext="C_{8}" class="ltx_Math" display="inline" id="S5.T2.8.8.8.1.m1.1"><semantics id="S5.T2.8.8.8.1.m1.1a"><msub id="S5.T2.8.8.8.1.m1.1.1" xref="S5.T2.8.8.8.1.m1.1.1.cmml"><mi id="S5.T2.8.8.8.1.m1.1.1.2" xref="S5.T2.8.8.8.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.8.8.8.1.m1.1.1.3" xref="S5.T2.8.8.8.1.m1.1.1.3.cmml">8</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.8.8.8.1.m1.1b"><apply id="S5.T2.8.8.8.1.m1.1.1.cmml" xref="S5.T2.8.8.8.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.8.8.8.1.m1.1.1.1.cmml" xref="S5.T2.8.8.8.1.m1.1.1">subscript</csymbol><ci id="S5.T2.8.8.8.1.m1.1.1.2.cmml" xref="S5.T2.8.8.8.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.8.8.8.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.8.8.8.1.m1.1.1.3">8</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.8.8.8.1.m1.1c">C_{8}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.8.8.8.1.m1.1d">italic_C start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.8.8.8.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.8.8.8.3"> <span class="ltx_text" id="S5.T2.8.8.8.3.1"></span> <span class="ltx_text" id="S5.T2.8.8.8.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.8.8.8.3.2.1"> <span class="ltx_tr" id="S5.T2.8.8.8.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.8.8.8.3.2.1.1.1">XSY/albert-base-v2-imdb-calssification, emrecan/bert-base-multilingual-cased-snli_tr</span></span> </span></span><span class="ltx_text" id="S5.T2.8.8.8.3.3"></span></td> </tr> <tr class="ltx_tr" id="S5.T2.14.14.17"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" colspan="3" id="S5.T2.14.14.17.1"><span class="ltx_text ltx_font_bold" id="S5.T2.14.14.17.1.1">Model Clusters of Computer Vision</span></td> </tr> <tr class="ltx_tr" id="S5.T2.14.14.18"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.14.14.18.1"><span class="ltx_text ltx_font_bold" id="S5.T2.14.14.18.1.1">Cluster</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.14.14.18.2"><span class="ltx_text ltx_font_bold" id="S5.T2.14.14.18.2.1">Size</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.14.14.18.3"><span class="ltx_text ltx_font_bold" id="S5.T2.14.14.18.3.1">Pre-trained Models</span></td> </tr> <tr class="ltx_tr" id="S5.T2.9.9.9"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.9.9.9.1"><math alttext="C_{1}" class="ltx_Math" display="inline" id="S5.T2.9.9.9.1.m1.1"><semantics id="S5.T2.9.9.9.1.m1.1a"><msub id="S5.T2.9.9.9.1.m1.1.1" xref="S5.T2.9.9.9.1.m1.1.1.cmml"><mi id="S5.T2.9.9.9.1.m1.1.1.2" xref="S5.T2.9.9.9.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.9.9.9.1.m1.1.1.3" xref="S5.T2.9.9.9.1.m1.1.1.3.cmml">1</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.9.9.9.1.m1.1b"><apply id="S5.T2.9.9.9.1.m1.1.1.cmml" xref="S5.T2.9.9.9.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.9.9.9.1.m1.1.1.1.cmml" xref="S5.T2.9.9.9.1.m1.1.1">subscript</csymbol><ci id="S5.T2.9.9.9.1.m1.1.1.2.cmml" xref="S5.T2.9.9.9.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.9.9.9.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.9.9.9.1.m1.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.9.9.9.1.m1.1c">C_{1}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.9.9.9.1.m1.1d">italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.9.9.9.2">6</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.9.9.9.3"> <span class="ltx_text" id="S5.T2.9.9.9.3.1"></span> <span class="ltx_text" id="S5.T2.9.9.9.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.9.9.9.3.2.1"> <span class="ltx_tr" id="S5.T2.9.9.9.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.9.9.9.3.2.1.1.1">facebook/deit-base-patch16-224, facebook/deit-base-patch16-384, facebook/dino-vits16,</span></span> <span class="ltx_tr" id="S5.T2.9.9.9.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.9.9.9.3.2.1.2.1">facebook/vit-msn-base, facebook/vit-msn-small, Visual-Attention-Network/van-large</span></span> </span></span><span class="ltx_text" id="S5.T2.9.9.9.3.3"></span></td> </tr> <tr class="ltx_tr" id="S5.T2.10.10.10"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.10.10.10.1"><math alttext="C_{2}" class="ltx_Math" display="inline" id="S5.T2.10.10.10.1.m1.1"><semantics id="S5.T2.10.10.10.1.m1.1a"><msub id="S5.T2.10.10.10.1.m1.1.1" xref="S5.T2.10.10.10.1.m1.1.1.cmml"><mi id="S5.T2.10.10.10.1.m1.1.1.2" xref="S5.T2.10.10.10.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.10.10.10.1.m1.1.1.3" xref="S5.T2.10.10.10.1.m1.1.1.3.cmml">2</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.10.10.10.1.m1.1b"><apply id="S5.T2.10.10.10.1.m1.1.1.cmml" xref="S5.T2.10.10.10.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.10.10.10.1.m1.1.1.1.cmml" xref="S5.T2.10.10.10.1.m1.1.1">subscript</csymbol><ci id="S5.T2.10.10.10.1.m1.1.1.2.cmml" xref="S5.T2.10.10.10.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.10.10.10.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.10.10.10.1.m1.1.1.3">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.10.10.10.1.m1.1c">C_{2}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.10.10.10.1.m1.1d">italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.10.10.10.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.10.10.10.3"> <span class="ltx_text" id="S5.T2.10.10.10.3.1"></span> <span class="ltx_text" id="S5.T2.10.10.10.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.10.10.10.3.2.1"> <span class="ltx_tr" id="S5.T2.10.10.10.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.10.10.10.3.2.1.1.1">facebook/deit-small-patch16-224, Visual-Attention-Network/van-base</span></span> </span></span><span class="ltx_text" id="S5.T2.10.10.10.3.3"></span></td> </tr> <tr class="ltx_tr" id="S5.T2.11.11.11"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.11.11.11.1"><math alttext="C_{3}" class="ltx_Math" display="inline" id="S5.T2.11.11.11.1.m1.1"><semantics id="S5.T2.11.11.11.1.m1.1a"><msub id="S5.T2.11.11.11.1.m1.1.1" xref="S5.T2.11.11.11.1.m1.1.1.cmml"><mi id="S5.T2.11.11.11.1.m1.1.1.2" xref="S5.T2.11.11.11.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.11.11.11.1.m1.1.1.3" xref="S5.T2.11.11.11.1.m1.1.1.3.cmml">3</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.11.11.11.1.m1.1b"><apply id="S5.T2.11.11.11.1.m1.1.1.cmml" xref="S5.T2.11.11.11.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.11.11.11.1.m1.1.1.1.cmml" xref="S5.T2.11.11.11.1.m1.1.1">subscript</csymbol><ci id="S5.T2.11.11.11.1.m1.1.1.2.cmml" xref="S5.T2.11.11.11.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.11.11.11.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.11.11.11.1.m1.1.1.3">3</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.11.11.11.1.m1.1c">C_{3}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.11.11.11.1.m1.1d">italic_C start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.11.11.11.2">11</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.11.11.11.3"> <span class="ltx_text" id="S5.T2.11.11.11.3.1"></span> <span class="ltx_text" id="S5.T2.11.11.11.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.11.11.11.3.2.1"> <span class="ltx_tr" id="S5.T2.11.11.11.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.11.11.11.3.2.1.1.1">facebook/dino-vitb16, facebook/dino-vitb8, google/vit-base-patch16-224,</span></span> <span class="ltx_tr" id="S5.T2.11.11.11.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.11.11.11.3.2.1.2.1">google/vit-base-patch16-384, lixiqi/beit-base-patch16-224-pt22k-ft22k-finetuned-FER2013-6e-05,</span></span> <span class="ltx_tr" id="S5.T2.11.11.11.3.2.1.3"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.11.11.11.3.2.1.3.1">lixiqi/beit-base-patch16-224-pt22k-ft22k-finetuned-FER2013-7e-05,</span></span> <span class="ltx_tr" id="S5.T2.11.11.11.3.2.1.4"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.11.11.11.3.2.1.4.1">lixiqi/beit-base-patch16-224-pt22k-ft22k-finetuned-FER-5e-05-3,</span></span> <span class="ltx_tr" id="S5.T2.11.11.11.3.2.1.5"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.11.11.11.3.2.1.5.1">microsoft/beit-base-patch16-224, microsoft/beit-base-patch16-224-pt22k-ft22k, microsoft/beit-base-patch16-384,</span></span> <span class="ltx_tr" id="S5.T2.11.11.11.3.2.1.6"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.11.11.11.3.2.1.6.1">nateraw/vit-age-classifier</span></span> </span></span><span class="ltx_text" id="S5.T2.11.11.11.3.3"></span></td> </tr> <tr class="ltx_tr" id="S5.T2.12.12.12"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.12.12.12.1"><math alttext="C_{4}" class="ltx_Math" display="inline" id="S5.T2.12.12.12.1.m1.1"><semantics id="S5.T2.12.12.12.1.m1.1a"><msub id="S5.T2.12.12.12.1.m1.1.1" xref="S5.T2.12.12.12.1.m1.1.1.cmml"><mi id="S5.T2.12.12.12.1.m1.1.1.2" xref="S5.T2.12.12.12.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.12.12.12.1.m1.1.1.3" xref="S5.T2.12.12.12.1.m1.1.1.3.cmml">4</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.12.12.12.1.m1.1b"><apply id="S5.T2.12.12.12.1.m1.1.1.cmml" xref="S5.T2.12.12.12.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.12.12.12.1.m1.1.1.1.cmml" xref="S5.T2.12.12.12.1.m1.1.1">subscript</csymbol><ci id="S5.T2.12.12.12.1.m1.1.1.2.cmml" xref="S5.T2.12.12.12.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.12.12.12.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.12.12.12.1.m1.1.1.3">4</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.12.12.12.1.m1.1c">C_{4}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.12.12.12.1.m1.1d">italic_C start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.12.12.12.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.12.12.12.3"> <span class="ltx_text" id="S5.T2.12.12.12.3.1"></span> <span class="ltx_text" id="S5.T2.12.12.12.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.12.12.12.3.2.1"> <span class="ltx_tr" id="S5.T2.12.12.12.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.12.12.12.3.2.1.1.1">shi-labs/dinat-large-in22k-in1k-224, shi-labs/dinat-large-in22k-in1k-384</span></span> </span></span><span class="ltx_text" id="S5.T2.12.12.12.3.3"></span></td> </tr> <tr class="ltx_tr" id="S5.T2.13.13.13"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.13.13.13.1"><math alttext="C_{5}" class="ltx_Math" display="inline" id="S5.T2.13.13.13.1.m1.1"><semantics id="S5.T2.13.13.13.1.m1.1a"><msub id="S5.T2.13.13.13.1.m1.1.1" xref="S5.T2.13.13.13.1.m1.1.1.cmml"><mi id="S5.T2.13.13.13.1.m1.1.1.2" xref="S5.T2.13.13.13.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.13.13.13.1.m1.1.1.3" xref="S5.T2.13.13.13.1.m1.1.1.3.cmml">5</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.13.13.13.1.m1.1b"><apply id="S5.T2.13.13.13.1.m1.1.1.cmml" xref="S5.T2.13.13.13.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.13.13.13.1.m1.1.1.1.cmml" xref="S5.T2.13.13.13.1.m1.1.1">subscript</csymbol><ci id="S5.T2.13.13.13.1.m1.1.1.2.cmml" xref="S5.T2.13.13.13.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.13.13.13.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.13.13.13.1.m1.1.1.3">5</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.13.13.13.1.m1.1c">C_{5}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.13.13.13.1.m1.1d">italic_C start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.13.13.13.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T2.13.13.13.3"> <span class="ltx_text" id="S5.T2.13.13.13.3.1"></span> <span class="ltx_text" id="S5.T2.13.13.13.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.13.13.13.3.2.1"> <span class="ltx_tr" id="S5.T2.13.13.13.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.13.13.13.3.2.1.1.1">sail/poolformer-m36, sail/poolformer-m48</span></span> </span></span><span class="ltx_text" id="S5.T2.13.13.13.3.3"></span></td> </tr> <tr class="ltx_tr" id="S5.T2.14.14.14"> <td class="ltx_td ltx_align_center ltx_border_b ltx_border_l ltx_border_r ltx_border_t" id="S5.T2.14.14.14.1"><math alttext="C_{6}" class="ltx_Math" display="inline" id="S5.T2.14.14.14.1.m1.1"><semantics id="S5.T2.14.14.14.1.m1.1a"><msub id="S5.T2.14.14.14.1.m1.1.1" xref="S5.T2.14.14.14.1.m1.1.1.cmml"><mi id="S5.T2.14.14.14.1.m1.1.1.2" xref="S5.T2.14.14.14.1.m1.1.1.2.cmml">C</mi><mn id="S5.T2.14.14.14.1.m1.1.1.3" xref="S5.T2.14.14.14.1.m1.1.1.3.cmml">6</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T2.14.14.14.1.m1.1b"><apply id="S5.T2.14.14.14.1.m1.1.1.cmml" xref="S5.T2.14.14.14.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T2.14.14.14.1.m1.1.1.1.cmml" xref="S5.T2.14.14.14.1.m1.1.1">subscript</csymbol><ci id="S5.T2.14.14.14.1.m1.1.1.2.cmml" xref="S5.T2.14.14.14.1.m1.1.1.2">𝐶</ci><cn id="S5.T2.14.14.14.1.m1.1.1.3.cmml" type="integer" xref="S5.T2.14.14.14.1.m1.1.1.3">6</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T2.14.14.14.1.m1.1c">C_{6}</annotation><annotation encoding="application/x-llamapun" id="S5.T2.14.14.14.1.m1.1d">italic_C start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S5.T2.14.14.14.2">2</td> <td class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S5.T2.14.14.14.3"> <span class="ltx_text" id="S5.T2.14.14.14.3.1"></span> <span class="ltx_text" id="S5.T2.14.14.14.3.2"> <span class="ltx_tabular ltx_align_middle" id="S5.T2.14.14.14.3.2.1"> <span class="ltx_tr" id="S5.T2.14.14.14.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T2.14.14.14.3.2.1.1.1">shi-labs/dinat-base-in1k-224, microsoft/beit-large-patch16-224-pt22k</span></span> </span></span><span class="ltx_text" id="S5.T2.14.14.14.3.3"></span></td> </tr> </table> </figure> <div class="ltx_para" id="S5.SS2.p4"> <p class="ltx_p" id="S5.SS2.p4.10">Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.T2" title="TABLE II ‣ V-B Experiment for Coarse-Recall Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">II</span></a> illustrates our clustering results of non-singleton clusters for natural language processing and computer vision tasks. For natural language processing models, there are 8 non-singleton clusters which contain 30 models out of a total of 40 models. For computer vision tasks, there are 6 non-singleton models that contain almost all the models. We study the details for some clusters. For natural language process models, as the introduction information is not available for many models, we infer the model training process from the model name. For <math alttext="C_{1}" class="ltx_Math" display="inline" id="S5.SS2.p4.1.m1.1"><semantics id="S5.SS2.p4.1.m1.1a"><msub id="S5.SS2.p4.1.m1.1.1" xref="S5.SS2.p4.1.m1.1.1.cmml"><mi id="S5.SS2.p4.1.m1.1.1.2" xref="S5.SS2.p4.1.m1.1.1.2.cmml">C</mi><mn id="S5.SS2.p4.1.m1.1.1.3" xref="S5.SS2.p4.1.m1.1.1.3.cmml">1</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS2.p4.1.m1.1b"><apply id="S5.SS2.p4.1.m1.1.1.cmml" xref="S5.SS2.p4.1.m1.1.1"><csymbol cd="ambiguous" id="S5.SS2.p4.1.m1.1.1.1.cmml" xref="S5.SS2.p4.1.m1.1.1">subscript</csymbol><ci id="S5.SS2.p4.1.m1.1.1.2.cmml" xref="S5.SS2.p4.1.m1.1.1.2">𝐶</ci><cn id="S5.SS2.p4.1.m1.1.1.3.cmml" type="integer" xref="S5.SS2.p4.1.m1.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p4.1.m1.1c">C_{1}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p4.1.m1.1d">italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT</annotation></semantics></math> and <math alttext="C_{2}" class="ltx_Math" display="inline" id="S5.SS2.p4.2.m2.1"><semantics id="S5.SS2.p4.2.m2.1a"><msub id="S5.SS2.p4.2.m2.1.1" xref="S5.SS2.p4.2.m2.1.1.cmml"><mi id="S5.SS2.p4.2.m2.1.1.2" xref="S5.SS2.p4.2.m2.1.1.2.cmml">C</mi><mn id="S5.SS2.p4.2.m2.1.1.3" xref="S5.SS2.p4.2.m2.1.1.3.cmml">2</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS2.p4.2.m2.1b"><apply id="S5.SS2.p4.2.m2.1.1.cmml" xref="S5.SS2.p4.2.m2.1.1"><csymbol cd="ambiguous" id="S5.SS2.p4.2.m2.1.1.1.cmml" xref="S5.SS2.p4.2.m2.1.1">subscript</csymbol><ci id="S5.SS2.p4.2.m2.1.1.2.cmml" xref="S5.SS2.p4.2.m2.1.1.2">𝐶</ci><cn id="S5.SS2.p4.2.m2.1.1.3.cmml" type="integer" xref="S5.SS2.p4.2.m2.1.1.3">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p4.2.m2.1c">C_{2}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p4.2.m2.1d">italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math> clusters, we can see they mainly contain pre-trained models fine-tuned on qqp <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib47" title="">47</a>]</cite> and cola <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib48" title="">48</a>]</cite> datasets respectively, demonstrating the models fine-tuned on the same downstream tasks could be grouped together by model clustering. On the other hand, we can see there are also models with names containing qqp that are not clustered into <math alttext="C_{1}" class="ltx_Math" display="inline" id="S5.SS2.p4.3.m3.1"><semantics id="S5.SS2.p4.3.m3.1a"><msub id="S5.SS2.p4.3.m3.1.1" xref="S5.SS2.p4.3.m3.1.1.cmml"><mi id="S5.SS2.p4.3.m3.1.1.2" xref="S5.SS2.p4.3.m3.1.1.2.cmml">C</mi><mn id="S5.SS2.p4.3.m3.1.1.3" xref="S5.SS2.p4.3.m3.1.1.3.cmml">1</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS2.p4.3.m3.1b"><apply id="S5.SS2.p4.3.m3.1.1.cmml" xref="S5.SS2.p4.3.m3.1.1"><csymbol cd="ambiguous" id="S5.SS2.p4.3.m3.1.1.1.cmml" xref="S5.SS2.p4.3.m3.1.1">subscript</csymbol><ci id="S5.SS2.p4.3.m3.1.1.2.cmml" xref="S5.SS2.p4.3.m3.1.1.2">𝐶</ci><cn id="S5.SS2.p4.3.m3.1.1.3.cmml" type="integer" xref="S5.SS2.p4.3.m3.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p4.3.m3.1c">C_{1}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p4.3.m3.1d">italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT</annotation></semantics></math>, such as <math alttext="C_{7}" class="ltx_Math" display="inline" id="S5.SS2.p4.4.m4.1"><semantics id="S5.SS2.p4.4.m4.1a"><msub id="S5.SS2.p4.4.m4.1.1" xref="S5.SS2.p4.4.m4.1.1.cmml"><mi id="S5.SS2.p4.4.m4.1.1.2" xref="S5.SS2.p4.4.m4.1.1.2.cmml">C</mi><mn id="S5.SS2.p4.4.m4.1.1.3" xref="S5.SS2.p4.4.m4.1.1.3.cmml">7</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS2.p4.4.m4.1b"><apply id="S5.SS2.p4.4.m4.1.1.cmml" xref="S5.SS2.p4.4.m4.1.1"><csymbol cd="ambiguous" id="S5.SS2.p4.4.m4.1.1.1.cmml" xref="S5.SS2.p4.4.m4.1.1">subscript</csymbol><ci id="S5.SS2.p4.4.m4.1.1.2.cmml" xref="S5.SS2.p4.4.m4.1.1.2">𝐶</ci><cn id="S5.SS2.p4.4.m4.1.1.3.cmml" type="integer" xref="S5.SS2.p4.4.m4.1.1.3">7</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p4.4.m4.1c">C_{7}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p4.4.m4.1d">italic_C start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT</annotation></semantics></math>, demonstrating the performance of models with similar model names may also vary, which may be caused by different training setups. For <math alttext="C_{3}" class="ltx_Math" display="inline" id="S5.SS2.p4.5.m5.1"><semantics id="S5.SS2.p4.5.m5.1a"><msub id="S5.SS2.p4.5.m5.1.1" xref="S5.SS2.p4.5.m5.1.1.cmml"><mi id="S5.SS2.p4.5.m5.1.1.2" xref="S5.SS2.p4.5.m5.1.1.2.cmml">C</mi><mn id="S5.SS2.p4.5.m5.1.1.3" xref="S5.SS2.p4.5.m5.1.1.3.cmml">3</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS2.p4.5.m5.1b"><apply id="S5.SS2.p4.5.m5.1.1.cmml" xref="S5.SS2.p4.5.m5.1.1"><csymbol cd="ambiguous" id="S5.SS2.p4.5.m5.1.1.1.cmml" xref="S5.SS2.p4.5.m5.1.1">subscript</csymbol><ci id="S5.SS2.p4.5.m5.1.1.2.cmml" xref="S5.SS2.p4.5.m5.1.1.2">𝐶</ci><cn id="S5.SS2.p4.5.m5.1.1.3.cmml" type="integer" xref="S5.SS2.p4.5.m5.1.1.3">3</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p4.5.m5.1c">C_{3}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p4.5.m5.1d">italic_C start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT</annotation></semantics></math> cluster, we find that it groups models with names containing mnli and feather_berts together, and it is reasonable since we can find from <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib49" title="">49</a>]</cite> that the feather_berts models are also BERT models fine-tuned on the MNLI datasets. This also demonstrates the effectiveness of the model clustering. The results of computer vision clusters exhibit a similar phenomenon. The <math alttext="C_{1}" class="ltx_Math" display="inline" id="S5.SS2.p4.6.m6.1"><semantics id="S5.SS2.p4.6.m6.1a"><msub id="S5.SS2.p4.6.m6.1.1" xref="S5.SS2.p4.6.m6.1.1.cmml"><mi id="S5.SS2.p4.6.m6.1.1.2" xref="S5.SS2.p4.6.m6.1.1.2.cmml">C</mi><mn id="S5.SS2.p4.6.m6.1.1.3" xref="S5.SS2.p4.6.m6.1.1.3.cmml">1</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS2.p4.6.m6.1b"><apply id="S5.SS2.p4.6.m6.1.1.cmml" xref="S5.SS2.p4.6.m6.1.1"><csymbol cd="ambiguous" id="S5.SS2.p4.6.m6.1.1.1.cmml" xref="S5.SS2.p4.6.m6.1.1">subscript</csymbol><ci id="S5.SS2.p4.6.m6.1.1.2.cmml" xref="S5.SS2.p4.6.m6.1.1.2">𝐶</ci><cn id="S5.SS2.p4.6.m6.1.1.3.cmml" type="integer" xref="S5.SS2.p4.6.m6.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p4.6.m6.1c">C_{1}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p4.6.m6.1d">italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT</annotation></semantics></math> cluster mainly contains base-size deit models <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib19" title="">19</a>]</cite>, small-size vit models using dino <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib50" title="">50</a>]</cite>, and vit models using msn <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib51" title="">51</a>]</cite>. Looking further into the models, we discover that these three kinds of models are all pre-trained or fine-tuned on dataset Imagenet-1k <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib52" title="">52</a>]</cite>. The <math alttext="C_{3}" class="ltx_Math" display="inline" id="S5.SS2.p4.7.m7.1"><semantics id="S5.SS2.p4.7.m7.1a"><msub id="S5.SS2.p4.7.m7.1.1" xref="S5.SS2.p4.7.m7.1.1.cmml"><mi id="S5.SS2.p4.7.m7.1.1.2" xref="S5.SS2.p4.7.m7.1.1.2.cmml">C</mi><mn id="S5.SS2.p4.7.m7.1.1.3" xref="S5.SS2.p4.7.m7.1.1.3.cmml">3</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS2.p4.7.m7.1b"><apply id="S5.SS2.p4.7.m7.1.1.cmml" xref="S5.SS2.p4.7.m7.1.1"><csymbol cd="ambiguous" id="S5.SS2.p4.7.m7.1.1.1.cmml" xref="S5.SS2.p4.7.m7.1.1">subscript</csymbol><ci id="S5.SS2.p4.7.m7.1.1.2.cmml" xref="S5.SS2.p4.7.m7.1.1.2">𝐶</ci><cn id="S5.SS2.p4.7.m7.1.1.3.cmml" type="integer" xref="S5.SS2.p4.7.m7.1.1.3">3</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p4.7.m7.1c">C_{3}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p4.7.m7.1d">italic_C start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT</annotation></semantics></math> cluster mainly contains base-size vit models using dino, vit base models, and beit base models. Similar to <math alttext="C_{1}" class="ltx_Math" display="inline" id="S5.SS2.p4.8.m8.1"><semantics id="S5.SS2.p4.8.m8.1a"><msub id="S5.SS2.p4.8.m8.1.1" xref="S5.SS2.p4.8.m8.1.1.cmml"><mi id="S5.SS2.p4.8.m8.1.1.2" xref="S5.SS2.p4.8.m8.1.1.2.cmml">C</mi><mn id="S5.SS2.p4.8.m8.1.1.3" xref="S5.SS2.p4.8.m8.1.1.3.cmml">1</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS2.p4.8.m8.1b"><apply id="S5.SS2.p4.8.m8.1.1.cmml" xref="S5.SS2.p4.8.m8.1.1"><csymbol cd="ambiguous" id="S5.SS2.p4.8.m8.1.1.1.cmml" xref="S5.SS2.p4.8.m8.1.1">subscript</csymbol><ci id="S5.SS2.p4.8.m8.1.1.2.cmml" xref="S5.SS2.p4.8.m8.1.1.2">𝐶</ci><cn id="S5.SS2.p4.8.m8.1.1.3.cmml" type="integer" xref="S5.SS2.p4.8.m8.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p4.8.m8.1c">C_{1}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p4.8.m8.1d">italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT</annotation></semantics></math> cluster, these models, except dino-vit, are all pre-trained or fine-tuned with dataset Imagenet-21k <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib53" title="">53</a>]</cite>. On the other hand, we can also see models that are not grouped in one cluster even though they share similar names or training datasets, like dino vit models in <math alttext="C_{1}" class="ltx_Math" display="inline" id="S5.SS2.p4.9.m9.1"><semantics id="S5.SS2.p4.9.m9.1a"><msub id="S5.SS2.p4.9.m9.1.1" xref="S5.SS2.p4.9.m9.1.1.cmml"><mi id="S5.SS2.p4.9.m9.1.1.2" xref="S5.SS2.p4.9.m9.1.1.2.cmml">C</mi><mn id="S5.SS2.p4.9.m9.1.1.3" xref="S5.SS2.p4.9.m9.1.1.3.cmml">1</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS2.p4.9.m9.1b"><apply id="S5.SS2.p4.9.m9.1.1.cmml" xref="S5.SS2.p4.9.m9.1.1"><csymbol cd="ambiguous" id="S5.SS2.p4.9.m9.1.1.1.cmml" xref="S5.SS2.p4.9.m9.1.1">subscript</csymbol><ci id="S5.SS2.p4.9.m9.1.1.2.cmml" xref="S5.SS2.p4.9.m9.1.1.2">𝐶</ci><cn id="S5.SS2.p4.9.m9.1.1.3.cmml" type="integer" xref="S5.SS2.p4.9.m9.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p4.9.m9.1c">C_{1}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p4.9.m9.1d">italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT</annotation></semantics></math> and <math alttext="C_{3}" class="ltx_Math" display="inline" id="S5.SS2.p4.10.m10.1"><semantics id="S5.SS2.p4.10.m10.1a"><msub id="S5.SS2.p4.10.m10.1.1" xref="S5.SS2.p4.10.m10.1.1.cmml"><mi id="S5.SS2.p4.10.m10.1.1.2" xref="S5.SS2.p4.10.m10.1.1.2.cmml">C</mi><mn id="S5.SS2.p4.10.m10.1.1.3" xref="S5.SS2.p4.10.m10.1.1.3.cmml">3</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS2.p4.10.m10.1b"><apply id="S5.SS2.p4.10.m10.1.1.cmml" xref="S5.SS2.p4.10.m10.1.1"><csymbol cd="ambiguous" id="S5.SS2.p4.10.m10.1.1.1.cmml" xref="S5.SS2.p4.10.m10.1.1">subscript</csymbol><ci id="S5.SS2.p4.10.m10.1.1.2.cmml" xref="S5.SS2.p4.10.m10.1.1.2">𝐶</ci><cn id="S5.SS2.p4.10.m10.1.1.3.cmml" type="integer" xref="S5.SS2.p4.10.m10.1.1.3">3</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p4.10.m10.1c">C_{3}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p4.10.m10.1d">italic_C start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT</annotation></semantics></math>, which also demonstrates models with similar model names may have different performance. We put the result of the K-means clustering method in Appendix. F <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib15" title="">15</a>]</cite>. Generally speaking, some clusters contain models of different structures and/or training datasets.</p> </div> <div class="ltx_para" id="S5.SS2.p5"> <p class="ltx_p" id="S5.SS2.p5.2">We also study the performance of models in non-singleton clusters in Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.T3" title="TABLE III ‣ V-B Experiment for Coarse-Recall Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">III</span></a>. We can see the average accuracy of models in non-singleton clusters is significantly higher than that of models in singleton clusters for both natural language and computer vision tasks. Meanwhile, we also compute the count of models that achieve maximum accuracy for a benchmark dataset, and it exhibits a similar result that models in non-singleton clusters almost contribute to all the best models for benchmark datasets. The result of Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.T3" title="TABLE III ‣ V-B Experiment for Coarse-Recall Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">III</span></a> demonstrates that models in the non-singleton clusters tend to achieve higher training performance. This could be explained by the fact that the high-quality model may achieve similar performance bounded by the state-of-the-art model, while on the other hand, the performance of poorly-performed models may differ a lot on different datasets. This phenomenon supports the <math alttext="proxy\_score" class="ltx_Math" display="inline" id="S5.SS2.p5.1.m1.1"><semantics id="S5.SS2.p5.1.m1.1a"><mrow id="S5.SS2.p5.1.m1.1.1" xref="S5.SS2.p5.1.m1.1.1.cmml"><mi id="S5.SS2.p5.1.m1.1.1.2" xref="S5.SS2.p5.1.m1.1.1.2.cmml">p</mi><mo id="S5.SS2.p5.1.m1.1.1.1" xref="S5.SS2.p5.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.1.m1.1.1.3" xref="S5.SS2.p5.1.m1.1.1.3.cmml">r</mi><mo id="S5.SS2.p5.1.m1.1.1.1a" xref="S5.SS2.p5.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.1.m1.1.1.4" xref="S5.SS2.p5.1.m1.1.1.4.cmml">o</mi><mo id="S5.SS2.p5.1.m1.1.1.1b" xref="S5.SS2.p5.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.1.m1.1.1.5" xref="S5.SS2.p5.1.m1.1.1.5.cmml">x</mi><mo id="S5.SS2.p5.1.m1.1.1.1c" xref="S5.SS2.p5.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.1.m1.1.1.6" xref="S5.SS2.p5.1.m1.1.1.6.cmml">y</mi><mo id="S5.SS2.p5.1.m1.1.1.1d" xref="S5.SS2.p5.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.1.m1.1.1.7" mathvariant="normal" xref="S5.SS2.p5.1.m1.1.1.7.cmml">_</mi><mo id="S5.SS2.p5.1.m1.1.1.1e" xref="S5.SS2.p5.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.1.m1.1.1.8" xref="S5.SS2.p5.1.m1.1.1.8.cmml">s</mi><mo id="S5.SS2.p5.1.m1.1.1.1f" xref="S5.SS2.p5.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.1.m1.1.1.9" xref="S5.SS2.p5.1.m1.1.1.9.cmml">c</mi><mo id="S5.SS2.p5.1.m1.1.1.1g" xref="S5.SS2.p5.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.1.m1.1.1.10" xref="S5.SS2.p5.1.m1.1.1.10.cmml">o</mi><mo id="S5.SS2.p5.1.m1.1.1.1h" xref="S5.SS2.p5.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.1.m1.1.1.11" xref="S5.SS2.p5.1.m1.1.1.11.cmml">r</mi><mo id="S5.SS2.p5.1.m1.1.1.1i" xref="S5.SS2.p5.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.1.m1.1.1.12" xref="S5.SS2.p5.1.m1.1.1.12.cmml">e</mi></mrow><annotation-xml encoding="MathML-Content" id="S5.SS2.p5.1.m1.1b"><apply id="S5.SS2.p5.1.m1.1.1.cmml" xref="S5.SS2.p5.1.m1.1.1"><times id="S5.SS2.p5.1.m1.1.1.1.cmml" xref="S5.SS2.p5.1.m1.1.1.1"></times><ci id="S5.SS2.p5.1.m1.1.1.2.cmml" xref="S5.SS2.p5.1.m1.1.1.2">𝑝</ci><ci id="S5.SS2.p5.1.m1.1.1.3.cmml" xref="S5.SS2.p5.1.m1.1.1.3">𝑟</ci><ci id="S5.SS2.p5.1.m1.1.1.4.cmml" xref="S5.SS2.p5.1.m1.1.1.4">𝑜</ci><ci id="S5.SS2.p5.1.m1.1.1.5.cmml" xref="S5.SS2.p5.1.m1.1.1.5">𝑥</ci><ci id="S5.SS2.p5.1.m1.1.1.6.cmml" xref="S5.SS2.p5.1.m1.1.1.6">𝑦</ci><ci id="S5.SS2.p5.1.m1.1.1.7.cmml" xref="S5.SS2.p5.1.m1.1.1.7">_</ci><ci id="S5.SS2.p5.1.m1.1.1.8.cmml" xref="S5.SS2.p5.1.m1.1.1.8">𝑠</ci><ci id="S5.SS2.p5.1.m1.1.1.9.cmml" xref="S5.SS2.p5.1.m1.1.1.9">𝑐</ci><ci id="S5.SS2.p5.1.m1.1.1.10.cmml" xref="S5.SS2.p5.1.m1.1.1.10">𝑜</ci><ci id="S5.SS2.p5.1.m1.1.1.11.cmml" xref="S5.SS2.p5.1.m1.1.1.11">𝑟</ci><ci id="S5.SS2.p5.1.m1.1.1.12.cmml" xref="S5.SS2.p5.1.m1.1.1.12">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p5.1.m1.1c">proxy\_score</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p5.1.m1.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e</annotation></semantics></math> computation strategy in the coarse-recall phase that we only compute the <math alttext="proxy\_score" class="ltx_Math" display="inline" id="S5.SS2.p5.2.m2.1"><semantics id="S5.SS2.p5.2.m2.1a"><mrow id="S5.SS2.p5.2.m2.1.1" xref="S5.SS2.p5.2.m2.1.1.cmml"><mi id="S5.SS2.p5.2.m2.1.1.2" xref="S5.SS2.p5.2.m2.1.1.2.cmml">p</mi><mo id="S5.SS2.p5.2.m2.1.1.1" xref="S5.SS2.p5.2.m2.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.2.m2.1.1.3" xref="S5.SS2.p5.2.m2.1.1.3.cmml">r</mi><mo id="S5.SS2.p5.2.m2.1.1.1a" xref="S5.SS2.p5.2.m2.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.2.m2.1.1.4" xref="S5.SS2.p5.2.m2.1.1.4.cmml">o</mi><mo id="S5.SS2.p5.2.m2.1.1.1b" xref="S5.SS2.p5.2.m2.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.2.m2.1.1.5" xref="S5.SS2.p5.2.m2.1.1.5.cmml">x</mi><mo id="S5.SS2.p5.2.m2.1.1.1c" xref="S5.SS2.p5.2.m2.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.2.m2.1.1.6" xref="S5.SS2.p5.2.m2.1.1.6.cmml">y</mi><mo id="S5.SS2.p5.2.m2.1.1.1d" xref="S5.SS2.p5.2.m2.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.2.m2.1.1.7" mathvariant="normal" xref="S5.SS2.p5.2.m2.1.1.7.cmml">_</mi><mo id="S5.SS2.p5.2.m2.1.1.1e" xref="S5.SS2.p5.2.m2.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.2.m2.1.1.8" xref="S5.SS2.p5.2.m2.1.1.8.cmml">s</mi><mo id="S5.SS2.p5.2.m2.1.1.1f" xref="S5.SS2.p5.2.m2.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.2.m2.1.1.9" xref="S5.SS2.p5.2.m2.1.1.9.cmml">c</mi><mo id="S5.SS2.p5.2.m2.1.1.1g" xref="S5.SS2.p5.2.m2.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.2.m2.1.1.10" xref="S5.SS2.p5.2.m2.1.1.10.cmml">o</mi><mo id="S5.SS2.p5.2.m2.1.1.1h" xref="S5.SS2.p5.2.m2.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.2.m2.1.1.11" xref="S5.SS2.p5.2.m2.1.1.11.cmml">r</mi><mo id="S5.SS2.p5.2.m2.1.1.1i" xref="S5.SS2.p5.2.m2.1.1.1.cmml">⁢</mo><mi id="S5.SS2.p5.2.m2.1.1.12" xref="S5.SS2.p5.2.m2.1.1.12.cmml">e</mi></mrow><annotation-xml encoding="MathML-Content" id="S5.SS2.p5.2.m2.1b"><apply id="S5.SS2.p5.2.m2.1.1.cmml" xref="S5.SS2.p5.2.m2.1.1"><times id="S5.SS2.p5.2.m2.1.1.1.cmml" xref="S5.SS2.p5.2.m2.1.1.1"></times><ci id="S5.SS2.p5.2.m2.1.1.2.cmml" xref="S5.SS2.p5.2.m2.1.1.2">𝑝</ci><ci id="S5.SS2.p5.2.m2.1.1.3.cmml" xref="S5.SS2.p5.2.m2.1.1.3">𝑟</ci><ci id="S5.SS2.p5.2.m2.1.1.4.cmml" xref="S5.SS2.p5.2.m2.1.1.4">𝑜</ci><ci id="S5.SS2.p5.2.m2.1.1.5.cmml" xref="S5.SS2.p5.2.m2.1.1.5">𝑥</ci><ci id="S5.SS2.p5.2.m2.1.1.6.cmml" xref="S5.SS2.p5.2.m2.1.1.6">𝑦</ci><ci id="S5.SS2.p5.2.m2.1.1.7.cmml" xref="S5.SS2.p5.2.m2.1.1.7">_</ci><ci id="S5.SS2.p5.2.m2.1.1.8.cmml" xref="S5.SS2.p5.2.m2.1.1.8">𝑠</ci><ci id="S5.SS2.p5.2.m2.1.1.9.cmml" xref="S5.SS2.p5.2.m2.1.1.9">𝑐</ci><ci id="S5.SS2.p5.2.m2.1.1.10.cmml" xref="S5.SS2.p5.2.m2.1.1.10">𝑜</ci><ci id="S5.SS2.p5.2.m2.1.1.11.cmml" xref="S5.SS2.p5.2.m2.1.1.11">𝑟</ci><ci id="S5.SS2.p5.2.m2.1.1.12.cmml" xref="S5.SS2.p5.2.m2.1.1.12">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p5.2.m2.1c">proxy\_score</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p5.2.m2.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e</annotation></semantics></math> between the target dataset and the representative models of non-singleton clusters, and propagate the score to models in singleton clusters.</p> </div> <figure class="ltx_table" id="S5.T3"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">TABLE III: </span>Performance Comparison of models in Singleton and Non-Singleton Clusters</figcaption> <table class="ltx_tabular ltx_centering ltx_align_middle" id="S5.T3.1"> <tr class="ltx_tr" id="S5.T3.1.1"> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T3.1.1.1"><span class="ltx_text ltx_font_bold" id="S5.T3.1.1.1.1">Task Type</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T3.1.1.2"><span class="ltx_text ltx_font_bold" id="S5.T3.1.1.2.1">Cluster Type</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T3.1.1.3"><span class="ltx_text ltx_font_bold" id="S5.T3.1.1.3.1">Avg(Acc)</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T3.1.1.4"><span class="ltx_text ltx_font_bold" id="S5.T3.1.1.4.1">No. Maximum(Acc)</span></td> </tr> <tr class="ltx_tr" id="S5.T3.1.2"> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T3.1.2.1" rowspan="2"><span class="ltx_text" id="S5.T3.1.2.1.1">NLP</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T3.1.2.2">Non-Singleton</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T3.1.2.3">0.67</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T3.1.2.4">22</td> </tr> <tr class="ltx_tr" id="S5.T3.1.3"> <td class="ltx_td ltx_align_center" id="S5.T3.1.3.1">Singleton</td> <td class="ltx_td ltx_align_center" id="S5.T3.1.3.2">0.61</td> <td class="ltx_td ltx_align_center" id="S5.T3.1.3.3">2</td> </tr> <tr class="ltx_tr" id="S5.T3.1.4"> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T3.1.4.1" rowspan="2"><span class="ltx_text" id="S5.T3.1.4.1.1">CV</span></td> <td class="ltx_td ltx_align_center" id="S5.T3.1.4.2">Non-Singleton</td> <td class="ltx_td ltx_align_center" id="S5.T3.1.4.3">0.84</td> <td class="ltx_td ltx_align_center" id="S5.T3.1.4.4">10</td> </tr> <tr class="ltx_tr" id="S5.T3.1.5"> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T3.1.5.1">Singleton</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T3.1.5.2">0.73</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T3.1.5.3">0</td> </tr> </table> </figure> <figure class="ltx_figure" id="S5.F5"> <p class="ltx_p ltx_align_center" id="S5.F5.1"><span class="ltx_text" id="S5.F5.1.1"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="1440" id="S5.F5.1.1.g1" src="extracted/2404.00069v1/coarse-recall-vs-random_8.png" width="2880"/></span></p> <br class="ltx_break ltx_break"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 5: </span>The average accuracy comparison of recalled models</figcaption> </figure> <figure class="ltx_figure" id="S5.F6"> <p class="ltx_p ltx_align_center" id="S5.F6.1"><span class="ltx_text" id="S5.F6.1.1"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="900" id="S5.F6.1.1.g1" src="extracted/2404.00069v1/cluster_per.png" width="1688"/></span></p> <br class="ltx_break ltx_break"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 6: </span>Clustering Performance based on the first validation results. The blue color represents the comparison between random clustering and clustering based on validation performance, with silhouette score as the selected metric. The higher the silhouette score, the better the clustering effect. The red color represents the comparison between the prediction of test performance using the clustering method and the mean of all historical test performances. The metric used here is the absolute difference between the predicted and actual test performance divided by the actual test performance. We calculate the average with each dataset as the target dataset; the smaller the value, the more accurate the prediction performance. Full model name could be found in Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.T8" title="TABLE VIII ‣ -B Model Details ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">VIII</span></a> in Appendix.B<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib15" title="">15</a>]</cite>.</figcaption> </figure> <div class="ltx_para" id="S5.SS2.p6"> <p class="ltx_p" id="S5.SS2.p6.4"><span class="ltx_text ltx_font_bold" id="S5.SS2.p6.4.1">Model Recall.</span> To evaluate the effectiveness of coarse-recall phase, we fine-tune all the models on corresponding target datasets to get the actual training performance. Then, we compare the average training accuracy on the target datasets of top <math alttext="K" class="ltx_Math" display="inline" id="S5.SS2.p6.1.m1.1"><semantics id="S5.SS2.p6.1.m1.1a"><mi id="S5.SS2.p6.1.m1.1.1" xref="S5.SS2.p6.1.m1.1.1.cmml">K</mi><annotation-xml encoding="MathML-Content" id="S5.SS2.p6.1.m1.1b"><ci id="S5.SS2.p6.1.m1.1.1.cmml" xref="S5.SS2.p6.1.m1.1.1">𝐾</ci></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p6.1.m1.1c">K</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p6.1.m1.1d">italic_K</annotation></semantics></math> recalled models. Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.F5" title="Figure 5 ‣ V-B Experiment for Coarse-Recall Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">5</span></a> compares the average accuracy between coarse-recall and random-recall. Here, random-recall represents randomly return <math alttext="K" class="ltx_Math" display="inline" id="S5.SS2.p6.2.m2.1"><semantics id="S5.SS2.p6.2.m2.1a"><mi id="S5.SS2.p6.2.m2.1.1" xref="S5.SS2.p6.2.m2.1.1.cmml">K</mi><annotation-xml encoding="MathML-Content" id="S5.SS2.p6.2.m2.1b"><ci id="S5.SS2.p6.2.m2.1.1.cmml" xref="S5.SS2.p6.2.m2.1.1">𝐾</ci></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p6.2.m2.1c">K</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p6.2.m2.1d">italic_K</annotation></semantics></math> models from model repository. We can see coarse-recall achieves higher accuracy compared with random-recall on all the eight target datasets. Meanwhile, for coarse-recall, the average accuracy of smaller <math alttext="K" class="ltx_Math" display="inline" id="S5.SS2.p6.3.m3.1"><semantics id="S5.SS2.p6.3.m3.1a"><mi id="S5.SS2.p6.3.m3.1.1" xref="S5.SS2.p6.3.m3.1.1.cmml">K</mi><annotation-xml encoding="MathML-Content" id="S5.SS2.p6.3.m3.1b"><ci id="S5.SS2.p6.3.m3.1.1.cmml" xref="S5.SS2.p6.3.m3.1.1">𝐾</ci></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p6.3.m3.1c">K</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p6.3.m3.1d">italic_K</annotation></semantics></math> values is higher than that of bigger <math alttext="K" class="ltx_Math" display="inline" id="S5.SS2.p6.4.m4.1"><semantics id="S5.SS2.p6.4.m4.1a"><mi id="S5.SS2.p6.4.m4.1.1" xref="S5.SS2.p6.4.m4.1.1.cmml">K</mi><annotation-xml encoding="MathML-Content" id="S5.SS2.p6.4.m4.1b"><ci id="S5.SS2.p6.4.m4.1.1.cmml" xref="S5.SS2.p6.4.m4.1.1">𝐾</ci></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p6.4.m4.1c">K</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p6.4.m4.1d">italic_K</annotation></semantics></math> values, demonstrating the top models recalled by coarse-recall tend to achieve higher training performance compared with models with lower recall score. Meanwhile, we also find that the top 5 models recalled by coarse_recall has contained the model achieving maximum training performance; and for tweet, beans and MedMNIST datasets, the number of recalled models to contain the best model is 10, 10, 15 respectively. In the following experiments, we empirically set the number of recalled models as 10, which accounts for about 25% and 30% of corresponding total models for natural language processing and computer vision tasks respectively.</p> </div> </section> <section class="ltx_subsection" id="S5.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S5.SS3.4.1.1">V-C</span> </span><span class="ltx_text ltx_font_italic" id="S5.SS3.5.2">Experiments for Fine-Selection Phase</span> </h3> <div class="ltx_para" id="S5.SS3.p1"> <p class="ltx_p" id="S5.SS3.p1.1"><span class="ltx_text ltx_font_bold" id="S5.SS3.p1.1.1">Convergence Trend.</span> In this section, we demonstrate the effectiveness of mined convergence trend based on a model’s validation performance on benchmark datasets. We apply clustering methods during the fine-selection phase, and usually, a single filtering is sufficient to select the final model. Therefore, we only consider the performance of clustering based on the first validation results. Firstly, we directly measure the effect of clustering through the silhouette coefficient <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib45" title="">45</a>]</cite>, where a higher value indicates a better clustering outcome. As shown in the blue part of Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.F6" title="Figure 6 ‣ V-B Experiment for Coarse-Recall Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">6</span></a>, across all models, clustering based on validation significantly outperforms random clustering, demonstrating the effectiveness of validation results for clustering. In addition, we consider each benchmark dataset as the target dataset, assess the feasibility of predicting the final test performance using the mean test performance of models within the same cluster as the current model belongs to. We compare this with predicting the test performance based on the mean of all benchmark dataset test performances. The final metric is the absolute difference between the predicted and actual test performance divided by the actual test performance, averaged across all datasets. The red part of Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.F6" title="Figure 6 ‣ V-B Experiment for Coarse-Recall Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">6</span></a> demonstrates that our method predicts final test performance more accurately. This not only validates the feasibility of using the mean test performance within the cluster as a prediction but also demonstrates the effectiveness of clustering. In summary, it is feasible to select models by mining convergence trend based on validation performance and further predicting the model’s final test performance.</p> </div> <div class="ltx_para" id="S5.SS3.p2"> <p class="ltx_p" id="S5.SS3.p2.1"><span class="ltx_text ltx_font_bold" id="S5.SS3.p2.1.1">Filtering Threshold</span> Given the validation fluctuations during the model training process and the potential significant discrepancies between benchmark and target datasets, utilizing convergence trends to filter models might eliminate models with good performance. Therefore, we propose introducing a threshold, stipulating that a model is only filtered out when there is another model with better validation and a predicted performance improvement exceeding this threshold. The threshold is a proportion of the difference between the predicted performances. Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.T4" title="TABLE IV ‣ V-C Experiments for Fine-Selection Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">IV</span></a> illustrates the performance of fine-tuning methods under various threshold conditions. It can be observed that the threshold ensures better-performing models are filtered out later, albeit at the expense of efficiency. In order to more intuitively represent the efficiency of our method and the performance of the selected models, we uniformly use a 0% threshold in subsequent experiments.</p> </div> <figure class="ltx_table" id="S5.T4"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">TABLE IV: </span>Accuracy and Time comparison among different filtering threshold settings in fine-selection. 0% is the original setting.</figcaption> <table class="ltx_tabular ltx_centering ltx_align_middle" id="S5.T4.1"> <tr class="ltx_tr" id="S5.T4.1.1"> <td class="ltx_td ltx_align_left ltx_border_t" id="S5.T4.1.1.1">Models</td> <td class="ltx_td ltx_border_t" id="S5.T4.1.1.2"></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T4.1.1.3">0%</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T4.1.1.4">1%</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T4.1.1.5">5%</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T4.1.1.6">10%</td> </tr> <tr class="ltx_tr" id="S5.T4.1.2"> <td class="ltx_td ltx_align_left ltx_border_t" id="S5.T4.1.2.1" rowspan="2"><span class="ltx_text" id="S5.T4.1.2.1.1">MNLI</span></td> <td class="ltx_td ltx_align_left ltx_border_t" id="S5.T4.1.2.2">Accuracy</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T4.1.2.3">0.85</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T4.1.2.4">0.85</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T4.1.2.5">0.85</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T4.1.2.6">0.85</td> </tr> <tr class="ltx_tr" id="S5.T4.1.3"> <td class="ltx_td ltx_align_left" id="S5.T4.1.3.1">RunTime</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.3.2">14</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.3.3">14</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.3.4">15</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.3.5">16</td> </tr> <tr class="ltx_tr" id="S5.T4.1.4"> <td class="ltx_td ltx_align_left" id="S5.T4.1.4.1" rowspan="2"> <span class="ltx_ERROR undefined" id="S5.T4.1.4.1.1">\hdashline</span><span class="ltx_text" id="S5.T4.1.4.1.2">MultiRC</span> </td> <td class="ltx_td ltx_align_left" id="S5.T4.1.4.2">Accuracy</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.4.3">0.63</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.4.4">0.63</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.4.5">0.63</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.4.6">0.63</td> </tr> <tr class="ltx_tr" id="S5.T4.1.5"> <td class="ltx_td ltx_align_left" id="S5.T4.1.5.1">RunTime</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.5.2">16</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.5.3">16</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.5.4">19</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.5.5">19</td> </tr> <tr class="ltx_tr" id="S5.T4.1.6"> <td class="ltx_td ltx_align_left" id="S5.T4.1.6.1" rowspan="2"> <span class="ltx_ERROR undefined" id="S5.T4.1.6.1.1">\hdashline</span><span class="ltx_text" id="S5.T4.1.6.1.2">Flowers</span> </td> <td class="ltx_td ltx_align_left" id="S5.T4.1.6.2">Accuracy</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.6.3">0.985</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.6.4">0.985</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.6.5">0.986</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.6.6">0.986</td> </tr> <tr class="ltx_tr" id="S5.T4.1.7"> <td class="ltx_td ltx_align_left" id="S5.T4.1.7.1">RunTime</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.7.2">15</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.7.3">15</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.7.4">18</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.7.5">18</td> </tr> <tr class="ltx_tr" id="S5.T4.1.8"> <td class="ltx_td ltx_align_left ltx_border_b" id="S5.T4.1.8.1" rowspan="2"> <span class="ltx_ERROR undefined" id="S5.T4.1.8.1.1">\hdashline</span><span class="ltx_text" id="S5.T4.1.8.1.2">X-Ray</span> </td> <td class="ltx_td ltx_align_left" id="S5.T4.1.8.2">Accuracy</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.8.3">0.966</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.8.4">0.966</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.8.5">0.969</td> <td class="ltx_td ltx_align_center" id="S5.T4.1.8.6">0.969</td> </tr> <tr class="ltx_tr" id="S5.T4.1.9"> <td class="ltx_td ltx_align_left ltx_border_b" id="S5.T4.1.9.1">RunTime</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T4.1.9.2">14</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T4.1.9.3">15</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T4.1.9.4">18</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T4.1.9.5">18</td> </tr> </table> </figure> <div class="ltx_para" id="S5.SS3.p3"> <p class="ltx_p" id="S5.SS3.p3.1"><span class="ltx_text ltx_font_bold" id="S5.SS3.p3.1.1">Method Comparison.</span> After validating the effectiveness of convergence trend mining from the model training performance on benchmark datasets, we further verify its value in model selection methods.</p> </div> <section class="ltx_subsubsection" id="S5.SS3.SSS1"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S5.SS3.SSS1.4.1.1">V-C</span>1 </span>Performance</h4> <div class="ltx_para" id="S5.SS3.SSS1.p1"> <p class="ltx_p" id="S5.SS3.SSS1.p1.1">Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.F7" title="Figure 7 ‣ V-C1 Performance ‣ V-C Experiments for Fine-Selection Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">7</span></a> compares the final performances of the last model selected by the successive halving and our proposed fine-selection method among the top 10 performing and all of the models. At the same time, we provide the best and worst performances among the top 10 models for NLP and CV tasks. We can find that the fine-selection (FS) method is always able to pick out the optimal or near-optimal model. However, the traditional successive halving method may not necessarily select the best model.</p> </div> <figure class="ltx_figure" id="S5.F7"> <p class="ltx_p ltx_align_center" id="S5.F7.1"><span class="ltx_text" id="S5.F7.1.1"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="1424" id="S5.F7.1.1.g1" src="extracted/2404.00069v1/performance_8.png" width="2848"/></span></p> <br class="ltx_break ltx_break"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 7: </span>Selected model’s performance comparison between successive halving(SH) and our fine-selection(FS) method on NLP datasets: MNLI, Tweet, Boolq, MultiRC and CV dataset: Beans, MedMNIST, Flowers and X-ray. 10, 30 and 40 models represent initial model number for filtering. The best and worst model performances in the top 10 models for each dataset are also provided.</figcaption> </figure> </section> <section class="ltx_subsubsection" id="S5.SS3.SSS2"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S5.SS3.SSS2.4.1.1">V-C</span>2 </span>Time</h4> <div class="ltx_para" id="S5.SS3.SSS2.p1"> <p class="ltx_p" id="S5.SS3.SSS2.p1.1">In addition to the performance of the selected model, the speed of model selection is equally important. In Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.T5" title="TABLE V ‣ V-C2 Time ‣ V-C Experiments for Fine-Selection Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">V</span></a>, we compare the speed improvement of the two methods based on brute-force search (BS), that is, fine-tuning all models for a fixed number of epochs. Given the consistency of training settings and hardware environments, we use the total number of fine-tuning epochs across all models to represent the time for model selection. We can observe that our method has a noticeable efficiency improvement compared to successive halving in all datasets and different model numbers. Furthermore, it’s worth noting that the time taken per epoch is considerable, thus the time saved in model selection by our method is significant.</p> </div> <figure class="ltx_table" id="S5.T5"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">TABLE V: </span>Time comparisons among different model selection methods. NLP tasks are trained for 5 epochs and CV tasks are trained for 4 epochs. Runtime is total training epoch number. Selections of top 10 models and all of the models, whose number is 40 and 30 for NLP and CV, is both provided. </figcaption> <table class="ltx_tabular ltx_centering ltx_align_middle" id="S5.T5.1"> <tr class="ltx_tr" id="S5.T5.1.1"> <td class="ltx_td ltx_align_left ltx_border_t" id="S5.T5.1.1.1">Models</td> <td class="ltx_td ltx_border_t" id="S5.T5.1.1.2"></td> <td class="ltx_td ltx_align_center ltx_border_t" colspan="2" id="S5.T5.1.1.3">10</td> <td class="ltx_td ltx_align_left ltx_border_t" colspan="2" id="S5.T5.1.1.4">40(NLP)/30(CV)</td> </tr> <tr class="ltx_tr" id="S5.T5.1.2"> <td class="ltx_td" id="S5.T5.1.2.1"></td> <td class="ltx_td" id="S5.T5.1.2.2"></td> <td class="ltx_td ltx_align_center" id="S5.T5.1.2.3"> <table class="ltx_tabular ltx_align_middle" id="S5.T5.1.2.3.1"> <tr class="ltx_tr" id="S5.T5.1.2.3.1.1"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T5.1.2.3.1.1.1">Runtime</td> </tr> <tr class="ltx_tr" id="S5.T5.1.2.3.1.2"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T5.1.2.3.1.2.1">(epoch)</td> </tr> </table> </td> <td class="ltx_td ltx_align_center" id="S5.T5.1.2.4"> <table class="ltx_tabular ltx_align_middle" id="S5.T5.1.2.4.1"> <tr class="ltx_tr" id="S5.T5.1.2.4.1.1"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T5.1.2.4.1.1.1">Speedup</td> </tr> <tr class="ltx_tr" id="S5.T5.1.2.4.1.2"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T5.1.2.4.1.2.1">(vs. BF)</td> </tr> </table> </td> <td class="ltx_td ltx_align_center" id="S5.T5.1.2.5"> <table class="ltx_tabular ltx_align_middle" id="S5.T5.1.2.5.1"> <tr class="ltx_tr" id="S5.T5.1.2.5.1.1"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T5.1.2.5.1.1.1">Runtime</td> </tr> <tr class="ltx_tr" id="S5.T5.1.2.5.1.2"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T5.1.2.5.1.2.1">(epoch)</td> </tr> </table> </td> <td class="ltx_td ltx_align_center" id="S5.T5.1.2.6"> <table class="ltx_tabular ltx_align_middle" id="S5.T5.1.2.6.1"> <tr class="ltx_tr" id="S5.T5.1.2.6.1.1"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T5.1.2.6.1.1.1">Speedup</td> </tr> <tr class="ltx_tr" id="S5.T5.1.2.6.1.2"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S5.T5.1.2.6.1.2.1">(vs. BF)</td> </tr> </table> </td> </tr> <tr class="ltx_tr" id="S5.T5.1.3"> <td class="ltx_td ltx_align_left ltx_border_t" id="S5.T5.1.3.1"><span class="ltx_text ltx_font_bold" id="S5.T5.1.3.1.1">NLP</span></td> <td class="ltx_td ltx_align_left ltx_border_t" id="S5.T5.1.3.2">BF</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T5.1.3.3">50</td> <td class="ltx_td ltx_border_t" id="S5.T5.1.3.4"></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T5.1.3.5">200</td> <td class="ltx_td ltx_border_t" id="S5.T5.1.3.6"></td> </tr> <tr class="ltx_tr" id="S5.T5.1.4"> <td class="ltx_td" id="S5.T5.1.4.1"></td> <td class="ltx_td ltx_align_left" id="S5.T5.1.4.2">SH</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.4.3">19</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.4.4">2.63x</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.4.5">77</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.4.6">2.60x</td> </tr> <tr class="ltx_tr" id="S5.T5.1.5"> <td class="ltx_td ltx_align_left" id="S5.T5.1.5.1"> <span class="ltx_ERROR undefined" id="S5.T5.1.5.1.1">\hdashline</span>Tweet</td> <td class="ltx_td ltx_align_left" id="S5.T5.1.5.2">FS</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.5.3">14</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.5.4">3.57x</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.5.5">44</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.5.6">4.55x</td> </tr> <tr class="ltx_tr" id="S5.T5.1.6"> <td class="ltx_td ltx_align_left" id="S5.T5.1.6.1">MNLI</td> <td class="ltx_td ltx_align_left" id="S5.T5.1.6.2">FS</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.6.3">14</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.6.4">3.57x</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.6.5">44</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.6.6">4.55x</td> </tr> <tr class="ltx_tr" id="S5.T5.1.7"> <td class="ltx_td ltx_align_left" id="S5.T5.1.7.1">MultiRC</td> <td class="ltx_td ltx_align_left" id="S5.T5.1.7.2">FS</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.7.3">15</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.7.4">3.33x</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.7.5">46</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.7.6">4.35x</td> </tr> <tr class="ltx_tr" id="S5.T5.1.8"> <td class="ltx_td ltx_align_left" id="S5.T5.1.8.1">Boolq</td> <td class="ltx_td ltx_align_left" id="S5.T5.1.8.2">FS</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.8.3">16</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.8.4">3.13x</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.8.5">48</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.8.6">4.17x</td> </tr> <tr class="ltx_tr" id="S5.T5.1.9"> <td class="ltx_td ltx_align_left ltx_border_t" id="S5.T5.1.9.1"><span class="ltx_text ltx_font_bold" id="S5.T5.1.9.1.1">CV</span></td> <td class="ltx_td ltx_align_left ltx_border_t" id="S5.T5.1.9.2">BF</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T5.1.9.3">40</td> <td class="ltx_td ltx_border_t" id="S5.T5.1.9.4"></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T5.1.9.5">120</td> <td class="ltx_td ltx_border_t" id="S5.T5.1.9.6"></td> </tr> <tr class="ltx_tr" id="S5.T5.1.10"> <td class="ltx_td" id="S5.T5.1.10.1"></td> <td class="ltx_td ltx_align_left" id="S5.T5.1.10.2">SH</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.10.3">18</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.10.4">2.22x</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.10.5">55</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.10.6">2.18x</td> </tr> <tr class="ltx_tr" id="S5.T5.1.11"> <td class="ltx_td ltx_align_left" id="S5.T5.1.11.1"> <span class="ltx_ERROR undefined" id="S5.T5.1.11.1.1">\hdashline</span>X-Ray</td> <td class="ltx_td ltx_align_left" id="S5.T5.1.11.2">FS</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.11.3">13</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.11.4">3.08x</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.11.5">38</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.11.6">3.16x</td> </tr> <tr class="ltx_tr" id="S5.T5.1.12"> <td class="ltx_td ltx_align_left" id="S5.T5.1.12.1">MedMNIST</td> <td class="ltx_td ltx_align_left" id="S5.T5.1.12.2">FS</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.12.3">15</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.12.4">2.67x</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.12.5">37</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.12.6">3.24x</td> </tr> <tr class="ltx_tr" id="S5.T5.1.13"> <td class="ltx_td ltx_align_left" id="S5.T5.1.13.1">Flowers</td> <td class="ltx_td ltx_align_left" id="S5.T5.1.13.2">FS</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.13.3">15</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.13.4">2.67x</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.13.5">36</td> <td class="ltx_td ltx_align_center" id="S5.T5.1.13.6">3.33x</td> </tr> <tr class="ltx_tr" id="S5.T5.1.14"> <td class="ltx_td ltx_align_left ltx_border_b" id="S5.T5.1.14.1">Beans</td> <td class="ltx_td ltx_align_left ltx_border_b" id="S5.T5.1.14.2">FS</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T5.1.14.3">17</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T5.1.14.4">2.35x</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T5.1.14.5">41</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T5.1.14.6">2.93x</td> </tr> </table> </figure> </section> <section class="ltx_subsubsection" id="S5.SS3.SSS3"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S5.SS3.SSS3.4.1.1">V-C</span>3 </span>Scaling to more models</h4> <div class="ltx_para" id="S5.SS3.SSS3.p1"> <p class="ltx_p" id="S5.SS3.SSS3.p1.1">In addition to the 10 models obtained based on coarse-recall, we further explored the performance of our method when the number of models increased. In Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.F7" title="Figure 7 ‣ V-C1 Performance ‣ V-C Experiments for Fine-Selection Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">7</span></a> and Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.T5" title="TABLE V ‣ V-C2 Time ‣ V-C Experiments for Fine-Selection Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">V</span></a>, we provided performance and time comparisons for 40 models of natural language process and 30 models of computer vision. We found that as the number of models increases, our method not only maintains the quality of model selection but also significantly saves time. This indicates that the fine-selection method is not only suitable for fine-grained selection after coarse-grained model recall, but it is also effective for a larger number of models. Therefore, our method can be expanded to accommodate more models, making it capable of coping with the increasing emergence of pre-trained models.</p> </div> </section> </section> <section class="ltx_subsection" id="S5.SS4"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S5.SS4.4.1.1">V-D</span> </span><span class="ltx_text ltx_font_italic" id="S5.SS4.5.2">Overall Performance</span> </h3> <div class="ltx_para" id="S5.SS4.p1"> <p class="ltx_p" id="S5.SS4.p1.2">We study the end-to-end model selection effectiveness and efficiency in this section. The comparison methods are also brute-force search (BF) and successive halving (SH) in last section. The number of total models are 40 and 30 for NLP and CV, and recalled model is 10. Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.T6" title="TABLE VI ‣ V-D Overall Performance ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">VI</span></a> shows accuracy and efficiency comparison among different methods where CR+FS stands for the two-phase framework (coarse-recall and fine-selection) in this paper. We can see both SH and two-phase model selection methods proposed in this paper achieve near accuracy compared with BF. Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.T6" title="TABLE VI ‣ V-D Overall Performance ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">VI</span></a> also exhibits the time consumed for the three methods in terms of training epochs. For CR+FS, as <math alttext="proxy\_score" class="ltx_Math" display="inline" id="S5.SS4.p1.1.m1.1"><semantics id="S5.SS4.p1.1.m1.1a"><mrow id="S5.SS4.p1.1.m1.1.1" xref="S5.SS4.p1.1.m1.1.1.cmml"><mi id="S5.SS4.p1.1.m1.1.1.2" xref="S5.SS4.p1.1.m1.1.1.2.cmml">p</mi><mo id="S5.SS4.p1.1.m1.1.1.1" xref="S5.SS4.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS4.p1.1.m1.1.1.3" xref="S5.SS4.p1.1.m1.1.1.3.cmml">r</mi><mo id="S5.SS4.p1.1.m1.1.1.1a" xref="S5.SS4.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS4.p1.1.m1.1.1.4" xref="S5.SS4.p1.1.m1.1.1.4.cmml">o</mi><mo id="S5.SS4.p1.1.m1.1.1.1b" xref="S5.SS4.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS4.p1.1.m1.1.1.5" xref="S5.SS4.p1.1.m1.1.1.5.cmml">x</mi><mo id="S5.SS4.p1.1.m1.1.1.1c" xref="S5.SS4.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS4.p1.1.m1.1.1.6" xref="S5.SS4.p1.1.m1.1.1.6.cmml">y</mi><mo id="S5.SS4.p1.1.m1.1.1.1d" xref="S5.SS4.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS4.p1.1.m1.1.1.7" mathvariant="normal" xref="S5.SS4.p1.1.m1.1.1.7.cmml">_</mi><mo id="S5.SS4.p1.1.m1.1.1.1e" xref="S5.SS4.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS4.p1.1.m1.1.1.8" xref="S5.SS4.p1.1.m1.1.1.8.cmml">s</mi><mo id="S5.SS4.p1.1.m1.1.1.1f" xref="S5.SS4.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS4.p1.1.m1.1.1.9" xref="S5.SS4.p1.1.m1.1.1.9.cmml">c</mi><mo id="S5.SS4.p1.1.m1.1.1.1g" xref="S5.SS4.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS4.p1.1.m1.1.1.10" xref="S5.SS4.p1.1.m1.1.1.10.cmml">o</mi><mo id="S5.SS4.p1.1.m1.1.1.1h" xref="S5.SS4.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS4.p1.1.m1.1.1.11" xref="S5.SS4.p1.1.m1.1.1.11.cmml">r</mi><mo id="S5.SS4.p1.1.m1.1.1.1i" xref="S5.SS4.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S5.SS4.p1.1.m1.1.1.12" xref="S5.SS4.p1.1.m1.1.1.12.cmml">e</mi></mrow><annotation-xml encoding="MathML-Content" id="S5.SS4.p1.1.m1.1b"><apply id="S5.SS4.p1.1.m1.1.1.cmml" xref="S5.SS4.p1.1.m1.1.1"><times id="S5.SS4.p1.1.m1.1.1.1.cmml" xref="S5.SS4.p1.1.m1.1.1.1"></times><ci id="S5.SS4.p1.1.m1.1.1.2.cmml" xref="S5.SS4.p1.1.m1.1.1.2">𝑝</ci><ci id="S5.SS4.p1.1.m1.1.1.3.cmml" xref="S5.SS4.p1.1.m1.1.1.3">𝑟</ci><ci id="S5.SS4.p1.1.m1.1.1.4.cmml" xref="S5.SS4.p1.1.m1.1.1.4">𝑜</ci><ci id="S5.SS4.p1.1.m1.1.1.5.cmml" xref="S5.SS4.p1.1.m1.1.1.5">𝑥</ci><ci id="S5.SS4.p1.1.m1.1.1.6.cmml" xref="S5.SS4.p1.1.m1.1.1.6">𝑦</ci><ci id="S5.SS4.p1.1.m1.1.1.7.cmml" xref="S5.SS4.p1.1.m1.1.1.7">_</ci><ci id="S5.SS4.p1.1.m1.1.1.8.cmml" xref="S5.SS4.p1.1.m1.1.1.8">𝑠</ci><ci id="S5.SS4.p1.1.m1.1.1.9.cmml" xref="S5.SS4.p1.1.m1.1.1.9">𝑐</ci><ci id="S5.SS4.p1.1.m1.1.1.10.cmml" xref="S5.SS4.p1.1.m1.1.1.10">𝑜</ci><ci id="S5.SS4.p1.1.m1.1.1.11.cmml" xref="S5.SS4.p1.1.m1.1.1.11">𝑟</ci><ci id="S5.SS4.p1.1.m1.1.1.12.cmml" xref="S5.SS4.p1.1.m1.1.1.12">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS4.p1.1.m1.1c">proxy\_score</annotation><annotation encoding="application/x-llamapun" id="S5.SS4.p1.1.m1.1d">italic_p italic_r italic_o italic_x italic_y _ italic_s italic_c italic_o italic_r italic_e</annotation></semantics></math> needs to be computed in coarse-recall phase which needs to do inference on the target task, we count the computation time as <math alttext="0.5\cdot|MC|" class="ltx_Math" display="inline" id="S5.SS4.p1.2.m2.1"><semantics id="S5.SS4.p1.2.m2.1a"><mrow id="S5.SS4.p1.2.m2.1.1" xref="S5.SS4.p1.2.m2.1.1.cmml"><mn id="S5.SS4.p1.2.m2.1.1.3" xref="S5.SS4.p1.2.m2.1.1.3.cmml">0.5</mn><mo id="S5.SS4.p1.2.m2.1.1.2" lspace="0.222em" rspace="0.222em" xref="S5.SS4.p1.2.m2.1.1.2.cmml">⋅</mo><mrow id="S5.SS4.p1.2.m2.1.1.1.1" xref="S5.SS4.p1.2.m2.1.1.1.2.cmml"><mo id="S5.SS4.p1.2.m2.1.1.1.1.2" stretchy="false" xref="S5.SS4.p1.2.m2.1.1.1.2.1.cmml">|</mo><mrow id="S5.SS4.p1.2.m2.1.1.1.1.1" xref="S5.SS4.p1.2.m2.1.1.1.1.1.cmml"><mi id="S5.SS4.p1.2.m2.1.1.1.1.1.2" xref="S5.SS4.p1.2.m2.1.1.1.1.1.2.cmml">M</mi><mo id="S5.SS4.p1.2.m2.1.1.1.1.1.1" xref="S5.SS4.p1.2.m2.1.1.1.1.1.1.cmml">⁢</mo><mi id="S5.SS4.p1.2.m2.1.1.1.1.1.3" xref="S5.SS4.p1.2.m2.1.1.1.1.1.3.cmml">C</mi></mrow><mo id="S5.SS4.p1.2.m2.1.1.1.1.3" stretchy="false" xref="S5.SS4.p1.2.m2.1.1.1.2.1.cmml">|</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S5.SS4.p1.2.m2.1b"><apply id="S5.SS4.p1.2.m2.1.1.cmml" xref="S5.SS4.p1.2.m2.1.1"><ci id="S5.SS4.p1.2.m2.1.1.2.cmml" xref="S5.SS4.p1.2.m2.1.1.2">⋅</ci><cn id="S5.SS4.p1.2.m2.1.1.3.cmml" type="float" xref="S5.SS4.p1.2.m2.1.1.3">0.5</cn><apply id="S5.SS4.p1.2.m2.1.1.1.2.cmml" xref="S5.SS4.p1.2.m2.1.1.1.1"><abs id="S5.SS4.p1.2.m2.1.1.1.2.1.cmml" xref="S5.SS4.p1.2.m2.1.1.1.1.2"></abs><apply id="S5.SS4.p1.2.m2.1.1.1.1.1.cmml" xref="S5.SS4.p1.2.m2.1.1.1.1.1"><times id="S5.SS4.p1.2.m2.1.1.1.1.1.1.cmml" xref="S5.SS4.p1.2.m2.1.1.1.1.1.1"></times><ci id="S5.SS4.p1.2.m2.1.1.1.1.1.2.cmml" xref="S5.SS4.p1.2.m2.1.1.1.1.1.2">𝑀</ci><ci id="S5.SS4.p1.2.m2.1.1.1.1.1.3.cmml" xref="S5.SS4.p1.2.m2.1.1.1.1.1.3">𝐶</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS4.p1.2.m2.1c">0.5\cdot|MC|</annotation><annotation encoding="application/x-llamapun" id="S5.SS4.p1.2.m2.1d">0.5 ⋅ | italic_M italic_C |</annotation></semantics></math> epochs because the inference do not need to compute the gradients to do back-propagation. From Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.T6" title="TABLE VI ‣ V-D Overall Performance ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">VI</span></a>, We can see the two-phase model selection methods achieve about 2x to 3x times faster compared with SH and about 5x to 8x times faster compared with BF. The speed up comes from both phases where the coarse-recall phase largely reduce the number of models need to be fine-tuned and the fine-selection phase further reduce computation time of fine-tuning by exploiting the convergence trend. The results demonstrate the model selection method proposed in this paper could achieve high training performance with significantly lower compute time, hence are more suitable to address large scale model repository.</p> </div> <figure class="ltx_table" id="S5.T6"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">TABLE VI: </span>End-to-End comparisons on time and performance among different model selection methods. 2PH, BF, SH is short for 2 phase (Coarse Recall +Fine Selection), Brute Forcing, Seccessive Halving, respectively. Acc is accuracy metric.</figcaption> <table class="ltx_tabular ltx_centering ltx_align_middle" id="S5.T6.1"> <tr class="ltx_tr" id="S5.T6.1.1"> <td class="ltx_td ltx_border_t" id="S5.T6.1.1.1"></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T6.1.1.2">Runtime</td> <td class="ltx_td ltx_align_center ltx_border_t" colspan="2" id="S5.T6.1.1.3">Speedup</td> <td class="ltx_td ltx_align_center ltx_border_t" colspan="3" id="S5.T6.1.1.4">Acc</td> </tr> <tr class="ltx_tr" id="S5.T6.1.2"> <td class="ltx_td" id="S5.T6.1.2.1"></td> <td class="ltx_td ltx_align_center" id="S5.T6.1.2.2">2PH</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.2.3">(vs.BF)</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.2.4">(vs.SH)</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.2.5">BF</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.2.6">SH</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.2.7">2PH</td> </tr> <tr class="ltx_tr" id="S5.T6.1.3"> <td class="ltx_td ltx_align_left ltx_border_t" id="S5.T6.1.3.1">Tweet</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T6.1.3.2">19</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T6.1.3.3">10.53x</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T6.1.3.4">4.05x</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T6.1.3.5">0.650</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T6.1.3.6">0.60</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T6.1.3.7">0.650</td> </tr> <tr class="ltx_tr" id="S5.T6.1.4"> <td class="ltx_td ltx_align_left" id="S5.T6.1.4.1">MNLI</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.4.2">19</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.4.3">10.53x</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.4.4">4.05x</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.4.5">0.850</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.4.6">0.850</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.4.7">0.850</td> </tr> <tr class="ltx_tr" id="S5.T6.1.5"> <td class="ltx_td ltx_align_left" id="S5.T6.1.5.1">MultiRC</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.5.2">20</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.5.3">10.00x</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.5.4">3.85x</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.5.5">0.640</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.5.6">0.630</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.5.7">0.630</td> </tr> <tr class="ltx_tr" id="S5.T6.1.6"> <td class="ltx_td ltx_align_left" id="S5.T6.1.6.1">Boolq</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.6.2">21</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.6.3">9.52x</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.6.4">3.67x</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.6.5">0.720</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.6.6">0.720</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.6.7">0.720</td> </tr> <tr class="ltx_tr" id="S5.T6.1.7"> <td class="ltx_td ltx_align_left" id="S5.T6.1.7.1"> <span class="ltx_ERROR undefined" id="S5.T6.1.7.1.1">\hdashline</span>X-Ray</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.7.2">18</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.7.3">6.67x</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.7.4">3.06x</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.7.5">0.969</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.7.6">0.962</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.7.7">0.962</td> </tr> <tr class="ltx_tr" id="S5.T6.1.8"> <td class="ltx_td ltx_align_left" id="S5.T6.1.8.1">MedMNIST</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.8.2">20</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.8.3">6.00x</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.8.4">2.75x</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.8.5">0.779</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.8.6">0.773</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.8.7">0.773</td> </tr> <tr class="ltx_tr" id="S5.T6.1.9"> <td class="ltx_td ltx_align_left" id="S5.T6.1.9.1">Flowers</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.9.2">20</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.9.3">6.00x</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.9.4">2.75x</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.9.5">0.986</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.9.6">0.986</td> <td class="ltx_td ltx_align_center" id="S5.T6.1.9.7">0.985</td> </tr> <tr class="ltx_tr" id="S5.T6.1.10"> <td class="ltx_td ltx_align_left ltx_border_b" id="S5.T6.1.10.1">Beans</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T6.1.10.2">22</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T6.1.10.3">5.45x</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T6.1.10.4">2.50x</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T6.1.10.5">0.968</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T6.1.10.6">0.961</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T6.1.10.7">0.961</td> </tr> </table> </figure> <figure class="ltx_table" id="S5.T7"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table">TABLE VII: </span>Case study of the final selected model after coarse-recall and fine-selection on four datasets. Acc and R are short for accuracy and rank. The rank for CR are obtained by sorting the recalled models by proxy score. Avg_Acc is obtained by computing the average accuracies of the recalled models.</figcaption> <table class="ltx_tabular ltx_centering ltx_align_middle" id="S5.T7.1"> <tr class="ltx_tr" id="S5.T7.1.1"> <td class="ltx_td ltx_align_left ltx_border_t" id="S5.T7.1.1.1">Dataset</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T7.1.1.2">Best_model</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T7.1.1.3">Acc</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T7.1.1.4">R@CR</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T7.1.1.5">Avg_Acc</td> </tr> <tr class="ltx_tr" id="S5.T7.1.2"> <td class="ltx_td ltx_align_left ltx_border_t" id="S5.T7.1.2.1">MultiRC</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T7.1.2.2">albert-base-v2</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T7.1.2.3"><span class="ltx_text ltx_font_bold" id="S5.T7.1.2.3.1">0.630</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T7.1.2.4">5</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T7.1.2.5">0.574</td> </tr> <tr class="ltx_tr" id="S5.T7.1.3"> <td class="ltx_td ltx_align_left" id="S5.T7.1.3.1">Boolq</td> <td class="ltx_td ltx_align_center" id="S5.T7.1.3.2">bert-base-uncased-mnli</td> <td class="ltx_td ltx_align_center" id="S5.T7.1.3.3"><span class="ltx_text ltx_font_bold" id="S5.T7.1.3.3.1">0.720</span></td> <td class="ltx_td ltx_align_center" id="S5.T7.1.3.4">0</td> <td class="ltx_td ltx_align_center" id="S5.T7.1.3.5">0.635</td> </tr> <tr class="ltx_tr" id="S5.T7.1.4"> <td class="ltx_td ltx_align_left" id="S5.T7.1.4.1">MedMNIST</td> <td class="ltx_td ltx_align_center" id="S5.T7.1.4.2">vit-base-patch16-384</td> <td class="ltx_td ltx_align_center" id="S5.T7.1.4.3"><span class="ltx_text ltx_font_bold" id="S5.T7.1.4.3.1">0.773</span></td> <td class="ltx_td ltx_align_center" id="S5.T7.1.4.4">1</td> <td class="ltx_td ltx_align_center" id="S5.T7.1.4.5">0.768</td> </tr> <tr class="ltx_tr" id="S5.T7.1.5"> <td class="ltx_td ltx_align_left ltx_border_b" id="S5.T7.1.5.1">Flowers</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T7.1.5.2">vit-base-patch16-224</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T7.1.5.3"><span class="ltx_text ltx_font_bold" id="S5.T7.1.5.3.1">0.985</span></td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T7.1.5.4">9</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S5.T7.1.5.5">0.891</td> </tr> </table> </figure> </section> <section class="ltx_subsection" id="S5.SS5"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S5.SS5.4.1.1">V-E</span> </span><span class="ltx_text ltx_font_italic" id="S5.SS5.5.2">Generalization Study</span> </h3> <div class="ltx_para" id="S5.SS5.p1"> <p class="ltx_p" id="S5.SS5.p1.1">The generalization capability of the proposed method for new target tasks lies in three aspects. Firstly, we can address new tasks with different domain distribution and task types compared with benchmark datasets. This is because the benchmark datasets are only used to measure the model similarity offline and will be used neither in corase-recall nor fine-selection phase. Secondly, the LEEP score in the coarse-recall phase proves to be able to measure the transferability between heterogeneous tasks. And thirdly, the fine-selection phase selects model directly based on the fine-tuning results at different validation intervals, which is more accurate to evaluate the final transferability of a source model.</p> </div> <div class="ltx_para" id="S5.SS5.p2"> <p class="ltx_p" id="S5.SS5.p2.1">Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.T7" title="TABLE VII ‣ V-D Overall Performance ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">VII</span></a> illustrates the best selected models for four target tasks which have different domain distribution and task type compared with benchmark datasets. The best models are ranked higher at coarse-recall phase and the accuracy are all higher than the average accuracy of all models. The Boolq is a question answering task for yes/no questions based on given passages, and the best model selected for Boolq is bert-base-uncased-mnli which is the bert model fine-tuned on the MNLI dataset. As the input format and output label space are both different for Boolq and MNLI, the result demonstrates the proposed method could capture the latent transferability between heterogeneous tasks. On the other hand, the MultiRC dataset selects albert-base-v2 as the best model, which is a pre-trained model not fine-tuned on any downstream dataset. For computer vision task, both Flowers and MedMNIST datasets select the vit-base-patch16 models which are trained on data with different domain distribution compared with corresponding tasks, demonstrating the out-of-domain capability of proposed method in CV tasks.</p> </div> </section> </section> <section class="ltx_section" id="S6"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">VI </span><span class="ltx_text ltx_font_smallcaps" id="S6.1.1">Related Work</span> </h2> <div class="ltx_para" id="S6.p1"> <p class="ltx_p" id="S6.p1.1">Pre-training could be viewed as an application of deep transfer leanring <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib54" title="">54</a>]</cite>, where the a model is pre-trained on a upstreaming dataset and then the pre-trained model will be used as parameter initialization and continuing trained on datasets of various downstream tasks. The upstream task could be unsupervised task, such as masked language model <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib9" title="">9</a>]</cite><cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib16" title="">16</a>]</cite> for natural language processing, or supervised task, such as image classification for computer vision <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib55" title="">55</a>]</cite>. Pre-trained models could help the downstream task to achieve better training effect especially for the situation where the training data of the downstream task is limited. Meanwhile, compared with training dataset, the pre-trained model are usually more safely to be published. Therefore, thousands of pre-trained models has been published on model hub website such as HuggingFace <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib1" title="">1</a>]</cite> , PyTorch-Hub <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib56" title="">56</a>]</cite>, etc. On the other hand, previous work has demonstrated that for the same downstream task, the training performance may vary a lot when fine-tuned on different pre-trained models <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib3" title="">3</a>]</cite>. Hence, selecting a good pre-trained model from model repository is an important step to achieve high training performance for the target task.</p> </div> <div class="ltx_para" id="S6.p2"> <p class="ltx_p" id="S6.p2.1">As the number of models in model repository becomes larger, it is compute infeasible to select a good model after fine-tuning all the models on the target task, and a few methods has been proposed to speed up the model selection process. We divide the proposed model selection methods into two categories, light-weight proxy score computation and model selection during fine-tuning. Specifically, for methods of light-weight proxy score computation, Task2Vec <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib57" title="">57</a>]</cite> embedding the upstream and downstream tasks into the same vector space, and models trained in upstream task closed to the downstream task could be selected directly; LEEP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib11" title="">11</a>]</cite> is proposed to measure the transfer-ability for a pre-trained model on the target classification task; and in <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib12" title="">12</a>]</cite>, the KNN classifier is built on the hidden layer output of pre-trained models to approximate the training effect after fine-tuning. For methods of model selection during fine-tuning, Palette <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib3" title="">3</a>]</cite> adopts successive halving <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib56" title="">56</a>]</cite> to filter lower effect models at early training step, and Shift <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib4" title="">4</a>]</cite> builds cost model to predict the training cost of successive halving and fine-tuning directly. The two-phase model selection framework could be viewed as combing the advantage of the two category methods. As the two-phase model selection framework is flexible, other model selection methods could be combined in this framework even in corresponding phase, such as we can also measure the model performance on benchmark datasets as <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib58" title="">58</a>]</cite> and combine this strategy with LEEP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib11" title="">11</a>]</cite> in coarse-recall phase, and we can also combine multi-model selection methods <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib3" title="">3</a>]</cite><cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib59" title="">59</a>]</cite><cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib60" title="">60</a>]</cite><cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib61" title="">61</a>]</cite> in the fine-selection phase to achieve high ensemble performance.</p> </div> <div class="ltx_para" id="S6.p3"> <p class="ltx_p" id="S6.p3.1">In this paper, we also propose to cluster models and mine convergence trend to speed up coarse-recall and fine-selection phase. The clustering and mining process both depend on the training performance of pre-trained models on benchmark datasets, such as GLUE <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib5" title="">5</a>]</cite> for natural language processing. As thousands of datasets are also published on the website, such as HuggingFace datasets <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib1" title="">1</a>]</cite>, methods such as <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib57" title="">57</a>]</cite><cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#bib.bib62" title="">62</a>]</cite> could also be applied to measure the similarity between the task datasets and help to build more effective benchmark datasets which could cover a wider range of tasks for model selection.</p> </div> </section> <section class="ltx_section" id="S7"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">VII </span><span class="ltx_text ltx_font_smallcaps" id="S7.1.1">Future Work</span> </h2> <div class="ltx_para" id="S7.p1"> <p class="ltx_p" id="S7.p1.1">The future work could be summarized into three aspects. Firstly, the coarse-recall phase aims to retain the high performance pre-trained models in top models. A single proxy-score measurement may be not enough to return high performance models for different machine learning tasks. We plan to combine different light-weight tasks to return a high quality subset of models more robustly. Secondly, in this paper, we build benchmark datasets empirically. As a large of number of datasets has been published, we will study data-driven methods to build benchmark datasets which could cover more types of machine tasks, and meanwhile make benchmark datasets more compact to maintain performance matrix more cheaply. And thirdly, as model selection is an important step in the machine learning pipeline, the capability of automatically select high performance model enable us to build the whole machine learning pipeline for new task. We plan to build data management system which stores and maintains the pre-trained models and datasets, then support automatically selecting models efficiently to help users complete the model training for new task.</p> </div> </section> <section class="ltx_section" id="S8"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">VIII </span><span class="ltx_text ltx_font_smallcaps" id="S8.1.1">Conclusion</span> </h2> <div class="ltx_para" id="S8.p1"> <p class="ltx_p" id="S8.p1.1">In this paper, we propose a two-phase framework for fast model selection, where the coarse-recall phase implements light-weight proxy tasks to recall a much smaller number of candidate models and fine-selection phase only fine-tunes the models from this first phase to select the best model. To speed up these two phases, we build performance matrix by fine-tuning pre-trained models on benchmark datasets, and cluster models based on performance matrix to avoid duplicated proxy-score computation in coarse-recall phase and mine convergence trend to filter poorly-perform models more earlier in fine-selection phase. The experiments results on natural language process and computer vision tasks demonstrate the methods proposed in this paper could select a good model for the new task much faster compared with baseline methods.</p> </div> </section> <section class="ltx_section" id="S9"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">IX </span><span class="ltx_text ltx_font_smallcaps" id="S9.1.1">Acknowledgment</span> </h2> <div class="ltx_para" id="S9.p1"> <p class="ltx_p" id="S9.p1.1">This work was supported by the National Natural Science Foundation of China under Grant No. 62072458 and No. 62062058.</p> </div> </section> <section class="ltx_bibliography" id="bib"> <h2 class="ltx_title ltx_title_bibliography">References</h2> <ul class="ltx_biblist"> <li class="ltx_bibitem" id="bib.bib1"> <span class="ltx_tag ltx_tag_bibitem">[1]</span> <span class="ltx_bibblock"> T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz <em class="ltx_emph ltx_font_italic" id="bib.bib1.1.1">et al.</em>, “Huggingface’s transformers: State-of-the-art natural language processing,” <em class="ltx_emph ltx_font_italic" id="bib.bib1.2.2">arXiv preprint arXiv:1910.03771</em>, 2019. </span> </li> <li class="ltx_bibitem" id="bib.bib2"> <span class="ltx_tag ltx_tag_bibitem">[2]</span> <span class="ltx_bibblock"> Z. Zhao, Y. Li, C. Hou, J. Zhao, R. Tian, W. Liu, Y. Chen, N. Sun, H. Liu, W. Mao <em class="ltx_emph ltx_font_italic" id="bib.bib2.1.1">et al.</em>, “Tencentpretrain: A scalable and flexible toolkit for pre-training models of different modalities,” <em class="ltx_emph ltx_font_italic" id="bib.bib2.2.2">arXiv preprint arXiv:2212.06385</em>, 2022. </span> </li> <li class="ltx_bibitem" id="bib.bib3"> <span class="ltx_tag ltx_tag_bibitem">[3]</span> <span class="ltx_bibblock"> Y. Li, Y. Shen, and L. Chen, “Palette: towards multi-source model selection and ensemble for reuse,” in <em class="ltx_emph ltx_font_italic" id="bib.bib3.1.1">2021 IEEE 37th International Conference on Data Engineering (ICDE)</em>.   IEEE, 2021, pp. 2147–2152. </span> </li> <li class="ltx_bibitem" id="bib.bib4"> <span class="ltx_tag ltx_tag_bibitem">[4]</span> <span class="ltx_bibblock"> C. Renggli, X. Yao, L. Kolar, L. Rimanic, A. Klimovic, and C. Zhang, “Shift: an efficient, flexible search engine for transfer learning,” <em class="ltx_emph ltx_font_italic" id="bib.bib4.1.1">arXiv preprint arXiv:2204.01457</em>, 2022. </span> </li> <li class="ltx_bibitem" id="bib.bib5"> <span class="ltx_tag ltx_tag_bibitem">[5]</span> <span class="ltx_bibblock"> A. Wang, A. Singh, J. Michael, F. Hill, O. Levy, and S. R. Bowman, “Glue: A multi-task benchmark and analysis platform for natural language understanding,” <em class="ltx_emph ltx_font_italic" id="bib.bib5.1.1">arXiv preprint arXiv:1804.07461</em>, 2018. </span> </li> <li class="ltx_bibitem" id="bib.bib6"> <span class="ltx_tag ltx_tag_bibitem">[6]</span> <span class="ltx_bibblock"> C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie, “The caltech-ucsd birds-200-2011 dataset,” California Institute of Technology, Tech. Rep. CNS-TR-2011-001, 2011. </span> </li> <li class="ltx_bibitem" id="bib.bib7"> <span class="ltx_tag ltx_tag_bibitem">[7]</span> <span class="ltx_bibblock"> S. Kornblith, J. Shlens, and Q. V. Le, “Do better imagenet models transfer better?” in <em class="ltx_emph ltx_font_italic" id="bib.bib7.1.1">Proceedings of the IEEE/CVF conference on computer vision and pattern recognition</em>, 2019, pp. 2661–2671. </span> </li> <li class="ltx_bibitem" id="bib.bib8"> <span class="ltx_tag ltx_tag_bibitem">[8]</span> <span class="ltx_bibblock"> J. Dodge, G. Ilharco, R. Schwartz, A. Farhadi, H. Hajishirzi, and N. Smith, “Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping,” <em class="ltx_emph ltx_font_italic" id="bib.bib8.1.1">arXiv preprint arXiv:2002.06305</em>, 2020. </span> </li> <li class="ltx_bibitem" id="bib.bib9"> <span class="ltx_tag ltx_tag_bibitem">[9]</span> <span class="ltx_bibblock"> J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” <em class="ltx_emph ltx_font_italic" id="bib.bib9.1.1">arXiv preprint arXiv:1810.04805</em>, 2018. </span> </li> <li class="ltx_bibitem" id="bib.bib10"> <span class="ltx_tag ltx_tag_bibitem">[10]</span> <span class="ltx_bibblock"> J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in <em class="ltx_emph ltx_font_italic" id="bib.bib10.1.1">2009 IEEE conference on computer vision and pattern recognition</em>.   Ieee, 2009, pp. 248–255. </span> </li> <li class="ltx_bibitem" id="bib.bib11"> <span class="ltx_tag ltx_tag_bibitem">[11]</span> <span class="ltx_bibblock"> C. Nguyen, T. Hassner, M. Seeger, and C. Archambeau, “Leep: A new measure to evaluate transferability of learned representations,” in <em class="ltx_emph ltx_font_italic" id="bib.bib11.1.1">International Conference on Machine Learning</em>.   PMLR, 2020, pp. 7294–7305. </span> </li> <li class="ltx_bibitem" id="bib.bib12"> <span class="ltx_tag ltx_tag_bibitem">[12]</span> <span class="ltx_bibblock"> C. Renggli, A. S. Pinto, L. Rimanic, J. Puigcerver, C. Riquelme, C. Zhang, and M. Lučić, “Which model to transfer? finding the needle in the growing haystack,” in <em class="ltx_emph ltx_font_italic" id="bib.bib12.1.1">Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</em>, 2022, pp. 9205–9214. </span> </li> <li class="ltx_bibitem" id="bib.bib13"> <span class="ltx_tag ltx_tag_bibitem">[13]</span> <span class="ltx_bibblock"> J. A. Hartigan and M. A. Wong, “Algorithm as 136: A k-means clustering algorithm,” <em class="ltx_emph ltx_font_italic" id="bib.bib13.1.1">Journal of the royal statistical society. series c (applied statistics)</em>, vol. 28, no. 1, pp. 100–108, 1979. </span> </li> <li class="ltx_bibitem" id="bib.bib14"> <span class="ltx_tag ltx_tag_bibitem">[14]</span> <span class="ltx_bibblock"> F. Murtagh and P. Contreras, “Algorithms for hierarchical clustering: an overview,” <em class="ltx_emph ltx_font_italic" id="bib.bib14.1.1">Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery</em>, vol. 2, no. 1, pp. 86–97, 2012. </span> </li> <li class="ltx_bibitem" id="bib.bib15"> <span class="ltx_tag ltx_tag_bibitem">[15]</span> <span class="ltx_bibblock"> J.-W. Cui, W.-H. Shi, H.-L. Tao, W. Lu, and X.-Y. Du, “technical report: https://github.com/plasware/two-phase-selection/blob/main/tech-report-137.pdf,” <em class="ltx_emph ltx_font_italic" id="bib.bib15.1.1">unpublished</em>. </span> </li> <li class="ltx_bibitem" id="bib.bib16"> <span class="ltx_tag ltx_tag_bibitem">[16]</span> <span class="ltx_bibblock"> Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” <em class="ltx_emph ltx_font_italic" id="bib.bib16.1.1">arXiv preprint arXiv:1907.11692</em>, 2019. </span> </li> <li class="ltx_bibitem" id="bib.bib17"> <span class="ltx_tag ltx_tag_bibitem">[17]</span> <span class="ltx_bibblock"> B. Wu, C. Xu, X. Dai, A. Wan, P. Zhang, Z. Yan, M. Tomizuka, J. Gonzalez, K. Keutzer, and P. Vajda, “Visual transformers: Token-based image representation and processing for computer vision,” <em class="ltx_emph ltx_font_italic" id="bib.bib17.1.1">arXiv preprint arXiv:2006.03677</em>, 2020. </span> </li> <li class="ltx_bibitem" id="bib.bib18"> <span class="ltx_tag ltx_tag_bibitem">[18]</span> <span class="ltx_bibblock"> H. Bao, L. Dong, S. Piao, and F. Wei, “Beit: Bert pre-training of image transformers,” <em class="ltx_emph ltx_font_italic" id="bib.bib18.1.1">arXiv preprint arXiv:2106.08254</em>, 2021. </span> </li> <li class="ltx_bibitem" id="bib.bib19"> <span class="ltx_tag ltx_tag_bibitem">[19]</span> <span class="ltx_bibblock"> H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jégou, “Training data-efficient image transformers &amp; distillation through attention,” in <em class="ltx_emph ltx_font_italic" id="bib.bib19.1.1">International conference on machine learning</em>.   PMLR, 2021, pp. 10 347–10 357. </span> </li> <li class="ltx_bibitem" id="bib.bib20"> <span class="ltx_tag ltx_tag_bibitem">[20]</span> <span class="ltx_bibblock"> W. Yu, M. Luo, P. Zhou, C. Si, Y. Zhou, X. Wang, J. Feng, and S. Yan, “Metaformer is actually what you need for vision,” in <em class="ltx_emph ltx_font_italic" id="bib.bib20.1.1">Proceedings of the IEEE/CVF conference on computer vision and pattern recognition</em>, 2022, pp. 10 819–10 829. </span> </li> <li class="ltx_bibitem" id="bib.bib21"> <span class="ltx_tag ltx_tag_bibitem">[21]</span> <span class="ltx_bibblock"> A. Hassani and H. Shi, “Dilated neighborhood attention transformer,” <em class="ltx_emph ltx_font_italic" id="bib.bib21.1.1">arXiv preprint arXiv:2209.15001</em>, 2022. </span> </li> <li class="ltx_bibitem" id="bib.bib22"> <span class="ltx_tag ltx_tag_bibitem">[22]</span> <span class="ltx_bibblock"> M.-H. Guo, C.-Z. Lu, Z.-N. Liu, M.-M. Cheng, and S.-M. Hu, “Visual attention network,” <em class="ltx_emph ltx_font_italic" id="bib.bib22.1.1">arXiv preprint arXiv:2202.09741</em>, 2022. </span> </li> <li class="ltx_bibitem" id="bib.bib23"> <span class="ltx_tag ltx_tag_bibitem">[23]</span> <span class="ltx_bibblock"> A. Krizhevsky, G. Hinton <em class="ltx_emph ltx_font_italic" id="bib.bib23.1.1">et al.</em>, “Learning multiple layers of features from tiny images,” 2009. </span> </li> <li class="ltx_bibitem" id="bib.bib24"> <span class="ltx_tag ltx_tag_bibitem">[24]</span> <span class="ltx_bibblock"> A. Wang, Y. Pruksachatkun, N. Nangia, A. Singh, J. Michael, F. Hill, O. Levy, and S. R. Bowman, “Superglue: A stickier benchmark for general-purpose language understanding systems,” <em class="ltx_emph ltx_font_italic" id="bib.bib24.1.1">arXiv preprint arXiv:1905.00537</em>, 2019. </span> </li> <li class="ltx_bibitem" id="bib.bib25"> <span class="ltx_tag ltx_tag_bibitem">[25]</span> <span class="ltx_bibblock"> A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” in <em class="ltx_emph ltx_font_italic" id="bib.bib25.1.1">Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies</em>.   Portland, Oregon, USA: Association for Computational Linguistics, June 2011, pp. 142–150. [Online]. Available: http://www.aclweb.org/anthology/P11-1015 </span> </li> <li class="ltx_bibitem" id="bib.bib26"> <span class="ltx_tag ltx_tag_bibitem">[26]</span> <span class="ltx_bibblock"> Y. Kim, “Convolutional neural networks for sentence classification,” in <em class="ltx_emph ltx_font_italic" id="bib.bib26.1.1">Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</em>, A. Moschitti, B. Pang, and W. Daelemans, Eds.   Doha, Qatar: Association for Computational Linguistics, Oct. 2014, pp. 1746–1751. [Online]. Available: https://aclanthology.org/D14-1181 </span> </li> <li class="ltx_bibitem" id="bib.bib27"> <span class="ltx_tag ltx_tag_bibitem">[27]</span> <span class="ltx_bibblock"> A. Conneau, R. Rinott, G. Lample, A. Williams, S. R. Bowman, H. Schwenk, and V. Stoyanov, “Xnli: Evaluating cross-lingual sentence representations,” in <em class="ltx_emph ltx_font_italic" id="bib.bib27.1.1">Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing</em>.   Association for Computational Linguistics, 2018. </span> </li> <li class="ltx_bibitem" id="bib.bib28"> <span class="ltx_tag ltx_tag_bibitem">[28]</span> <span class="ltx_bibblock"> Y. Nie, A. Williams, E. Dinan, M. Bansal, J. Weston, and D. Kiela, “Adversarial nli: A new benchmark for natural language understanding,” in <em class="ltx_emph ltx_font_italic" id="bib.bib28.1.1">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics</em>.   Association for Computational Linguistics, 2020. </span> </li> <li class="ltx_bibitem" id="bib.bib29"> <span class="ltx_tag ltx_tag_bibitem">[29]</span> <span class="ltx_bibblock"> “Software applications user reviews,” 2017. </span> </li> <li class="ltx_bibitem" id="bib.bib30"> <span class="ltx_tag ltx_tag_bibitem">[30]</span> <span class="ltx_bibblock"> X. Li and D. Roth, “Learning question classifiers,” in <em class="ltx_emph ltx_font_italic" id="bib.bib30.1.1">COLING 2002: The 19th International Conference on Computational Linguistics</em>, 2002. [Online]. Available: https://www.aclweb.org/anthology/C02-1150 </span> </li> <li class="ltx_bibitem" id="bib.bib31"> <span class="ltx_tag ltx_tag_bibitem">[31]</span> <span class="ltx_bibblock"> E. Hovy, L. Gerber, U. Hermjakob, C.-Y. Lin, and D. Ravichandran, “Toward semantics-based answer pinpointing,” in <em class="ltx_emph ltx_font_italic" id="bib.bib31.1.1">Proceedings of the First International Conference on Human Language Technology Research</em>, 2001. [Online]. Available: https://www.aclweb.org/anthology/H01-1069 </span> </li> <li class="ltx_bibitem" id="bib.bib32"> <span class="ltx_tag ltx_tag_bibitem">[32]</span> <span class="ltx_bibblock"> M. Marelli, S. Menini, M. Baroni, L. Bentivogli, R. Bernardi, and R. Zamparelli, “A SICK cure for the evaluation of compositional distributional semantic models,” in <em class="ltx_emph ltx_font_italic" id="bib.bib32.1.1">Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14)</em>.   Reykjavik, Iceland: European Language Resources Association (ELRA), May 2014, pp. 216–223. [Online]. Available: http://www.lrec-conf.org/proceedings/lrec2014/pdf/363_Paper.pdf </span> </li> <li class="ltx_bibitem" id="bib.bib33"> <span class="ltx_tag ltx_tag_bibitem">[33]</span> <span class="ltx_bibblock"> P. Malo, A. Sinha, P. Korhonen, J. Wallenius, and P. Takala, “Good debt or bad debt: Detecting semantic orientations in economic texts,” <em class="ltx_emph ltx_font_italic" id="bib.bib33.1.1">Journal of the Association for Information Science and Technology</em>, vol. 65, 2014. </span> </li> <li class="ltx_bibitem" id="bib.bib34"> <span class="ltx_tag ltx_tag_bibitem">[34]</span> <span class="ltx_bibblock"> L. Bossard, M. Guillaumin, and L. Van Gool, “Food-101 – mining discriminative components with random forests,” in <em class="ltx_emph ltx_font_italic" id="bib.bib34.1.1">European Conference on Computer Vision</em>, 2014. </span> </li> <li class="ltx_bibitem" id="bib.bib35"> <span class="ltx_tag ltx_tag_bibitem">[35]</span> <span class="ltx_bibblock"> J. Elson, J. J. Douceur, J. Howell, and J. Saul, “Asirra: A captcha that exploits interest-aligned manual image categorization,” in <em class="ltx_emph ltx_font_italic" id="bib.bib35.1.1">Proceedings of 14th ACM Conference on Computer and Communications Security (CCS)</em>.   Association for Computing Machinery, Inc., October 2007. [Online]. Available: https://www.microsoft.com/en-us/research/publication/asirra-a-captcha-that-exploits-interest-aligned-manual-image-categorization/ </span> </li> <li class="ltx_bibitem" id="bib.bib36"> <span class="ltx_tag ltx_tag_bibitem">[36]</span> <span class="ltx_bibblock"> Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” <em class="ltx_emph ltx_font_italic" id="bib.bib36.1.1">Proceedings of the IEEE</em>, vol. 86, no. 11, pp. 2278–2324, 1998. </span> </li> <li class="ltx_bibitem" id="bib.bib37"> <span class="ltx_tag ltx_tag_bibitem">[37]</span> <span class="ltx_bibblock"> F. Barbieri, J. Camacho-Collados, L. Neves, and L. Espinosa-Anke, “Tweeteval: Unified benchmark and comparative evaluation for tweet classification,” <em class="ltx_emph ltx_font_italic" id="bib.bib37.1.1">arXiv preprint arXiv:2010.12421</em>, 2020. </span> </li> <li class="ltx_bibitem" id="bib.bib38"> <span class="ltx_tag ltx_tag_bibitem">[38]</span> <span class="ltx_bibblock"> A. Williams, N. Nangia, and S. R. Bowman, “A broad-coverage challenge corpus for sentence understanding through inference,” <em class="ltx_emph ltx_font_italic" id="bib.bib38.1.1">arXiv preprint arXiv:1704.05426</em>, 2017. </span> </li> <li class="ltx_bibitem" id="bib.bib39"> <span class="ltx_tag ltx_tag_bibitem">[39]</span> <span class="ltx_bibblock"> D. Khashabi, S. Chaturvedi, M. Roth, S. Upadhyay, and D. Roth, “Looking beyond the surface: A challenge set for reading comprehension over multiple sentences,” in <em class="ltx_emph ltx_font_italic" id="bib.bib39.1.1">Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)</em>, 2018, pp. 252–262. </span> </li> <li class="ltx_bibitem" id="bib.bib40"> <span class="ltx_tag ltx_tag_bibitem">[40]</span> <span class="ltx_bibblock"> C. Clark, K. Lee, M.-W. Chang, T. Kwiatkowski, M. Collins, and K. Toutanova, “Boolq: Exploring the surprising difficulty of natural yes/no questions,” in <em class="ltx_emph ltx_font_italic" id="bib.bib40.1.1">NAACL</em>, 2019. </span> </li> <li class="ltx_bibitem" id="bib.bib41"> <span class="ltx_tag ltx_tag_bibitem">[41]</span> <span class="ltx_bibblock"> D. S. Kermany, M. Goldbaum, W. Cai, C. C. Valentim, H. Liang, S. L. Baxter, A. McKeown, G. Yang, X. Wu, F. Yan <em class="ltx_emph ltx_font_italic" id="bib.bib41.1.1">et al.</em>, “Identifying medical diagnoses and treatable diseases by image-based deep learning,” <em class="ltx_emph ltx_font_italic" id="bib.bib41.2.2">cell</em>, vol. 172, no. 5, pp. 1122–1131, 2018. </span> </li> <li class="ltx_bibitem" id="bib.bib42"> <span class="ltx_tag ltx_tag_bibitem">[42]</span> <span class="ltx_bibblock"> J. Yang, R. Shi, D. Wei, Z. Liu, L. Zhao, B. Ke, H. Pfister, and B. Ni, “Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification,” <em class="ltx_emph ltx_font_italic" id="bib.bib42.1.1">Scientific Data</em>, vol. 10, no. 1, p. 41, 2023. </span> </li> <li class="ltx_bibitem" id="bib.bib43"> <span class="ltx_tag ltx_tag_bibitem">[43]</span> <span class="ltx_bibblock"> M.-E. Nilsback and A. Zisserman, “Automated flower classification over a large number of classes,” in <em class="ltx_emph ltx_font_italic" id="bib.bib43.1.1">2008 Sixth Indian conference on computer vision, graphics &amp; image processing</em>.   IEEE, 2008, pp. 722–729. </span> </li> <li class="ltx_bibitem" id="bib.bib44"> <span class="ltx_tag ltx_tag_bibitem">[44]</span> <span class="ltx_bibblock"> M. A. Lab. (2020, January) Bean disease dataset. [Online]. Available: https://github.com/AI-Lab-Makerere/ibean/ </span> </li> <li class="ltx_bibitem" id="bib.bib45"> <span class="ltx_tag ltx_tag_bibitem">[45]</span> <span class="ltx_bibblock"> P. J. Rousseeuw, “Silhouettes: a graphical aid to the interpretation and validation of cluster analysis,” <em class="ltx_emph ltx_font_italic" id="bib.bib45.1.1">Journal of computational and applied mathematics</em>, vol. 20, pp. 53–65, 1987. </span> </li> <li class="ltx_bibitem" id="bib.bib46"> <span class="ltx_tag ltx_tag_bibitem">[46]</span> <span class="ltx_bibblock"> N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” in <em class="ltx_emph ltx_font_italic" id="bib.bib46.1.1">Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing</em>.   Association for Computational Linguistics, 11 2019. [Online]. Available: https://arxiv.org/abs/1908.10084 </span> </li> <li class="ltx_bibitem" id="bib.bib47"> <span class="ltx_tag ltx_tag_bibitem">[47]</span> <span class="ltx_bibblock"> L. Sharma, L. Graesser, N. Nangia, and U. Evci, “Natural language understanding with the quora question pairs dataset,” <em class="ltx_emph ltx_font_italic" id="bib.bib47.1.1">arXiv preprint arXiv:1907.01041</em>, 2019. </span> </li> <li class="ltx_bibitem" id="bib.bib48"> <span class="ltx_tag ltx_tag_bibitem">[48]</span> <span class="ltx_bibblock"> A. Warstadt, A. Singh, and S. R. Bowman, “Neural network acceptability judgments,” <em class="ltx_emph ltx_font_italic" id="bib.bib48.1.1">Transactions of the Association for Computational Linguistics</em>, vol. 7, pp. 625–641, 2019. </span> </li> <li class="ltx_bibitem" id="bib.bib49"> <span class="ltx_tag ltx_tag_bibitem">[49]</span> <span class="ltx_bibblock"> R. T. McCoy, J. Min, and T. Linzen, “Berts of a feather do not generalize together: Large variability in generalization across models with similar test set performance,” <em class="ltx_emph ltx_font_italic" id="bib.bib49.1.1">arXiv preprint arXiv:1911.02969</em>, 2019. </span> </li> <li class="ltx_bibitem" id="bib.bib50"> <span class="ltx_tag ltx_tag_bibitem">[50]</span> <span class="ltx_bibblock"> M. Caron, H. Touvron, I. Misra, H. Jégou, J. Mairal, P. Bojanowski, and A. Joulin, “Emerging properties in self-supervised vision transformers,” in <em class="ltx_emph ltx_font_italic" id="bib.bib50.1.1">Proceedings of the IEEE/CVF international conference on computer vision</em>, 2021, pp. 9650–9660. </span> </li> <li class="ltx_bibitem" id="bib.bib51"> <span class="ltx_tag ltx_tag_bibitem">[51]</span> <span class="ltx_bibblock"> M. Assran, M. Caron, I. Misra, P. Bojanowski, F. Bordes, P. Vincent, A. Joulin, M. Rabbat, and N. Ballas, “Masked siamese networks for label-efficient learning,” in <em class="ltx_emph ltx_font_italic" id="bib.bib51.1.1">European Conference on Computer Vision</em>.   Springer, 2022, pp. 456–473. </span> </li> <li class="ltx_bibitem" id="bib.bib52"> <span class="ltx_tag ltx_tag_bibitem">[52]</span> <span class="ltx_bibblock"> O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein <em class="ltx_emph ltx_font_italic" id="bib.bib52.1.1">et al.</em>, “Imagenet large scale visual recognition challenge,” <em class="ltx_emph ltx_font_italic" id="bib.bib52.2.2">International journal of computer vision</em>, vol. 115, pp. 211–252, 2015. </span> </li> <li class="ltx_bibitem" id="bib.bib53"> <span class="ltx_tag ltx_tag_bibitem">[53]</span> <span class="ltx_bibblock"> T. Ridnik, E. Ben-Baruch, A. Noy, and L. Zelnik-Manor, “Imagenet-21k pretraining for the masses,” <em class="ltx_emph ltx_font_italic" id="bib.bib53.1.1">arXiv preprint arXiv:2104.10972</em>, 2021. </span> </li> <li class="ltx_bibitem" id="bib.bib54"> <span class="ltx_tag ltx_tag_bibitem">[54]</span> <span class="ltx_bibblock"> C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, “A survey on deep transfer learning,” in <em class="ltx_emph ltx_font_italic" id="bib.bib54.1.1">Artificial Neural Networks and Machine Learning–ICANN 2018: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4-7, 2018, Proceedings, Part III 27</em>.   Springer, 2018, pp. 270–279. </span> </li> <li class="ltx_bibitem" id="bib.bib55"> <span class="ltx_tag ltx_tag_bibitem">[55]</span> <span class="ltx_bibblock"> A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” <em class="ltx_emph ltx_font_italic" id="bib.bib55.1.1">Advances in neural information processing systems</em>, vol. 25, 2012. </span> </li> <li class="ltx_bibitem" id="bib.bib56"> <span class="ltx_tag ltx_tag_bibitem">[56]</span> <span class="ltx_bibblock"> K. Jamieson and A. Talwalkar, “Non-stochastic best arm identification and hyperparameter optimization,” in <em class="ltx_emph ltx_font_italic" id="bib.bib56.1.1">Artificial intelligence and statistics</em>.   PMLR, 2016, pp. 240–248. </span> </li> <li class="ltx_bibitem" id="bib.bib57"> <span class="ltx_tag ltx_tag_bibitem">[57]</span> <span class="ltx_bibblock"> A. Achille, M. Lam, R. Tewari, A. Ravichandran, S. Maji, C. C. Fowlkes, S. Soatto, and P. Perona, “Task2vec: Task embedding for meta-learning,” in <em class="ltx_emph ltx_font_italic" id="bib.bib57.1.1">Proceedings of the IEEE/CVF international conference on computer vision</em>, 2019, pp. 6430–6439. </span> </li> <li class="ltx_bibitem" id="bib.bib58"> <span class="ltx_tag ltx_tag_bibitem">[58]</span> <span class="ltx_bibblock"> X. Zhai, J. Puigcerver, A. Kolesnikov, P. Ruyssen, C. Riquelme, M. Lucic, J. Djolonga, A. S. Pinto, M. Neumann, A. Dosovitskiy <em class="ltx_emph ltx_font_italic" id="bib.bib58.1.1">et al.</em>, “The visual task adaptation benchmark,” 2019. </span> </li> <li class="ltx_bibitem" id="bib.bib59"> <span class="ltx_tag ltx_tag_bibitem">[59]</span> <span class="ltx_bibblock"> J.-W. Cui, W. Lu, X. Zhao, and X.-Y. Du, “Efficient model store and reuse in an olml database system,” <em class="ltx_emph ltx_font_italic" id="bib.bib59.1.1">Journal of Computer Science and Technology</em>, vol. 36, pp. 792–805, 2021. </span> </li> <li class="ltx_bibitem" id="bib.bib60"> <span class="ltx_tag ltx_tag_bibitem">[60]</span> <span class="ltx_bibblock"> W. Zhang, J. Jiang, Y. Shao, and B. Cui, “Efficient diversity-driven ensemble for deep neural networks,” in <em class="ltx_emph ltx_font_italic" id="bib.bib60.1.1">2020 IEEE 36th International Conference on Data Engineering (ICDE)</em>.   IEEE, 2020, pp. 73–84. </span> </li> <li class="ltx_bibitem" id="bib.bib61"> <span class="ltx_tag ltx_tag_bibitem">[61]</span> <span class="ltx_bibblock"> Y.-X. Ding and Z.-H. Zhou, “Boosting-based reliable model reuse,” in <em class="ltx_emph ltx_font_italic" id="bib.bib61.1.1">Asian Conference on Machine Learning</em>.   PMLR, 2020, pp. 145–160. </span> </li> <li class="ltx_bibitem" id="bib.bib62"> <span class="ltx_tag ltx_tag_bibitem">[62]</span> <span class="ltx_bibblock"> F. Feng, Y. Yang, D. Cer, N. Arivazhagan, and W. Wang, “Language-agnostic bert sentence embedding,” <em class="ltx_emph ltx_font_italic" id="bib.bib62.1.1">arXiv preprint arXiv:2007.01852</em>, 2020. </span> </li> </ul> </section> <div class="ltx_pagination ltx_role_newpage"></div> <section class="ltx_subsection" id="A0.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="A0.SS1.4.1.1">-A</span> </span><span class="ltx_text ltx_font_italic" id="A0.SS1.5.2">Mnli Results</span> </h3> <div class="ltx_para" id="A0.SS1.p1"> <p class="ltx_p" id="A0.SS1.p1.1">We provide different models’ validation results on MNLI dataset under a different learning rate 1e-5 in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.F8" title="Figure 8 ‣ -A Mnli Results ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">8</span></a>, compared to 3e-5 in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4.F3" title="Figure 3 ‣ IV-A Early Stopping ‣ IV Fine-Selection ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">3</span></a>. It can be observed that under new hyperparameter settings, the training dynamics of the models have changed. The performances of the top two models did not continuously decline with further training, suggesting a less severe overfitting issue. This indicates that the training process of models is highly sensitive to the setting of hyperparameters. In addition, we use our two-phase model selection method for the model training process under the new hyperparameters, and the performance and efficiency are consistent. Despite the changes in the training process, the variation in model performance was not significant enough to impact the effectiveness of our method. Therefore, our approach is robust to different hyperparameter settings in model training and is applicable across various model training scenarios.</p> </div> <figure class="ltx_figure" id="A0.F8"> <p class="ltx_p ltx_align_center" id="A0.F8.1"><span class="ltx_text" id="A0.F8.1.1"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="960" id="A0.F8.1.1.g1" src="extracted/2404.00069v1/mnli_lr_1e-05.png" width="1488"/></span></p> <br class="ltx_break ltx_break"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 8: </span>Top-10 models validation and test results on MNLI dataset. Learning rate is 1e-5, which is different than 3e-5 in the Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S4.F3" title="Figure 3 ‣ IV-A Early Stopping ‣ IV Fine-Selection ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">3</span></a>.</figcaption> </figure> </section> <section class="ltx_subsection" id="A0.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="A0.SS2.4.1.1">-B</span> </span><span class="ltx_text ltx_font_italic" id="A0.SS2.5.2">Model Details</span> </h3> <div class="ltx_para" id="A0.SS2.p1"> <p class="ltx_p" id="A0.SS2.p1.1">The pre-trained models we use are all from Huggingface’s model hub <span class="ltx_note ltx_role_footnote" id="footnote1"><sup class="ltx_note_mark">1</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">1</sup><span class="ltx_tag ltx_tag_note">1</span>https://huggingface.co/models</span></span></span>. We list the full names of all the NLP and CV models used in our work in Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.T8" title="TABLE VIII ‣ -B Model Details ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">VIII</span></a>. Note that we sometimes use incomplete model names in the main text to save space by removing the name of the repository to which the model belongs. After removing the repository name prefix, the model names are still uniquely summarized in the list of models we use, so partial model names can also be used to pinpoint the corresponding model.</p> </div> <div class="ltx_para" id="A0.SS2.p2"> <p class="ltx_p" id="A0.SS2.p2.1">NLP models and CV models are listed in Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.T9" title="TABLE IX ‣ -C Dataset Details ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">IX</span></a> in total. All models are available using ”https://huggingface.co/” as prefix.</p> </div> <figure class="ltx_table" id="A0.T8"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">TABLE VIII: </span>NLP and CV Models</figcaption> <table class="ltx_tabular ltx_centering ltx_align_middle" id="A0.T8.1"> <tr class="ltx_tr" id="A0.T8.1.1"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.1.1"><span class="ltx_text ltx_font_bold" id="A0.T8.1.1.1.1">NLP model name</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.1.2"><span class="ltx_text ltx_font_bold" id="A0.T8.1.1.2.1">CV model name</span></td> </tr> <tr class="ltx_tr" id="A0.T8.1.2"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.2.1">18811449050/bert_finetuning_test</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.2.2">facebook/deit-base-patch16-224</td> </tr> <tr class="ltx_tr" id="A0.T8.1.3"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.3.1">aditeyabaral/finetuned-sail2017-xlm-roberta-base</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.3.2">facebook/deit-base-patch16-384</td> </tr> <tr class="ltx_tr" id="A0.T8.1.4"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.4.1">albert-base-v2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.4.2">facebook/deit-small-patch16-224</td> </tr> <tr class="ltx_tr" id="A0.T8.1.5"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.5.1">aliosm/sha3bor-metre-detector-arabertv2-base</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.5.2">facebook/dino-vitb16</td> </tr> <tr class="ltx_tr" id="A0.T8.1.6"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.6.1">Alireza1044/albert-base-v2-qnli</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.6.2">facebook/dino-vitb8</td> </tr> <tr class="ltx_tr" id="A0.T8.1.7"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.7.1">anirudh21/bert-base-uncased-finetuned-qnli</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.7.2">facebook/dino-vits16</td> </tr> <tr class="ltx_tr" id="A0.T8.1.8"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.8.1">aviator-neural/bert-base-uncased-sst2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.8.2">facebook/vit-msn-base</td> </tr> <tr class="ltx_tr" id="A0.T8.1.9"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.9.1">aychang/bert-base-cased-trec-coarse</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.9.2">facebook/vit-msn-small</td> </tr> <tr class="ltx_tr" id="A0.T8.1.10"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.10.1">bert-base-uncased</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.10.2">google/vit-base-patch16-224</td> </tr> <tr class="ltx_tr" id="A0.T8.1.11"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.11.1">bondi/bert-semaphore-prediction-w4</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.11.2">google/vit-base-patch16-384</td> </tr> <tr class="ltx_tr" id="A0.T8.1.12"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.12.1">CAMeL-Lab/bert-base-arabic-camelbert-da-sentiment</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.12.2">google/vit-base-patch32-224-in21k</td> </tr> <tr class="ltx_tr" id="A0.T8.1.13"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.13.1">CAMeL-Lab–bert-base-arabic-camelbert-mix-did-nadi</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.13.2">lixiqi/beit-base-patch16-224-pt22k-ft22k-finetuned-FER2013-6e-05</td> </tr> <tr class="ltx_tr" id="A0.T8.1.14"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.14.1">classla/bcms-bertic-parlasent-bcs-ter</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.14.2">lixiqi/beit-base-patch16-224-pt22k-ft22k-finetuned-FER2013-7e-05</td> </tr> <tr class="ltx_tr" id="A0.T8.1.15"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.15.1">connectivity/bert_ft_qqp-1</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.15.2">lixiqi/beit-base-patch16-224-pt22k-ft22k-finetuned-FER-5e-05-3</td> </tr> <tr class="ltx_tr" id="A0.T8.1.16"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.16.1">connectivity/bert_ft_qqp-17</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.16.2">microsoft/beit-base-patch16-224</td> </tr> <tr class="ltx_tr" id="A0.T8.1.17"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.17.1">connectivity/bert_ft_qqp-7</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.17.2">microsoft/beit-base-patch16-224-pt22k</td> </tr> <tr class="ltx_tr" id="A0.T8.1.18"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.18.1">connectivity/bert_ft_qqp-96</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.18.2">microsoft/beit-base-patch16-224-pt22k-ft22k</td> </tr> <tr class="ltx_tr" id="A0.T8.1.19"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.19.1">dhimskyy/wiki-bert</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.19.2">microsoft/beit-base-patch16-384</td> </tr> <tr class="ltx_tr" id="A0.T8.1.20"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.20.1">distilbert-base-uncased</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.20.2">microsoft/beit-large-patch16-224-pt22k</td> </tr> <tr class="ltx_tr" id="A0.T8.1.21"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.21.1">DoyyingFace/bert-asian-hate-tweets-asian-unclean-freeze-4</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.21.2">mrgiraffe/vit-large-dataset-model-v3</td> </tr> <tr class="ltx_tr" id="A0.T8.1.22"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.22.1">emrecan/bert-base-multilingual-cased-snli_tr</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.22.2">sail/poolformer_m36</td> </tr> <tr class="ltx_tr" id="A0.T8.1.23"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.23.1">gchhablani/bert-base-cased-finetuned-rte</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.23.2">sail/poolformer_m48</td> </tr> <tr class="ltx_tr" id="A0.T8.1.24"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.24.1">gchhablani/bert-base-cased-finetuned-wnli</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.24.2">sail/poolformer_s36</td> </tr> <tr class="ltx_tr" id="A0.T8.1.25"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.25.1">ishan/bert-base-uncased-mnli</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.25.2">shi-labs/dinat-base-in1k-224</td> </tr> <tr class="ltx_tr" id="A0.T8.1.26"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.26.1">jb2k/bert-base-multilingual-cased-language-detection</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.26.2">shi-labs/dinat-large-in22k-in1k-224</td> </tr> <tr class="ltx_tr" id="A0.T8.1.27"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.27.1">Jeevesh8/512seq_len_6ep_bert_ft_cola-91</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.27.2">shi-labs/dinat-large-in22k-in1k-384</td> </tr> <tr class="ltx_tr" id="A0.T8.1.28"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.28.1">Jeevesh8/6ep_bert_ft_cola-47</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.28.2">Visual-Attention-Network/van-base</td> </tr> <tr class="ltx_tr" id="A0.T8.1.29"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.29.1">Jeevesh8/bert_ft_cola-88</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.29.2">Visual-Attention-Network/van-large</td> </tr> <tr class="ltx_tr" id="A0.T8.1.30"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.30.1">Jeevesh8/bert_ft_qqp-40</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.30.2">oschamp/vit-artworkclassifier</td> </tr> <tr class="ltx_tr" id="A0.T8.1.31"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.31.1">Jeevesh8/bert_ft_qqp-68</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.31.2">nateraw/vit-age-classifier</td> </tr> <tr class="ltx_tr" id="A0.T8.1.32"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.32.1">Jeevesh8/bert_ft_qqp-9</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.32.2">-</td> </tr> <tr class="ltx_tr" id="A0.T8.1.33"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.33.1">Jeevesh8/feather_berts_46</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.33.2">-</td> </tr> <tr class="ltx_tr" id="A0.T8.1.34"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.34.1">Jeevesh8/init_bert_ft_qqp-24</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.34.2">-</td> </tr> <tr class="ltx_tr" id="A0.T8.1.35"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.35.1">Jeevesh8/init_bert_ft_qqp-33</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.35.2">-</td> </tr> <tr class="ltx_tr" id="A0.T8.1.36"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.36.1">manueltonneau/bert-twitter-en-is-hired</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.36.2">-</td> </tr> <tr class="ltx_tr" id="A0.T8.1.37"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.37.1">roberta-base</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.37.2">-</td> </tr> <tr class="ltx_tr" id="A0.T8.1.38"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.38.1">socialmediaie/TRAC2020_IBEN_B_bert-base-multilingual-uncased</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.38.2">-</td> </tr> <tr class="ltx_tr" id="A0.T8.1.39"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.39.1">Splend1dchan/bert-base-uncased-slue-goldtrascription-e3-lr1e-4</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.39.2">-</td> </tr> <tr class="ltx_tr" id="A0.T8.1.40"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.40.1">XSY/albert-base-v2-imdb-calssification</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T8.1.40.2">-</td> </tr> <tr class="ltx_tr" id="A0.T8.1.41"> <td class="ltx_td ltx_align_center ltx_border_b ltx_border_l ltx_border_r ltx_border_t" id="A0.T8.1.41.1">Guscode/DKbert-hatespeech-detection</td> <td class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="A0.T8.1.41.2">-</td> </tr> </table> </figure> </section> <section class="ltx_subsection" id="A0.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="A0.SS3.4.1.1">-C</span> </span><span class="ltx_text ltx_font_italic" id="A0.SS3.5.2">Dataset Details</span> </h3> <div class="ltx_para" id="A0.SS3.p1"> <p class="ltx_p" id="A0.SS3.p1.1">NLP datasets and CV datasets are listed in Table IX. Some datasets contains multiple subsets. All datasets are available using ”https://huggingface.co/” as prefix. GLUE and SuperGLUE are the most common benchmark datasets in NLP. Cifar10 and MNIST are the most common benchmark datasets in CV. Other NLP datasets are described below:</p> </div> <div class="ltx_para" id="A0.SS3.p2"> <ul class="ltx_itemize" id="A0.I1"> <li class="ltx_item" id="A0.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I1.i1.p1"> <p class="ltx_p" id="A0.I1.i1.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I1.i1.p1.1.1">LysandreJik/glue-mnli-train</span> This datasets contain labelled MNLI dataset. The original MNLI dataset in GLUE does not have label, and the label is necessary for our experiment. This task is to predict the relation between the premise and the hypothesis. The result could be entailment, contradiction, or neutral. The labels of this dataset are balanced.</p> </div> </li> <li class="ltx_item" id="A0.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I1.i2.p1"> <p class="ltx_p" id="A0.I1.i2.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I1.i2.p1.1.1">SetFit/qnli</span> This datasets contain labelled qnli dataset. The original qnli dataset in GLUE does not have label, and the label is necessary for our experiment. This task is to predict whether or not the paragraph contains the answer to the question. The labels of this dataset are balanced.</p> </div> </li> <li class="ltx_item" id="A0.I1.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I1.i3.p1"> <p class="ltx_p" id="A0.I1.i3.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I1.i3.p1.1.1">xnli</span> This dataset contains part of MNLI dataset after translated into different languages. The labels of this dataset are balanced.</p> </div> </li> <li class="ltx_item" id="A0.I1.i4" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I1.i4.p1"> <p class="ltx_p" id="A0.I1.i4.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I1.i4.p1.1.1">stsb_multi_mt</span> This task is to score the similarity between two sentences on the scale of 0 to 5. The labels of this dataset are not balanced.</p> </div> </li> <li class="ltx_item" id="A0.I1.i5" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I1.i5.p1"> <p class="ltx_p" id="A0.I1.i5.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I1.i5.p1.1.1">anli</span> This task is the same as MNLI dataset. However, the dataset is collected in an adversarial procedure. The labels of this dataset are not balanced.</p> </div> </li> <li class="ltx_item" id="A0.I1.i6" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I1.i6.p1"> <p class="ltx_p" id="A0.I1.i6.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I1.i6.p1.1.1">tweet_eval</span> This is a sentiment analysis task. The dataset is collected from Tweeter. The labels of this dataset are not balanced.</p> </div> </li> <li class="ltx_item" id="A0.I1.i7" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I1.i7.p1"> <p class="ltx_p" id="A0.I1.i7.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I1.i7.p1.1.1">paws</span> This is a paraphrase identification task. The labels of this dataset are not balanced.</p> </div> </li> <li class="ltx_item" id="A0.I1.i8" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I1.i8.p1"> <p class="ltx_p" id="A0.I1.i8.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I1.i8.p1.1.1">financial_phrasebank</span> This is a sentiment analysis task in the realm of finance. The dataset is collected from financial news. The labels of this dataset are not balanced.</p> </div> </li> <li class="ltx_item" id="A0.I1.i9" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I1.i9.p1"> <p class="ltx_p" id="A0.I1.i9.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I1.i9.p1.1.1">yahoo_answers_topics</span> This is a classification task. The dataset is collected from Yahoo. The labels of this dataset are balanced.</p> </div> </li> </ul> </div> <div class="ltx_para" id="A0.SS3.p3"> <p class="ltx_p" id="A0.SS3.p3.1">Other CV datasets are described below:</p> </div> <div class="ltx_para" id="A0.SS3.p4"> <ul class="ltx_itemize" id="A0.I2"> <li class="ltx_item" id="A0.I2.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I2.i1.p1"> <p class="ltx_p" id="A0.I2.i1.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I2.i1.p1.1.1">food101</span> This dataset contains 101 kinds of food that need to predict. The size of the image is not the same. The labels of this dataset are balanced.</p> </div> </li> <li class="ltx_item" id="A0.I2.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I2.i2.p1"> <p class="ltx_p" id="A0.I2.i2.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I2.i2.p1.1.1">nelorth/oxford-flowers</span> This dataset contains 102 kinds of flowers that need to predict. The size of the images is not the same. The labels of this dataset are not balanced.</p> </div> </li> <li class="ltx_item" id="A0.I2.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I2.i3.p1"> <p class="ltx_p" id="A0.I2.i3.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I2.i3.p1.1.1">Matthijs/snacks</span> This dataset contains 20 kinds of snacks that need to predict. The size of the images is not the same. The labels of this dataset are slightly unbalanced.</p> </div> </li> <li class="ltx_item" id="A0.I2.i4" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I2.i4.p1"> <p class="ltx_p" id="A0.I2.i4.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I2.i4.p1.1.1">beans</span> This dataset contains 3 kinds of leaves that need to predict. The size of the images is the same. The labels of this dataset are balanced.</p> </div> </li> <li class="ltx_item" id="A0.I2.i5" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I2.i5.p1"> <p class="ltx_p" id="A0.I2.i5.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I2.i5.p1.1.1">cats_vs_dogs</span> This dataset contains images of cats or dogs and is a subset of Asirra dataset. The size of the images is not the same. The labels of this dataset are balanced.</p> </div> </li> <li class="ltx_item" id="A0.I2.i6" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I2.i6.p1"> <p class="ltx_p" id="A0.I2.i6.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I2.i6.p1.1.1">trpakov/chest-xray-classification</span> This dataset contains images of chest x-ray. The size of the images is the same. The labels of this dataset are not balanced.</p> </div> </li> <li class="ltx_item" id="A0.I2.i7" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I2.i7.p1"> <p class="ltx_p" id="A0.I2.i7.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I2.i7.p1.1.1">alkzar90/CC6204-Hackaton-Cub-Dataset</span> This daatset contains images of birds. The size of the images is not the same. The labels of this dataset are not balanced.</p> </div> </li> <li class="ltx_item" id="A0.I2.i8" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="A0.I2.i8.p1"> <p class="ltx_p" id="A0.I2.i8.p1.1"><span class="ltx_text ltx_font_bold" id="A0.I2.i8.p1.1.1">albertvillanova/medmnist-v2</span> This dataset contains images about biomedical. The size of the image is the same. The labels of this dataset are not balanced.</p> </div> </li> </ul> </div> <figure class="ltx_table" id="A0.T9"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">TABLE IX: </span>NLP and CV Datasets</figcaption> <table class="ltx_tabular ltx_centering ltx_align_middle" id="A0.T9.1"> <tr class="ltx_tr" id="A0.T9.1.1"> <td class="ltx_td ltx_align_center ltx_border_t" id="A0.T9.1.1.1"><span class="ltx_text ltx_font_bold" id="A0.T9.1.1.1.1">NLP dataset name</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="A0.T9.1.1.2"><span class="ltx_text ltx_font_bold" id="A0.T9.1.1.2.1">CV dataset name</span></td> </tr> <tr class="ltx_tr" id="A0.T9.1.2"> <td class="ltx_td ltx_align_center ltx_border_t" id="A0.T9.1.2.1">glue</td> <td class="ltx_td ltx_align_center ltx_border_t" id="A0.T9.1.2.2">food101</td> </tr> <tr class="ltx_tr" id="A0.T9.1.3"> <td class="ltx_td ltx_align_center" id="A0.T9.1.3.1">super_glue</td> <td class="ltx_td ltx_align_center" id="A0.T9.1.3.2">nelorth/oxford-flowers</td> </tr> <tr class="ltx_tr" id="A0.T9.1.4"> <td class="ltx_td ltx_align_center" id="A0.T9.1.4.1">LysandreJik/glue-mnli-train</td> <td class="ltx_td ltx_align_center" id="A0.T9.1.4.2">Matthijs/snacks</td> </tr> <tr class="ltx_tr" id="A0.T9.1.5"> <td class="ltx_td ltx_align_center" id="A0.T9.1.5.1">SetFit/qnli</td> <td class="ltx_td ltx_align_center" id="A0.T9.1.5.2">beans</td> </tr> <tr class="ltx_tr" id="A0.T9.1.6"> <td class="ltx_td ltx_align_center" id="A0.T9.1.6.1">xnli</td> <td class="ltx_td ltx_align_center" id="A0.T9.1.6.2">cats_vs_dogs</td> </tr> <tr class="ltx_tr" id="A0.T9.1.7"> <td class="ltx_td ltx_align_center" id="A0.T9.1.7.1">stsb_multi_mt</td> <td class="ltx_td ltx_align_center" id="A0.T9.1.7.2">trpakov/chest-xray-classification</td> </tr> <tr class="ltx_tr" id="A0.T9.1.8"> <td class="ltx_td ltx_align_center" id="A0.T9.1.8.1">anli</td> <td class="ltx_td ltx_align_center" id="A0.T9.1.8.2">cifar10</td> </tr> <tr class="ltx_tr" id="A0.T9.1.9"> <td class="ltx_td ltx_align_center" id="A0.T9.1.9.1">tweet_eval</td> <td class="ltx_td ltx_align_center" id="A0.T9.1.9.2">MNIST</td> </tr> <tr class="ltx_tr" id="A0.T9.1.10"> <td class="ltx_td ltx_align_center" id="A0.T9.1.10.1">paws</td> <td class="ltx_td ltx_align_center" id="A0.T9.1.10.2">alkzar90/CC6204-Hackaton-Cub-Dataset</td> </tr> <tr class="ltx_tr" id="A0.T9.1.11"> <td class="ltx_td ltx_align_center" id="A0.T9.1.11.1">financial_phrasebank</td> <td class="ltx_td ltx_align_center" id="A0.T9.1.11.2">albertvillanova/medmnist-v2</td> </tr> <tr class="ltx_tr" id="A0.T9.1.12"> <td class="ltx_td ltx_align_center ltx_border_b" id="A0.T9.1.12.1">yahoo_answers_topics</td> <td class="ltx_td ltx_align_center ltx_border_b" id="A0.T9.1.12.2">-</td> </tr> </table> </figure> </section> <section class="ltx_subsection" id="A0.SS4"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="A0.SS4.4.1.1">-D</span> </span><span class="ltx_text ltx_font_italic" id="A0.SS4.5.2">Experiment on the Number of Dimensions for Max Average Error</span> </h3> <div class="ltx_para" id="A0.SS4.p1"> <p class="ltx_p" id="A0.SS4.p1.7">As discussed in Eq. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S3.E1" title="In III-A Model Clustering ‣ III Coarse Recall ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">1</span></a> and Section V.B., we use top-k maximum average error to measure the model similarity and the parameter <math alttext="k" class="ltx_Math" display="inline" id="A0.SS4.p1.1.m1.1"><semantics id="A0.SS4.p1.1.m1.1a"><mi id="A0.SS4.p1.1.m1.1.1" xref="A0.SS4.p1.1.m1.1.1.cmml">k</mi><annotation-xml encoding="MathML-Content" id="A0.SS4.p1.1.m1.1b"><ci id="A0.SS4.p1.1.m1.1.1.cmml" xref="A0.SS4.p1.1.m1.1.1">𝑘</ci></annotation-xml><annotation encoding="application/x-tex" id="A0.SS4.p1.1.m1.1c">k</annotation><annotation encoding="application/x-llamapun" id="A0.SS4.p1.1.m1.1d">italic_k</annotation></semantics></math> may influence the performance of the model selection algorithm. Thus, we test different values of <math alttext="k" class="ltx_Math" display="inline" id="A0.SS4.p1.2.m2.1"><semantics id="A0.SS4.p1.2.m2.1a"><mi id="A0.SS4.p1.2.m2.1.1" xref="A0.SS4.p1.2.m2.1.1.cmml">k</mi><annotation-xml encoding="MathML-Content" id="A0.SS4.p1.2.m2.1b"><ci id="A0.SS4.p1.2.m2.1.1.cmml" xref="A0.SS4.p1.2.m2.1.1">𝑘</ci></annotation-xml><annotation encoding="application/x-tex" id="A0.SS4.p1.2.m2.1c">k</annotation><annotation encoding="application/x-llamapun" id="A0.SS4.p1.2.m2.1d">italic_k</annotation></semantics></math> while fixing other items. Due to the number of datasets, we choose <math alttext="k=5,10,15" class="ltx_Math" display="inline" id="A0.SS4.p1.3.m3.3"><semantics id="A0.SS4.p1.3.m3.3a"><mrow id="A0.SS4.p1.3.m3.3.4" xref="A0.SS4.p1.3.m3.3.4.cmml"><mi id="A0.SS4.p1.3.m3.3.4.2" xref="A0.SS4.p1.3.m3.3.4.2.cmml">k</mi><mo id="A0.SS4.p1.3.m3.3.4.1" xref="A0.SS4.p1.3.m3.3.4.1.cmml">=</mo><mrow id="A0.SS4.p1.3.m3.3.4.3.2" xref="A0.SS4.p1.3.m3.3.4.3.1.cmml"><mn id="A0.SS4.p1.3.m3.1.1" xref="A0.SS4.p1.3.m3.1.1.cmml">5</mn><mo id="A0.SS4.p1.3.m3.3.4.3.2.1" xref="A0.SS4.p1.3.m3.3.4.3.1.cmml">,</mo><mn id="A0.SS4.p1.3.m3.2.2" xref="A0.SS4.p1.3.m3.2.2.cmml">10</mn><mo id="A0.SS4.p1.3.m3.3.4.3.2.2" xref="A0.SS4.p1.3.m3.3.4.3.1.cmml">,</mo><mn id="A0.SS4.p1.3.m3.3.3" xref="A0.SS4.p1.3.m3.3.3.cmml">15</mn></mrow></mrow><annotation-xml encoding="MathML-Content" id="A0.SS4.p1.3.m3.3b"><apply id="A0.SS4.p1.3.m3.3.4.cmml" xref="A0.SS4.p1.3.m3.3.4"><eq id="A0.SS4.p1.3.m3.3.4.1.cmml" xref="A0.SS4.p1.3.m3.3.4.1"></eq><ci id="A0.SS4.p1.3.m3.3.4.2.cmml" xref="A0.SS4.p1.3.m3.3.4.2">𝑘</ci><list id="A0.SS4.p1.3.m3.3.4.3.1.cmml" xref="A0.SS4.p1.3.m3.3.4.3.2"><cn id="A0.SS4.p1.3.m3.1.1.cmml" type="integer" xref="A0.SS4.p1.3.m3.1.1">5</cn><cn id="A0.SS4.p1.3.m3.2.2.cmml" type="integer" xref="A0.SS4.p1.3.m3.2.2">10</cn><cn id="A0.SS4.p1.3.m3.3.3.cmml" type="integer" xref="A0.SS4.p1.3.m3.3.3">15</cn></list></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.SS4.p1.3.m3.3c">k=5,10,15</annotation><annotation encoding="application/x-llamapun" id="A0.SS4.p1.3.m3.3d">italic_k = 5 , 10 , 15</annotation></semantics></math> for NLP clustering evaluation and <math alttext="k=3,4,5" class="ltx_Math" display="inline" id="A0.SS4.p1.4.m4.3"><semantics id="A0.SS4.p1.4.m4.3a"><mrow id="A0.SS4.p1.4.m4.3.4" xref="A0.SS4.p1.4.m4.3.4.cmml"><mi id="A0.SS4.p1.4.m4.3.4.2" xref="A0.SS4.p1.4.m4.3.4.2.cmml">k</mi><mo id="A0.SS4.p1.4.m4.3.4.1" xref="A0.SS4.p1.4.m4.3.4.1.cmml">=</mo><mrow id="A0.SS4.p1.4.m4.3.4.3.2" xref="A0.SS4.p1.4.m4.3.4.3.1.cmml"><mn id="A0.SS4.p1.4.m4.1.1" xref="A0.SS4.p1.4.m4.1.1.cmml">3</mn><mo id="A0.SS4.p1.4.m4.3.4.3.2.1" xref="A0.SS4.p1.4.m4.3.4.3.1.cmml">,</mo><mn id="A0.SS4.p1.4.m4.2.2" xref="A0.SS4.p1.4.m4.2.2.cmml">4</mn><mo id="A0.SS4.p1.4.m4.3.4.3.2.2" xref="A0.SS4.p1.4.m4.3.4.3.1.cmml">,</mo><mn id="A0.SS4.p1.4.m4.3.3" xref="A0.SS4.p1.4.m4.3.3.cmml">5</mn></mrow></mrow><annotation-xml encoding="MathML-Content" id="A0.SS4.p1.4.m4.3b"><apply id="A0.SS4.p1.4.m4.3.4.cmml" xref="A0.SS4.p1.4.m4.3.4"><eq id="A0.SS4.p1.4.m4.3.4.1.cmml" xref="A0.SS4.p1.4.m4.3.4.1"></eq><ci id="A0.SS4.p1.4.m4.3.4.2.cmml" xref="A0.SS4.p1.4.m4.3.4.2">𝑘</ci><list id="A0.SS4.p1.4.m4.3.4.3.1.cmml" xref="A0.SS4.p1.4.m4.3.4.3.2"><cn id="A0.SS4.p1.4.m4.1.1.cmml" type="integer" xref="A0.SS4.p1.4.m4.1.1">3</cn><cn id="A0.SS4.p1.4.m4.2.2.cmml" type="integer" xref="A0.SS4.p1.4.m4.2.2">4</cn><cn id="A0.SS4.p1.4.m4.3.3.cmml" type="integer" xref="A0.SS4.p1.4.m4.3.3">5</cn></list></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.SS4.p1.4.m4.3c">k=3,4,5</annotation><annotation encoding="application/x-llamapun" id="A0.SS4.p1.4.m4.3d">italic_k = 3 , 4 , 5</annotation></semantics></math> for CV clustering evaluation. The result is shown in Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.T10" title="TABLE X ‣ -D Experiment on the Number of Dimensions for Max Average Error ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">X</span></a>. We can find that the influence of parameter <math alttext="k" class="ltx_Math" display="inline" id="A0.SS4.p1.5.m5.1"><semantics id="A0.SS4.p1.5.m5.1a"><mi id="A0.SS4.p1.5.m5.1.1" xref="A0.SS4.p1.5.m5.1.1.cmml">k</mi><annotation-xml encoding="MathML-Content" id="A0.SS4.p1.5.m5.1b"><ci id="A0.SS4.p1.5.m5.1.1.cmml" xref="A0.SS4.p1.5.m5.1.1">𝑘</ci></annotation-xml><annotation encoding="application/x-tex" id="A0.SS4.p1.5.m5.1c">k</annotation><annotation encoding="application/x-llamapun" id="A0.SS4.p1.5.m5.1d">italic_k</annotation></semantics></math> is limited since the silhouette coefficient fluctuates within an acceptable range. Considering that the parameter <math alttext="k" class="ltx_Math" display="inline" id="A0.SS4.p1.6.m6.1"><semantics id="A0.SS4.p1.6.m6.1a"><mi id="A0.SS4.p1.6.m6.1.1" xref="A0.SS4.p1.6.m6.1.1.cmml">k</mi><annotation-xml encoding="MathML-Content" id="A0.SS4.p1.6.m6.1b"><ci id="A0.SS4.p1.6.m6.1.1.cmml" xref="A0.SS4.p1.6.m6.1.1">𝑘</ci></annotation-xml><annotation encoding="application/x-tex" id="A0.SS4.p1.6.m6.1c">k</annotation><annotation encoding="application/x-llamapun" id="A0.SS4.p1.6.m6.1d">italic_k</annotation></semantics></math> in Eq. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S3.E1" title="In III-A Model Clustering ‣ III Coarse Recall ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">1</span></a> should be able to filter noise and retain valid information, we choose <math alttext="k=5" class="ltx_Math" display="inline" id="A0.SS4.p1.7.m7.1"><semantics id="A0.SS4.p1.7.m7.1a"><mrow id="A0.SS4.p1.7.m7.1.1" xref="A0.SS4.p1.7.m7.1.1.cmml"><mi id="A0.SS4.p1.7.m7.1.1.2" xref="A0.SS4.p1.7.m7.1.1.2.cmml">k</mi><mo id="A0.SS4.p1.7.m7.1.1.1" xref="A0.SS4.p1.7.m7.1.1.1.cmml">=</mo><mn id="A0.SS4.p1.7.m7.1.1.3" xref="A0.SS4.p1.7.m7.1.1.3.cmml">5</mn></mrow><annotation-xml encoding="MathML-Content" id="A0.SS4.p1.7.m7.1b"><apply id="A0.SS4.p1.7.m7.1.1.cmml" xref="A0.SS4.p1.7.m7.1.1"><eq id="A0.SS4.p1.7.m7.1.1.1.cmml" xref="A0.SS4.p1.7.m7.1.1.1"></eq><ci id="A0.SS4.p1.7.m7.1.1.2.cmml" xref="A0.SS4.p1.7.m7.1.1.2">𝑘</ci><cn id="A0.SS4.p1.7.m7.1.1.3.cmml" type="integer" xref="A0.SS4.p1.7.m7.1.1.3">5</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.SS4.p1.7.m7.1c">k=5</annotation><annotation encoding="application/x-llamapun" id="A0.SS4.p1.7.m7.1d">italic_k = 5</annotation></semantics></math> in both tasks.</p> </div> <figure class="ltx_table" id="A0.T10"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table">TABLE X: </span>Parameter K Selection</figcaption> <table class="ltx_tabular ltx_align_middle" id="A0.T10.1"> <tr class="ltx_tr" id="A0.T10.1.1"> <td class="ltx_td ltx_border_t" id="A0.T10.1.1.1"></td> <td class="ltx_td ltx_align_center ltx_border_t" colspan="3" id="A0.T10.1.1.2">NLP</td> <td class="ltx_td ltx_align_center ltx_border_t" colspan="3" id="A0.T10.1.1.3">CV</td> </tr> <tr class="ltx_tr" id="A0.T10.1.2"> <td class="ltx_td ltx_align_center ltx_border_t" id="A0.T10.1.2.1">K Value</td> <td class="ltx_td ltx_align_center ltx_border_t" id="A0.T10.1.2.2">5</td> <td class="ltx_td ltx_align_center ltx_border_t" id="A0.T10.1.2.3">10</td> <td class="ltx_td ltx_align_center ltx_border_t" id="A0.T10.1.2.4">15</td> <td class="ltx_td ltx_align_center ltx_border_t" id="A0.T10.1.2.5">3</td> <td class="ltx_td ltx_align_center ltx_border_t" id="A0.T10.1.2.6">4</td> <td class="ltx_td ltx_align_center ltx_border_t" id="A0.T10.1.2.7">5</td> </tr> <tr class="ltx_tr" id="A0.T10.1.3"> <td class="ltx_td ltx_align_center ltx_border_b" id="A0.T10.1.3.1">Silhouette Coefficient</td> <td class="ltx_td ltx_align_center ltx_border_b" id="A0.T10.1.3.2">0.543</td> <td class="ltx_td ltx_align_center ltx_border_b" id="A0.T10.1.3.3">0.503</td> <td class="ltx_td ltx_align_center ltx_border_b" id="A0.T10.1.3.4">0.535</td> <td class="ltx_td ltx_align_center ltx_border_b" id="A0.T10.1.3.5">0.850</td> <td class="ltx_td ltx_align_center ltx_border_b" id="A0.T10.1.3.6">0.828</td> <td class="ltx_td ltx_align_center ltx_border_b" id="A0.T10.1.3.7">0.821</td> </tr> </table> </figure> </section> <section class="ltx_subsection" id="A0.SS5"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="A0.SS5.4.1.1">-E</span> </span><span class="ltx_text ltx_font_italic" id="A0.SS5.5.2">Model cards</span> </h3> <div class="ltx_para" id="A0.SS5.p1"> <p class="ltx_p" id="A0.SS5.p1.1">A model card is given in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.F9" title="Figure 9 ‣ -E Model cards ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">9</span></a>. A model card contain the general description of the model, such as structure and training information.</p> </div> <figure class="ltx_figure" id="A0.F9"> <p class="ltx_p ltx_align_center" id="A0.F9.1"><span class="ltx_text" id="A0.F9.1.1"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="297" id="A0.F9.1.1.g1" src="extracted/2404.00069v1/model_card.png" width="553"/></span></p> <br class="ltx_break ltx_break"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 9: </span>Model card of bert-base-uncased-mnli. Each model on HuggingFace has a model card to describe the model.</figcaption> </figure> </section> <section class="ltx_subsection" id="A0.SS6"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="A0.SS6.4.1.1">-F</span> </span><span class="ltx_text ltx_font_italic" id="A0.SS6.5.2">K-means Clustering Results</span> </h3> <div class="ltx_para" id="A0.SS6.p1"> <p class="ltx_p" id="A0.SS6.p1.5">The result of K-means clustering is shown in Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#A0.T11" title="TABLE XI ‣ -F K-means Clustering Results ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">XI</span></a>. This table is related to Table <a class="ltx_ref" href="https://arxiv.org/html/2404.00069v1#S5.T2" title="TABLE II ‣ V-B Experiment for Coarse-Recall Phase ‣ V Experiments ‣ A Two-Phase Recall-and-Select Framework for Fast Model Selection"><span class="ltx_text ltx_ref_tag">II</span></a> in section V. B. Model Clustering. In that section, we explain the result of hierarchical clustering in detail. We conclude that the result of hierarchical clustering is effective since the in-cluster models share the same model structure or training dataset while the silhouette coefficient is high. Here we give the result of K-means clustering to better prove our conclusion. Both the NLP clustering result and CV clustering result of the K-means clustering algorithm show less connection between in-cluster models. In the NLP part, the 2 biggest clusters, <math alttext="C_{2}" class="ltx_Math" display="inline" id="A0.SS6.p1.1.m1.1"><semantics id="A0.SS6.p1.1.m1.1a"><msub id="A0.SS6.p1.1.m1.1.1" xref="A0.SS6.p1.1.m1.1.1.cmml"><mi id="A0.SS6.p1.1.m1.1.1.2" xref="A0.SS6.p1.1.m1.1.1.2.cmml">C</mi><mn id="A0.SS6.p1.1.m1.1.1.3" xref="A0.SS6.p1.1.m1.1.1.3.cmml">2</mn></msub><annotation-xml encoding="MathML-Content" id="A0.SS6.p1.1.m1.1b"><apply id="A0.SS6.p1.1.m1.1.1.cmml" xref="A0.SS6.p1.1.m1.1.1"><csymbol cd="ambiguous" id="A0.SS6.p1.1.m1.1.1.1.cmml" xref="A0.SS6.p1.1.m1.1.1">subscript</csymbol><ci id="A0.SS6.p1.1.m1.1.1.2.cmml" xref="A0.SS6.p1.1.m1.1.1.2">𝐶</ci><cn id="A0.SS6.p1.1.m1.1.1.3.cmml" type="integer" xref="A0.SS6.p1.1.m1.1.1.3">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.SS6.p1.1.m1.1c">C_{2}</annotation><annotation encoding="application/x-llamapun" id="A0.SS6.p1.1.m1.1d">italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math> and <math alttext="C_{8}" class="ltx_Math" display="inline" id="A0.SS6.p1.2.m2.1"><semantics id="A0.SS6.p1.2.m2.1a"><msub id="A0.SS6.p1.2.m2.1.1" xref="A0.SS6.p1.2.m2.1.1.cmml"><mi id="A0.SS6.p1.2.m2.1.1.2" xref="A0.SS6.p1.2.m2.1.1.2.cmml">C</mi><mn id="A0.SS6.p1.2.m2.1.1.3" xref="A0.SS6.p1.2.m2.1.1.3.cmml">8</mn></msub><annotation-xml encoding="MathML-Content" id="A0.SS6.p1.2.m2.1b"><apply id="A0.SS6.p1.2.m2.1.1.cmml" xref="A0.SS6.p1.2.m2.1.1"><csymbol cd="ambiguous" id="A0.SS6.p1.2.m2.1.1.1.cmml" xref="A0.SS6.p1.2.m2.1.1">subscript</csymbol><ci id="A0.SS6.p1.2.m2.1.1.2.cmml" xref="A0.SS6.p1.2.m2.1.1.2">𝐶</ci><cn id="A0.SS6.p1.2.m2.1.1.3.cmml" type="integer" xref="A0.SS6.p1.2.m2.1.1.3">8</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.SS6.p1.2.m2.1c">C_{8}</annotation><annotation encoding="application/x-llamapun" id="A0.SS6.p1.2.m2.1d">italic_C start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT</annotation></semantics></math>, consist of a mix of models that have different structures and training datasets. In the CV part, there is a cross mixing in <math alttext="C_{6}" class="ltx_Math" display="inline" id="A0.SS6.p1.3.m3.1"><semantics id="A0.SS6.p1.3.m3.1a"><msub id="A0.SS6.p1.3.m3.1.1" xref="A0.SS6.p1.3.m3.1.1.cmml"><mi id="A0.SS6.p1.3.m3.1.1.2" xref="A0.SS6.p1.3.m3.1.1.2.cmml">C</mi><mn id="A0.SS6.p1.3.m3.1.1.3" xref="A0.SS6.p1.3.m3.1.1.3.cmml">6</mn></msub><annotation-xml encoding="MathML-Content" id="A0.SS6.p1.3.m3.1b"><apply id="A0.SS6.p1.3.m3.1.1.cmml" xref="A0.SS6.p1.3.m3.1.1"><csymbol cd="ambiguous" id="A0.SS6.p1.3.m3.1.1.1.cmml" xref="A0.SS6.p1.3.m3.1.1">subscript</csymbol><ci id="A0.SS6.p1.3.m3.1.1.2.cmml" xref="A0.SS6.p1.3.m3.1.1.2">𝐶</ci><cn id="A0.SS6.p1.3.m3.1.1.3.cmml" type="integer" xref="A0.SS6.p1.3.m3.1.1.3">6</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.SS6.p1.3.m3.1c">C_{6}</annotation><annotation encoding="application/x-llamapun" id="A0.SS6.p1.3.m3.1d">italic_C start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT</annotation></semantics></math> and <math alttext="C_{7}" class="ltx_Math" display="inline" id="A0.SS6.p1.4.m4.1"><semantics id="A0.SS6.p1.4.m4.1a"><msub id="A0.SS6.p1.4.m4.1.1" xref="A0.SS6.p1.4.m4.1.1.cmml"><mi id="A0.SS6.p1.4.m4.1.1.2" xref="A0.SS6.p1.4.m4.1.1.2.cmml">C</mi><mn id="A0.SS6.p1.4.m4.1.1.3" xref="A0.SS6.p1.4.m4.1.1.3.cmml">7</mn></msub><annotation-xml encoding="MathML-Content" id="A0.SS6.p1.4.m4.1b"><apply id="A0.SS6.p1.4.m4.1.1.cmml" xref="A0.SS6.p1.4.m4.1.1"><csymbol cd="ambiguous" id="A0.SS6.p1.4.m4.1.1.1.cmml" xref="A0.SS6.p1.4.m4.1.1">subscript</csymbol><ci id="A0.SS6.p1.4.m4.1.1.2.cmml" xref="A0.SS6.p1.4.m4.1.1.2">𝐶</ci><cn id="A0.SS6.p1.4.m4.1.1.3.cmml" type="integer" xref="A0.SS6.p1.4.m4.1.1.3">7</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.SS6.p1.4.m4.1c">C_{7}</annotation><annotation encoding="application/x-llamapun" id="A0.SS6.p1.4.m4.1d">italic_C start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT</annotation></semantics></math>, and the biggest cluster, <math alttext="C_{4}" class="ltx_Math" display="inline" id="A0.SS6.p1.5.m5.1"><semantics id="A0.SS6.p1.5.m5.1a"><msub id="A0.SS6.p1.5.m5.1.1" xref="A0.SS6.p1.5.m5.1.1.cmml"><mi id="A0.SS6.p1.5.m5.1.1.2" xref="A0.SS6.p1.5.m5.1.1.2.cmml">C</mi><mn id="A0.SS6.p1.5.m5.1.1.3" xref="A0.SS6.p1.5.m5.1.1.3.cmml">4</mn></msub><annotation-xml encoding="MathML-Content" id="A0.SS6.p1.5.m5.1b"><apply id="A0.SS6.p1.5.m5.1.1.cmml" xref="A0.SS6.p1.5.m5.1.1"><csymbol cd="ambiguous" id="A0.SS6.p1.5.m5.1.1.1.cmml" xref="A0.SS6.p1.5.m5.1.1">subscript</csymbol><ci id="A0.SS6.p1.5.m5.1.1.2.cmml" xref="A0.SS6.p1.5.m5.1.1.2">𝐶</ci><cn id="A0.SS6.p1.5.m5.1.1.3.cmml" type="integer" xref="A0.SS6.p1.5.m5.1.1.3">4</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.SS6.p1.5.m5.1c">C_{4}</annotation><annotation encoding="application/x-llamapun" id="A0.SS6.p1.5.m5.1d">italic_C start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT</annotation></semantics></math>, does not show consistency in either model structure or training dataset. Thus, we take the method of hierarchical clustering as the main line of this paper.</p> </div> <figure class="ltx_table" id="A0.T11"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">TABLE XI: </span>Model Clustering Results Using K-means</figcaption> <table class="ltx_tabular ltx_centering ltx_align_middle" id="A0.T11.18.18"> <tr class="ltx_tr" id="A0.T11.18.18.19"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" colspan="3" id="A0.T11.18.18.19.1"><span class="ltx_text ltx_font_bold" id="A0.T11.18.18.19.1.1">Model Clusters of Natural Language Processing</span></td> </tr> <tr class="ltx_tr" id="A0.T11.18.18.20"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.18.18.20.1"><span class="ltx_text ltx_font_bold" id="A0.T11.18.18.20.1.1">Cluster</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.18.18.20.2"><span class="ltx_text ltx_font_bold" id="A0.T11.18.18.20.2.1">Size</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.18.18.20.3"><span class="ltx_text ltx_font_bold" id="A0.T11.18.18.20.3.1">Pre-trained Models</span></td> </tr> <tr class="ltx_tr" id="A0.T11.1.1.1"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.1.1.1.1"><math alttext="C_{1}" class="ltx_Math" display="inline" id="A0.T11.1.1.1.1.m1.1"><semantics id="A0.T11.1.1.1.1.m1.1a"><msub id="A0.T11.1.1.1.1.m1.1.1" xref="A0.T11.1.1.1.1.m1.1.1.cmml"><mi id="A0.T11.1.1.1.1.m1.1.1.2" xref="A0.T11.1.1.1.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.1.1.1.1.m1.1.1.3" xref="A0.T11.1.1.1.1.m1.1.1.3.cmml">1</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.1.1.1.1.m1.1b"><apply id="A0.T11.1.1.1.1.m1.1.1.cmml" xref="A0.T11.1.1.1.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.1.1.1.1.m1.1.1.1.cmml" xref="A0.T11.1.1.1.1.m1.1.1">subscript</csymbol><ci id="A0.T11.1.1.1.1.m1.1.1.2.cmml" xref="A0.T11.1.1.1.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.1.1.1.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.1.1.1.1.m1.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.1.1.1.1.m1.1c">C_{1}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.1.1.1.1.m1.1d">italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.1.1.1.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.1.1.1.3"> <span class="ltx_text" id="A0.T11.1.1.1.3.1"></span> <span class="ltx_text" id="A0.T11.1.1.1.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.1.1.1.3.2.1"> <span class="ltx_tr" id="A0.T11.1.1.1.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.1.1.1.3.2.1.1.1">gchhablani–bert-base-cased-finetuned-rte, anirudh21–bert-base-uncased-finetuned-qnli</span></span> </span></span><span class="ltx_text" id="A0.T11.1.1.1.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.2.2.2"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.2.2.2.1"><math alttext="C_{2}" class="ltx_Math" display="inline" id="A0.T11.2.2.2.1.m1.1"><semantics id="A0.T11.2.2.2.1.m1.1a"><msub id="A0.T11.2.2.2.1.m1.1.1" xref="A0.T11.2.2.2.1.m1.1.1.cmml"><mi id="A0.T11.2.2.2.1.m1.1.1.2" xref="A0.T11.2.2.2.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.2.2.2.1.m1.1.1.3" xref="A0.T11.2.2.2.1.m1.1.1.3.cmml">2</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.2.2.2.1.m1.1b"><apply id="A0.T11.2.2.2.1.m1.1.1.cmml" xref="A0.T11.2.2.2.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.2.2.2.1.m1.1.1.1.cmml" xref="A0.T11.2.2.2.1.m1.1.1">subscript</csymbol><ci id="A0.T11.2.2.2.1.m1.1.1.2.cmml" xref="A0.T11.2.2.2.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.2.2.2.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.2.2.2.1.m1.1.1.3">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.2.2.2.1.m1.1c">C_{2}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.2.2.2.1.m1.1d">italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.2.2.2.2">5</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.2.2.2.3"> <span class="ltx_text" id="A0.T11.2.2.2.3.1"></span> <span class="ltx_text" id="A0.T11.2.2.2.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.2.2.2.3.2.1"> <span class="ltx_tr" id="A0.T11.2.2.2.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.2.2.2.3.2.1.1.1">Jeevesh8–bert_ft_cola-88, DoyyingFace–bert-asian-hate-tweets-asian-unclean-freeze-4,</span></span> <span class="ltx_tr" id="A0.T11.2.2.2.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.2.2.2.3.2.1.2.1">bert-base-uncased , aditeyabaral–finetuned-sail2017-xlm-roberta-base, Jeevesh8–512seq_len_6ep_bert_ft_cola-91</span></span> </span></span><span class="ltx_text" id="A0.T11.2.2.2.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.3.3.3"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.3.3.3.1"><math alttext="C_{3}" class="ltx_Math" display="inline" id="A0.T11.3.3.3.1.m1.1"><semantics id="A0.T11.3.3.3.1.m1.1a"><msub id="A0.T11.3.3.3.1.m1.1.1" xref="A0.T11.3.3.3.1.m1.1.1.cmml"><mi id="A0.T11.3.3.3.1.m1.1.1.2" xref="A0.T11.3.3.3.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.3.3.3.1.m1.1.1.3" xref="A0.T11.3.3.3.1.m1.1.1.3.cmml">3</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.3.3.3.1.m1.1b"><apply id="A0.T11.3.3.3.1.m1.1.1.cmml" xref="A0.T11.3.3.3.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.3.3.3.1.m1.1.1.1.cmml" xref="A0.T11.3.3.3.1.m1.1.1">subscript</csymbol><ci id="A0.T11.3.3.3.1.m1.1.1.2.cmml" xref="A0.T11.3.3.3.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.3.3.3.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.3.3.3.1.m1.1.1.3">3</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.3.3.3.1.m1.1c">C_{3}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.3.3.3.1.m1.1d">italic_C start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.3.3.3.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.3.3.3.3"> <span class="ltx_text" id="A0.T11.3.3.3.3.1"></span> <span class="ltx_text" id="A0.T11.3.3.3.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.3.3.3.3.2.1"> <span class="ltx_tr" id="A0.T11.3.3.3.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.3.3.3.3.2.1.1.1">manueltonneau–bert-twitter-en-is-hired, aychang–bert-base-cased-trec-coarse</span></span> </span></span><span class="ltx_text" id="A0.T11.3.3.3.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.4.4.4"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.4.4.4.1"><math alttext="C_{4}" class="ltx_Math" display="inline" id="A0.T11.4.4.4.1.m1.1"><semantics id="A0.T11.4.4.4.1.m1.1a"><msub id="A0.T11.4.4.4.1.m1.1.1" xref="A0.T11.4.4.4.1.m1.1.1.cmml"><mi id="A0.T11.4.4.4.1.m1.1.1.2" xref="A0.T11.4.4.4.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.4.4.4.1.m1.1.1.3" xref="A0.T11.4.4.4.1.m1.1.1.3.cmml">4</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.4.4.4.1.m1.1b"><apply id="A0.T11.4.4.4.1.m1.1.1.cmml" xref="A0.T11.4.4.4.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.4.4.4.1.m1.1.1.1.cmml" xref="A0.T11.4.4.4.1.m1.1.1">subscript</csymbol><ci id="A0.T11.4.4.4.1.m1.1.1.2.cmml" xref="A0.T11.4.4.4.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.4.4.4.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.4.4.4.1.m1.1.1.3">4</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.4.4.4.1.m1.1c">C_{4}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.4.4.4.1.m1.1d">italic_C start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.4.4.4.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.4.4.4.3"> <span class="ltx_text" id="A0.T11.4.4.4.3.1"></span> <span class="ltx_text" id="A0.T11.4.4.4.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.4.4.4.3.2.1"> <span class="ltx_tr" id="A0.T11.4.4.4.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.4.4.4.3.2.1.1.1">XSY–albert-base-v2-imdb-calssification, distilbert-base-uncased</span></span> </span></span><span class="ltx_text" id="A0.T11.4.4.4.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.5.5.5"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.5.5.5.1"><math alttext="C_{4}" class="ltx_Math" display="inline" id="A0.T11.5.5.5.1.m1.1"><semantics id="A0.T11.5.5.5.1.m1.1a"><msub id="A0.T11.5.5.5.1.m1.1.1" xref="A0.T11.5.5.5.1.m1.1.1.cmml"><mi id="A0.T11.5.5.5.1.m1.1.1.2" xref="A0.T11.5.5.5.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.5.5.5.1.m1.1.1.3" xref="A0.T11.5.5.5.1.m1.1.1.3.cmml">4</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.5.5.5.1.m1.1b"><apply id="A0.T11.5.5.5.1.m1.1.1.cmml" xref="A0.T11.5.5.5.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.5.5.5.1.m1.1.1.1.cmml" xref="A0.T11.5.5.5.1.m1.1.1">subscript</csymbol><ci id="A0.T11.5.5.5.1.m1.1.1.2.cmml" xref="A0.T11.5.5.5.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.5.5.5.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.5.5.5.1.m1.1.1.3">4</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.5.5.5.1.m1.1c">C_{4}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.5.5.5.1.m1.1d">italic_C start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.5.5.5.2">4</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.5.5.5.3"> <span class="ltx_text" id="A0.T11.5.5.5.3.1"></span> <span class="ltx_text" id="A0.T11.5.5.5.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.5.5.5.3.2.1"> <span class="ltx_tr" id="A0.T11.5.5.5.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.5.5.5.3.2.1.1.1">ishan–bert-base-uncased-mnli, Alireza1044–albert-base-v2-qnli,</span></span> <span class="ltx_tr" id="A0.T11.5.5.5.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.5.5.5.3.2.1.2.1">albert-base-v2, Jeevesh8–feather_berts_46 :</span></span> </span></span><span class="ltx_text" id="A0.T11.5.5.5.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.6.6.6"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.6.6.6.1"><math alttext="C_{5}" class="ltx_Math" display="inline" id="A0.T11.6.6.6.1.m1.1"><semantics id="A0.T11.6.6.6.1.m1.1a"><msub id="A0.T11.6.6.6.1.m1.1.1" xref="A0.T11.6.6.6.1.m1.1.1.cmml"><mi id="A0.T11.6.6.6.1.m1.1.1.2" xref="A0.T11.6.6.6.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.6.6.6.1.m1.1.1.3" xref="A0.T11.6.6.6.1.m1.1.1.3.cmml">5</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.6.6.6.1.m1.1b"><apply id="A0.T11.6.6.6.1.m1.1.1.cmml" xref="A0.T11.6.6.6.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.6.6.6.1.m1.1.1.1.cmml" xref="A0.T11.6.6.6.1.m1.1.1">subscript</csymbol><ci id="A0.T11.6.6.6.1.m1.1.1.2.cmml" xref="A0.T11.6.6.6.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.6.6.6.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.6.6.6.1.m1.1.1.3">5</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.6.6.6.1.m1.1c">C_{5}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.6.6.6.1.m1.1d">italic_C start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.6.6.6.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.6.6.6.3"> <span class="ltx_text" id="A0.T11.6.6.6.3.1"></span> <span class="ltx_text" id="A0.T11.6.6.6.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.6.6.6.3.2.1"> <span class="ltx_tr" id="A0.T11.6.6.6.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.6.6.6.3.2.1.1.1">CAMeL-Lab–bert-base-arabic-camelbert-mix-did-nadi, aliosm–sha3bor-metre-detector-arabertv2-base</span></span> </span></span><span class="ltx_text" id="A0.T11.6.6.6.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.7.7.7"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.7.7.7.1"><math alttext="C_{6}" class="ltx_Math" display="inline" id="A0.T11.7.7.7.1.m1.1"><semantics id="A0.T11.7.7.7.1.m1.1a"><msub id="A0.T11.7.7.7.1.m1.1.1" xref="A0.T11.7.7.7.1.m1.1.1.cmml"><mi id="A0.T11.7.7.7.1.m1.1.1.2" xref="A0.T11.7.7.7.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.7.7.7.1.m1.1.1.3" xref="A0.T11.7.7.7.1.m1.1.1.3.cmml">6</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.7.7.7.1.m1.1b"><apply id="A0.T11.7.7.7.1.m1.1.1.cmml" xref="A0.T11.7.7.7.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.7.7.7.1.m1.1.1.1.cmml" xref="A0.T11.7.7.7.1.m1.1.1">subscript</csymbol><ci id="A0.T11.7.7.7.1.m1.1.1.2.cmml" xref="A0.T11.7.7.7.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.7.7.7.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.7.7.7.1.m1.1.1.3">6</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.7.7.7.1.m1.1c">C_{6}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.7.7.7.1.m1.1d">italic_C start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.7.7.7.2">3</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.7.7.7.3"> <span class="ltx_text" id="A0.T11.7.7.7.3.1"></span> <span class="ltx_text" id="A0.T11.7.7.7.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.7.7.7.3.2.1"> <span class="ltx_tr" id="A0.T11.7.7.7.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.7.7.7.3.2.1.1.1">socialmediaie–TRAC2020_IBEN_B_bert-base-multilingual-uncased, jb2k–bert-base-multilingual-cased-language-detection,</span></span> <span class="ltx_tr" id="A0.T11.7.7.7.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.7.7.7.3.2.1.2.1">emrecan–bert-base-multilingual-cased-snli_tr</span></span> </span></span><span class="ltx_text" id="A0.T11.7.7.7.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.8.8.8"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.8.8.8.1"><math alttext="C_{7}" class="ltx_Math" display="inline" id="A0.T11.8.8.8.1.m1.1"><semantics id="A0.T11.8.8.8.1.m1.1a"><msub id="A0.T11.8.8.8.1.m1.1.1" xref="A0.T11.8.8.8.1.m1.1.1.cmml"><mi id="A0.T11.8.8.8.1.m1.1.1.2" xref="A0.T11.8.8.8.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.8.8.8.1.m1.1.1.3" xref="A0.T11.8.8.8.1.m1.1.1.3.cmml">7</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.8.8.8.1.m1.1b"><apply id="A0.T11.8.8.8.1.m1.1.1.cmml" xref="A0.T11.8.8.8.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.8.8.8.1.m1.1.1.1.cmml" xref="A0.T11.8.8.8.1.m1.1.1">subscript</csymbol><ci id="A0.T11.8.8.8.1.m1.1.1.2.cmml" xref="A0.T11.8.8.8.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.8.8.8.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.8.8.8.1.m1.1.1.3">7</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.8.8.8.1.m1.1c">C_{7}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.8.8.8.1.m1.1d">italic_C start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.8.8.8.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.8.8.8.3"> <span class="ltx_text" id="A0.T11.8.8.8.3.1"></span> <span class="ltx_text" id="A0.T11.8.8.8.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.8.8.8.3.2.1"> <span class="ltx_tr" id="A0.T11.8.8.8.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.8.8.8.3.2.1.1.1">dhimskyy–wiki-bert, bondi–bert-semaphore-prediction-w4</span></span> </span></span><span class="ltx_text" id="A0.T11.8.8.8.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.9.9.9"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.9.9.9.1"><math alttext="C_{8}" class="ltx_Math" display="inline" id="A0.T11.9.9.9.1.m1.1"><semantics id="A0.T11.9.9.9.1.m1.1a"><msub id="A0.T11.9.9.9.1.m1.1.1" xref="A0.T11.9.9.9.1.m1.1.1.cmml"><mi id="A0.T11.9.9.9.1.m1.1.1.2" xref="A0.T11.9.9.9.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.9.9.9.1.m1.1.1.3" xref="A0.T11.9.9.9.1.m1.1.1.3.cmml">8</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.9.9.9.1.m1.1b"><apply id="A0.T11.9.9.9.1.m1.1.1.cmml" xref="A0.T11.9.9.9.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.9.9.9.1.m1.1.1.1.cmml" xref="A0.T11.9.9.9.1.m1.1.1">subscript</csymbol><ci id="A0.T11.9.9.9.1.m1.1.1.2.cmml" xref="A0.T11.9.9.9.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.9.9.9.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.9.9.9.1.m1.1.1.3">8</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.9.9.9.1.m1.1c">C_{8}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.9.9.9.1.m1.1d">italic_C start_POSTSUBSCRIPT 8 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.9.9.9.2">5</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.9.9.9.3"> <span class="ltx_text" id="A0.T11.9.9.9.3.1"></span> <span class="ltx_text" id="A0.T11.9.9.9.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.9.9.9.3.2.1"> <span class="ltx_tr" id="A0.T11.9.9.9.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.9.9.9.3.2.1.1.1">Jeevesh8–init_bert_ft_qqp-33, Jeevesh8–bert_ft_qqp-68, Jeevesh8–bert_ft_qqp-40,</span></span> <span class="ltx_tr" id="A0.T11.9.9.9.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.9.9.9.3.2.1.2.1">connectivity–bert_ft_qqp-1, Jeevesh8–bert_ft_qqp-9</span></span> </span></span><span class="ltx_text" id="A0.T11.9.9.9.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.10.10.10"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.10.10.10.1"><math alttext="C_{9}" class="ltx_Math" display="inline" id="A0.T11.10.10.10.1.m1.1"><semantics id="A0.T11.10.10.10.1.m1.1a"><msub id="A0.T11.10.10.10.1.m1.1.1" xref="A0.T11.10.10.10.1.m1.1.1.cmml"><mi id="A0.T11.10.10.10.1.m1.1.1.2" xref="A0.T11.10.10.10.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.10.10.10.1.m1.1.1.3" xref="A0.T11.10.10.10.1.m1.1.1.3.cmml">9</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.10.10.10.1.m1.1b"><apply id="A0.T11.10.10.10.1.m1.1.1.cmml" xref="A0.T11.10.10.10.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.10.10.10.1.m1.1.1.1.cmml" xref="A0.T11.10.10.10.1.m1.1.1">subscript</csymbol><ci id="A0.T11.10.10.10.1.m1.1.1.2.cmml" xref="A0.T11.10.10.10.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.10.10.10.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.10.10.10.1.m1.1.1.3">9</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.10.10.10.1.m1.1c">C_{9}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.10.10.10.1.m1.1d">italic_C start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.10.10.10.2">4</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.10.10.10.3"> <span class="ltx_text" id="A0.T11.10.10.10.3.1"></span> <span class="ltx_text" id="A0.T11.10.10.10.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.10.10.10.3.2.1"> <span class="ltx_tr" id="A0.T11.10.10.10.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.10.10.10.3.2.1.1.1">connectivity–bert_ft_qqp-96, connectivity–bert_ft_qqp-7,</span></span> <span class="ltx_tr" id="A0.T11.10.10.10.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.10.10.10.3.2.1.2.1">connectivity–bert_ft_qqp-17, Jeevesh8–init_bert_ft_qqp-24</span></span> </span></span><span class="ltx_text" id="A0.T11.10.10.10.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.11.11.11"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.11.11.11.1"><math alttext="C_{10}" class="ltx_Math" display="inline" id="A0.T11.11.11.11.1.m1.1"><semantics id="A0.T11.11.11.11.1.m1.1a"><msub id="A0.T11.11.11.11.1.m1.1.1" xref="A0.T11.11.11.11.1.m1.1.1.cmml"><mi id="A0.T11.11.11.11.1.m1.1.1.2" xref="A0.T11.11.11.11.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.11.11.11.1.m1.1.1.3" xref="A0.T11.11.11.11.1.m1.1.1.3.cmml">10</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.11.11.11.1.m1.1b"><apply id="A0.T11.11.11.11.1.m1.1.1.cmml" xref="A0.T11.11.11.11.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.11.11.11.1.m1.1.1.1.cmml" xref="A0.T11.11.11.11.1.m1.1.1">subscript</csymbol><ci id="A0.T11.11.11.11.1.m1.1.1.2.cmml" xref="A0.T11.11.11.11.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.11.11.11.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.11.11.11.1.m1.1.1.3">10</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.11.11.11.1.m1.1c">C_{10}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.11.11.11.1.m1.1d">italic_C start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.11.11.11.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.11.11.11.3"> <span class="ltx_text" id="A0.T11.11.11.11.3.1"></span> <span class="ltx_text" id="A0.T11.11.11.11.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.11.11.11.3.2.1"> <span class="ltx_tr" id="A0.T11.11.11.11.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.11.11.11.3.2.1.1.1">Splend1dchan–bert-base-uncased-slue-goldtrascription-e3-lr1e-4, Jeevesh8–6ep_bert_ft_cola-47</span></span> </span></span><span class="ltx_text" id="A0.T11.11.11.11.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.18.18.21"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" colspan="3" id="A0.T11.18.18.21.1"><span class="ltx_text ltx_font_bold" id="A0.T11.18.18.21.1.1">Model Clusters of Computer Vision</span></td> </tr> <tr class="ltx_tr" id="A0.T11.18.18.22"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.18.18.22.1"><span class="ltx_text ltx_font_bold" id="A0.T11.18.18.22.1.1">Cluster</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.18.18.22.2"><span class="ltx_text ltx_font_bold" id="A0.T11.18.18.22.2.1">Size</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.18.18.22.3"><span class="ltx_text ltx_font_bold" id="A0.T11.18.18.22.3.1">Pre-trained Models</span></td> </tr> <tr class="ltx_tr" id="A0.T11.12.12.12"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.12.12.12.1"><math alttext="C_{1}" class="ltx_Math" display="inline" id="A0.T11.12.12.12.1.m1.1"><semantics id="A0.T11.12.12.12.1.m1.1a"><msub id="A0.T11.12.12.12.1.m1.1.1" xref="A0.T11.12.12.12.1.m1.1.1.cmml"><mi id="A0.T11.12.12.12.1.m1.1.1.2" xref="A0.T11.12.12.12.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.12.12.12.1.m1.1.1.3" xref="A0.T11.12.12.12.1.m1.1.1.3.cmml">1</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.12.12.12.1.m1.1b"><apply id="A0.T11.12.12.12.1.m1.1.1.cmml" xref="A0.T11.12.12.12.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.12.12.12.1.m1.1.1.1.cmml" xref="A0.T11.12.12.12.1.m1.1.1">subscript</csymbol><ci id="A0.T11.12.12.12.1.m1.1.1.2.cmml" xref="A0.T11.12.12.12.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.12.12.12.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.12.12.12.1.m1.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.12.12.12.1.m1.1c">C_{1}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.12.12.12.1.m1.1d">italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.12.12.12.2">6</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.12.12.12.3"> <span class="ltx_text" id="A0.T11.12.12.12.3.1"></span> <span class="ltx_text" id="A0.T11.12.12.12.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.12.12.12.3.2.1"> <span class="ltx_tr" id="A0.T11.12.12.12.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.12.12.12.3.2.1.1.1">shi-labs/dinat-large-in22k-in1k-224, shi-labs/dinat-large-in22k-in1k-384, microsoft/beit-base-patch16-224-pt22k-ft22k,</span></span> <span class="ltx_tr" id="A0.T11.12.12.12.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.12.12.12.3.2.1.2.1">lixiqi/beit-base-patch16-224-pt22k-ft22k-finetuned-FER2013-7e-05,</span></span> <span class="ltx_tr" id="A0.T11.12.12.12.3.2.1.3"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.12.12.12.3.2.1.3.1">lixiqi/beit-base-patch16-224-pt22k-ft22k-finetuned-FER2013-6e-05,</span></span> <span class="ltx_tr" id="A0.T11.12.12.12.3.2.1.4"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.12.12.12.3.2.1.4.1">lixiqi/beit-base-patch16-224-pt22k-ft22k-finetuned-FER-5e-05-3</span></span> </span></span><span class="ltx_text" id="A0.T11.12.12.12.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.13.13.13"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.13.13.13.1"><math alttext="C_{2}" class="ltx_Math" display="inline" id="A0.T11.13.13.13.1.m1.1"><semantics id="A0.T11.13.13.13.1.m1.1a"><msub id="A0.T11.13.13.13.1.m1.1.1" xref="A0.T11.13.13.13.1.m1.1.1.cmml"><mi id="A0.T11.13.13.13.1.m1.1.1.2" xref="A0.T11.13.13.13.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.13.13.13.1.m1.1.1.3" xref="A0.T11.13.13.13.1.m1.1.1.3.cmml">2</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.13.13.13.1.m1.1b"><apply id="A0.T11.13.13.13.1.m1.1.1.cmml" xref="A0.T11.13.13.13.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.13.13.13.1.m1.1.1.1.cmml" xref="A0.T11.13.13.13.1.m1.1.1">subscript</csymbol><ci id="A0.T11.13.13.13.1.m1.1.1.2.cmml" xref="A0.T11.13.13.13.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.13.13.13.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.13.13.13.1.m1.1.1.3">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.13.13.13.1.m1.1c">C_{2}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.13.13.13.1.m1.1d">italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.13.13.13.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.13.13.13.3"> <span class="ltx_text" id="A0.T11.13.13.13.3.1"></span> <span class="ltx_text" id="A0.T11.13.13.13.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.13.13.13.3.2.1"> <span class="ltx_tr" id="A0.T11.13.13.13.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.13.13.13.3.2.1.1.1">nateraw/vit-age-classifier, facebook/dino-vitb16</span></span> </span></span><span class="ltx_text" id="A0.T11.13.13.13.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.14.14.14"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.14.14.14.1"><math alttext="C_{3}" class="ltx_Math" display="inline" id="A0.T11.14.14.14.1.m1.1"><semantics id="A0.T11.14.14.14.1.m1.1a"><msub id="A0.T11.14.14.14.1.m1.1.1" xref="A0.T11.14.14.14.1.m1.1.1.cmml"><mi id="A0.T11.14.14.14.1.m1.1.1.2" xref="A0.T11.14.14.14.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.14.14.14.1.m1.1.1.3" xref="A0.T11.14.14.14.1.m1.1.1.3.cmml">3</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.14.14.14.1.m1.1b"><apply id="A0.T11.14.14.14.1.m1.1.1.cmml" xref="A0.T11.14.14.14.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.14.14.14.1.m1.1.1.1.cmml" xref="A0.T11.14.14.14.1.m1.1.1">subscript</csymbol><ci id="A0.T11.14.14.14.1.m1.1.1.2.cmml" xref="A0.T11.14.14.14.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.14.14.14.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.14.14.14.1.m1.1.1.3">3</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.14.14.14.1.m1.1c">C_{3}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.14.14.14.1.m1.1d">italic_C start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.14.14.14.2">3</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.14.14.14.3"> <span class="ltx_text" id="A0.T11.14.14.14.3.1"></span> <span class="ltx_text" id="A0.T11.14.14.14.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.14.14.14.3.2.1"> <span class="ltx_tr" id="A0.T11.14.14.14.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.14.14.14.3.2.1.1.1">sail/poolformer_m48, sail/poolformer_m36, sail/poolformer_s36</span></span> </span></span><span class="ltx_text" id="A0.T11.14.14.14.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.15.15.15"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.15.15.15.1"><math alttext="C_{4}" class="ltx_Math" display="inline" id="A0.T11.15.15.15.1.m1.1"><semantics id="A0.T11.15.15.15.1.m1.1a"><msub id="A0.T11.15.15.15.1.m1.1.1" xref="A0.T11.15.15.15.1.m1.1.1.cmml"><mi id="A0.T11.15.15.15.1.m1.1.1.2" xref="A0.T11.15.15.15.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.15.15.15.1.m1.1.1.3" xref="A0.T11.15.15.15.1.m1.1.1.3.cmml">4</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.15.15.15.1.m1.1b"><apply id="A0.T11.15.15.15.1.m1.1.1.cmml" xref="A0.T11.15.15.15.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.15.15.15.1.m1.1.1.1.cmml" xref="A0.T11.15.15.15.1.m1.1.1">subscript</csymbol><ci id="A0.T11.15.15.15.1.m1.1.1.2.cmml" xref="A0.T11.15.15.15.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.15.15.15.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.15.15.15.1.m1.1.1.3">4</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.15.15.15.1.m1.1c">C_{4}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.15.15.15.1.m1.1d">italic_C start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.15.15.15.2">7</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.15.15.15.3"> <span class="ltx_text" id="A0.T11.15.15.15.3.1"></span> <span class="ltx_text" id="A0.T11.15.15.15.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.15.15.15.3.2.1"> <span class="ltx_tr" id="A0.T11.15.15.15.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.15.15.15.3.2.1.1.1">facebook/vit-msn-small, facebook/vit-msn-base, facebook/deit-base-patch16-384,</span></span> <span class="ltx_tr" id="A0.T11.15.15.15.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.15.15.15.3.2.1.2.1">google/vit-base-patch32-224-in21k, Visual-Attention-Network/van-large, facebook/deit-base-patch16-224, facebook/dino-vits16</span></span> </span></span><span class="ltx_text" id="A0.T11.15.15.15.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.16.16.16"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.16.16.16.1"><math alttext="C_{5}" class="ltx_Math" display="inline" id="A0.T11.16.16.16.1.m1.1"><semantics id="A0.T11.16.16.16.1.m1.1a"><msub id="A0.T11.16.16.16.1.m1.1.1" xref="A0.T11.16.16.16.1.m1.1.1.cmml"><mi id="A0.T11.16.16.16.1.m1.1.1.2" xref="A0.T11.16.16.16.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.16.16.16.1.m1.1.1.3" xref="A0.T11.16.16.16.1.m1.1.1.3.cmml">5</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.16.16.16.1.m1.1b"><apply id="A0.T11.16.16.16.1.m1.1.1.cmml" xref="A0.T11.16.16.16.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.16.16.16.1.m1.1.1.1.cmml" xref="A0.T11.16.16.16.1.m1.1.1">subscript</csymbol><ci id="A0.T11.16.16.16.1.m1.1.1.2.cmml" xref="A0.T11.16.16.16.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.16.16.16.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.16.16.16.1.m1.1.1.3">5</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.16.16.16.1.m1.1c">C_{5}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.16.16.16.1.m1.1d">italic_C start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.16.16.16.2">4</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.16.16.16.3"> <span class="ltx_text" id="A0.T11.16.16.16.3.1"></span> <span class="ltx_text" id="A0.T11.16.16.16.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.16.16.16.3.2.1"> <span class="ltx_tr" id="A0.T11.16.16.16.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.16.16.16.3.2.1.1.1">Visual-Attention-Network/van-base, microsoft/beit-large-patch16-224-pt22k,</span></span> <span class="ltx_tr" id="A0.T11.16.16.16.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.16.16.16.3.2.1.2.1">facebook/deit-small-patch16-224, shi-labs/dinat-base-in1k-224</span></span> </span></span><span class="ltx_text" id="A0.T11.16.16.16.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.17.17.17"> <td class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.17.17.17.1"><math alttext="C_{6}" class="ltx_Math" display="inline" id="A0.T11.17.17.17.1.m1.1"><semantics id="A0.T11.17.17.17.1.m1.1a"><msub id="A0.T11.17.17.17.1.m1.1.1" xref="A0.T11.17.17.17.1.m1.1.1.cmml"><mi id="A0.T11.17.17.17.1.m1.1.1.2" xref="A0.T11.17.17.17.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.17.17.17.1.m1.1.1.3" xref="A0.T11.17.17.17.1.m1.1.1.3.cmml">6</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.17.17.17.1.m1.1b"><apply id="A0.T11.17.17.17.1.m1.1.1.cmml" xref="A0.T11.17.17.17.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.17.17.17.1.m1.1.1.1.cmml" xref="A0.T11.17.17.17.1.m1.1.1">subscript</csymbol><ci id="A0.T11.17.17.17.1.m1.1.1.2.cmml" xref="A0.T11.17.17.17.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.17.17.17.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.17.17.17.1.m1.1.1.3">6</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.17.17.17.1.m1.1c">C_{6}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.17.17.17.1.m1.1d">italic_C start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.17.17.17.2">2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="A0.T11.17.17.17.3"> <span class="ltx_text" id="A0.T11.17.17.17.3.1"></span> <span class="ltx_text" id="A0.T11.17.17.17.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.17.17.17.3.2.1"> <span class="ltx_tr" id="A0.T11.17.17.17.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.17.17.17.3.2.1.1.1">microsoft/beit-base-patch16-384, google/vit-base-patch16-384</span></span> </span></span><span class="ltx_text" id="A0.T11.17.17.17.3.3"></span></td> </tr> <tr class="ltx_tr" id="A0.T11.18.18.18"> <td class="ltx_td ltx_align_center ltx_border_b ltx_border_l ltx_border_r ltx_border_t" id="A0.T11.18.18.18.1"><math alttext="C_{7}" class="ltx_Math" display="inline" id="A0.T11.18.18.18.1.m1.1"><semantics id="A0.T11.18.18.18.1.m1.1a"><msub id="A0.T11.18.18.18.1.m1.1.1" xref="A0.T11.18.18.18.1.m1.1.1.cmml"><mi id="A0.T11.18.18.18.1.m1.1.1.2" xref="A0.T11.18.18.18.1.m1.1.1.2.cmml">C</mi><mn id="A0.T11.18.18.18.1.m1.1.1.3" xref="A0.T11.18.18.18.1.m1.1.1.3.cmml">7</mn></msub><annotation-xml encoding="MathML-Content" id="A0.T11.18.18.18.1.m1.1b"><apply id="A0.T11.18.18.18.1.m1.1.1.cmml" xref="A0.T11.18.18.18.1.m1.1.1"><csymbol cd="ambiguous" id="A0.T11.18.18.18.1.m1.1.1.1.cmml" xref="A0.T11.18.18.18.1.m1.1.1">subscript</csymbol><ci id="A0.T11.18.18.18.1.m1.1.1.2.cmml" xref="A0.T11.18.18.18.1.m1.1.1.2">𝐶</ci><cn id="A0.T11.18.18.18.1.m1.1.1.3.cmml" type="integer" xref="A0.T11.18.18.18.1.m1.1.1.3">7</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="A0.T11.18.18.18.1.m1.1c">C_{7}</annotation><annotation encoding="application/x-llamapun" id="A0.T11.18.18.18.1.m1.1d">italic_C start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="A0.T11.18.18.18.2">2</td> <td class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="A0.T11.18.18.18.3"> <span class="ltx_text" id="A0.T11.18.18.18.3.1"></span> <span class="ltx_text" id="A0.T11.18.18.18.3.2"> <span class="ltx_tabular ltx_align_middle" id="A0.T11.18.18.18.3.2.1"> <span class="ltx_tr" id="A0.T11.18.18.18.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="A0.T11.18.18.18.3.2.1.1.1">microsoft/beit-base-patch16-224, google/vit-base-patch16-224</span></span> </span></span><span class="ltx_text" id="A0.T11.18.18.18.3.3"></span></td> </tr> </table> </figure> </section> </article> </div> <footer class="ltx_page_footer"> <div class="ltx_page_logo">Generated on Thu May 2 20:15:09 2024 by <a class="ltx_LaTeXML_logo" href="http://dlmf.nist.gov/LaTeXML/"><span style="letter-spacing:-0.2em; margin-right:0.1em;">L<span class="ltx_font_smallcaps" style="position:relative; bottom:2.2pt;">a</span>T<span class="ltx_font_smallcaps" style="font-size:120%;position:relative; bottom:-0.2ex;">e</span></span><span style="font-size:90%; position:relative; bottom:-0.2ex;">XML</span><img alt="Mascot Sammy" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAsAAAAOCAYAAAD5YeaVAAAAAXNSR0IArs4c6QAAAAZiS0dEAP8A/wD/oL2nkwAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9wKExQZLWTEaOUAAAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAdpJREFUKM9tkL+L2nAARz9fPZNCKFapUn8kyI0e4iRHSR1Kb8ng0lJw6FYHFwv2LwhOpcWxTjeUunYqOmqd6hEoRDhtDWdA8ApRYsSUCDHNt5ul13vz4w0vWCgUnnEc975arX6ORqN3VqtVZbfbTQC4uEHANM3jSqXymFI6yWazP2KxWAXAL9zCUa1Wy2tXVxheKA9YNoR8Pt+aTqe4FVVVvz05O6MBhqUIBGk8Hn8HAOVy+T+XLJfLS4ZhTiRJgqIoVBRFIoric47jPnmeB1mW/9rr9ZpSSn3Lsmir1fJZlqWlUonKsvwWwD8ymc/nXwVBeLjf7xEKhdBut9Hr9WgmkyGEkJwsy5eHG5vN5g0AKIoCAEgkEkin0wQAfN9/cXPdheu6P33fBwB4ngcAcByHJpPJl+fn54mD3Gg0NrquXxeLRQAAwzAYj8cwTZPwPH9/sVg8PXweDAauqqr2cDjEer1GJBLBZDJBs9mE4zjwfZ85lAGg2+06hmGgXq+j3+/DsixYlgVN03a9Xu8jgCNCyIegIAgx13Vfd7vdu+FweG8YRkjXdWy329+dTgeSJD3ieZ7RNO0VAXAPwDEAO5VKndi2fWrb9jWl9Esul6PZbDY9Go1OZ7PZ9z/lyuD3OozU2wAAAABJRU5ErkJggg=="/></a> </div></footer> </div> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10