CINXE.COM
An investigation into the causes of race bias in AI-based cine CMR segmentation
<!DOCTYPE html> <html lang="en"> <head> <meta content="text/html; charset=utf-8" http-equiv="content-type"/> <title>An investigation into the causes of race bias in AI-based cine CMR segmentation</title> <!--Generated on Mon Aug 5 13:30:25 2024 by LaTeXML (version 0.8.8) http://dlmf.nist.gov/LaTeXML/.--> <meta content="width=device-width, initial-scale=1, shrink-to-fit=no" name="viewport"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv-fonts.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/latexml_styles.css" rel="stylesheet" type="text/css"/> <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.3.3/html2canvas.min.js"></script> <script src="/static/browse/0.3.4/js/addons_new.js"></script> <script src="/static/browse/0.3.4/js/feedbackOverlay.js"></script> <meta content=" Cardiac magnetic resonance artificial intelligence cardiac segmentation cardiac classification, bias" lang="en" name="keywords"/> <base href="/html/2408.02462v1/"/></head> <body> <nav class="ltx_page_navbar"> <nav class="ltx_TOC"> <ol class="ltx_toclist"> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S1" title="In An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">1 </span>Introduction</span></a></li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S2" title="In An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2 </span>Contributions</span></a></li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S3" title="In An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3 </span>Methods</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S3.SS1" title="In 3 Methods ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.1 </span>Dataset</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S3.SS2" title="In 3 Methods ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.2 </span>Models used</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S3.SS3" title="In 3 Methods ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.3 </span>Statistical evaluation</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S4" title="In An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4 </span>Results</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S4.SS1" title="In 4 Results ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.1 </span>Experiment 1: Source of bias</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S4.SS2" title="In 4 Results ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.2 </span>Experiment 2: Localisation of source of bias</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S4.SS3" title="In 4 Results ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.3 </span>Experiment 3: Are the biases observed in AI CMR segmentation due to confounders?</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S5" title="In An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">5 </span>Discussion</span></a></li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S6" title="In An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">6 </span>Conclusion</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"> <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#Sx2.SS1" title="In Supplementary Material ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">6.1 </span>Experiment 1: source of the bias</span></a> <ol class="ltx_toclist ltx_toclist_subsection"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#Sx2.SS2" title="In Item 2 ‣ 6.1 Experiment 1: source of the bias ‣ Supplementary Material ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">6.2 </span>Experiment 2: localisation of the source of the bias</span></a></li> </ol> </li> </ol></nav> </nav> <div class="ltx_page_main"> <div class="ltx_page_content"> <article class="ltx_document ltx_authors_1line"><span class="ltx_note ltx_role_institutetext" id="id1"><sup class="ltx_note_mark">1</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">1</sup><span class="ltx_note_type">institutetext: </span>School of Biomedical Engineering & Imaging Sciences, King's College London, UK. </span></span></span><span class="ltx_note ltx_role_institutetext" id="id2"><sup class="ltx_note_mark">2</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">2</sup><span class="ltx_note_type">institutetext: </span>Guy’s and St Thomas’ Hospital, London, UK. </span></span></span><span class="ltx_note ltx_role_institutetext" id="id3"><sup class="ltx_note_mark">3</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">3</sup><span class="ltx_note_type">institutetext: </span>College of Electronic and Information Engineering, Tongji University, China. </span></span></span><span class="ltx_note ltx_role_institutetext" id="id4"><sup class="ltx_note_mark">4</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">4</sup><span class="ltx_note_type">institutetext: </span>HeartFlow Inc, London, UK.</span></span></span> <h1 class="ltx_title ltx_title_document">An investigation into the causes of race bias in AI-based cine CMR segmentation</h1> <div class="ltx_authors"> <span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Tiarna Lee </span><span class="ltx_author_notes">11</span></span> <span class="ltx_author_before"> </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Esther Puyol-Antón </span><span class="ltx_author_notes">11 4 4</span></span> <span class="ltx_author_before"> </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Bram Ruijsink </span><span class="ltx_author_notes">1122</span></span> <span class="ltx_author_before"> </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Sebastien Roujol </span><span class="ltx_author_notes">11</span></span> <span class="ltx_author_before"> </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Theodore Barfoot </span><span class="ltx_author_notes">11</span></span> <span class="ltx_author_before"> </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Shaheim Ogbomo-Harmitt </span><span class="ltx_author_notes">11</span></span> <span class="ltx_author_before"> </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Miaojing Shi </span><span class="ltx_author_notes">33</span></span> <span class="ltx_author_before"> </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Andrew P. King </span><span class="ltx_author_notes">11</span></span> </div> <div class="ltx_abstract"> <h6 class="ltx_title ltx_title_abstract">Abstract</h6> <p class="ltx_p" id="id1.id1">Artificial intelligence (AI) methods are being used increasingly for the automated segmentation of cine cardiac magnetic resonance (CMR) imaging. However, these methods have been shown to be subject to race bias, i.e. they exhibit different levels of performance for different races depending on the (im)balance of the data used to train the AI model. In this paper we investigate the source of this bias, seeking to understand its root cause(s) so that it can be effectively mitigated. We perform a series of classification and segmentation experiments on short-axis cine CMR images acquired from Black and White subjects from the UK Biobank and apply AI interpretability methods to understand the results. In the classification experiments, we found that race can be predicted with high accuracy from the images alone, but less accurately from ground truth segmentations, suggesting that the distributional shift between races, which is often the cause of AI bias, is mostly image-based rather than segmentation-based. The interpretability methods showed that most attention in the classification models was focused on non-heart regions, such as subcutaneous fat. Cropping the images tightly around the heart reduced classification accuracy to around chance level. Similarly, race can be predicted from the latent representations of a biased segmentation model, suggesting that race information is encoded in the model. Cropping images tightly around the heart reduced but did not eliminate segmentation bias. We also investigate the influence of possible confounders on the bias observed.</p> </div> <div class="ltx_keywords"> <h6 class="ltx_title ltx_title_keywords">Keywords: </h6> Cardiac magnetic resonance artificial intelligence cardiac segmentation cardiac classification, bias </div> <section class="ltx_section" id="S1"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">1 </span>Introduction</h2> <div class="ltx_para" id="S1.p1"> <p class="ltx_p" id="S1.p1.1">Cardiac Magnetic Resonance (CMR) imaging is widely used to acquire images for diagnosis and prognosis of cardiovascular conditions. Artificial intelligence (AI) methods are increasingly being used to automate the estimation of functional biomarkers from cine CMR by automatic delineation (segmentation) of cardiac structures <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib1" title="">1</a>]</cite>, <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib2" title="">2</a>]</cite>, <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib3" title="">3</a>]</cite>. However, recent work has shown that AI CMR segmentation models can exhibit different levels of performance for different protected groups, such as those based on race <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib4" title="">4</a>]</cite>, <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib5" title="">5</a>]</cite>, <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib6" title="">6</a>]</cite> or sex <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib7" title="">7</a>]</cite> (i.e. they can be biased). In order to properly address this bias, it is important to understand its causes, but these are not yet well understood. This paper presents an investigation into the causes of race bias in AI-based CMR segmentation.</p> </div> </section> <section class="ltx_section" id="S2"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">2 </span>Contributions</h2> <div class="ltx_para" id="S2.p1"> <p class="ltx_p" id="S2.p1.1">The contribution of this work is to investigate the cause of bias in AI-based CMR segmentation models. We show that the main source of bias is in the image content outside of the heart region and that bias can be reduced by cropping the images before training the AI models.</p> </div> </section> <section class="ltx_section" id="S3"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">3 </span>Methods</h2> <section class="ltx_subsection" id="S3.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.1 </span>Dataset</h3> <div class="ltx_para" id="S3.SS1.p1"> <p class="ltx_p" id="S3.SS1.p1.1">The dataset used in the experiments described in this paper comprised cine short axis (SAX) CMR images from 436 subjects from the UK Biobank <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib8" title="">8</a>]</cite>. For each subject, typically 7 – 13 SAX slices were available at 50 time frames covering the cardiac cycle. The demographic information of the subjects can be found in <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S3.T1" title="In 3.1 Dataset ‣ 3 Methods ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Table</span> <span class="ltx_text ltx_ref_tag">1</span></a>.</p> </div> <figure class="ltx_table" id="S3.T1"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S3.T1.4.2.1" style="font-size:90%;">Table 1</span>: </span><span class="ltx_text" id="S3.T1.2.1" style="font-size:90%;">Clinical characteristics of subjects used in the experiments. Mean values are presented for each characteristic with standard deviations given in brackets. Statistically significant differences between subject groups and the overall average are indicated with an asterisk * (p <math alttext="<" class="ltx_Math" display="inline" id="S3.T1.2.1.m1.1"><semantics id="S3.T1.2.1.m1.1b"><mo id="S3.T1.2.1.m1.1.1" xref="S3.T1.2.1.m1.1.1.cmml"><</mo><annotation-xml encoding="MathML-Content" id="S3.T1.2.1.m1.1c"><lt id="S3.T1.2.1.m1.1.1.cmml" xref="S3.T1.2.1.m1.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.2.1.m1.1d"><</annotation><annotation encoding="application/x-llamapun" id="S3.T1.2.1.m1.1e"><</annotation></semantics></math> 0.05) and were determined using a two-tailed Student’s t-test.</span></figcaption> <p class="ltx_p ltx_align_center" id="S3.T1.5"><span class="ltx_text ltx_inline-block" id="S3.T1.5.1" style="width:433.6pt;"> <span class="ltx_inline-block ltx_transformed_outer" id="S3.T1.5.1.1" style="width:439.8pt;height:92.9pt;vertical-align:-2.9pt;"><span class="ltx_transformed_inner" style="transform:translate(0.0pt,0.0pt) scale(1,1) ;"> <span class="ltx_p" id="S3.T1.5.1.1.1"><span class="ltx_text" id="S3.T1.5.1.1.1.1"> <span class="ltx_tabular ltx_guessed_headers ltx_align_middle" id="S3.T1.5.1.1.1.1.1"> <span class="ltx_thead"> <span class="ltx_tr" id="S3.T1.5.1.1.1.1.1.1.1"> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_l ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.1.1.1"><span class="ltx_text ltx_font_bold" id="S3.T1.5.1.1.1.1.1.1.1.1.1">Health measure</span></span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.1.1.2"><span class="ltx_text ltx_font_bold" id="S3.T1.5.1.1.1.1.1.1.1.2.1">Overall</span></span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.1.1.3"><span class="ltx_text ltx_font_bold" id="S3.T1.5.1.1.1.1.1.1.1.3.1">White</span></span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.1.1.4"><span class="ltx_text ltx_font_bold" id="S3.T1.5.1.1.1.1.1.1.1.4.1">Black</span></span></span> <span class="ltx_tr" id="S3.T1.5.1.1.1.1.1.2.2"> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_l ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.2.2.1"># subjects</span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.2.2.2">436</span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.2.2.3">218</span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.2.2.4">218</span></span> </span> <span class="ltx_tbody"> <span class="ltx_tr" id="S3.T1.5.1.1.1.1.1.3.1"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.3.1.1">Age (years</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.3.1.2">58.9 (7.0)</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.3.1.3">58.9 (7.0)</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.3.1.4">58.8 (6.9)</span></span> <span class="ltx_tr" id="S3.T1.5.1.1.1.1.1.4.2"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.4.2.1">Standing height (cm)</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.4.2.2">80.6 (16.6)</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.4.2.3">79.3 (17.0)</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.4.2.4">82.0 (16.1)</span></span> <span class="ltx_tr" id="S3.T1.5.1.1.1.1.1.5.3"> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_l ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.5.3.1">Body Mass Index</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.5.3.2">27.7 (4.9)</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.5.3.3">26.9 (4.6)*</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S3.T1.5.1.1.1.1.1.5.3.4">28.6 (5.1)*</span></span> </span> </span></span></span> </span></span></span></p> </figure> <div class="ltx_para" id="S3.SS1.p2"> <p class="ltx_p" id="S3.SS1.p2.1">The subjects selected for potential inclusion from the full UK Biobank CMR cohort were those with available manual ground truth segmentations (4928 out of 78,166 subjects) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib9" title="">9</a>]</cite>. The manual segmentations were of the left ventricular blood pool (LVBP), left ventricular myocardium (LVM), and right ventricular blood pool (RVBP) and were performed for the end diastole (ED) and end systole (ES) images. Therefore, only the ED and ES frames were used in our experiments. Manual segmentation was performed by outlining the LV endocardial and epicardial borders and the RV endocardial border using cvi42 (version 5.1.1, Circle Cardiovascular Imaging Inc., Calgary, Alberta, Canada). A panel of ten experts was provided with the same guidelines and one expert annotated each image. The selection of images for annotation included subjects with different sexes and races and was randomised. The experts were not provided with demographic information about the subjects.</p> </div> <div class="ltx_para" id="S3.SS1.p3"> <p class="ltx_p" id="S3.SS1.p3.1">From the available data, a cohort of 218 Black subjects was selected for use in all experiments. This cohort was chosen to have 109 males (all available Black males) and 109 females to minimise the impact of possible sex bias. To select the White subjects a matched pairs design was used, in which White subjects with matching age (± 1 year) and sex to each Black subject were chosen at random from the available pool of 4690 White subjects with ground truth segmentations. For each subject, the ED and ES frames of all SAX CMR slices and their corresponding ground truth segmentations were utilised in the experiments. Demographic and health data from the subjects was acquired from the UK Biobank database including the subjects’ age, standing height, weight, body mass index (BMI), resting heart rate, systolic and diastolic blood pressure, left ventricular stroke volume (LVSV), left ventricular ejection fraction (LVEF), left ventricular end diastolic mass (LVEDM), HDL cholesterol, cholesterol, diabetes status, hypertension status, hypercholesterolemia status, smoking status and date of the MRI scan.</p> </div> </section> <section class="ltx_subsection" id="S3.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.2 </span>Models used</h3> <div class="ltx_para" id="S3.SS2.p1"> <p class="ltx_p" id="S3.SS2.p1.1">To perform the investigations into the source of bias we employ two types of AI model: a classification model and a segmentation model.</p> </div> <div class="ltx_para" id="S3.SS2.p2"> <p class="ltx_p" id="S3.SS2.p2.1">ResNet-18 is a deep convolutional neural network (CNN) for classification consisting of 18 layers <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib10" title="">10</a>]</cite>. The network has residual blocks and skip connections which can be used to form deep networks. For the classification experiments (Experiments 1 and 2 in the Results), the model was trained for 100 epochs with an initial learning rate of 0.001 which decreased by a factor of 10 every 50 epochs. The loss function used was binary cross entropy and the model was optimised using stochastic gradient descent. The batch size was 16. The images were augmented using random mirroring, rotating, scaling and translation. As the images are greyscale, no colour intensity transformations were used. Each model was trained 10 times with different random seeds and train/validation splits and the mean and standard deviation for these 10 runs is reported.</p> </div> <div class="ltx_para" id="S3.SS2.p3"> <p class="ltx_p" id="S3.SS2.p3.1">For the classification network, we also employed the gradient-weighted class activation mapping method, or GradCAM, which is a visualisation and interpretability method <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib11" title="">11</a>]</cite>. The gradients of the target class (in our case, race) in the last convolutional layer of the classification network were visualised to produce a heatmap which shows the areas of an image that were most important for the classification decision.</p> </div> <div class="ltx_para" id="S3.SS2.p4"> <p class="ltx_p" id="S3.SS2.p4.1">For the segmentation experiments, we used nnU-Net, a self-adapting framework for segmentation of biomedical images <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib12" title="">12</a>]</cite>. The network automatically adapts to the imaging modality and changes training parameters such as the patch size, batch size and image resampling. The nnU-Net v1 model consists of an encoder and decoder structure which form a “U” shape, allowing the network to learn a more abstract representation of the images. For the segmentation experiments (Experiments 1 and 2 in the Results), the model was trained for 500 epochs with an initial learning rate of 0.01. The loss function used was a combined Dice and cross entropy loss. The model was optimised using stochastic gradient descent with a ‘poly’ learning rate schedule, where the initial learning rate was 0.01 and the Nesterov momentum was 0.99. A batch size of 16 was used. During training, data augmentation was applied to the images including mirroring, rotation and scaling. Cross-validation was performed on the training set, resulting in five models, which were used as an ensemble for inference on the test set.</p> </div> </section> <section class="ltx_subsection" id="S3.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.3 </span>Statistical evaluation</h3> <div class="ltx_para" id="S3.SS3.p1"> <p class="ltx_p" id="S3.SS3.p1.1">Classification accuracy was evaluated using overall accuracy, sensitivity and specificity. Differences in performances were evaluated using a two-tailed Student’s t-test of the accuracies of the 10 runs. Segmentation performance was evaluated using the Dice Similarity Coefficient (DSC) which measures the overlap between ground truth and predicted segmentations where 1 is a perfect overlap and 0 is no overlap. Confounder analysis was performed using linear regression models in SPSS Statistics (IBM Corp. Released 2023. IBM SPSS Statistics for Macintosh, Version 29.0.2.0 Armonk, NY: IBM Corp).</p> </div> </section> </section> <section class="ltx_section" id="S4"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">4 </span>Results</h2> <div class="ltx_para" id="S4.p1"> <p class="ltx_p" id="S4.p1.1">The experiments performed using the data and models described above aimed to investigate three aspects of the bias in AI CMR segmentation performance as detailed below.</p> </div> <section class="ltx_subsection" id="S4.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.1 </span>Experiment 1: Source of bias</h3> <div class="ltx_para" id="S4.SS1.p1"> <p class="ltx_p" id="S4.SS1.p1.1">Bias in AI models is often the result of a distributional shift in the data of subjects in different protected groups. Combined with imbalance in the training data, these distributional shifts can lead to bias in performance of AI models <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib6" title="">6</a>, <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib13" title="">13</a>]</cite>. However, the distributional shift can be in the images, the ground truth segmentations or a combination of both. Understanding the origin of the bias in trained segmentation models is important when deciding on strategies to address it. Therefore, the first experiment aimed to assess the extent of the distributional shift between the CMR images and/or the ground truth segmentations.</p> </div> <div class="ltx_para" id="S4.SS1.p2"> <p class="ltx_p" id="S4.SS1.p2.1">To quantify the extent of the distributional shifts, we trained ResNet-18 models to classify the race of the subject (White vs Black) from a single SAX CMR image and/or segmentation. The SAX CMR images and ground truth segmentations of the 218 Black and 218 White subjects were randomly split at the subject level into training and test datasets with 176 and 84 subjects respectively (ensuring that both images from each matched pair were in the same split).</p> </div> <div class="ltx_para" id="S4.SS1.p3"> <p class="ltx_p" id="S4.SS1.p3.1">The classifier was trained with three channels as input. To assess the relative distributional shifts between images and ground truth segmentations we used four different combinations of images (Im) and segmentations (Seg): Im-Im-Im, Im-Im-Seg, Im-Seg-Seg, and Seg-Seg-Seg, as illustrated in <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S4.F1" title="In 4.1 Experiment 1: Source of bias ‣ 4 Results ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">1</span></a>.</p> </div> <figure class="ltx_figure" id="S4.F1"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="97" id="S4.F1.g1" src="extracted/5774675/Images/Ims_diagram.png" width="471"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F1.2.1.1" style="font-size:90%;">Figure 1</span>: </span><span class="ltx_text" id="S4.F1.3.2" style="font-size:90%;">An illustration of the combination of images and segmentations used as input to the protected attribute classifiers</span></figcaption> </figure> <div class="ltx_para" id="S4.SS1.p4"> <p class="ltx_p" id="S4.SS1.p4.1"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S4.T2" title="In 4.1 Experiment 1: Source of bias ‣ 4 Results ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Table</span> <span class="ltx_text ltx_ref_tag">2</span></a> shows the results for classifying Black and White race subjects. The highest accuracies were achieved when images were used, either on their own or in combination with segmentations. The accuracy of the Seg-Seg-Seg dataset was the lowest but still higher than random chance. Using a two-tailed Student’s t-test between the accuracies of the 10 runs for each of the datasets, the only significant differences were between the Seg-Seg-Seg dataset and the other three datasets which contained images (p <math alttext="<" class="ltx_Math" display="inline" id="S4.SS1.p4.1.m1.1"><semantics id="S4.SS1.p4.1.m1.1a"><mo id="S4.SS1.p4.1.m1.1.1" xref="S4.SS1.p4.1.m1.1.1.cmml"><</mo><annotation-xml encoding="MathML-Content" id="S4.SS1.p4.1.m1.1b"><lt id="S4.SS1.p4.1.m1.1.1.cmml" xref="S4.SS1.p4.1.m1.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.p4.1.m1.1c"><</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.p4.1.m1.1d"><</annotation></semantics></math> 0.0001 for all).</p> </div> <figure class="ltx_table" id="S4.T2"> <p class="ltx_p ltx_align_center" id="S4.T2.2"><span class="ltx_text ltx_inline-block" id="S4.T2.2.1" style="width:433.6pt;"> <span class="ltx_inline-block ltx_transformed_outer" id="S4.T2.2.1.1" style="width:369.5pt;height:90pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(0.0pt,0.0pt) scale(1,1) ;"> <span class="ltx_p" id="S4.T2.2.1.1.1"><span class="ltx_text" id="S4.T2.2.1.1.1.1"> <span class="ltx_tabular ltx_guessed_headers ltx_align_middle" id="S4.T2.2.1.1.1.1.1"> <span class="ltx_thead"> <span class="ltx_tr" id="S4.T2.2.1.1.1.1.1.1.1"> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_l ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.1.1.1"><span class="ltx_text ltx_font_bold" id="S4.T2.2.1.1.1.1.1.1.1.1.1">Image type</span></span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.1.1.2"><span class="ltx_text ltx_font_bold" id="S4.T2.2.1.1.1.1.1.1.1.2.1">Accuracy</span></span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.1.1.3"><span class="ltx_text ltx_font_bold" id="S4.T2.2.1.1.1.1.1.1.1.3.1">Sensitivity</span></span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.1.1.4"><span class="ltx_text ltx_font_bold" id="S4.T2.2.1.1.1.1.1.1.1.4.1">Specificity</span></span></span> </span> <span class="ltx_tbody"> <span class="ltx_tr" id="S4.T2.2.1.1.1.1.1.2.1"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.2.1.1">Im-Im-Im</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.2.1.2"><span class="ltx_text ltx_font_bold" id="S4.T2.2.1.1.1.1.1.2.1.2.1">0.959 (0.004)</span></span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.2.1.3"><span class="ltx_text ltx_font_bold" id="S4.T2.2.1.1.1.1.1.2.1.3.1">0.966 (0.013)</span></span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.2.1.4">0.951 (0.014)</span></span> <span class="ltx_tr" id="S4.T2.2.1.1.1.1.1.3.2"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.3.2.1">Im-Im-Seg</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.3.2.2">0.957 (0.010)</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.3.2.3">0.959 (0.011)</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.3.2.4"><span class="ltx_text ltx_font_bold" id="S4.T2.2.1.1.1.1.1.3.2.4.1">0.956 (0.018)</span></span></span> <span class="ltx_tr" id="S4.T2.2.1.1.1.1.1.4.3"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.4.3.1">Im-Seg-Seg</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.4.3.2">0.955 (0.007)</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.4.3.3">0.961 (0.013)</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.4.3.4">0.948 (0.010)</span></span> <span class="ltx_tr" id="S4.T2.2.1.1.1.1.1.5.4"> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_l ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.5.4.1">Seg-Seg-Seg</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.5.4.2">0.742 (0.005)</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.5.4.3">0.727 (0.011)</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S4.T2.2.1.1.1.1.1.5.4.4">0.765 (0.020)</span></span> </span> </span></span></span> </span></span></span></p> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S4.T2.3.1.1" style="font-size:90%;">Table 2</span>: </span><span class="ltx_text" id="S4.T2.4.2" style="font-size:90%;">Accuracy for experiment on classifying the subjects by race (Black and White). The results show the mean (standard deviation) over 10 repeat runs. The highest result for each measure is shown in bold.</span></figcaption> </figure> <div class="ltx_para" id="S4.SS1.p5"> <p class="ltx_p" id="S4.SS1.p5.1">The conclusion of this first experiment is that the majority of the distributional shift between races lies in the images rather than the ground truth segmentations.</p> </div> <div class="ltx_para" id="S4.SS1.p6"> <p class="ltx_p" id="S4.SS1.p6.1">The next experiment aimed to assess whether, and the degree to which, this distributional shift is also encoded in trained segmentation models. To answer these questions, we used an approach similar to that described in <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib14" title="">14</a>]</cite>. First, nnU-Net models were trained to segment CMR images using training data with varying levels of race imbalance. Next, the decoder part of the trained networks was removed. The test set was then fed through the encoder part of the network to produce a latent vector for each test subject. We then used principal components analysis (PCA) to reduce the dimensionality of these latent representations and visualised the results. Furthermore, we investigated whether race could be separated in the reduced dimensional space using logistic regression classification.</p> </div> <div class="ltx_para" id="S4.SS1.p7"> <p class="ltx_p" id="S4.SS1.p7.1">The quantitative results for this experiment are shown in <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S4.T3" title="In 4.1 Experiment 1: Source of bias ‣ 4 Results ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Table</span> <span class="ltx_text ltx_ref_tag">3</span></a>. The PCA reduced-dimensional representations of the images could be classified with a high accuracy of approximately 95% for all models. Visual illustrations of the classification of protected attributes can be found in <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#Sx2.F1" title="In 6.1 Experiment 1: source of the bias ‣ Supplementary Material ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">S1</span></a>.</p> </div> <figure class="ltx_table" id="S4.T3"> <p class="ltx_p ltx_align_center" id="S4.T3.2"><span class="ltx_text ltx_inline-block" id="S4.T3.2.1" style="width:433.6pt;"> <span class="ltx_inline-block ltx_transformed_outer" id="S4.T3.2.1.1" style="width:306.8pt;height:109.9pt;vertical-align:-1.9pt;"><span class="ltx_transformed_inner" style="transform:translate(0.0pt,0.0pt) scale(1,1) ;"> <span class="ltx_p" id="S4.T3.2.1.1.1"><span class="ltx_text" id="S4.T3.2.1.1.1.1"> <span class="ltx_tabular ltx_guessed_headers ltx_align_middle" id="S4.T3.2.1.1.1.1.1"> <span class="ltx_thead"> <span class="ltx_tr" id="S4.T3.2.1.1.1.1.1.1.1"> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_l ltx_border_r ltx_border_t" id="S4.T3.2.1.1.1.1.1.1.1.1"><span class="ltx_text ltx_font_bold" id="S4.T3.2.1.1.1.1.1.1.1.1.1">Training dataset split</span></span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t" id="S4.T3.2.1.1.1.1.1.1.1.2"><span class="ltx_text ltx_font_bold" id="S4.T3.2.1.1.1.1.1.1.1.2.1">Black vs White</span></span></span> </span> <span class="ltx_tbody"> <span class="ltx_tr" id="S4.T3.2.1.1.1.1.1.2.1"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T3.2.1.1.1.1.1.2.1.1">100%/0%</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.1.1.1.1.1.2.1.2">0.950</span></span> <span class="ltx_tr" id="S4.T3.2.1.1.1.1.1.3.2"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T3.2.1.1.1.1.1.3.2.1">75%/25%</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.1.1.1.1.1.3.2.2">0.950</span></span> <span class="ltx_tr" id="S4.T3.2.1.1.1.1.1.4.3"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T3.2.1.1.1.1.1.4.3.1">50%/50%</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.1.1.1.1.1.4.3.2">0.949</span></span> <span class="ltx_tr" id="S4.T3.2.1.1.1.1.1.5.4"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T3.2.1.1.1.1.1.5.4.1">25%/75%</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.1.1.1.1.1.5.4.2">0.951</span></span> <span class="ltx_tr" id="S4.T3.2.1.1.1.1.1.6.5"> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_l ltx_border_r ltx_border_t" id="S4.T3.2.1.1.1.1.1.6.5.1">0%/100%</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S4.T3.2.1.1.1.1.1.6.5.2">0.949</span></span> </span> </span></span></span> </span></span></span></p> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S4.T3.3.1.1" style="font-size:90%;">Table 3</span>: </span><span class="ltx_text" id="S4.T3.4.2" style="font-size:90%;">Accuracy of a logistic regression model classifying the PCA representations of the test CMR images fed through the segmentation model encoder.</span></figcaption> </figure> <div class="ltx_para" id="S4.SS1.p8"> <p class="ltx_p" id="S4.SS1.p8.1">The conclusion from this experiment is that race appears to be encoded in the latent representations of the trained segmentation models.</p> </div> </section> <section class="ltx_subsection" id="S4.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.2 </span>Experiment 2: Localisation of source of bias</h3> <div class="ltx_para" id="S4.SS2.p1"> <p class="ltx_p" id="S4.SS2.p1.1">The first set of experiments resulted in high accuracy for race classification, suggesting a strong distributional shift. They also suggested that the source of the bias was mainly in the images and that it was being encoded into the segmentation model. Therefore, we next sought to understand which parts of the images were leading to the distributional shift and hence the bias. To visualise the relative importance of the different regions of the image, we used GradCAM <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib11" title="">11</a>]</cite> applied to the race classification models.</p> </div> <div class="ltx_para" id="S4.SS2.p2"> <p class="ltx_p" id="S4.SS2.p2.1">The results are shown in <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S4.F2" title="In 4.2 Experiment 2: Localisation of source of bias ‣ 4 Results ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">2</span></a> with normalised CMR images and GradCAM images. These representative examples show that for both the Black and White subjects, the most attention is being given to non-heart regions such as subcutaneous fat. Further examples can be seen in <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#Sx2.F2" title="In 6.1 Experiment 1: source of the bias ‣ Supplementary Material ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">S2</span></a> and <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#Sx2.F3" title="In 6.1 Experiment 1: source of the bias ‣ Supplementary Material ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">S3</span></a>.</p> </div> <figure class="ltx_figure" id="S4.F2"><img alt="Refer to caption" class="ltx_graphics ltx_img_square" height="468" id="S4.F2.g1" src="extracted/5774675/Images/Gradcam_images.png" width="471"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F2.2.1.1" style="font-size:90%;">Figure 2</span>: </span><span class="ltx_text" id="S4.F2.3.2" style="font-size:90%;">Examples of the normalised CMR images and GradCAM images for the classification model trained on the Im-Im-Im dataset for Black vs White subjects. Higher values (red) correspond to important areas used for race classification; lower values (blue) correspond to less important areas. The top image displays a heatmap where the non-heart regions have higher activations, the bottom image shows a heatmap where artefacts have higher activations.</span></figcaption> </figure> <div class="ltx_para" id="S4.SS2.p3"> <p class="ltx_p" id="S4.SS2.p3.1">By visual inspection of all test images, we found that 42% of the images had the highest activations in non-heart anatomical regions of the body whereas only 6% had the highest activation in heart regions. The remaining 52% could be classified as ‘activations due to image artefacts’ (50%) and ‘other’ where there were no clear activations in any particular area (2%). These image artefacts become visible after normalising the images which occurs before model training. The artefacts can be caused by interactions between the magnetic field and body tissues during MR image acquisition. For example, ‘ghosting artefacts’ can cause the skin and fat layers to appear as echoes at regular intervals in an image <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib15" title="">15</a>]</cite>.</p> </div> <div class="ltx_para" id="S4.SS2.p4"> <p class="ltx_p" id="S4.SS2.p4.1">Based on these results, we next investigated the impact on race classification performance of using different areas of the images as input. We created two further datasets of images: a dataset including only the heart and a dataset excluding the heart. For the first dataset we cropped the images around the region of the heart using a bounding box based on the ground truth segmentations. All images for a given experiment were cropped to the same size, i.e. the size of the largest heart in the dataset. For the second dataset the heart was blurred out using a Gaussian filter. Examples of the images created in this way are shown in <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S4.F3" title="In 4.2 Experiment 2: Localisation of source of bias ‣ 4 Results ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">3</span></a>. We then repeated the race classification experiments using these new images and compared to the performance on uncropped images.</p> </div> <figure class="ltx_figure" id="S4.F3"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="195" id="S4.F3.g1" src="extracted/5774675/Images/Blurred_cropped_ims_example.png" width="393"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F3.2.1.1" style="font-size:90%;">Figure 3</span>: </span><span class="ltx_text" id="S4.F3.3.2" style="font-size:90%;">Examples of images including and excluding the heart. a) image cropped around the heart b) image with the heart blurred</span></figcaption> </figure> <div class="ltx_para" id="S4.SS2.p5"> <p class="ltx_p" id="S4.SS2.p5.1">As before, the results were averaged over 10 repeat runs with different random training and validation sets and random seeds for training. The results can be seen in <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S4.T4" title="In 4.2 Experiment 2: Localisation of source of bias ‣ 4 Results ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Table</span> <span class="ltx_text ltx_ref_tag">4</span></a>. Cropping the images around the heart regions caused the accuracy to decrease by 0.405 to 0.554. Blurring the images only caused the accuracy to decrease by 0.046 to 0.913.</p> </div> <figure class="ltx_table" id="S4.T4"> <p class="ltx_p ltx_align_center" id="S4.T4.2"><span class="ltx_text ltx_inline-block" id="S4.T4.2.1" style="width:433.6pt;"> <span class="ltx_inline-block ltx_transformed_outer" id="S4.T4.2.1.1" style="width:498.3pt;height:74.9pt;vertical-align:-2.9pt;"><span class="ltx_transformed_inner" style="transform:translate(0.0pt,0.0pt) scale(1,1) ;"> <span class="ltx_p" id="S4.T4.2.1.1.1"><span class="ltx_text" id="S4.T4.2.1.1.1.1"> <span class="ltx_tabular ltx_guessed_headers ltx_align_middle" id="S4.T4.2.1.1.1.1.1"> <span class="ltx_thead"> <span class="ltx_tr" id="S4.T4.2.1.1.1.1.1.1.1"> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_l ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.1.1.1"><span class="ltx_text ltx_font_bold" id="S4.T4.2.1.1.1.1.1.1.1.1.1">Image type</span></span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.1.1.2"><span class="ltx_text ltx_font_bold" id="S4.T4.2.1.1.1.1.1.1.1.2.1">Accuracy</span></span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.1.1.3"><span class="ltx_text ltx_font_bold" id="S4.T4.2.1.1.1.1.1.1.1.3.1">Sensitivity</span></span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.1.1.4"><span class="ltx_text ltx_font_bold" id="S4.T4.2.1.1.1.1.1.1.1.4.1">Specificity</span></span></span> </span> <span class="ltx_tbody"> <span class="ltx_tr" id="S4.T4.2.1.1.1.1.1.2.1"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.2.1.1">Im-Im-Im</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.2.1.2"><span class="ltx_text ltx_font_bold" id="S4.T4.2.1.1.1.1.1.2.1.2.1">0.959 (0.004)</span></span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.2.1.3"><span class="ltx_text ltx_font_bold" id="S4.T4.2.1.1.1.1.1.2.1.3.1">0.966 (0.013)</span></span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.2.1.4">0.951 (0.014)</span></span> <span class="ltx_tr" id="S4.T4.2.1.1.1.1.1.3.2"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.3.2.1">Images cropped around the heart</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.3.2.2">0.554 (0.028)</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.3.2.3">0.618 (0.061)</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.3.2.4">0.537 (0.029)</span></span> <span class="ltx_tr" id="S4.T4.2.1.1.1.1.1.4.3"> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_l ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.4.3.1">Images with heart blurred</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.4.3.2">0.913 (0.007)</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.4.3.3">0.884 (0.010)</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S4.T4.2.1.1.1.1.1.4.3.4"><span class="ltx_text ltx_font_bold" id="S4.T4.2.1.1.1.1.1.4.3.4.1">0.952 (0.014)</span></span></span> </span> </span></span></span> </span></span></span></p> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S4.T4.3.1.1" style="font-size:90%;">Table 4</span>: </span><span class="ltx_text" id="S4.T4.4.2" style="font-size:90%;">Classification accuracy for original images, images cropped around the heart and images with the heart blurred. The results show the mean (standard deviation) over 10 repeat runs. The highest result for each measure is shown in bold</span></figcaption> </figure> <div class="ltx_para" id="S4.SS2.p6"> <p class="ltx_p" id="S4.SS2.p6.1">The conclusion of these experiments is that the main source of the distributional shift for races is outside the heart area. This agrees with the GradCAM experiments which showed higher activation in non-heart regions such as subcutaneous fat and image artefacts.</p> </div> <div class="ltx_para" id="S4.SS2.p7"> <p class="ltx_p" id="S4.SS2.p7.1">Based on this conclusion, we next investigated the impact of training segmentation models after cropping out the regions of the images which seemed to be leading to the distributional shift between races, i.e. the areas outside the heart. Specifically, the segmentation experiments performed in <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib7" title="">7</a>]</cite> (using the full CMR images) were repeated using the cropped images. These experiments trained multiple nnU-Net segmentation models using different levels of race imbalance in the training set and evaluated their performance separately for White and Black subjects. The images here were cropped to the same size as the previous classification experiment. The models were trained using the same training parameters as in <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib7" title="">7</a>]</cite>.</p> </div> <figure class="ltx_figure" id="S4.F4"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="560" id="S4.F4.g1" src="extracted/5774675/Images/Cropped_DSC.png" width="1108"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F4.18.9.1" style="font-size:90%;">Figure 4</span>: </span><span class="ltx_text" id="S4.F4.16.8" style="font-size:90%;"> Overall Dice similarity coefficient (DSC) for segmentation experiments using original (a) and cropped (b) CMR images. Statistical significance was tested using a Mann-Whitney U test and is denoted by **** (p <math alttext="\leq" class="ltx_Math" display="inline" id="S4.F4.9.1.m1.1"><semantics id="S4.F4.9.1.m1.1b"><mo id="S4.F4.9.1.m1.1.1" xref="S4.F4.9.1.m1.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="S4.F4.9.1.m1.1c"><leq id="S4.F4.9.1.m1.1.1.cmml" xref="S4.F4.9.1.m1.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="S4.F4.9.1.m1.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="S4.F4.9.1.m1.1e">≤</annotation></semantics></math> 0.0001), *** (0.001 <math alttext="<" class="ltx_Math" display="inline" id="S4.F4.10.2.m2.1"><semantics id="S4.F4.10.2.m2.1b"><mo id="S4.F4.10.2.m2.1.1" xref="S4.F4.10.2.m2.1.1.cmml"><</mo><annotation-xml encoding="MathML-Content" id="S4.F4.10.2.m2.1c"><lt id="S4.F4.10.2.m2.1.1.cmml" xref="S4.F4.10.2.m2.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.F4.10.2.m2.1d"><</annotation><annotation encoding="application/x-llamapun" id="S4.F4.10.2.m2.1e"><</annotation></semantics></math> p <math alttext="\leq" class="ltx_Math" display="inline" id="S4.F4.11.3.m3.1"><semantics id="S4.F4.11.3.m3.1b"><mo id="S4.F4.11.3.m3.1.1" xref="S4.F4.11.3.m3.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="S4.F4.11.3.m3.1c"><leq id="S4.F4.11.3.m3.1.1.cmml" xref="S4.F4.11.3.m3.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="S4.F4.11.3.m3.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="S4.F4.11.3.m3.1e">≤</annotation></semantics></math> 0.0001), ** (0.01 <math alttext="<" class="ltx_Math" display="inline" id="S4.F4.12.4.m4.1"><semantics id="S4.F4.12.4.m4.1b"><mo id="S4.F4.12.4.m4.1.1" xref="S4.F4.12.4.m4.1.1.cmml"><</mo><annotation-xml encoding="MathML-Content" id="S4.F4.12.4.m4.1c"><lt id="S4.F4.12.4.m4.1.1.cmml" xref="S4.F4.12.4.m4.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.F4.12.4.m4.1d"><</annotation><annotation encoding="application/x-llamapun" id="S4.F4.12.4.m4.1e"><</annotation></semantics></math> p <math alttext="\leq" class="ltx_Math" display="inline" id="S4.F4.13.5.m5.1"><semantics id="S4.F4.13.5.m5.1b"><mo id="S4.F4.13.5.m5.1.1" xref="S4.F4.13.5.m5.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="S4.F4.13.5.m5.1c"><leq id="S4.F4.13.5.m5.1.1.cmml" xref="S4.F4.13.5.m5.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="S4.F4.13.5.m5.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="S4.F4.13.5.m5.1e">≤</annotation></semantics></math> 0.001), * (0.01 <math alttext="<" class="ltx_Math" display="inline" id="S4.F4.14.6.m6.1"><semantics id="S4.F4.14.6.m6.1b"><mo id="S4.F4.14.6.m6.1.1" xref="S4.F4.14.6.m6.1.1.cmml"><</mo><annotation-xml encoding="MathML-Content" id="S4.F4.14.6.m6.1c"><lt id="S4.F4.14.6.m6.1.1.cmml" xref="S4.F4.14.6.m6.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.F4.14.6.m6.1d"><</annotation><annotation encoding="application/x-llamapun" id="S4.F4.14.6.m6.1e"><</annotation></semantics></math> p <math alttext="\leq" class="ltx_Math" display="inline" id="S4.F4.15.7.m7.1"><semantics id="S4.F4.15.7.m7.1b"><mo id="S4.F4.15.7.m7.1.1" xref="S4.F4.15.7.m7.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="S4.F4.15.7.m7.1c"><leq id="S4.F4.15.7.m7.1.1.cmml" xref="S4.F4.15.7.m7.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="S4.F4.15.7.m7.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="S4.F4.15.7.m7.1e">≤</annotation></semantics></math>0.05), ns (0.05 <math alttext="\leq" class="ltx_Math" display="inline" id="S4.F4.16.8.m8.1"><semantics id="S4.F4.16.8.m8.1b"><mo id="S4.F4.16.8.m8.1.1" xref="S4.F4.16.8.m8.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="S4.F4.16.8.m8.1c"><leq id="S4.F4.16.8.m8.1.1.cmml" xref="S4.F4.16.8.m8.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="S4.F4.16.8.m8.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="S4.F4.16.8.m8.1e">≤</annotation></semantics></math> p).</span></figcaption> </figure> <div class="ltx_para" id="S4.SS2.p8"> <p class="ltx_p" id="S4.SS2.p8.1"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S4.F4" title="In 4.2 Experiment 2: Localisation of source of bias ‣ 4 Results ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">4</span></a> shows the results of the segmentation experiments. Comparisons of the predicted end systolic volumes and ejection fraction for the original and cropped images can be seen in <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#Sx2.F4" title="In 6.1 Experiment 1: source of the bias ‣ Supplementary Material ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">S4</span></a>. Compared to using the original images, using the cropped images reduced the range of DSCs for both protected groups at each level of training set imbalance. The differences between DSCs of the two protected groups also reduced although the results remain significantly different. Therefore, cropping the images reduced but did not remove the difference in performance between the Black and White subjects. This is consistent with the classification experiments <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S4.T2" title="In 4.1 Experiment 1: Source of bias ‣ 4 Results ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Table</span> <span class="ltx_text ltx_ref_tag">2</span></a> which showed that some distributional shift was present in the heart region.</p> </div> </section> <section class="ltx_subsection" id="S4.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.3 </span>Experiment 3: Are the biases observed in AI CMR segmentation due to confounders?</h3> <div class="ltx_para" id="S4.SS3.p1"> <p class="ltx_p" id="S4.SS3.p1.1">Differences in covariates between protected groups may lead to distributional shifts and consequent bias in the AI CMR segmentation models. We investigate whether this is the case by comparing the DSC and the covariates of the subjects such as their weight, height and heart rate. The data is analysed by fitting a linear regression model between covariates and DSC scores.</p> </div> <div class="ltx_para" id="S4.SS3.p2"> <p class="ltx_p" id="S4.SS3.p2.2"><a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#S4.T5" title="In 4.3 Experiment 3: Are the biases observed in AI CMR segmentation due to confounders? ‣ 4 Results ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Table</span> <span class="ltx_text ltx_ref_tag">5</span></a> shows the results of this analysis. The table shows the standardised <math alttext="\beta" class="ltx_Math" display="inline" id="S4.SS3.p2.1.m1.1"><semantics id="S4.SS3.p2.1.m1.1a"><mi id="S4.SS3.p2.1.m1.1.1" xref="S4.SS3.p2.1.m1.1.1.cmml">β</mi><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.1.m1.1b"><ci id="S4.SS3.p2.1.m1.1.1.cmml" xref="S4.SS3.p2.1.m1.1.1">𝛽</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.1.m1.1c">\beta</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.1.m1.1d">italic_β</annotation></semantics></math> coefficients and p-values for each covariate. The standardised <math alttext="\beta" class="ltx_Math" display="inline" id="S4.SS3.p2.2.m2.1"><semantics id="S4.SS3.p2.2.m2.1a"><mi id="S4.SS3.p2.2.m2.1.1" xref="S4.SS3.p2.2.m2.1.1.cmml">β</mi><annotation-xml encoding="MathML-Content" id="S4.SS3.p2.2.m2.1b"><ci id="S4.SS3.p2.2.m2.1.1.cmml" xref="S4.SS3.p2.2.m2.1.1">𝛽</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p2.2.m2.1c">\beta</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p2.2.m2.1d">italic_β</annotation></semantics></math> coefficient shows the relative effect of each covariate on the DSC score, with positive coefficients indicating a positive correlation and negative coefficients indicating a negative correlation. The year of the MRI scan was the most predictive of DSC score in the Black subjects. White subjects have no confounders with a p-value less than 0.05.</p> </div> <figure class="ltx_table" id="S4.T5"> <div class="ltx_block ltx_pruned_first" id="S4.T5.2"> <div class="ltx_para ltx_noindent ltx_align_center" id="S4.T5.2.p2"> <p class="ltx_p" id="S4.T5.2.p2.2"><span class="ltx_text ltx_inline-block" id="S4.T5.2.p2.2.2" style="width:433.6pt;"> <span class="ltx_inline-block ltx_transformed_outer" id="S4.T5.2.p2.2.2.2.2" style="width:438.3pt;height:361pt;vertical-align:-1.0pt;"><span class="ltx_transformed_inner" style="transform:translate(0.0pt,0.0pt) scale(1,1) ;"> <span class="ltx_p" id="S4.T5.2.p2.2.2.2.2.2"><span class="ltx_text" id="S4.T5.2.p2.2.2.2.2.2.2"> <span class="ltx_tabular ltx_guessed_headers ltx_align_middle" id="S4.T5.2.p2.2.2.2.2.2.2.2"> <span class="ltx_tbody"> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.3.1"> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_l ltx_border_r ltx_border_t ltx_rowspan ltx_rowspan_3" id="S4.T5.2.p2.2.2.2.2.2.2.2.3.1.1"><span class="ltx_text" id="S4.T5.2.p2.2.2.2.2.2.2.2.3.1.1.1">Covariate</span></span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t ltx_colspan ltx_colspan_2" id="S4.T5.2.p2.2.2.2.2.2.2.2.3.1.2">Black subjects</span> <span class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_r ltx_border_t ltx_colspan ltx_colspan_2" id="S4.T5.2.p2.2.2.2.2.2.2.2.3.1.3">White subjects</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.4.2"> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.4.2.1">Standardised</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t ltx_rowspan ltx_rowspan_2" id="S4.T5.2.p2.2.2.2.2.2.2.2.4.2.2"><span class="ltx_text" id="S4.T5.2.p2.2.2.2.2.2.2.2.4.2.2.1">p-value</span></span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.4.2.3">Standardised</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t ltx_rowspan ltx_rowspan_2" id="S4.T5.2.p2.2.2.2.2.2.2.2.4.2.4"><span class="ltx_text" id="S4.T5.2.p2.2.2.2.2.2.2.2.4.2.4.1">p-value</span></span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.2"> <span class="ltx_td ltx_align_center ltx_border_r" id="S4.T5.2.p2.1.1.1.1.1.1.1.1.1"><math alttext="\beta" class="ltx_Math" display="inline" id="S4.T5.2.p2.1.1.1.1.1.1.1.1.1.m1.1"><semantics id="S4.T5.2.p2.1.1.1.1.1.1.1.1.1.m1.1a"><mi id="S4.T5.2.p2.1.1.1.1.1.1.1.1.1.m1.1.1" xref="S4.T5.2.p2.1.1.1.1.1.1.1.1.1.m1.1.1.cmml">β</mi><annotation-xml encoding="MathML-Content" id="S4.T5.2.p2.1.1.1.1.1.1.1.1.1.m1.1b"><ci id="S4.T5.2.p2.1.1.1.1.1.1.1.1.1.m1.1.1.cmml" xref="S4.T5.2.p2.1.1.1.1.1.1.1.1.1.m1.1.1">𝛽</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T5.2.p2.1.1.1.1.1.1.1.1.1.m1.1c">\beta</annotation><annotation encoding="application/x-llamapun" id="S4.T5.2.p2.1.1.1.1.1.1.1.1.1.m1.1d">italic_β</annotation></semantics></math> coefficient</span> <span class="ltx_td ltx_align_center ltx_border_r" id="S4.T5.2.p2.2.2.2.2.2.2.2.2.2"><math alttext="\beta" class="ltx_Math" display="inline" id="S4.T5.2.p2.2.2.2.2.2.2.2.2.2.m1.1"><semantics id="S4.T5.2.p2.2.2.2.2.2.2.2.2.2.m1.1a"><mi id="S4.T5.2.p2.2.2.2.2.2.2.2.2.2.m1.1.1" xref="S4.T5.2.p2.2.2.2.2.2.2.2.2.2.m1.1.1.cmml">β</mi><annotation-xml encoding="MathML-Content" id="S4.T5.2.p2.2.2.2.2.2.2.2.2.2.m1.1b"><ci id="S4.T5.2.p2.2.2.2.2.2.2.2.2.2.m1.1.1.cmml" xref="S4.T5.2.p2.2.2.2.2.2.2.2.2.2.m1.1.1">𝛽</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T5.2.p2.2.2.2.2.2.2.2.2.2.m1.1c">\beta</annotation><annotation encoding="application/x-llamapun" id="S4.T5.2.p2.2.2.2.2.2.2.2.2.2.m1.1d">italic_β</annotation></semantics></math> coefficient</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.5.3"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.5.3.1">Age</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.5.3.2">0.060</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.5.3.3">0.718</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.5.3.4">0.056</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.5.3.5">0.754</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.6.4"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.6.4.1">Height</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.6.4.2">0.126</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.6.4.3">0.881</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.6.4.4">0.935</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.6.4.5">0.227</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.7.5"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.7.5.1">Weight</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.7.5.2">0.240</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.7.5.3">0.865</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.7.5.4">2.017</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.7.5.5">0.206</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.8.6"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.8.6.1">BMI</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.8.6.2">0.098</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.8.6.3">0.940</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.8.6.4">1.832</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.8.6.5">0.172</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.9.7"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.9.7.1">Heart rate</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.9.7.2">0.241</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.9.7.3">0.063</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.9.7.4">0.284</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.9.7.5">0.125</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.10.8"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.10.8.1">Systolic blood pressure</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.10.8.2">0.253</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.10.8.3">0.333</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.10.8.4">0.213</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.10.8.5">0.357</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.11.9"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.11.9.1">Diastolic blood pressure</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.11.9.2">0.125</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.11.9.3">0.604</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.11.9.4">0.424</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.11.9.5">0.138</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.12.10"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.12.10.1">LVSV</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.12.10.2">0.581</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.12.10.3">0.009</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.12.10.4">0.121</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.12.10.5">0.713</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.13.11"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.13.11.1">LVEF</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.13.11.2">0.410</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.13.11.3">0.012</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.13.11.4">0.079</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.13.11.5">0.734</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.14.12"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.14.12.1">LVEDM</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.14.12.2">0.453</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.14.12.3">0.057</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.14.12.4">0.141</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.14.12.5">0.688</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.15.13"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.15.13.1">HDL cholesterol</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.15.13.2">0.321</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.15.13.3">0.015</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.15.13.4">0.002</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.15.13.5">0.990</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.16.14"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.16.14.1">Cholesterol</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.16.14.2">0.015</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.16.14.3">0.929</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.16.14.4">0.071</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.16.14.5">0.657</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.17.15"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.17.15.1">Diabetes</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.17.15.2">0.236</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.17.15.3">0.155</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t ltx_colspan ltx_colspan_2" id="S4.T5.2.p2.2.2.2.2.2.2.2.17.15.4">*</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.18.16"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.18.16.1">Hypertension</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.18.16.2">0.032</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.18.16.3">0.838</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.18.16.4">0.169</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.18.16.5">0.280</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.19.17"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.19.17.1">Hyper cholesterolemia</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.19.17.2">0.175</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.19.17.3">0.393</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.19.17.4">0.767</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.19.17.5">0.446</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.20.18"> <span class="ltx_td ltx_align_center ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.20.18.1">Smoking</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.20.18.2">0.1744</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.20.18.3">0.155</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.20.18.4">0.041</span> <span class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.20.18.5">0.776</span></span> <span class="ltx_tr" id="S4.T5.2.p2.2.2.2.2.2.2.2.21.19"> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_l ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.21.19.1">MRI year</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.21.19.2">0.493</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.21.19.3">0.001</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.21.19.4">0.038</span> <span class="ltx_td ltx_align_center ltx_border_b ltx_border_r ltx_border_t" id="S4.T5.2.p2.2.2.2.2.2.2.2.21.19.5">0.808</span></span> </span> </span></span></span> </span></span></span></p> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_block"><span class="ltx_text" id="S4.T5.2.1.1.1" style="font-size:90%;">Table 5</span>: </span><span class="ltx_text" id="S4.T5.2.2.2" style="font-size:90%;">Parameters from a linear regression model fitting DSC scores from a segmentation model trained on original, uncropped CMR images to covariates for Black and White test subjects. The DSC scores are from an evenly balanced training dataset. * None of the White subjects had diabetes.</span></figcaption> </div> </figure> <div class="ltx_pagination ltx_role_newpage"></div> </section> </section> <section class="ltx_section" id="S5"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">5 </span>Discussion</h2> <div class="ltx_para" id="S5.p1"> <p class="ltx_p" id="S5.p1.1">In this paper, we have shown that race can be predicted from single SAX CMR images with very high accuracy. However, the accuracy of predicting race from CMR segmentations was noticeably lower, indicating that the distributional shift between White and Black protected groups is mostly in the CMR images as opposed to the manual segmentations.</p> </div> <div class="ltx_para" id="S5.p2"> <p class="ltx_p" id="S5.p2.1">The GradCAM images showed that the classification networks had the highest activations in non-heart regions such as subcutaneous fat and image artefacts, a result that was further demonstrated by the classification experiments using a dataset with the heart “removed” from the images. The accuracy here remained high whereas we found low classification accuracy using images cropped tightly around the heart. This suggests that there are fewer race-specific features in the images of the hearts of White and Black subjects and that the distributional shift is mostly in non-heart regions. A similar result was found in <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib16" title="">16</a>]</cite> where occluding regions identified by saliency maps as important for race classification from chest X-ray images caused the accuracy to decrease.</p> </div> <div class="ltx_para" id="S5.p3"> <p class="ltx_p" id="S5.p3.1">When looking at segmentation tasks, the high classification accuracy of the logistic regression model showed that subjects’ races were encoded in the latent representations of the CMR images, which makes this encoding a likely cause of the bias in segmentation performance. Cropping the images in a similar fashion to the classification experiments reduced, but did not eliminate, the bias found in the segmentation experiments. We speculate that the remaining bias is due to some anatomical differences in the heart region and the fact that it was not possible to completely crop out non-heart regions in all images because of the variability in heart size and the need to maintain a constant image size for AI model training.</p> </div> <div class="ltx_para" id="S5.p4"> <p class="ltx_p" id="S5.p4.1">The covariate analysis indicated that some variables seem to be acting as confounders. LVEDM, LVSV and LVEF were correlated with DSC score for Black subjects but not for White subjects. Black and White subjects are known to have differences in body composition such as fat distribution and bone density <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib17" title="">17</a>]</cite> as well as differences in cardiac anatomy such as Black subjects having higher left ventricular mass <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib18" title="">18</a>]</cite>. These distributional shifts may be recognisable to a model and may be used for classification tasks and lead to bias in segmentation tasks. MRI year was also a confounder for Black subjects. We further investigated this and found that, by chance, the White subjects selected in our dataset were on average scanned in earlier years than the Black subjects. It is possible that there were differences in image artefacts over time due to small changes in the acquisition protocol, which would be consistent with the GradCAM activations focusing on artefact areas 50% of the time. Therefore, we reran the segmentation experiments (using the different levels of race imbalance as in Experiment 2) using data which were also matched by MRI year but found no difference in bias characteristics or performance (see <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#Sx2.F5" title="In 6.1 Experiment 1: source of the bias ‣ Supplementary Material ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">S5</span></a>). Therefore, we conclude that this was a spurious confounding effect caused by random selection of White subjects who were scanned earlier.</p> </div> <div class="ltx_para" id="S5.p5"> <p class="ltx_p" id="S5.p5.1">As a recommendation for future development of AI CMR segmentation tools, we suggest that training models using images cropped around the heart may be beneficial. However, this does raise the question of how best to crop images in this way at inference time, when ground truth segmentations are obviously not available. Region-of-interest detection methods such as Mask R-CNN <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib19" title="">19</a>]</cite> may be useful for this purpose. We also emphasise that such an approach should not be seen as a substitute for more equal representation in CMR datasets. Our experiments have focused on Black and White subjects but previous work <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib6" title="">6</a>]</cite> has shown that similar bias effects exist for Asian subjects and by sex <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#bib.bib7" title="">7</a>]</cite>. Therefore, we argue for greater representation of all protected groups in CMR datasets.</p> </div> </section> <section class="ltx_section" id="S6"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">6 </span>Conclusion</h2> <div class="ltx_para" id="S6.p1"> <p class="ltx_p" id="S6.p1.1">We have performed a series of experiments to investigate the cause of AI CMR segmentation bias. Our conclusions are (i) the distributional shift between White and Black subjects is mostly, but not entirely, in the images rather than the segmentations, (ii) differences in body fat composition outside of the heart are a likely cause of the distributional shift and hence the bias, (iii) cropping the images around the heart reduces but does not eliminate the bias. Our results will likely be valuable to researchers aiming the train fair AI CMR segmentation models in the future.</p> </div> </section> <section class="ltx_section" id="Sx1"> <h2 class="ltx_title ltx_title_section">Acknowledgements</h2> <div class="ltx_para" id="Sx1.p1"> <p class="ltx_p" id="Sx1.p1.1">This work was supported by the Engineering & Physical Sciences Research Council Doctoral Training Partnership (EPSRC DTP) grant EP/T517963/1. This research has been conducted using the UK Biobank Resource under Application Number 17806.</p> </div> </section> <section class="ltx_bibliography" id="bib"> <h2 class="ltx_title ltx_title_bibliography">References</h2> <ul class="ltx_biblist"> <li class="ltx_bibitem" id="bib.bib1"> <span class="ltx_tag ltx_tag_bibitem">[1]</span> <span class="ltx_bibblock"> J. Mariscal-Harana, C. Asher, V. Vergani, M. Rizvi, L. Keehn, R. J. Kim, R. M. Judd, S. E. Petersen, R. Razavi, A. P. King, B. Ruijsink, and E. Puyol-Antón, “An artificial intelligence tool for automated analysis of large-scale unstructured clinical cine cardiac magnetic resonance databases,” <span class="ltx_text ltx_font_italic" id="bib.bib1.1.1">European Heart Journal - Digital Health</span>, vol. 4, pp. 370–383, 10 2023. </span> </li> <li class="ltx_bibitem" id="bib.bib2"> <span class="ltx_tag ltx_tag_bibitem">[2]</span> <span class="ltx_bibblock"> R. H. Davies, J. B. Augusto, A. Bhuva, H. Xue, T. A. Treibel, Y. Ye, R. K. Hughes, W. Bai, C. Lau, H. Shiwani, M. Fontana, R. Kozor, A. Herrey, L. R. Lopes, V. Maestrini, S. Rosmini, S. E. Petersen, P. Kellman, D. Rueckert, J. P. Greenwood, G. Captur, C. Manisty, E. Schelbert, and J. C. Moon, “Precision measurement of cardiac structure and function in cardiovascular magnetic resonance using machine learning,” <span class="ltx_text ltx_font_italic" id="bib.bib2.1.1">Journal of Cardiovascular Magnetic Resonance</span>, vol. 24, p. 16, 1 2022. </span> </li> <li class="ltx_bibitem" id="bib.bib3"> <span class="ltx_tag ltx_tag_bibitem">[3]</span> <span class="ltx_bibblock"> B. Ruijsink, E. Puyol-Antón, I. Oksuz, M. Sinclair, W. Bai, J. A. Schnabel, R. Razavi, and A. P. King, “Fully Automated, Quality-Controlled Cardiac Analysis From CMR: Validation and Large-Scale Application to Characterize Cardiac Function,” <span class="ltx_text ltx_font_italic" id="bib.bib3.1.1">JACC: Cardiovascular Imaging</span>, vol. 13, pp. 684–695, 3 2020. </span> </li> <li class="ltx_bibitem" id="bib.bib4"> <span class="ltx_tag ltx_tag_bibitem">[4]</span> <span class="ltx_bibblock"> E. Puyol-Antón, B. Ruijsink, S. K. Piechnik, S. Neubauer, S. E. Petersen, R. Razavi, and A. P. King, “Fairness in Cardiac MR Image Analysis: An Investigation of Bias Due to Data Imbalance in Deep Learning Based Segmentation,” in <span class="ltx_text ltx_font_italic" id="bib.bib4.1.1">Medical Image Computing and Computer Assisted Intervention – MICCAI 2021</span>, vol. 12903 LNCS, pp. 413–423, Springer International Publishing, 2021. </span> </li> <li class="ltx_bibitem" id="bib.bib5"> <span class="ltx_tag ltx_tag_bibitem">[5]</span> <span class="ltx_bibblock"> E. Puyol-Antón, B. Ruijsink, J. Mariscal Harana, S. K. Piechnik, S. Neubauer, S. E. Petersen, R. Razavi, P. Chowienczyk, and A. P. King, “Fairness in Cardiac Magnetic Resonance Imaging: Assessing Sex and Racial Bias in Deep Learning-Based Segmentation,” <span class="ltx_text ltx_font_italic" id="bib.bib5.1.1">Frontiers in Cardiovascular Medicine</span>, vol. 0, p. 664, 4 2022. </span> </li> <li class="ltx_bibitem" id="bib.bib6"> <span class="ltx_tag ltx_tag_bibitem">[6]</span> <span class="ltx_bibblock"> T. Lee, E. Puyol-Antón, B. Ruijsink, M. Shi, and A. P. King, “A Systematic Study of Race and Sex Bias in CNN-Based Cardiac MR Segmentation,” in <span class="ltx_text ltx_font_italic" id="bib.bib6.1.1">Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</span>, vol. 13593 LNCS, pp. 233–244, Springer Science and Business Media Deutschland GmbH, 2022. </span> </li> <li class="ltx_bibitem" id="bib.bib7"> <span class="ltx_tag ltx_tag_bibitem">[7]</span> <span class="ltx_bibblock"> T. Lee, E. Puyol-Antón, B. Ruijsink, K. Aitcheson, M. Shi, and A. P. King, “An Investigation into the Impact of Deep Learning Model Choice on Sex and Race Bias in Cardiac MR Segmentation,” <span class="ltx_text ltx_font_italic" id="bib.bib7.1.1">Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)</span>, vol. 14242 LNCS, pp. 215–224, 2023. </span> </li> <li class="ltx_bibitem" id="bib.bib8"> <span class="ltx_tag ltx_tag_bibitem">[8]</span> <span class="ltx_bibblock"> S. E. Petersen, P. M. Matthews, J. M. Francis, M. D. Robson, F. Zemrak, R. Boubertakh, A. A. Young, S. Hudson, P. Weale, S. Garratt, R. Collins, S. Piechnik, and S. Neubauer, “UK Biobank’s cardiovascular magnetic resonance protocol,” <span class="ltx_text ltx_font_italic" id="bib.bib8.1.1">Journal of Cardiovascular Magnetic Resonance</span>, vol. 18, pp. 1–7, 2 2016. </span> </li> <li class="ltx_bibitem" id="bib.bib9"> <span class="ltx_tag ltx_tag_bibitem">[9]</span> <span class="ltx_bibblock"> S. E. Petersen, N. Aung, M. M. Sanghvi, F. Zemrak, K. Fung, J. M. Paiva, J. M. Francis, M. Y. Khanji, E. Lukaschuk, A. M. Lee, V. Carapella, Y. J. Kim, P. Leeson, S. K. Piechnik, and S. Neubauer, “Reference ranges for cardiac structure and function using cardiovascular magnetic resonance (CMR) in Caucasians from the UK Biobank population cohort,” <span class="ltx_text ltx_font_italic" id="bib.bib9.1.1">Journal of Cardiovascular Magnetic Resonance</span>, vol. 19, p. 18, 12 2016. </span> </li> <li class="ltx_bibitem" id="bib.bib10"> <span class="ltx_tag ltx_tag_bibitem">[10]</span> <span class="ltx_bibblock"> K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in <span class="ltx_text ltx_font_italic" id="bib.bib10.1.1">Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition</span>, vol. 2016-December, 2016. </span> </li> <li class="ltx_bibitem" id="bib.bib11"> <span class="ltx_tag ltx_tag_bibitem">[11]</span> <span class="ltx_bibblock"> R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization,” </span> </li> <li class="ltx_bibitem" id="bib.bib12"> <span class="ltx_tag ltx_tag_bibitem">[12]</span> <span class="ltx_bibblock"> F. Isensee, P. F. Jaeger, S. A. Kohl, J. Petersen, and K. H. Maier-Hein, “nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation,” <span class="ltx_text ltx_font_italic" id="bib.bib12.1.1">Nature Methods 2020 18:2</span>, vol. 18, pp. 203–211, 12 2020. </span> </li> <li class="ltx_bibitem" id="bib.bib13"> <span class="ltx_tag ltx_tag_bibitem">[13]</span> <span class="ltx_bibblock"> A. J. Larrazabal, N. Nieto, V. Peterson, D. H. Milone, and E. Ferrante, “Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis,” <span class="ltx_text ltx_font_italic" id="bib.bib13.1.1">Proceedings of the National Academy of Sciences of the United States of America</span>, vol. 117, no. 23, pp. 12592–12594, 2020. </span> </li> <li class="ltx_bibitem" id="bib.bib14"> <span class="ltx_tag ltx_tag_bibitem">[14]</span> <span class="ltx_bibblock"> B. Glocker, C. Jones, M. Bernhardt, and S. Winzeck, “Algorithmic encoding of protected characteristics in chest X-ray disease detection models,” <span class="ltx_text ltx_font_italic" id="bib.bib14.1.1">EBioMedicine</span>, vol. 89, 3 2023. </span> </li> <li class="ltx_bibitem" id="bib.bib15"> <span class="ltx_tag ltx_tag_bibitem">[15]</span> <span class="ltx_bibblock"> K. Alfudhili, P. G. Masci, J. Delacoste, J. B. Ledoux, G. Berchier, V. Dunet, S. D. Qanadli, J. Schwitter, and C. Beigelman-Aubry, “Current artefacts in cardiac and chest magnetic resonance imaging: Tips and tricks,” 2016. </span> </li> <li class="ltx_bibitem" id="bib.bib16"> <span class="ltx_tag ltx_tag_bibitem">[16]</span> <span class="ltx_bibblock"> J. W. Gichoya, I. Banerjee, A. R. Bhimireddy, J. L. Burns, L. A. Celi, L.-C. Chen, R. Correa, N. Dullerud, M. Ghassemi, S.-C. Huang, P.-C. Kuo, M. P. Lungren, L. J. Palmer, B. J. Price, S. Purkayastha, A. T. Pyrros, L. Oakden-Rayner, C. Okechukwu, L. Seyyed-Kalantari, H. Trivedi, R. Wang, Z. Zaiman, and H. Zhang, “AI recognition of patient race in medical imaging: a modelling study.,” <span class="ltx_text ltx_font_italic" id="bib.bib16.1.1">The Lancet. Digital health</span>, vol. 7500, no. 22, 2022. </span> </li> <li class="ltx_bibitem" id="bib.bib17"> <span class="ltx_tag ltx_tag_bibitem">[17]</span> <span class="ltx_bibblock"> D. R. Wagner and V. H. Heyward, “Measures of body composition in blacks and whites: a comparative review,” <span class="ltx_text ltx_font_italic" id="bib.bib17.1.1">The American Journal of Clinical Nutrition</span>, vol. 71, pp. 1392–1402, 6 2000. </span> </li> <li class="ltx_bibitem" id="bib.bib18"> <span class="ltx_tag ltx_tag_bibitem">[18]</span> <span class="ltx_bibblock"> E. Nardi, G. Mulè, C. Nardi, and M. Averna, “Differences in Cardiac Structure and Function Between Black and White Patients: Another Step in the Evaluation of Cardiovascular Risk in Chronic Kidney Disease,” <span class="ltx_text ltx_font_italic" id="bib.bib18.1.1">770 American Journal of Hypertension</span>, vol. 30, no. 8, 2017. </span> </li> <li class="ltx_bibitem" id="bib.bib19"> <span class="ltx_tag ltx_tag_bibitem">[19]</span> <span class="ltx_bibblock"> K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” <span class="ltx_text ltx_font_italic" id="bib.bib19.1.1">IEEE Transactions on Pattern Analysis and Machine Intelligence</span>, vol. 42, pp. 386–397, 3 2017. </span> </li> </ul> </section> <div class="ltx_pagination ltx_role_newpage"></div> <div class="ltx_pagination ltx_role_newpage"></div> <section class="ltx_section" id="Sx2"> <h2 class="ltx_title ltx_title_section">Supplementary Material</h2> <section class="ltx_subsection" id="Sx2.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">6.1 </span>Experiment 1: source of the bias</h3> <div class="ltx_para" id="Sx2.SS1.p1"> <ol class="ltx_enumerate" id="Sx2.I1"> <li class="ltx_item" id="Sx2.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">1.</span> <div class="ltx_para" id="Sx2.I1.i1.p1"> <p class="ltx_p" id="Sx2.I1.i1.p1.1">For protected attribute classification, all datasets were trained using a model which was pre-trained on images from the ImageNet1k dataset, apart from the Seg-Seg-Seg dataset which was trained from scratch using randomised weights.</p> </div> </li> <li class="ltx_item" id="Sx2.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">2.</span> <div class="ltx_para" id="Sx2.I1.i2.p1"> <p class="ltx_p" id="Sx2.I1.i2.p1.1">A visual representation of the decision boundary used for the classification of latent space representations of images can be seen in <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#Sx2.F1" title="In 6.1 Experiment 1: source of the bias ‣ Supplementary Material ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">S1</span></a>. <span class="ltx_ERROR undefined" id="Sx2.I1.i2.p1.1.1">\suspend</span>enumerate</p> </div> <section class="ltx_subsection" id="Sx2.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">6.2 </span>Experiment 2: localisation of the source of the bias</h3> <div class="ltx_para" id="Sx2.SS2.p1"> <span class="ltx_ERROR undefined" id="Sx2.SS2.p1.1">\resume</span> <p class="ltx_p" id="Sx2.SS2.p1.2">enumerate</p> </div> </section> </li> <li class="ltx_item" id="Sx2.I1.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">3.</span> <div class="ltx_para" id="Sx2.I1.i3.p1"> <p class="ltx_p" id="Sx2.I1.i3.p1.1">Before plotting, all GradCAM heatmaps were smoothed using a Gaussian blur with kernel size (3,3) and standard deviation chosen from a uniform distribution between 1 and 2 chosen by visual inspection. Further examples of GradCAM images can be seen in <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#Sx2.F2" title="In 6.1 Experiment 1: source of the bias ‣ Supplementary Material ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">S2</span></a> and <a class="ltx_ref" href="https://arxiv.org/html/2408.02462v1#Sx2.F3" title="In 6.1 Experiment 1: source of the bias ‣ Supplementary Material ‣ An investigation into the causes of race bias in AI-based cine CMR segmentation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">S3</span></a> using the Im-Im-Im dataset.</p> </div> </li> </ol> </div> <figure class="ltx_figure" id="Sx2.F1"><img alt="Refer to caption" class="ltx_graphics ltx_img_square" height="538" id="Sx2.F1.g1" src="extracted/5774675/Images/latent_space_classification.png" width="510"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="Sx2.F1.2.1.1" style="font-size:90%;">Figure S1</span>: </span><span class="ltx_text" id="Sx2.F1.3.2" style="font-size:90%;">Component 1 and 2 of PCA on latent space representations of CMR images from nnU-Net.</span></figcaption> </figure> <figure class="ltx_figure" id="Sx2.F2"><img alt="Refer to caption" class="ltx_graphics ltx_img_portrait" height="810" id="Sx2.F2.g1" src="extracted/5774675/Images/gradcam_white_ex.png" width="432"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="Sx2.F2.2.1.1" style="font-size:90%;">Figure S2</span>: </span><span class="ltx_text" id="Sx2.F2.3.2" style="font-size:90%;">Examples of normalised CMR images and GradCAM heatmaps for Im-Im-Im dataset for the White subjects in race classification experiments</span></figcaption> </figure> <figure class="ltx_figure" id="Sx2.F3"><img alt="Refer to caption" class="ltx_graphics ltx_img_portrait" height="831" id="Sx2.F3.g1" src="extracted/5774675/Images/gradcam_black_ex.png" width="432"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="Sx2.F3.2.1.1" style="font-size:90%;">Figure S3</span>: </span><span class="ltx_text" id="Sx2.F3.3.2" style="font-size:90%;">Examples of normalised CMR images and GradCAM heatmaps for Im-Im-Im dataset for the Black subjects in race classification experiments</span></figcaption> </figure> <figure class="ltx_figure" id="Sx2.F4"><img alt="Refer to caption" class="ltx_graphics ltx_img_portrait" height="707" id="Sx2.F4.g1" src="extracted/5774675/Images/cardiac_function_measures.png" width="471"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="Sx2.F4.18.9.1" style="font-size:90%;">Figure S4</span>: </span><span class="ltx_text" id="Sx2.F4.16.8" style="font-size:90%;">Comparison of the prediction of right ventricular ejection fraction (RVEF), right ventricular end systolic volume (RVESV), left ventricular ejection fraction (LVEF) and left ventricular end systolic volume for the nnU-Net model using original images (left column) and cropped images (right column). Statistical significance was tested using a Mann-Whitney U test and is denoted by **** (p <math alttext="\leq" class="ltx_Math" display="inline" id="Sx2.F4.9.1.m1.1"><semantics id="Sx2.F4.9.1.m1.1b"><mo id="Sx2.F4.9.1.m1.1.1" xref="Sx2.F4.9.1.m1.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="Sx2.F4.9.1.m1.1c"><leq id="Sx2.F4.9.1.m1.1.1.cmml" xref="Sx2.F4.9.1.m1.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F4.9.1.m1.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="Sx2.F4.9.1.m1.1e">≤</annotation></semantics></math> 0.0001), *** (0.001 <math alttext="<" class="ltx_Math" display="inline" id="Sx2.F4.10.2.m2.1"><semantics id="Sx2.F4.10.2.m2.1b"><mo id="Sx2.F4.10.2.m2.1.1" xref="Sx2.F4.10.2.m2.1.1.cmml"><</mo><annotation-xml encoding="MathML-Content" id="Sx2.F4.10.2.m2.1c"><lt id="Sx2.F4.10.2.m2.1.1.cmml" xref="Sx2.F4.10.2.m2.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F4.10.2.m2.1d"><</annotation><annotation encoding="application/x-llamapun" id="Sx2.F4.10.2.m2.1e"><</annotation></semantics></math> p <math alttext="\leq" class="ltx_Math" display="inline" id="Sx2.F4.11.3.m3.1"><semantics id="Sx2.F4.11.3.m3.1b"><mo id="Sx2.F4.11.3.m3.1.1" xref="Sx2.F4.11.3.m3.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="Sx2.F4.11.3.m3.1c"><leq id="Sx2.F4.11.3.m3.1.1.cmml" xref="Sx2.F4.11.3.m3.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F4.11.3.m3.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="Sx2.F4.11.3.m3.1e">≤</annotation></semantics></math> 0.0001), ** (0.01 <math alttext="<" class="ltx_Math" display="inline" id="Sx2.F4.12.4.m4.1"><semantics id="Sx2.F4.12.4.m4.1b"><mo id="Sx2.F4.12.4.m4.1.1" xref="Sx2.F4.12.4.m4.1.1.cmml"><</mo><annotation-xml encoding="MathML-Content" id="Sx2.F4.12.4.m4.1c"><lt id="Sx2.F4.12.4.m4.1.1.cmml" xref="Sx2.F4.12.4.m4.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F4.12.4.m4.1d"><</annotation><annotation encoding="application/x-llamapun" id="Sx2.F4.12.4.m4.1e"><</annotation></semantics></math> p <math alttext="\leq" class="ltx_Math" display="inline" id="Sx2.F4.13.5.m5.1"><semantics id="Sx2.F4.13.5.m5.1b"><mo id="Sx2.F4.13.5.m5.1.1" xref="Sx2.F4.13.5.m5.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="Sx2.F4.13.5.m5.1c"><leq id="Sx2.F4.13.5.m5.1.1.cmml" xref="Sx2.F4.13.5.m5.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F4.13.5.m5.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="Sx2.F4.13.5.m5.1e">≤</annotation></semantics></math> 0.001), * (0.01 <math alttext="<" class="ltx_Math" display="inline" id="Sx2.F4.14.6.m6.1"><semantics id="Sx2.F4.14.6.m6.1b"><mo id="Sx2.F4.14.6.m6.1.1" xref="Sx2.F4.14.6.m6.1.1.cmml"><</mo><annotation-xml encoding="MathML-Content" id="Sx2.F4.14.6.m6.1c"><lt id="Sx2.F4.14.6.m6.1.1.cmml" xref="Sx2.F4.14.6.m6.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F4.14.6.m6.1d"><</annotation><annotation encoding="application/x-llamapun" id="Sx2.F4.14.6.m6.1e"><</annotation></semantics></math> p <math alttext="\leq" class="ltx_Math" display="inline" id="Sx2.F4.15.7.m7.1"><semantics id="Sx2.F4.15.7.m7.1b"><mo id="Sx2.F4.15.7.m7.1.1" xref="Sx2.F4.15.7.m7.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="Sx2.F4.15.7.m7.1c"><leq id="Sx2.F4.15.7.m7.1.1.cmml" xref="Sx2.F4.15.7.m7.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F4.15.7.m7.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="Sx2.F4.15.7.m7.1e">≤</annotation></semantics></math> 0.05), ns (0.05 <math alttext="\leq" class="ltx_Math" display="inline" id="Sx2.F4.16.8.m8.1"><semantics id="Sx2.F4.16.8.m8.1b"><mo id="Sx2.F4.16.8.m8.1.1" xref="Sx2.F4.16.8.m8.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="Sx2.F4.16.8.m8.1c"><leq id="Sx2.F4.16.8.m8.1.1.cmml" xref="Sx2.F4.16.8.m8.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F4.16.8.m8.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="Sx2.F4.16.8.m8.1e">≤</annotation></semantics></math> p). </span></figcaption> </figure> <figure class="ltx_figure" id="Sx2.F5"><img alt="Refer to caption" class="ltx_graphics ltx_img_square" height="350" id="Sx2.F5.g1" src="extracted/5774675/Images/mri_dsc.png" width="432"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="Sx2.F5.18.9.1" style="font-size:90%;">Figure S5</span>: </span><span class="ltx_text" id="Sx2.F5.16.8" style="font-size:90%;">Overall Dice similarity coefficient (DSC) for segmentation experiment using original CMR images. The subjects are controlled by age and MRI year. Statistical significance was found using a Mann-Whitney U test and is denoted by **** (p <math alttext="\leq" class="ltx_Math" display="inline" id="Sx2.F5.9.1.m1.1"><semantics id="Sx2.F5.9.1.m1.1b"><mo id="Sx2.F5.9.1.m1.1.1" xref="Sx2.F5.9.1.m1.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="Sx2.F5.9.1.m1.1c"><leq id="Sx2.F5.9.1.m1.1.1.cmml" xref="Sx2.F5.9.1.m1.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F5.9.1.m1.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="Sx2.F5.9.1.m1.1e">≤</annotation></semantics></math> 0.0001), *** (0.001 <math alttext="<" class="ltx_Math" display="inline" id="Sx2.F5.10.2.m2.1"><semantics id="Sx2.F5.10.2.m2.1b"><mo id="Sx2.F5.10.2.m2.1.1" xref="Sx2.F5.10.2.m2.1.1.cmml"><</mo><annotation-xml encoding="MathML-Content" id="Sx2.F5.10.2.m2.1c"><lt id="Sx2.F5.10.2.m2.1.1.cmml" xref="Sx2.F5.10.2.m2.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F5.10.2.m2.1d"><</annotation><annotation encoding="application/x-llamapun" id="Sx2.F5.10.2.m2.1e"><</annotation></semantics></math> p <math alttext="\leq" class="ltx_Math" display="inline" id="Sx2.F5.11.3.m3.1"><semantics id="Sx2.F5.11.3.m3.1b"><mo id="Sx2.F5.11.3.m3.1.1" xref="Sx2.F5.11.3.m3.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="Sx2.F5.11.3.m3.1c"><leq id="Sx2.F5.11.3.m3.1.1.cmml" xref="Sx2.F5.11.3.m3.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F5.11.3.m3.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="Sx2.F5.11.3.m3.1e">≤</annotation></semantics></math> 0.0001), ** (0.01 <math alttext="<" class="ltx_Math" display="inline" id="Sx2.F5.12.4.m4.1"><semantics id="Sx2.F5.12.4.m4.1b"><mo id="Sx2.F5.12.4.m4.1.1" xref="Sx2.F5.12.4.m4.1.1.cmml"><</mo><annotation-xml encoding="MathML-Content" id="Sx2.F5.12.4.m4.1c"><lt id="Sx2.F5.12.4.m4.1.1.cmml" xref="Sx2.F5.12.4.m4.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F5.12.4.m4.1d"><</annotation><annotation encoding="application/x-llamapun" id="Sx2.F5.12.4.m4.1e"><</annotation></semantics></math> p <math alttext="\leq" class="ltx_Math" display="inline" id="Sx2.F5.13.5.m5.1"><semantics id="Sx2.F5.13.5.m5.1b"><mo id="Sx2.F5.13.5.m5.1.1" xref="Sx2.F5.13.5.m5.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="Sx2.F5.13.5.m5.1c"><leq id="Sx2.F5.13.5.m5.1.1.cmml" xref="Sx2.F5.13.5.m5.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F5.13.5.m5.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="Sx2.F5.13.5.m5.1e">≤</annotation></semantics></math> 0.001), * (0.01 <math alttext="<" class="ltx_Math" display="inline" id="Sx2.F5.14.6.m6.1"><semantics id="Sx2.F5.14.6.m6.1b"><mo id="Sx2.F5.14.6.m6.1.1" xref="Sx2.F5.14.6.m6.1.1.cmml"><</mo><annotation-xml encoding="MathML-Content" id="Sx2.F5.14.6.m6.1c"><lt id="Sx2.F5.14.6.m6.1.1.cmml" xref="Sx2.F5.14.6.m6.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F5.14.6.m6.1d"><</annotation><annotation encoding="application/x-llamapun" id="Sx2.F5.14.6.m6.1e"><</annotation></semantics></math> p <math alttext="\leq" class="ltx_Math" display="inline" id="Sx2.F5.15.7.m7.1"><semantics id="Sx2.F5.15.7.m7.1b"><mo id="Sx2.F5.15.7.m7.1.1" xref="Sx2.F5.15.7.m7.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="Sx2.F5.15.7.m7.1c"><leq id="Sx2.F5.15.7.m7.1.1.cmml" xref="Sx2.F5.15.7.m7.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F5.15.7.m7.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="Sx2.F5.15.7.m7.1e">≤</annotation></semantics></math> 0.05), ns (0.05 <math alttext="\leq" class="ltx_Math" display="inline" id="Sx2.F5.16.8.m8.1"><semantics id="Sx2.F5.16.8.m8.1b"><mo id="Sx2.F5.16.8.m8.1.1" xref="Sx2.F5.16.8.m8.1.1.cmml">≤</mo><annotation-xml encoding="MathML-Content" id="Sx2.F5.16.8.m8.1c"><leq id="Sx2.F5.16.8.m8.1.1.cmml" xref="Sx2.F5.16.8.m8.1.1"></leq></annotation-xml><annotation encoding="application/x-tex" id="Sx2.F5.16.8.m8.1d">\leq</annotation><annotation encoding="application/x-llamapun" id="Sx2.F5.16.8.m8.1e">≤</annotation></semantics></math> p).</span></figcaption> </figure> <div class="ltx_pagination ltx_role_newpage"></div> </section> </section> </article> </div> <footer class="ltx_page_footer"> <div class="ltx_page_logo">Generated on Mon Aug 5 13:30:25 2024 by <a class="ltx_LaTeXML_logo" href="http://dlmf.nist.gov/LaTeXML/"><span style="letter-spacing:-0.2em; margin-right:0.1em;">L<span class="ltx_font_smallcaps" style="position:relative; bottom:2.2pt;">a</span>T<span class="ltx_font_smallcaps" style="font-size:120%;position:relative; bottom:-0.2ex;">e</span></span><span style="font-size:90%; position:relative; bottom:-0.2ex;">XML</span><img alt="Mascot Sammy" src=""/></a> </div></footer> </div> </body> </html>