CINXE.COM
That's what Nicola said
<!DOCTYPE html> <html lang="en-us"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <meta http-equiv="X-UA-Compatible" content="IE=edge"> <meta name="generator" content="Hugo 0.81.0 with theme Tranquilpeak 0.4.3-SNAPSHOT"> <meta name="author" content="Citizen Statistician"> <meta name="keywords" content=""> <meta name="description" content="For the last year, I’ve watched just about every COVID-19 briefing by the Scottish Government, most of which are delivered by First Minister Nicola Sturgeon. Earlier on in the pandemic these were daily updates, lately it seems like once a week. The more often they happen, the worse you know things are going… If I’ve chatted with you about COVID, you have probably heard me say that I am very impressed by the way she delivers these updates."> <meta property="og:description" content="For the last year, I’ve watched just about every COVID-19 briefing by the Scottish Government, most of which are delivered by First Minister Nicola Sturgeon. Earlier on in the pandemic these were daily updates, lately it seems like once a week. The more often they happen, the worse you know things are going… If I’ve chatted with you about COVID, you have probably heard me say that I am very impressed by the way she delivers these updates."> <meta property="og:type" content="article"> <meta property="og:title" content="That's what Nicola said"> <meta name="twitter:title" content="That's what Nicola said"> <meta property="og:url" content="http://www.citizen-statistician.org/2021/04/that-s-what-nicola-said/"> <meta property="twitter:url" content="http://www.citizen-statistician.org/2021/04/that-s-what-nicola-said/"> <meta property="og:site_name" content="Citizen Statistician"> <meta property="og:description" content="For the last year, I’ve watched just about every COVID-19 briefing by the Scottish Government, most of which are delivered by First Minister Nicola Sturgeon. Earlier on in the pandemic these were daily updates, lately it seems like once a week. The more often they happen, the worse you know things are going… If I’ve chatted with you about COVID, you have probably heard me say that I am very impressed by the way she delivers these updates."> <meta name="twitter:description" content="For the last year, I’ve watched just about every COVID-19 briefing by the Scottish Government, most of which are delivered by First Minister Nicola Sturgeon. Earlier on in the pandemic these were daily updates, lately it seems like once a week. The more often they happen, the worse you know things are going… If I’ve chatted with you about COVID, you have probably heard me say that I am very impressed by the way she delivers these updates."> <meta property="og:locale" content="en-us"> <meta property="article:published_time" content="2021-04-21T00:00:00"> <meta property="article:modified_time" content="2021-04-21T23:35:56"> <meta property="article:section" content="musings"> <meta property="article:tag" content="text analysis"> <meta property="article:tag" content="data science"> <meta name="twitter:card" content="summary"> <meta name="twitter:site" content="@citizenstat"> <meta name="twitter:creator" content="@citizenstat"> <meta property="og:image" content="http://www.citizen-statistician.org/img/logo.png"> <meta property="twitter:image" content="http://www.citizen-statistician.org/img/logo.png"> <title>That's what Nicola said</title> <link rel="icon" href="img/logo.png"> <link rel="canonical" href="http://www.citizen-statistician.org/2021/04/that-s-what-nicola-said/"> <link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.5.0/css/all.css" integrity="sha384-B4dIYHKNBt8Bc12p+WXckhzcICo0wtJAoU8YZTY5qE0Id1GSseTk6S+L3BlXeVIU" crossorigin="anonymous"> <link rel="stylesheet" href="https://cdn.rawgit.com/jpswalsh/academicons/master/css/academicons.min.css"> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/fancybox/2.1.4/jquery.fancybox.min.css" integrity="sha256-vuXZ9LGmmwtjqFX1F+EKin1ThZMub58gKULUyf0qECk=" crossorigin="anonymous" /> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/fancybox/2.1.4/helpers/jquery.fancybox-thumbs.min.css" integrity="sha256-SEa4XYAHihTcEP1f5gARTB2K26Uk8PsndQYHQC1f4jU=" crossorigin="anonymous" /> <link rel="stylesheet" href="http://www.citizen-statistician.org/css/style-nnm2spxvve8onlujjlegkkytaehyadd4ksxc1hyzzq9a2wvtrgbljqyulomn.min.css" /> </head> <body> <div id="blog"> <header id="header" data-behavior="5"> <i id="btn-open-sidebar" class="fas fa-lg fa-bars"></i> <div class="header-title"> <a class="header-title-link" href="http://www.citizen-statistician.org/">Citizen Statistician</a> </div> <a class="header-right-picture " href="http://www.citizen-statistician.org/#about"> <img class="header-picture" src="http://www.citizen-statistician.org/img/logo.png" alt="Author's picture" /> </a> </header> <nav id="sidebar" data-behavior="5"> <div class="sidebar-container"> <div class="sidebar-profile"> <a href="http://www.citizen-statistician.org/#about"> <img class="sidebar-profile-picture" src="http://www.citizen-statistician.org/img/logo.png" alt="Author's picture" /> </a> <h4 class="sidebar-profile-name">Citizen Statistician</h4> <h5 class="sidebar-profile-bio">Learning to swim in the data deluge</h5> </div> <ul class="sidebar-buttons"> <li class="sidebar-button"> <a class="sidebar-button-link " href="http://www.citizen-statistician.org/"> <i class="sidebar-button-icon fas fa-lg fa-home"></i> <span class="sidebar-button-desc">Home</span> </a> </li> <li class="sidebar-button"> <a class="sidebar-button-link " href="http://www.citizen-statistician.org/categories"> <i class="sidebar-button-icon fas fa-lg fa-bookmark"></i> <span class="sidebar-button-desc">Categories</span> </a> </li> <li class="sidebar-button"> <a class="sidebar-button-link " href="http://www.citizen-statistician.org/tags"> <i class="sidebar-button-icon fas fa-lg fa-tags"></i> <span class="sidebar-button-desc">Tags</span> </a> </li> <li class="sidebar-button"> <a class="sidebar-button-link " href="http://www.citizen-statistician.org/archives"> <i class="sidebar-button-icon fas fa-lg fa-archive"></i> <span class="sidebar-button-desc">Archives</span> </a> </li> <li class="sidebar-button"> <a class="sidebar-button-link " href="http://www.citizen-statistician.org/about"> <i class="sidebar-button-icon fas fa-lg fa-users"></i> <span class="sidebar-button-desc">About</span> </a> </li> </ul> <ul class="sidebar-buttons"> <li class="sidebar-button"> <a class="sidebar-button-link " href="https://github.com/mine-cetinkaya-rundel/citizenstatistician" target="_blank" rel="noopener"> <i class="sidebar-button-icon fab fa-lg fa-github"></i> <span class="sidebar-button-desc">GitHub</span> </a> </li> </ul> <ul class="sidebar-buttons"> <li class="sidebar-button"> <a class="sidebar-button-link " href="http://www.citizen-statistician.org/post/index.xml"> <i class="sidebar-button-icon fa fa-lg fa-rss"></i> <span class="sidebar-button-desc">RSS</span> </a> </li> </ul> </div> </nav> <div id="main" data-behavior="5" class=" hasCoverMetaIn "> <article class="post" itemscope itemType="http://schema.org/BlogPosting"> <div class="post-header main-content-wrap text-left"> <h1 class="post-title" itemprop="headline"> That's what Nicola said </h1> <div class="postShorten-meta post-meta"> <p itemprop="author"> by mine </p> <time itemprop="datePublished" datetime="2021-04-21T00:00:00Z"> April 21, 2021 </time> <span>in</span> <a class="category-link" href="http://www.citizen-statistician.org/categories/musings">musings</a> </div> </div> <div class="post-content markdown" itemprop="articleBody"> <div class="main-content-wrap"> <script src="http://www.citizen-statistician.org/2021/04/that-s-what-nicola-said/ index_files/header-attrs/header-attrs.js"></script> <p>For the last year, I’ve watched just about every COVID-19 briefing by the Scottish Government, most of which are delivered by First Minister Nicola Sturgeon. Earlier on in the pandemic these were daily updates, lately it seems like once a week. The more often they happen, the worse you know things are going… If I’ve chatted with you about COVID, you have probably heard me say that I am <em>very</em> impressed by the way she delivers these updates. I’ll be honest, they are almost boring, but in the best possible way. The last thing I want from a leader at this time is surprises, showmanship, or claims with no scientific basis.</p> <p>About a few weeks into the daily updates, I realized that the text for these speeches are published <a href="https://www.gov.scot/collections/first-ministers-speeches/">on the Scottish Government website</a>. So, naturally, I scraped the data and started analyzing it. You can find the full analysis at <a href="https://github.com/mine-cetinkaya-rundel/fm-speeches-covid19">github.com/mine-cetinkaya-rundel/fm-speeches-covid19</a>.</p> <p>It’s been fun to keep coming back to this project, scraping a bit more data and looking to see how/if trends are changing. Below are a few figures from the analysis that will give you a glimpse of what it’s all about.</p> <div id="highlights" class="section level2"> <h2>Highlights</h2> <div id="sentiment-over-time" class="section level3"> <h3>Sentiment over time</h3> <p>For this plot I calculated the sentiment score of each briefing as the number of words associated with a positive sentiment minus the number of words associated with a negative sentiment. Plotting sentiment scores over time shows a steady trend in the negative earlier on in the pandemic, with an upward trend towards positive since March 2021.</p> <p><img src="sentiment-over-time.png" title="Sentiment score (calculated as number of positive minus negative words) over time. Most briefings have a negative sentiment score, with positive scores increasing in the recent past, with an upward trend since March." alt="Sentiment score (calculated as number of positive minus negative words) over time. Most briefings have a negative sentiment score, with positive scores increasing in the recent past, with an upward trend since March." width="3150" /></p> <p>Code for creating this figure can be found <a href="https://github.com/mine-cetinkaya-rundel/fm-speeches-covid19/blob/c18c78ee987ac4e560e7134af1d4596d151b4df1/analysis/03-visualise-scot.Rmd#L174-L189">here</a>. Some notes on the code:</p> <ul> <li>I like using <code>shape</code> to distinguish groups along with <code>colour</code> by mapping the same variable to both aesthetics.</li> <li>Using custom colours is a low effort/high return way of making your plots unique, for just two colours <code>scale_colour_manual()</code> is the simplest way to achieve this.</li> </ul> <p>Early on in the sentiment analysis it was clear that the sentiment assignments didn’t work perfectly in this setting. For example, the word “positive” actually carries a negative sentiment in the COVID context, but it’s assigned a positive sentiment in the Bing lexicon. I dropped that word from the sentiment analysis, though I realize there are a few others like it that I didn’t catch/correct for.</p> </div> <div id="social-vs.-physical-distancing" class="section level3"> <h3>Social vs. physical distancing</h3> <p>Earlier in the pandemic Nicola Sturgeon used the phrase “social distancing”, then it looks like she experimented with “physical distancing” at the beginning of the summer, and has been using that phrase exclusively ever since.</p> <p><img src="social-physical-scot.png" title="Number of times the phrase social distancing or physical distancing appeared in the briefings over time in Scotland briefings. Early on social distancing is used, then for a while they're both used, and then for majority of the pandemic up today, physical distancing is used." alt="Number of times the phrase social distancing or physical distancing appeared in the briefings over time in Scotland briefings. Early on social distancing is used, then for a while they're both used, and then for majority of the pandemic up today, physical distancing is used." width="3150" /></p> <p>Code for creating this figure can be found <a href="https://github.com/mine-cetinkaya-rundel/fm-speeches-covid19/blob/c18c78ee987ac4e560e7134af1d4596d151b4df1/analysis/03-visualise-scot.Rmd#L273-L287">here</a>.</p> <p>Meanwhile, down in 10 Downing Street, it’s still “social distancing”… These data come from the <a href="https://www.gov.uk/search/all?content_purpose_supergroup%5B%5D=news_and_communications&level_one_taxon=5b7b9532-a775-4bd2-a3aa-6ce380184b6c&order=updated-newest&organisations%5B%5D=prime-ministers-office-10-downing-street&page=1&parent=prime-ministers-office-10-downing-street">UK government website</a>.</p> <p><img src="social-physical-uk.png" title="Number of times the phrase social distancing or physical distancing appeared in the briefings over time in UK briefings. The phrase physical distancing is never used." alt="Number of times the phrase social distancing or physical distancing appeared in the briefings over time in UK briefings. The phrase physical distancing is never used." width="3150" /></p> <p>Code for creating this figure can be found <a href="https://github.com/mine-cetinkaya-rundel/fm-speeches-covid19/blob/36dafd85ce8e005c81008452b57f334c8d9da617/analysis/06-visualise-uk.Rmd#L247-L261">here</a>.</p> <p>One note on the code: In order capture “social distance” and “social distancing” (and their “physical” variants), I used <code>social dist|physical dist</code> as the regular expression to match on.<a href="#fn1" class="footnote-ref" id="fnref1"><sup>1</sup></a></p> </div> <div id="vaccines-ftw" class="section level3"> <h3>Vaccines FTW!</h3> <p>This figure shows the number of times anything related to vaccinations shows up in the Scotland briefings. It occurred to me today that it’s a “jab” not a “vaccine” here so I used <code>[Vv]accin|\\b[Jj]abs?\\b</code> as the regular expression string to try to catch all relevant mentions.</p> <p><img src="vaccines.png" title="The number of times vaccinations or anything related to them has been mentioned has increased drastically since January. In some briefings in February and March vaccinations were mentioned over 25 times in a given briefing." alt="The number of times vaccinations or anything related to them has been mentioned has increased drastically since January. In some briefings in February and March vaccinations were mentioned over 25 times in a given briefing." width="3150" /></p> <p>Code for creating this figure can be found <a href="https://github.com/mine-cetinkaya-rundel/fm-speeches-covid19/blob/c18c78ee987ac4e560e7134af1d4596d151b4df1/analysis/03-visualise-scot.Rmd#L293-L305">here</a>. Some notes on the code:</p> <ul> <li>With <a href="https://www.tidyverse.org/blog/2021/02/modern-text-features/">new advancements in graphics</a> it’s pretty straightforward to create plots that use emojis to represent data with just ggplot2. Whether this is a great use of emojis, I’m not so sure…</li> <li>For this figure to show up in R Markdown, I used <code>dev = "ragg_png"</code> as a code chunk option.</li> </ul> </div> <div id="still-waiting-for-pubs" class="section level3"> <h3>Still waiting for pubs…</h3> <p>The other thing everyone is talking about nowadays is pubs. Unfortunately, so far, mentions of pubs has been more frequently related to outbreaks than happy news.</p> <p><img src="pubs.png" title="Pubs were mentioned most often between August and October and most of those mentions were related to outbreaks. On August 14, 2020, pubs were mentioned 13 times, and this date corresponds to the outbreak in Aberdeen that was linked to transmission in pubs. Since October there weren't many mentions of pubs, until in the most recent briefing in April they were mentioned twice." alt="Pubs were mentioned most often between August and October and most of those mentions were related to outbreaks. On August 14, 2020, pubs were mentioned 13 times, and this date corresponds to the outbreak in Aberdeen that was linked to transmission in pubs. Since October there weren't many mentions of pubs, until in the most recent briefing in April they were mentioned twice." width="3150" /></p> <p>Code for creating this figure can be found <a href="https://github.com/mine-cetinkaya-rundel/fm-speeches-covid19/blob/c18c78ee987ac4e560e7134af1d4596d151b4df1/analysis/03-visualise-scot.Rmd#L311-L328">here</a>. Some notes on the code:</p> <ul> <li>Originally my gray lines covered part of the beer mug. My first instict was to shift the beer mugs (that is, <code>goem_text()</code>) up in the y direction. But then I realized I can have the mugs cover the lines simply by switching their order since ggplot2 plots the layers in order.</li> <li><code>expand_limits()</code> might be my new favourite function. Well, it’s not new, but new to me, somehow… It allows you to ensure a single value is included in the limits without having to worry about whether it would be included or not automatically. If included, ggplot2 doesn’t change anything. If not, it extends the limit to that value.</li> </ul> </div> <div id="scotland-vs.-uk-speeches" class="section level3"> <h3>Scotland vs. UK speeches</h3> <p>The following figure shows the tf-idf (term frequency - inverse document frequency) scores of words in Scotland and UK speeches. Some of the words that show up more in Scotland briefings and much fewer (or no) times in UK briefings are geographic areas in Scotland, which makes sense. Nicole Sturgeon also regularly qualifies statistics she reports (e.g. positive cases, deaths, etc.) with the phrase “by that measurement”, and that’s probably why the word “measurement” shows up on the Scotland side. Some differences are harder to explain. Except “slides”, they say “next slide please” a lot in the UK briefings! Nicola Sturgeon doesn’t use slides, so none of that up here.</p> <p><img src="tf-idf.png" title="Words with high tf-idf for Scotland speeches are (in decreasing order) registered, Glasgow, suspected, Aberdeen, usual, measurement, issues, Lanarkshire, Clyde, either, Lothian, presiding, decrease, relation, reminder. Words with high tf-idf for UK speeches are (in decreasing order) slide, speaker, adjusting, fatalities, doctors, kingdom, alas, roadmap, mechanical, gov.uk, Merseyside, Mancherter's, mayor, amazing, department " alt="Words with high tf-idf for Scotland speeches are (in decreasing order) registered, Glasgow, suspected, Aberdeen, usual, measurement, issues, Lanarkshire, Clyde, either, Lothian, presiding, decrease, relation, reminder. Words with high tf-idf for UK speeches are (in decreasing order) slide, speaker, adjusting, fatalities, doctors, kingdom, alas, roadmap, mechanical, gov.uk, Merseyside, Mancherter's, mayor, amazing, department " width="3150" /></p> <p>Code for creating this figure can be found <a href="https://github.com/mine-cetinkaya-rundel/fm-speeches-covid19/blob/36dafd85ce8e005c81008452b57f334c8d9da617/analysis/08-visualise-compare.Rmd#L91-L102">here</a>. One note on the code: The blue used in the Scotland plot is the blue from the Scottish flag and the red in the UK plot is the red from the UK flag. Again, this is a point about using custom colours, and in this case, contextual colours.</p> </div> <div id="text-classification-models" class="section level3"> <h3>Text classification models</h3> <p>This case study was a great motivator for me to learn more about tidymodels and also text modeling. I am hugely grateful for the fantastic <a href="https://www.tidymodels.org/learn/">tidymodels learning resources</a> as well as this <a href="https://emilhvitfeldt.github.io/useR2020-text-modeling-tutorial/#1">tidy text modeling tutorial</a> by Julia Silge and Emil Hvitfeldt. The details of the model building are can be found <a href="https://github.com/mine-cetinkaya-rundel/fm-speeches-covid19/blob/master/analysis/11-predict-downsample.R">here</a> but I’ll show one figure summarizing the results in this post – the variable importance figure.</p> <p>These are text features that have high importance in the classification model. Some of them match findings from the tf-idf plot above, but in the model I also used bigrams and trigrams, so you see some phrases as well as single words as well. “Physical distancing” shows up on the Scotland side, which makes sense based on the social vs. physical distancing finding from earlier. I can almost hear in my head Nicola Sturgeon talking about losing a “loved” one, I think she uses this phrasing in every briefing, so that word also makes sense to me. On the UK side, we have “slide” again, which makes me think I should have removed all the “next slide please” sentences. There is also “livelihoods” and upon closer inspection of the texts I think this is due to phrases like <em>“we are engaged in a constant struggle to protect lives and livelihoods”</em> being more common in the UK briefings than in the Scotland briefings. If what you know about Scotland is limited to <a href="https://www.youtube.com/watch?v=k7rPOaoPL4I&ab_channel=BestMoviesByFarr">Braveheart</a>, the word “freedom” showing up on the UK side might surprise you. Both countries’ briefings mention “freedom”, but it shows up a lot more often in the UK briefings, within sentences like <em>“and i must stress that it is only because of months of sacrifice and effort that we can take this small step to freedom today.”</em>.</p> <p><img src="vip.png" title="Variable importance for text features from the model. 40 features per country of origin are presented. The interesting features are discussed in the text above." alt="Variable importance for text features from the model. 40 features per country of origin are presented. The interesting features are discussed in the text above." width="1500" /></p> <p>Code for creating this figure can be found <a href="https://github.com/mine-cetinkaya-rundel/fm-speeches-covid19/blob/36dafd85ce8e005c81008452b57f334c8d9da617/analysis/11-predict-downsample.R#L214-L234">here</a>.</p> </div> </div> <div id="venues" class="section level2"> <h2>Venues</h2> <p>What started as a side project ended up being a useful resource for a few more things. I’ve used this case study in a few other venues:</p> <ul> <li>For teaching (all materials on Data Science in a Box, videos included): <ul> <li>Writing a function to scrape a page: <a href="https://datasciencebox.org/exploring-data.html">Unit 2 - Deck 21</a></li> <li>Iterating over many pages: <a href="https://datasciencebox.org/exploring-data.html">Unit 2 - Deck 22</a></li> <li>Text analysis: <a href="https://datasciencebox.org/looking-forward.html">Unit 5 - Decks 1 and 2</a></li> <li>Shiny: <a href="https://datasciencebox.org/looking-forward.html">Unit 5 - Deck 3</a></li> <li>Machine learning / text classification models: <a href="https://datasciencebox.org/looking-forward.html">Unit 5 - Deck 4</a></li> </ul></li> <li>In talks: <ul> <li>TwinCities WIMLDS - Mar 2021 <a href="https://mine-cetinkaya-rundel.github.io/fm-speeches-covid19/venues/twincities-wimlds/tidyverse-tidymodels.html#1">[slides]</a></li> <li>R-Ladies Edinburgh + edinbR - April 2021 <a href="https://mine-cetinkaya-rundel.github.io/fm-speeches-covid19/venues/edi-rladies-edinbr/thats-what-nicola-said.html#1">[slides]</a></li> </ul></li> </ul> </div> <div id="find-out-more" class="section level2"> <h2>Find out more</h2> <p>If you’d like to dig in to the code yourself, you can find it all in <a href="https://github.com/mine-cetinkaya-rundel/fm-speeches-covid19">this GitHub repo</a>.</p> <p>I might keep scraping the data for a bit longer and update the analysis, but I sure am looking forward to the day when there are no further COVID briefings!</p> </div> <div class="footnotes"> <hr /> <ol> <li id="fn1"><p>I am now wondering if “socially distanced” or “physically distanced” might also show up, Probably should have checked for that!<a href="#fnref1" class="footnote-back">↩︎</a></p></li> </ol> </div> </div> </div> <div id="post-footer" class="post-footer main-content-wrap"> <div class="post-footer-tags"> <span class="text-color-light text-small">TAGGED IN</span><br/> <a class="tag tag--primary tag--small" href="http://www.citizen-statistician.org/tags/text-analysis/">text analysis</a> <a class="tag tag--primary tag--small" href="http://www.citizen-statistician.org/tags/data-science/">data science</a> </div> <div class="post-actions-wrap"> <nav > <ul class="post-actions post-action-nav"> <li class="post-action"> <a class="post-action-btn btn btn--disabled"> <i class="fa fa-angle-left"></i> <span class="hide-xs hide-sm text-small icon-ml">NEXT</span> </a> </li> <li class="post-action"> <a class="post-action-btn btn btn--default tooltip--top" href="http://www.citizen-statistician.org/2021/03/open-source-contribution-as-a-student-project/" data-tooltip="Open-source contribution as a student project"> <span class="hide-xs hide-sm text-small icon-mr">PREVIOUS</span> <i class="fa fa-angle-right"></i> </a> </li> </ul> </nav> <ul class="post-actions post-action-share" > <li class="post-action hide-lg hide-md hide-sm"> <a class="post-action-btn btn btn--default btn-open-shareoptions" href="#btn-open-shareoptions"> <i class="fas fa-share-alt"></i> </a> </li> <li class="post-action hide-xs"> <a class="post-action-btn btn btn--default" target="new" href="https://www.facebook.com/sharer/sharer.php?u=http://www.citizen-statistician.org/2021/04/that-s-what-nicola-said/"> <i class="fab fa-facebook-f"></i> </a> </li> <li class="post-action hide-xs"> <a class="post-action-btn btn btn--default" target="new" href="https://twitter.com/intent/tweet?text=http://www.citizen-statistician.org/2021/04/that-s-what-nicola-said/"> <i class="fab fa-twitter"></i> </a> </li> <li class="post-action"> <a class="post-action-btn btn btn--default" href="#disqus_thread"> <i class="far fa-comment"></i> </a> </li> <li class="post-action"> <a class="post-action-btn btn btn--default" href="#"> <i class="fa fa-list"></i> </a> </li> </ul> </div> <div id="disqus_thread"></div> <script> (function() { var d = document, s = d.createElement('script'); s.src = 'https://citizen-statistician.disqus.com/embed.js'; s.setAttribute('data-timestamp', +new Date()); (d.head || d.body).appendChild(s); })(); </script> <noscript>Please enable JavaScript to view the <a href="https://disqus.com/?ref_noscript">comments powered by Disqus.</a></noscript> </div> </article> <footer id="footer" class="main-content-wrap"> <span class="copyrights"> © 2021 Citizen Statistician. All Rights Reserved </span> </footer> </div> <div id="bottom-bar" class="post-bottom-bar" data-behavior="5"> <div class="post-actions-wrap"> <nav > <ul class="post-actions post-action-nav"> <li class="post-action"> <a class="post-action-btn btn btn--disabled"> <i class="fa fa-angle-left"></i> <span class="hide-xs hide-sm text-small icon-ml">NEXT</span> </a> </li> <li class="post-action"> <a class="post-action-btn btn btn--default tooltip--top" href="http://www.citizen-statistician.org/2021/03/open-source-contribution-as-a-student-project/" data-tooltip="Open-source contribution as a student project"> <span class="hide-xs hide-sm text-small icon-mr">PREVIOUS</span> <i class="fa fa-angle-right"></i> </a> </li> </ul> </nav> <ul class="post-actions post-action-share" > <li class="post-action hide-lg hide-md hide-sm"> <a class="post-action-btn btn btn--default btn-open-shareoptions" href="#btn-open-shareoptions"> <i class="fas fa-share-alt"></i> </a> </li> <li class="post-action hide-xs"> <a class="post-action-btn btn btn--default" target="new" href="https://www.facebook.com/sharer/sharer.php?u=http://www.citizen-statistician.org/2021/04/that-s-what-nicola-said/"> <i class="fab fa-facebook-f"></i> </a> </li> <li class="post-action hide-xs"> <a class="post-action-btn btn btn--default" target="new" href="https://twitter.com/intent/tweet?text=http://www.citizen-statistician.org/2021/04/that-s-what-nicola-said/"> <i class="fab fa-twitter"></i> </a> </li> <li class="post-action"> <a class="post-action-btn btn btn--default" href="#disqus_thread"> <i class="far fa-comment"></i> </a> </li> <li class="post-action"> <a class="post-action-btn btn btn--default" href="#"> <i class="fa fa-list"></i> </a> </li> </ul> </div> </div> <div id="share-options-bar" class="share-options-bar" data-behavior="5"> <i id="btn-close-shareoptions" class="fas fa-times"></i> <ul class="share-options"> <li class="share-option"> <a class="share-option-btn" target="new" href="https://www.facebook.com/sharer/sharer.php?u=http%3A%2F%2Fwww.citizen-statistician.org%2F2021%2F04%2Fthat-s-what-nicola-said%2F"> <i class="fab fa-facebook-f"></i><span>Share on Facebook</span> </a> </li> <li class="share-option"> <a class="share-option-btn" target="new" href="https://twitter.com/intent/tweet?text=http%3A%2F%2Fwww.citizen-statistician.org%2F2021%2F04%2Fthat-s-what-nicola-said%2F"> <i class="fab fa-twitter"></i><span>Share on Twitter</span> </a> </li> </ul> </div> <div id="share-options-mask" class="share-options-mask"></div> </div> <div id="about"> <div id="about-card"> <div id="about-btn-close"> <i class="fas fa-times"></i> </div> <img id="about-card-picture" src="http://www.citizen-statistician.org/img/logo.png" alt="Author's picture" /> <h4 id="about-card-name">Citizen Statistician</h4> <div id="about-card-bio">Learning to swim in the data deluge</div> </div> </div> <div id="algolia-search-modal" class="modal-container"> <div class="modal"> <div class="modal-header"> <span class="close-button"><i class="fas fa-times"></i></span> <a href="https://algolia.com" target="_blank" rel="noopener" class="searchby-algolia text-color-light link-unstyled"> <span class="searchby-algolia-text text-color-light text-small">by</span> <img class="searchby-algolia-logo" src="https://www.algolia.com/static_assets/images/press/downloads/algolia-light.svg"> </a> <i class="search-icon fas fa-search"></i> <form id="algolia-search-form"> <input type="text" id="algolia-search-input" name="search" class="form-control input--large search-input" placeholder="Search" /> </form> </div> <div class="modal-body"> <div class="no-result text-color-light text-center">no post found</div> <div class="results"> <div class="media"> <div class="media-body"> <a class="link-unstyled" href="http://www.citizen-statistician.org/2021/04/that-s-what-nicola-said/"> <h3 class="media-heading">That's what Nicola said</h3> </a> <span class="media-meta"> <span class="media-date text-small"> Apr 4, 2021 </span> </span> <div class="media-content hide-xs font-merryweather">For the last year, I’ve watched just about every COVID-19 briefing by the Scottish Government, most of which are delivered by First Minister Nicola Sturgeon. Earlier on in the pandemic these were daily updates, lately it seems like once a week. The more often they happen, the worse you know things are going… If I’ve chatted with you about COVID, you have probably heard me say that I am very impressed by the way she delivers these updates.</div> </div> <div style="clear:both;"></div> <hr> </div> <div class="media"> <div class="media-body"> <a class="link-unstyled" href="http://www.citizen-statistician.org/2021/03/open-source-contribution-as-a-student-project/"> <h3 class="media-heading">Open-source contribution as a student project</h3> </a> <span class="media-meta"> <span class="media-date text-small"> Mar 3, 2021 </span> </span> <div class="media-content hide-xs font-merryweather">An opportunity to teach, an opportunity to give back… If you’ve seen one of my data science education talks or attended one of my workshops in the last few years, you’ve probably heard me talk about the unvotes package in R. This package provides the voting history of countries in the United Nations General Assembly, along with information such as date, description, and topics for each vote. I love using data from this package in my teaching, especially on day one of class, because the data are rich while being accessible.</div> </div> <div style="clear:both;"></div> <hr> </div> <div class="media"> <div class="media-body"> <a class="link-unstyled" href="http://www.citizen-statistician.org/2021/03/tiktok-lockdown-and-introduction-to-r/"> <h3 class="media-heading">TikTok, lockdown, and introduction to R</h3> </a> <span class="media-meta"> <span class="media-date text-small"> Mar 3, 2021 </span> </span> <div class="media-content hide-xs font-merryweather">Last weekend Maria Tackett and I gave an introduction to R workshop as part of the 2021 ENAR Fostering Diversity in Biostatistics Workshop for high school and undergraduate students. Our goal was to give them a taster for exploring and visualizing data with R and, hopefully, leave them wanting to learn more. We only had 75 minutes for the workshop and a totally beginner crowd. We knew that they would be a mix of undergraduate and high school students, but didn’t know much else about them as we prepared for the workshop.</div> </div> <div style="clear:both;"></div> <hr> </div> <div class="media"> <div class="media-body"> <a class="link-unstyled" href="http://www.citizen-statistician.org/2021/03/in-the-beginning-was-r-markdown/"> <h3 class="media-heading">In the beginning was R Markdown</h3> </a> <span class="media-meta"> <span class="media-date text-small"> Mar 3, 2021 </span> </span> <div class="media-content hide-xs font-merryweather">Last week I attended the Toronto Workshop on Reproducibility where I had to the pleasure of giving one of the keynotes. When I was asked to give a keynote for this event on teaching, I had the idea of reflecting on almost 9 years of teaching with introductory statistics and data science through the lens of reproducibility. I would have said “teaching with R Markdown”, but looking back through my notes, this wasn’t true as the rmarkdown package has not been around for that long – turns out I started teaching with it when it was just knitr.</div> </div> <div style="clear:both;"></div> <hr> </div> <div class="media"> <div class="media-body"> <a class="link-unstyled" href="http://www.citizen-statistician.org/2021/03/themoment-tweets/"> <h3 class="media-heading">#TheMoment tweets</h3> </a> <span class="media-meta"> <span class="media-date text-small"> Mar 3, 2021 </span> </span> <div class="media-content hide-xs font-merryweather"> <script src="http://www.citizen-statistician.org/rmarkdown-libs/header-attrs/header-attrs.js"></script> <script src="http://www.citizen-statistician.org/rmarkdown-libs/twitter-widget/widgets.js"></script> <p>On Sunday morning I came across a tweet by NPR’s <a href="https://www.npr.org/people/4462099/lourdes-garcia-navarro?t=1614556725862">Lulu Garcia-Navarro</a> morning asking people when they knew things were going to be different due to COVID. Whenever I read replies to a tweet like this I’m always tempted to scrape all the replies and take a look at the data to see if anything interesting emerges.</p> </div> </div> <div style="clear:both;"></div> <hr> </div> <div class="media"> <div class="media-body"> <a class="link-unstyled" href="http://www.citizen-statistician.org/2020/11/github-workflow-for-data-science-project-proposals/"> <h3 class="media-heading">GitHub workflow for data science project proposals</h3> </a> <span class="media-meta"> <span class="media-date text-small"> Nov 11, 2020 </span> </span> <div class="media-content hide-xs font-merryweather"> <script src="{{< blogdown/postref >}}index_files/header-attrs/header-attrs.js"></script> <p>Over the past few years I’ve been working on moving from a mindset of end-of-semester project to semester-long project. Inevitably students end up doing lots of work as the deadline approaches at the end of the semester (and I can’t blame them, that’s how I work around deadlines too, and how just about anyone I know works), but creating opportunities for them to get started on their projects earlier in the semester is very important.</p> </div> </div> <div style="clear:both;"></div> <hr> </div> <div class="media"> <div class="media-body"> <a class="link-unstyled" href="http://www.citizen-statistician.org/2020/08/data-science-tutorials-with-learnr-and-gradethis/"> <h3 class="media-heading">Data science tutorials with learnr and gradethis</h3> </a> <span class="media-meta"> <span class="media-date text-small"> Aug 8, 2020 </span> </span> <div class="media-content hide-xs font-merryweather"><p><em>This post was contributed by <a href="https://github.com/lee-suddaby">Lee Suddaby</a> and <a href="https://github.com/ZenoMK">Zeno Kujawa</a>, second year students at the University of Edinburgh majoring in Mathematics and Data Science, respectively.</em></p> <p>Over the university summer break, we (Zeno and Lee) were busy making preparations for moving more of our <a href="https://introds.org/">Introduction to Data Science</a> course from being human-graded to computer-graded. We both took this course in the Fall of 2019, as part of our first-year studies at the University of Edinburgh, and this is where we first learned R.</p></div> </div> <div style="clear:both;"></div> <hr> </div> <div class="media"> <div class="media-body"> <a class="link-unstyled" href="http://www.citizen-statistician.org/2020/06/teaching-statistics-and-data-science-online-workshops/"> <h3 class="media-heading">Teaching statistics and data science online workshops</h3> </a> <span class="media-meta"> <span class="media-date text-small"> Jun 6, 2020 </span> </span> <div class="media-content hide-xs font-merryweather"><p>Colin Rundel and I will be teaching a series of three virtual workshops in July 2020 on teaching statistics and data science online.</p></div> </div> <div style="clear:both;"></div> <hr> </div> <div class="media"> <div class="media-body"> <a class="link-unstyled" href="http://www.citizen-statistician.org/2020/06/preparing-to-teach-2020-what-did-we-learn/"> <h3 class="media-heading">Preparing to Teach 2020: What did we learn?</h3> </a> <span class="media-meta"> <span class="media-date text-small"> Jun 6, 2020 </span> </span> <div class="media-content hide-xs font-merryweather"><p><em>This post was contributed by <a href="https://sastoudt.github.io/">Sara Stoudt</a> (<a href="https://twitter.com/sastoudt">@sastoudt</a>). Thank you Sara!</em></p> <p>On May 15th and 20th the third <a href="https://preparingtoteach.org/">Preparing for Careers in Teaching Statistics and Data Science Workshop</a> was held. 37 graduate students and recent PhDs gathered (remotely of course) to learn from Allan Rossman (Cal Poly), Mine Çetinkaya-Rundel (University of Edinburgh, Duke, RStudio), Jo Hardin (Pomona), Beth Chance (Cal Poly), Lucy D’Agostino McGowan (Wake Forest), and Ulrike Genschel (Iowa State).</p></div> </div> <div style="clear:both;"></div> <hr> </div> <div class="media"> <div class="media-body"> <a class="link-unstyled" href="http://www.citizen-statistician.org/2020/06/letter-to-copss-executive-committee/"> <h3 class="media-heading">Letter to the COPSS Executive Committee</h3> </a> <span class="media-meta"> <span class="media-date text-small"> Jun 6, 2020 </span> </span> <div class="media-content hide-xs font-merryweather"><p>As recent, current, and future chairs of the American Statistical Association (ASA) Section on Statistics and Data Science Education, we have sent the following letter to Ron Wasserstein (Executive Director of ASA) and Bhramar Mukherjee (COPSS Chair) and requested that they share it with the COPSS Executive Committee.</p></div> </div> <div style="clear:both;"></div> <hr> </div> </div> </div> <div class="modal-footer"> <p class="results-count text-medium" data-message-zero="no post found" data-message-one="1 post found" data-message-other="{n} posts found"> 180 posts found </p> </div> </div> </div> <div id="cover" style="background-image:url('http://www.citizen-statistician.org/images/cover.jpg');"></div> <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/2.2.4/jquery.min.js" integrity="sha256-BbhdlvQf/xTY9gja0Dq3HiwQF8LaCRTXxZKRutelT44=" crossorigin="anonymous"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/highlight.min.js" integrity="sha256-/BfiIkHlHoVihZdc6TFuj7MmJ0TWcWsMXkeDFwhi0zw=" crossorigin="anonymous"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/fancybox/2.1.7/js/jquery.fancybox.min.js" integrity="sha256-GEAnjcTqVP+vBp3SSc8bEDQqvWAZMiHyUSIorrWwH50=" crossorigin="anonymous"></script> <script src="http://www.citizen-statistician.org/js/script-qi9wbxp2ya2j6p7wx1i6tgavftewndznf4v0hy2gvivk1rxgc3lm7njqb6bz.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.12.0/languages/r.min.js"></script> <script lang="javascript"> window.onload = updateMinWidth; window.onresize = updateMinWidth; document.getElementById("sidebar").addEventListener("transitionend", updateMinWidth); function updateMinWidth() { var sidebar = document.getElementById("sidebar"); var main = document.getElementById("main"); main.style.minWidth = ""; var w1 = getComputedStyle(main).getPropertyValue("min-width"); var w2 = getComputedStyle(sidebar).getPropertyValue("width"); var w3 = getComputedStyle(sidebar).getPropertyValue("left"); main.style.minWidth = `calc(${w1} - ${w2} - ${w3})`; } </script> <script> $(document).ready(function() { hljs.configure({ classPrefix: '', useBR: false }); $('pre.code-highlight > code, pre > code').each(function(i, block) { if (!$(this).hasClass('codeblock')) { $(this).addClass('codeblock'); } hljs.highlightBlock(block); }); }); </script> <script> var disqus_config = function () { this.page.url = 'http:\/\/www.citizen-statistician.org\/2021\/04\/that-s-what-nicola-said\/'; this.page.identifier = '\/2021\/04\/that-s-what-nicola-said\/' }; (function() { if (window.location.hostname == "localhost") { return; } var d = document, s = d.createElement('script'); var disqus_shortname = 'citizen-statistician'; s.src = '//' + disqus_shortname + '.disqus.com/embed.js'; s.setAttribute('data-timestamp', +new Date()); (d.head || d.body).appendChild(s); })(); </script> <script id="dsq-count-scr" src="//citizen-statistician.disqus.com/count.js" async></script> </body> </html>