CINXE.COM
GitHub Status - Incident History
<?xml version="1.0" encoding="UTF-8"?> <feed xml:lang="en-US" xmlns="http://www.w3.org/2005/Atom"> <id>tag:www.githubstatus.com,2005:/history</id> <link rel="alternate" type="text/html" href="https://www.githubstatus.com"/> <link rel="self" type="application/atom+xml" href="https://www.githubstatus.com/history.atom"/> <title>GitHub Status - Incident History</title> <updated>2025-02-16T12:44:00Z</updated> <author> <name>GitHub</name> </author> <entry> <id>tag:www.githubstatus.com,2005:Incident/24025277</id> <published>2025-02-16T12:44:00Z</published> <updated>2025-02-16T12:44:00Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/nhzd9qzv27l8"/> <title>Disruption with some GitHub services</title> <content type="html"><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>12:44</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>12:43</var> UTC</small><br><strong>Update</strong> - Pull Requests is operating normally.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>12:43</var> UTC</small><br><strong>Update</strong> - Webhooks is operating normally.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>12:43</var> UTC</small><br><strong>Update</strong> - API Requests is operating normally.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>12:42</var> UTC</small><br><strong>Update</strong> - Issues is operating normally.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>12:42</var> UTC</small><br><strong>Update</strong> - Codespaces is operating normally.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>12:42</var> UTC</small><br><strong>Update</strong> - Git Operations is operating normally.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>12:42</var> UTC</small><br><strong>Update</strong> - Actions is operating normally.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>12:24</var> UTC</small><br><strong>Update</strong> - Pull Requests is experiencing degraded performance. We are continuing to investigate.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>12:10</var> UTC</small><br><strong>Update</strong> - API Requests is experiencing degraded performance. We are continuing to investigate.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>12:08</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of degraded performance for Actions, Codespaces, Git Operations, Issues and Webhooks</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23991123</id> <published>2025-02-15T04:15:43Z</published> <updated>2025-02-15T04:15:43Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/zxtwqgc613rl"/> <title>Disruption with some GitHub services</title> <content type="html"><p><small>Feb <var data-var='date'>15</var>, <var data-var='time'>04:15</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Feb <var data-var='date'>15</var>, <var data-var='time'>04:15</var> UTC</small><br><strong>Update</strong> - We completed the rollout. GitHub Codespaces are healthy.</p><p><small>Feb <var data-var='date'>15</var>, <var data-var='time'>03:21</var> UTC</small><br><strong>Update</strong> - We continue the rollout in Central India, SE Asia, and Australia Codespaces regions. We are seeing a minimal number of connection failures across all regions at the moment.</p><p><small>Feb <var data-var='date'>15</var>, <var data-var='time'>01:47</var> UTC</small><br><strong>Update</strong> - We rolled out a fix to most of our Codespaces regions. Central India, SE Asia, and Australia are the remaining regions to be fixed. Customers in these remaining regions can be experiencing issues with Codespaces connectivity.</p><p><small>Feb <var data-var='date'>14</var>, <var data-var='time'>20:53</var> UTC</small><br><strong>Update</strong> - Some customers are continuing to see intermittent connection failures to their codespaces. We are monitoring closely to build a better idea of when impact should be mitigated. At this time, we expect the number of impacted users to remain low, and will update again when there is a development in our repair efforts.</p><p><small>Feb <var data-var='date'>14</var>, <var data-var='time'>20:22</var> UTC</small><br><strong>Update</strong> - Codespaces is experiencing degraded performance. We are continuing to investigate.</p><p><small>Feb <var data-var='date'>14</var>, <var data-var='time'>20:12</var> UTC</small><br><strong>Update</strong> - Some GitHub codespace users are experiencing intermittent connection failures. A deployment is underway to mitigate the problem, and US-based customers should see recovery soon. Full recovery is expected to take several hours. In the meantime, we advise customers experiencing issues to retry their connection attempts.</p><p><small>Feb <var data-var='date'>14</var>, <var data-var='time'>20:06</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23939893</id> <published>2025-02-12T23:10:48Z</published> <updated>2025-02-12T23:10:48Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/6g38m7fyc324"/> <title>Claude Sonnet unavailable in GitHub Copilot</title> <content type="html"><p><small>Feb <var data-var='date'>12</var>, <var data-var='time'>23:10</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Feb <var data-var='date'>12</var>, <var data-var='time'>23:10</var> UTC</small><br><strong>Update</strong> - Claude Sonnet is fully available in GitHub Copilot again. If you used an alternate model during the outage, you can switch back to Claude Sonnet.</p><p><small>Feb <var data-var='date'>12</var>, <var data-var='time'>23:04</var> UTC</small><br><strong>Update</strong> - We are seeing a recovery with our Claude Sonnet model provider. We'll confirm once the problem is fully resolved.</p><p><small>Feb <var data-var='date'>12</var>, <var data-var='time'>22:54</var> UTC</small><br><strong>Update</strong> - Our Claude Sonnet provider acknowledged the issue. They will provide us with next update by 11:30 AM UTC / 3:30 PM PT. Claude Sonnet remains unavailable in GitHub Copilot, please use an alternate model.</p><p><small>Feb <var data-var='date'>12</var>, <var data-var='time'>22:41</var> UTC</small><br><strong>Update</strong> - We escalated the issue to our Claude Sonnet model provider. Claude Sonnet remains unavailable in GitHub Copilot, please use an alternate model.</p><p><small>Feb <var data-var='date'>12</var>, <var data-var='time'>21:59</var> UTC</small><br><strong>Update</strong> - Claude Sonnet is currently not working in GitHub Copilot. Please switch to an alternate model while we're working on resolving the issue.</p><p><small>Feb <var data-var='date'>12</var>, <var data-var='time'>21:52</var> UTC</small><br><strong>Update</strong> - Copilot is experiencing degraded performance. We are continuing to investigate.</p><p><small>Feb <var data-var='date'>12</var>, <var data-var='time'>21:51</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23866630</id> <published>2025-02-06T11:13:43Z</published> <updated>2025-02-13T15:05:54Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/yhn3m0yqdxmc"/> <title>Incident with GIT LFS and Other Requests</title> <content type="html"><p><small>Feb <var data-var='date'> 6</var>, <var data-var='time'>11:13</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Feb <var data-var='date'> 6</var>, <var data-var='time'>11:13</var> UTC</small><br><strong>Update</strong> - This issue has been mitigated. We will continue to investigate root causes to ensure this does not reoccur.</p><p><small>Feb <var data-var='date'> 6</var>, <var data-var='time'>11:05</var> UTC</small><br><strong>Update</strong> - We have scaled out database resources and rolled back recent changes and are seeing signs of mitigation, but are monitoring to ensure complete recovery.</p><p><small>Feb <var data-var='date'> 6</var>, <var data-var='time'>10:29</var> UTC</small><br><strong>Update</strong> - We are attempting to scale databases to handle observed load spikes, as well as investigating other mitigation approaches.<br /><br />Customers may intermittently experience failures to fetch repositories with LFS, as well as increased latency and errors across the API.</p><p><small>Feb <var data-var='date'> 6</var>, <var data-var='time'>09:52</var> UTC</small><br><strong>Update</strong> - We are investigating failed Git LFS requests and potentially slow API requests.<br /><br />Customers may experience failures to fetch repositories with LFS.</p><p><small>Feb <var data-var='date'> 6</var>, <var data-var='time'>09:42</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of degraded performance for API Requests</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23854368</id> <published>2025-02-05T11:44:32Z</published> <updated>2025-02-07T23:07:33Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/4s0n40wj3l02"/> <title>Actions Larger Runners Provisioning Delays</title> <content type="html"><p><small>Feb <var data-var='date'> 5</var>, <var data-var='time'>11:44</var> UTC</small><br><strong>Resolved</strong> - Between Feb 5, 2025 00:34 UTC and 11:16 UTC, up to 7% of organizations using GitHub-hosted larger runners with public IP addresses had those jobs fail to start during the impact window. The issue was caused by a backend migration in the public IP management system, which caused certain public IP address runners to be placed in a non-functioning state.<br /><br />We have improved the rollback steps for this migration to reduce the time to mitigate any future recurrences, are working to improve automated detection of this error state, and are improving the resiliency of runners to handle this error state without customer impact.</p><p><small>Feb <var data-var='date'> 5</var>, <var data-var='time'>11:17</var> UTC</small><br><strong>Update</strong> - We have identified a configuration change that we believe may be related. We are working to mitigate.</p><p><small>Feb <var data-var='date'> 5</var>, <var data-var='time'>10:33</var> UTC</small><br><strong>Update</strong> - We are continuing investigation</p><p><small>Feb <var data-var='date'> 5</var>, <var data-var='time'>09:56</var> UTC</small><br><strong>Update</strong> - We continue to investigate and have determined this is limited to a subset of larger runner pools.</p><p><small>Feb <var data-var='date'> 5</var>, <var data-var='time'>09:21</var> UTC</small><br><strong>Update</strong> - We are investigating an incident where Actions larger runners are stuck in provisioning for some customers</p><p><small>Feb <var data-var='date'> 5</var>, <var data-var='time'>08:58</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23824745</id> <published>2025-02-03T19:37:49Z</published> <updated>2025-02-03T19:38:09Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/m8ntcjv1w4pr"/> <title>[Retroactive] Incident with some GitHub services</title> <content type="html"><p><small>Feb <var data-var='date'> 3</var>, <var data-var='time'>19:37</var> UTC</small><br><strong>Resolved</strong> - A component that imports external git repositories into GitHub had an incident that was caused by the improper internal configuration of a gem. We have since rolled back to a stable version, and all migrations are able to resume.</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23746865</id> <published>2025-01-30T15:39:53Z</published> <updated>2025-01-30T23:44:25Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/nm83zrdky73y"/> <title>Incident with Pull Requests and Issues</title> <content type="html"><p><small>Jan <var data-var='date'>30</var>, <var data-var='time'>15:39</var> UTC</small><br><strong>Resolved</strong> - On January 30th, 2025 from 14:22 UTC to 14:48 UTC, web requests to GitHub.com experienced failures (at peak the error rate was 44%), with the average successful request taking over 3 seconds to complete.<br /><br />This outage was caused by a hardware failure in the caching layer that supports rate limiting. In addition, the impact was prolonged due to a lack of automated failover for the caching layer. A manual failover of the primary to trusted hardware was performed following recovery to ensure that the issue would not reoccur under similar circumstances.<br /><br />As a result of this incident, we will be moving to a high availability cache configuration and adding resilience to cache failures at this layer to ensure requests are able to be handled should similar circumstances happen in the future.</p><p><small>Jan <var data-var='date'>30</var>, <var data-var='time'>15:39</var> UTC</small><br><strong>Update</strong> - We have completed the fail over. Services are operating as normal.</p><p><small>Jan <var data-var='date'>30</var>, <var data-var='time'>15:29</var> UTC</small><br><strong>Update</strong> - We will be failing over one of our primary caching hosts to complete our mitigation of the problem. Users will experience some temporary service disruptions until that event is complete.</p><p><small>Jan <var data-var='date'>30</var>, <var data-var='time'>14:58</var> UTC</small><br><strong>Update</strong> - We are seeing recovery in our caching infrastructure. We are continuing to monitor</p><p><small>Jan <var data-var='date'>30</var>, <var data-var='time'>14:46</var> UTC</small><br><strong>Update</strong> - Users may experience timeouts in various GitHub services. We have identified an issue with our caching infrastructure and are working to mitigate the issue</p><p><small>Jan <var data-var='date'>30</var>, <var data-var='time'>14:29</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of degraded availability for Issues and Pull Requests</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23735682</id> <published>2025-01-29T16:30:58Z</published> <updated>2025-01-29T16:30:58Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/wg7n9ns64dsd"/> <title>Disruption with some GitHub services</title> <content type="html"><p><small>Jan <var data-var='date'>29</var>, <var data-var='time'>16:30</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Jan <var data-var='date'>29</var>, <var data-var='time'>16:29</var> UTC</small><br><strong>Update</strong> - We have pushed a fix and are seeing general recovery.</p><p><small>Jan <var data-var='date'>29</var>, <var data-var='time'>16:09</var> UTC</small><br><strong>Update</strong> - We're continuing to investigate an issue related to Copilot Chat on GitHub.com</p><p><small>Jan <var data-var='date'>29</var>, <var data-var='time'>15:37</var> UTC</small><br><strong>Update</strong> - We're continuing to investigate an issue related to Copilot Chat on GitHub.com</p><p><small>Jan <var data-var='date'>29</var>, <var data-var='time'>15:04</var> UTC</small><br><strong>Update</strong> - We're seeing issues related to Copilot chat on GitHub.com</p><p><small>Jan <var data-var='date'>29</var>, <var data-var='time'>14:52</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23716236</id> <published>2025-01-27T23:41:13Z</published> <updated>2025-01-27T23:41:13Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/r6j3fnl9j58q"/> <title>Disruption with some GitHub services</title> <content type="html"><p><small>Jan <var data-var='date'>27</var>, <var data-var='time'>23:41</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Jan <var data-var='date'>27</var>, <var data-var='time'>23:32</var> UTC</small><br><strong>Update</strong> - Our Audit Log Streaming service is experiencing degradation but is experiencing no data outage.</p><p><small>Jan <var data-var='date'>27</var>, <var data-var='time'>23:32</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23723119</id> <published>2025-01-26T21:00:00Z</published> <updated>2025-01-28T14:43:18Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/7p05r5610bqd"/> <title>Incident With Migration Service</title> <content type="html"><p><small>Jan <var data-var='date'>26</var>, <var data-var='time'>21:00</var> UTC</small><br><strong>Resolved</strong> - Between Sunday 20:50 UTC and Monday 15:20 UTC the Migrations service was unable to process migrations. This was due to a invalid infrastructure credential. <br /><br />We mitigated the issue by updating the credential internally.<br /><br />Mechanisms and automation will be implemented to detect and prevent this issue again in the future.</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23665952</id> <published>2025-01-23T17:27:02Z</published> <updated>2025-01-24T20:09:09Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/gszmt35n7k20"/> <title>Incident with Actions</title> <content type="html"><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>17:27</var> UTC</small><br><strong>Resolved</strong> - On January 23, 2025, between 9:49 and 17:00 UTC, the available capacity of large hosted runners was degraded. On average, 26% of jobs requiring large runners had a >5min delay getting a runner assigned. This was caused by the rollback of a configuration change and a latent bug in event processing, which was triggered by the mixed data shape that resulted from the rollback. The processing would reprocess the same events unnecessarily and cause the background job that manages large runner creation and deletion to run out of resources. It would automatically restart and continue processing, but the jobs were not able to keep up with production traffic. We mitigated the impact by using a feature flag to bypass the problematic event processing logic. While these changes had been rolling out in stages over the last few months and had been safely rolled back previously, an unrelated change prevented rollback from causing this problem in earlier stages.<br /><br />We are reviewing and updating the feature flags in this event processing workflow to ensure that we have high confidence in rollback in all rollout stages. We are also improving observability of the event processing to reduce the time to diagnose and mitigate similar issues going forward.</p><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>17:03</var> UTC</small><br><strong>Update</strong> - We are seeing recovery with the latest mitigation. Queue time for a very small percentage of larger runner jobs are still longer than expected so we are monitoring those for full recovery before going green.</p><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>16:25</var> UTC</small><br><strong>Update</strong> - We are actively applying mitigations to help improve larger runner start times. We are currently seeing delays starting about 25% of larger runner jobs.</p><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>15:33</var> UTC</small><br><strong>Update</strong> - We are still actively investigating a slowdown in larger runner assignment and are working to apply additional mitigations.</p><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>14:53</var> UTC</small><br><strong>Update</strong> - We're still applying mitigations to unblock queueing Actions in large runners. We are monitoring for full recovery.</p><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>14:17</var> UTC</small><br><strong>Update</strong> - We are applying further mitigations to fix the issues with delayed queuing for Actions jobs in large runners. We continue to monitor for full recovery.</p><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>13:42</var> UTC</small><br><strong>Update</strong> - We are investigating further mitigations for queueing Actions jobs in large runners. We continue to watch telemetry and are monitoring for full recovery.</p><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>13:09</var> UTC</small><br><strong>Update</strong> - We've applied a mitigation to fix the issues with queuing and running Actions jobs. We are seeing improvements in telemetry and are monitoring for full recovery.</p><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>12:36</var> UTC</small><br><strong>Update</strong> - The team continues to apply mitigations for issues with some Actions jobs delayed being enqueued for larger runners seen by a small number of customers. We will continue providing updates on the progress towards full mitigation.</p><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>12:03</var> UTC</small><br><strong>Update</strong> - The team continues to apply mitigations for issues with some Actions jobs delayed being enqueued for larger runners. We will continue providing updates on the progress towards full mitigation.</p><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>11:31</var> UTC</small><br><strong>Update</strong> - The team continues to investigate issues with some Actions jobs delayed being enqueued for larger runners. We will continue providing updates on the progress towards mitigation.</p><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>10:58</var> UTC</small><br><strong>Update</strong> - The team continues to investigate issues with some Actions jobs having delays in being queued for larger runners. We will continue providing updates on the progress towards mitigation.</p><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>10:25</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of degraded performance for Actions</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23575724</id> <published>2025-01-16T09:40:06Z</published> <updated>2025-02-12T18:04:27Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/78wybrzyv0wf"/> <title>Incident with Pull Request Rebase Merges</title> <content type="html"><p><small>Jan <var data-var='date'>16</var>, <var data-var='time'>09:40</var> UTC</small><br><strong>Resolved</strong> - On January 16, 2025, between 00:45 UTC and 09:40 UTC the Pull Requests service was degraded and failed to generate rebase merge commits. This was due to a configuration change that introduced disagreements between replicas. These disagreements caused a secondary job to run, triggering timeouts while computing rebase merge commits. <br /><br />We mitigated the incident by rolling back the configuration change.<br /><br />We are working on improving our monitoring and deployment practices to reduce our time to detection and mitigation of issues like this one in the future.</p><p><small>Jan <var data-var='date'>16</var>, <var data-var='time'>09:39</var> UTC</small><br><strong>Update</strong> - The incident has been resolved, but please note affected pull requests will self repair when any commits are pushed to the pull requests' base branch or head branch. If you encounter problems with a rebase and merge, either click the "update branch" button or push a commit to the PR's branch.</p><p><small>Jan <var data-var='date'>16</var>, <var data-var='time'>09:18</var> UTC</small><br><strong>Update</strong> - We have mitigated the incident, and any new pull request rebase merges should be recovered. We are working on recovery steps for any pull requests that attempted to merge during this incident.</p><p><small>Jan <var data-var='date'>16</var>, <var data-var='time'>08:37</var> UTC</small><br><strong>Update</strong> - We believe to have found a root cause, and in the process of verifying the mitigation.</p><p><small>Jan <var data-var='date'>16</var>, <var data-var='time'>07:38</var> UTC</small><br><strong>Update</strong> - We are still continuing to investigate.</p><p><small>Jan <var data-var='date'>16</var>, <var data-var='time'>07:05</var> UTC</small><br><strong>Update</strong> - We are still experiencing failures for rebase merges in pull requests, we are continuing to investigate.</p><p><small>Jan <var data-var='date'>16</var>, <var data-var='time'>06:22</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of degraded performance for Pull Requests</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23556124</id> <published>2025-01-14T21:20:19Z</published> <updated>2025-01-27T17:05:22Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/bll83wkgdd2m"/> <title>Disruption connecting to Codespaces</title> <content type="html"><p><small>Jan <var data-var='date'>14</var>, <var data-var='time'>21:20</var> UTC</small><br><strong>Resolved</strong> - On January 14, 2025, between 19:13 UTC and 21:210 UTC the Codespaces service was degraded and led to connection failures with running codespaces, with a 7.6% failure rate for connections during the degradation. Users with bad connections could not use impacted codespaces until they were stopped and restarted.<br /><br />This was caused by bad connections left behind after a deployment in an upstream dependency that the Codespaces service still provided to clients. The incident self-mitigated as new connections replaced stale ones. We are coordinating to ensure connection stability with future deployments of this nature.</p><p><small>Jan <var data-var='date'>14</var>, <var data-var='time'>21:19</var> UTC</small><br><strong>Update</strong> - We are beginning to see recovery for users connecting to Codespaces. Any users continuing to see impact should attempt a restart.</p><p><small>Jan <var data-var='date'>14</var>, <var data-var='time'>20:55</var> UTC</small><br><strong>Update</strong> - We are investigating reports of timeouts for Codespaces users creating new or connecting to existing Codespaces. We will continue to keep users updated on progress towards mitigation.</p><p><small>Jan <var data-var='date'>14</var>, <var data-var='time'>20:55</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of degraded performance for Codespaces</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23543623</id> <published>2025-01-14T00:28:58Z</published> <updated>2025-01-14T00:50:13Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/qd96yfgvmcf9"/> <title>Incident with Git Operations</title> <content type="html"><p><small>Jan <var data-var='date'>14</var>, <var data-var='time'>00:28</var> UTC</small><br><strong>Resolved</strong> - On January 13, 2025, between 23:35 UTC and 00:24 UTC all Git operations were unavailable due to a configuration change causing our internal load balancer to drop requests between services that Git relies upon.<br /><br />We mitigated the incident by rolling back the configuration change.<br /><br />We are improving our monitoring and deployment practices to reduce our time to detection and automated mitigation for issues like this in the future.</p><p><small>Jan <var data-var='date'>14</var>, <var data-var='time'>00:15</var> UTC</small><br><strong>Update</strong> - We've identified a cause of degraded git operations, which may affect other GitHub services that rely upon git. We're working to remediate.</p><p><small>Jan <var data-var='date'>13</var>, <var data-var='time'>23:57</var> UTC</small><br><strong>Update</strong> - Actions is experiencing degraded performance. We are continuing to investigate.</p><p><small>Jan <var data-var='date'>13</var>, <var data-var='time'>23:46</var> UTC</small><br><strong>Update</strong> - Pages is experiencing degraded performance. We are continuing to investigate.</p><p><small>Jan <var data-var='date'>13</var>, <var data-var='time'>23:44</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of degraded availability for Git Operations</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23496521</id> <published>2025-01-09T20:00:26Z</published> <updated>2025-01-14T01:31:48Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/jbhcrj5xtggx"/> <title>Issues with VNet Injected Larger Hosted Runners in East US 2</title> <content type="html"><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>20:00</var> UTC</small><br><strong>Resolved</strong> - On January 9, 2025, larger hosted runners configured with Azure private networking in East US 2 were degraded, causing delayed job starts for ~2,300 jobs between 16:00 and 20:00 UTC. There was also an earlier period of impact from 2025-01-08 22:00 UTC to 2025-01-09 4:10 UTC with 488 jobs impacted. The cause of both these delays was an incident in East US 2 impacting provisioning and network connectivity of Azure resources. More details on that incident are visible at https://azure.status.microsoft/en-us/status/history (Tracking ID: PLP3-1W8). Because these runners are reliant on private networking with networks in the East US 2 region, there were no immediate mitigations available other than restoring network connectivity. Going forward, we will continue evaluating options to provide better resilience to 3rd party regional outages that affect private networking customers.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>20:00</var> UTC</small><br><strong>Update</strong> - The impact to Large Runners has been mitigated. The third party incident has not been fully mitigated but is being actively monitored at https://azure.status.microsoft/en-us/status in case of reoccurrence.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>19:27</var> UTC</small><br><strong>Update</strong> - We are continuing to see improvements while still monitoring updates from the third party at https://azure.status.microsoft/en-us/status</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>18:53</var> UTC</small><br><strong>Update</strong> - We are still monitoring the third party networking updates via https://azure.status.microsoft/en-us/status. Multiple workstreams are in progress by the third party to mitigate the impact.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>18:18</var> UTC</small><br><strong>Update</strong> - We are still monitoring the third party networking updates via https://azure.status.microsoft/en-us/status. Multiple workstreams are in progress by the third party to mitigate the impact.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>17:43</var> UTC</small><br><strong>Update</strong> - The underlying third party networking issues have been identified and are being work on. Ongoing updates can be found at https://azure.status.microsoft/en-us/status</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>17:12</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23490363</id> <published>2025-01-09T08:30:41Z</published> <updated>2025-01-13T20:21:38Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/l76qh74ryp16"/> <title>Some GitHub Actions may not run</title> <content type="html"><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>08:30</var> UTC</small><br><strong>Resolved</strong> - On January 9, 2025, between 06:26 and 07:49 UTC, Actions experienced degraded performance, leading to failures in about 1% of workflow runs across ~10k repositories. The failures occurred due to an outage in a dependent service, which disrupted Redis connectivity in the East US 2 region. We mitigated the incident by re-routing Redis traffic out of that region at 07:49 UTC. We continued to monitor service recovery before resolving the incident at 08:30 UTC. We are working to improve our monitoring to reduce our time to detection and mitigation of issues like this one in the future.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>08:30</var> UTC</small><br><strong>Update</strong> - Actions is operating normally.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>08:17</var> UTC</small><br><strong>Update</strong> - We have seen recovery of Actions runs for affected repositories. We are verifying all remediations before resolving this incident.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>07:47</var> UTC</small><br><strong>Update</strong> - We have identified the problem and are proceeding with a fail-over remediation. We anticipate this will allow Actions Runs to proceed for affected repositories.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>07:17</var> UTC</small><br><strong>Update</strong> - 1-2% of repositories may have Actions jobs that are blocked and are not running or will be delayed. We have identified a potential cause. We are confirming and will be working on remediation.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>07:15</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of degraded performance for Actions</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23487850</id> <published>2025-01-09T02:27:04Z</published> <updated>2025-01-11T00:45:23Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/plgzz71xn6zq"/> <title>Incident with Webhooks</title> <content type="html"><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>02:27</var> UTC</small><br><strong>Resolved</strong> - On January 9, 2025, between 01:26 UTC and 01:56 UTC GitHub experienced widespread disruption to many services, with users receiving 500 responses when trying to access various functionality. This was due to a deployment which introduced a query that saturated a primary database server. On average, the error rate was 6% and peaked at 6.85% of update requests.<br /><br />We mitigated the incident by identifying the source of the problematic query and rolling back the deployment.<br /><br />We are investigating methods to detect problematic queries prior to deployment to prevent, and to reduce our time to detection and mitigation of issues like this one in the future.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>02:19</var> UTC</small><br><strong>Update</strong> - We have identified the root cause and have deployed a fix. Majority of the services have recovered. Actions service is in the process of being recovered.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>02:14</var> UTC</small><br><strong>Update</strong> - Copilot is operating normally.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>02:13</var> UTC</small><br><strong>Update</strong> - Issues is operating normally.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>02:13</var> UTC</small><br><strong>Update</strong> - Pages is operating normally.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>02:12</var> UTC</small><br><strong>Update</strong> - Git Operations is operating normally.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>02:12</var> UTC</small><br><strong>Update</strong> - Webhooks is operating normally.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>02:12</var> UTC</small><br><strong>Update</strong> - Pull Requests is operating normally.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>02:11</var> UTC</small><br><strong>Update</strong> - Codespaces is operating normally.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>02:09</var> UTC</small><br><strong>Update</strong> - We have identified the root cause and have deployed a fix. Service are recovering.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>02:01</var> UTC</small><br><strong>Update</strong> - API Requests is experiencing degraded performance. We are continuing to investigate.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>01:59</var> UTC</small><br><strong>Update</strong> - We are continuing the investigation of multiple service issues. We will continue to keep users updated on progress towards mitigation.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>01:53</var> UTC</small><br><strong>Update</strong> - Copilot is experiencing degraded performance. We are continuing to investigate.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>01:51</var> UTC</small><br><strong>Update</strong> - Codespaces is experiencing degraded performance. We are continuing to investigate.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>01:51</var> UTC</small><br><strong>Update</strong> - Codespaces is experiencing degraded availability. We are continuing to investigate.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>01:49</var> UTC</small><br><strong>Update</strong> - Git Operations is experiencing degraded availability. We are continuing to investigate.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>01:46</var> UTC</small><br><strong>Update</strong> - We are investigating reports of issues with multiple services including authentication, PRs, codespaces, pages, git operation, and apis. We will continue to keep users updated on progress towards mitigation.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>01:44</var> UTC</small><br><strong>Update</strong> - Pages is experiencing degraded performance. We are continuing to investigate.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>01:43</var> UTC</small><br><strong>Update</strong> - Git Operations is experiencing degraded performance. We are continuing to investigate.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>01:42</var> UTC</small><br><strong>Update</strong> - Pull Requests is experiencing degraded performance. We are continuing to investigate.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>01:41</var> UTC</small><br><strong>Update</strong> - Issues is experiencing degraded performance. We are continuing to investigate.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>01:37</var> UTC</small><br><strong>Update</strong> - Actions is experiencing degraded performance. We are continuing to investigate.</p><p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>01:36</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of degraded availability for Webhooks</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23468172</id> <published>2025-01-07T16:39:23Z</published> <updated>2025-01-10T01:11:29Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/dk61qxd21mtl"/> <title>Incident with Actions resulting in degraded performance</title> <content type="html"><p><small>Jan <var data-var='date'> 7</var>, <var data-var='time'>16:39</var> UTC</small><br><strong>Resolved</strong> - On January 7th, 2025 between 11:54:00 and 16:39 UTC, degraded performance was observed in Actions, Webhooks, and Issues, caused by an internal Certificate Authority configuration change that disrupted our event infrastructure. The configuration issue was promptly identified and resolved by rolling the change back on impacted hosts and re-issuing certificates.<br /><br />We have identified what services need updates to support the current PKI architecture and are working on implementing those changes to prevent a future recurrence.</p><p><small>Jan <var data-var='date'> 7</var>, <var data-var='time'>16:39</var> UTC</small><br><strong>Update</strong> - Issues is operating normally.</p><p><small>Jan <var data-var='date'> 7</var>, <var data-var='time'>16:38</var> UTC</small><br><strong>Update</strong> - Actions is operating normally.</p><p><small>Jan <var data-var='date'> 7</var>, <var data-var='time'>16:38</var> UTC</small><br><strong>Update</strong> - Webhooks is operating normally.</p><p><small>Jan <var data-var='date'> 7</var>, <var data-var='time'>16:09</var> UTC</small><br><strong>Update</strong> - Webhooks is experiencing degraded performance. We are continuing to investigate.</p><p><small>Jan <var data-var='date'> 7</var>, <var data-var='time'>15:59</var> UTC</small><br><strong>Update</strong> - We have identified a configuration issue that we believe is the source of the Action workflow job delays and page latency increases. We are continuing to investigate and mitigate the issue.</p><p><small>Jan <var data-var='date'> 7</var>, <var data-var='time'>15:51</var> UTC</small><br><strong>Update</strong> - Issues is experiencing degraded performance. We are continuing to investigate.</p><p><small>Jan <var data-var='date'> 7</var>, <var data-var='time'>15:17</var> UTC</small><br><strong>Update</strong> - Users may see delays with Action workflow jobs in the UI and API responses. Additionally, several endpoints, including some Pull Request pages are experiencing increased latency. We are continuing to investigate the issue.</p><p><small>Jan <var data-var='date'> 7</var>, <var data-var='time'>14:49</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of degraded performance for Actions</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23417382</id> <published>2025-01-03T00:19:10Z</published> <updated>2025-01-06T23:30:24Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/6ddp6v8g5t72"/> <title>Incident with Actions</title> <content type="html"><p><small>Jan <var data-var='date'> 3</var>, <var data-var='time'>00:19</var> UTC</small><br><strong>Resolved</strong> - On January 2, 2025 between 16:00:00 and 22:27:30 UTC, a bug in feature-flagged code that cleans up Pull Requests after they are closed or merged incorrectly cleared the merge commit SHA for ~139,000 pull requests. During the incident, Actions workflows triggered by the <i>on: pull_request</i> trigger for the <i>closed</i> type were not queued successfully because of these missing merge commit SHAs. Approximately 45,000 repositories experienced these missing workflow triggers in either of two possible scenarios: pull requests which were closed, but not merged; and pull requests which were merged. Impact was mitigated after rolling back the aforementioned feature flag. <br /><br />Merged pull requests that were affected have had their merge commit SHAs restored. Closed pull requests have not had their merge commit SHA restored; however, customers can re-open and close them again to recalculate this SHA. We are investigating methods to improve detection of these kinds of errors in the future.</p><p><small>Jan <var data-var='date'> 3</var>, <var data-var='time'>00:19</var> UTC</small><br><strong>Update</strong> - All systems are operational, and we have a plan to backfill the missing metadata. In total, 139,000 PRs were impacted across 45,000 repositories. The backfilled metadata will be available in a few days.<br /><br />Until the backfill is complete, there are several actions you can take to ensure an Action runs:<br />- Any Actions that should have run on closed but not merged PRs can be triggered by re-opening and re-closing the PR.<br />- Actions that should have run on PR merge can be re-run from the main branch of your repository.<br /><br />The only Actions that cannot be re-run at this time are ones that specifically use the merge commit.<br /><br />Additionally, the `merge_commit_sha` field on an impacted Pull Request will be `null` when queried via our API until the backfill completes.<br /><br />We appreciate the error reports we received, and thank you for your patience. We mitigated the initial impact quickly by rolling back a feature flag. We will be improving the monitoring of our feature flag rollouts in the future to better catch these scenarios.</p><p><small>Jan <var data-var='date'> 2</var>, <var data-var='time'>23:11</var> UTC</small><br><strong>Update</strong> - We have remediated the issue impacting Actions workflows. During investigation and remediation, we realized there were also issues with recording metadata around merge commits. No git data or code has been lost. PRs merged today between 20:06 UTC and 22:15 UTC are impacted. We are working on a plan to regenerate the missing metadata and will provide an update once we have one in place.</p><p><small>Jan <var data-var='date'> 2</var>, <var data-var='time'>23:05</var> UTC</small><br><strong>Update</strong> - Pull Requests is experiencing degraded performance. We are continuing to investigate.</p><p><small>Jan <var data-var='date'> 2</var>, <var data-var='time'>22:30</var> UTC</small><br><strong>Update</strong> - We have identified and begun to remediate the issue preventing Actions from triggering on closed pull requests. We are beginning to see recovery.</p><p><small>Jan <var data-var='date'> 2</var>, <var data-var='time'>22:09</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of degraded performance for Actions</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23274163</id> <published>2024-12-20T16:44:36Z</published> <updated>2024-12-24T21:56:41Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/5hjghvwvqztc"/> <title>Disruption with some GitHub services</title> <content type="html"><p><small>Dec <var data-var='date'>20</var>, <var data-var='time'>16:44</var> UTC</small><br><strong>Resolved</strong> - On December 20th, 2024, between 15:57 UTC and 16:39 UTC some of our marketing pages became inaccessible and users attempting to access the pages would have received 500 errors. There was no impact to any operational product or service area. This issue was due to a partial outage with one of our service providers. At 16:39 UTC the service provider resolved the outage, restoring access to the affected pages. We are investigating methods to improve error handling and gracefully degrade these pages in case of future outages.</p><p><small>Dec <var data-var='date'>20</var>, <var data-var='time'>16:42</var> UTC</small><br><strong>Update</strong> - This issue is related to a partner who is working the problem, they in partial recovery.</p><p><small>Dec <var data-var='date'>20</var>, <var data-var='time'>16:20</var> UTC</small><br><strong>Update</strong> - We're seeing issues related to some of our marketing pages. We are investigating.</p><p><small>Dec <var data-var='date'>20</var>, <var data-var='time'>16:18</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23233596</id> <published>2024-12-17T16:00:10Z</published> <updated>2024-12-18T21:44:07Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/fnq063tqh7cc"/> <title>Live updates on pages not loading reliably</title> <content type="html"><p><small>Dec <var data-var='date'>17</var>, <var data-var='time'>16:00</var> UTC</small><br><strong>Resolved</strong> - On December 17th, 2024, between 14:33 UTC and 14:50 UTC, many users experienced intermittent errors and timeouts when accessing github.com. The error rate was 8.5% on average and peaked at 44.3% of requests. The increased error rate caused a broad impact across our services, such as the inability to log in, view a repository, open a pull request, and comment on issues. The errors were caused by our web servers being overloaded as a result of planned maintenance that unintentionally caused our live updates service to fail to start. As a result of the live updates service being down, clients reconnected aggressively and overloaded our servers.<br /><br />We only marked Issues as affected during this incident despite the broad impact. This oversight was due to a gap in our alerting while our web servers were overloaded. The engineering team's focus on restoring functionality led us to not identify the broad scope of the impact to customers until the incident had already been mitigated.<br /><br />We mitigated the incident by rolling back the changes from the planned maintenance to the live updates service and scaling up the service to handle the influx of traffic from WebSocket clients.<br /><br />We are working to reduce the impact of the live updates service's availability on github.com to prevent issues like this one in the future. We are also working to improve our alerting to better detect the scope of impact from incidents like this.</p><p><small>Dec <var data-var='date'>17</var>, <var data-var='time'>15:32</var> UTC</small><br><strong>Update</strong> - Issues is operating normally.</p><p><small>Dec <var data-var='date'>17</var>, <var data-var='time'>15:29</var> UTC</small><br><strong>Update</strong> - We have taken some mitigation steps and are continuing to investigate the issue. There was a period of wider impact on many GitHub services such as user logins and page loads which should now be mitigated.</p><p><small>Dec <var data-var='date'>17</var>, <var data-var='time'>15:05</var> UTC</small><br><strong>Update</strong> - Issues is experiencing degraded availability. We are continuing to investigate.</p><p><small>Dec <var data-var='date'>17</var>, <var data-var='time'>14:53</var> UTC</small><br><strong>Update</strong> - We are currently seeing live updates on some pages not working. This can impact features such as status checks and the merge button for PRs.<br /><br />Current mitigation is to refresh pages manually to see latest details.<br /><br />We are working to mitigate this and will continue to provide updates as the team makes progress.</p><p><small>Dec <var data-var='date'>17</var>, <var data-var='time'>14:51</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of degraded performance for Issues</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23095063</id> <published>2024-12-06T17:17:36Z</published> <updated>2025-01-15T14:48:40Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/d33mtmnttgsh"/> <title>Disruption with some GitHub services</title> <content type="html"><p><small>Dec <var data-var='date'> 6</var>, <var data-var='time'>17:17</var> UTC</small><br><strong>Resolved</strong> - Upon further investigation, the degradation in migrations in the EU was caused by an internal configuration issue, which was promptly identified and resolved. No customer migrations were impacted during this time and the issue only affected GitHub Enterprise Cloud - EU and had no impact on GitHub.com. The service is now fully operational. We are following up by improving our processes for these internal configuration changes to prevent a recurrence, and to have incidents that affect GitHub Enterprise Cloud - EU be reported on https://eu.githubstatus.com/.</p><p><small>Dec <var data-var='date'> 6</var>, <var data-var='time'>17:17</var> UTC</small><br><strong>Update</strong> - Migrations are failing for a subset of users in the EU region with data residency. We believe we have resolved the issue and are monitoring for resolution.</p><p><small>Dec <var data-var='date'> 6</var>, <var data-var='time'>16:58</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23070020</id> <published>2024-12-04T19:27:34Z</published> <updated>2024-12-19T19:34:01Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/4349zxvb8stp"/> <title>Disruption with some GitHub services</title> <content type="html"><p><small>Dec <var data-var='date'> 4</var>, <var data-var='time'>19:27</var> UTC</small><br><strong>Resolved</strong> - On December 4th, 2024 between 18:52 UTC and 19:11 UTC, several GitHub services were degraded with an average error rate of 8%.<br /><br />The incident was caused by a change to a centralized authorization service that contained an unoptimized database query. This led to an increase in overall load on a shared database cluster, resulting in a cascading effect on multiple services and specifically affecting repository access authorization checks. We mitigated the incident after rolling back the change at 19:07 UTC, fully recovering within 4 minutes. <br /><br />While this incident was caught and remedied quickly, we are implementing process improvements around recognizing and reducing risk of changes involving high volume authorization checks. We are investing in broad improvements to our safe rollout process, such as improving early detection mechanisms.</p><p><small>Dec <var data-var='date'> 4</var>, <var data-var='time'>19:26</var> UTC</small><br><strong>Update</strong> - Pull Requests is operating normally.</p><p><small>Dec <var data-var='date'> 4</var>, <var data-var='time'>19:21</var> UTC</small><br><strong>Update</strong> - Pull Requests is experiencing degraded performance. We are continuing to investigate.</p><p><small>Dec <var data-var='date'> 4</var>, <var data-var='time'>19:20</var> UTC</small><br><strong>Update</strong> - Issues is operating normally.</p><p><small>Dec <var data-var='date'> 4</var>, <var data-var='time'>19:18</var> UTC</small><br><strong>Update</strong> - API Requests is operating normally.</p><p><small>Dec <var data-var='date'> 4</var>, <var data-var='time'>19:17</var> UTC</small><br><strong>Update</strong> - Webhooks is operating normally.</p><p><small>Dec <var data-var='date'> 4</var>, <var data-var='time'>19:11</var> UTC</small><br><strong>Update</strong> - We have identified the cause of timeouts impacting users across multiple services. This change was rolled back and we are seeing recovery. We will continue to monitor for complete recovery.</p><p><small>Dec <var data-var='date'> 4</var>, <var data-var='time'>19:07</var> UTC</small><br><strong>Update</strong> - Issues is experiencing degraded performance. We are continuing to investigate.</p><p><small>Dec <var data-var='date'> 4</var>, <var data-var='time'>19:05</var> UTC</small><br><strong>Update</strong> - API Requests is experiencing degraded performance. We are continuing to investigate.</p><p><small>Dec <var data-var='date'> 4</var>, <var data-var='time'>19:05</var> UTC</small><br><strong>Update</strong> - Webhooks is experiencing degraded performance. We are continuing to investigate.</p><p><small>Dec <var data-var='date'> 4</var>, <var data-var='time'>18:58</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23058715</id> <published>2024-12-03T23:30:00Z</published> <updated>2024-12-04T00:35:50Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/lbdsk3990lz5"/> <title>[Retroactive] Incident with Pull Requests</title> <content type="html"><p><small>Dec <var data-var='date'> 3</var>, <var data-var='time'>23:30</var> UTC</small><br><strong>Resolved</strong> - On December 3rd, between 23:29 and 23:43 UTC, Pull Requests experienced a brief outage and teams have confirmed the issue to be resolved. Due to brevity of incident it was not publicly statused at the time however an RCA will be conducted and shared in due course.</p></content> </entry> <entry> <id>tag:www.githubstatus.com,2005:Incident/23055857</id> <published>2024-12-03T20:05:05Z</published> <updated>2024-12-04T23:36:39Z</updated> <link rel="alternate" type="text/html" href="https://www.githubstatus.com/incidents/w6g0cmvyx3vm"/> <title>Incident with Pull Requests and API Requests</title> <content type="html"><p><small>Dec <var data-var='date'> 3</var>, <var data-var='time'>20:05</var> UTC</small><br><strong>Resolved</strong> - On December 3, 2024, between 19:35 UTC and 20:05 UTC API requests, Actions, Pull Requests and Issues were degraded. Web and API requests for Pull Requests experienced a 3.5% error rate and Issues had a 1.2% error rate. The highest impact was for users who experienced errors while creating and commenting on Pull Requests and Issues. Actions had a 3.3% error rate in jobs and delays on some updates during this time.<br /><br />This was due to an erroneous database credential change impacting write access to Issues and Pull Requests data. We mitigated the incident by reverting the credential change at 19:52 UTC. We continued to monitor service recovery before resolving the incident at 20:05 UTC. <br /><br />There are a few improvements we are making in response to this. We are investing in safe guards to the change management process in order to prevent erroneous database credential changes. Additionally, the initial rollback attempt was unsuccessful which led to a longer time to mitigate. We were able to revert through an alternative method and are updating our playbooks to document this mitigation strategy.</p><p><small>Dec <var data-var='date'> 3</var>, <var data-var='time'>20:05</var> UTC</small><br><strong>Update</strong> - Pull Requests is operating normally.</p><p><small>Dec <var data-var='date'> 3</var>, <var data-var='time'>20:04</var> UTC</small><br><strong>Update</strong> - Actions is operating normally.</p><p><small>Dec <var data-var='date'> 3</var>, <var data-var='time'>20:02</var> UTC</small><br><strong>Update</strong> - API Requests is operating normally.</p><p><small>Dec <var data-var='date'> 3</var>, <var data-var='time'>19:59</var> UTC</small><br><strong>Update</strong> - We have taken mitigating actions and are starting to see recovery but are continuing to monitor and ensure full recovery. Some users may still see errors.</p><p><small>Dec <var data-var='date'> 3</var>, <var data-var='time'>19:54</var> UTC</small><br><strong>Update</strong> - Some users will experience problems with certain features of pull requests, actions, issues and other areas. We are aware of the issue, know the cause, and are working on a mitigation.</p><p><small>Dec <var data-var='date'> 3</var>, <var data-var='time'>19:48</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of degraded performance for API Requests, Actions and Pull Requests</p></content> </entry> </feed>