CINXE.COM
Massive drop in results from works endpoint? - Metadata Retrieval - Crossref community forum
<!DOCTYPE html> <html lang="en-GB"> <head> <meta charset="utf-8"> <title>Massive drop in results from works endpoint? - Metadata Retrieval - Crossref community forum</title> <meta name="description" content="Hello CrossRef community, In August I set up a weekly job to search for new research on poverty in developing countries. I used quite a long search string and some basic filters, as follows: query: &quot;global poverty &hellip;"> <meta name="generator" content="Discourse 3.5.0.beta1-dev - https://github.com/discourse/discourse version 402ec6bf5c857ddc07be9cb9673734cc7152b7be"> <link rel="icon" type="image/png" href="https://us1.discourse-cdn.com/flex020/uploads/crossref/optimized/1X/e2c7a53e5928f61f78db67dea91eba635f04bffa_2_32x32.png"> <link rel="apple-touch-icon" type="image/png" href="https://us1.discourse-cdn.com/flex020/uploads/crossref/optimized/1X/6f5d00eeaf1982f0b932ea1d0e51946356bad5f1_2_180x180.png"> <meta name="theme-color" media="all" content="#ffffff"> <meta name="color-scheme" content="light"> <meta name="viewport" content="width=device-width, initial-scale=1.0, minimum-scale=1.0, viewport-fit=cover"> <link rel="canonical" href="https://community.crossref.org/t/massive-drop-in-results-from-works-endpoint/12599" /> <link rel="search" type="application/opensearchdescription+xml" href="https://community.crossref.org/opensearch.xml" title="Crossref community forum Search"> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/color_definitions_crossref_1_1_9744899a6c7b19d904fa8827761824d97b0814f5.css?__ws=community.crossref.org" media="all" rel="stylesheet" class="light-scheme"/> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/desktop_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="desktop" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/checklist_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="checklist" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-adplugin_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-adplugin" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-ai_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-ai" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-akismet_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-akismet" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-assign_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-assign" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-chat-integration_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-chat-integration" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-data-explorer_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-data-explorer" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-details_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-details" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-lazy-videos_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-lazy-videos" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-local-dates_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-local-dates" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-narrative-bot_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-narrative-bot" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-policy_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-policy" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-presence_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-presence" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-templates_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-templates" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-topic-voting_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-topic-voting" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-user-notes_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-user-notes" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/footnote_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="footnote" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/hosted-site_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="hosted-site" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/poll_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="poll" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-ai_desktop_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-ai_desktop" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/discourse-topic-voting_desktop_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="discourse-topic-voting_desktop" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/poll_desktop_2a83027730540c4820d6b2212c76b5bbab4a46c4.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="poll_desktop" /> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/desktop_theme_6_2162c2e3563d66bb118e15a274d13bb073919563.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="desktop_theme" data-theme-id="6" data-theme-name="easy responsive footer"/> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/desktop_theme_5_08bd0147fb80c44b011695a28f7b57abc8cbbac6.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="desktop_theme" data-theme-id="5" data-theme-name="make the like icon visible with the voting plugin"/> <link href="https://sea2.discourse-cdn.com/flex020/stylesheets/desktop_theme_1_758f57ce93bd6c49f3f6b12ced389f548b1bff25.css?__ws=community.crossref.org" media="all" rel="stylesheet" data-target="desktop_theme" data-theme-id="1" data-theme-name="crossref theme (header & footer tracking code)"/> <link rel="alternate nofollow" type="application/rss+xml" title="RSS feed of 'Massive drop in results from works endpoint?'" href="https://community.crossref.org/t/massive-drop-in-results-from-works-endpoint/12599.rss" /> <meta property="og:site_name" content="Crossref community forum" /> <meta property="og:type" content="website" /> <meta name="twitter:card" content="summary_large_image" /> <meta name="twitter:image" content="https://us1.discourse-cdn.com/flex020/uploads/crossref/original/1X/89725076d6cfb1b23a275c52a9930ab3526559d2.png" /> <meta property="og:image" content="https://us1.discourse-cdn.com/flex020/uploads/crossref/original/1X/b3840c8b20237bfc6f9b977ba808cfdd912e10f9.png" /> <meta property="og:url" content="https://community.crossref.org/t/massive-drop-in-results-from-works-endpoint/12599" /> <meta name="twitter:url" content="https://community.crossref.org/t/massive-drop-in-results-from-works-endpoint/12599" /> <meta property="og:title" content="Massive drop in results from works endpoint?" /> <meta name="twitter:title" content="Massive drop in results from works endpoint?" /> <meta property="og:description" content="Hello CrossRef community, In August I set up a weekly job to search for new research on poverty in developing countries. I used quite a long search string and some basic filters, as follows: query: "global poverty reduction evidence climate urban migration gender developing low middle income country LIC MIC", filter: "from-pub-date:[7 days ago],has-abstract:1,type:journal-article,type:report", rows: 1000 When I ran this search initially it returned around 800 results for a single ..." /> <meta name="twitter:description" content="Hello CrossRef community, In August I set up a weekly job to search for new research on poverty in developing countries. I used quite a long search string and some basic filters, as follows: query: "global poverty reduction evidence climate urban migration gender developing low middle income country LIC MIC", filter: "from-pub-date:[7 days ago],has-abstract:1,type:journal-article,type:report", rows: 1000 When I ran this search initially it returned around 800 results for a single ..." /> <meta property="og:article:section" content="Metadata Retrieval" /> <meta property="og:article:section:color" content="3EB1C8" /> <meta property="og:article:tag" content="openalex" /> <meta property="og:article:tag" content="rest-api" /> <meta property="og:article:tag" content="metadata-retrieval" /> <meta name="twitter:label1" value="Reading time" /> <meta name="twitter:data1" value="2 mins 🕑" /> <meta name="twitter:label2" value="Likes" /> <meta name="twitter:data2" value="6 ❤" /> <meta property="article:published_time" content="2024-11-15T09:56:20+00:00" /> <meta property="og:ignore_canonical" content="true" /> </head> <body class="crawler browser-update"> <script src="https://status.crossref.org/embed/script.js" nonce="GgBsJYyj70ulA32RbT6gqUgjK"></script> <div id="back-to-site"> ← <a href="https://www.crossref.org/">Visit the main Crossref website</a> </div> <header> <a href="/"> Crossref community forum </a> </header> <div id="main-outlet" class="wrap" role="main"> <div id="topic-title"> <h1> <a href="/t/massive-drop-in-results-from-works-endpoint/12599">Massive drop in results from works endpoint?</a> </h1> <div class="topic-category" itemscope itemtype="http://schema.org/BreadcrumbList"> <span itemprop="itemListElement" itemscope itemtype="http://schema.org/ListItem"> <a href="/c/metadata-retrieval/27" class="badge-wrapper bullet" itemprop="item"> <span class='badge-category-bg' style='background-color: #3EB1C8'></span> <span class='badge-category clear-badge'> <span class='category-name' itemprop='name'>Metadata Retrieval</span> </span> </a> <meta itemprop="position" content="1" /> </span> </div> <div class="topic-category"> <div class='discourse-tags list-tags'> <a href='https://community.crossref.org/tag/openalex' class='discourse-tag' rel="tag">openalex</a>, <a href='https://community.crossref.org/tag/rest-api' class='discourse-tag' rel="tag">rest-api</a>, <a href='https://community.crossref.org/tag/metadata-retrieval' class='discourse-tag' rel="tag">metadata-retrieval</a> </div> </div> </div> <div itemscope itemtype='http://schema.org/DiscussionForumPosting'> <meta itemprop='headline' content='Massive drop in results from works endpoint?'> <link itemprop='url' href='https://community.crossref.org/t/massive-drop-in-results-from-works-endpoint/12599'> <meta itemprop='datePublished' content='2024-11-15T09:56:19Z'> <meta itemprop='articleSection' content='Metadata Retrieval'> <meta itemprop='keywords' content='openalex, rest-api, metadata-retrieval'> <div itemprop='publisher' itemscope itemtype="http://schema.org/Organization"> <meta itemprop='name' content='Crossref'> <div itemprop='logo' itemscope itemtype="http://schema.org/ImageObject"> <meta itemprop='url' content='https://us1.discourse-cdn.com/flex020/uploads/crossref/original/1X/79c9275b59867363a78fe3e42460a6e21f851d66.svg'> </div> </div> <div id='post_1' class='topic-body crawler-post'> <div class='crawler-post-meta'> <span class="creator" itemprop="author" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" rel='nofollow' href='https://community.crossref.org/u/tomwagstaff-opml'><span itemprop='name'>tomwagstaff-opml</span></a> </span> <link itemprop="mainEntityOfPage" href="https://community.crossref.org/t/massive-drop-in-results-from-works-endpoint/12599"> <span class="crawler-post-infos"> <time datetime='2024-11-15T09:56:20Z' class='post-time'> 15 November 2024 09:56 </time> <meta itemprop='dateModified' content='2024-11-18T19:57:25Z'> <span itemprop='position'>1</span> </span> </div> <div class='post' itemprop='text'> <p>Hello CrossRef community,</p> <p>In August I set up a weekly job to search for new research on poverty in developing countries. I used quite a long search string and some basic filters, as follows:</p> <pre><code> query: "global poverty reduction evidence climate urban migration gender developing low middle income country LIC MIC", filter: "from-pub-date:[7 days ago],has-abstract:1,type:journal-article,type:report", rows: 1000 </code></pre> <p>When I ran this search initially it returned around 800 results for a single week. During September, there seems to have been a crash in volumes, and for the last few weeks, the same query has been returning only a handful of results - about 20 per week.</p> <p>This week, I repeated the search for the initial time window in August, and also retrieved only around 20 records this time.</p> <p><strong>Has the behaviour of this endpoint changed recently? Is there any way I can get back to the previous volumes of results?</strong></p> <p>Apologies for the vague issue - any help would be greatly appreciated.</p> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/LikeAction"/> <meta itemprop="userInteractionCount" content="0" /> <span class='post-likes'></span> </div> </div> <div id='post_2' itemprop='comment' itemscope itemtype='http://schema.org/Comment' class='topic-body crawler-post'> <div class='crawler-post-meta'> <span class="creator" itemprop="author" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" rel='nofollow' href='https://community.crossref.org/u/ifarley'><span itemprop='name'>ifarley</span></a> </span> <span class="crawler-post-infos"> <time itemprop='datePublished' datetime='2024-11-15T20:01:11Z' class='post-time'> 15 November 2024 20:01 </time> <meta itemprop='dateModified' content='2024-11-15T20:01:11Z'> <span itemprop='position'>2</span> </span> </div> <div class='post' itemprop='text'> <aside class="quote no-group" data-username="tomwagstaff-opml" data-post="1" data-topic="12599"> <div class="title"> <div class="quote-controls"></div> <img loading="lazy" alt="" width="24" height="24" src="https://avatars.discourse-cdn.com/v4/letter/t/35a633/48.png" class="avatar"> tomwagstaff-opml:</div> <blockquote> <p><code>from-pub-date:[7 days ago],has-abstract:1,type:journal-article,type:report</code></p> </blockquote> </aside> <p>Hi <a class="mention" href="/u/tomwagstaff-opml">@tomwagstaff-opml</a> ,</p> <p>I assume your API query looks like this, right?</p> <p><a href="https://api.crossref.org/works?query.bibliographic=global+poverty+reduction+evidence+climate+urban+migration+gender+developing+low+middle+income+country+LIC+MIC&filter=from-pub-date:2024-11-08,has-abstract:1,type:journal-article,type:report">https://api.crossref.org/works?query.bibliographic=global+poverty+reduction+evidence+climate+urban+migration+gender+developing+low+middle+income+country+LIC+MIC&filter=from-pub-date:2024-11-08,has-abstract:1,type:journal-article,type:report</a> - 29 results</p> <p>Compared to a query like this: <a href="https://api.crossref.org/works?query.bibliographic=global+poverty+reduction+evidence+climate+urban+migration+gender+developing+low+middle+income+country+LIC+MIC&filter=from-pub-date:2024-01-01,has-abstract:1,type:journal-article,type:report">https://api.crossref.org/works?query.bibliographic=global+poverty+reduction+evidence+climate+urban+migration+gender+developing+low+middle+income+country+LIC+MIC&filter=from-pub-date:2024-01-01,has-abstract:1,type:journal-article,type:report</a> - 1343 results</p> <p>I would expect to see more results for both of these queries, so I am checking with a couple of my colleagues about what I am seeing here.</p> <p>More as I have it,<br> Isaac</p> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/LikeAction"/> <meta itemprop="userInteractionCount" content="0" /> <span class='post-likes'></span> </div> </div> <div id='post_3' itemprop='comment' itemscope itemtype='http://schema.org/Comment' class='topic-body crawler-post'> <div class='crawler-post-meta'> <span class="creator" itemprop="author" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" rel='nofollow' href='https://community.crossref.org/u/tomwagstaff-opml'><span itemprop='name'>tomwagstaff-opml</span></a> </span> <span class="crawler-post-infos"> <time itemprop='datePublished' datetime='2024-11-18T09:39:17Z' class='post-time'> 18 November 2024 09:39 </time> <meta itemprop='dateModified' content='2024-11-18T09:39:17Z'> <span itemprop='position'>3</span> </span> </div> <div class='post' itemprop='text'> <p>Hi <a class="mention" href="/u/ifarley">@ifarley</a>,</p> <p>Thanks very much for picking this up. Glad it’s not just me who thought this was peculiar behaviour <img src="https://emoji.discourse-cdn.com/twitter/sweat_smile.png?v=12" title=":sweat_smile:" class="emoji" alt=":sweat_smile:" loading="lazy" width="20" height="20"></p> <p>You’re correct on my query URL, except I’m using the general <strong>query</strong> rather than <strong>query.bibliographic</strong>. As a sidenote - do you know what the difference is? I know bibliographic searches title, author and some publication details - is an unqualified query looking at <em>all</em> possible fields?</p> <p>Here is my query URL (with apologies for the percent encoding):<br> <a href="https://api.crossref.org/works?query=global+poverty+reduction+evidence+climate+urban+migration+gender+developing+low+middle+income+country+LIC+MIC&filter=from-pub-date%3A2024-11-08%2Chas-abstract%3A1%2Ctype%3Ajournal-article%2Ctype%3Abook-chapter%2Ctype%3Areport" class="onebox" target="_blank" rel="noopener nofollow ugc">https://api.crossref.org/works?query=global+poverty+reduction+evidence+climate+urban+migration+gender+developing+low+middle+income+country+LIC+MIC&filter=from-pub-date%3A2024-11-08%2Chas-abstract%3A1%2Ctype%3Ajournal-article%2Ctype%3Abook-chapter%2Ctype%3Areport</a></p> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/LikeAction"/> <meta itemprop="userInteractionCount" content="1" /> <span class='post-likes'>1 Like</span> </div> </div> <div id='post_4' itemprop='comment' itemscope itemtype='http://schema.org/Comment' class='topic-body crawler-post'> <div class='crawler-post-meta'> <span class="creator" itemprop="author" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" rel='nofollow' href='https://community.crossref.org/u/tomwagstaff-opml'><span itemprop='name'>tomwagstaff-opml</span></a> </span> <span class="crawler-post-infos"> <time itemprop='datePublished' datetime='2024-11-18T09:41:19Z' class='post-time'> 18 November 2024 09:41 </time> <meta itemprop='dateModified' content='2024-11-18T09:41:19Z'> <span itemprop='position'>4</span> </span> </div> <div class='post' itemprop='text'> <p>And in case it helps - I’ve been doing a bit more investigating. These are the numbers of results I got on Friday, using another 7-day query:</p> <div class="md-table"> <table> <thead> <tr> <th><strong>Query</strong></th> <th><strong>Results (for 3/11 - 10/11, searched now)</strong></th> </tr> </thead> <tbody> <tr> <td>global poverty reduction evidence climate urban migration gender developing low middle income country LIC MIC</td> <td>25</td> </tr> <tr> <td>poverty map</td> <td>56</td> </tr> <tr> <td>poverty</td> <td>26</td> </tr> <tr> <td>poverty low middle income country LIC MIC</td> <td>629</td> </tr> <tr> <td>poverty low middle income country LIC MIC climate urban migration gender</td> <td>71</td> </tr> </tbody> </table> </div><p>I’m surprised at how these results swing around. I thought the query terms were basically ORed together, so more terms should always bring back more results, but it doesn’t seem to be the case here…</p> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/LikeAction"/> <meta itemprop="userInteractionCount" content="1" /> <span class='post-likes'>1 Like</span> </div> <div class='crawler-linkback-list' itemscope itemtype='http://schema.org/ItemList'> <div itemprop='itemListElement' itemscope itemtype='http://schema.org/ListItem'> <a itemprop='url' href="https://community.crossref.org/t/advanced-set-boolean-operators/4596/3">Advanced set/boolean operators</a> <meta itemprop='position' content='1'> </div> <div itemprop='itemListElement' itemscope itemtype='http://schema.org/ListItem'> <a itemprop='url' href="https://community.crossref.org/t/how-to-query-works-that-match-a-given-and-family-name/12355/2">How to query works that match a given AND family name?</a> <meta itemprop='position' content='2'> </div> <div itemprop='itemListElement' itemscope itemtype='http://schema.org/ListItem'> <a itemprop='url' href="https://community.crossref.org/t/how-to-query-works-that-match-a-given-and-family-name/12355/5">How to query works that match a given AND family name?</a> <meta itemprop='position' content='3'> </div> </div> </div> <div id='post_5' itemprop='comment' itemscope itemtype='http://schema.org/Comment' class='topic-body crawler-post'> <div class='crawler-post-meta'> <span class="creator" itemprop="author" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" rel='nofollow' href='https://community.crossref.org/u/ifarley'><span itemprop='name'>ifarley</span></a> </span> <span class="crawler-post-infos"> <time itemprop='datePublished' datetime='2024-11-18T19:55:43Z' class='post-time'> 18 November 2024 19:55 </time> <meta itemprop='dateModified' content='2024-11-18T19:55:43Z'> <span itemprop='position'>5</span> </span> </div> <div class='post' itemprop='text'> <p>Hello <a class="mention" href="/u/tomwagstaff-opml">@tomwagstaff-opml</a> ,</p> <p>Thanks for posting your query and some example results.</p> <p><code>query.bibliographic</code> is specifically for querying bibliographic information, which is useful for citation look up. <code>query.bibliographic</code> includes matches from titles, authors, ISSNs, and publication years.</p> <p>I think I was also under the impression that we performed something very similar to an OR query when a query included a list of multiple terms. I likely have referenced it that way in other posts within this community forum (I’m going to search and see if I can clarify in other similar community forum posts). And, I believe I came to that assumption because I was reviewing queries with fewer search terms. After discussing your example with Martyn on our program team and Dominika on our technical team, I learned that my understanding was incorrect and not quite nuanced enough.</p> <p>So, we don’t quite have an OR query at play here. Instead, <code>query</code> and <code>query.bibliographic</code> require that at least 20% of query words match. We added this requirement a few years ago for performance reasons. So, with 10 input query words, 20% becomes 2 words, and the query results start to drop works that match only one word at that threshold. This is why a query containing <code>poverty+low+middle+income+country+LIC+MIC</code> has more results than a query containing <code>poverty+low+middle+income+country+LIC+MIC+climate+urban+migration+gender</code>. We’ve simply dropped results from the latter of the two queries because of that 20% matching rule that we implemented.</p> <p>Based on your use case, you might be better served by something like <strong><a href="https://openalex.org/">OpenAlex</a></strong> which is designed more with search in mind.</p> <p>I hope this helpful!</p> <p>-Isaac</p> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/LikeAction"/> <meta itemprop="userInteractionCount" content="1" /> <span class='post-likes'>1 Like</span> </div> </div> <div id='post_6' itemprop='comment' itemscope itemtype='http://schema.org/Comment' class='topic-body crawler-post'> <div class='crawler-post-meta'> <span class="creator" itemprop="author" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" rel='nofollow' href='https://community.crossref.org/u/ifarley'><span itemprop='name'>ifarley</span></a> </span> <span class="crawler-post-infos"> <time itemprop='datePublished' datetime='2024-11-19T14:52:00Z' class='post-time'> 19 November 2024 14:52 </time> <meta itemprop='dateModified' content='2024-11-19T14:52:00Z'> <span itemprop='position'>6</span> </span> </div> <div class='post' itemprop='text'> <p>A couple of additional things I failed to mention in yesterday’s post:</p> <ol> <li>We’ll update our Swagger documentation to capture this behavior</li> <li>As you reported <a class="mention" href="/u/tomwagstaff-opml">@tomwagstaff-opml</a> , there do appear to be some discrepancies with the recent totals you’re reporting; my colleague <a class="mention" href="/u/mrittman">@mrittman</a> is investigating that</li> </ol> <p>Thanks again for raising this!</p> <p>More as we have it,<br> Isaac</p> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/LikeAction"/> <meta itemprop="userInteractionCount" content="0" /> <span class='post-likes'></span> </div> </div> <div id='post_7' itemprop='comment' itemscope itemtype='http://schema.org/Comment' class='topic-body crawler-post'> <div class='crawler-post-meta'> <span class="creator" itemprop="author" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" rel='nofollow' href='https://community.crossref.org/u/tomwagstaff-opml'><span itemprop='name'>tomwagstaff-opml</span></a> </span> <span class="crawler-post-infos"> <time itemprop='datePublished' datetime='2024-11-22T16:52:45Z' class='post-time'> 22 November 2024 16:52 </time> <meta itemprop='dateModified' content='2024-11-22T16:52:45Z'> <span itemprop='position'>7</span> </span> </div> <div class='post' itemprop='text'> <p>Hi <a class="mention" href="/u/ifarley">@ifarley</a>,</p> <p>Thanks very much for all this feedback. It’s good to understand there is this extra constraint at play of at least 20% matching - this explains the different performance of the various queries.</p> <p>Thanks also for the suggestion of OpenAlex - it might as you say be better suited to our use case, although I would regret moving over from CrossRef, which has been such a good source overall.</p> <p>Final question - and I know it’s vague and fuzzy and probably impossible to answer - but do you have any idea why we saw these major drops in volumes since early September, even with the same query? (Apologies if that’s what <a class="mention" href="/u/mrittman">@mrittman</a> is already looking into!)</p> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/LikeAction"/> <meta itemprop="userInteractionCount" content="0" /> <span class='post-likes'></span> </div> </div> <div id='post_8' itemprop='comment' itemscope itemtype='http://schema.org/Comment' class='topic-body crawler-post'> <div class='crawler-post-meta'> <span class="creator" itemprop="author" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" rel='nofollow' href='https://community.crossref.org/u/ifarley'><span itemprop='name'>ifarley</span></a> </span> <span class="crawler-post-infos"> <time itemprop='datePublished' datetime='2024-11-22T18:22:11Z' class='post-time'> 22 November 2024 18:22 </time> <meta itemprop='dateModified' content='2024-11-22T18:22:11Z'> <span itemprop='position'>8</span> </span> </div> <div class='post' itemprop='text'> <p>Thanks for following up <a class="mention" href="/u/tomwagstaff-opml">@tomwagstaff-opml</a> . Yes, that’s what <a class="mention" href="/u/mrittman">@mrittman</a> was going to investigate. I’ll follow up with him next week to check on his progress.</p> <p>Have a lovely weekend,<br> Isaac</p> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/LikeAction"/> <meta itemprop="userInteractionCount" content="0" /> <span class='post-likes'></span> </div> </div> <div id='post_10' itemprop='comment' itemscope itemtype='http://schema.org/Comment' class='topic-body crawler-post'> <div class='crawler-post-meta'> <span class="creator" itemprop="author" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" rel='nofollow' href='https://community.crossref.org/u/tomwagstaff-opml'><span itemprop='name'>tomwagstaff-opml</span></a> </span> <span class="crawler-post-infos"> <time itemprop='datePublished' datetime='2024-11-29T13:41:47Z' class='post-time'> 29 November 2024 13:41 </time> <meta itemprop='dateModified' content='2024-11-29T13:41:47Z'> <span itemprop='position'>10</span> </span> </div> <div class='post' itemprop='text'> <p>Thanks <a class="mention" href="/u/ifarley">@ifarley</a> and <a class="mention" href="/u/mrittman">@mrittman</a>,</p> <p>Did you uncover anything? I’m guessing this will probably remain a mystery…</p> <p>More broadly, I wonder if you have any tips for getting reproducible results from CrossRef? The issue for us really is that we scoped what we were likely to get, planned on that basis, and are now getting back much different results (in volume but also the contents seem to be qualitative different).</p> <p>And I can’t now reproduce my searches from August, so I’m guessing something has changed either with the content that sits behind the API or the way the API itself operates.</p> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/LikeAction"/> <meta itemprop="userInteractionCount" content="0" /> <span class='post-likes'></span> </div> </div> <div id='post_11' itemprop='comment' itemscope itemtype='http://schema.org/Comment' class='topic-body crawler-post'> <div class='crawler-post-meta'> <span class="creator" itemprop="author" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" rel='nofollow' href='https://community.crossref.org/u/mrittman'><span itemprop='name'>mrittman</span></a> </span> <span class="crawler-post-infos"> <time itemprop='datePublished' datetime='2024-12-02T10:37:58Z' class='post-time'> 2 December 2024 10:37 </time> <meta itemprop='dateModified' content='2024-12-02T10:37:58Z'> <span itemprop='position'>11</span> </span> </div> <div class='post' itemprop='text'> <p>We’ve had an initial look and haven’t found anything yet - nothing obvious changed on our side that would have caused a change in the number of results. It’s possible that some of the metadata was modified, that’s the most likely cause for the changes you’re seeing in August. It’s difficult to track back to earlier in the year to see what might have changed.</p> <p>To get more consistent data you might want to use a different date: publication date can be quite different from the date that a record is registered with us. You should get consistent results using the created date. You could also try and split your search into separate or smaller queries. Querying with many terms is resource-intensive and more error-prone, and not the kind of query our APIs are optimised for.</p> <p>Another option might be to use a data dump: we have a free public data file that we release once per year: <a href="https://www.crossref.org/blog/2024-public-data-file-now-available-featuring-new-experimental-formats/" class="inline-onebox">2024 public data file now available, featuring new experimental formats - Crossref</a>. Alternatively, we have a monthly snapshot available to <a href="https://www.crossref.org/documentation/metadata-plus/">subscribers of our Plus service</a> - that might not be a feasible option for you, though, unless your institution has access.</p> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/LikeAction"/> <meta itemprop="userInteractionCount" content="2" /> <span class='post-likes'>2 Likes</span> </div> </div> <div id='post_12' itemprop='comment' itemscope itemtype='http://schema.org/Comment' class='topic-body crawler-post'> <div class='crawler-post-meta'> <span class="creator" itemprop="author" itemscope itemtype="http://schema.org/Person"> <a itemprop="url" rel='nofollow' href='https://community.crossref.org/u/tomwagstaff-opml'><span itemprop='name'>tomwagstaff-opml</span></a> </span> <span class="crawler-post-infos"> <time itemprop='datePublished' datetime='2024-12-02T17:04:55Z' class='post-time'> 2 December 2024 17:04 </time> <meta itemprop='dateModified' content='2024-12-02T17:04:55Z'> <span itemprop='position'>12</span> </span> </div> <div class='post' itemprop='text'> <p>Thanks <a class="mention" href="/u/mrittman">@mrittman</a>,</p> <p>Not surprised you haven’t found a clear cause - we probably just got lucky during our test run and first few weeks of activity…</p> <p>Thanks for the tip - yes I’ll switch to created date and trim the query.</p> <p>We’re really looking for latest releases rather than through historical data dumps. We are not a subscribing institution (yet!) but I’ll note the option of getting monthly snapshots - this might be worth it if we have enough different applications.</p> <p>Thank you all for your help - I think this is as far as we can get with the investigation. Thanks for all the ideas for next steps and alternatives - I’m going to try and stick with the CrossRef API in the first instance <img src="https://emoji.discourse-cdn.com/twitter/heart.png?v=12" title=":heart:" class="emoji" alt=":heart:" loading="lazy" width="20" height="20"></p> </div> <div itemprop="interactionStatistic" itemscope itemtype="http://schema.org/InteractionCounter"> <meta itemprop="interactionType" content="http://schema.org/LikeAction"/> <meta itemprop="userInteractionCount" content="1" /> <span class='post-likes'>1 Like</span> </div> </div> </div> <div id="related-topics" class="more-topics__list " role="complementary" aria-labelledby="related-topics-title"> <h3 id="related-topics-title" class="more-topics__list-title"> Related topics </h3> <div class="topic-list-container" itemscope itemtype='http://schema.org/ItemList'> <meta itemprop='itemListOrder' content='http://schema.org/ItemListOrderDescending'> <table class='topic-list'> <thead> <tr> <th>Topic</th> <th></th> <th class="replies">Replies</th> <th class="views">Views</th> <th>Activity</th> </tr> </thead> <tbody> <tr class="topic-list-item" id="topic-list-item-3409"> <td class="main-link" itemprop='itemListElement' itemscope itemtype='http://schema.org/ListItem'> <meta itemprop='position' content='1'> <span class="link-top-line"> <a itemprop='url' href='https://community.crossref.org/t/metadata-api-search/3409' class='title raw-link raw-topic-link'>Metadata api search</a> </span> <div class="link-bottom-line"> <a href='/c/metadata-retrieval/27' class='badge-wrapper bullet'> <span class='badge-category-bg' style='background-color: #3EB1C8'></span> <span class='badge-category clear-badge'> <span class='category-name'>Metadata Retrieval</span> </span> </a> <div class="discourse-tags"> <a href='https://community.crossref.org/tag/rest-api' class='discourse-tag'>rest-api</a> , <a href='https://community.crossref.org/tag/metadata-retrieval' class='discourse-tag'>metadata-retrieval</a> , <a href='https://community.crossref.org/tag/metadata-search' class='discourse-tag'>metadata-search</a> </div> </div> </td> <td class="replies"> <span class='posts' title='posts'>2</span> </td> <td class="views"> <span class='views' title='views'>718</span> </td> <td> 15 November 2023 </td> </tr> <tr class="topic-list-item" id="topic-list-item-2202"> <td class="main-link" itemprop='itemListElement' itemscope itemtype='http://schema.org/ListItem'> <meta itemprop='position' content='2'> <span class="link-top-line"> <a itemprop='url' href='https://community.crossref.org/t/come-and-get-your-grant-metadata-crossref/2202' class='title raw-link raw-topic-link'>Come and get your grant metadata! - Crossref</a> </span> <div class="link-bottom-line"> <a href='/c/metadata-retrieval/interfaces-for-machines/29' class='badge-wrapper bullet'> <span class='badge-category-bg' style='background-color: #3EB1C8'></span> <span class='badge-category clear-badge'> <span class='category-name'>Interfaces for Machines</span> </span> </a> <div class="discourse-tags"> <a href='https://community.crossref.org/tag/rest-api' class='discourse-tag'>rest-api</a> , <a href='https://community.crossref.org/tag/metadata' class='discourse-tag'>metadata</a> , <a href='https://community.crossref.org/tag/blog' class='discourse-tag'>blog</a> , <a href='https://community.crossref.org/tag/grants' class='discourse-tag'>grants</a> </div> </div> </td> <td class="replies"> <span class='posts' title='posts'>0</span> </td> <td class="views"> <span class='views' title='views'>1230</span> </td> <td> 8 November 2021 </td> </tr> <tr class="topic-list-item" id="topic-list-item-12000"> <td class="main-link" itemprop='itemListElement' itemscope itemtype='http://schema.org/ListItem'> <meta itemprop='position' content='3'> <span class="link-top-line"> <a itemprop='url' href='https://community.crossref.org/t/503-service-unavailable/12000' class='title raw-link raw-topic-link'>503 Service Unavailable</a> </span> <div class="link-bottom-line"> <a href='/c/content-registration/24' class='badge-wrapper bullet'> <span class='badge-category-bg' style='background-color: #D8D2C4'></span> <span class='badge-category clear-badge'> <span class='category-name'>Content Registration</span> </span> </a> <div class="discourse-tags"> </div> </div> </td> <td class="replies"> <span class='posts' title='posts'>2</span> </td> <td class="views"> <span class='views' title='views'>27</span> </td> <td> 21 August 2024 </td> </tr> <tr class="topic-list-item" id="topic-list-item-2904"> <td class="main-link" itemprop='itemListElement' itemscope itemtype='http://schema.org/ListItem'> <meta itemprop='position' content='4'> <span class="link-top-line"> <a itemprop='url' href='https://community.crossref.org/t/crossref-search-internal-server-error-persisting/2904' class='title raw-link raw-topic-link'>Crossref search: internal server error persisting</a> </span> <div class="link-bottom-line"> <a href='/c/tech-support/8' class='badge-wrapper bullet'> <span class='badge-category-bg' style='background-color: #EF3340'></span> <span class='badge-category clear-badge'> <span class='category-name'>Technical Support</span> </span> </a> <div class="discourse-tags"> <a href='https://community.crossref.org/tag/rest-api' class='discourse-tag'>rest-api</a> , <a href='https://community.crossref.org/tag/metadata-retrieval' class='discourse-tag'>metadata-retrieval</a> , <a href='https://community.crossref.org/tag/metadata-search' class='discourse-tag'>metadata-search</a> , <a href='https://community.crossref.org/tag/crmds' class='discourse-tag'>crmds</a> </div> </div> </td> <td class="replies"> <span class='posts' title='posts'>1</span> </td> <td class="views"> <span class='views' title='views'>883</span> </td> <td> 25 August 2022 </td> </tr> <tr class="topic-list-item" id="topic-list-item-4631"> <td class="main-link" itemprop='itemListElement' itemscope itemtype='http://schema.org/ListItem'> <meta itemprop='position' content='5'> <span class="link-top-line"> <a itemprop='url' href='https://community.crossref.org/t/update-on-the-relationships-endpoint/4631' class='title raw-link raw-topic-link'>Update on the relationships endpoint</a> </span> <div class="link-bottom-line"> <a href='/c/crossref-services/18' class='badge-wrapper bullet'> <span class='badge-category-bg' style='background-color: #4F5858'></span> <span class='badge-category clear-badge'> <span class='category-name'>Crossref Services</span> </span> </a> <div class="discourse-tags"> <a href='https://community.crossref.org/tag/rest-api' class='discourse-tag'>rest-api</a> , <a href='https://community.crossref.org/tag/metadata' class='discourse-tag'>metadata</a> , <a href='https://community.crossref.org/tag/event-data' class='discourse-tag'>event-data</a> , <a href='https://community.crossref.org/tag/relationships' class='discourse-tag'>relationships</a> </div> </div> </td> <td class="replies"> <span class='posts' title='posts'>1</span> </td> <td class="views"> <span class='views' title='views'>458</span> </td> <td> 6 February 2024 </td> </tr> </tbody> </table> </div> </div> </div> <footer class="container wrap"> <nav class='crawler-nav'> <ul> <li itemscope itemtype='http://schema.org/SiteNavigationElement'> <span itemprop='name'> <a href='/' itemprop="url">Home </a> </span> </li> <li itemscope itemtype='http://schema.org/SiteNavigationElement'> <span itemprop='name'> <a href='/categories' itemprop="url">Categories </a> </span> </li> <li itemscope itemtype='http://schema.org/SiteNavigationElement'> <span itemprop='name'> <a href='/guidelines' itemprop="url">Guidelines </a> </span> </li> <li itemscope itemtype='http://schema.org/SiteNavigationElement'> <span itemprop='name'> <a href='https://www.crossref.org/code-of-conduct' itemprop="url">Code of Conduct </a> </span> </li> <li itemscope itemtype='http://schema.org/SiteNavigationElement'> <span itemprop='name'> <a href='https://www.crossref.org/privacy/' itemprop="url">Privacy Policy </a> </span> </li> </ul> </nav> <p class='powered-by-link'>Powered by <a href="https://www.discourse.org">Discourse</a>, best viewed with JavaScript enabled</p> </footer> <!-- Matomo --> <!-- End Matomo Code --><script defer="" src="https://sea2.discourse-cdn.com/flex020/theme-javascripts/482f2bd768050a3591ec9a6687a8d21339d28b0c.js?__ws=community.crossref.org" data-theme-id="1" nonce="GgBsJYyj70ulA32RbT6gqUgjK"></script> <!-- Matomo --> <!-- End Matomo Code --><script defer="" src="https://sea2.discourse-cdn.com/flex020/theme-javascripts/482f2bd768050a3591ec9a6687a8d21339d28b0c.js?__ws=community.crossref.org" data-theme-id="1" nonce="GgBsJYyj70ulA32RbT6gqUgjK"></script> <script src="https://status.crossref.org/embed/script.js" nonce="GgBsJYyj70ulA32RbT6gqUgjK"></script> <div class="buorg"><div>Unfortunately, <a href="https://www.discourse.org/faq/#browser">your browser is unsupported</a>. Please <a href="https://browsehappy.com">switch to a supported browser</a> to view rich content, log in and reply.</div></div> </body> </html>