CINXE.COM
Steve Baskauf's blog: Understanding the TDWG Standards Documentation Specification, Part 5: Acquiring Machine-readable using DCAT
<!DOCTYPE html> <html class='v2' dir='ltr' lang='en'> <head> <link href='https://www.blogger.com/static/v1/widgets/3566091532-css_bundle_v2.css' rel='stylesheet' type='text/css'/> <meta content='width=1100' name='viewport'/> <meta content='text/html; charset=UTF-8' http-equiv='Content-Type'/> <meta content='blogger' name='generator'/> <link href='http://baskauf.blogspot.com/favicon.ico' rel='icon' type='image/x-icon'/> <link href='http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards_24.html' rel='canonical'/> <link rel="alternate" type="application/atom+xml" title="Steve Baskauf's blog - Atom" href="http://baskauf.blogspot.com/feeds/posts/default" /> <link rel="alternate" type="application/rss+xml" title="Steve Baskauf's blog - RSS" href="http://baskauf.blogspot.com/feeds/posts/default?alt=rss" /> <link rel="service.post" type="application/atom+xml" title="Steve Baskauf's blog - Atom" href="https://www.blogger.com/feeds/5299754536670281996/posts/default" /> <link rel="alternate" type="application/atom+xml" title="Steve Baskauf's blog - Atom" href="http://baskauf.blogspot.com/feeds/9189703936140939208/comments/default" /> <!--Can't find substitution for tag [blog.ieCssRetrofitLinks]--> <link href='https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiz3b18P5IKqa1T8c8ey2n9E1gDlKFZrSPqWzhx8tGQs-WMtuGAyh8WGrYcLJyoeBx9qXOcod8GwgQ0HYCPQU_-5Oc-qGijdRtRnCkIf8ahh9o2h9AU_TA96F_NzFQsalzl3VU4gWlEPEY/s640/dcat.png' rel='image_src'/> <meta content='http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards_24.html' property='og:url'/> <meta content='Understanding the TDWG Standards Documentation Specification, Part 5: Acquiring Machine-readable using DCAT' property='og:title'/> <meta content='This is the fifth in a series of posts about the TDWG Standards Documentation Specification (SDS).聽 For background on the SDS, see the first...' property='og:description'/> <meta content='https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiz3b18P5IKqa1T8c8ey2n9E1gDlKFZrSPqWzhx8tGQs-WMtuGAyh8WGrYcLJyoeBx9qXOcod8GwgQ0HYCPQU_-5Oc-qGijdRtRnCkIf8ahh9o2h9AU_TA96F_NzFQsalzl3VU4gWlEPEY/w1200-h630-p-k-no-nu/dcat.png' property='og:image'/> <title>Steve Baskauf's blog: Understanding the TDWG Standards Documentation Specification, Part 5: Acquiring Machine-readable using DCAT</title> <style id='page-skin-1' type='text/css'><!-- /* ----------------------------------------------- Blogger Template Style Name: Simple Designer: Blogger URL: www.blogger.com ----------------------------------------------- */ /* Content ----------------------------------------------- */ body { font: normal normal 12px 'Trebuchet MS', Trebuchet, Verdana, sans-serif; color: #666666; background: #ffffff none repeat scroll top left; padding: 0 0 0 0; } html body .region-inner { min-width: 0; max-width: 100%; width: auto; } h2 { font-size: 22px; } a:link { text-decoration:none; color: #8832ff; } a:visited { text-decoration:none; color: #bb2188; } a:hover { text-decoration:underline; color: #33aaff; } .body-fauxcolumn-outer .fauxcolumn-inner { background: transparent none repeat scroll top left; _background-image: none; } .body-fauxcolumn-outer .cap-top { position: absolute; z-index: 1; height: 400px; width: 100%; } .body-fauxcolumn-outer .cap-top .cap-left { width: 100%; background: transparent none repeat-x scroll top left; _background-image: none; } .content-outer { -moz-box-shadow: 0 0 0 rgba(0, 0, 0, .15); -webkit-box-shadow: 0 0 0 rgba(0, 0, 0, .15); -goog-ms-box-shadow: 0 0 0 #333333; box-shadow: 0 0 0 rgba(0, 0, 0, .15); margin-bottom: 1px; } .content-inner { padding: 10px 40px; } .content-inner { background-color: #ffffff; } /* Header ----------------------------------------------- */ .header-outer { background: transparent none repeat-x scroll 0 -400px; _background-image: none; } .Header h1 { font: normal normal 40px 'Trebuchet MS',Trebuchet,Verdana,sans-serif; color: #000000; text-shadow: 0 0 0 rgba(0, 0, 0, .2); } .Header h1 a { color: #000000; } .Header .description { font-size: 18px; color: #000000; } .header-inner .Header .titlewrapper { padding: 22px 0; } .header-inner .Header .descriptionwrapper { padding: 0 0; } /* Tabs ----------------------------------------------- */ .tabs-inner .section:first-child { border-top: 0 solid #dddddd; } .tabs-inner .section:first-child ul { margin-top: -1px; border-top: 1px solid #dddddd; border-left: 1px solid #dddddd; border-right: 1px solid #dddddd; } .tabs-inner .widget ul { background: transparent none repeat-x scroll 0 -800px; _background-image: none; border-bottom: 1px solid #dddddd; margin-top: 0; margin-left: -30px; margin-right: -30px; } .tabs-inner .widget li a { display: inline-block; padding: .6em 1em; font: normal normal 12px 'Trebuchet MS', Trebuchet, Verdana, sans-serif; color: #000000; border-left: 1px solid #ffffff; border-right: 1px solid #dddddd; } .tabs-inner .widget li:first-child a { border-left: none; } .tabs-inner .widget li.selected a, .tabs-inner .widget li a:hover { color: #000000; background-color: #eeeeee; text-decoration: none; } /* Columns ----------------------------------------------- */ .main-outer { border-top: 0 solid transparent; } .fauxcolumn-left-outer .fauxcolumn-inner { border-right: 1px solid transparent; } .fauxcolumn-right-outer .fauxcolumn-inner { border-left: 1px solid transparent; } /* Headings ----------------------------------------------- */ div.widget > h2, div.widget h2.title { margin: 0 0 1em 0; font: normal bold 11px 'Trebuchet MS',Trebuchet,Verdana,sans-serif; color: #000000; } /* Widgets ----------------------------------------------- */ .widget .zippy { color: #999999; text-shadow: 2px 2px 1px rgba(0, 0, 0, .1); } .widget .popular-posts ul { list-style: none; } /* Posts ----------------------------------------------- */ h2.date-header { font: normal bold 11px Arial, Tahoma, Helvetica, FreeSans, sans-serif; } .date-header span { background-color: #bbbbbb; color: #ffffff; padding: 0.4em; letter-spacing: 3px; margin: inherit; } .main-inner { padding-top: 35px; padding-bottom: 65px; } .main-inner .column-center-inner { padding: 0 0; } .main-inner .column-center-inner .section { margin: 0 1em; } .post { margin: 0 0 45px 0; } h3.post-title, .comments h4 { font: normal normal 22px 'Trebuchet MS',Trebuchet,Verdana,sans-serif; margin: .75em 0 0; } .post-body { font-size: 110%; line-height: 1.4; position: relative; } .post-body img, .post-body .tr-caption-container, .Profile img, .Image img, .BlogList .item-thumbnail img { padding: 2px; background: #ffffff; border: 1px solid #eeeeee; -moz-box-shadow: 1px 1px 5px rgba(0, 0, 0, .1); -webkit-box-shadow: 1px 1px 5px rgba(0, 0, 0, .1); box-shadow: 1px 1px 5px rgba(0, 0, 0, .1); } .post-body img, .post-body .tr-caption-container { padding: 5px; } .post-body .tr-caption-container { color: #666666; } .post-body .tr-caption-container img { padding: 0; background: transparent; border: none; -moz-box-shadow: 0 0 0 rgba(0, 0, 0, .1); -webkit-box-shadow: 0 0 0 rgba(0, 0, 0, .1); box-shadow: 0 0 0 rgba(0, 0, 0, .1); } .post-header { margin: 0 0 1.5em; line-height: 1.6; font-size: 90%; } .post-footer { margin: 20px -2px 0; padding: 5px 10px; color: #666666; background-color: #eeeeee; border-bottom: 1px solid #eeeeee; line-height: 1.6; font-size: 90%; } #comments .comment-author { padding-top: 1.5em; border-top: 1px solid transparent; background-position: 0 1.5em; } #comments .comment-author:first-child { padding-top: 0; border-top: none; } .avatar-image-container { margin: .2em 0 0; } #comments .avatar-image-container img { border: 1px solid #eeeeee; } /* Comments ----------------------------------------------- */ .comments .comments-content .icon.blog-author { background-repeat: no-repeat; background-image: url(); } .comments .comments-content .loadmore a { border-top: 1px solid #999999; border-bottom: 1px solid #999999; } .comments .comment-thread.inline-thread { background-color: #eeeeee; } .comments .continue { border-top: 2px solid #999999; } /* Accents ---------------------------------------------- */ .section-columns td.columns-cell { border-left: 1px solid transparent; } .blog-pager { background: transparent url(//www.blogblog.com/1kt/simple/paging_dot.png) repeat-x scroll top center; } .blog-pager-older-link, .home-link, .blog-pager-newer-link { background-color: #ffffff; padding: 5px; } .footer-outer { border-top: 1px dashed #bbbbbb; } /* Mobile ----------------------------------------------- */ body.mobile { background-size: auto; } .mobile .body-fauxcolumn-outer { background: transparent none repeat scroll top left; } .mobile .body-fauxcolumn-outer .cap-top { background-size: 100% auto; } .mobile .content-outer { -webkit-box-shadow: 0 0 3px rgba(0, 0, 0, .15); box-shadow: 0 0 3px rgba(0, 0, 0, .15); } .mobile .tabs-inner .widget ul { margin-left: 0; margin-right: 0; } .mobile .post { margin: 0; } .mobile .main-inner .column-center-inner .section { margin: 0; } .mobile .date-header span { padding: 0.1em 10px; margin: 0 -10px; } .mobile h3.post-title { margin: 0; } .mobile .blog-pager { background: transparent none no-repeat scroll top center; } .mobile .footer-outer { border-top: none; } .mobile .main-inner, .mobile .footer-inner { background-color: #ffffff; } .mobile-index-contents { color: #666666; } .mobile-link-button { background-color: #8832ff; } .mobile-link-button a:link, .mobile-link-button a:visited { color: #ffffff; } .mobile .tabs-inner .section:first-child { border-top: none; } .mobile .tabs-inner .PageList .widget-content { background-color: #eeeeee; color: #000000; border-top: 1px solid #dddddd; border-bottom: 1px solid #dddddd; } .mobile .tabs-inner .PageList .widget-content .pagelist-arrow { border-left: 1px solid #dddddd; } --></style> <style id='template-skin-1' type='text/css'><!-- body { min-width: 960px; } .content-outer, .content-fauxcolumn-outer, .region-inner { min-width: 960px; max-width: 960px; _width: 960px; } .main-inner .columns { padding-left: 0px; padding-right: 190px; } .main-inner .fauxcolumn-center-outer { left: 0px; right: 190px; /* IE6 does not respect left and right together */ _width: expression(this.parentNode.offsetWidth - parseInt("0px") - parseInt("190px") + 'px'); } .main-inner .fauxcolumn-left-outer { width: 0px; } .main-inner .fauxcolumn-right-outer { width: 190px; } .main-inner .column-left-outer { width: 0px; right: 100%; margin-left: -0px; } .main-inner .column-right-outer { width: 190px; margin-right: -190px; } #layout { min-width: 0; } #layout .content-outer { min-width: 0; width: 800px; } #layout .region-inner { min-width: 0; width: auto; } body#layout div.add_widget { padding: 8px; } body#layout div.add_widget a { margin-left: 32px; } --></style> <link href='https://www.blogger.com/dyn-css/authorization.css?targetBlogID=5299754536670281996&zx=d15c2b04-e5e6-4a3c-87b6-c11d9c7cdc1f' media='none' onload='if(media!='all')media='all'' rel='stylesheet'/><noscript><link href='https://www.blogger.com/dyn-css/authorization.css?targetBlogID=5299754536670281996&zx=d15c2b04-e5e6-4a3c-87b6-c11d9c7cdc1f' rel='stylesheet'/></noscript> <meta name='google-adsense-platform-account' content='ca-host-pub-1556223355139109'/> <meta name='google-adsense-platform-domain' content='blogspot.com'/> </head> <body class='loading variant-simplysimple'> <div class='navbar section' id='navbar' name='Navbar'><div class='widget Navbar' data-version='1' id='Navbar1'><script type="text/javascript"> function setAttributeOnload(object, attribute, val) { if(window.addEventListener) { window.addEventListener('load', function(){ object[attribute] = val; }, false); } else { window.attachEvent('onload', function(){ object[attribute] = val; }); } } </script> <div id="navbar-iframe-container"></div> <script type="text/javascript" src="https://apis.google.com/js/platform.js"></script> <script type="text/javascript"> gapi.load("gapi.iframes:gapi.iframes.style.bubble", function() { if (gapi.iframes && gapi.iframes.getContext) { gapi.iframes.getContext().openChild({ url: 'https://www.blogger.com/navbar.g?targetBlogID\x3d5299754536670281996\x26blogName\x3dSteve+Baskauf\x27s+blog\x26publishMode\x3dPUBLISH_MODE_BLOGSPOT\x26navbarType\x3dLIGHT\x26layoutType\x3dLAYOUTS\x26searchRoot\x3dhttps://baskauf.blogspot.com/search\x26blogLocale\x3den\x26v\x3d2\x26homepageUrl\x3dhttp://baskauf.blogspot.com/\x26targetPostID\x3d9189703936140939208\x26blogPostOrPageUrl\x3dhttp://baskauf.blogspot.com/2019/04/understanding-tdwg-standards_24.html\x26vt\x3d6214689078914686339', where: document.getElementById("navbar-iframe-container"), id: "navbar-iframe", messageHandlersFilter: gapi.iframes.CROSS_ORIGIN_IFRAMES_FILTER, messageHandlers: { 'blogger-ping': function() {} } }); } }); </script><script type="text/javascript"> (function() { var script = document.createElement('script'); script.type = 'text/javascript'; script.src = '//pagead2.googlesyndication.com/pagead/js/google_top_exp.js'; var head = document.getElementsByTagName('head')[0]; if (head) { head.appendChild(script); }})(); </script> </div></div> <div class='body-fauxcolumns'> <div class='fauxcolumn-outer body-fauxcolumn-outer'> <div class='cap-top'> <div class='cap-left'></div> <div class='cap-right'></div> </div> <div class='fauxborder-left'> <div class='fauxborder-right'></div> <div class='fauxcolumn-inner'> </div> </div> <div class='cap-bottom'> <div class='cap-left'></div> <div class='cap-right'></div> </div> </div> </div> <div class='content'> <div class='content-fauxcolumns'> <div class='fauxcolumn-outer content-fauxcolumn-outer'> <div class='cap-top'> <div class='cap-left'></div> <div class='cap-right'></div> </div> <div class='fauxborder-left'> <div class='fauxborder-right'></div> <div class='fauxcolumn-inner'> </div> </div> <div class='cap-bottom'> <div class='cap-left'></div> <div class='cap-right'></div> </div> </div> </div> <div class='content-outer'> <div class='content-cap-top cap-top'> <div class='cap-left'></div> <div class='cap-right'></div> </div> <div class='fauxborder-left content-fauxborder-left'> <div class='fauxborder-right content-fauxborder-right'></div> <div class='content-inner'> <header> <div class='header-outer'> <div class='header-cap-top cap-top'> <div class='cap-left'></div> <div class='cap-right'></div> </div> <div class='fauxborder-left header-fauxborder-left'> <div class='fauxborder-right header-fauxborder-right'></div> <div class='region-inner header-inner'> <div class='header section' id='header' name='Header'><div class='widget Header' data-version='1' id='Header1'> <div id='header-inner'> <div class='titlewrapper'> <h1 class='title'> <a href='http://baskauf.blogspot.com/'> Steve Baskauf's blog </a> </h1> </div> <div class='descriptionwrapper'> <p class='description'><span> </span></p> </div> </div> </div></div> </div> </div> <div class='header-cap-bottom cap-bottom'> <div class='cap-left'></div> <div class='cap-right'></div> </div> </div> </header> <div class='tabs-outer'> <div class='tabs-cap-top cap-top'> <div class='cap-left'></div> <div class='cap-right'></div> </div> <div class='fauxborder-left tabs-fauxborder-left'> <div class='fauxborder-right tabs-fauxborder-right'></div> <div class='region-inner tabs-inner'> <div class='tabs no-items section' id='crosscol' name='Cross-Column'></div> <div class='tabs no-items section' id='crosscol-overflow' name='Cross-Column 2'></div> </div> </div> <div class='tabs-cap-bottom cap-bottom'> <div class='cap-left'></div> <div class='cap-right'></div> </div> </div> <div class='main-outer'> <div class='main-cap-top cap-top'> <div class='cap-left'></div> <div class='cap-right'></div> </div> <div class='fauxborder-left main-fauxborder-left'> <div class='fauxborder-right main-fauxborder-right'></div> <div class='region-inner main-inner'> <div class='columns fauxcolumns'> <div class='fauxcolumn-outer fauxcolumn-center-outer'> <div class='cap-top'> <div class='cap-left'></div> <div class='cap-right'></div> </div> <div class='fauxborder-left'> <div class='fauxborder-right'></div> <div class='fauxcolumn-inner'> </div> </div> <div class='cap-bottom'> <div class='cap-left'></div> <div class='cap-right'></div> </div> </div> <div class='fauxcolumn-outer fauxcolumn-left-outer'> <div class='cap-top'> <div class='cap-left'></div> <div class='cap-right'></div> </div> <div class='fauxborder-left'> <div class='fauxborder-right'></div> <div class='fauxcolumn-inner'> </div> </div> <div class='cap-bottom'> <div class='cap-left'></div> <div class='cap-right'></div> </div> </div> <div class='fauxcolumn-outer fauxcolumn-right-outer'> <div class='cap-top'> <div class='cap-left'></div> <div class='cap-right'></div> </div> <div class='fauxborder-left'> <div class='fauxborder-right'></div> <div class='fauxcolumn-inner'> </div> </div> <div class='cap-bottom'> <div class='cap-left'></div> <div class='cap-right'></div> </div> </div> <!-- corrects IE6 width calculation --> <div class='columns-inner'> <div class='column-center-outer'> <div class='column-center-inner'> <div class='main section' id='main' name='Main'><div class='widget Blog' data-version='1' id='Blog1'> <div class='blog-posts hfeed'> <div class="date-outer"> <h2 class='date-header'><span>Wednesday, April 24, 2019</span></h2> <div class="date-posts"> <div class='post-outer'> <div class='post hentry uncustomized-post-template' itemprop='blogPost' itemscope='itemscope' itemtype='http://schema.org/BlogPosting'> <meta content='https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiz3b18P5IKqa1T8c8ey2n9E1gDlKFZrSPqWzhx8tGQs-WMtuGAyh8WGrYcLJyoeBx9qXOcod8GwgQ0HYCPQU_-5Oc-qGijdRtRnCkIf8ahh9o2h9AU_TA96F_NzFQsalzl3VU4gWlEPEY/s640/dcat.png' itemprop='image_url'/> <meta content='5299754536670281996' itemprop='blogId'/> <meta content='9189703936140939208' itemprop='postId'/> <a name='9189703936140939208'></a> <h3 class='post-title entry-title' itemprop='name'> Understanding the TDWG Standards Documentation Specification, Part 5: Acquiring Machine-readable using DCAT </h3> <div class='post-header'> <div class='post-header-line-1'></div> </div> <div class='post-body entry-content' id='post-body-9189703936140939208' itemprop='description articleBody'> This is the fifth in a series of posts about the TDWG Standards Documentation Specification (SDS). For background on the SDS, see the <a href="http://baskauf.blogspot.com/2019/03/understanding-tdwg-standards.html" target="_blank">first post</a>. For information on the SDS hierarchical model and how it relates to IRI design, see the <a href="http://baskauf.blogspot.com/2019/03/understanding-tdwg-standards_10.html" target="_blank">second post</a>. For information about how TDWG standards metadata can be retrieved via IRI dereferencing, see the <a href="http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards.html" target="_blank">third post</a>. For information about accessing TDWG standards metadata via a SPARQL API, see the <a href="http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards_7.html" target="_blank">fourth post</a>.<br /> <br /> Note: this post was revised on 2020-03-04 when IRI dereferencing of the http://rs.tdwg.org/ subdomain went from testing into production.<br /> <h2> Acquiring the machine-readable TDWG standards metadata based on the W3C Data Catalog (DCAT) Vocabulary Recommendation.</h2> <div> <br /></div> <h3> Not-so-great methods of getting a dump of all of the machine-readable metadata</h3> In the last two posts of this series, I showed two different ways that you could acquire machine-readable metadata about TDWG Standards and their components.<br /> <br /> In the <a href="http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards.html" target="_blank">third post</a>, I explained how the implementation of the Standards Documentation Specification (SDS) could allow a machine (i.e. computer software) use the classic Linked Open Data (LOD) method of "following its nose" and essentially scraping the standards metadata by discovering linked IRIs, then following those links to retrieve metadata about the linked components. There are two problems with this approach. One is that it's very inefficient. Multiple HTTP calls are required to acquire the metadata about a single resource and there are thousands of resources that would need to be scraped. A more serious problem is that some of the terms that are current or past terms of Darwin and Audubon Cores are not dereferenceable. For example, the International Press Telecommunications Council (IPTC) terms that are borrowed by Audubon Core are defined in a PDF document and don't dereference. There are many ancient Darwin Core terms in namespaces other than the<span style="font-family: "courier new" , "courier" , monospace;"> rs.tdwg.org</span> subdomain that don't even bring up a web page, let alone machine-readable metadata. And the "permanent URLs" of the standards themselves (e.g. <span style="font-family: "courier new" , "courier" , monospace;">http://www.tdwg.org/standards/116</span>) do not use content negotiation to return machine-readable metadata (although they might at some future point). So there are many items of interest whose machine-readable metadata simply cannot be discovered by this means, since linked IRIs can't be dereferenced with a request for machine-readable metadata.<br /> <br /> In the <a href="http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards_7.html" target="_blank">fourth post</a>, I described how the SPARQL query language could be used to get all of the triples in the TDWG Standards dataset. The query to do so was really simple:<br /> <br /> <span style="font-family: "courier new" , "courier" , monospace;">CONSTRUCT {?s ?p ?o}</span><br /> <span style="font-family: "courier new" , "courier" , monospace;">FROM <http://rs.tdwg.org/></span><br /> <span style="font-family: "courier new" , "courier" , monospace;">WHERE {?s ?p ?o}</span><br /> <div> <br /></div> <div> and by requesting the appropriate content type (XML, Turtle, or JSON-LD) via an Accept header, a single HTTP call would retrieve all of the metadata at once. If all goes well, this is a simple and effective method. However, this method depends critically on two things: there has to be a SPARQL endpoint that is functioning and publicly accessible, and the metadata in the triplestore of the underlying graph database must be up-to-date with the most recent data. At the moment, both of those things are true about the <a href="https://sparql.vanderbilt.edu/" target="_blank">Vanderbilt Library SPARQL endpoint</a> (<span style="font-family: "courier new" , "courier" , monospace;">https://sparql.vanderbilt.edu/sparql</span>), but there is no guarantee that it will continue to be true indefinitely. There is no reason why there cannot be multiple SPARQL endpoints where the data are available, and TDWG itself could run its own, but currently there are no plans for that to happen and so we are stuck with depending on the Vanderbilt endpoint.</div> <div> <br /></div> <h2> Getting a machine-readable data dump from TDWG itself</h2> <div> <br /></div> <div> I'm now going to tell you about the best way to acquire authoritative machine-readable metadata from the <span style="font-family: "courier new" , "courier" , monospace;">rs.tdwg.org</span> implementation itself. But first we need to talk about the W3C Data Catalog (DCAT) recommendation, which is used to organize the data dump. The SDS does not mention the DCAT recommendation, but since DCAT is an international standard, it is the logical choice to be used for describing the TDWG standards datasets.<br /> <br /></div> <div> <br /></div> <h3> Data Catalog Vocabulary (DCAT)</h3> <div> In 2014, the W3C ratified the <a href="https://www.w3.org/TR/vocab-dcat/" target="_blank">DCAT vocabulary</a> as a Recommendation (the W3C term for its ratified standards). DCAT is a vocabulary for describing datasets of any form. The described datasets can be machine-readable, but do not have to be, and could include non-machine-readable forms like spreadsheets. The description of the datasets is in RDF, although the Recommendation is agnostic about the serialization. </div> <div> <br /></div> <div> There are three classes of resources that are described by the DCAT vocabulary. A <b><i>data catalog</i></b> is the resource that describes datasets. It's type is <span style="font-family: "courier new" , "courier" , monospace;">dcat:Catalog</span> (<span style="font-family: "courier new" , "courier" , monospace;">http://www.w3.org/ns/dcat#Catalog</span>). The <i><b>datasets</b></i> described in the catalog are assigned the type <span style="font-family: "courier new" , "courier" , monospace;">dcat:Dataset</span>, which is a subclass of <span style="font-family: "courier new" , "courier" , monospace;">dctype:Dataset</span> (<span style="font-family: "courier new" , "courier" , monospace;">http://purl.org/dc/dcmitype/Dataset</span>). The third class of resources, <b><i>distributions</i></b>, are described as "an accessible form of a dataset" and can include downloadable files or web services. Distributions are assigned the type <span style="font-family: "courier new" , "courier" , monospace;">dcat:Distribution</span> (<span style="font-family: "courier new" , "courier" , monospace;">http://www.w3.org/ns/dcat#Distribution</span>). The hierarchical relationship among these classes of resources is shown in the following diagram.<br /> <br /> <div class="separator" style="clear: both; text-align: center;"> <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiz3b18P5IKqa1T8c8ey2n9E1gDlKFZrSPqWzhx8tGQs-WMtuGAyh8WGrYcLJyoeBx9qXOcod8GwgQ0HYCPQU_-5Oc-qGijdRtRnCkIf8ahh9o2h9AU_TA96F_NzFQsalzl3VU4gWlEPEY/s1600/dcat.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="720" data-original-width="1002" height="458" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiz3b18P5IKqa1T8c8ey2n9E1gDlKFZrSPqWzhx8tGQs-WMtuGAyh8WGrYcLJyoeBx9qXOcod8GwgQ0HYCPQU_-5Oc-qGijdRtRnCkIf8ahh9o2h9AU_TA96F_NzFQsalzl3VU4gWlEPEY/s640/dcat.png" width="640" /></a></div> <div class="separator" style="clear: both; text-align: center;"> </div> <br /> An important thing to notice is that the DCAT vocabulary defines several terms whose IRIs are very similar: <span style="font-family: "courier new" , "courier" , monospace;">dcat:dataset</span> and <span style="font-family: "courier new" , "courier" , monospace;">dcat:Dataset</span>, and <span style="font-family: "courier new" , "courier" , monospace;">dcat:distribution</span> and <span style="font-family: "courier new" , "courier" , monospace;">dcat:Distribution</span>. The only thing that differs between the pairs of terms is whether the local name is capitalized or not. Those with capitalized local names denote <b><i>classes</i></b> and those that begin with lower case denote object <b><i>properties</i></b>.<br /> <br /></div> <div> <h3> Organization of TDWG data according to the DCAT data model</h3> </div> <div> I assigned the IRI <span style="font-family: "courier new" , "courier" , monospace;">http://rs.tdwg.org/index</span> to denote the TDWG standards metadata catalog. The local name "index" is descriptive of a catalog, and the IRI has the added benefit of supporting a typical web behavior: if a base subdomain like <span style="font-family: "courier new" , "courier" , monospace;">http://rs.tdwg.org/</span> is dereferenced, it is typical for that form of IRI to dereference to a "homepage" having the IRI <span style="font-family: "courier new" , "courier" , monospace;">http://rs.tdwg.org/index.htm</span>, and <span style="font-family: "courier new" , "courier" , monospace;">http://rs-test.tdwg.org/index.htm</span> does indeed redirect to a "homepage" of sorts: the README.md page for the rs.tdwg.org GitHub repo where the authoritative metadata tables live. You can try this yourself by putting either <span style="font-family: "courier new" , "courier" , monospace;"><a href="http://rs.tdwg.org/" target="_blank">http://rs.tdwg.org/</a></span>or <span style="font-family: "courier new" , "courier" , monospace;"><a href="http://rs.tdwg.org/index.htm" target="_blank">http://rs.tdwg.org/index.htm</a></span> into a browser URL bar and see what happens. However, making an HTTP call to either of these IRIs with an <span style="font-family: "courier new" , "courier" , monospace;">Accept </span>header for machine-readable RDF (<span style="font-family: "courier new" , "courier" , monospace;">text/turtle</span> or <span style="font-family: "courier new" , "courier" , monospace;">application/rdf+xml</span>) will redirect to a representation-specific IRI like <a href="http://rs.tdwg.org/index.ttl" style="font-family: "Courier New", Courier, monospace;" target="_blank">http://rs.tdwg.org/index.ttl</a> or <a href="http://rs.tdwg.org/index.rdf" style="font-family: "Courier New", Courier, monospace;" target="_blank">http://rs.tdwg.org/index.rdf</a> as you'd expect in the Linked Data world.<br /> <br /> The data catalog denoted by <span style="font-family: "courier new" , "courier" , monospace;">http://rs.tdwg.org/index</span> describes the data located in the GitHub repository <a href="https://github.com/tdwg/rs.tdwg.org" target="_blank">https://github.com/tdwg/rs.tdwg.org</a>. Those data are organized into a number of directories, with each directory containing all of the information required to map metadata-containing CSV files to machine-readable RDF. From the standpoint of DCAT, we can consider the information in each directory as a dataset. There is no philosophical reason why we should organize the datasets that way. Rather, it is based on practicality, since the server that dereferences TDWG IRIs can generate a data dump for each directory via a dump URL. See <a href="https://github.com/tdwg/rs.tdwg.org/blob/master/index/index-datasets.csv" target="_blank">this file</a> for a complete list of the datasets.<br /> <br /> Each of the abstract datasets can be accessed through one of several distributions. Currently, the RDF metadata about the TDWG data says that there are three distributions for each of the datasets: one in RDF/XML, one in RDF/Turtle, and one in JSON-LD (with the JSON-LD having a problem I mentioned in the <a href="http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards.html" target="_blank">third post</a>). The IANA media type for each distribution is given as the value of a <span style="font-family: "courier new" , "courier" , monospace;">dcat:mediaType</span> property (see the diagram above for an example).<br /> <br /> One thing that is a bit different from what one might consider the traditional Linked Data approach is that the distributions are not really considered representations of the datasets. That is, under the DCAT model, one does not necessarily expect to be redirected to the distribution IRI from dereferencing of the dataset IRI through content negotiation. That's because content negotiation generally results in direct retrieval of some human- or machine-readable serialization, but in the DCAT model, the distribution itself is a separate, abstract entity apart from the serialization. The serialization itself is connected via a <span style="font-family: "courier new" , "courier" , monospace;">dcat:downloadURL</span> property of the distribution (see the diagram above). I'm not sure why the DCAT model adds this extra layer, but I think it is probably so that a permanent IRI can be assigned to the distribution, while the download URL can be a mutable thing that can change over time, yet still be discovered through its link to the distribution.<br /> <br /> At the moment, the dataset IRIs don't dereference, although that could be changed in the future if need be. Despite that, their metadata are exposed when the data catalog IRI itself is dereferenced, so a machine could learn all it needed to know about them with a single HTTP call to the catalog IRI.<br /> <br /> In the case of the TDWG data, I didn't actually mint IRIs for the distributions, since it's not that likely that anyone would ever need to address them directly and I wasn't interested in maintaining another set of identifiers. So they are represented by blank (anonymous) nodes in the dataset. The download URLs can be determined from the dataset URI by rules, so there's no need to maintain a record of them, either.<br /> <br /> Here is an abbreviated bit of the Turtle that you get if you dereference the catalog IRI <span style="font-family: "courier new" , "courier" , monospace;">http://rs.tdwg.org/index</span> and request <span style="font-family: "courier new" , "courier" , monospace;">text/turtle</span> (or just retrieve <a href="http://rs.tdwg.org/index.ttl" target="_blank">http://rs.tdwg.org/index.ttl</a>):<br /> <br /> <span style="font-family: "courier new" , "courier" , monospace;">@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.</span><br /> <span style="font-family: "courier new" , "courier" , monospace;">@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.</span><br /> <span style="font-family: "courier new" , "courier" , monospace;">@prefix dc: <http://purl.org/dc/elements/1.1/>.</span><br /> <span style="font-family: "courier new" , "courier" , monospace;">@prefix dcterms: <http://purl.org/dc/terms/>.</span><br /> <span style="font-family: "courier new" , "courier" , monospace;">@prefix dcat: <http://www.w3.org/ns/dcat#>.</span><br /> <span style="font-family: "courier new" , "courier" , monospace;">@prefix dcmitype: <http://purl.org/dc/dcmitype/>.</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"><br /></span> <span style="font-family: "courier new" , "courier" , monospace;"><http://rs.tdwg.org/index></span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> dc:publisher "Biodiversity Information Standards (TDWG)"@en;</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> dcterms:publisher <https://www.grid.ac/institutes/grid.480498.9>;</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> dcterms:license <http://creativecommons.org/licenses/by/4.0/>;</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> dcterms:modified "2018-10-09"^^xsd:date;</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> rdfs:label "TDWG dataset catalog"@en;</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> rdfs:comment "This dataset contains the data that underlies TDWG standards and standards documents"@en;</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> dcat:dataset <http://rs.tdwg.org/index/audubon>;</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> a dcat:Catalog.</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"><br /></span> <span style="font-family: "courier new" , "courier" , monospace;"><http://rs.tdwg.org/index/audubon></span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> dcterms:modified "2018-10-09"^^xsd:date;</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> rdfs:label "Audubon Core-defined terms"@en;</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> dcat:distribution _:53c07f45-4561-448b-9bb9-396e47d3ad1d;</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> a dcmitype:Dataset.</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"><br /></span> <span style="font-family: "courier new" , "courier" , monospace;">_:53c07f45-4561-448b-9bb9-396e47d3ad1d</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> dcat:mediaType <https://www.iana.org/assignments/media-types/application/rdf+xml>;</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> dcterms:license <https://creativecommons.org/publicdomain/zero/1.0/>;</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> dcat:downloadURL <http://rs.tdwg.org/dump/audubon.rdf>;</span><br /> <span style="font-family: "courier new" , "courier" , monospace;"> a dcat:Distribution.</span><br /> <div> <br /></div> In this Turtle, you can see the DCAT-based structure as described above.<br /> <br /> Returning to a comment that I made earlier, DCAT can describe data in any form and it's not restricted to RDF. So in theory, one could consider each dataset to have a distribution that is in CSV format, and use the GitHub raw URL for the CSV file as the download URL of that distribution. I haven't done that because complete information about the dataset requires the combination of the raw CSV file with a property mapping table and I don't know how to represent that complexity in DCAT. But at least in theory it could be done. One can also indicate that a distribution of the dataset is available from an API such as a SPARQL endpoint, which I also have not done because the datasets aren't compartmentalized into named graphs and therefore can't really be distinguished from each other. But again, in theory it could be done.</div> <div> <br /> <h3> Getting a dump of all of the data</h3> At the start of this post, I complained that there were potential issues with the first two methods that I described for retrieving all of the TDWG standards metadata. I promised a better way, so here it is!<br /> <br /> In theory, a client could start with the catalog IRI (<span style="font-family: "courier new" , "courier" , monospace;">http://rs.tdwg.org/index</span>), dereference it requesting the machine-readable serialization flavor of your choice, and follow the links to the download URLs of all 50 of the datasets currently in the catalog. That would be in the LOD style and would require far fewer HTTP calls than the thousands that would be required to scrape all of the machine-readable data one standards-related resource at a time.<br /> <br /> However, here is a quick and dirty way that doesn't require using any Linked Data technology:<br /> <ul> <li>use a script of your favorite programming language to load the <a href="https://raw.githubusercontent.com/tdwg/rs.tdwg.org/master/index/index-datasets.csv" target="_blank">raw file for the datasets CSV table on GitHub</a></li> <li>get the dataset name from the second ("term_localName") column (e.g. <span style="font-family: "courier new" , "courier" , monospace;">audubon</span>)</li> <li>prepend <span style="font-family: "courier new" , "courier" , monospace;">http://rs.tdwg.org/dump/</span> to the name (e.g. <span style="font-family: "courier new" , "courier" , monospace;">http://rs.tdwg.org/dump/audubon</span>)</li> <li>append the appropriate file extension for the serialization you want (<span style="font-family: "courier new" , "courier" , monospace;">.ttl</span> for Turtle, <span style="font-family: "courier new" , "courier" , monospace;">.rdf</span> for XML) to the URL from the previous step (e.g. <span style="font-family: "courier new" , "courier" , monospace;">http://rs.tdwg.org/dump/audubon.ttl</span>)</li> <li>make an HTTP GET call to that URL to acquire the machine-readable serialization for that dataset. </li> <li>Repeat for the other 49 data rows in the table.</li> </ul> <br /> I've actually done something like this in lines 55 to 63 of <a href="https://github.com/tdwg/rs.tdwg.org/blob/master/index/database-triple-loader.py" target="_blank">a Python script</a> on GitHub. Rather than making a GET request, the script actually uses the constructed URL to create a <a href="https://www.w3.org/TR/sparql11-update/" target="_blank">SPARQL Update</a> command that loads the data directly from the TDWG server into a graph database triplestore (lines 133 and 127) via an HTTP POST request. But you could use GET to load the data directly into your own software using a library like Python's <a href="https://github.com/RDFLib/rdflib" target="_blank">RDFLib</a> if you preferred to work with it directly rather than through a SPARQL endpoint.<br /> <br /> The advantage of getting the dump in this way is that it would be coming directly from the authoritative TDWG server (which gets its data from the CSVs in the rs.tdwg.org repo of the TDWG GitHub site). You would then be guaranteed to have the most up-to-date version of the data, something that would not necessarily happen if you got the data from somebody else's SPARQL endpoint.<br /> <br /> In the future, this method will be important because it would be the best way to build reliable applications that made use of standards metadata. For many standards and the "regular" TDWG vocabularies that conform to the SDS (Darwin and Audubon Cores), retrieving up-to-date metadata probably isn't that critical because those standards don't change very quickly. However, in the case of controlled vocabularies, access to up-to-date data may be more important.<br /> <br /></div> <div style='clear: both;'></div> </div> <div class='post-footer'> <div class='post-footer-line post-footer-line-1'> <span class='post-author vcard'> Posted by <span class='fn' itemprop='author' itemscope='itemscope' itemtype='http://schema.org/Person'> <meta content='https://www.blogger.com/profile/01896499749604153763' itemprop='url'/> <a class='g-profile' href='https://www.blogger.com/profile/01896499749604153763' rel='author' title='author profile'> <span itemprop='name'>Steve Baskauf</span> </a> </span> </span> <span class='post-timestamp'> at <meta content='http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards_24.html' itemprop='url'/> <a class='timestamp-link' href='http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards_24.html' rel='bookmark' title='permanent link'><abbr class='published' itemprop='datePublished' title='2019-04-24T10:06:00-07:00'>10:06 AM</abbr></a> </span> <span class='post-comment-link'> </span> <span class='post-icons'> <span class='item-control blog-admin pid-95103704'> <a href='https://www.blogger.com/post-edit.g?blogID=5299754536670281996&postID=9189703936140939208&from=pencil' title='Edit Post'> <img alt='' class='icon-action' height='18' src='https://resources.blogblog.com/img/icon18_edit_allbkg.gif' width='18'/> </a> </span> </span> <div class='post-share-buttons goog-inline-block'> <a class='goog-inline-block share-button sb-email' href='https://www.blogger.com/share-post.g?blogID=5299754536670281996&postID=9189703936140939208&target=email' target='_blank' title='Email This'><span class='share-button-link-text'>Email This</span></a><a class='goog-inline-block share-button sb-blog' href='https://www.blogger.com/share-post.g?blogID=5299754536670281996&postID=9189703936140939208&target=blog' onclick='window.open(this.href, "_blank", "height=270,width=475"); return false;' target='_blank' title='BlogThis!'><span class='share-button-link-text'>BlogThis!</span></a><a class='goog-inline-block share-button sb-twitter' href='https://www.blogger.com/share-post.g?blogID=5299754536670281996&postID=9189703936140939208&target=twitter' target='_blank' title='Share to X'><span class='share-button-link-text'>Share to X</span></a><a class='goog-inline-block share-button sb-facebook' href='https://www.blogger.com/share-post.g?blogID=5299754536670281996&postID=9189703936140939208&target=facebook' onclick='window.open(this.href, "_blank", "height=430,width=640"); return false;' target='_blank' title='Share to Facebook'><span class='share-button-link-text'>Share to Facebook</span></a><a class='goog-inline-block share-button sb-pinterest' href='https://www.blogger.com/share-post.g?blogID=5299754536670281996&postID=9189703936140939208&target=pinterest' target='_blank' title='Share to Pinterest'><span class='share-button-link-text'>Share to Pinterest</span></a> </div> </div> <div class='post-footer-line post-footer-line-2'> <span class='post-labels'> </span> </div> <div class='post-footer-line post-footer-line-3'> <span class='post-location'> </span> </div> </div> </div> <div class='comments' id='comments'> <a name='comments'></a> <h4>No comments:</h4> <div id='Blog1_comments-block-wrapper'> <dl class='avatar-comment-indent' id='comments-block'> </dl> </div> <p class='comment-footer'> <div class='comment-form'> <a name='comment-form'></a> <h4 id='comment-post-message'>Post a Comment</h4> <p> </p> <a href='https://www.blogger.com/comment/frame/5299754536670281996?po=9189703936140939208&hl=en' id='comment-editor-src'></a> <iframe allowtransparency='true' class='blogger-iframe-colorize blogger-comment-from-post' frameborder='0' height='410px' id='comment-editor' name='comment-editor' src='' width='100%'></iframe> <script src='https://www.blogger.com/static/v1/jsbin/2315299244-comment_from_post_iframe.js' type='text/javascript'></script> <script type='text/javascript'> BLOG_CMT_createIframe('https://www.blogger.com/rpc_relay.html'); </script> </div> </p> </div> </div> </div></div> </div> <div class='blog-pager' id='blog-pager'> <span id='blog-pager-newer-link'> <a class='blog-pager-newer-link' href='http://baskauf.blogspot.com/2019/05/getting-data-out-of-wikidata-using.html' id='Blog1_blog-pager-newer-link' title='Newer Post'>Newer Post</a> </span> <span id='blog-pager-older-link'> <a class='blog-pager-older-link' href='http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards_7.html' id='Blog1_blog-pager-older-link' title='Older Post'>Older Post</a> </span> <a class='home-link' href='http://baskauf.blogspot.com/'>Home</a> </div> <div class='clear'></div> <div class='post-feeds'> <div class='feed-links'> Subscribe to: <a class='feed-link' href='http://baskauf.blogspot.com/feeds/9189703936140939208/comments/default' target='_blank' type='application/atom+xml'>Post Comments (Atom)</a> </div> </div> </div></div> </div> </div> <div class='column-left-outer'> <div class='column-left-inner'> <aside> </aside> </div> </div> <div class='column-right-outer'> <div class='column-right-inner'> <aside> <div class='sidebar section' id='sidebar-right-1'><div class='widget BlogArchive' data-version='1' id='BlogArchive1'> <h2>Blog Archive</h2> <div class='widget-content'> <div id='ArchiveList'> <div id='BlogArchive1_ArchiveList'> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2023/'> 2023 </a> <span class='post-count' dir='ltr'>(2)</span> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2023/08/'> August </a> <span class='post-count' dir='ltr'>(1)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2023/04/'> April </a> <span class='post-count' dir='ltr'>(1)</span> </li> </ul> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2022/'> 2022 </a> <span class='post-count' dir='ltr'>(4)</span> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2022/09/'> September </a> <span class='post-count' dir='ltr'>(1)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2022/06/'> June </a> <span class='post-count' dir='ltr'>(1)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2022/03/'> March </a> <span class='post-count' dir='ltr'>(1)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2022/01/'> January </a> <span class='post-count' dir='ltr'>(1)</span> </li> </ul> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2021/'> 2021 </a> <span class='post-count' dir='ltr'>(4)</span> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2021/03/'> March </a> <span class='post-count' dir='ltr'>(4)</span> </li> </ul> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2020/'> 2020 </a> <span class='post-count' dir='ltr'>(5)</span> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2020/03/'> March </a> <span class='post-count' dir='ltr'>(1)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2020/02/'> February </a> <span class='post-count' dir='ltr'>(4)</span> </li> </ul> </li> </ul> <ul class='hierarchy'> <li class='archivedate expanded'> <a class='toggle' href='javascript:void(0)'> <span class='zippy toggle-open'> ▼  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2019/'> 2019 </a> <span class='post-count' dir='ltr'>(9)</span> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2019/10/'> October </a> <span class='post-count' dir='ltr'>(1)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2019/06/'> June </a> <span class='post-count' dir='ltr'>(2)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2019/05/'> May </a> <span class='post-count' dir='ltr'>(1)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate expanded'> <a class='toggle' href='javascript:void(0)'> <span class='zippy toggle-open'> ▼  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2019/04/'> April </a> <span class='post-count' dir='ltr'>(3)</span> <ul class='posts'> <li><a href='http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards_24.html'>Understanding the TDWG Standards Documentation Spe...</a></li> <li><a href='http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards_7.html'>Understanding the TDWG Standards Documentation Spe...</a></li> <li><a href='http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards.html'>Understanding the TDWG Standards Documentation Spe...</a></li> </ul> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2019/03/'> March </a> <span class='post-count' dir='ltr'>(2)</span> </li> </ul> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2018/'> 2018 </a> <span class='post-count' dir='ltr'>(1)</span> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2018/02/'> February </a> <span class='post-count' dir='ltr'>(1)</span> </li> </ul> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2017/'> 2017 </a> <span class='post-count' dir='ltr'>(6)</span> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2017/07/'> July </a> <span class='post-count' dir='ltr'>(1)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2017/05/'> May </a> <span class='post-count' dir='ltr'>(1)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2017/03/'> March </a> <span class='post-count' dir='ltr'>(3)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2017/02/'> February </a> <span class='post-count' dir='ltr'>(1)</span> </li> </ul> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2016/'> 2016 </a> <span class='post-count' dir='ltr'>(15)</span> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2016/11/'> November </a> <span class='post-count' dir='ltr'>(3)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2016/10/'> October </a> <span class='post-count' dir='ltr'>(3)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2016/04/'> April </a> <span class='post-count' dir='ltr'>(2)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2016/03/'> March </a> <span class='post-count' dir='ltr'>(3)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2016/02/'> February </a> <span class='post-count' dir='ltr'>(4)</span> </li> </ul> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2015/'> 2015 </a> <span class='post-count' dir='ltr'>(6)</span> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2015/09/'> September </a> <span class='post-count' dir='ltr'>(2)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2015/07/'> July </a> <span class='post-count' dir='ltr'>(4)</span> </li> </ul> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2014/'> 2014 </a> <span class='post-count' dir='ltr'>(7)</span> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2014/05/'> May </a> <span class='post-count' dir='ltr'>(3)</span> </li> </ul> <ul class='hierarchy'> <li class='archivedate collapsed'> <a class='toggle' href='javascript:void(0)'> <span class='zippy'> ►  </span> </a> <a class='post-count-link' href='http://baskauf.blogspot.com/2014/04/'> April </a> <span class='post-count' dir='ltr'>(4)</span> </li> </ul> </li> </ul> </div> </div> <div class='clear'></div> </div> </div><div class='widget Profile' data-version='1' id='Profile1'> <h2>About Me</h2> <div class='widget-content'> <a href='https://www.blogger.com/profile/01896499749604153763'><img alt='My photo' class='profile-img' height='80' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhVm8jF34Q3Q0IttRnG66Z3cjplIPwTfYeVwoznPhxbTRnbtkuN5ekrcV1MNxedjGsgnImAIr_OAT_USbVVyX6pK_xy2GWTQDofBY9K7fiMj9DBOK4_dORjPC0UTaHIvP0/s220/profile-pic-carmen-small.jpg' width='80'/></a> <dl class='profile-datablock'> <dt class='profile-data'> <a class='profile-name-link g-profile' href='https://www.blogger.com/profile/01896499749604153763' rel='author' style='background-image: url(//www.blogger.com/img/logo-16.png);'> Steve Baskauf </a> </dt> </dl> <a class='profile-link' href='https://www.blogger.com/profile/01896499749604153763' rel='author'>View my complete profile</a> <div class='clear'></div> </div> </div></div> </aside> </div> </div> </div> <div style='clear: both'></div> <!-- columns --> </div> <!-- main --> </div> </div> <div class='main-cap-bottom cap-bottom'> <div class='cap-left'></div> <div class='cap-right'></div> </div> </div> <footer> <div class='footer-outer'> <div class='footer-cap-top cap-top'> <div class='cap-left'></div> <div class='cap-right'></div> </div> <div class='fauxborder-left footer-fauxborder-left'> <div class='fauxborder-right footer-fauxborder-right'></div> <div class='region-inner footer-inner'> <div class='foot no-items section' id='footer-1'></div> <table border='0' cellpadding='0' cellspacing='0' class='section-columns columns-2'> <tbody> <tr> <td class='first columns-cell'> <div class='foot no-items section' id='footer-2-1'></div> </td> <td class='columns-cell'> <div class='foot no-items section' id='footer-2-2'></div> </td> </tr> </tbody> </table> <!-- outside of the include in order to lock Attribution widget --> <div class='foot section' id='footer-3' name='Footer'><div class='widget Attribution' data-version='1' id='Attribution1'> <div class='widget-content' style='text-align: center;'> Simple theme. Powered by <a href='https://www.blogger.com' target='_blank'>Blogger</a>. </div> <div class='clear'></div> </div></div> </div> </div> <div class='footer-cap-bottom cap-bottom'> <div class='cap-left'></div> <div class='cap-right'></div> </div> </div> </footer> <!-- content --> </div> </div> <div class='content-cap-bottom cap-bottom'> <div class='cap-left'></div> <div class='cap-right'></div> </div> </div> </div> <script type='text/javascript'> window.setTimeout(function() { document.body.className = document.body.className.replace('loading', ''); }, 10); </script> <script type="text/javascript" src="https://www.blogger.com/static/v1/widgets/984859869-widgets.js"></script> <script type='text/javascript'> window['__wavt'] = 'AOuZoY5zn2XNEjiwBhB3q0XjE9x_A3OlMw:1732490057987';_WidgetManager._Init('//www.blogger.com/rearrange?blogID\x3d5299754536670281996','//baskauf.blogspot.com/2019/04/understanding-tdwg-standards_24.html','5299754536670281996'); _WidgetManager._SetDataContext([{'name': 'blog', 'data': {'blogId': '5299754536670281996', 'title': 'Steve Baskauf\x27s blog', 'url': 'http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards_24.html', 'canonicalUrl': 'http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards_24.html', 'homepageUrl': 'http://baskauf.blogspot.com/', 'searchUrl': 'http://baskauf.blogspot.com/search', 'canonicalHomepageUrl': 'http://baskauf.blogspot.com/', 'blogspotFaviconUrl': 'http://baskauf.blogspot.com/favicon.ico', 'bloggerUrl': 'https://www.blogger.com', 'hasCustomDomain': false, 'httpsEnabled': true, 'enabledCommentProfileImages': true, 'gPlusViewType': 'FILTERED_POSTMOD', 'adultContent': false, 'analyticsAccountNumber': '', 'encoding': 'UTF-8', 'locale': 'en', 'localeUnderscoreDelimited': 'en', 'languageDirection': 'ltr', 'isPrivate': false, 'isMobile': false, 'isMobileRequest': false, 'mobileClass': '', 'isPrivateBlog': false, 'isDynamicViewsAvailable': true, 'feedLinks': '\x3clink rel\x3d\x22alternate\x22 type\x3d\x22application/atom+xml\x22 title\x3d\x22Steve Baskauf\x26#39;s blog - Atom\x22 href\x3d\x22http://baskauf.blogspot.com/feeds/posts/default\x22 /\x3e\n\x3clink rel\x3d\x22alternate\x22 type\x3d\x22application/rss+xml\x22 title\x3d\x22Steve Baskauf\x26#39;s blog - RSS\x22 href\x3d\x22http://baskauf.blogspot.com/feeds/posts/default?alt\x3drss\x22 /\x3e\n\x3clink rel\x3d\x22service.post\x22 type\x3d\x22application/atom+xml\x22 title\x3d\x22Steve Baskauf\x26#39;s blog - Atom\x22 href\x3d\x22https://www.blogger.com/feeds/5299754536670281996/posts/default\x22 /\x3e\n\n\x3clink rel\x3d\x22alternate\x22 type\x3d\x22application/atom+xml\x22 title\x3d\x22Steve Baskauf\x26#39;s blog - Atom\x22 href\x3d\x22http://baskauf.blogspot.com/feeds/9189703936140939208/comments/default\x22 /\x3e\n', 'meTag': '', 'adsenseHostId': 'ca-host-pub-1556223355139109', 'adsenseHasAds': false, 'adsenseAutoAds': false, 'boqCommentIframeForm': true, 'loginRedirectParam': '', 'view': '', 'dynamicViewsCommentsSrc': '//www.blogblog.com/dynamicviews/4224c15c4e7c9321/js/comments.js', 'dynamicViewsScriptSrc': '//www.blogblog.com/dynamicviews/d78375fb222d99b3', 'plusOneApiSrc': 'https://apis.google.com/js/platform.js', 'disableGComments': true, 'interstitialAccepted': false, 'sharing': {'platforms': [{'name': 'Get link', 'key': 'link', 'shareMessage': 'Get link', 'target': ''}, {'name': 'Facebook', 'key': 'facebook', 'shareMessage': 'Share to Facebook', 'target': 'facebook'}, {'name': 'BlogThis!', 'key': 'blogThis', 'shareMessage': 'BlogThis!', 'target': 'blog'}, {'name': 'X', 'key': 'twitter', 'shareMessage': 'Share to X', 'target': 'twitter'}, {'name': 'Pinterest', 'key': 'pinterest', 'shareMessage': 'Share to Pinterest', 'target': 'pinterest'}, {'name': 'Email', 'key': 'email', 'shareMessage': 'Email', 'target': 'email'}], 'disableGooglePlus': true, 'googlePlusShareButtonWidth': 0, 'googlePlusBootstrap': '\x3cscript type\x3d\x22text/javascript\x22\x3ewindow.___gcfg \x3d {\x27lang\x27: \x27en\x27};\x3c/script\x3e'}, 'hasCustomJumpLinkMessage': false, 'jumpLinkMessage': 'Read more', 'pageType': 'item', 'postId': '9189703936140939208', 'postImageThumbnailUrl': 'https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiz3b18P5IKqa1T8c8ey2n9E1gDlKFZrSPqWzhx8tGQs-WMtuGAyh8WGrYcLJyoeBx9qXOcod8GwgQ0HYCPQU_-5Oc-qGijdRtRnCkIf8ahh9o2h9AU_TA96F_NzFQsalzl3VU4gWlEPEY/s72-c/dcat.png', 'postImageUrl': 'https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiz3b18P5IKqa1T8c8ey2n9E1gDlKFZrSPqWzhx8tGQs-WMtuGAyh8WGrYcLJyoeBx9qXOcod8GwgQ0HYCPQU_-5Oc-qGijdRtRnCkIf8ahh9o2h9AU_TA96F_NzFQsalzl3VU4gWlEPEY/s640/dcat.png', 'pageName': 'Understanding the TDWG Standards Documentation Specification, Part 5: Acquiring Machine-readable using DCAT', 'pageTitle': 'Steve Baskauf\x27s blog: Understanding the TDWG Standards Documentation Specification, Part 5: Acquiring Machine-readable using DCAT'}}, {'name': 'features', 'data': {}}, {'name': 'messages', 'data': {'edit': 'Edit', 'linkCopiedToClipboard': 'Link copied to clipboard!', 'ok': 'Ok', 'postLink': 'Post Link'}}, {'name': 'template', 'data': {'name': 'Simple', 'localizedName': 'Simple', 'isResponsive': false, 'isAlternateRendering': false, 'isCustom': false, 'variant': 'simplysimple', 'variantId': 'simplysimple'}}, {'name': 'view', 'data': {'classic': {'name': 'classic', 'url': '?view\x3dclassic'}, 'flipcard': {'name': 'flipcard', 'url': '?view\x3dflipcard'}, 'magazine': {'name': 'magazine', 'url': '?view\x3dmagazine'}, 'mosaic': {'name': 'mosaic', 'url': '?view\x3dmosaic'}, 'sidebar': {'name': 'sidebar', 'url': '?view\x3dsidebar'}, 'snapshot': {'name': 'snapshot', 'url': '?view\x3dsnapshot'}, 'timeslide': {'name': 'timeslide', 'url': '?view\x3dtimeslide'}, 'isMobile': false, 'title': 'Understanding the TDWG Standards Documentation Specification, Part 5: Acquiring Machine-readable using DCAT', 'description': 'This is the fifth in a series of posts about the TDWG Standards Documentation Specification (SDS).\xa0 For background on the SDS, see the first...', 'featuredImage': 'https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiz3b18P5IKqa1T8c8ey2n9E1gDlKFZrSPqWzhx8tGQs-WMtuGAyh8WGrYcLJyoeBx9qXOcod8GwgQ0HYCPQU_-5Oc-qGijdRtRnCkIf8ahh9o2h9AU_TA96F_NzFQsalzl3VU4gWlEPEY/s640/dcat.png', 'url': 'http://baskauf.blogspot.com/2019/04/understanding-tdwg-standards_24.html', 'type': 'item', 'isSingleItem': true, 'isMultipleItems': false, 'isError': false, 'isPage': false, 'isPost': true, 'isHomepage': false, 'isArchive': false, 'isLabelSearch': false, 'postId': 9189703936140939208}}]); _WidgetManager._RegisterWidget('_NavbarView', new _WidgetInfo('Navbar1', 'navbar', document.getElementById('Navbar1'), {}, 'displayModeFull')); _WidgetManager._RegisterWidget('_HeaderView', new _WidgetInfo('Header1', 'header', document.getElementById('Header1'), {}, 'displayModeFull')); _WidgetManager._RegisterWidget('_BlogView', new _WidgetInfo('Blog1', 'main', document.getElementById('Blog1'), {'cmtInteractionsEnabled': false, 'lightboxEnabled': true, 'lightboxModuleUrl': 'https://www.blogger.com/static/v1/jsbin/2646514562-lbx.js', 'lightboxCssUrl': 'https://www.blogger.com/static/v1/v-css/1964470060-lightbox_bundle.css'}, 'displayModeFull')); _WidgetManager._RegisterWidget('_BlogArchiveView', new _WidgetInfo('BlogArchive1', 'sidebar-right-1', document.getElementById('BlogArchive1'), {'languageDirection': 'ltr', 'loadingMessage': 'Loading\x26hellip;'}, 'displayModeFull')); _WidgetManager._RegisterWidget('_ProfileView', new _WidgetInfo('Profile1', 'sidebar-right-1', document.getElementById('Profile1'), {}, 'displayModeFull')); _WidgetManager._RegisterWidget('_AttributionView', new _WidgetInfo('Attribution1', 'footer-3', document.getElementById('Attribution1'), {}, 'displayModeFull')); </script> </body> </html>