CINXE.COM
Using Drill Metastore - Apache Drill
<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <meta name=viewport content="width=device-width, initial-scale=1"> <title>Using Drill Metastore - Apache Drill</title> <link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.3.0/css/font-awesome.min.css" rel="stylesheet" type="text/css"/> <link href="/css/site.css" rel="stylesheet" type="text/css"/> <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon"/> <link rel="icon" href="/favicon.ico" type="image/x-icon"/> <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/1.11.1/jquery.min.js" language="javascript" type="text/javascript"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery-easing/1.3/jquery.easing.min.js" language="javascript" type="text/javascript"></script> <script language="javascript" type="text/javascript" src="/js/modernizr.custom.js"></script> <script language="javascript" type="text/javascript" src="/js/script.js"></script> <script language="javascript" type="text/javascript" src="/js/drill.js"></script> </head> <body onResize="resized();"> <div class="page-wrap"> <div class="bui"></div> <div id="menu" class="mw"> <ul> <li class='toc-categories'> <a class="expand-toc-icon" href="javascript:void(0);"><i class="fa fa-bars"></i></a> </li> <li class="logo"><a href="/"></a></li> <li class='expand-menu'> <a href="javascript:void(0);"><span class='menu-text'>Menu</span><span class='expand-icon'><i class="fa fa-bars"></i></span></a> </li> <li class="clear-float"></li> <li class="nav"> <a>Language</a> <ul> <li> <a style="font-weight: bold;" href="/docs/using-drill-metastore/" >en</a> </li> <li> <a href="/zh/docs/using-drill-metastore/" >zh</a> </li> </ul> </li> <li class="apache-link"> <a href="/apacheASF/">Apache</a> </li> <li class="poweredby"> <a href="/poweredBy">Powered By</a> </li> <li class="documentation-menu"> <a href="/docs/">Documentation</a> <ul> <li><a href="/docs/getting-started/">Getting Started</a></li> <li><a href="/docs/architecture/">Architecture</a></li> <li><a href="/docs/tutorials/">Tutorials</a></li> <li><a href="/docs/drill-on-yarn/">Drill-on-YARN</a></li> <li><a href="/docs/install-drill/">Install Drill</a></li> <li><a href="/docs/configure-drill/">Configure Drill</a></li> <li><a href="/docs/connect-a-data-source/">Connect a Data Source</a></li> <li><a href="/docs/odbc-jdbc-interfaces/">ODBC/JDBC Interfaces</a></li> <li><a href="/docs/query-data/">Query Data</a></li> <li><a href="/docs/performance-tuning/">Performance Tuning</a></li> <li><a href="/docs/log-and-debug/">Log and Debug</a></li> <li><a href="/docs/sql-reference/">SQL Reference</a></li> <li><a href="/docs/data-sources-and-file-formats/">Data Sources and File Formats</a></li> <li><a href="/docs/develop-custom-functions/">Develop Custom Functions</a></li> <li><a href="/docs/troubleshooting/">Troubleshooting</a></li> <li><a href="/docs/developer-information/">Developer Information</a></li> <li><a href="/docs/release-notes/">Release Notes</a></li> <li><a href="/docs/sample-datasets/">Sample Datasets</a></li> <li><a href="/docs/project-bylaws/">Project Bylaws</a></li> <li><a href="/docs/ecosystem/">Ecosystem</a></li> </ul> </li> <li class='nav'> <a href="/community-resources/">Community</a> <ul> <li><a href="/team/">Team</a></li> <li><a href="/mailinglists/">Mailing Lists</a></li> <li><a href="/community-resources/">Community Resources</a></li> </ul> </li> <li class='nav'><a href="/faq/">FAQ</a></li> <li class='nav'><a href="/blog/">Blog</a></li> <li class="social-menu-item"><a href="https://twitter.com/apachedrill" title="apachedrill on twitter" target="_blank"><img src="/images/twitter_32_26_white.png" alt="twitter logo" align="center"></a> </li> <li class="social-menu-item"><a href="https://join.slack.com/t/apache-drill/shared_invite/enQtNTQ4MjM1MDA3MzQ2LTJlYmUxMTRkMmUwYmQ2NTllYmFmMjU4MDk0NjYwZjBmYjg0MDZmOTE2ZDg0ZjBlYmI3Yjc4Y2I2NTQyNGVlZTc" title="Apache Drill Slack channels" target="_blank"><img src="/images/slack-logo.svg" alt="Slack logo" align="center"></a> </li> <li class='search-bar'> <form id="drill-search-form"> <input type="text" placeholder="Search Apache Drill" id="drill-search-term" /> <button type="submit"> <i class="fa fa-search"></i> </button> </form> </li> <li class="d"> <a href="/download/"> <i class="fa fa-cloud-download"></i> Download </a> </li> </ul> </div> <link href="/css/content.css" rel="stylesheet" type="text/css"> <aside class="sidebar"> <div class="docsidebar"> <div class="docsidebarwrapper"> <ul style="display: block;"> <li class="toctree-l1"><a href="javascript: void(0);">Getting Started</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/drill-introduction/">Drill Introduction</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/why-drill/">Why Drill</a></li> </ul> <li class="toctree-l1"><a href="javascript: void(0);">Architecture</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/architecture-introduction/">Architecture Introduction</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/drill-query-execution/">Drill Query Execution</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/core-modules/">Core Modules</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/performance/">Performance</a></li> </ul> <li class="toctree-l1"><a href="javascript: void(0);">Tutorials</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/tutorials-introduction/">Tutorials Introduction</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/drill-in-10-minutes/">Drill in 10 Minutes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/analyzing-the-yelp-academic-dataset/">Analyzing the Yelp Academic Dataset</a></li> <li class="toctree-l2"><a href="javascript: void(0);">Learn Drill with the MapR Sandbox</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/about-the-mapr-sandbox/">About the MapR Sandbox</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/installing-the-apache-drill-sandbox/">Installing the Apache Drill Sandbox</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/getting-to-know-the-drill-sandbox/">Getting to Know the Drill Sandbox</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/lesson-1-learn-about-the-data-set/">Lesson 1: Learn about the Data Set</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/lesson-2-run-queries-with-ansi-sql/">Lesson 2: Run Queries with ANSI SQL</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/lesson-3-run-queries-on-complex-data-types/">Lesson 3: Run Queries on Complex Data Types</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/summary/">Summary</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/analyzing-highly-dynamic-datasets/">Analyzing Highly Dynamic Datasets</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/analyzing-social-media/">Analyzing Social Media</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/analyzing-data-using-window-functions/">Analyzing Data Using Window Functions</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/orchestrating-queries-with-airflow/">Orchestrating queries with Airflow</a></li> </ul> <li class="toctree-l1"><a href="javascript: void(0);">Drill-on-YARN</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/drill-on-yarn-introduction/">Drill-on-YARN Introduction</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/creating-a-basic-drill-cluster/">Creating a Basic Drill Cluster</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/launch-drill-under-yarn/">Launch Drill Under YARN</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/configuration-reference/">Configuration Reference</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/drill-on-yarn-command-line-tool/">Drill-on-YARN Command-Line Tool</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/using-the-drill-on-yarn-web-ui/">Using the Drill-on-YARN Web UI</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/multiple-drill-clusters/">Multiple Drill Clusters</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/enabling-web-ui-security/">Enabling Web UI Security</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/appendix-a-release-note-issues/">Appendix A: Release Note Issues</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/appendix-b-drill-env-sh-settings/">Appendix B: drill-env.sh Settings</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/appendix-c-troubleshooting/">Appendix C: Troubleshooting</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/appendix-d-recreate-the-drill-archive/">Appendix D: Recreate the Drill Archive</a></li> </ul> <li class="toctree-l1"><a href="javascript: void(0);">Install Drill</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/install-drill-introduction/">Install Drill Introduction</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/migrating-parquet-data/">Migrating Parquet Data</a></li> <li class="toctree-l2"><a href="javascript: void(0);">Installing Drill in Embedded Mode</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/embedded-mode-prerequisites/">Embedded Mode Prerequisites</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/running-drill-on-docker/">Running Drill on Docker</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/installing-drill-on-linux-and-mac-os-x/">Installing Drill on Linux and Mac OS X</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/starting-drill-on-linux-and-mac-os-x/">Starting Drill on Linux and Mac OS X</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/installing-drill-on-windows/">Installing Drill on Windows</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/starting-drill-on-windows/">Starting Drill on Windows</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">Installing Drill in Distributed Mode</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/distributed-mode-prerequisites/">Distributed Mode Prerequisites</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/installing-drill-on-the-cluster/">Installing Drill on the Cluster</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/starting-drill-in-distributed-mode/">Starting Drill in Distributed Mode</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/starting-the-web-ui/">Starting the Web UI</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/stopping-drill/">Stopping Drill</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/identifying-multiple-drill-versions-in-a-cluster/">Identifying Multiple Drill Versions in a Cluster</a></li> <li class="toctree-l2"><a href="javascript: void(0);">Extended</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/installing-drill-in-distributed-mode-with-gcp-dataproc/">Installing Drill in Distributed Mode with GCP Dataproc</a></li> </ul> </ul> <li class="toctree-l1"><a href="javascript: void(0);">Configure Drill</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/configure-drill-introduction/">Configure Drill Introduction</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/configuring-drill-memory/">Configuring Drill Memory</a></li> <li class="toctree-l2"><a href="javascript: void(0);">Securing Drill</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/securing-drill-introduction/">Securing Drill Introduction</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/secure-communication-paths/">Secure Communication Paths</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/roles-and-privileges/">Roles and Privileges</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-user-impersonation/">Configuring User Impersonation</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-inbound-impersonation/">Configuring Inbound Impersonation</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-user-impersonation-with-hive-authorization/">Configuring User Impersonation with Hive Authorization</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-user-security/">Configuring User Security</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-plain-security/">Configuring Plain Security</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-ssl-tls-for-encryption/">Configuring SSL/TLS for Encryption</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/using-libpam4j-as-the-pam-authenticator/">Using libpam4j as the PAM Authenticator</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/using-jpam-as-the-pam-authenticator/">Using jpam as the PAM Authenticator</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-htpasswd-file-authentication/">Configuring htpasswd file authentication</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-hashicorp-vault-authentication/">Configuring HashiCorp Vault authentication</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-kerberos-security/">Configuring Kerberos Security</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-drill-to-use-spnego-for-http-authentication/">Configuring Drill to use SPNEGO for HTTP Authentication</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-web-ui-and-rest-api-security/">Configuring Web UI and REST API Security</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-custom-acls-to-secure-znodes/">Configuring Custom ACLs to Secure znodes</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">Configuring a Multitenant Cluster</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-a-multitenant-cluster-introduction/">Configuring a Multitenant Cluster Introduction</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-multitenant-resources/">Configuring Multitenant Resources</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-resources-for-a-shared-drillbit/">Configuring Resources for a Shared Drillbit</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">Configuration Options</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/configuration-options-introduction/">Configuration Options Introduction</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/start-up-options/">Start-Up Options</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/planning-and-execution-options/">Planning and Execution Options</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/persistent-configuration-storage/">Persistent Configuration Storage</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/global-query-list/">Global Query List</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/ports-and-bind-addresses-used-by-drill/">Ports and Bind Addresses Used by Drill</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/configuring-the-drill-shell/">Configuring the Drill Shell</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/configuring-cgroups-to-control-cpu-usage/">Configuring cgroups to Control CPU Usage</a></li> </ul> <li class="toctree-l1"><a href="javascript: void(0);">Connect a Data Source</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/connect-a-data-source-introduction/">Connect a Data Source Introduction</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/storage-plugin-registration/">Storage Plugin Registration</a></li> <li class="toctree-l2"><a href="javascript: void(0);">Storage Plugin Configuration</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/plugin-configuration-basics/">Plugin Configuration Basics</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-storage-plugins/">Configuring Storage Plugins</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/storage-plugin-authentication-modes/">Storage plugin authentication modes</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/workspaces/">Workspaces</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/drill-default-input-format/">Drill Default Input Format</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/file-system-storage-plugin/">File System Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/hbase-storage-plugin/">HBase Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/hive-storage-plugin/">Hive Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/rdbms-storage-plugin/">RDBMS Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/mongodb-storage-plugin/">MongoDB Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/mapr-db-format/">MapR-DB Format</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/s3-storage-plugin/">S3 Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/oci-os-storage-plugin/">OCI OS Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/opentsdb-storage-plugin/">OpenTSDB Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/kafka-storage-plugin/">Kafka Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/azure-blob-storage-plugin/">Azure Blob Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/druid-storage-plugin/">Druid Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/http-storage-plugin/">HTTP Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/elasticsearch-storage-plugin/">ElasticSearch Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/splunk-storage-plugin/">Splunk Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/cassandra-storage-plugin/">Cassandra Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/dropbox-storage-plugin/">Dropbox Storage Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/google-sheets-storage-plugin/">Google Sheets Storage Plugin</a></li> </ul> <li class="toctree-l1"><a href="javascript: void(0);">ODBC/JDBC Interfaces</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/interfaces-introduction/">Interfaces Introduction</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/using-the-jdbc-driver/">Using the JDBC Driver</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/using-jdbc-with-squirrel-on-windows/">Using JDBC with SQuirreL on Windows</a></li> <li class="toctree-l2"><a href="javascript: void(0);">Installing the ODBC Driver</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/installing-the-driver-on-linux/">Installing the Driver on Linux</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/installing-the-driver-on-mac-os-x/">Installing the Driver on Mac OS X</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/installing-the-driver-on-windows/">Installing the Driver on Windows</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">Configuring ODBC</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/odbc-configuration-reference/">ODBC Configuration Reference</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/logging-and-tracing/">Logging and Tracing</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-odbc-on-linux/">Configuring ODBC on Linux</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-odbc-on-mac-os-x/">Configuring ODBC on Mac OS X</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-odbc-on-windows/">Configuring ODBC on Windows</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/testing-the-odbc-connection/">Testing the ODBC Connection</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">Using Drill Explorer</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/drill-explorer-introduction/">Drill Explorer Introduction</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/connecting-drill-explorer-to-data/">Connecting Drill Explorer to Data</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/browsing-data-and-defining-views/">Browsing Data and Defining Views</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">Using Drill with BI Tools</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/using-drill-with-bi-tools-introduction/">Using Drill with BI Tools Introduction</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/tableau-examples/">Tableau Examples</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/using-microstrategy-analytics-with-apache-drill/">Using MicroStrategy Analytics with Apache Drill</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/using-tibco-spotfire-desktop-with-drill/">Using Tibco Spotfire Desktop with Drill</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-tibco-spotfire-server-with-drill/">Configuring Tibco Spotfire Server with Drill</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/using-qlik-sense-with-drill/">Using Qlik Sense with Drill</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/using-apache-drill-with-tableau-10-2/">Using Apache Drill with Tableau 10.2</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/using-apache-drill-with-tableau-9-desktop/">Using Apache Drill with Tableau 9 Desktop</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/using-apache-drill-with-tableau-9-server/">Using Apache Drill with Tableau 9 Server</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/using-information-builders-webfocus-with-apache-drill/">Using Information Builders’ WebFOCUS with Apache Drill</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-jreport-with-drill/">Configuring JReport with Drill</a></li> </ul> </ul> <li class="toctree-l1"><a href="javascript: void(0);">Query Data</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/query-data-introduction/">Query Data Introduction</a></li> <li class="toctree-l2"><a href="javascript: void(0);">Querying a File System</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/querying-a-file-system-introduction/">Querying a File System Introduction</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/querying-avro-files/">Querying Avro Files</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/querying-json-files/">Querying JSON Files</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/querying-parquet-files/">Querying Parquet Files</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/querying-plain-text-files/">Querying Plain Text Files</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/querying-directories/">Querying Directories</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/querying-sequence-files/">Querying Sequence Files</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/querying-hbase/">Querying HBase</a></li> <li class="toctree-l2"><a href="javascript: void(0);">Querying Complex Data</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/querying-complex-data-introduction/">Querying Complex Data Introduction</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/sample-data-donuts/">Sample Data: Donuts</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/selecting-flat-data/">Selecting Flat Data</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/using-sql-functions-clauses-and-joins/">Using SQL Functions, Clauses, and Joins</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/selecting-nested-data-for-a-column/">Selecting Nested Data for a Column</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/selecting-multiple-columns-within-nested-data/">Selecting Multiple Columns Within Nested Data</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/querying-hive/">Querying Hive</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/querying-the-information-schema/">Querying the INFORMATION SCHEMA</a></li> <li class="toctree-l2"><a href="javascript: void(0);">Querying Indexes</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/querying-indexes-introduction/">Querying Indexes Introduction</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/queries-that-qualify-for-index-based-query-plans/">Queries that Qualify for Index-Based Query Plans</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/types-of-indexes/">Types of Indexes</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/index-selection/">Index Selection</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/designing-indexes-for-your-queries/">Designing Indexes for Your Queries</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/configuring-index-planning/">Configuring Index Planning</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/verifying-index-use/">Verifying Index Use</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/querying-system-tables/">Querying System Tables</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/monitoring-and-canceling-queries-in-the-drill-web-ui/">Monitoring and Canceling Queries in the Drill Web UI</a></li> </ul> <li class="toctree-l1 current_section "><a href="javascript: void(0);">Performance Tuning</a></li> <ul class="current_section"> <li class="toctree-l2"><a href="javascript: void(0);">Drill Metastore</a></li> <ul style=""> <li class="toctree-l3 current"><a class="reference internal" href="/docs/using-drill-metastore/">Using Drill Metastore</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/drill-iceberg-metastore/">Drill Iceberg Metastore</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/rdbms-metastore/">RDBMS Metastore</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/mongo-metastore/">Mongo Metastore</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/performance-tuning-introduction/">Performance Tuning Introduction</a></li> <li class="toctree-l2"><a href="javascript: void(0);">Partition Pruning</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/partition-pruning-introduction/">Partition Pruning Introduction</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/how-to-partition-data/">How to Partition Data</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/asynchronous-parquet-reader/">Asynchronous Parquet Reader</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/optimizing-parquet-metadata-reading/">Optimizing Parquet Metadata Reading</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/parquet-filter-pushdown/">Parquet Filter Pushdown</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/hive-metadata-caching/">Hive Metadata Caching</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/choosing-a-storage-format/">Choosing a Storage Format</a></li> <li class="toctree-l2"><a href="javascript: void(0);">Query Plans and Tuning</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/query-plans-and-tuning-introduction/">Query Plans and Tuning Introduction</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/join-planning-guidelines/">Join Planning Guidelines</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/guidelines-for-optimizing-aggregation/">Guidelines for Optimizing Aggregation</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/modifying-query-planning-options/">Modifying Query Planning Options</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/sort-based-and-hash-based-memory-constrained-operators/">Sort-Based and Hash-Based Memory-Constrained Operators</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/enabling-query-queuing/">Enabling Query Queuing</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/throttling/">Throttling</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/controlling-parallelization-to-balance-performance-with-multi-tenancy/">Controlling Parallelization to Balance Performance with Multi-Tenancy</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">Identifying Performance Issues</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/query-plans/">Query Plans</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/query-profiles/">Query Profiles</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">Performance Tuning Reference</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/query-profile-column-descriptions/">Query Profile Column Descriptions</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/physical-operators/">Physical Operators</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/monitoring-metrics/">Monitoring Metrics</a></li> </ul> <li class="toctree-l1"><a href="javascript: void(0);">Log and Debug</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/log-and-debug-introduction/">Log and Debug Introduction</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/error-messages/">Error Messages</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/modify-logback-xml/">Modify logback.xml</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/review-the-java-stack-trace/">Review the Java Stack Trace</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/query-audit-logging/">Query Audit Logging</a></li> </ul> <li class="toctree-l1"><a href="javascript: void(0);">SQL Reference</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/sql-reference-introduction/">SQL Reference Introduction</a></li> <li class="toctree-l2"><a href="javascript: void(0);">Data Types</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/supported-data-types/">Supported Data Types</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/date-time-and-timestamp/">Date, Time, and Timestamp</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/handling-different-data-types/">Handling Different Data Types</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/lexical-structure/">Lexical Structure</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/operators/">Operators</a></li> <li class="toctree-l2"><a href="javascript: void(0);">SQL Functions</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/about-sql-function-examples/">About SQL Function Examples</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/math-and-trig/">Math and Trig</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/statistical/">Statistical</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/data-type-conversion/">Data Type Conversion</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/data-type-functions/">Data Type Functions</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/date-time-functions-and-arithmetic/">Date/Time Functions and Arithmetic</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/string-manipulation/">String Manipulation</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/aggregate-and-aggregate-statistical/">Aggregate and Aggregate Statistical</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/functions-for-handling-nulls/">Functions for Handling Nulls</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/phonetic-functions/">Phonetic Functions</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/string-distance-functions/">String Distance Functions</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/cryptography-functions/">Cryptography Functions</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/sql-dialect-compatibility-functions/">SQL dialect compatibility functions</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/gis-functions/">GIS functions</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/time-series-analysis-functions/">Time Series Analysis Functions</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/ip-networking-functions/">IP Networking functions</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">SQL Window Functions</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/sql-window-functions-introduction/">SQL Window Functions Introduction</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/aggregate-window-functions/">Aggregate Window Functions</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/ranking-window-functions/">Ranking Window Functions</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/value-window-functions/">Value Window Functions</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/sql-window-functions-examples/">SQL Window Functions Examples</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">Nested Data Functions</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/nested-data-limitations/">Nested Data Limitations</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/flatten/">FLATTEN</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/kvgen/">KVGEN</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/repeated-count/">REPEATED_COUNT</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/repeated-contains/">REPEATED_CONTAINS</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/list-creation-functions/">COLLECT_LIST</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/query-directory-functions/">Query Directory Functions</a></li> <li class="toctree-l2"><a href="javascript: void(0);">SQL Commands</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/supported-sql-commands/">Supported SQL Commands</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/analyze-table-refresh-metadata/">ANALYZE TABLE REFRESH METADATA</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/analyze-table-compute-statistics/">ANALYZE TABLE COMPUTE STATISTICS</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/set/">SET</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/refresh-table-metadata/">REFRESH TABLE METADATA</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/reset/">RESET</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/alter-system/">ALTER SYSTEM</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/create-or-replace-schema/">CREATE OR REPLACE SCHEMA</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/create-table-as-ctas/">CREATE TABLE AS (CTAS)</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/create-temporary-table-as-cttas/">CREATE TEMPORARY TABLE AS (CTTAS)</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/create-function-using-jar/">CREATE FUNCTION USING JAR</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/partition-by-clause/">PARTITION BY Clause</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/create-view/">CREATE VIEW</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/describe/">DESCRIBE</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/drop-function-using-jar/">DROP FUNCTION USING JAR</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/drop-table/">DROP TABLE</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/drop-view/">DROP VIEW</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/explain/">EXPLAIN</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/lateral-join/">LATERAL Join</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/select/">SELECT</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/select-list/">SELECT List</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/from-clause/">FROM Clause</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/group-by-clause/">GROUP BY Clause</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/having-clause/">HAVING Clause</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/limit-clause/">LIMIT Clause</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/offset-clause/">OFFSET Clause</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/order-by-clause/">ORDER BY Clause</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/set-operators/">Set Operators</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/where-clause/">WHERE Clause</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/with-clause/">WITH Clause</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/show-databases-and-show-schemas/">SHOW DATABASES and SHOW SCHEMAS</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/show-files/">SHOW FILES</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/show-tables/">SHOW TABLES</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/use/">USE</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/pivot-operators/">Pivot Operators</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">SQL Conditional Expressions</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/case/">CASE</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/reserved-keywords/">Reserved Keywords</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/sql-extensions/">SQL Extensions</a></li> </ul> <li class="toctree-l1"><a href="javascript: void(0);">Data Sources and File Formats</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/data-sources-and-file-formats-introduction/">Data Sources and File Formats Introduction</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/hive-to-drill-data-type-mapping/">Hive-to-Drill Data Type Mapping</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/deploying-and-using-a-hive-udf/">Deploying and Using a Hive UDF</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/parquet-format/">Parquet Format</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/logfile-plugin/">Logfile Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/json-data-model/">JSON Data Model</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/text-files-csv-tsv-psv/">Text Files: CSV, TSV, PSV</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/sequence-files/">Sequence Files</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/httpd-format-plugin/">HTTPD Format Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/image-metadata-format-plugin/">Image Metadata Format Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/syslog-format-plugin/">Syslog Format Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/ltsv-format-plugin/">LTSV Format Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/spss-format-plugin/">SPSS Format Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/esri-shapefile-format-plugin/">ESRI Shapefile Format Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/excel-format-plugin/">Excel Format Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/hdf5-format-plugin/">HDF5 Format Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/ms-access-format-plugin/">Microsoft Access Format Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/xml-format-plugin/">XML Format Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/iceberg-format-plugin/">Iceberg Format Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/delta-lake-format-plugin/">Delta Lake Format Plugin</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/sas-format-plugin/">SAS Format Plugin</a></li> </ul> <li class="toctree-l1"><a href="javascript: void(0);">Develop Custom Functions</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/develop-custom-functions-introduction/">Develop Custom Functions Introduction</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/developing-a-simple-function/">Developing a Simple Function</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/tutorial-develop-a-simple-function/">Tutorial: Develop a Simple Function</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/developing-an-aggregate-function/">Developing an Aggregate Function</a></li> <li class="toctree-l2"><a href="javascript: void(0);">Adding Custom Functions to Drill</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/adding-custom-functions-to-drill-introduction/">Adding Custom Functions to Drill Introduction</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/manually-adding-custom-functions-to-drill/">Manually Adding Custom Functions to Drill</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/creating-custom-authenticators/">Creating Custom Authenticators</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/dynamic-udfs/">Dynamic UDFs</a></li> </ul> <li class="toctree-l2"><a class="reference internal" href="/docs/using-custom-functions-in-queries/">Using Custom Functions in Queries</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/custom-function-interfaces/">Custom Function Interfaces</a></li> </ul> <li class="toctree-l1"><a class="reference internal" href="/docs/troubleshooting/">Troubleshooting</a></li> <li class="toctree-l1"><a href="javascript: void(0);">Developer Information</a></li> <ul style="display: none"> <li class="toctree-l2"><a href="javascript: void(0);">REST API</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/rest-api-introduction/">REST API Introduction</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/submitting-queries-from-the-rest-api-when-impersonation-is-enabled-and-authentication-is-disabled/">Submitting Queries from the REST API when Impersonation is Enabled and Authentication is Disabled</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/use-postman-to-run-sql-queries-on-drill-data-sources/">Use Postman to Run SQL Queries on Drill Data Sources</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">Develop Drill</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/compiling-drill-from-source/">Compiling Drill from Source</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/useful-information-for-drill-developers/">Useful Information for Drill Developers</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">Contribute to Drill</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/apache-drill-contribution-guidelines/">Apache Drill Contribution Guidelines</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/apache-drill-contribution-ideas/">Apache Drill Contribution Ideas</a></li> </ul> <li class="toctree-l2"><a href="javascript: void(0);">Design Docs</a></li> <ul style="display: none"> <li class="toctree-l3"><a class="reference internal" href="/docs/drill-plan-syntax/">Drill Plan Syntax</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/rpc-overview/">RPC Overview</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/query-stages/">Query Stages</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/useful-research/">Useful Research</a></li> <li class="toctree-l3"><a class="reference internal" href="/docs/value-vectors/">Value Vectors</a></li> </ul> </ul> <li class="toctree-l1"><a href="javascript: void(0);">Release Notes</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-21-2-release-notes/">Apache Drill 1.21.2 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-21-1-release-notes/">Apache Drill 1.21.1 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-21-0-release-notes/">Apache Drill 1.21.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-20-3-release-notes/">Apache Drill 1.20.3 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-20-2-release-notes/">Apache Drill 1.20.2 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-20-1-release-notes/">Apache Drill 1.20.1 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-20-0-release-notes/">Apache Drill 1.20.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-19-0-release-notes/">Apache Drill 1.19.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-18-0-release-notes/">Apache Drill 1.18.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-17-0-release-notes/">Apache Drill 1.17.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-16-0-release-notes/">Apache Drill 1.16.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-15-0-release-notes/">Apache Drill 1.15.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-14-0-release-notes/">Apache Drill 1.14.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-13-0-release-notes/">Apache Drill 1.13.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-12-0-release-notes/">Apache Drill 1.12.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-11-0-release-notes/">Apache Drill 1.11.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-10-0-release-notes/">Apache Drill 1.10.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-9-0-release-notes/">Apache Drill 1.9.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-8-0-release-notes/">Apache Drill 1.8.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-7-0-release-notes/">Apache Drill 1.7.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-6-0-release-notes/">Apache Drill 1.6.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-5-0-release-notes/">Apache Drill 1.5.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-4-0-release-notes/">Apache Drill 1.4.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-3-0-release-notes/">Apache Drill 1.3.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-2-0-release-notes/">Apache Drill 1.2.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-1-0-release-notes/">Apache Drill 1.1.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-1-0-0-release-notes/">Apache Drill 1.0.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-0-9-0-release-notes/">Apache Drill 0.9.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-0-8-0-release-notes/">Apache Drill 0.8.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-0-7-0-release-notes/">Apache Drill 0.7.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-0-6-0-release-notes/">Apache Drill 0.6.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-0-5-0-release-notes/">Apache Drill 0.5.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-0-4-0-release-notes/">Apache Drill 0.4.0 Release Notes</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/apache-drill-m1-release-notes-apache-drill-alpha/">Apache Drill M1 Release Notes (Apache Drill Alpha)</a></li> </ul> <li class="toctree-l1"><a href="javascript: void(0);">Sample Datasets</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/enron-emails/">Enron Emails</a></li> <li class="toctree-l2"><a class="reference internal" href="/docs/wikipedia-edit-history/">Wikipedia Edit History</a></li> </ul> <li class="toctree-l1"><a class="reference internal" href="/docs/project-bylaws/">Project Bylaws</a></li> <li class="toctree-l1"><a href="javascript: void(0);">Ecosystem</a></li> <ul style="display: none"> <li class="toctree-l2"><a class="reference internal" href="/docs/using-saiku-analytics-with-apache-drill/">Using Saiku Analytics with Apache Drill</a></li> </ul> </ul> </div> </div> </aside> <nav class="breadcrumbs"> <li><a href="/docs/">Docs</a></li> <li><a href="/docs/performance-tuning/">Performance Tuning</a></li> <li><a href="/docs/drill-metastore/">Drill Metastore</a></li> <li>Using Drill Metastore</li> </nav> <div class="main-content-wrapper"> <div class="main-content"> <a class="edit-link" href="https://github.com/apache/drill-site/blob/master/_docs/en/performance-tuning/drill-metastore/010-using-drill-metastore.md" target="_blank"><i class="fa fa-pencil-square-o"></i></a> <div class="int_title left"> <h1>Using Drill Metastore</h1> </div> <!-- jt: we don't need to display a last-modified date on each page to users --> <div class="int_text" align="left"> <p>Drill 1.17 introduces the Drill Metastore which stores the table schema and table statistics. Statistics allow Drill to better create optimal query plans.</p> <p>The Metastore is a beta feature and is subject to change. In particular, the SQL commands and Metastore format may change based on your experience and feedback.</p> <div class="admonition note"> <p class="first admonition-title">Note</p> <p class="last"> In Drill 1.17, Metastore supports only tables in Parquet format. The feature is disabled by default. In Drill 1.18, Metastore supports all format plugins (except MaprDB) for the file system plugin. The feature is still disabled by default. </p> </div> <h2 id="drill-metastore-introduction">Drill Metastore introduction</h2> <p>One of the main advantages of Drill is schema-on-read. But Drill can’t handle some cases with this approach, there are the issues related to schema evolution or ambiguous schema.</p> <p>Significant benefits of schema-aware execution:</p> <ul> <li>At Planning time: <ul> <li>Better scope for planning optimizations.</li> <li>Proper estimation of column widths since types are known, hence more accurate costing.</li> <li>Graceful early exit if certain data type validations fail.</li> </ul> </li> <li>At Runtime: <ul> <li><code class="language-plaintext highlighter-rouge">SchemaChange</code> exceptions avoidance. All minor fragments will have a common understanding of the schema.</li> </ul> </li> </ul> <p>Reading the data along with its statistics metadata helps to build more efficient plans and optimize query execution:</p> <ul> <li>Crucial for optimal join planning, 2-phase aggregation vs 1-phase aggregation planning, selectivity estimation of filter conditions, parallelization decisions.</li> </ul> <p>Taking into account the above points, existing query processing can be improved by:</p> <ul> <li>storing table schema and reusing it;</li> <li>collecting, storing and reusing table statistics to improve query planning.</li> </ul> <p>One of the main steps to resolve all these goals is providing the framework for metadata management named hereafter as Drill Metastore.</p> <h2 id="enabling-drill-metastore">Enabling Drill Metastore</h2> <p>To use the Drill Metastore, you must enable it at the session or system level with one of the following commands:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SET `metastore.enabled` = true; ALTER SYSTEM SET `metastore.enabled` = true; </code></pre></div></div> <p>Alternatively, you can enable the option in the Drill Web UI at <code class="language-plaintext highlighter-rouge">http://<drill-hostname-or-ip-address>:8047/options</code>.</p> <h2 id="computing-and-storing-table-metadata-to-drill-metastore">Computing and storing table metadata to Drill Metastore</h2> <p>Once you enable the Metastore, the next step is to populate it with data. Metastore entries are optional. If you query a table without a Metastore entry, Drill works with that table just as if the Metastore was disabled. In Drill, only add data to the Metastore when doing so improves query performance. In general, large tables benefit from statistics more than small tables do.</p> <p>Unlike Hive, Drill does not require you to declare a schema. Instead, Drill infers the schema by scanning your table in the same way as it is done during regular select and computes some metadata like <code class="language-plaintext highlighter-rouge">MIN</code> / <code class="language-plaintext highlighter-rouge">MAX</code> column values and <code class="language-plaintext highlighter-rouge">NULLS_COUNT</code> designated as “metadata” to be able to produce more optimizations like filter push-down, etc. If <code class="language-plaintext highlighter-rouge">planner.statistics.use</code> option is enabled, this command will also calculate and store table statistics into Drill Metastore.</p> <h2 id="configuration">Configuration</h2> <p>To configure the Metastore, create <code class="language-plaintext highlighter-rouge">$DRILL_HOME/conf/drill-metastore-override.conf</code> file. This is a HOCON-format file, just like <code class="language-plaintext highlighter-rouge">drill-override.conf</code>. All Metastore configuration properties should reside in <code class="language-plaintext highlighter-rouge">drill.metastore</code> namespace.</p> <h3 id="metastore-implementations">Metastore Implementations</h3> <p>Drill Metastore offers an API that allows for any number of implementations. See <a href="https://github.com/apache/drill/blob/master/metastore/metastore-api/README.md">metastore-api module docs</a> for a description of the API.</p> <p>The default implementation is the <a href="/docs/drill-iceberg-metastore">Iceberg Metastore</a> based on <a href="http://iceberg.incubator.apache.org">Iceberg tables</a> that provides support of transactions and concurrent writes. It resides on the file system specified in Metastore configuration.</p> <p>To specify custom Metastore implementation, place the JAR which has the implementation of <code class="language-plaintext highlighter-rouge">org.apache.drill.metastore.Metastore</code> interface into classpath and indicate custom class in the <code class="language-plaintext highlighter-rouge">drill.metastore.implementation.class</code> config property. The default value is the following:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>drill.metastore: { implementation.class: "org.apache.drill.metastore.iceberg.IcebergMetastore" } </code></pre></div></div> <h3 id="metastore-components">Metastore Components</h3> <p>The Drill 1.17 version of the Metastore stores metadata about tables: the table schema and table statistics. The Metastore is an active subproject of Drill, See <a href="https://issues.apache.org/jira/browse/DRILL-6552">DRILL-6552</a> for more information.</p> <h3 id="table-metadata">Table Metadata</h3> <p>Table Metadata includes the following info:</p> <ul> <li>Table schema, column name, type, nullability, scale and precision if available, and other info. For details please refer to <a href="/docs/create-or-replace-schema/#usage-notes">Schema provisioning</a>.</li> <li>Table statistics are of two kinds: <ul> <li>Summary statistics: <code class="language-plaintext highlighter-rouge">MIN</code>, <code class="language-plaintext highlighter-rouge">MAX</code>, <code class="language-plaintext highlighter-rouge">NULL count</code>, etc.</li> <li>Detail statistics: histograms, <code class="language-plaintext highlighter-rouge">NDV</code>, etc.</li> </ul> </li> </ul> <p>Schema information and summary statistics also computed and stored for table segments, files, row groups and partitions.</p> <p>The detailed metadata schema is described <a href="https://github.com/apache/drill/tree/master/metastore/metastore-api#metastore-tables">here</a>. You can try out the metadata to get a sense of what is available, by using the <a href="#inspect-the-metastore-using-information_schema-tables">Inspect the Metastore using <code class="language-plaintext highlighter-rouge">INFORMATION_SCHEMA</code> tables</a> tutorial.</p> <p>Every table described by the Metastore may be a bare file or one or more files that reside in one or more directories.</p> <p>If a table consists of a single directory or file, then it is non-partitioned. The single directory can contain any number of files. Larger tables tend to have subdirectories. Each subdirectory is a partition and such a table are called “partitioned”. Please refer to <a href="#exposing-drill-metastore-metadata-through-information_schema-tables">Exposing Drill Metastore metadata through <code class="language-plaintext highlighter-rouge">INFORMATION_SCHEMA</code> tables</a> for information, how to query partitions and segments metadata.</p> <p>A traditional database divides tables into schemas and tables. Drill can connect to any number of data sources, each of which may have its own schema. As a result, the Metastore labels tables with a combination of (plugin configuration name, workspace name, table name). Note that if before renaming any of these items, you must delete table’s Metadata entry and recreate it after renaming.</p> <h3 id="using-schema-provisioning-feature-with-drill-metastore">Using schema provisioning feature with Drill Metastore</h3> <p>The Drill Metastore holds both schema and statistics information for a table. The <code class="language-plaintext highlighter-rouge">ANALYZE</code> command can infer the table schema for well-defined tables (such as many Parquet tables). Some tables are too complex or variable for Drill’s schema inference to work well. For example, JSON tables often omit fields or have long runs of nulls so that Drill cannot determine column types. In these cases, you can specify the correct schema based on your knowledge of the table’s structure. You specify a schema in the <code class="language-plaintext highlighter-rouge">ANALYZE</code> command using the <a href="/docs/plugin-configuration-basics/#specifying-the-schema-as-table-function-parameter">Schema provisioning</a> syntax.</p> <p>Please refer to <a href="#provisioning-schema-for-drill-metastore">Provisioning schema for Drill Metastore</a> for examples of usage.</p> <h3 id="schema-priority">Schema priority</h3> <p>Drill uses metadata during both query planning and execution. Drill gives you multiple ways to provide a schema.</p> <p>When you run the <code class="language-plaintext highlighter-rouge">ANALYZE TABLE</code> command, Drill will use the following rules for the table schema to be stored in the Metastore. In priority order:</p> <ul> <li>A schema provided in the table function.</li> <li>A schema file, created with <code class="language-plaintext highlighter-rouge">CREATE OR REPLACE SCHEMA</code>, in the table root directory.</li> <li>Schema inferred from file data.</li> </ul> <p>To plan a query, Drill requires information about your file partitions (if any) and about row and column cardinality. Drill does not use the provided schema for planning as it does not provide this metadata. Instead, at plan time Drill obtains metadata from one of the following, again in priority order:</p> <ul> <li>The Drill Metastore, if available.</li> <li>Inferred from file data. Drill scans the table’s directory structure to identify partitions. Drill estimates row counts based on the file size. Drill uses default estimates for column cardinality.</li> </ul> <p>At query execution time, a schema tells Drill the shape of your data and how that data should be converted to Drill’s SQL types. Your choices for execution-time schema, in priority order, are:</p> <ul> <li>With a table function: <ul> <li>specify an inline schema</li> <li>specify the path to the schema file.</li> </ul> </li> <li>With a schema file, created with <code class="language-plaintext highlighter-rouge">CREATE OR REPLACE SCHEMA</code>, in the table root directory.</li> <li>Using the schema from the Drill Metastore, if available.</li> <li>Infer the schema directly from file data.</li> </ul> <h3 id="related-sessionsystem-options">Related Session/System Options</h3> <p>The Metastore provides a number of options to fit your environment. The default options are fine in most cases. The options are set via <code class="language-plaintext highlighter-rouge">ALTER SYSTEM SET</code>, <code class="language-plaintext highlighter-rouge">SET</code> (it is an alias for <code class="language-plaintext highlighter-rouge">ALTER SESSION SET</code>) or the Drill Web console.</p> <p>In general, admin should set the options via <code class="language-plaintext highlighter-rouge">ALTER SYSTEM</code> so that they take effect for all users. Setting options at the session level is an advanced topic.</p> <ul> <li><strong>metastore.enabled</strong> Enables Drill Metastore usage to be able to store table metadata during ANALYZE TABLE commands execution and to be able to read table metadata during regular queries execution or when querying some INFORMATION_SCHEMA tables. Default is <code class="language-plaintext highlighter-rouge">false</code>.</li> <li><strong>metastore.metadata.store.depth_level</strong> Specifies the most-specific metadata kind to be collected with more general metadata kinds. Same options as the <em>level</em> option above. Default is <code class="language-plaintext highlighter-rouge">'ALL'</code>.</li> <li><strong>metastore.retrieval.retry_attempts</strong> If you run the <code class="language-plaintext highlighter-rouge">ANALYZE TABLE</code> command at the same time as queries run, then the query can read incorrect or corrupt statistics. Drill will reload statistics and replan the query. This option specifies the maximum number of retry attempts. Default is <code class="language-plaintext highlighter-rouge">5</code>.</li> <li><strong>metastore.metadata.fallback_to_file_metadata</strong> Allows using <a href="/docs/refresh-table-metadata">file metadata cache</a> for the case when required metadata is absent in the Metastore. Default is <code class="language-plaintext highlighter-rouge">true</code>.</li> <li><strong>metastore.metadata.use_schema</strong> The <code class="language-plaintext highlighter-rouge">ANALYZE TABLE</code> command infers table schema as it gathers statistics. This option tells Drill to use that schema information while planning the query. Disable this option if Drill has inferred the schema incorrectly, or schema will be provided separately (see <a href="/docs/create-or-replace-schema">CREATE OR REPLACE SCHEMA</a>). Default is <code class="language-plaintext highlighter-rouge">true</code>.</li> <li><strong>metastore.metadata.use_statistics</strong> Enables the Drill query planner to use table and column statistics stored in the Metastore. Default is <code class="language-plaintext highlighter-rouge">true</code>. Enable <code class="language-plaintext highlighter-rouge">planner.statistics.use</code> to be able to use statistics during query planning.</li> <li><strong>drill.exec.storage.implicit.last_modified_time.column.label</strong> Sets the implicit column name for the last modified time (<code class="language-plaintext highlighter-rouge">lmt</code>) column. Used when producing Metastore analyze. You can set the last modified time column name to custom name when current column name clashes which column name present in the table. If your table contains a column name with the same name as an implicit column, the implicit column takes priority and shadows column from the table. Default is <code class="language-plaintext highlighter-rouge">lmt</code>.</li> <li><strong>drill.exec.storage.implicit.row_group_index.column.label</strong> Sets the implicit column name for the row group index (<code class="language-plaintext highlighter-rouge">rgi</code>) column. Used when producing Metastore analyze. You can set row group index column name to custom name when current column name clashes which column name present in the table. If your table contains a column name with the same name as an implicit column, the implicit column takes priority and shadows column from the table. Default is <code class="language-plaintext highlighter-rouge">rgi</code>.</li> <li><strong>drill.exec.storage.implicit.row_group_length.column.label</strong> Sets the implicit column name for the row group length (<code class="language-plaintext highlighter-rouge">rgl</code>) column. Used when producing Metastore analyze. You can set row group length column name to custom name when current column name clashes which column name present in the table. If your table contains a column name with the same name as an implicit column, the implicit column takes priority and shadows column from the table. Default is <code class="language-plaintext highlighter-rouge">rgl</code>.</li> <li><strong>drill.exec.storage.implicit.row_group_start.column.label</strong> Sets the implicit column name for the row group start (<code class="language-plaintext highlighter-rouge">rgs</code>) column. Used when producing Metastore analyze. You can set row group start column name to custom name when current column name clashes which column name present in the table. If your table contains a column name with the same name as an implicit column, the implicit column takes priority and shadows column from the table. Default is <code class="language-plaintext highlighter-rouge">rgs</code>.</li> </ul> <h2 id="analyzing-a-table">Analyzing a table</h2> <p>You create Metastore metadata by running the <a href="/docs/analyze-table-refresh-metadata"><code class="language-plaintext highlighter-rouge">ANALYZE TABLE</code></a> command. The first time you run it, the Metastore will infer the schema and (depending on which options you have selected), populate statistics.</p> <p>Tables change over time. To keep the Metastore metadata up-to-date, you must periodically run <code class="language-plaintext highlighter-rouge">ANALYZE TABLE</code> again on each changed table. When you do <code class="language-plaintext highlighter-rouge">ANALYZE TABLE</code> a second time, Drill will attempt to update statistics, called “incremental analysis”.</p> <p>Incremental analysis will compute metadata only for files and partitions changed since the last analysis and reuse actual metadata from the Metastore where possible.</p> <p>The command will return the following message if table statistics are up-to-date:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ANALYZE TABLE `lineitem` REFRESH METADATA; |-------|---------------------------------------------------------| | ok | summary | |-------|---------------------------------------------------------| | false | Table metadata is up to date, analyze wasn't performed. | |-------|---------------------------------------------------------| </code></pre></div></div> <p>Table schemas evolve over time. If your table adds (or removes) columns, run <code class="language-plaintext highlighter-rouge">ANALYZE TABLE</code> with the new set of columns. Drill will perform a full table analysis.</p> <h2 id="general-information">General Information</h2> <h3 id="metadata-usage">Metadata usage</h3> <p>Drill uses the Metastore in several places. When you run a query with multiple directories, files or Parquet row groups, Drill will use statistics to “prune” the scan. That is, to identify those directories, files or row groups that do not contain data that your query needs. If you add new files or directories and do not rerun <code class="language-plaintext highlighter-rouge">ANALYZE TABLE</code>, then Drill will assume that existing metadata is invalid and wouldn’t use it. Periodically rerun <code class="language-plaintext highlighter-rouge">ANALYZE TABLE</code> so that Drill can use table metadata when possible.</p> <h3 id="exposing-drill-metastore-metadata-through-information_schema-tables">Exposing Drill Metastore metadata through <code class="language-plaintext highlighter-rouge">INFORMATION_SCHEMA</code> tables</h3> <p>Drill exposes some Metastore tables metadata through <code class="language-plaintext highlighter-rouge">INFORMATION_SCHEMA</code> tables. Note, that Metastore metadata will be exposed to the <code class="language-plaintext highlighter-rouge">INFORMATION_SCHEMA</code> only if Metastore is enabled. If it is disabled, info tables won’t contain Metastore metadata.</p> <p><code class="language-plaintext highlighter-rouge">TABLES</code> table includes the set of tables on which you have run <code class="language-plaintext highlighter-rouge">ANALYZE TABLE</code>. Description of Metastore-specific columns:</p> <table> <thead> <tr> <th>Column name</th> <th>Type</th> <th>Nullable</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code class="language-plaintext highlighter-rouge">TABLE_SOURCE</code></td> <td>VARCHAR</td> <td>YES</td> <td>Table data type: <code class="language-plaintext highlighter-rouge">PARQUET</code>, <code class="language-plaintext highlighter-rouge">CSV</code>, <code class="language-plaintext highlighter-rouge">JSON</code></td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">LOCATION</code></td> <td>VARCHAR</td> <td>YES</td> <td>Table location: <code class="language-plaintext highlighter-rouge">/tmp/nation</code></td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">NUM_ROWS</code></td> <td>BIGINT</td> <td>YES</td> <td>Total number of rows in all files of the table. Null if not known</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">LAST_MODIFIED_TIME</code></td> <td>TIMESTAMP</td> <td>YES</td> <td>Timestamp of the most-recently modified file within the table. Updated on each <code class="language-plaintext highlighter-rouge">ANALYZE TABLE</code> run.</td> </tr> </tbody> </table> <p>The <code class="language-plaintext highlighter-rouge">COLUMNS</code> table describes the columns within each table. Only those columns listed in the <code class="language-plaintext highlighter-rouge">COLUMNS</code> clause of the <code class="language-plaintext highlighter-rouge">ANALYZE TABLE</code> statement appear in this table.</p> <table> <thead> <tr> <th>Column name</th> <th>Type</th> <th>Nullable</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code class="language-plaintext highlighter-rouge">COLUMN_DEFAULT</code></td> <td>VARCHAR</td> <td>YES</td> <td>Column default value.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">COLUMN_FORMAT</code></td> <td>VARCHAR</td> <td>YES</td> <td>Usually applicable for date time columns: <code class="language-plaintext highlighter-rouge">yyyy-MM-dd</code>. See <a href="/docs/create-or-replace-schema/#format-for-date-time-conversion">Format for Date, Time Conversion</a>.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">NUM_NULLS</code></td> <td>BIGINT</td> <td>YES</td> <td>Number of rows which contain nulls for this column.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">MIN_VAL</code></td> <td>VARCHAR</td> <td>YES</td> <td>Minimum value of the column. For example: <code class="language-plaintext highlighter-rouge">'-273'</code>.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">MAX_VAL</code></td> <td>VARCHAR</td> <td>YES</td> <td>Maximum value of the column. For example: <code class="language-plaintext highlighter-rouge">'100500'</code>.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">NDV</code></td> <td>FLOAT8</td> <td>YES</td> <td>Number of distinct values in column.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">EST_NUM_NON_NULLS</code></td> <td>FLOAT8</td> <td>YES</td> <td>Estimated number of non null values.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">IS_NESTED</code></td> <td>BIT</td> <td>NO</td> <td>If column is nested. Nested columns are extracted from columns with struct type.</td> </tr> </tbody> </table> <p>A table can be divided into directories, called “partitions”. The <code class="language-plaintext highlighter-rouge">PARTITIONS</code> table contains an entry for each directory.</p> <table> <thead> <tr> <th>Column name</th> <th>Type</th> <th>Nullable</th> <th>Description</th> </tr> </thead> <tbody> <tr> <td><code class="language-plaintext highlighter-rouge">TABLE_CATALOG</code></td> <td>VARCHAR</td> <td>YES</td> <td>Table catalog (currently we have only one catalog): <code class="language-plaintext highlighter-rouge">DRILL</code>.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">TABLE_SCHEMA</code></td> <td>VARCHAR</td> <td>YES</td> <td>Table schema: <code class="language-plaintext highlighter-rouge">dfs.tmp</code>.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">TABLE_NAME</code></td> <td>VARCHAR</td> <td>YES</td> <td>Table name: <code class="language-plaintext highlighter-rouge">nation</code>.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">METADATA_KEY</code></td> <td>VARCHAR</td> <td>YES</td> <td>Top level segment key, the same for all nested segments and partitions: <code class="language-plaintext highlighter-rouge">part_int=3</code>.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">METADATA_TYPE</code></td> <td>VARCHAR</td> <td>YES</td> <td><code class="language-plaintext highlighter-rouge">SEGMENT</code> or <code class="language-plaintext highlighter-rouge">PARTITION</code>. Partition here corresponds to “Drill partition”, though segment corresponds to data parts like partitions in general case, for example, Hive partition.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">METADATA_IDENTIFIER</code></td> <td>VARCHAR</td> <td>YES</td> <td>Current metadata identifier: <code class="language-plaintext highlighter-rouge">part_int=3/part_varchar=g</code>. It is unique value for segment or partition within the table.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">PARTITION_COLUMN</code></td> <td>VARCHAR</td> <td>YES</td> <td>Partition column name: <code class="language-plaintext highlighter-rouge">part_varchar</code>.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">PARTITION_VALUE</code></td> <td>VARCHAR</td> <td>YES</td> <td>Partition column value: <code class="language-plaintext highlighter-rouge">g</code>.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">LOCATION</code></td> <td>VARCHAR</td> <td>YES</td> <td>Segment location, <code class="language-plaintext highlighter-rouge">null</code> for partitions: <code class="language-plaintext highlighter-rouge">/tmp/nation/part_int=3</code>.</td> </tr> <tr> <td><code class="language-plaintext highlighter-rouge">LAST_MODIFIED_TIME</code></td> <td>TIMESTAMP</td> <td>YES</td> <td>Last modification time.</td> </tr> </tbody> </table> <h3 id="limitations-of-the-117-release">Limitations of the 1.17 release</h3> <ul> <li>Applies to tables stored as Parquet files and only when stored in the <code class="language-plaintext highlighter-rouge">DFS</code> storage plugin.</li> <li>Disabled by default. You must enable this feature through the <code class="language-plaintext highlighter-rouge">metastore.enabled</code> system/session option.</li> </ul> <h3 id="limitations-of-the-118-release">Limitations of the 1.18 release</h3> <ul> <li>Applies to all file system storage plugin formats except for MaprDB.</li> </ul> <h3 id="cheat-sheet-of-analyze-table-commands">Cheat sheet of <code class="language-plaintext highlighter-rouge">ANALYZE TABLE</code> commands</h3> <ul> <li>Add a new table with <code class="language-plaintext highlighter-rouge">ANALYZE TABLE dfs.tmp.lineitem REFRESH METADATA</code> command.</li> <li>When table data (but not schema) changes, run <code class="language-plaintext highlighter-rouge">ANALYZE TABLE dfs.tmp.lineitem REFRESH METADATA</code> command.</li> <li>When the table schema changes, run <code class="language-plaintext highlighter-rouge">ANALYZE TABLE dfs.tmp.lineitem COLUMNS (col1, col2, ...) REFRESH METADATA</code> command.</li> <li>If partitions are added or removed, run <code class="language-plaintext highlighter-rouge">ANALYZE TABLE dfs.tmp.lineitem REFRESH METADATA</code> command.</li> <li>Remove table metadata by submitting <code class="language-plaintext highlighter-rouge">ANALYZE TABLE dfs.tmp.lineitem DROP METADATA</code> command.</li> </ul> <h2 id="tutorial">Tutorial</h2> <p>Examples throughout this topic use the files and directories described in the following section <code class="language-plaintext highlighter-rouge">Directory and File Setup</code>.</p> <h3 id="directory-and-file-setup">Directory and File Setup</h3> <p>The following examples are written for local file system, but Drill Metastore supports collecting metadata for tables placed in any any of Drill’s supported file systems. The examples work for both embedded and distributed Drill modes.</p> <p>Obtain an SF=1 TPC-H dataset, either by running a TPC-H data generator locally or by downloading generated TPC-H data from some trusted source and place the dataset in a filesystem visible to Drill. Set up storage plugin for desired file system, as described here: <a href="/docs/file-system-storage-plugin/#connecting-drill-to-a-file-system">Connecting Drill to a File System</a>.</p> <p>Create lineitem directory in <code class="language-plaintext highlighter-rouge">/tmp/</code> and two subdirectories under <code class="language-plaintext highlighter-rouge">/tmp/lineitem</code> named <code class="language-plaintext highlighter-rouge">s1</code> and <code class="language-plaintext highlighter-rouge">s2</code> and copy there table data:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir /tmp/lineitem mkdir /tmp/lineitem/s1 mkdir /tmp/lineitem/s2 cp TPCH/lineitem /tmp/lineitem/s1 cp TPCH/lineitem /tmp/lineitem/s2 </code></pre></div></div> <p>Query the directory <code class="language-plaintext highlighter-rouge">/tmp/lineitem</code>:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT count(*) FROM dfs.tmp.lineitem; |----------| | EXPR$0 | |----------| | 12002430 | |----------| 1 row selected (0.291 seconds) </code></pre></div></div> <p>Notice that the query plan contains a group scan with <code class="language-plaintext highlighter-rouge">usedMetastore = false</code>:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00-00 Screen : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {2.1 rows, 2.1 cpu, 1.0 io, 0.0 network, 0.0 memory}, id = 8410 00-01 Project(EXPR$0=[$0]) : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {2.0 rows, 2.0 cpu, 1.0 io, 0.0 network, 0.0 memory}, id = 8409 00-02 DirectScan(groupscan=[selectionRoot = file:/tmp/lineitem, numFiles = 12, usedMetadataSummaryFile = false, usedMetastore = false, ... </code></pre></div></div> <h3 id="compute-table-metadata-and-store-in-the-drill-metastore">Compute table metadata and store in the Drill Metastore</h3> <p>Enable Drill Metastore:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SET `metastore.enabled` = true; </code></pre></div></div> <p>The above command enables the Metastore for just this one session.</p> <p>Run the <a href="/docs/analyze-table-refresh-metadata">ANALYZE TABLE</a> command on the table, whose metadata should be computed and stored into the Drill Metastore:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ANALYZE TABLE dfs.tmp.lineitem REFRESH METADATA; |------|-------------------------------------------------------------| | ok | summary | |------|-------------------------------------------------------------| | true | Collected / refreshed metadata for table [dfs.tmp.lineitem] | |------|-------------------------------------------------------------| 1 row selected (32.257 seconds) </code></pre></div></div> <p>The output of this command provides the status of the command execution and its summary.</p> <p>Now that we’ve collected table metadata, we can use it when we query the table, by checking the <code class="language-plaintext highlighter-rouge">usedMetastore=true</code> entry in <code class="language-plaintext highlighter-rouge">ParquetGroupScan</code>:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00-00 Screen : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {2.1 rows, 2.1 cpu, 1.0 io, 0.0 network, 0.0 memory}, id = 8560 00-01 Project(EXPR$0=[$0]) : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {2.0 rows, 2.0 cpu, 1.0 io, 0.0 network, 0.0 memory}, id = 8559 00-02 DirectScan(groupscan=[selectionRoot = /tmp/lineitem, numFiles = 12, usedMetadataSummaryFile = false, usedMetastore = true, ... </code></pre></div></div> <h3 id="perform-incremental-analysis">Perform incremental analysis</h3> <p>Rerun <a href="/docs/analyze-table-refresh-metadata">ANALYZE TABLE</a> command on the <code class="language-plaintext highlighter-rouge">lineitem</code> table:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ANALYZE TABLE dfs.tmp.lineitem REFRESH METADATA; |-------|---------------------------------------------------------| | ok | summary | |-------|---------------------------------------------------------| | false | Table metadata is up to date, analyze wasn't performed. | |-------|---------------------------------------------------------| 1 row selected (0.249 seconds) </code></pre></div></div> <h3 id="inspect-the-metastore-using-information_schema-tables">Inspect the Metastore using INFORMATION_SCHEMA tables</h3> <p>Run the following query to inspect <code class="language-plaintext highlighter-rouge">lineitem</code> table metadata from <code class="language-plaintext highlighter-rouge">TABLES</code> table stored in the Metastore:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT * FROM INFORMATION_SCHEMA.`TABLES` WHERE TABLE_NAME='lineitem'; |---------------|--------------|------------|------------|--------------|---------------|----------|-----------------------| | TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME | TABLE_TYPE | TABLE_SOURCE | LOCATION | NUM_ROWS | LAST_MODIFIED_TIME | |---------------|--------------|------------|------------|--------------|---------------|----------|-----------------------| | DRILL | dfs.tmp | lineitem | TABLE | PARQUET | /tmp/lineitem | 12002430 | 2016-09-28 03:22:58.0 | |---------------|--------------|------------|------------|--------------|---------------|----------|-----------------------| 1 row selected (0.157 seconds) </code></pre></div></div> <p>To obtain columns with their types and descriptions within the <code class="language-plaintext highlighter-rouge">lineitem</code> table, run the following query:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` WHERE TABLE_NAME='lineitem'; |---------------|--------------|------------|-----------------|------------------|----------------|-------------|-------------------|--------------------------|------------------------|-------------------|-------------------------|---------------|--------------------|---------------|--------------------|-------------|---------------|-----------|--------------|---------------------------------------------|-----------|-------------------|-----------| | TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME | COLUMN_NAME | ORDINAL_POSITION | COLUMN_DEFAULT | IS_NULLABLE | DATA_TYPE | CHARACTER_MAXIMUM_LENGTH | CHARACTER_OCTET_LENGTH | NUMERIC_PRECISION | NUMERIC_PRECISION_RADIX | NUMERIC_SCALE | DATETIME_PRECISION | INTERVAL_TYPE | INTERVAL_PRECISION | COLUMN_SIZE | COLUMN_FORMAT | NUM_NULLS | MIN_VAL | MAX_VAL | NDV | EST_NUM_NON_NULLS | IS_NESTED | |---------------|--------------|------------|-----------------|------------------|----------------|-------------|-------------------|--------------------------|------------------------|-------------------|-------------------------|---------------|--------------------|---------------|--------------------|-------------|---------------|-----------|--------------|---------------------------------------------|-----------|-------------------|-----------| | DRILL | dfs.tmp | lineitem | dir0 | 1 | null | YES | CHARACTER VARYING | 65535 | 65535 | null | null | null | null | null | null | 65535 | null | 0 | s1 | s2 | null | null | false | | DRILL | dfs.tmp | lineitem | l_orderkey | 2 | null | YES | INTEGER | null | null | 0 | 2 | 0 | null | null | null | 11 | null | 0 | 1 | 6000000 | 1499876.0 | 1.200243E7 | false | | DRILL | dfs.tmp | lineitem | l_partkey | 3 | null | YES | INTEGER | null | null | 0 | 2 | 0 | null | null | null | 11 | null | 0 | 1 | 200000 | 199857.0 | 1.200243E7 | false | ... | DRILL | dfs.tmp | lineitem | l_comment | 17 | null | YES | CHARACTER VARYING | 65535 | 65535 | null | null | null | null | null | null | 65535 | null | 0 | Tiresias | zzle? slyly final platelets sleep quickly. | 4586320.0 | 1.200243E7 | false | |---------------|--------------|------------|-----------------|------------------|----------------|-------------|-------------------|--------------------------|------------------------|-------------------|-------------------------|---------------|--------------------|---------------|--------------------|-------------|---------------|-----------|--------------|---------------------------------------------|-----------|-------------------|-----------| 17 rows selected (0.187 seconds) </code></pre></div></div> <p>The sample <code class="language-plaintext highlighter-rouge">lineitem</code> table has two partitions. The <code class="language-plaintext highlighter-rouge">PARTITIONS</code> table contains an entry for each directory:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT * FROM INFORMATION_SCHEMA.`PARTITIONS` WHERE TABLE_NAME='lineitem'; |---------------|--------------|------------|--------------|---------------|---------------------|------------------|-----------------|------------------|-----------------------| | TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME | METADATA_KEY | METADATA_TYPE | METADATA_IDENTIFIER | PARTITION_COLUMN | PARTITION_VALUE | LOCATION | LAST_MODIFIED_TIME | |---------------|--------------|------------|--------------|---------------|---------------------|------------------|-----------------|------------------|-----------------------| | DRILL | dfs.tmp | lineitem | s2 | SEGMENT | s2 | `dir0` | s2 | /tmp/lineitem/s2 | 2016-09-28 03:22:58.0 | | DRILL | dfs.tmp | lineitem | s1 | SEGMENT | s1 | `dir0` | s1 | /tmp/lineitem/s1 | 2016-09-28 03:22:58.0 | |---------------|--------------|------------|--------------|---------------|---------------------|------------------|-----------------|------------------|-----------------------| 2 rows selected (0.149 seconds) </code></pre></div></div> <h3 id="drop-table-metadata">Drop table metadata</h3> <p>Once we are done exploring metadata we can drop the metadata for the <code class="language-plaintext highlighter-rouge">lineitem</code> table.</p> <p>Table metadata may be dropped using <code class="language-plaintext highlighter-rouge">ANALYZE TABLE DROP METADATA</code> command:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ANALYZE TABLE dfs.tmp.lineitem DROP METADATA; |------|----------------------------------------| | ok | summary | |------|----------------------------------------| | true | Metadata for table [lineitem] dropped. | |------|----------------------------------------| 1 row selected (0.291 seconds) </code></pre></div></div> <h3 id="collect-metadata-for-specific-set-of-columns">Collect metadata for specific set of columns</h3> <p>Next let’s gather metadata for a subset of the columns in the <code class="language-plaintext highlighter-rouge">lineitem</code> table. You would do this to allow Drill to optimize <code class="language-plaintext highlighter-rouge">WHERE</code> conditions on certain columns. Also, if file size or the number of columns grows large, it can take too long to gather all statistics. Instead you can speed up analysis by gathering statistics only for selected columns: those actually used in the <code class="language-plaintext highlighter-rouge">WHERE</code> clause.</p> <p>For the case when metadata for several columns should be computed and stored into the Metastore, the following command may be used:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ANALYZE TABLE dfs.tmp.lineitem COLUMNS(l_orderkey, l_partkey) REFRESH METADATA; |------|-------------------------------------------------------------| | ok | summary | |------|-------------------------------------------------------------| | true | Collected / refreshed metadata for table [dfs.tmp.lineitem] | |------|-------------------------------------------------------------| 1 row selected (94.1 seconds) </code></pre></div></div> <p>Now, check, that metadata is collected only for specified columns (<code class="language-plaintext highlighter-rouge">MIN_VAL</code>, <code class="language-plaintext highlighter-rouge">MAX_VAL</code>, <code class="language-plaintext highlighter-rouge">NDV</code>, etc.), but all columns are present:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT * FROM INFORMATION_SCHEMA.`COLUMNS` WHERE TABLE_NAME='lineitem'; |---------------|--------------|------------|-----------------|------------------|----------------|-------------|-------------------|--------------------------|------------------------|-------------------|-------------------------|---------------|--------------------|---------------|--------------------|-------------|---------------|-----------|---------|---------|-----------|-------------------|-----------| | TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME | COLUMN_NAME | ORDINAL_POSITION | COLUMN_DEFAULT | IS_NULLABLE | DATA_TYPE | CHARACTER_MAXIMUM_LENGTH | CHARACTER_OCTET_LENGTH | NUMERIC_PRECISION | NUMERIC_PRECISION_RADIX | NUMERIC_SCALE | DATETIME_PRECISION | INTERVAL_TYPE | INTERVAL_PRECISION | COLUMN_SIZE | COLUMN_FORMAT | NUM_NULLS | MIN_VAL | MAX_VAL | NDV | EST_NUM_NON_NULLS | IS_NESTED | |---------------|--------------|------------|-----------------|------------------|----------------|-------------|-------------------|--------------------------|------------------------|-------------------|-------------------------|---------------|--------------------|---------------|--------------------|-------------|---------------|-----------|---------|---------|-----------|-------------------|-----------| | DRILL | dfs.tmp | lineitem | dir0 | 1 | null | YES | CHARACTER VARYING | 65535 | 65535 | null | null | null | null | null | null | 65535 | null | 0 | s1 | s2 | null | null | false | | DRILL | dfs.tmp | lineitem | l_orderkey | 2 | null | YES | INTEGER | null | null | 0 | 2 | 0 | null | null | null | 11 | null | 0 | 1 | 6000000 | 1499876.0 | 1.200243E7 | false | | DRILL | dfs.tmp | lineitem | l_partkey | 3 | null | YES | INTEGER | null | null | 0 | 2 | 0 | null | null | null | 11 | null | 0 | 1 | 200000 | 199857.0 | 1.200243E7 | false | ... | DRILL | dfs.tmp | lineitem | l_comment | 17 | null | YES | CHARACTER VARYING | 65535 | 65535 | null | null | null | null | null | null | 65535 | null | null | null | null | null | null | false | |---------------|--------------|------------|-----------------|------------------|----------------|-------------|-------------------|--------------------------|------------------------|-------------------|-------------------------|---------------|--------------------|---------------|--------------------|-------------|---------------|-----------|---------|---------|-----------|-------------------|-----------| 17 rows selected (0.183 seconds) </code></pre></div></div> <h3 id="provisioning-schema-for-drill-metastore">Provisioning schema for Drill Metastore</h3> <h4 id="directory-and-file-setup-1">Directory and File Setup</h4> <p>Ensure you have configured the file system storage plugin as described here: <a href="/docs/file-system-storage-plugin/#connecting-drill-to-a-file-system">Connecting Drill to a File System</a>.</p> <p>Set <code class="language-plaintext highlighter-rouge">store.format</code> to <code class="language-plaintext highlighter-rouge">csvh</code>:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SET `store.format`='csvh'; |------|-----------------------| | ok | summary | |------|-----------------------| | true | store.format updated. | |------|-----------------------| </code></pre></div></div> <p>Create a text table based on the sample <code class="language-plaintext highlighter-rouge">/tpch/nation.parquet</code> table from <code class="language-plaintext highlighter-rouge">cp</code> plugin:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CREATE TABLE dfs.tmp.text_nation AS (SELECT * FROM cp.`/tpch/nation.parquet`); |----------|---------------------------| | Fragment | Number of records written | |----------|---------------------------| | 0_0 | 25 | |----------|---------------------------| </code></pre></div></div> <p>Query the table <code class="language-plaintext highlighter-rouge">text_nation</code>:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT typeof(n_nationkey), typeof(n_name), typeof(n_regionkey), typeof(n_comment) FROM dfs.tmp.text_nation LIMIT 1; |---------|---------|---------|---------| | EXPR$0 | EXPR$1 | EXPR$2 | EXPR$3 | |---------|---------|---------|---------| | VARCHAR | VARCHAR | VARCHAR | VARCHAR | |---------|---------|---------|---------| </code></pre></div></div> <p>Notice that the query plan contains a group scan with <code class="language-plaintext highlighter-rouge">usedMetastore = false</code>:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00-00 Screen : rowType = RecordType(ANY EXPR$0, ANY EXPR$1, ANY EXPR$2, ANY EXPR$3): rowcount = 1.0, cumulative cost = {25.1 rows, 109.1 cpu, 2247.0 io, 0.0 network, 0.0 memory}, id = 160 00-01 Project(EXPR$0=[TYPEOF($0)], EXPR$1=[TYPEOF($1)], EXPR$2=[TYPEOF($2)], EXPR$3=[TYPEOF($3)]) : rowType = RecordType(ANY EXPR$0, ANY EXPR$1, ANY EXPR$2, ANY EXPR$3): rowcount = 1.0, cumulative cost = {25.0 rows, 109.0 cpu, 2247.0 io, 0.0 network, 0.0 memory}, id = 159 00-02 SelectionVectorRemover : rowType = RecordType(ANY n_nationkey, ANY n_name, ANY n_regionkey, ANY n_comment): rowcount = 1.0, cumulative cost = {24.0 rows, 93.0 cpu, 2247.0 io, 0.0 network, 0.0 memory}, id = 158 00-03 Limit(fetch=[1]) : rowType = RecordType(ANY n_nationkey, ANY n_name, ANY n_regionkey, ANY n_comment): rowcount = 1.0, cumulative cost = {23.0 rows, 92.0 cpu, 2247.0 io, 0.0 network, 0.0 memory}, id = 157 00-04 Scan(table=[[dfs, tmp, text_nation]], groupscan=[EasyGroupScan [... schema=null, usedMetastore=false... </code></pre></div></div> <h4 id="compute-table-metadata-and-store-in-the-drill-metastore-1">Compute table metadata and store in the Drill Metastore</h4> <p>Enable Drill Metastore:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SET `metastore.enabled` = true; </code></pre></div></div> <p>Specify table schema when running <code class="language-plaintext highlighter-rouge">ANALYZE</code> query:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ANALYZE TABLE table(dfs.tmp.`text_nation` (type=>'text', fieldDelimiter=>',', extractHeader=>true, schema=>'inline=( `n_nationkey` INT not null, `n_name` VARCHAR not null, `n_regionkey` INT not null, `n_comment` VARCHAR not null)' )) REFRESH METADATA; |------|----------------------------------------------------------------| | ok | summary | |------|----------------------------------------------------------------| | true | Collected / refreshed metadata for table [dfs.tmp.text_nation] | |------|----------------------------------------------------------------| </code></pre></div></div> <h4 id="inspect-the-metastore-using-information_schema-tables-1">Inspect the Metastore using INFORMATION_SCHEMA tables</h4> <p>Run the following query to inspect <code class="language-plaintext highlighter-rouge">text_nation</code> table schema stored in the Metastore:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT COLUMN_NAME, DATA_TYPE FROM INFORMATION_SCHEMA.`COLUMNS` WHERE TABLE_NAME='text_nation'; |-------------|-------------------| | COLUMN_NAME | DATA_TYPE | |-------------|-------------------| | n_nationkey | INTEGER | | n_name | CHARACTER VARYING | | n_regionkey | INTEGER | | n_comment | CHARACTER VARYING | |-------------|-------------------| </code></pre></div></div> <p>Ensure that this schema is applied to the table:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SELECT typeof(n_nationkey), typeof(n_name), typeof(n_regionkey), typeof(n_comment) FROM dfs.tmp.text_nation LIMIT 1; |--------|---------|--------|---------| | EXPR$0 | EXPR$1 | EXPR$2 | EXPR$3 | |--------|---------|--------|---------| | INT | VARCHAR | INT | VARCHAR | |--------|---------|--------|---------| </code></pre></div></div> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>select sum(n_nationkey) from dfs.tmp.`text_nation`; |--------| | EXPR$0 | |--------| | 300 | |--------| </code></pre></div></div> <p>Query plan contains schema from the Metastore and group scan with <code class="language-plaintext highlighter-rouge">usedMetastore = true</code>:</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00-00 Screen : rowType = RecordType(ANY EXPR$0): rowcount = 1.0, cumulative cost = {45.1 rows, 287.1 cpu, 2247.0 io, 0.0 network, 0.0 memory}, id = 3129 00-01 Project(EXPR$0=[$0]) : rowType = RecordType(ANY EXPR$0): rowcount = 1.0, cumulative cost = {45.0 rows, 287.0 cpu, 2247.0 io, 0.0 network, 0.0 memory}, id = 3128 00-02 StreamAgg(group=[{}], EXPR$0=[SUM($0)]) : rowType = RecordType(ANY EXPR$0): rowcount = 1.0, cumulative cost = {44.0 rows, 286.0 cpu, 2247.0 io, 0.0 network, 0.0 memory}, id = 3127 00-03 Scan(table=[[dfs, tmp, text_nation]], groupscan=[EasyGroupScan ... schema=..., usedMetastore=true]]) ... </code></pre></div></div> <div class="doc-nav"> <span class="previous-toc"><a href="/docs/monitoring-metrics/">← Monitoring Metrics</a></span><span class="next-toc"><a href="/docs/drill-iceberg-metastore/">Drill Iceberg Metastore →</a></span> </div> </div> </div> </div> </div> <p class="push"></p> <div id="footer" class="mw"> <div class="wrapper"> Copyright © 2012-2025 The Apache Software Foundation, licensed under the Apache License, Version 2.0.<br> Apache and the Apache feather logo are trademarks of The Apache Software Foundation. Other names appearing on the site may be trademarks of their respective owners.<br/><br/> </div> </div> <script type="text/javascript" src="https://s7.addthis.com/js/300/addthis_widget.js#pubid=ra-548b2caa33765e8d" async="async"></script> </body> </html>