We’re excited to announce the Public Preview of Hive Metastore (HMS) and AWS Glue Federation in Unity Catalog! This new functionality allows Unity Catalog to seamlessly entry and govern tables saved in Hive Metastores—whether or not inner to Databricks or exterior—in addition to AWS Glue. It represents a key milestone in our Lakehouse Federation imaginative and prescient, which brings exterior knowledge sources together with databases, knowledge warehouses and catalogs, collectively underneath a unified governance framework with Unity Catalog. You’ll be able to effortlessly uncover, govern and question all of your knowledge from a single, centralized platform, whatever the format and placement. This not solely fosters open entry and collaboration throughout your group but in addition extends knowledge intelligence into each knowledge supply.
On this weblog, we’ll discover the advantages of HMS and AWS Glue Federation, clarify the way it works, and supply steerage on getting began.
Why Hive Metastore and AWS Glue Federation?
HMS has been an early customary for cataloging knowledge to be used in large knowledge methods, and whereas it offers foundational functionalities, they aren’t ideally suited to fashionable knowledge and AI workloads that demand complete governance together with fine-grained entry controls on rows and columns, lineage, monitoring and auditing throughout all knowledge and AI belongings in a single place.
Unity Catalog addresses these shortcomings by offering the business’s solely unified, open governance answer for managing all knowledge and AI belongings. It allows organizations to create an enterprise catalog that curates recordsdata, tables, ML fashions, AI instruments, notebooks, and metrics, all ruled with fine-grained entry controls, lineage, monitoring, auditing and cross-platform sharing in a single answer. Over 10,000+ enterprises at the moment are leveraging Unity Catalog to manipulate their knowledge property.
HMS and AWS Glue Federation present important advantages for organizations with HMS deeply embedded of their knowledge structure. For these with long-standing HMS or AWS Glue deployments, this functionality presents a seamless path to leverage Unity Catalog’s superior options over knowledge saved within the HMS or Glue metastore. It ensures operational continuity by enabling organizations to maintain legacy workflows whereas step by step upgrading current knowledge and workspaces to Unity Catalog.
Key advantages embody:
- Seamless integration: Join your current HMS and AWS Glue catalogs on to Unity Catalog with out requiring handbook metadata migration.
- Simplified knowledge discovery: Entry and discover metadata from HMS and AWS Glue by way of a unified interface, alongside different knowledge and AI belongings in Unity Catalog.
- Complete governance: Leverage Unity Catalog’s fine-grained entry controls, tagging, classification, lineage, and audit capabilities on prime of the information saved in HMS and AWS Glue.
“We now have years’ price of datasets which are cataloged in an exterior Hive Metastore. HMS Federation permits us to instantly profit from Unity Catalog solely options like sturdy entry management and self-serve AI tooling by way of Genie Areas, with out the overhead of migrating all of those tables into Unity Catalog”
— James Davidheiser, Technical Lead, Information Infrastructure, Asana
The way it works
Unity Catalog now consists of federation connectors for Hive Metastore (HMS) and AWS Glue, serving as a translation layer between Unity Catalog and your exterior metastores. These connectors allow you to mount complete HMS catalogs (each inner and exterior) or AWS Glue as overseas catalogs inside Unity Catalog, making them seem as native objects. You’ll be able to outline fine-grained entry controls, view lineage, carry out audits, and question HMS or AWS Glue managed tables utilizing the Databricks engine. The federation helps each studying and writing to tables in inner HMS inside Databricks workspaces whereas providing read-only entry for tables in exterior HMS and AWS Glue.
With this functionality, you may learn all tables in HMS and AWS Glue—Parquet, Delta, and Iceberg (coming quickly in Public Preview)—enabling you to entry and govern all of your tables seamlessly.
Try the video tutorial beneath to discover AWS Glue and HMS Federation in motion.
Get began
By embracing Unity Catalog because the cornerstone of your Lakehouse structure, you may unlock the facility of a unified and open governance implementation that spans your complete knowledge and AI property.