1.2 C
United States of America
Thursday, March 6, 2025

Cross-account information collaboration with Amazon DataZone and AWS analytical instruments


Information sharing has turn into an important side of driving innovation, contributing to progress, and fostering collaboration throughout industries. Based on thisĀ Gartner research, organizations selling information sharing outperform their friends on most enterprise worth metrics. A simple information entry and sharing mechanism is essential for enabling efficient information sharing throughout a corporation. There are challenges reminiscent of complexity in managing cross-account permissions and issue in discovering the suitable information throughout accounts that organizations face when making an attempt to share information merchandise throughout AWS accounts.Ā Amazon DataZoneĀ is a totally managed information administration service that clients can use to catalog, uncover, share, and govern information saved throughout Amazon Net Providers (AWS).

On this submit, we’ll cowl how you need to use Amazon DataZone to facilitate information collaboration between AWS accounts.

Answer overview

This resolution gives a streamlined strategy to allow cross-account information collaboration utilizingĀ Amazon DataZone area affiliation whereas sustaining safety and governance. This submit describes the method of utilizing the enterprise information catalog useful resource of Amazon DataZone to publish information belongings in order that theyā€™re discoverable by different accounts. After theyā€™ve been revealed, you may question the revealed belongings from one other AWS account utilizing analytical instruments reminiscent ofĀ Amazon Athena and the Amazon Redshift question editor,Ā as proven within the following determine.

On this resolution (as proven within the previous determine), the AWS account that accommodates the information belongings is known as the producer account. The AWS account that should entry or use the information from the producer account is known as the client account. The Amazon DataZone area is created and managed inside the producer account after which the buyer account is related to that area.

As a part of Amazon DataZone area affiliation,Ā Amazon DataZone makes use ofĀ AWS Useful resource Entry Supervisor (AWS RAM) to share the useful resource. When the producer and client AWS accounts are in the identical group insideĀ AWS Organizations, the area affiliation occurs routinely.Ā If the producer and client AWS accounts are in several organizations, AWS RAM sends an invite to the buyer AWS account to just accept or reject the useful resource grant.

This resolution presents three Amazon DataZone consumer personas as:

  • Information directors: Account house owners in each producer and client AWS accounts. The info directors are accountable for creating Amazon DataZone domains, configuring area associations, and accepting area associations inside the Amazon DataZone area.
  • Information publishers: Customers in producer AWS accounts. The info publishers are accountable for creating Amazon DataZone publish tasks and environments, producing and publishing information belongings, and accepting subscription requests.
  • Information subscribers: Customers in client AWS accounts. The info subscribers are accountable for creating Amazon DataZone subscribe tasks and environments, trying to find and subscribing to information belongings, and querying the information and deriving insights.

Conditions

To observe together with the directions, you will want:

  • Two AWS accounts, one serving as producer and different account serving as client. Create newĀ AWS accounts if essential.
  • AnĀ Amazon Redshift provisioned clusterĀ orĀ Amazon Redshift Serverless workgroupĀ within the producer and client AWS accounts provisioned by an information administrator.
  • A secret inĀ AWS Secrets and techniques SupervisorĀ storing the grasp consumer credentials for the Amazon Redshift cluster or workgroup within the producer and client AWS accounts.
    • The info directors are accountable for creating secrets and techniques.
    • The info producers and shoppers can get hold of theĀ Amazon Useful resource Title (ARN)Ā of the secrets and techniques from the information directors throughout the surroundings or surroundings profile creation steps.

Amazon DataZone makes use ofĀ Amazon Redshift DatasharesĀ to share information throughout clusters and accounts. There are particular necessities andĀ limitationsĀ for utilizing Amazon Redshift datashares.

  • For cross-account information sharing, each the producer and client clusters should be encrypted. See Cluster encryption part of datashare-considerations for extra details about the encryption course of.
  • Information sharing is supported just for provisioned ra3 cluster varieties (ra3.16xlarge, ra3.4xlarge, and ra3.xlplus) and Amazon Redshift Serverless.

Walkthrough:

The next are the excessive stage steps to configure cross-account entry. Weā€™ve offered step-by-step directions within the following sections.

  1. Create an Amazon DataZone area within the producer account. The info administrator creates an Amazon DataZone area.
  2. Request Amazon DataZone area affiliation from the producer account to the buyer account.
  3. Settle for the area affiliation request within the client account. The info administrator accepts the area affiliation.
  4. Add information customers to the Amazon DataZone area.
  5. Create the mandatory publish challenge for AWS Glue and Amazon Redshift within the producer account.
  6. Create AWS Glue and Amazon Redshift environments to publish the information belongings within the producer account.
  7. Create and run an information supply for AWS Glue and Amazon Redshift to publish belongings into the enterprise catalog.
  8. Create subscribe tasks for AWS Glue and Amazon Redshift.
  9. Create AWS Glue and Amazon Redshift surroundings profiles and environments within the subscribe challenge
  10. Subscribe to AWS Glue and Amazon Redshift tables. Eat the information utilizing Athena and Amazon redshift editors. This step is carried out by the information subscriber.

Create the Amazon DataZone area within the producer account

Amazon DataZone domains function high-level organizational models for belongings, customers, and tasks, facilitating cross-team and cross-account collaboration. This step focusses on creating the Amazon DataZone area within the producer account.

  1. Register to the producer account AWS Administration Console for Amazon DataZoneĀ utilizing the information administrator credentials.
  2. Create an Amazon DataZone area titled Demo_cross_account_domain utilizing the directions atĀ create domains.
  3. On the Create area display, chooseĀ Fast setupĀ checkbox to automate a number of configuration steps, saving time and decreasing the potential for setup errors. Fast setup allows two default blueprints and creates the default surroundings profiles for the information lake and information warehouse default blueprints.


Request Amazon DataZone area affiliation from the producer account to the buyer account

To affiliate the Amazon DataZone area with the buyer account, the producer accountĀ requests a website affiliation. This entails offering essential details about the buyer account and granting applicable permissions for information entry and administration.

  1. Register to theĀ Amazon DataZone consoleĀ of the producer account utilizing the information administrator credentials.
  2. Navigate to the area element web page, after which scroll down and choose the Related Accounts tab.
  3. Enter the buyer account IDs that you just wish to request affiliation. Select Add one other account if you wish to add a couple of account. If youā€™re glad with the checklist of account IDs, selectĀ Request affiliation.
    • Use the most recent (AWS RAMĀ DataZonePortalReadWriteĀ coverage when requesting the account affiliation. This coverage permits customers within the client account to execute Amazon DataZone APIs and to make use of the information portal interface.

Settle for an account affiliation request from an Amazon DataZone area

This step focuses on accepting the account affiliation request from the Amazon DataZone area within the client account. This enables the buyer account to be linked with the Amazon DataZone area to allow information sharing and collaboration between the producer and client accounts.

  1. Register to the buyer account and go to the Amazon DataZone console Ā in the identical AWS Area because the area. On the Amazon DataZone residence web page, select View requests.
  2. Choose the identify of the inviting Amazon DataZone area and select Evaluate request.
  3. SelectĀ Settle for affiliation, it is best to see the Demo_cross_account_domainĀ state as related within the Related domains display

  1. Select the area for which you wish to allow an surroundings blueprint.
  2. From the Blueprints checklist, select both the DefaultDataLake blueprint
  3. On the Permissions and sources web page, for enabling the DefaultDataLake blueprint, for Glue Handle Entry function, specify a brand new function that grants Amazon DataZone authorization to ingest and handle entry to tables in AWS Glue and AWS Lake Formation.

  1. Repeat steps 4 to six to allow the DefaultDataWarehouse blueprint by selecting DefaultDataWarehouse as a substitute of DefaultDataLake

Add information customers to the Amazon DataZone area

To grant entry to the Amazon DataZone information portal from the console for information writer and information Subscriber IAM customers, use the next steps so as to add them within the Consumer Administration part of the Amazon DataZone area. SeeĀ Handle customers within the Amazon DataZone consoleĀ for extra particulars.

  1. Register to the Amazon DataZone console as an information administrator utilizing the producer account.
  2. Choose the Amazon DataZone area and, within theĀ Consumer administrationĀ part,Ā selectĀ AddĀ and chooseĀ Add IAM customers.
  3. On theĀ Add customersĀ web page, selectĀ Present accountĀ and add the consumer ARN of the information producer and selectĀ Add customers.
  4. Subsequent selectĀ Related account, and enter the information subscriber consumerā€™s ARN and add the consumer by selectingĀ Add customers.

Create the publish challenge for AWS Glue and Amazon Redshift

This step focuses on creating the publish challenge for AWS Glue and Amazon Redshift within the producer account.Ā The challenge will probably be used to publish information out of your information sources to the suitable AWS companies.

  1. Utilizing the producer account, register to theĀ Amazon DataZone console as an information writer.
  2. ChooseĀ View domainsĀ and choose theĀ demo_cross_account_domain.
  3. Select theĀ Open information portalĀ hyperlink and register to the information portal.
  4. Select Create New ChallengeĀ and create a challenge namedĀ Glue_Publish_ProjectĀ for publishing AWS Glue information belongings and create the challenge beneath demo_cross_account_domain.
  5. Create one other challenge named Redshift_Publish_Project for publishing Amazon Redshift information belongings, additionally beneath the demo_cross_account_domain.

Create AWS Glue and Amazon Redshift environments to publish the information belongings

On this step, you arrange AWS Glue and Amazon Redshift environments within the producer account to share information belongings. The required infrastructure, such because the AWS Glue Information Catalog and Redshift cluster for storing information, ought to already be in place. After setup, this may enable the buyer account to entry and use the shared information belongings. SeeĀ Create a brand new surroundingsĀ for detailed directions on creating a brand new surroundings.

Create the AWS Glue surroundings and a brand new AWS Glue desk

  1. In the identical Amazon DataZone area demo_cross_account_domain, select Browse Challenge and choose theĀ Glue_Publish_ProjectĀ and createĀ Glue_Publish_EnvironmentĀ utilizing the defaultĀ DataLakeProfile.
  2. Depart theĀ producer_glue_db_name, consumer_glue_db_name andĀ Workgroup_name clean.
  3. SelectĀ Create AtmosphereĀ and anticipate the method to finish.
  4. After the surroundings is created, browse the checklist of accessible tasks and select Glue_publish_project.
  5. Subsequent, navigate to the Glue_Publish_Environment, and beneathĀ Analytics instruments, selectĀ Amazon AthenaĀ to open the Athena question editor
  6. Select Open AthenaĀ and be sure thatĀ Glue_Publish_EnvironmentĀ is chosen within theĀ Amazon DataZone surroundingsĀ dropdown on the higher proper and that in Information on the left, glue_publish_environment_pub_db is chosen because the Database.
  7. Create a brand new AWS Glue desk for publishing to Amazon DataZone. Paste the next create desk as choose (CTAS) question script within the Question window and run it to create a brand new desk named mkt_sls_table. The script creates a desk with pattern advertising and marketing and gross sales information.
    CREATE TABLE mkt_sls_table AS
    SELECT 146776932 AS ord_num, 23 AS sales_qty_sld, 23.4 AS wholesale_cost, 45.0 as lst_pr, 43.0 as sell_pr, 2.0 as disnt, 12 as ship_mode,13 as warehouse_id, 23 as item_id, 34 as ctlg_page, 232 as ship_cust_id, 4556 as bill_cust_id
    UNION ALL SELECT 46776931, 24, 24.4, 46, 44, 1, 14, 15, 24, 35, 222, 4551
    UNION ALL SELECT 46777394, 42, 43.4, 60, 50, 10, 30, 20, 27, 43, 241, 4565
    UNION ALL SELECT 46777831, 33, 40.4, 51, 46, 15, 16, 26, 33, 40, 234, 4563
    UNION ALL SELECT 46779160, 29, 26.4, 50, 61, 8, 31, 15, 36, 40, 242, 4562
    UNION ALL SELECT 46778595, 43, 28.4, 49, 47, 7, 28, 22, 27, 43, 224, 4555
    UNION ALL SELECT 46779482, 34, 33.4, 64, 44, 10, 17, 27, 43, 52, 222, 4556
    UNION ALL SELECT 46779650, 39, 37.4, 51, 62, 13, 31, 25, 31, 52, 224, 4551
    UNION ALL SELECT 46780524, 33, 40.4, 60, 53, 18, 32, 31, 31, 39, 232, 4563
    UNION ALL SELECT 46780634, 39, 35.4, 46, 44, 16, 33, 19, 31, 52, 242, 4557
    UNION ALL SELECT 46781887, 24, 30.4, 54, 62, 13, 18, 29, 24, 52, 223, 4561

  8. Go to the TablesĀ andĀ ViewsĀ part and confirm that theĀ mkt_sls_tableĀ desk was efficiently created.

Create the Amazon Redshift publish surroundings and a brand new Redshift desk

  1. Staying in the identical Amazon DataZone area demo_cross_account_domain, select Browse Challenge, to create an Amazon Redshift publish surroundings, choose theĀ Redshift_Publish_ProjectĀ and create Redshift_Publish_EnvironmentĀ utilizing the defaultĀ information warehouse profile.
  2. Ā To configure surroundings parameters, enter the identify of your Amazon Redshift cluster or workgroup, specify the database identify and enter the AWS Secrets and techniques Supervisor secret ARN for the Redshift cluster or workgroup. It’s worthwhile to be sure that the key in Secrets and techniques Supervisor consists of the next tags. These tags assist Amazon DataZone implement correct entry management in order that solely approved customers inside the right Amazon DataZone challenge and area can entry the Amazon Redshift useful resource:
    1. For Amazon Redshift cluster: DataZone.rs.cluster: <cluster_name:database identify>
    2. For Amazon Redshift Serverless workgroup: DataZone.rs.workgroup:Ā  <workgroup_name:database_name>
    3. AmazonDataZoneProject: <projectID>
    4. AmazonDataZoneDomain: <domainID>For extra data for creating redshift database consumer secret in secret supervisor, seeĀ Storing database credentials in AWS Secrets and techniques Supervisor.

For extra data for creating redshift database consumer secret in secret supervisor, see Storing database credentials in AWS Secrets and techniques Supervisor.

  1. Notice that the database consumer you present in Secrets and techniques Supervisor will need to have superuser permissions. Information publishers ought to work with the information administrator to get the small print of the Redshift cluster or workgroup, database identify, and secret ARN.
  2. The schema is non-obligatory.
  3. Select Create Atmosphere and anticipate the method to finish.
  4. Confirm that the surroundings is created efficiently with out errors.
  5. Browse the checklist of accessible tasks and choose Redshift_publish_project. Navigate to Redshift_publish_environment.
  6. Below Analytics instruments, select Amazon Redshift to open the Amazon Redshift question editor.
  7. Choose the Redshift cluster that you just wish to join, select Save after which select Create Connection utilizing short-term credentials along with your IAM identification.
  8. Create a brand new Redshift desk. You should use the CTAS question to create a brand new desk named rs_sls_tbl. Use the offered CTAS script, which creates a desk with pattern gross sales information within the datazone_env_redshift_publish_environment schema.
    CREATE TABLE "datazone_env_redshift_publish_environment"."rs_sls_tbl" AS
    SELECT 146776932 AS ord_num, 23 AS sales_qty_sld, 23.4 AS wholesale_cost, 45.0 as lst_pr, 43.0 as sell_pr, 2.0 as disnt, 12 as ship_mode,13 as warehouse_id, 23 as item_id, 34 as ctlg_page, 232 as ship_cust_id, 4556 as bill_cust_id
    UNION ALL SELECT 46776931, 24, 24.4, 46, 44, 1, 14, 15, 24, 35, 222, 4551
    UNION ALL SELECT 46777394, 42, 43.4, 60, 50, 10, 30, 20, 27, 43, 241, 4565
    UNION ALL SELECT 46777831, 33, 40.4, 51, 46, 15, 16, 26, 33, 40, 234, 4563
    UNION ALL SELECT 46779160, 29, 26.4, 50, 61, 8, 31, 15, 36, 40, 242, 4562
    UNION ALL SELECT 46778595, 43, 28.4, 49, 47, 7, 28, 22, 27, 43, 224, 4555
    UNION ALL SELECT 46779482, 34, 33.4, 64, 44, 10, 17, 27, 43, 52, 222, 4556
    UNION ALL SELECT 46779650, 39, 37.4, 51, 62, 13, 31, 25, 31, 52, 224, 4551
    UNION ALL SELECT 46780524, 33, 40.4, 60, 53, 18, 32, 31, 31, 39, 232, 4563
    UNION ALL SELECT 46780634, 39, 35.4, 46, 44, 16, 33, 19, 31, 52, 242, 4557
    UNION ALL SELECT 46781887, 24, 30.4, 54, 62, 13, 18, 29, 24, 52, 223, 4561

  9. Ā Guarantee that theĀ rs_sls_tblĀ desk is efficiently created.

Publish belongings into the frequent enterprise catalog

On this step, you create and run the Amazon DataZone information sources forĀ AWS GlueĀ andĀ Amazon Redshift. You’ll then publish the information belongings from these information sources.

The Amazon DataZone information sources assist you to join to varied information sources, together with databases, information warehouses, and information lakes, and ingest metadata into Amazon DataZone. By creating and operating these information sources, you can also make your information accessible for evaluation, transformation, and sharing inside your group.

After the information sources are arrange, you may publish the information belongings from these sources to make them accessible to different customers and purposes. This course of entails mapping the information belongings to the suitable enterprise phrases and metadata, ensuring that the information is correctly described and categorized.

Add an AWS Glue information supply to publish the brand new AWS Glue desk.

  1. Keep signed within the producer account and Amazon DataZone console as an information writer.
  2. SelectĀ Choose challengeĀ from the highest navigation pane and choose theĀ Glue_Publish_ProjectĀ that you just wish to add the information supply to.
  3. Choose theĀ Glue_Publish_Environment.
  4. SelectĀ Create information supply. Enter glue-publish-datasource because the identify.
  5. BelowĀ Information supply kind, selectĀ AWS Glue.
  6. BelowĀ Choose an surroundings, chooseĀ Glue_Publish_Environment.
  7. BelowĀ Information choice, choose the AWS Glue database glue_publish_environment_pub_db, enter your desk choice standards asĀ ā€œ*ā€œ, after whichĀ and selectĀ Subsequent.
  8. Depart all different setting as default and selectĀ Subsequent.
  9. ForĀ Run Choice,Ā choose Run on demandĀ to ingest metadata from the desired AWS Glue tables into Amazon DataZone.
  10. Evaluate and selectĀ Create.
  11. After the information supply has been created selectĀ Run. TheĀ mkt_sls_tableĀ will probably be listed within the stock and accessible to publish.
  12. Choose the mkt_sls_table desk and overview the metadata that was generated. SelectĀ Settle for AllĀ when youā€™re glad with the metadata.
  13. Select Publish AssetĀ and theĀ mkt_sls_tableĀ desk will probably be revealed to the enterprise information catalog, making it discoverable and comprehensible throughout your group.

Add an Amazon Redshift information supply to publish the brand new Amazon Redshift desk.

  1. Keep signed within the producer account and Amazon DataZone console as an information writer.
  2. Select Choose challengeĀ from the highest navigation pane and choose theĀ Redshift_Publish_ProjectĀ that you just wish to add the information supply to.
  3. Select theĀ Redshift_Publish_Environment.
  4. SelectĀ Create information supply. Enter rs-publish-datasource because the identify.
  5. BelowĀ Information supply kind, chooseĀ Amazon Redshift.
  6. BelowĀ Choose an surroundings, chooseĀ Redshift_Publish_Environment.
  7. BelowĀ Redshift Credentials, enter the Redshift cluster and secret particulars offered by the information administrator.
  8. BelowĀ Information Choice,Ā choose the databaseĀ devĀ and schema datazone_env_redshift_publish_environment.
  9. Maintain different setting as default and selectĀ Subsequent.
  10. ForĀ Run Choice,Ā choose Run on Demand.
  11. SelectĀ Save. After the information supply is created, selectĀ Run. The info supply runs and theĀ rs_sls_tblĀ will probably be listed within the stock and accessible to publish.
  12. Choose the rs_sls_tbl desk and overview the metadata that was generated. SelectĀ Settle for AllĀ if you’re glad with the metadata.
  13. SelectĀ Publish AssetĀ and theĀ rs_sls_tableĀ desk will probably be revealed to the enterprise information catalog.

Create subscribe tasks for AWS Glue and Amazon Redshift

On this step, you create theĀ tasksĀ for subscribing to AWS Glue and Amazon Redshift information belongings inside your Amazon DataZone area.

  1. Register to the Amazon DataZone console as an information subscriber IAM consumer utilizing the buyer account.
  2. SelectĀ Related domainsĀ and choose theĀ demo_cross_account_domain.
  3. Choose theĀ Open information portal hyperlinkĀ andĀ registerĀ to theĀ information portal.
  4. SelectĀ Create New ChallengeĀ and create a challenge namedĀ Glue_Subscribe_ProjectĀ for subscribing to the AWS Glue information belongings.
  5. Create one other challenge named Redshift_Subscribe_Project for subscribing to the Redshift information belongings.

Create AWS Glue and Amazon Redshift surroundings profiles

On this step, you’ll arrange the surroundings profilesĀ andĀ environmentsĀ for AWS Glue and Amazon Redshift in your Amazon DataZone tasks. It will assist you to join and work together with sources throughout AWS accounts.

The aim of surroundings profiles in Amazon DataZone is to streamline the method of surroundings creation. Through the use of surroundings profiles, you may preconfigure important placement data reminiscent of AWS account and AWS Area. On this resolution, you’ll configure surroundings profiles with placement data pointing to your client account.

Additionally, you will create an Amazon DataZone surroundings from the profiles you might be about to create. It will provision the mandatory sources within the client account and set up the connections between the Amazon DataZone area and the buyer account. After the environments are created, you may work with AWS Glue and Amazon Redshift belongings seamlessly throughout totally different AWS accounts inside your Amazon DataZone ecosystem.

Create an AWS Glue profile and surroundings

  1. Keep signed within the client accountā€™sĀ Amazon DataZone console as an information subscriber IAM, choose the Environments tab after which selectĀ Create surroundings profile.
  2. Configure the fields as follows:
    1. Title: Enter glue_subscribe-env-profile.
    2. Proprietor: The challenge the place the profile is being created is chosen by default on this subject. Confirm that itā€™s Glue_Subscribe_Project.
    3. Blueprint: Choose Default Information Lake.
    4. AWS account parameters: Enter the buyer AWS account quantity and choose the Area.
    5. Approved tasks: Choose All tasks.
    6. Publishing: Choose Publish from any database.
    7. Select Create Atmosphere Profile.
  3. On the Create surroundings web page, enter the next:
    1. Title: Enter glue_subscribe_environment.
    2. Confirm that the Atmosphere profile is about to glue_subscribe-env-profile.
  4. (Non-obligatory) Parameters: Enter the Producer glue db identify, Shopper glue db identify, and Workgroup identify.
  5. Select Create surroundings.
  6. It takes a couple of minutes for the surroundings to be created. Confirm that the surroundings creation is profitable with none errors.

Create a Redshift surroundings profile and surroundings

  1. Staying within the client accountā€™sĀ Amazon DataZone administration console as an information subscriber IAM consumer, navigate to theĀ Redshift_Subscribe_ProjectĀ you created beforehand.
  2. Choose the Environments tab after which selectĀ Create surroundings profile.
  3. Configure the fields as follows:
    1. Title: EnterĀ redshift_subscribe-env-profile.
    2. Proprietor: Confirm that Challenge is about toĀ Redshift_Subscribe_Project.
    3. Blueprint: ChooseĀ Default Information Warehouse.
    4. Parameter set:Ā Choose Enter my very own.
    5. AWS account parameters: Enter the buyer AWS account quantity and choose the Area.
    6. Parameters: Choose bothĀ Amazon Redshift ClusterĀ orĀ Amazon Redshift ServerlessĀ within the client account.
      • AWS Secret ARN: Enter the AWS Secrets and techniques Supervisor secret ARN for the Redshift cluster or workgroup. It’s worthwhile to be sure that the key in Secrets and techniques Supervisor consists of the next tags. These tags assist Amazon DataZone implement correct entry management in order that solely approved customers inside the right Amazon DataZone challenge and area can entry the Amazon Redshift useful resource.
        1. AmazonDataZoneDomain: [Domain_ID]
        2. AmazonDataZoneProject:Ā  [Project_ID]

      For extra data for creating redshift database consumer secret in secret supervisor, seeĀ Storing database credentials in AWS Secrets and techniques Supervisor.

      Notice that the database consumer you present in AWS Secrets and techniques Supervisor will need to have superuser permissions. Information publishers ought to work with the information administrator to get the small print of the Redshift cluster or workgroup, database identify, and secret ARN.

      • Redshift cluster identify: Enter the identify of the Amazon Redshift cluster or Amazon Redshift Serverless workgroup.
      • Database identify: Enter the identify of the database inside the chosen Amazon Redshift cluster or Amazon Redshift Serverless workgroup
    7. Approved tasks: Choose All tasks.
    8. Publishing:Ā Choose Publish any schema.
  4. Select Create surroundings profile.
  5. Create an surroundings from this profile: Create an surroundings from this profile:
    1. Title: Enter redshift_subscribe_environment.
    2. Confirm that the Atmosphere profile is about to redshift_subscribe-env-profile.
  6. Select Create Atmosphere.

It takes a couple of minutes for the surroundings to be created. Confirm that the surroundings creation is profitable with none errors.

Subscribe to the AWS Glue and Redshift tables

On this step, you’ll subscribe AWS Glue and Amazon redshift tables revealed by the information producer.

Subscribe to the AWS Glue desk

  1. Register to theĀ Amazon DataZone consoleĀ of the buyer account utilizing the information subscriber credentials and navigate to theĀ Glue_Subscribe_projectĀ you created beforehand.
  2. Seek for the Market Gross sales Desk within theĀ Search bar.
  3. Choose the Market Gross sales DeskĀ and selectĀ Subscribe.
  4. Within the SubscribeĀ pop-up window, present the next data:
    • Challenge: Enter the identify of the challenge that you just wish to subscribe to the asset. By default this will probably beĀ Glue_Subscribe_Project.
    • Enter a justification to your subscription request.
  5. Select Subscribe.
  6. Swap to the information writer function to approve the subscription request, then again to information subscriber after selecting Approve.
  7. Choose the Glue_subscribe_projectĀ and selectĀ Subscribed Property. Confirm that theĀ Market Gross sales Desk is added to your surroundings.
  8. Navigate to theĀ Amazon Athena question editorĀ utilizing the hyperlink within the challengeā€™s residence web page.
  9. Select OPEN AMAZON ATHENA.
  10. You’ll now be routinely routed to the Athena console, be sure that the Amazon DataZone AtmosphereĀ is about toĀ glue_subscribe_environment.
  11. For Database, chooseĀ glue_subscribe_environment_sub_db.
  12. You need to see the mkt_sls_table within the Tables checklist. Preview the desk by selecting the three-dot menu subsequent to the desk identify and deciding on Preview Desk
  13. Evaluate the desk preview outcomes. It is possible for you to to see all of the gross sales associated information from the mkt_sls_table

Subscribe to the Redshift desk

  1. Keep signed in to the Amazon DataZone administration console as the information subscriber,Ā SelectĀ Choose challengeĀ from the highest navigation pane and choose theĀ Redshift_Subscribe_project.
  2. Seek for Gross sales DeskĀ within the search bar, and choose theĀ Gross sales Desk.
  3. Within theĀ SubscribeĀ pop-up window, present the next data:
    • Challenge: Enter the identify of the challenge that you just wish to subscribe to the asset. By default this will probably beĀ Redshift_Subscribe_Project.
    • Enter a justification to your subscription request.
  4. SelectĀ Subscribe.
  5. Swap again to the information writer who’s the producer of the Market Gross sales Desk select Approve.
  6. After the subscription request is permitted, change again to information subscriber.
  7. Choose the Redshift_subscribe_projectĀ and select Subscribed Property. After the Gross sales Desk is added to your surroundings, you may question the information within the desk.
  8. Choose the Amazon RedshiftĀ hyperlink in the suitable aspect panel of the challenge residence web page and navigate to theĀ Amazon Redshift question editor.
  9. Choose Open Amazon RedshiftĀ and theĀ Redshift question editor v2Ā will open in a brand new tab.
  10. Within the question editor, right-click your Amazon DataZone surroundingsā€™s Amazon Redshift cluster and choose Create a connection.
  11. Choose Non permanent credentials utilizing your IAM identificationĀ for authentication.
    • If that authentication technique isnā€™t accessible, openĀ Account settingsĀ by selecting the gear icon within the backside left nook, selectĀ Authenticate with IAM credentialsĀ and select Save.
  12. Enter the identify of the Amazon DataZone surroundingsā€™s database to create the connection.
  13. Select Create connection.
  14. Now you can view the Redshift desk rs_sls_tbl within the datazone_env_redshift_subscribe_environment.
  15. Execute the next question to verify the information is accessible
SELECT * FROM "dev"."datazone_env_redshift_subscribe_environment"."rs_sls_tbl";

It is possible for you to to preview the rs_sls_tbl which can present the sale information from the desk.

Clear up

To keep away from pointless future fees, observe these steps:

Abstract

Organizations typically face vital challenges when making an attempt to share information merchandise throughout a number of AWS accounts. These challenges stem from the complexity of configuring correct cross-account entry permissions and roles whereas sustaining strong information governance and safety controls.

You should use the answer described within the submit to publish and devour information throughout AWS accounts and be sure that dependable entry and constant information governance is in place. By combining the facility of AWS Glue and Amazon Redshift, you may unlock worthwhile insights and speed up your data-driven decision-making processes.

On this submit, you adopted a step-by-step information to arrange cross-account information sharing utilizing Amazon DataZone area affiliation. You realized publish information belongings from a producer account. You additionally realized subscribe to and question the revealed belongings from a client account.Ā You’ll be able to optionally use AWS Lake Formation entry monitoring to view permissions and information entry actions. AWS Lake Formation makes use of AWS CloudTrail for historic evaluation and CloudTrail retains logs for 90 days by default.

Now that you justā€™re acquainted with the weather concerned in cross-account information sharing utilizing Amazon DataZone and your selection of analytical instrument, youā€™re able to attempt it with a number of accounts.


In regards to the Authors

Arun Pradeep SelvarajĀ is a Senior Options Architect at AWS. Arun is obsessed with working together with his clients and stakeholders on digital transformations and innovation within the cloud whereas persevering with to be taught, construct and reinvent. He’s artistic, fast-paced, deeply customer-obsessed, and makes use of the working backwards course of to construct trendy architectures to assist clients resolve their distinctive challenges. Join with him onĀ LinkedIn.

Piyush MattooĀ is a Senior Answer Architect for the Monetary Providers Information Supplier phase at Amazon Net Providers. Heā€™s a software program expertise chief with over a decade of expertise constructing scalable and distributed software program programs to allow enterprise worth via the usage of expertise. He has an academic background in Laptop Science with a graspā€™s diploma in pc and data science from College of Massachusetts. He’s primarily based out of Southern California and present pursuits embody tenting and nature walks.

Mani YamarajaĀ is a Senior Buyer Options Supervisor for Monetary Providers Information Supplier phase at Amazon Net Providers. He has over a decade lengthy expertise working with monetary companies clients enabling their digital transformation journey. Mani adopts a buyer centric method and gives expertise options working backwards from buyerā€™s enterprise objectives. He’s passionate in regards to the monetary companies business and helps the purchasers speed up their cloud primarily based transformation utilizing the confirmed mechanisms of AWS.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles