14.8 C
United States of America
Thursday, November 14, 2024

Speed up SQL code migration from Google BigQuery to Amazon Redshift utilizing BladeBridge


Accelerating SQL code migration from Google BigQuery to Amazon Redshift could be a complicated and time-consuming job. Companies typically battle to effectively translate their current BigQuery code to Amazon Redshift, which may delay important knowledge modernization initiatives. Nevertheless, with the suitable instruments and method, this migration course of might be considerably streamlined.

This publish explores how you need to use BladeBridge, a number one knowledge atmosphere modernization answer, to simplify and speed up the migration of SQL code from BigQuery to Amazon Redshift. BladeBridge gives a complete suite of instruments that automate a lot of the complicated conversion work, permitting organizations to rapidly and reliably transition their knowledge analytics capabilities to the scalable Amazon Redshift knowledge warehouse. BladeBridge offers a configurable framework to seamlessly convert legacy metadata and code into extra fashionable providers similar to Amazon Redshift.

Amazon Redshift is a completely managed knowledge warehouse service provided by Amazon Net Companies (AWS). Tens of hundreds of consumers use Amazon Redshift day-after-day to run analytics, processing exabytes of knowledge for enterprise insights. Whether or not your rising knowledge is saved in operational knowledge shops, knowledge lakes, streaming knowledge providers, or third-party datasets, Amazon Redshift helps you securely entry, mix, and share knowledge with minimal motion or copying. Amazon Redshift is constructed for scale and delivers as much as 7.9 occasions higher worth efficiency than different cloud knowledge warehouses.

By utilizing the BladeBridge Analyzer and BladeBridge Converter instruments, organizations can considerably scale back the effort and time required emigrate BigQuery code to Amazon Redshift. The Analyzer offers detailed assessments of the complexity and necessities for the migration, and the Converter automates the precise code conversion course of, utilizing pattern-based customizable guidelines to streamline the transition.

On this publish, we stroll by means of the step-by-step technique of utilizing BladeBridge to speed up the migration of BigQuery SQL code to Amazon Redshift.

Answer overview

The BladeBridge answer consists of two key elements: the BladeBridge Analyzer and the BladeBridge Converter.

BladeBridge Analyzer

The Analyzer is designed to completely assess the complexities of the present knowledge atmosphere, on this case, Google BigQuery. After evaluation of the supply SQL recordsdata, it generates a complete report that gives beneficial insights into the migration effort. The Analyzer report contains the next:

  • Abstract of the entire variety of SQL scripts, file scripts, knowledge definition language (DDL) statements, and different key metrics
  • Categorization of the SQL code complexity into ranges similar to low, medium, complicated, and really complicated
  • Insights that assist each the organizations and programs integrators put together extra correct challenge estimates and migration plans

BladeBridge Converter

The Converter is a pattern-based automation device that streamlines the precise code conversion course of from BigQuery to Amazon Redshift. The Converter makes use of a set of predefined conversion guidelines and patterns to robotically translate 70–95% of the legacy SQL code. This considerably reduces the guide effort required by builders. The Converter works by doing the next:

  • Parsing the supply SQL recordsdata and analyzing the code semantically
  • Making use of the suitable translation guidelines and patterns to transform supply database code to the goal, on this case, Google BigQuery to Amazon Redshift

The out-of-the-box code handles most conversions. The Converter permits builders to customise the conversion patterns for extra complicated transformations.

The next is the migration process:

  1. Put together SQL recordsdata
  2. Utilizing BladeBridge Analyzer, create an analyzer report
  3. Buy license keys for converter
  4. Utilizing BladeBridge Converter, convert SQL recordsdata

The next diagram illustrates these steps.

Stipulations

You want the next stipulations to implement the answer:

Answer walkthrough

Observe these answer steps:

Put together SQL recordsdata

For SQL knowledge warehouses similar to BigQuery, code preparation begins by exporting the SQL recordsdata out of the information warehouse answer. In case your BigQuery SQL code is saved in a single file containing a number of database objects, you might want to cut up them into particular person recordsdata earlier than utilizing the BladeBridge instruments to transform the code to Amazon Redshift. To separate into a number of recordsdata, you need to use the BladeBridge SQL File Splitter utility. The BladeBridge conversion course of is optimized to work with every database object (for instance, tables, views, and materialized views) and code object (for instance, saved procedures and capabilities) saved in its personal separate SQL file. This permits the BladeBridge Analyzer to scan every file individually, gaining a complete understanding of the code patterns, complexity, and construction. To make use of BladeBridge SQL File Splitter utility, observe these steps:

  1. Log in to BladeBridge portal and obtain the SQL file splitter utility on your working system.
  2. Create an enter file listing and place your BigQuery SQL code recordsdata within the listing.
  3. Create an empty output file listing. The recordsdata generated by the splitter shall be saved right here.
  4. Navigate to the listing the place you downloaded the bbsqlsplit executable file and run the next command in your terminal (Mac or Linux) or command immediate (Home windows), changing the enter and output file listing paths:

Syntax

bbsqlsplit

######## OPTIONS ########

-d <<enter file listing path>>

-o <<output file listing path>>

-E sql

Instance 

bbsqlsplit

-d C:UsersXXXXXDesktopBladeBridgeSplitFilesUtilitysource

-o C:UsersXXXXXDesktopBladeBridgeSplitFilesUtilitysplitFiles

-E sql

For extra choices of the bbsqlsplit command, consult with the SQL file cut up documentation within the BladeBridge group portal.

Utilizing BladeBridge Analyzer, create an analyzer report

The Analyzer offers an in depth evaluation of the present BigQuery code, producing a complete report that outlines the complexity and necessities for the migration to Amazon Redshift.

To run the BladeBridge Analyzer, observe these steps:

  1. Log in to the BladeBridge portal and navigate to the Analyzer Obtain
  2. Obtain the Analyzer executable file on your working system (for instance, bbanalyzer.exe for Home windows, bbanalyzer.gz for Linux and macOS). For macOS and Linux customers, you might want to deflate the downloaded gzip file.
  3. Obtain the configuration file (general_sql_specs.json) from the BladeBridge group portal, as proven within the following screenshot.

  1. On the BladeBridge group portal, select Belongings. This web page ought to show the Analyzer key on your group.

  1. From the belongings web page, obtain the Analyzer key as proven within the following screenshot.

Within the listing the place you downloaded the bbanalyzer executable file, run the next command in your terminal (Mac or Linux) or command immediate (Home windows), changing the mandatory paths.

Syntax

bbanalyzer

######## OPTIONS ########

-c <<path to your analyzer key>>

-t SQL

-d <<path to your supply code listing>>

-r <<identify for the output report>>.xlsx

Instance

bbanalyzer

-c C:UsersXXXXXDesktopBladeBridgeanalyzer_key.txt

-t SQL

-d C:UsersXXXXXDesktopBladeBridgeSplitFilesUtilitysplitFiles

-r analyzer_report.xlsx

After operating the command efficiently, the Analyzer generates a report. Overview the report completely, as a result of it offers a abstract and in-depth explanations of the SQL evaluation. The abstract sheet, proven within the following picture, offers an outline of the migration, together with the variety of complete SQL scripts, file scripts, and DDLs. Every SQL script is categorized into LOW, MEDIUM, COMPLEX, or VERY_COMPLEX complexities, that are decided by the Analyzer Complexity Willpower Algorithm. The abstract may also assist with understanding the general complexity and migration effort earlier than performing the precise conversion.

In case you observe an error when operating BladeBridge Analyzer, evaluate following troubleshooting ideas:

  • Configure the write permission – It’s possible you’ll want so as to add crucial permission to the analyzer executable file. For Mac and Linux customers, run chmod 755 ./bbanalyzer to switch the permission.
  • Permit operating third celebration software program – As a result of BladeBridge Analyzer is a third-party software program, MacOS might increase a warning or an error when operating Analyzer. In case you’re utilizing Mac, observe the directions in Open a Mac app from an unidentified developer.
  • Use native drive – In some instances, you would possibly encounter an error if the executable is situated in a community drive. We advocate that you just run the executable on the native drive.
  • Don’t embrace whitespace within the path – Be certain that the trail to the executable doesn’t include a listing with areas within the listing identify.

For extra particulars, consult with the BladeBridge Analyzer Demo.

Buy license keys for convertor

To make use of the BladeBridge Converter and automate the code translation course of, you might want to buy the mandatory license keys. These license keys are tied to the particular SQL recordsdata you’re changing, ensuring that updates to the supply code require the suitable license.

To acquire the license keys, observe these steps:

  1. Share the output of the BladeBridge Analyzer report and the offered pricing calculator Excel sheet with BladeBridge.
  2. The BladeBridge staff will evaluate the data and offer you the required license keys to run the Converter.

The license secret is tied to the file hash of the SQL recordsdata you’re changing. In case you make updates to the supply SQL recordsdata, you might want to buy new license keys to transform the modified code. Due to this fact, be sure to buy the mandatory license keys and handle your recordsdata with a model management system to have easy transitions when changing your BigQuery SQL code to Amazon Redshift.

Utilizing BladeBridge Converter, convert SQL recordsdata

The Converter makes use of the predefined conversion guidelines which can be out there within the out-of-the-box configuration recordsdata to robotically translate 70–95% of the legacy code, considerably decreasing guide effort on your improvement staff. The out-of-the-box configuration file handles conversion for frequent code patterns from Google BigQuery to Amazon Redshift. For these customized patterns that aren’t coated by an out-of-the-box configuration file, you may create customized conversion guidelines by creating extra configuration recordsdata.

Observe these steps to run the BladeBridge Converter:

  1. Log in to the BladeBridge portal and on the Convertor downloads web page, obtain the Convertor executable file on your working system (sqlconv.exe for Home windows or sqlconv.gz for Mac or Linux)

  1. From the identical web page, obtain the configuration file (general_sql_specs.json)
  2. Create an output listing the place the transformed recordsdata shall be saved
  3. Within the folder the place you downloaded the Convertor executable sqlconv, run the next command

Syntax

sqlconv

-c <<converter license file identify obtained from BladeBridge>>

-d <<enter folder for SQL recordsdata>>

-n <<output folder for transformed recordsdata>>

-u <<path for the config recordsdata/s present at the least one file>>

Instance

sqlconv

-c converter_license.txt

-d C:UsersXXXXXDesktopBladeBridgeSplitFilesUtilitysplitFiles

-n C:UsersXXXXXDesktopBladeBridgeSplitFilesUtilitycnvrtdFiles

-u bq2redshift.json

  1. Run the generated SQL recordsdata in your Amazon Redshift knowledge warehouse. In case you encounter errors, analyze them and decide if customized conversion guidelines, not already coated within the out-of-the-box configuration recordsdata, should be utilized.
  2. If customized conversion guidelines are wanted, create a brand new configuration file following the rules within the Customise Configuration File part. Present the brand new config file identify within the -u choice and rerun the Converter.
  3. Repeat these steps till all recordsdata are transformed efficiently or manually modified.

Customise configuration file

Customizing a configuration file is an iterative course of that may assist automate the conversion for occurrences in your codebase. Nevertheless, guide conversion could also be required if the conversion is required for only some recordsdata and some occurrences.

The configuration is outlined in a JSON file. There’s a normal configuration file with frequent guidelines and customized configuration recordsdata for every shopper with client-specific guidelines. Guidelines might be added to the final configuration file if they’re relevant for all shoppers. For client-specific guidelines, a separate JSON file ought to be created and referenced. This retains the final guidelines clear and arranged.

The conversion guidelines in BladeBridge’s configuration file fall into certainly one of three classes:

  1. Line substitution
  2. Block substitution
  3. Operate substitution

Each line ending with a ; is an announcement. This line ending additionally might be changed with different breakers. Discuss with this BladeBridge documentation to get extra particulars on SQL and expression conversion.

The next are issues whereas utilizing the personalized configuration:

  • Nested capabilities in BigQuery enable for complicated operations inside a single SQL assertion, which can should be damaged down into a number of steps in Amazon Redshift
  • Array capabilities in BigQuery present capabilities for manipulating and reworking array knowledge sorts, which can require different approaches in Amazon Redshift
  • You must rigorously analyze the necessities and implement workarounds or different options when migrating from BigQuery to Amazon Redshift, particularly for superior performance in a roundabout way supported in Amazon Redshift

Line substitution

Line substitution applies common expressions to every line of code. This has the from clause, which has the expression to be transformed. The to part has the goal mapping for which it’ll be transformed. Statement_categories restrict the applying of line substitution to particular statements similar to DDL or process. For instance:

  • The primary expression within the following code instance replaces the common expression sample ROWNUM with the SQL expression row_number() over (order by 1)
  • The second expression within the following code instance replaces the common expression sample SYSDATE with the SQL expression CURRENT_TIMESTAMP.
line_subst” : [
{“from” : “bROWNUMb”, “to” : “row_number() over (order by 1)”},
{“from” : “SYSDATE”, “to” : “CURRENT_TIMESTAMP”}
]

Block substitution

Block substitution applies common expressions throughout a number of traces. This is applicable to statements that stretch over a number of traces, that are usually extra complicated than the road substitutions. The next expression within the instance replaces the block. On this instance, the process is created within the goal database.

BEGIN

EXECUTE IMMEDIATE(‘SQL Assertion’);

EXCEPTION WHEN OTHERS

THEN

NULL

END;

To

CALL SP_DYN_SQL(‘parameters’);

“block_subst” : [
{“from”: “BEGIN(.*?)execute immediate(.*);.*exceptions*whens*otherss*thens*null(.*?)end;(.*)”, “to”: “CALL sp_dyn_sql($2);”}
]

Operate substitution

Operate substitution permits changing one perform with an equal perform within the goal knowledge warehouse. The configuration additionally permits for specifying customized capabilities.

Operate substitution factors to an array of directions accountable for altering perform calls. This part is used when perform translations are required or perform arguments (perform signature) must be altered. The next expression converts the NVL2 perform to CASE perform on Amazon Redshift.

“function_subst” : [
{“from”: “NVL2”, “output_template” : “CASE WHEN $1 IS NOT NULL THEN $2 ELSE $3 END”}
]

Conclusion

On this publish, we demonstrated the way to use the BladeBridge Analyzer and BladeBridge Converter to streamline the migration of SQL code from Google BigQuery to Amazon Redshift. By utilizing BladeBridge, organizations can considerably scale back the effort and time required to translate their current BigQuery code for migration to the Amazon Redshift knowledge warehouse. The Analyzer offers an in depth evaluation of the supply SQL code, and the Converter automates the precise conversion course of utilizing a set of predefined, customizable guidelines and patterns.

We additionally coated the customization capabilities of the BladeBridge answer, showcasing how one can tailor the conversion guidelines to deal with extra complicated transformations. By utilizing the road substitution, block substitution, and performance substitution options, you may have a seamless migration that addresses the distinctive necessities of your knowledge analytics infrastructure.

We encourage you to check out BladeBridge’s GCP BigQuery to Amazon Redshift answer and discover the varied configuration choices. In case you encounter any challenges or have extra necessities, consult with the BladeBridge group help portal or attain out to the BladeBridge staff for additional help.


In regards to the authors

Ritesh Kumar Sinha is an Analytics Specialist Options Architect based mostly out of San Francisco. He has helped prospects construct scalable knowledge warehousing and large knowledge options for over 16 years. He likes to design and construct environment friendly end-to-end options on AWS. In his spare time, he loves studying, strolling, and doing yoga.

Anusha Challa is a Senior Analytics Specialist Options Architect targeted on Amazon Redshift. She has helped many shoppers construct large-scale knowledge warehouse options within the cloud and on premises. She is captivated with knowledge analytics and knowledge science.

Yota Hamaoka is an Analytics Answer Architect at Amazon Net Companies. He’s targeted on driving prospects to speed up their analytics journey with Amazon Redshift.

Milind Oke is a Knowledge Warehouse Specialist Options Architect based mostly out of New York. He has been constructing knowledge warehouse options for over 15 years and focuses on Amazon Redshift.

Raza Hafeez is a Senior Product Supervisor at Amazon Redshift. He has over 13 years {of professional} expertise constructing and optimizing enterprise knowledge warehouses and is captivated with enabling prospects to appreciate the ability of their knowledge. He focuses on migrating enterprise knowledge warehouses to AWS Fashionable Knowledge Structure.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles