What Reverse ETL can Lighten Your Data Load
App builders, data engineers, as well as IT staff, face recurring problems with data warehouses and moving data between apps. We all know that our businesses can reap significant benefits if we are intelligent with our data. There are some reverse ETL tools.
There are many options available for moving data. There are many options for moving data. Some of them have been around for years, and some have changed over time, like ETL (extract transform, load), and custom-built connections.
Some were born out of necessity like ELT (extract-load, transform), and event streaming. These data pipeline tools are becoming more complex and demanding in terms of requirements and use cases. A novel, but sensible use case emerged in recent years from these ever-increasing requirements: Moving data from your data warehouse into the cloud applications you use. To meet this need, a new data pipeline was created: reverse ETL.
What is reverse ETL?
Reverse ETL works by moving data from your data warehouse into your cloud applications. Reverse ETL tools sync data on a regular schedule. This can be set up by calling an API endpoint (application programming interface), or by integrating with tools such as Airflow or dbt.
What can I do with reverse ETL?
You can realize much of the promise of data science with reverse ETL tools. Your data warehouse can live with the complex and valuable analysis and modeling your data teams perform. Your data scientists’ work is more valuable because they can use the enriched post-analysis data to automate maintaining your business applications current. This allows data scientists to deliver value in real-time, as opposed to the manual processes used in many businesses today.
Reverse ETL tools are focused on customer data. They can be used to solve problems that involve combining data from your website, digital products, or any other cloud application. These are the most common uses of reverse ETL tools:
- Building more complete customer profiles (sometimes called “customer 360”)
- Create more targeted, granular audiences
- Scoring leads based upon your business-specific criteria
- Identify “at-risk” customers or customers most likely to churn
- Cloud applications can be used to deliver data for better reporting
Who are the leading reverse ETL vendors?
There are many reverse ETL tools that you can choose from, and all work in the same way First, you need to create a source connected to the data warehouse. Next, you will need to configure a destination connection for a cloud application. Finally, you’ll need to write an SQL statement (or choose a tableau) to select the data that needs to be synchronized, your mappings, and establish a sync schedule. Despite the similar functionality of reverse ETL tools, three vendors stand out:
Hightouch believes that your data warehouse can be your source of truth about customer data. Hightouch makes it simple to sync this data to any cloud-based tools that your business uses. Hightouch is unique because it has a mature tool that allows for more destination and source integrations than any pure-play reverse ETL tool. In the past six to twelve months, Hightouch has grown its integration library more quickly than Census (see below). Because integrations determine the flexibility that your company has with tool selection, this is crucial. Reverse ETL is more efficient when there are more integrations.
Grafana Plaid, Zeplin, Mattermost are all hightouch customers.
Census would be the industry standard for reverse. Although Census isn’t as popular as Hightouch, it has gained popularity first and has a large customer base. It is a mature tool with many integrations, but less than Hightouch.
Fivetran, Netlify, and Netlify are Census customers.
Reverse ETL tools are the most popular choice. You’re most likely to choose between Hightouch or Census. Because Hightouch and Census use different pricing models, your decision criteria will be based on available integrations and pricing. Hightouch prices are based on how many records you sync each month, while Census prices are based on how many data synchronization workflows your company growth.
RudderStackdoesn’t just reverse ETL, it’s also an event streaming platform. RudderStack was the open-source alternative to segment, which helped it gain a reputation and grow its customer base. RudderStack’s ETL and reverse ETL capabilities made it a contender in the reverse ETL market earlier this year.
This combination of features is logical because reverse ETL relies upon event streaming or event gathering tools (often Segment, Snowplow, or RudderStack), and ETL tools for bringing data into the warehouse. RudderStack is the only reverse-ETL tool that can bring customer data into your warehouse. The company also offers more destination integrations than Hightouch and Census. It is an event streaming tool and requires extensive integration libraries in order to be competitive.
RudderStack customers include Crate & Barrel, Priceline, Acorns, and Hinge.
Segment has reverse ETL functionality too, but the company doesn’t market itself as such. Personas SQL Traits lets you sync data from your warehouse to your cloud applications, but it has to go through Segment’s Personas audience builder.
Segment launched Segment data lakes late last year, which creates a customer database lake for you. Segment’s reverse ETL functionality is less important.
Alternatives to reverse ETL
Reverse ETL can be used to create customer profiles, segment audiences, or other customer-centric processes. These processes do not have a strict real-time requirement. This is because loading and analyzing data in your warehouse in real-time is not an architectural pattern. Although data warehouses and online analytical processing (OLAP) databases are capable of running complex queries and models quickly and can be used to provide real-time responses to applications, they don’t have the ability to do so for immediate application responses.
These real-time needs can be met by tools such asRocksetthat provide real-time analytics for your applications. Rockset functions are similar to Elasticsearch. However, Rockset was built cloud-native and emphasizes SQL compatibility. This allows you to scale beyond Elasticsearch’s capabilities and perform core SQL functions, such as joins, that Elasticsearch does not support.
Rockset can be used to feed data to an online leaderboard that is constantly updated in large multiplayer games. It is extremely difficult to process millions of players simultaneously, calculate millions of scores and then sort the list in real-time. However, this is an easy use case for Rockset.