Alteryx Project ETL Image

We recently had the opportunity to learn and use Alteryx to develop an ETL data workflow in support of a data analysis project. We were naturally a little skeptical of new ETL tools, but Alteryx won us over.

Note: Alteryx also provides machine learning capabilities not evaluated during this Alteryx project.

Alteryx Project Background: Becoming the tool of choice

This ETL project required retrieving call center fact and dimensional data from a remote server, cleaning and loading the data into HP Vertica tables and staging the data for a Tableau dashboard to be used for analysis by the TurboTax Customer Care team.

At the onset, the project team had key performance indicators, but knowing these would evolve with each iteration of the project, the data modeling and review process needed to iterate quickly. Furthermore, there wasn’t a lot of time to learn a new tool before tax season, so the tool needed to be intuitive, easy to learn, and in the end, deliver enterprise reliability and performance.

Alteryx fits these project requirements nicely.

An overview of Alteryx for ETL

Alteryx Workflow

Alteryx works on the concept of designing data streams by dragging, dropping and configuring predefined tools onto a workflow canvas. The general idea is to build a data stream from input to output, applying  necessary transforms and custom logic in between.

During this Alteryx Project, we learned they support a plethora of data connectors and select API’s out of the box, but it also provides a rich set of tools for defining and incorporating workflow parameters that can implement complex business logic. It is also possible to develop your own custom tools if needed.

The Alteryx UI is fleet of foot and allows the user to experiment without jeopardizing productivity. The balance Alteryx achieves between the abstraction of the technical, which facilitates productivity for the analyst, and access to the inner workings of the tool to extend functionality or troubleshoot problems by an engineer is just right.

The Alteryx Community

When we encountered an insurmountable issue or question, the Alteryx community was able to help in most all cases.

Alteryx has a growing and vibrant online community where users have access to information about the product and how-to’s. Users can also post questions to the community about problems they are having or seek peer review of development approaches.

You will also find more advanced users and Alteryx engineers sharing custom tools they have developed which can be really helpful. This community is open and ready to share, and official Alteryx support is responsive and helpful.

Seasoned SQL Developers may become frustrated

There are times when running a bit of SQL is the most efficient and elegant method for a given transformation task, but this proved more challenging than we anticipated, given our experience with other ETL tools.

It is not as simple as dragging an ‘Execute SQL’ tool to the canvas as no tool currently exists.  Alteryx prefers the use of a batch macro, but for a seasoned SQL Developer, these might prove to be a little tedious in comparison. For the less technical, a batch macro might be just the ticket.

In this case, we were able to run some SQL as part of an output process to achieve the objective. Hopefully, we’ll get the ‘Execute SQL’ tool in a future release!

Final thoughts on Piloting an Alteryx Project

Finally, we found Alteryx to perform quite well.

The key to designing efficient workflows is to follow the Alteryx Best Practices Guide published in the online community.

It is particularly important to size input fields appropriately as using the default field length can consume server memory and slow down data stream throughput. Fortunately, there is a tool for that. Checkout the ‘Auto Field’ tool that sizes the working fields in a data stream based on the actual data it encounters.

Overall, we think Alteryx has in many ways re-invented the ETL tool experience and set the bar for others in this competitive application space. Add to rich ETL capabilities the advanced analytics feature, that we have yet to explore, and the outlook gets even better.

Written By


Evolytics

This post is curated content from the Evolytics staff, bringing you the most interesting news in data and analysis from around the web. The Evolytics staff has proven experience and expertise in analytics strategy, tagging implementation, data engineering, and data visualization.