Analytics & Insights, Deep Dive

Lightning-fast Data Transfer Between Tellius and Snowflake, Powered by Apache Arrow

hero-shape-left hero-shape-right


Many of our largest customers run Tellius’s AI-powered insights, natural language searches, and data prep on top of Snowflake’s cloud data warehouse every day. Amongst our 30+ out-of-the-box data connectors including to other cloud data warehouses like Redshift and BigQuery, Snowflake is extremely popular. Why? In a previous post we highlighted the four key reasons Tellius excels on Snowflake such as the ability to push down Tellius search and visualization queries directly to the underlying Snowflake database without data movement or copy — today, we’d like to highlight a fifth reason to love Tellius on Snowflake — lightning-fast data transfer via Apache Arrow, a rising open-source technology that has been downloaded over 20 million times per month. In this post, we’ll discuss our integration with Apache Arrow and how it expedites data transfers to Snowflake from Tellius up to 3x faster than earlier versions. At the end, we hope you’ll take Tellius for a spin yourself to see how quickly you can start adding AI-powered analytics value to your business. Ready to get going?


5 Reasons to love tellius

TLDR: Tellius and Snowflake pair nicely 

Tellius Snowflake Connector

In our most recent release we made major improvements to how to bring data from Snowflake to Tellius which has improved the connectivity and scalability aspects of the connector.

The Tellius Snowflake Connector uses the Spark Snowflake Connector to load data from Snowflake to the Tellius In-Memory Compute Engine (ICE). The Tellius ICE Engine allows users to automate complex analytical processes and find key drivers behind business data at scale.  Utilizing the Spark Snowflake connector allows Tellius to load in a scalable manner to it’s ICE engine, allowing customers to bring TB’s of data from their Snowflake system to Tellius’s system without data movement or copy.

Previous Limitations of Snowflake Spark Connector

In earlier versions of Tellius, we used to use the Snowflake connector version 2.5.x which had the below limitations:

  • Before loading data to Tellius, for large data, Snowflake used to dump the data into CSV or JSON format to the staging area. Writing to the staging area meant write access to the underneath database. Customers were not comfortable with that. Even though Tellius will not write to the source system by default, we needed write access due to this connector behavior
  • CSV and JSON are not the best formats to load the data as they explode the data much bigger than original datasets. This affected the load time in Tellius.

Apache Arrow

Apache Arrow is a cross-language, cross-platform, columnar in-memory data format for data which allows for zero-copy reads for lightning-fast data access without serialization overhead. This format allows for efficient sharing of data between two large scale systems and is organized for efficient analytic operations on modern hardware like CPUs and GPUs. More information about Apache Arrow is available below

Adoption of Apache Arrow in Snowflake

From Spark Connector Version 2.6.0, Snowflake has adopted Apache Arrow as its standard communication. This change makes it possible to read the data from Snowflake without a staging area and in a much more efficient approach than CSV or JSON.

Bringing Apache Arrow Improvements to Tellius

From Tellius 3.0, Tellius uses Arrow Format as the standard way for caching in ICE engines. In addition, the underlying Snowflake connector has been upgraded to the latest version to use all the new improvements.

This upgrade has improved Tellius user experience as customers no longer need to give write access to their underneath system. Finally, with Apache Arrow support both in Snowflake and Tellius ICE engine loads are up to 3x faster than earlier versions.

Ready to take Tellius for a spin yourself? Try a 14 day free trial to experience the power of AI-powered analytics in your business.


Leave reply

Read Similar Posts

  • 2022 H1: Product Innovations
    Deep Dive

    2022 H1: Product Innovations

    Over the past few months, we’ve released major groundbreaking features that’ll completely change the way users analyze their cloud data. Get a sneak peak into some of the features.

  • BI & Data Science: Two Sides of the Same Coin
    Deep Dive

    BI & Data Science: Two Sides of the Same Coin

    Tellius offers a robust machine learning layer where users can train, assess, and apply predictive models. Read the advanced approach to customer segmentation based on an unsupervised machine learning clustering model in Tellius.

  • Tellius Recognized in 2022 Gartner® Market Guide for Multipersona Data Science and Machine Learning Platforms
    Deep Dive

    Tellius Recognized in 2022 Gartner® Market Guide for Multipersona Data Science and Machine Learning Platforms

    We’re excited to share that Tellius has been recognized as a Representative Vendor in the 2022 Gartner Market Guide for Multipersona Data Science and Machine Learning Platforms (DSML) – just a month after being recognized as a Visionary in the Gartner Magic QuadrantTM for Analytics & Business Intelligence Platforms!