Analytics & Insights, Deep Dive

Lightning-fast Data Transfer Between Tellius and Snowflake, Powered by Apache Arrow

hero-shape-left hero-shape-right

Introduction

Many of our largest customers run Tellius’s AI-powered insights, natural language searches, and data prep on top of Snowflake’s cloud data warehouse every day. Amongst our 30+ out-of-the-box data connectors including to other cloud data warehouses like Redshift and BigQuery, Snowflake is extremely popular. Why? In a previous post we highlighted the four key reasons Tellius excels on Snowflake such as the ability to push down Tellius search and visualization queries directly to the underlying Snowflake database without data movement or copy — today, we’d like to highlight a fifth reason to love Tellius on Snowflake — lightning-fast data transfer via Apache Arrow, a rising open-source technology that has been downloaded over 20 million times per month. In this post, we’ll discuss our integration with Apache Arrow and how it expedites data transfers to Snowflake from Tellius up to 3x faster than earlier versions. At the end, we hope you’ll take Tellius for a spin yourself to see how quickly you can start adding AI-powered analytics value to your business. Ready to get going?

 

5 Reasons to love tellius

TLDR: Tellius and Snowflake pair nicely 

Tellius Snowflake Connector

In our most recent release we made major improvements to how to bring data from Snowflake to Tellius which has improved the connectivity and scalability aspects of the connector.

The Tellius Snowflake Connector uses the Spark Snowflake Connector to load data from Snowflake to the Tellius In-Memory Compute Engine (ICE). The Tellius ICE Engine allows users to automate complex analytical processes and find key drivers behind business data at scale.  Utilizing the Spark Snowflake connector allows Tellius to load in a scalable manner to it’s ICE engine, allowing customers to bring TB’s of data from their Snowflake system to Tellius’s system without data movement or copy.

Previous Limitations of Snowflake Spark Connector

In earlier versions of Tellius, we used to use the Snowflake connector version 2.5.x which had the below limitations:

  • Before loading data to Tellius, for large data, Snowflake used to dump the data into CSV or JSON format to the staging area. Writing to the staging area meant write access to the underneath database. Customers were not comfortable with that. Even though Tellius will not write to the source system by default, we needed write access due to this connector behavior
  • CSV and JSON are not the best formats to load the data as they explode the data much bigger than original datasets. This affected the load time in Tellius.

Apache Arrow

Apache Arrow is a cross-language, cross-platform, columnar in-memory data format for data which allows for zero-copy reads for lightning-fast data access without serialization overhead. This format allows for efficient sharing of data between two large scale systems and is organized for efficient analytic operations on modern hardware like CPUs and GPUs. More information about Apache Arrow is available below

https://arrow.apache.org/

Adoption of Apache Arrow in Snowflake

From Spark Connector Version 2.6.0, Snowflake has adopted Apache Arrow as its standard communication. This change makes it possible to read the data from Snowflake without a staging area and in a much more efficient approach than CSV or JSON.

https://www.snowflake.com/blog/snowflake-connector-for-spark-version-2-6-turbocharges-reads-with-apache-arrow/

Bringing Apache Arrow Improvements to Tellius

From Tellius 3.0, Tellius uses Arrow Format as the standard way for caching in ICE engines. In addition, the underlying Snowflake connector has been upgraded to the latest version to use all the new improvements.

This upgrade has improved Tellius user experience as customers no longer need to give write access to their underneath system. Finally, with Apache Arrow support both in Snowflake and Tellius ICE engine loads are up to 3x faster than earlier versions.

Ready to take Tellius for a spin yourself? Try a 14 day free trial to experience the power of AI-powered analytics in your business.

share

Read Similar Posts

  • Ajay Khanna For Dataversity: How Analytics Consumers and Data Experts Can Come Together to Close Insights Gaps
    Deep Dive

    Ajay Khanna For Dataversity: How Analytics Consumers and Data Experts Can Come Together to Close Insights Gaps

    Recently, Tellius CEO, Ajay Khanna, spoke with Dataversity on the uses and management of data. The article is centered around the top ways that an improved and modern approach can bring analytics consumers and data experts together to streamline the process for making better data-backed business decisions.

    Tellius
  • Tellius to Showcase AI-Driven Decision Intelligence Platform at 2022 Gartner Data and Analytics Summit
    Deep Dive

    Tellius to Showcase AI-Driven Decision Intelligence Platform at 2022 Gartner Data and Analytics Summit

    Tellius, the AI-driven decision intelligence platform, today announced its participation in the Gartner® Data and Analytics Summit, held August 22-24 in Orlando. The summit brings together leading technology companies to address the most significant challenges that data analytics and data science leaders face as they build the organizations of the future.

    Tellius
  • Tellius Named Best Decision Intelligence Solution
    Deep Dive

    Tellius Named Best Decision Intelligence Solution

    Tech Breakthrough announced that Tellius is a winner in the '22 AI Breakthrough Awards program for the “Best Decision Intelligence Solution” award.

    Tellius