Lightning-fast Data Transfer Between Tellius and Snowflake, Powered by Apache Arrow

Introduction

Many of our largest customers run Tellius’s AI-powered insights, natural language searches, and data prep on top of Snowflake’s cloud data warehouse every day. Amongst our 30+ out-of-the-box data connectors including to other cloud data warehouses like Redshift and BigQuery, Snowflake is extremely popular. Why? In a previous post we highlighted the four key reasons Tellius excels on Snowflake such as the ability to push down Tellius search and visualization queries directly to the underlying Snowflake database without data movement or copy — today, we’d like to highlight a fifth reason to love Tellius on Snowflake — lightning-fast data transfer via Apache Arrow, a rising open-source technology that has been downloaded over 20 million times per month. In this post, we’ll discuss our integration with Apache Arrow and how it expedites data transfers to Snowflake from Tellius up to 3x faster than earlier versions. At the end, we hope you’ll take Tellius for a spin yourself to see how quickly you can start adding AI-powered analytics value to your business. Ready to get going?

 

5 Reasons to love tellius

TLDR: Tellius and Snowflake pair nicely 

Tellius Snowflake Connector

In our most recent release we made major improvements to how to bring data from Snowflake to Tellius which has improved the connectivity and scalability aspects of the connector.

The Tellius Snowflake Connector uses the Spark Snowflake Connector to load data from Snowflake to the Tellius In-Memory Compute Engine (ICE). The Tellius ICE Engine allows users to automate complex analytical processes and find key drivers behind business data at scale.  Utilizing the Spark Snowflake connector allows Tellius to load in a scalable manner to it’s ICE engine, allowing customers to bring TB’s of data from their Snowflake system to Tellius’s system without data movement or copy.

Previous Limitations of Snowflake Spark Connector

In earlier versions of Tellius, we used to use the Snowflake connector version 2.5.x which had the below limitations:

  • Before loading data to Tellius, for large data, Snowflake used to dump the data into CSV or JSON format to the staging area. Writing to the staging area meant write access to the underneath database. Customers were not comfortable with that. Even though Tellius will not write to the source system by default, we needed write access due to this connector behavior
  • CSV and JSON are not the best formats to load the data as they explode the data much bigger than original datasets. This affected the load time in Tellius.

Apache Arrow

Apache Arrow is a cross-language, cross-platform, columnar in-memory data format for data which allows for zero-copy reads for lightning-fast data access without serialization overhead. This format allows for efficient sharing of data between two large scale systems and is organized for efficient analytic operations on modern hardware like CPUs and GPUs. More information about Apache Arrow is available below

https://arrow.apache.org/

Adoption of Apache Arrow in Snowflake

From Spark Connector Version 2.6.0, Snowflake has adopted Apache Arrow as its standard communication. This change makes it possible to read the data from Snowflake without a staging area and in a much more efficient approach than CSV or JSON.

https://www.snowflake.com/blog/snowflake-connector-for-spark-version-2-6-turbocharges-reads-with-apache-arrow/

Bringing Apache Arrow Improvements to Tellius

From Tellius 3.0, Tellius uses Arrow Format as the standard way for caching in ICE engines. In addition, the underlying Snowflake connector has been upgraded to the latest version to use all the new improvements.

This upgrade has improved Tellius user experience as customers no longer need to give write access to their underneath system. Finally, with Apache Arrow support both in Snowflake and Tellius ICE engine loads are up to 3x faster than earlier versions.

Ready to take Tellius for a spin yourself? Try a 14 day free trial to experience the power of AI-powered analytics in your business.

share

Read Similar Posts

  • How to Pick the Right AI Analytics Platform for Your Business
    Deep Dive

    How to Pick the Right AI Analytics Platform for Your Business

    Although the journey to investing in an AI-powered analytics platform might seem daunting, it's a critical step in unlocking the full potential of your data. The right platform isn't just an investment in technology—it's an investment in the future success of your business.

    Tellius
  • Navigating the Perfect Pharma Revenue Storm: Life Science Pricing, Contracting & Rebates in the Era of AI
    Deep Dive

    Navigating the Perfect Pharma Revenue Storm: Life Science Pricing, Contracting & Rebates in the Era of AI

    Learn how AI-powered approaches to pharmaceutical and biotech pricing, contracting, rebates, and revenue management are helping commercial and market access teams navigate common challenges.

    Tellius
  • The Ultimate Guide to People Analytics
    Deep Dive

    The Ultimate Guide to People Analytics

    Let's explore benefits of HR analytics, the challenges of traditional approaches, and how AI and augmented analytics are revolutionizing every part of people management— planning, recruiting, onboarding, management, upskilling, and beyond.

    Tellius