The Modern Data Stack Explained | Build an AI-Ready Analytics Foundation

Blog

Modern Data Stack Essentials: AI Analytics & Decision Intelligence

Written by

Christine

Uncles

Published

September 25, 2024

read

min

Much has been written about the modern data stack. Oftentimes, the focus is centered on the innovations around the movement, transformation, and governance of data. But when it comes to the analytics layer, articles end at the traditional endpoints where data is consumed, whether it be through dashboards, SQL query, or the building of data science models.

The truth is that data-driven organizations shouldn’t be content with the same old approaches to data consumption. After all, tectonic shifts in technology usually revolutionize the experience for end-users in profound ways.

For example, technological shifts like the internet didn’t just improve the quality of phone calls and improve the resolution of your TV picture. They enabled complete shifts in digital social interactions and put every song and movie in everyone’s hands.

The same re-imagination of user experiences should come for consumers of data and analytics, and we’ll break down how we get there.

What is the modern data stack?

A modern data stack is a suite of tools used for gathering, storing, transforming, and analyzing data.

Each of these layers play a key role in your organization’s goals to get better insights from vast amounts of data and to proactively uncover new opportunities for growth. Unlike legacy technologies, you can usually get started very quickly and enjoy a pay-as-you-go pricing model.

You won’t be locked in with a specific vendor for the entire stack, so mix and matching best-of-breed tools is a core tenant of building a modern data stack.

Think of the MDS as a well-layered trifle dessert for data:

The bottom layer is your data sources, such as applications like Salesforce and Google Analytics, databases such as Oracle and SQL Server, and files such as spreadsheets or XML.
The ingestion layer extracts data from the various data sources with automated pipelines, using tools such as Fivetran, Stitch, or Segment, allowing your team to work with the freshest data possible.
The storage layer includes cloud data warehouses such as Snowflake and Amazon Redshift, and data lakes such as Databricks and Amazon S3.
The transformation layer cleans up raw data in order to facilitate subsequent analysis. Example tools are DBT (Data Build Tools), which is a SQL command-line program that allows data analysts and engineers to transform data. Another is Matillion, a data integration and transformation solution purpose-built for the cloud and cloud data warehouses.
The operations layer includes tools such as Apache Airflow, which is an open-source workflow management platform for data engineering pipelines, as well as Atlan, which connects to your storage layer to assist your data teams to democratize both internal and external data while automating repetitive tasks.
The icing on top is the analytics layer. This layer includes dashboard tools such as Looker, as well as SQL query, machine learning modeling tools such as Dataiku, and a form of AI analytics that we call decision intelligence.

When it comes to the analytics layer, the typical tools people usually think about are dashboards for business users monitoring KPIs, SQL query for analysts to dig deeper, and ML modeling for expert data scientists.

These techniques have been with us for decades. They reinforce the traditional analytics process where businesses wait on data teams to work through their backlog in order to answer the important, and oftentimes, new business questions.

If organizations are going to take a fresh, modern approach to their data stack, they should also update the analytics experience for their users as well. At Tellius, we call this approach decision intelligence, which combines business intelligence with AI and machine learning so users can get faster insights from their modern data stack.

Let’s dive deeper into the four essential pieces for modernizing the analytics layer of the modern data stack.

Automated generation of insights

With so much data and compute available in the cloud, it should go without saying that all that power should be used by automation to simplify and speed analysis. This makes it easier for technical and non-technical people to get answers from all the data.

While manual querying of data will always be an essential tool for analyst teams, the automated generation of insights provides more people with an easier—and faster—way to obtain important findings.

Automation solves a problem that many organizations that don’t even think they have “big data” actually have. Consider a dataset of just 20 columns or variables. In order to analyze up to four variables at a time to find the combinations that are correlated to a target metric, there are more than 6,000 combinations that you would have to visualize or evaluate.

A specific example would be for an e-commerce brand to discover that sneakers sales spiked up for a specific brand, in a group of zip codes, in a given color, and for a specific customer age group. Having this type of insight may lead to new targeted campaigns or follow-up actions that would never be possible without understanding deep, granular patterns and relationships in the data.

Read more about eBay’s journey to meet the fast-paced demands of the e-commerce world using AI-powered analytics.

With an automated process, it becomes easier to analyze all possible combinations of data instead of forming individual hypotheses and testing them by creating SQL query after SQL query to look across the data.

In addition, you would be able to discover unknown “unknowns” you may not have thought of otherwise. Then, the system would be able to proactively push insights to you that you’re most interested in because it learns what data and metrics are important to you and your business. This augmentation represents the future of how analysts will get answers more easily, iterate on insights discovery much more quickly, and even involve business users in the process and take organizations beyond dashboards.

Natural language

As someone who knows how to code (but I do not consider myself a programmer) I know there’s a time and place for applying code in analyzing data.

But I’d much rather prefer the modern experience of a search interface where I can ask questions to get the information I need, instead of needing to query in SQL or Python.

That’s where natural language plays a part in the modern analytics layer. With a search interface that supports natural language, users can ask questions—similar to how you would if you were speaking to one of your smart home devices—to get answers and visualize data that helps you understand what’s happening in the business.

It also helps you identify the reasons why metrics change, and how to improve business performance through granular recommendations found in data.

Toward the goal of making data accessible to more people, natural language also plays a part in the narratives and data stories that are presented alongside data visualizations in an insight that’s automatically generated. These narratives describe the specific findings of interest, as to help the user quickly understand the insight without having to interpret the visualization alone.

Machine learning for non-data scientists

Applying machine learning to business data has become a popular path to advanced and predictive analytics for many organizations in recent years.

But these capabilities shouldn’t just be the playground for PhDs and technical experts. Modern analytics is about upskilling more people in the data ecosystem.

To that end, modern analytics is empowering a new generation of business analysts and data analysts with the superpowers of data science.

Not only is this about making building ML models easier with point-and-click automated machine learning, but augmenting the process in a few key ways: making model outputs more explainable and transparent by making the process more visual and exploratory; automating the feature engineering to simply data prep (data prep is often cited by data scientists as the most time-consuming part about working with business data); and integrating models with an easy way for business users to visualize and interact with insights and predictions to strengthen the collaborative and iterative advanced analytics process.

Leveraging data warehouses / data lakes for analytics

The last piece of the puzzle for the analytics layer that sits on top of the modern data stack is for the analytics layer to leverage the compute power of the data storage layer.

Proponents of the modern data stack usually prefer that data isn’t moved to the analytics layer and that all processing takes place in the data warehouse or the data lake (or lakehouse). In this model, the analytics layer acts as the interface for analytics users (whether they’re business users, analysts, or data scientists), and any queries or machine learning jobs are pushed down to the storage layer for compute. The response is returned to the analytics layer for users to consume, interpret, and act on.

A modern analytics layer such as Tellius gives organizations the flexibility to live-query a data warehouse, perform live advanced analytics and machine learning with a data lake, and ingest data for internal processing when the situation calls for it.

Read more: Tellius recently launched the ability to run natural language search queries and automated insights directly on the Databricks Lakehouse data platform, powered by Delta Lake. Combining the flexibility of Delta Lake with Tellius empowers everyone—from data scientists to data analytics and business users—with access to mission-critical business insights.

Implementing AI analytics in your data stack

AI analytics—artificial intelligence for expediting analytics—has grown in popularity recently, thanks to awareness brought about by generative AI like ChatGPT.

In today’s rapidly evolving digital landscape, businesses are constantly striving to gain a competitive edge. One way to do this is by leveraging analytics driven by AI to better extract insights from data and make those insights accessible to more people. Using the power of machine learning, AI analytics is a powerful tool that can help businesses uncover valuable insights so they can make faster data-driven decisions to stay ahead of the curve.

While AI-powered analytics tools have become more accessible, the successful implementation and integration of AI analytics into your data stack requires patience and collaboration among data and domain experts to maximize its benefits effectively.

Ask deeper questions of vendors painting a picture of immediate plug-and-play or suggesting their implementation is as simple as typing a prompt in ChatGPT. Setting up, mapping, and fine-tuning the solution to the business’ unique situation is still a reality. Of course, leading AI analytics platforms also offer numerous automations to shorten time to insights, such as pre-built models, out-of-the-box calculations, templates, quickstarts, and the ability for the system to learn and grow more personalized with time.

What can a modern data stack do for your organization?

In closing, the modern data stack is a great innovation with a bright future, but the benefits shouldn’t just be about the underlying data plumbing.

Organizations should also examine how the modern data stack can offer a new experience that takes insight-driven organizations to the next level, making it easier to make critical decisions faster and with more confidence than ever before.

Learn more

Tellius is an AI-driven decision intelligence platform that accelerates your understanding of business data to answer critical questions across a variety of analytical use cases.

See how you can start with data, explore and visualize relevant metrics, generate granular insights to understand why metrics change, and identify the next best action to drive desired business outcomes. Watch our new webinar on AI-powered analytics to learn how to unlock your true analytics potential.