• Latest:
    1. Authorize REST call in Jscript
    2. Approval Process for external users – Power Automate
    3. Power Automate – Tips and Tricks
  • +45 26 246 341 +91 22 4129 6111
  • engage@kalpavruksh.com
  • Home
  • Services
    • Product Development
    • Technology Consulting
    • Agentic AI & AI/ML Automation
  • Dynamics 365 CoE
  • Blog
  • Careers
  • Company
    • About Us
    • Contact Us
24th August, 2019
  • Category: D365 Customer Engagement
  • Comments: 0

Processing billions of records using Azure Data Factory, Data Lake and Databricks

Introduction:

In this blog, we will learn How to process billions of records using Azure Data Factory, Data Lake and Databricks.

Each step mentioned here will be explained in detail in subsequent articles.

Problem Statement:

We are having billions of audit data in Azure table storage and want to generate a high-level report so that the management team can take some important decisions.

We tried different approaches to solve the problem.

Attempt with Approach 1:

Export Azure table storage data into a CSV file and perform a reporting using Excel.
Limitation:
Excel won’t hold more than the 1 million lines of data.

Attempt with Approach 2:

Migrated data to SQL server to process billions of records.
Limitation:
The SQL Server couldn’t handle large data processing and a simple query on it took more than 2.5 – 3 hours to execute.

Attempt with Approach 3:

Directly connect Power BI to Azure table Storage.
Limitation:
Power BI does not support a direct Query on Azure Table Storage. It tried to load a complete dataset on a local machine, which is not feasible in a practical situation.

Finally, we found a solution, how to process billions of data using Azure Data Lake, Data Factory and Databricks.

Below is the high-level diagram depicting the approach to solving this critical business requirement.

Let’s understand the step by step process of configuring various services and facilitating communication between different resources, to achieve the desired business solution.

Steps:

1. Create a Resource Group.
2. Create an Azure Data Lake account.
3. Create an Azure Data Factory.
4. Transfer the data from Table Storage to Azure Data Lake using Azure Data Factory.
5. Create an Azure AD application for Azure Databricks.
6. Assign a Contributor and Storage Blob Data Contributor role to the registered Azure AD Application at a subscription level.
7. Create an Azure Databricks service.
8. Connect Azure Data Lake to Azure Databricks using Notebook.
9. Connect Power BI to Azure Databricks for better visualization.

This approach will work for other sources as well. Example SQL Database, Cosmos DB, CSV.

In a series of blogs, we will see how each of the above steps can be configured. The next blog will start by explaining how a resource group can be created.

< Back to previous page

Leave a comment

Cancel reply

Your email address will not be published. Required fields are marked *

Recent posts

Authorize REST call in Jscript
18th December, 2023
Approval Process for external users – Power Automate
21st September, 2023
Power Automate – Tips and Tricks
27th June, 2023
Modern Commands and Power Fx
30th March, 2023
Plugin development – Tricks to avoid infinite loops
20th December, 2022

Contact us





    Contact Us

    Kalpavruksh Technologies Denmark A/S
    Store Kongensgade 68,
    1264 København
    +45 26 24 63 41

    Kalpavruksh Technologies USA
    29 Walter Hammond Pl,
    Waldwick NJ 07463

     

    Kalpavruksh Systems Pvt. Ltd.
    8th Floor, Technocity,
    Mahape, Navi Mumbai 400 710
    +91 22 4129 6111

    Kalpavruksh Technologies Deutschland GmbH
    Gosheimer Straße 26,
    78564 Wehingen

    Company

    • About Us
    • Blog
    • Careers
    • GDPR Compliance
    • Privacy Policy
    • Partners Privacy Policy
    • Contact us

    Microsoft Gold Partner



    Follow us

    © Copyright Kalpavruksh Technologies. 2025. All right reserved.
    Have any questions about our services, or just want to find out more about how we can help you reach your goals? Engage with us!