Posts

How to Set Up an SFTP Server and Seamlessly Connect It to Azure Data Factory

Image
 Introduction  This blog is about the process of setting up SFTP server and then connecting to linked service in Azure Data Factory successfully. Prerequisites:- 1. You already have create a Linux machine from Azure Portal. 2. You have already created a data factory.  IMP Note:- **Throughout the demonstration to establish SFTP server & connecting to ADF, port 22 should be allowed as an inbound rule in NSG configured for linux vm. The flow of the blog is such that we will first setup a new user named ' sftpuser ' and restrict the permissions for access such that it is the only user for authentication to the SFTP server. Then we will connect to SFTP server via ADF Linked Service.  Unless explicitly stated otherwise in a step, all commands are to be executed by the user account created during the Linux machine deployment. Step-By-Step Process:- 1. Login to your virtual machine I am using gitbash, you can use any external tool as well such as Putty. The below command...

How to Connect Data Ex-filtration Protection(DEP) enabled Synapse Workspace to Azure Cosmos DB (For NoSQL)

Image
This blog is going to be an interesting one — we’re diving into how to connect to a NoSQL resource from Azure Synapse pipelines using a Managed Virtual Network (VNet) , and how to securely access Azure Cosmos DB for NoSQL via a private endpoint . What makes this post exciting is that we’re not just talking about assigning the Synapse workspace’s managed identity to Cosmos DB with Contributor or Reader access and calling it a day. Nope — we’re going deeper! We’ll explore: What is  Data Exfiltration Protection (DEP)? What’s really happening behind the scenes? The difference between the control plane and data plane And how all of this works in Azure Data Factory (ADF) Alright, let’s get started!   Here’s a simple diagram showing the overall architecture of a DEP-enabled Synapse workspace . In this setup, pipelines are created within Synapse, and the linked service connects to Azure Cosmos DB for NoSQL using a Managed VNet Integration Runtime via a priv...

How to Configure Email Notifications For ADF Pipeline Runs Using Logic Apps

Image
 Introduction This blog post aims to demonstrate how you can configure an alert system for your ADF pipelines without relying on the in-build alert rule system provided by Azure Data Factory. But you may wonder why do i need to do that when I already have a in-build alert rule system managed by azure monitor for us? The answer is very simple. The manual create an alert system within the pipeline can help in following situations:- 1. Lets say, you are a part of a distributed email group that gets an alert when pipeline fails. Due to unusual circumstances at some point in time if the pipeline failure doesn't trigger alert email then it becomes a critical issue as some pipelines may belong to production environment and could certainly impact the business. The problem can be isolated by testing whether distributed email group is unresponsive from the SMTP server or an issue with the alert rule in ADF by creating custom alert emails via logic app through ADF pipeline. 2. To send an aler...

The Journey of a Web Request: Unraveling Multi-Tier Architecture and Cloud Security

Image
Introduction: Imagine you have a company with a website, say XYZ.com , where you sell grocery items online. When a client (let’s call them Client A) tries to access your website using a web browser, they type the website address (URL). The Domain Name System (DNS) resolves this webpage name (like www.xyz.com ) into an IP address. Once the IP address is obtained, the browser sends an HTTP request to the serve. Now, what exactly is a server? Every webpage you see is essentially a static file provided by a server to your web browser. A server is nothing but a computer that is ready to serve you with the files or anything it is configured to server. For example, even something as simple as searching for www.xyz.com triggers an HTTP request to the server. The server then responds by sending back the requested webpage, typically the homepage, to the browser. While this process may seem straightforward, a lot of complex operations happen behind the scen es every time you see such simpl...

Unlocking Change Data Capture in ADF: New Feature Explained

Image
 Change Data Capture(CDC) In Azure Data Factory Introduction This blog post introduces an exciting new feature in Azure Data Factory (ADF) — Change Data Capture (CDC) — a long-awaited capability that significantly enhances real-time data integration scenarios. While ADF has previously supported CDC through Data Flows , this newly introduced UI-based CDC feature brings a more streamlined and user-friendly experience. However, it's important to note that there are key differences between the two implementations: CDC in Data Flows follows a pattern of performing a full initial load , followed by incremental updates , allowing users to establish a historical baseline before applying changes. In contrast, the CDC feature in the ADF UI is designed specifically for incremental loads only , and it offers near real-time synchronization by pushing changes directly to the sink as they occur. Additionally, with Data Flows, changes are applied based on checkpoint-based pipeline ...