Recently I had a requirement to load streaming JSON data to provide a data feed for near real-time reporting. The solution streamed data into an “Ingress” Azure Event Hub, shred the JSON via Azure Stream Analytics and then push subsections of data as micro-batches (1 sec) into a “Egress” Azure Event Hub (for loading into a stage table in Azure SQL DW).
In Event Hubs and Stream Analytics there are only a few performance levers to help tune a solution like this, or said another way, doing nothing with these levers can affect your ongoing performance!
So this blog is to show the performance differences when using different Azure Event Hub partition configurations and the Azure Stream Analytics PARTITION BY clause.
Extending on my previous post about redirecting SQL CDC changes to Azure Event Hub, I have had a few people ask for details/options to stream SQL data into the Power BI API.
Specifically – they were looking for an easy method to leverage the ADD ROWS functionality of the Power BI API so they could push real-time data into a Power BI service dataset.
This method provides the ability to update the Power BI Dataset with new rows every few seconds, instead of a Power BI report having to either use Direct Connect or Scheduled data refresh capability which can be very limiting.
If interested in how the SQL CDC and Event Hubs work together, then read here from my previous post – https://mrfoxsql.wordpress.com/2017/07/12/streaming-etl-send-sql-change-data-capture-cdc-to-azure-event-hub/
The purpose of this post is to quickly show how to extend and explore pushing new SQL data rows via Azure Stream Analytics into Power BI.
And so, lets get into some CDC to Power BI streaming action!
I had a recent requirement to capture and stream real-time data changes on several SQL database tables from an on-prem SQL Server to Azure for downstream processing.
Specifically we needed to create a streaming ETL solution that …
- Captured intermediate DML operations on tables in an on-prem SQL database
- Transmit data securely and real-time into Azure
- Store the delta changes as TXT files in Azure Data Lake Store (ADLS)
- Visualise the real-time change telemetry on a Power BI dashboard (specifically the number of Inserts, Updates, Deletes over time).
The first part was easy; SQL has a feature called Change Data Capture (CDC) which does an amazing job of tracking DML changes to seperate system tables. If you dont know about CDC then see here – https://docs.microsoft.com/en-us/sql/relational-databases/track-changes/about-change-data-capture-sql-server
The second part wasn’t easy, and after some searching I came across this blog post by Spyros Sakellariadis which gave me inspiration and starter code for my streaming ETL solution. Excellent post. See here – https://azure.microsoft.com/en-us/resources/samples/event-hubs-dotnet-import-from-sql/
And so, the final architecture looks something like this…
The solution picks up the SQL data changes from the CDC Change Tracking system tables, creates JSON messages from the change rows, and then posts the message to an Azure Event Hub. Once landed in the Event Hub an Azure Stream Analytics (ASA) Job distributes the changes into the multiple outputs.
What I found pretty cool was that I could transmit SQL delta changes from source to target in as little as 5 seconds end to end!
And so, lets get into some CDC to Event Hub data streaming action!
Recently I did a presentation at our local SQL Server User Group (SSUG) on Managing Streaming Data Pipelines Using Azure Data Services and as such wanted to build a compelling Azure demo that worked with simple streaming data which under certain event conditions would trigger an outbound phone call.
If interested the presentation deck is here – SSUG Melbourne – Building Streaming Data Pipelines Using Azure Cloud Services
The solution had several key components and stages outlined in the architecture below.
- A mobile phone app which generates JSON events with the X, Y, Z location of the device and G (g-force) detected in the device during movement.
- An Azure IoT Hub (AIH) which accepts the JSON events posted from the mobile device
- An Azure Stream Analytics (ASA) job that queries the Event Hub and routes the event data to several outputs, including…
- Azure Blob Storage (for archive of all event data)
- Power BI (for a live dashboard of all event data)
- Azure SQL Database (ASDB) (for tabular storage of all event data)
- Azure Event Hub + Azure Function (AF) (for queuing events which have a G Force reading greater than 3 and then triggering a phone call back to the original device from which the event originated)
The entire demo solution is actually really interesting (tsk, of course!) – and I will blog about other separate parts of this presentation at some point later. However the part of the demo that received the most interest was the external phone call integration with Azure Functions.
To be clear up front – Azure itself does not have native phone capability – so to make outbound phone calls I leverage an external “Twilio” API from within an Azure Function, and “Twilio” connects the outbound call.
And so, lets see the Twilio phone setup and c# Function code in action!