Blog: Azure Cognitive Services API’s with SQL Integration Services Packages (SSIS)
I had a recent requirement to integrate multi-language support into a SQL DW via a SQL SSIS ETL solution. Specifically the SQL DW platform currently only supported English translation data for all Dimension tables, but the business was expanding internationally so there was a need to include other language translations of the Dimensional attributes.
We wanted to do this without having to manually translate English text attributes that exist already, or new ones that are added or modified over time. We wanted an automated method that simply “worked“.
Enter Azure Cognitive Services Translator Text API service!
So the purpose of this blog is to outline the code/pattern we used to integrate the Azure Cognitive Services API into SQL SSIS ETL packages.
Extending on my previous post about redirecting SQL CDC changes to Azure Event Hub, I have had a few people ask for details/options to stream SQL data into the Power BI API.
Specifically – they were looking for an easy method to leverage the ADD ROWS functionality of the Power BI API so they could push real-time data into a Power BI service dataset.
This method provides the ability to update the Power BI Dataset with new rows every few seconds, instead of a Power BI report having to either use Direct Connect or Scheduled data refresh capability which can be very limiting.
If interested in how the SQL CDC and Event Hubs work together, then read here from my previous post – https://mrfoxsql.wordpress.com/2017/07/12/streaming-etl-send-sql-change-data-capture-cdc-to-azure-event-hub/
The purpose of this post is to quickly show how to extend and explore pushing new SQL data rows via Azure Stream Analytics into Power BI.
And so, lets get into some CDC to Power BI streaming action!
I had a recent requirement to capture and stream real-time data changes on several SQL database tables from an on-prem SQL Server to Azure for downstream processing.
Specifically we needed to create a streaming ETL solution that …
- Captured intermediate DML operations on tables in an on-prem SQL database
- Transmit data securely and real-time into Azure
- Store the delta changes as TXT files in Azure Data Lake Store (ADLS)
- Visualise the real-time change telemetry on a Power BI dashboard (specifically the number of Inserts, Updates, Deletes over time).
The first part was easy; SQL has a feature called Change Data Capture (CDC) which does an amazing job of tracking DML changes to seperate system tables. If you dont know about CDC then see here – https://docs.microsoft.com/en-us/sql/relational-databases/track-changes/about-change-data-capture-sql-server
The second part wasn’t easy, and after some searching I came across this blog post by Spyros Sakellariadis which gave me inspiration and starter code for my streaming ETL solution. Excellent post. See here – https://azure.microsoft.com/en-us/resources/samples/event-hubs-dotnet-import-from-sql/
And so, the final architecture looks something like this…
The solution picks up the SQL data changes from the CDC Change Tracking system tables, creates JSON messages from the change rows, and then posts the message to an Azure Event Hub. Once landed in the Event Hub an Azure Stream Analytics (ASA) Job distributes the changes into the multiple outputs.
What I found pretty cool was that I could transmit SQL delta changes from source to target in as little as 5 seconds end to end!
And so, lets get into some CDC to Event Hub data streaming action!
Over the years I have presented many times to various clients describing Business Intelligence (BI) solutions using the Microsoft BI solutions stack.
In all of my sessions, regardless of the specific focus, each time I always start with the same 2 content slides.
- The first is always the positioning graphic from the Gartner Magic Quadrant for Business Intelligence & Analytics. This shows all top BI vendors and how they are positioned against each other, and is republished yearly (typically in Feb). This is often a great introduction as to why Microsoft BI is such a great play. If interested an introduction to the paper is here, but you can also find the full report on various web sites (https://www.gartner.com/doc/2989518/magic-quadrant-business-intelligence-analytics)
- The second is always a definition of exactly what Business Intelligence (BI) actually is, and this is the purpose of this short blog.
And so, lets dig down into a business definition for Business Intelligence!
So PASS officially kicked off this morning leading into the next 3 days of back to back sessions.
You could certainly tell that the keynote was on… I mean the dining room was pumping…!
Oh that’s right, everyone is at the keynote!
So the Keynote session was hosted by Joseph Sirosh Group Vice President, Data Group.
The big tell for the key note was undoubtedly the SQL Server 2016 CTP3 and just whats packed to the rafters within the software. If you want to learn more about that then I recommended step across to this link here http://blogs.technet.com/b/dataplatforminsider/archive/2015/10/28/sql-server-2016-everything-built-in.aspx
Key Takeaways from the Keynote;
- SQL 2016 is really a major release that really solidifies the Microsoft view of a solid foot in both the On Prem and In Cloud data platform camps.
- “The future is both earth and sky!”
- The release offers much On Prem capability like Polybase (to APS), R integration (advanced analytics), Always Encrypted, SSAS/SSRS improvements
- The release also provides the ability to seamlessly integrate from On Prem to Azure Cloud – and/or back like Polybase (to HDInsight), Stretch Database – and SQL already has capability to use Azure VM’s for SQL AAG solutions and Azure backups.
- An interesting takeaway – the human size of human genome is approx 1.5 Gigabytes, or about 2 CDs worth of storage space. How small do you feel now?
I then attended 4 sessions, but today there is really only time to blog about this one, mostly for me it was the most impressive in regards to capability and just how far its come!
The session was SQL Server in Azure Virtual Machines – Features and Best Practices and was presented by Luis Vargas is a Senior Program Manager Lead in the SQL Server team.
Its time to take a well deserved 1/2 time break in my 8 part post series on SQL Partitioning and so I have decided to take a slight “light-hearted” tangent and talk about visualisations, or more specifically Pie Chart visualisations.
In all seriousness, this actually came up as I overheard a conversation at a client site debating the usage of this very visualisation.
Now – If you believe everything you read on the Internet about Pie Charts you may begin to think they are the proverbial trouble-maker of the BI World, but I believe that they deserve a chance to prove themselves!
So to prop up my rickety case, this post will explore Pie Chart Visualization Best Practices and then compare the default Pie Chart visualization from 10x industry leading BI/Reporting Tools to see how they stack up against this Best Practice list.
And so, lets get into the nitty gritty of creating and eating pie charts!