Looping through data using PySpark notebook in Fabric
Fabric Notebooks – Looping through data using PySpark Continuing with my existing blog series on what I’m learning with notebooks and PySpark. Today, I’m going to explain to you how I found a way to loop through data in a notebook. In this example, I’m going to show you how I loop through a range of dates, which can then…
Using Sempy to Authenticate to Fabric/Power BI APIs using Service Principal and Azure Key Vault
I have been doing a fair amount of work lately with Fabric Notebooks. I am always conscious to ensure that when I am authenticating using a Service Principal, I can make sure it is as secure as possible. To do this I have found that I can use the Azure Key Vault and Azure identity to successfully authenticate. By using…
How to add current DateTime to existing PySpark data frame in a Fabric Notebook
How to add current DateTime to existing PySpark data frame in a Fabric Notebook In the blog post below, I am going to describe how to add the current Date Time to your existing Spark data frame. This is really useful when I am inserting data into a Fabric Lakehouse table, and I want to know when the data got…
How to get Sempy (Semantic-link) to run when being triggered from a data pipeline which runs a Notebook in Fabric
Below is where I had an error when trying to run a notebook via a data pipeline and it failed. Below are the steps to get this working. This was the error message I got as shown below. Notebook execution failed at Notebook service with http status code – ‘200’, please check the Run logs on Notebook, additional details –…
Renaming multiple Column Names in a single step using a PySpark Notebook
Following on from my previous blog post this blog post I’m going to demonstrate how to bulk rename column names in a single step instead of having to rename them individually. The reason this came about is because I had a set of data where the column names had the square brackets which I wanted to remove. As shown below…
Microsoft Fabric – Comparing Dataflow Gen2 vs Notebook on Costs and usability
In this blog post I am going to compare Dataflow Gen2 vs Notebook in terms of how much it costs for the workload. I will also compare usability as currently the dataflow gen2 has got a lot of built in features which makes it easier to use. The goal of this blog post is to understand which in my opinion…
Microsoft Fabric – Notebook session usage explained (And how to save CU’s or billed time)
I was working on a blog post to determine which consumed fewer Fabric Capacity Units (CU’s), and when I was initially testing this was getting some unexpected results. In a future blog post I will compare a Dataflow Gen2 or Notebook and which one consumes less CU’s In this blog post I’m going to explain the. Lessons are learned when…
An easy way to transform/clean your data using a Notebook in Microsoft Fabric
In this blog post I am going to show you an easy way to clean your data (which is often fixing data issues or mis-spelt data) using the new feature Launch Data Wranger using DataFrames I had previously blogged about using Pandas data frames but this required extra steps and details, if you are interested in that blog post you…