0

I have simulated devices which is sending messages to IoT Hub blob storage and from there I am copying data(encoded in JSON format) to Azure Data Lake Gen2 by creating a pipeline using Azure Data Factory.

How to convert these json output file to CSV file using Azure function and store it again on Azure Data Lake Store.


@Adam,

Thank You so much for your all the answer and I implemented those successfully in my azure account. But, this does not actually given me the desired output which I was looking for.

Hope this make things clear and my requirement.

My Input file which sent to IOT Hub is:-

enter image description here

Below is the sample records of the data which is stored in IOT Hub endpoints(Blob Storage):- (Json - Set of Objects):-

{"EnqueuedTimeUtc":"2019-08-06T10:46:39.4390000Z","Properties":{"$.cdid":"Simulated-File"},"SystemProperties":{"messageId":"d48413d2-d4d7-41bb-9470-dc0483466253","correlationId":"a3062fcb-5513-4c09-882e-8e642f8fe38e","connectionDeviceId":"Simulated-File","connectionAuthMethod":"{\"scope\":\"device\",\"type\":\"sas\",\"issuer\":\"iothub\",\"acceptingIpFilterRule\":null}","connectionDeviceGenerationId":"637001643970703748","contentType":"UTF-8","enqueuedTime":"2019-08-06T10:46:39.4390000Z"},"Body":"eyIiOiI1OCIsInJvdGF0ZSI6IjQ2Mi4wMjQxODE3IiwiZGF0ZXRpbWUiOiIxLzMvMjAxNSAxNjowMCIsIm1hY2hpbmVJRCI6IjEiLCJ2b2x0IjoiMTU2Ljk1MzI0NTkiLCJwcmVzc3VyZSI6IjEwNi4zNDY3MTc5IiwidmlicmF0aW9uIjoiNDguODIwMzAwODYifQ=="}

{"EnqueuedTimeUtc":"2019-08-06T10:46:40.5040000Z","Properties":{"$.cdid":"Simulated-File"},"SystemProperties":{"messageId":"9da638d9-fdba-41d3-86df-3ea6cedc44e7","correlationId":"aeb20305-6fee-4a59-9053-5fa1d0c780a9","connectionDeviceId":"Simulated-File","connectionAuthMethod":"{\"scope\":\"device\",\"type\":\"sas\",\"issuer\":\"iothub\",\"acceptingIpFilterRule\":null}","connectionDeviceGenerationId":"637001643970703748","contentType":"UTF-8","enqueuedTime":"2019-08-06T10:46:40.5040000Z"},"Body":"eyIiOiI1OSIsInJvdGF0ZSI6IjQyOS44MjIxNDM1IiwiZGF0ZXRpbWUiOiIxLzMvMjAxNSAxNzowMCIsIm1hY2hpbmVJRCI6IjEiLCJ2b2x0IjoiMTY0LjE0ODA5NDYiLCJwcmVzc3VyZSI6IjEwNC41MzIxMjM2IiwidmlicmF0aW9uIjoiNDMuNzg4NjgxNTUifQ=="}

**The Json "Body" field contains the actual IOT device data, which is encoded in JSON format with some system and message properties.

**By creating pipeline to JSON to CSV, does not extract the actual data to Data Lake Store.

Output CSV is exact copy of the JSON file(real data is not extracted)after running the ADF pipeline.

[![enter image description here][2]][2]

1: https://i.stack.imgur.com/iCBvi.png

[2]: https://i.stack.imgur.com/HaBMn.png

I have tried with below C# code and successfully parse the JSON, but cannot read the data from Json "body"field. Assist me on this.

#r "Microsoft.WindowsAzure.Storage"
#r "System.Linq"
#r "Newtonsoft.Json"
#r "Microsoft.Azure.WebJobs.Extensions.Storage"
#r "Microsoft.Azure.WebJobs"
#r "Microsoft.Azure.WebJobs.Extensions"
using Newtonsoft.Json;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Host;
using Microsoft.Azure.WebJobs.Extensions;
using Microsoft.WindowsAzure.Storage.Blob;
using System;
using System.IO;
using System.Runtime.Serialization;
using System.Security.Authentication;
using System.Text;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
public class Telemetry
{
    public string num { get; set; }
    public string datetime { get; set; }
    public string machineID { get; set; }
    public string volt { get; set; }
    public string rotate { get; set; }
    public string pressure { get; set; }
    public string vibration { get; set; }
}
public static void Run(
    [BlobTrigger("blobcontainer/{name}")] String myBlob, 
    //[Blob("bloboutput/{name}"), FileAccess.Write] Stream outputBlob,
    string name, ILogger log)
{
    log.LogInformation($"C# Blob trigger function Processed blob\n Name:{name} \n Size: {myBlob.Length} Bytes");
var serializer = new JsonSerializer();
using (var sr = new StreamReader(myBlob))
    using (var jsonTextReader = new JsonTextReader(sr))
    {
        var Results = (Telemetry)serializer.Deserialize(jsonTextReader, typeof(Telemetry));
        // Do something with Results.
    }
}
  • you need to use an integration account, put a schema and a map in it for your json to csv transform, and then use a logic app to actually execute that transformation – Alex Gordon Aug 13 '19 at 19:17

1 Answers1

1

I would really not save them into blob but just leave them in IoT hub and make functions trigger of that. But if you really it need from blob then.

Make function trigger like this

[FunctionName("BlobTriggerCSharp")]        
public static void Run(
    [BlobTrigger("input/{name}")] Stream myBlob, 
    [Blob("output/{name}"), FileAccess.Write] Stream outputBlob
    string name, ILogger log)
{
    // parse JSON with JsonConvert 
    // parse using base64 decode from SystemProperties.contentType field
    // write parsed output to outputBlob stream
}

Then in ADF

Placed this in demo container as file called demo.json on blob storage.

I created source dataset of type JSON in ADF

enter image description here

which looks like this

enter image description here

You can test if it works by pressing preview data.

enter image description here

And a sink dataset as follows

enter image description here

And then I simply made copy activity.

enter image description here

enter image description here

enter image description here

And after running it I got JSON transformed to CSV

You can see it in blob

enter image description here

enter image description here

If you want to learn more then check my video for

  1. Data Factory introduction https://youtu.be/EpDkxTHAhOs and,
  2. Function App introduction https://youtu.be/Vxf-rOEO1q4
Adam Marczak
  • 2,257
  • 9
  • 20
  • Could you please elaborate your solution. – Lalatendu Mohanty Aug 14 '19 at 07:00
  • If you have any web links available to be followed to execute the process will be more appreciated. The exact sensor data is getting copied to data Lake store, but I want to extract the data without using any JSON extractor. – Lalatendu Mohanty Aug 14 '19 at 07:25
  • Below is sample record(sensor data) which is present in Data Lake Blob Storage {"EnqueuedTimeUtc":"2019-08-06T10:46:39.4390000Z","Properties":{"$.cdid":"Simulated-File"},"SystemProperties":{"messageId":"d48413d2-d4d7-41bb-9470-dc0483466253","correlationId":"a3062fcb-5513-4c09-882e-8e642f8fe38e","connectionDeviceId":"Simulated-File","connectionAuthMethod":"{\"scope\":\"device\",\"type\":\"sas\",\"issuer\":\"iothub\",\"acceptingIpFilterRule\":null}","connectionDeviceGenerationId":"637001643970703748","contentType":"UTFiOiIxLzMvMjAxNSmVJRCI6IjEiLCJ2b2x0IjoiMTU2Ljk1MzI0NTkiLCJwcmVzcAwODYifQ=="} – Lalatendu Mohanty Aug 14 '19 at 07:25
  • You had missing bracket in your JSON, I fixed it and updated my answer. – Adam Marczak Aug 14 '19 at 09:18
  • I Have updated my requirement in the post, Please reconsider it and provide your valuable comments. Let me know if still things are not clear from my end. – Lalatendu Mohanty Aug 14 '19 at 10:47
  • Please consider making this as an answer if this helped you. – Adam Marczak Aug 14 '19 at 15:09
  • Hi Adam, Thank you so much for your profound answer. I am new to the azure cloud and am struggling to create the azure function with many error to implement the above. Could you please help me with your source code for the trigger. ~~My input & output container:- blobcontainer~~ My sensor data getting stored in below specified path on hourly basis,from where the function need to read the json. ~~Location: blobcontainer / coeanalyticsiothub / 01 / 2019 / 08 / 06 / 12 – Lalatendu Mohanty Aug 19 '19 at 10:51
  • Please post your code and errors that you are getting. – Adam Marczak Aug 19 '19 at 18:11
  • Hi Adam, Please see the code update above including errors. I have been struggling with the same to write the code in C# as I am new to this language and cloud too. It would be helpful, if you can share your answer that can lead to my improvement and getting insight on this field. – Lalatendu Mohanty Aug 20 '19 at 13:38
  • It looks like you are not writing functions properly, there are even syntax errors. Please check any guide on how to write Azure functions first. You can use my video guide too. You are doing very basic things wrong. You just need to learn functions first. 1. you are trying to use namespaces in the portal using CSX files, you have typos in the code ",". Everything is written in the error log, considering my many issues there are best case scenario is learning Azure functions basics first. At this point you would simply need me to write the function for you. – Adam Marczak Aug 20 '19 at 16:17
  • Hi Adam, Hope you are doing well. I have tried to do write the Csharp code to parse the json, but stuck at decoding the Json "body" filed which contain the raw data. Kindly assist me to make it succeeded. – Lalatendu Mohanty Sep 12 '19 at 12:32
  • It should be something along the lines of `var result = JsonConvert.DeserializeObject>(jsonTextReader);` but I think you should post separate questions per your issue not bundle them up in single thread. – Adam Marczak Sep 12 '19 at 14:50
  • Hi Adam, I have started a new thread, Require you assistance. Please refer this link. https://stackoverflow.com/questions/57919242/azure-function-to-decode-the-iot-hub-message-stored-in-azure-blob-storage – Lalatendu Mohanty Sep 13 '19 at 07:41
  • Hi Adam, I am getting below error while trying to store the json data into outputblob container. I am not able to find any solution in Google. line: [Blob("blobcontainer/{name}"), FileAccess.ReadWrite] Stream outputBlob, 2019-09-16T12:35:25.291 [Error] run.csx(50,47): error CS0426: The type name 'ReadWriteAttribute' does not exist in the type 'FileAccess' 2019-09-16T12:35:25.338 [Error] run.csx(50,47): error CS0426: The type name 'ReadWrite' does not exist in the type 'FileAccess' – Lalatendu Mohanty Sep 16 '19 at 12:40
  • Blob parenthesis is closed in the wrong place. Please remember that stackoverflow works best if you post new question whenever you encounter diferent issues so more pople can help and more people can get answers if they have similar issues. – Adam Marczak Sep 16 '19 at 12:46
  • Thanks Adam. Sure, from now on I will start new thread while encountering further issues. But, I would like to post my full fledged code here for you to review and understand your code for code improvisation. – Lalatendu Mohanty Sep 16 '19 at 12:52