I am trying to do a join in stream analytics with a reference data.
Below is the input from stream data.
[{
"id":"111111101",
"basetime":0,
"xyz":
[
{
"xxx":1,
"yyy":2631,
"aaa":"470A01",
"id":1
},
{
"xxx":0,
"yyy":0,
"aaa":"000000",
"id":61
}
]
},
{
"id":"111111102",
"basetime":0,
"xyz":
[
{
"xxx":1,
"yyy":2631,
"aaa":"03F4EB",
"id":1
}
]
},
{
"id":"111111103",
"basetime":0,
"xyz":
[
{
"xxx":1,
"yyy":2631,
"aaa":"6706",
"id":1
}
]
}
]
Below is the reference Master data.
[
{
"aaa": "470A01"
},
{
"aaa": "03F4EB"
},
{
"aaa": "710211"
}
]
The SAQL written is as shown below.
WITH INPUT1 AS (
SELECT
input.id.dateTime AS ID,
flatArrayElement as ABC,
FROM [signals2] as input
CROSS APPLY GetArrayElements(input.xyz) AS flatArrayElement
)
I have used CROSS APPLY to get each element in xyz as different rows.
The output of INPUT1 is as shown below.
+----------+------------------------------------------------------------------------+
| ID | ABC |
+----------+------------------------------------------------------------------------+
| 111111101| {"ArrayValue":{"xxx":1,"yyy":2631,"aaa":470A01,"id":1},"ArrayIndex":0} |
| 111111101| {"ArrayValue":{"xxx":0,"yyy":0,"aaa":000000,"id":61},"ArrayIndex":1} |
| 111111102| {"ArrayValue":{"xxx":1,"yyy":2631,"aaa":03F4EB,"id":1},"ArrayIndex":0} |
| 111111103| {"ArrayValue":{"xxx":1,"yyy":2631,"aaa":6706,"id":1},"ArrayIndex":0} |
+-------------------+---------------------------------------------------------------+
Now I am trying to join the data xyz.aaa with a reference data as below, where master is the reference data.
SIGNALS AS (
SELECT * FROM INPUT1 I JOIN master M ON I.ABC.ArrayValue.aaa = M.aaa
I am getting below output, but the problem is xyz with more than one element is duplicating in the output.
+-------------------------------+------------------------------------------------------------------------------------------+--------+
| i___timestamp | i | m |
+-------------------------------+------------------------------------------------------------------------------------------+--------+
| "2019-11-13T03:36:22.4636494Z"| "id": "111111101",{"ArrayValue":{"xxx":1,"yyy":2631,"aaa":470A01,"id":1},"ArrayIndex":0} | 470A01 |
| "2019-11-13T03:36:22.4636494Z"| "id": "111111101",{"ArrayValue":{"xxx":1,"yyy":2631,"aaa":470A01,"id":1},"ArrayIndex":0} | 470A01 |
| "2019-11-13T03:36:22.4636494Z"| "id": "111111102",{"ArrayValue":{"xxx":1,"yyy":2631,"aaa":03F4EB,"id":1},"ArrayIndex":0} | 03F4EB |
+-------------------------------+------------------------------------------------------------------------------------------+--------+
I am confused why the first two rows are duplicating,it should be only one entry for that. Out of two elements in xyz one is valid and one is invalid. But here the valid element is repeating two times. What might be the reason? How to resolve this ?