We have a Microsoft Hdinsight cluster. I would like to have a powershell one liner which helps to find the count of lines in a file present in azure blob. Appreciate your help.
Asked
Active
Viewed 799 times
0
-
What have you tried? Can you run powershell in HdInsight? I bet I can google a PS script within five minutes. Lets race – Nick.Mc Jan 07 '20 at 13:34
-
Here's a start. https://stackoverflow.com/questions/12084642/powershell-get-number-of-lines-of-big-large-file – Nick.Mc Jan 07 '20 at 13:35
-
oh.. except the blob makes it a little trickier. I don't know if there is a blob API that lets you do this. – Nick.Mc Jan 07 '20 at 13:36
-
Here's some hadoop native ways to do it: https://stackoverflow.com/questions/32612867/how-to-count-lines-in-a-file-on-hdfs-command – Nick.Mc Jan 07 '20 at 13:38
-
I have checked in google and didn't find a one liner. A possible approach is to download the file into local system and get the count which I guess takes some time as i have around 30 files each with 1 GB – TomG Jan 07 '20 at 13:49
1 Answers
1
The following should give what you want:
Get-AzureStorageBlobContent -Container "ContainerName" -Blob "Blob" - | Measure-Object –Line

Thiago Custodio
- 17,332
- 6
- 45
- 90