0

We have a Microsoft Hdinsight cluster. I would like to have a powershell one liner which helps to find the count of lines in a file present in azure blob. Appreciate your help.

TomG
  • 281
  • 1
  • 2
  • 20
  • What have you tried? Can you run powershell in HdInsight? I bet I can google a PS script within five minutes. Lets race – Nick.Mc Jan 07 '20 at 13:34
  • Here's a start. https://stackoverflow.com/questions/12084642/powershell-get-number-of-lines-of-big-large-file – Nick.Mc Jan 07 '20 at 13:35
  • oh.. except the blob makes it a little trickier. I don't know if there is a blob API that lets you do this. – Nick.Mc Jan 07 '20 at 13:36
  • Here's some hadoop native ways to do it: https://stackoverflow.com/questions/32612867/how-to-count-lines-in-a-file-on-hdfs-command – Nick.Mc Jan 07 '20 at 13:38
  • I have checked in google and didn't find a one liner. A possible approach is to download the file into local system and get the count which I guess takes some time as i have around 30 files each with 1 GB – TomG Jan 07 '20 at 13:49

1 Answers1

1

The following should give what you want:

Get-AzureStorageBlobContent -Container "ContainerName" -Blob "Blob" - | Measure-Object –Line
Thiago Custodio
  • 17,332
  • 6
  • 45
  • 90