0

I'm trying to read a csv file with comments and empty lines, from which I need to fetch lines which are not empty or commented.

File looks like this:
Test File for dry run:

#This is a comment
# This is a comment with, comma

# This,is,a,comment with exact number of commas as valid lines

h1,h2,h3,h4
a,b,c,d

e,f,g,h

i,j,k,l
m,n,o,p

Expected Output:

h1 h2 h3 h4
-----------
a  b  c  d 
e  f  g  h 
i  j  k  l 
m  n  o  p

Unsuccessful attempt:

q)("SSSS";enlist ",")0: ssr[;;]each read0 `:test.csv // tried various options with ssr but since '*' wildcard gives error with ssr so not sure of how to use regex here
Utsav
  • 5,572
  • 2
  • 29
  • 43
  • 2
    What system are you running on ? If you are on a unix-based system you should have access to the `sed` utility which could be used to preprocess the file. `sed -i -e '/^$/d' -e '/^#/d' test.csv` This will first remove empty lines, then remove comments. The file could then be read using ```("SSSS";enlist ",")0:`:test.csv``` from a q process – SeanHehir Apr 16 '21 at 15:59
  • I need to read it directly in q/kdb as a part of a q script. – Utsav Apr 16 '21 at 23:16

1 Answers1

3

This provides the required result:

q)("SSSS";enlist",")0:t where not""~/:t:5_read0`:test.csv
h1 h2 h3 h4
-----------
a  b  c  d
e  f  g  h
i  j  k  l
m  n  o  p

To ignore any number of comments you could use:

q)("SSSS";enlist",")0:t where not any each(" ";"#")~\:/:first each t:read0`:test.csv
h1 h2 h3 h4
-----------
a  b  c  d
e  f  g  h
i  j  k  l
m  n  o  p
Cathal O'Neill
  • 2,522
  • 1
  • 6
  • 17