Csv is a tricky format and it's generally inadvisable to read csv data with an incomplete parser. Of course, this has never stopped anyone from doing just that.
Should one want to do it by the book, however, the incantation goes like this.
package require csv
package require struct::matrix
Create a matrix data structure which will hold the data from the csv file and enable us to work with it:
::struct::matrix m
m
is now a command in the current namespace (you can add namespaces to the name to create it in another namespace). Once you're done with the matrix, you should call m destroy
.
You can also let the module name your matrix command and use it through a variable:
set m [::struct::matrix]
Now that you have a matrix, you can load the contents of the csv file into it:
set ch [open holiday.csv]
::csv::read2matrix $ch m , auto
chan close $ch
(You can inspect it with m serialize
(I've added some line breaks for readability):)
3 3 {
{{Dec 25} Christmas {US Holiday }}
{{Jan 1} {New Year} {US Holiday }}
{{Jan 19} {Martin Luther King} {US Holiday}}
}
To search for a given date:
proc findDate date {
m search column 0 $date
}
To search for a given string in the third column:
proc findStr str {
m search -glob column 2 $str*
}
(Since some of the values in the column have junk trailing whitespace, we need to search by string match
rules (-glob
) instead of the default exact match.)
Both these commands return a list of cells that the search has turned up with. The cells are designated by a column/row pair of values, e.g. {0 2}
for a match in the first column, third row.
If we just want to find out whether a given date occurs in the file, this predicate will do:
proc hasDate date {
expr {[llength [findDate $date]] > 0}
}
But if we want to be sure that the row the date was on really contains a US holiday, we need to check the third column as well. There are many ways to do this. For one of them, I first need a helper function to transform a list of cell descriptors to a list of row numbers:
proc getRowNums cells {
lmap cell $cells {lindex $cell 1}
}
Now I can check for date and string like this:
proc hasDateAndString {date str} {
set r1 [getRowNums [findDate $date]]
set r2 [getRowNums [findStr $str]]
# do any rows overlap?
foreach r $r1 {
if {$r in $r2} {
return true
}
}
return false
}
This works by checking if the two lists of rows share any values. If they don't, the date does not designate a US holiday.
Another way is to traverse the matrix by rows and check the relevant items on each row:
proc hasDateAndString {date str} {
for {set row 0} {$row < [m rows]} {incr row} {
lassign [m get row $row] dateVal - strVal
if {$date eq $dateVal && [string match $str* $strVal]} {
return true
}
}
return false
}
For every row I look at, I extract a list of values using m get row $row
and lassign
those values into variables that I can check against.
Note: struct::matrix
isn't very good to work with. People say it's slow, and what's worse it isn't really very good at hiding the low-level details. In some cases it's less work to read a csv file using ordinary Tcl I/O, use ::csv::split
to get the fields from each row and write them back after using ::csv::join
to convert them to csv strings again.
Documentation: chan, csv, expr, for, foreach, if, lassign, llength, lmap, open, package, proc, return, set, string, struct::matrix
lmap replacement for Tcl 8.4 and 8.5