3

I have a comma-separated text file I am reading in and parsing using textscan. Two of the fields are the date and time of day. I am able to convert both fields to fractional days using datenum, with the intention to sum the two resulting vectors.

My problem is that every so often one of the data messages includes the TIME field but not the DATE field. This is read in by textscan as an empty string. I have found that when datenum encounters the empty string, it returns an empty matrix rather than a NaN value or other filler value. This results in having vectors for TIME and DATE that are not the same length, and no obvious indicator of how to align the data.

How can I handle these empty strings in such a way that preserves the order of the data? Is there a way to get datenum to output a null value rather than simply ignoring the field? I would be fine with having a NaN or 0 or similar value to indicate the empty string. I would prefer to keep this vectorized if possible, but I understand a for loop may be necessary.

David K
  • 1,296
  • 18
  • 39

1 Answers1

4

One easy way would be to use logical indexing to process only your valid dates, and initialize the empty ones to 0 in the output. For example, if you have your dates in a cell array C, you could use cellfun and isempty to get the index like so:

index = cellfun(@isempty, C);
out(index) = 0;  % Empty dates are 0 in output
out(~index) = datenum(C(~index), 'ddmmyy');

Alternatively, you could first replace your empty strings with '0/0/0', which will be converted to a 0 by datenum. For example:

C(cellfun(@isempty, C)) = {'0/0/0'};

However, this conversion doesn't work with your specific 'ddmmyy' format (i.e. datenum('000000', 'ddmmyy') doesn't ever return 0, even when specifying the PivotYear argument). The first option may be your best bet.

gnovice
  • 125,304
  • 15
  • 256
  • 359
  • This could definitely work! My data fields are in the format `ddmmyy`, but that shouldn't be an issue. Unfortunately since it uses `yy`, it can't distinguish between centuries, but that should be easy enough to work around. – David K Jan 03 '18 at 15:20
  • 1
    As it turns out, my main problem was actually my (unnecessary) use of `cell2mat` to convert my cell before applying `datenum`. I was able to apply `datenum` to the cell array directly, and empty strings in a cell are interpreted with Jan-1-2018. I've marked your answer as correct, since it addresses the problem I presented and the same logic can still be used to substitute for the empty strings if I desire. – David K Jan 03 '18 at 19:20