4

I am posting this optimistically after searching for an answer here on SO, and even when SO tells me my question might be closed, as I think in this case it's a valid question.

Consider a CSV file with a column containing string representing either dates, or times, or both. I want to find out after reviewing the column, just that - exactly what type of column is it, not just that it's a valid "date"?

PHP function strtotime does an amazing job of returning a unix timestamp for pretty much any date-time-ish string. But (today when I'm posting this on 10/8/2018), 3:45PM and 15:45:00 and 10/8/2018 3:45PM would all return the same unix time, though obviously the first two are times.

What is a method to determine if a string is strictly a date component, a time component, or both?

P.S. If I have to write a function myself, so far the best lead would be to look for a : in the string, which would mean there's a time component (meaning either time, or datetime). If it parses as a datetime, but with no : present, then we could assume it's a date only. But again, I am wondering if PHP has a more elegant way. Here is a "pretty good" solution:

P.P.S this function is actually a "very good" solution now thanks to @KarstenKoop's clever suggestion in comments about the 2nd parameter for strtotime:

function date_time_component($date){
    if(strtotime($date) === false) return false;
    if(strtotime($date, 86400) !== strtotime($date, 86400 * 2)) return 'time';
    if(strstr($date, ':')) return 'datetime';
    return 'date';
}
Oliver Williams
  • 5,966
  • 7
  • 36
  • 78
  • What is the representation format in the CSV? Do times and dates obey certain rules that you could rely upon? – SirDarius Oct 08 '18 at 11:08
  • Let's say no. The whole purpose of this is to identify the intent of the column with wide latitude as to format - so for example 3:15, or 3:15PM, or even "noon" - but know that the column is _only_ a time or _only_ a date.. – Oliver Williams Oct 08 '18 at 11:11
  • Let me see if I understand you right; you want to validate your date/timestamp to check if it has a valid format? – Sanguinary Oct 08 '18 at 11:16
  • No. Assuming it _is_ valid, I want to determine if it is time-only, date-only, or datetime (both). See the function example I added after "P.S." in my question. – Oliver Williams Oct 08 '18 at 11:24
  • 1
    You can pass a timestamp as a second optional parameter to `strtotime()` for relative calculation. So if your input only contains a time, the date would be taken from the passed value, and you could try with two different ones and compare the results. If both are the same, date was taken from the input, if they are different, there was no date (or a relative date, like "last thursday") in the input. – Karsten Koop Oct 08 '18 at 11:39
  • Does https://3v4l.org/1PWlp help you? – Gabriel Heming Oct 08 '18 at 11:53
  • @KarstenKoop - your answer was the best response, see my edited function above which does what I want. Thanks! – Oliver Williams Oct 08 '18 at 11:58

1 Answers1

0

You should start by writing a set of test cases - which you could have included in your question. There are lots of different ways to write dates and a few ways to write times.

Here's some to get you started:

dates

 2018-10-08
 08/10/2018
 10/08/2018
 08-10-2018
 10-08-2018
 10-Oct-2018
 20181008
 10 Oct 2018
 10th October 2018
 October, 10th 2018

times

 1121
 11:21
 11:21 AM
 11:21
 11:21:46
 11:21:46 AM

Timezones

 +00:00
 +0
 +0000
 Europe/London

Then you have all the possible combinations of the 3 components:

date
date time
date time timezone
time
time timezone
time date
time timezone date

That gives 400 different formats (not all of which are uniquely resolvable).

(using strtotime to do the heavy lifting starts making a lot of sense). But rather than trying to parse the data, looking for specific patterns might be a better approach: are there strings of letters? a + or -? How many digits? How many consecutive digits?

You should still be starting with a list of test cases.

symcbean
  • 47,736
  • 6
  • 59
  • 94