3

A file handle in my syntax references a folder which includes a version number in YYYYDDMM format. For example, the "v20170215" referenced below:

file handle WORKING/name='ROOT\Uploads\20141001_20150930 v20170215'.

The version part of the file handle is routinely updated based on new data that needs to be processed. The file handle always ends with a "v" followed by a YYYYMMDD date.

How can I automatically extract the last "YYYYMMDD" string from the file handle (e.g., "20170215") and create a date variable out of it?

If the date were a string variable in the data, I could use something like below:

* Extract data, month, and year.
compute day = number(char.substr(...),F2.0).
compute month = number(char.substr(...),F2.0).
compute year = number(char.substr(...),F4.0).

* Compute date variable.
compute Version = date.mdy(month,day,year).
formats Version (adate10).
execute.

But given it's a line of syntax I need to parse, I suspect I should look to Python, but I'm stumped how to tackle this.

eli-k
  • 10,898
  • 11
  • 40
  • 44
Larry
  • 183
  • 2
  • 12
  • Where is this "syntax". Is it contained in a file that you can read with a Python script? – mhawke Oct 01 '17 at 03:38
  • 1
    It's contained in the same SPSS syntax file where I need to create the date variable (and then run lots of additional code). – Larry Oct 01 '17 at 15:57

2 Answers2

3

I'll assume you can't get the updated reference as data from the same source that creates the updated syntax (might have been an easier solution).
Once the handle is defined, you can extract that definition into data this way:

dataset declare  myhandle.
oms/select tables/if commands=['Show'] subtypes=['File Handles']/destination format=SAV outfile='myhandle'.
show handles.
omsend.
dataset activate myhandle.

This will open a dataset called myhandle in which variable Directory will contain the full path for your file as defined in the handle. From that you have to extract only the string you need - see if this can work for you:

compute Directory=char.substr(Directory,char.index(Directory," v")+2,10).

Now you have the string you needed, you can continue and turn it into a date and match it into your data.

eli-k
  • 10,898
  • 11
  • 40
  • 44
  • Works great! After I extracted it as a string, I broke it up into month,day,year parts: `compute month = number(char.substr(Directory,5),F2.0). compute day = number(char.substr(Directory,7),F2.0). compute year = number(char.substr(Directory,1),F4.0).` I then converted these to a date: `compute MyDate = date.mdy(month,day,year)` – Larry Oct 13 '17 at 20:34
1

Assuming that the syntax comes from a file that you can open and process with Python, you can split the line on whitespace, grab the date part of the last field using slicing, then feed that into datetime.strptime() to parse the string into a datetime.date object.

>>> from datetime import datetime    
>>> s = r"file handle WORKING/name='ROOT\Uploads\20141001_20150930 v20170215'."
>>> date_string = s.split()[-1][1:-2]
>>> datetime.strptime(date_string, '%Y%m%d').date()
datetime.date(2017, 2, 15)
mhawke
  • 84,695
  • 9
  • 117
  • 138