I'm trying to work with a data set that has no header and has ::
for field delimiters:
! wget --quiet http://files.grouplens.org/datasets/movielens/ml-1m.zip
! unzip ml-1m.zip
! mv ml-1m/ratings.dat .
! head ratings.dat
The output:
1::1193::5::978300760
1::661::3::978302109
1::914::3::978301968
I have loaded the file into my dsx pipeline, but I am unclear how to get dsx to split this file using the ::
delimiters.
How do I do this?
If it is not possible to get dsx to reshape this file using dsx ml pipeline functionality, does dsx have any pre-requisities in terms of input file format?
Update:
The ml pipeline functionality I'm trying to use can be seen from the screenshot below:
I have added a data set, but can't figure out how to get dsx to recognise the field delimiters: