I have a pipe-delimited file with varying numbers of columns, like this:
id|name|attribute|extraattribute
1|alvin|cool|funny
2|bob|tall
3|cindy|smart|funny
I'm trying to find an elegant way to import this into a dataframe using pyspark. I could try to fix the files to add a trailing | when the last column is missing (only the last column can be missing), but would love to find a solution that didn't involve changing the input files.