I have the following CSV file with column headings on line 1:
Test.csv
--------
Prj , Cap
A , 1
A , 2
H , 4
H , 5
I tried to read this into a table, but I'm having trouble making readtable
recognize the column headings on line 1:
readtable( 'Test.csv' , ...
delimitedTextImportOptions( 'VariableNamesLine' , 1 ) )
Var1 ExtraVar1
_____ _________
'Prj' ' Cap'
'A' ' 1'
'A' ' 2'
'H' ' 4'
'H' ' 5'
What am I misunderstanding about the VariableNamesLine
parameter?
I am using Matlab 2019a. doc delimitedTextImportOptions
shows it as being introduced in Matlab 2016b, and I am running Matlab 2019a.
Troubleshooting steps
Here is the delimitedTextImportOptions
object:
dtio = delimitedTextImportOptions( 'VariableNamesLine' , 1)
DelimitedTextImportOptions with properties:
Format Properties:
Delimiter: {','}
Whitespace: '\b\t '
LineEnding: {'\n' '\r' '\r\n'}
CommentStyle: {}
ConsecutiveDelimitersRule: 'split'
LeadingDelimitersRule: 'keep'
EmptyLineRule: 'skip'
Encoding: 'system'
Replacement Properties:
MissingRule: 'fill'
ImportErrorRule: 'fill'
ExtraColumnsRule: 'addvars'
Variable Import Properties: Set types by name using setvartype
VariableNames: {'Var1'}
VariableTypes: {'char'}
SelectedVariableNames: {'Var1'}
VariableOptions: Show all 1 VariableOptions
Access VariableOptions sub-properties using setvaropts/getvaropts
Location Properties:
DataLines: [1 Inf]
VariableNamesLine: 1
RowNamesColumn: 0
VariableUnitsLine: 0
VariableDescriptionsLine: 0
If I specify ReadVariableNames
as true, only the first column heading is recognized. And it still gets repeated in the data.
readtable( 'Test.csv' , dtio , 'ReadVariableNames',true )
Prj ExtraVar1
_____ _________
'Prj' ' Cap'
'A' ' 1'
'A' ' 2'
'H' ' 4'
'H' ' 5'
I can avoid having headings read as data by explicitly specifying DataLines
, but the 2nd column heading is still unread.
dtio = delimitedTextImportOptions( ...
'VariableNamesLine' , 1 , ...
'DataLines' , [2 Inf] );
readtable( 'Test.csv' , dtio , 'ReadVariableNames',true )
Prj ExtraVar1
___ _________
'A' ' 1'
'A' ' 2'
'H' ' 4'
'H' ' 5'
Oddly, the DataLines
specification is ignored if I additionally unset any preconceived VariableNames
:
dtio = delimitedTextImportOptions( ...
'VariableNamesLine' , 1 , ...
'DataLines' , [2 Inf] , ...
'VariableNames' , {} );
readtable( 'Test.csv' , dtio , 'ReadVariableNames',true )
ExtraVar1 ExtraVar2
_________ _________
'Prj ' ' Cap'
'A ' ' 1'
'A ' ' 2'
'H ' ' 4'
'H ' ' 5'
Following suggestions in the responses, I tried the default readtable
options. Unfortunately, this did not recognize ,
as a delimiter:
readtable('Test.csv')
Warning: Table variable names were modified to make them valid MATLAB identifiers. The original names are saved in the VariableDescriptions property.
Prj x_ Cap
___ ___ ___
'A' ',' 1
'A' ',' 2
'H' ',' 4
'H' ',' 5
Using a format string helps recognition of the column heading line, but white space around the delimiters is kept for the string columns:
readtable('Test.csv', 'Format', '%s%u')
Prj Cap
_______ ___
'A ' 1
'A ' 2
'H ' 4
'H ' 5
I get the same results regardless of whether Test.csv
has Unix or DOS line endings.
I will continue to investigate, read, and experiment.
P.S. Very odd, but the Matlab Answers forum at Matlab Central won't let me post this question (prior to coming here). I can enter text for the subject heading, but no insertion point appears in the message body no matter how much I click. It happens using both Firefox and Edge.