I am working on a school project for data mining, where we were given CSV data from kaggle (this is how the data looks (2 lines out of 6970)):
4,1970,Female,150,DomesticPartnersKids,Bachelor's Degree,Democrat,,Yes,No,No,No,Yes,Public,No,Yes,No,Yes,No,No,Yes,Science,Study first,Yes,Yes,No,No,Receiving,No,No,Pragmatist,No,No,Cool headed,Standard hours,No,Happy,Yes,Yes,Yes,No,A.M.,No,End,Yes,No,Me,Yes,Yes,No,Yes,No,Mysterious,No,No,,,,,,,,,,Mac,Yes,Cautious,No,Umm...,No,Space,Yes,In-person,No,Yes,Yes,No,Yay people!,Yes,Yes,Yes,Yes,Yes,No,Yes,,,,,,,,,,,,,,,,,No,No,No,Only-child,Yes,No,No
5,1997,Male,75,Single,High School Diploma,Republican,,Yes,Yes,No,,Yes,Private,No,No,No,Yes,No,No,Yes,Science,Study first,,Yes,No,Yes,Receiving,No,Yes,Pragmatist,No,Yes,Cool headed,Odd hours,No,Right,Yes,No,No,Yes,A.M.,Yes,Start,Yes,Yes,Circumstances,No,Yes,No,Yes,Yes,Mysterious,No,No,Tunes,Technology,Yes,Yes,Yes,Yes,No,Supportive,No,PC,No,Cautious,No,Umm...,No,Space,No,In-person,No,No,Yes,Yes,Grrr people,Yes,No,No,No,No,No,No,Yes,No,No,Yes,No,Own,Pessimist,Mom,No,No,No,No,Nope,Yes,No,No,No,Yes,No,Yes,No,Yes,No
and we have to get this to an .arff format for use in weka. I manualy typed the header(107 attributes)
@ATTRIBUTE user_id NUMERIC
@ATTRIBUTE yob NUMERIC
@ATTRIBUTE gender {Male,Female}
@ATTRIBUTE income {150,100,75,50,25,10}
@ATTRIBUTE householdstatus {MarriedKids,Married,DomesticPartnersKids,DomesticPartners,Single,SingleKids}
@ATTRIBUTE educationlevel {Bachelor's Degree,High School Diploma,Current K-12,Current Undergraduate,Master's Degree,Associate's Degree,Doctoral Degree}
@ATTRIBUTE party {Democrat,Republican}
@ATTRIBUTE Q124742 {Yes,No}
@ATTRIBUTE Q124122 {Yes,No}
and I get this error :
} expected at end of enumeration read token eol
Then I tried to use the weka converter but it gave me an error
Wrong number of values.Read 2,expected 1,read Token[EOL],line 4 Problem encountered at line:3