My data set is not in English but in Korean. The number of observations is more than 3000.
The data set's name is demo.
str(demo)
This has information of each person in each row.
$ 거주지역: Factor w/ 900 levels "","강원 강릉시 포남1동",..: 595 235 595 832 12 126 600 321 600 589 ...
Above is the 4th column's structure of the data set.
I want to make groups according to 4th column which indicates addresses of people. The problem is that the level of the factor is 900. This happens because the addresses are fully written.
I want to make groups to assign people in some provinces. So R needs to read the factors and identify the letters to make groups.
How can I do this? Please give me a help. I googled it for so much time but I could not find it.