1

I have a column with the following data:
Size: 100x7

val =

USA
USA
France
USA
France

I want to show the data on pie chart. to do this, I need to know how much USA occur in this column, and so on.
I read about the functions unique,accumarray but I dont success
I would like to get some suggestions how to do that.
Thanks.

Ofir Attia
  • 1,237
  • 4
  • 21
  • 39

4 Answers4

4

Use the third output of unique, and make sure that those input strings are in a cell array. The third output of unique is pretty cool, because it assigns a unique ID for each unique quantity that is seen in the input. As such, if you had a sequence of characters from a to e, it would assign a unique ID for each unique character that it has found, between 1 and 5. Also, the first output of unique gives you an array that only contains the unique quantities seen in the input.

You can then use accumarray on this third output to count how many times you see a particular country over all countries listed.

val = {'USA'; 'USA'; 'France'; 'USA'; 'France'};
[countries,~,id] = unique(val);
counts = accumarray(id, 1);

I get:

counts = 

2 
3

Also for countries:

countries = 

    'France'
    'USA'

Notice that each element of counts corresponds to how many times you see that particular country in the same position as the country in countries, so France is seen 2 times, and USA 3 times.

rayryeng
  • 102,964
  • 22
  • 184
  • 193
3

You can use unique with histc -

%// Get countries and their occurences
[countries,~,id] = unique(cellstr(val),'stable')
occurrences = histc(id,1:max(id))

You can then display the number of occurrences against the country names as a table -

>> table(countries,occurrences)
ans = 
    countries    occurrences
    _________    ___________
    'USA'        3          
    'France'     2       

Display output as a pie chart -

>> pie(occurrences,countries)

enter image description here

Divakar
  • 218,885
  • 19
  • 262
  • 358
0

This will give you the number of occurrences by using regexp:

unique_countries = unique(regexp(val,'^.*$','lineanchors','match','dotexceptnewline'));

count_unique_countries = zeros(size(unique_countries));
for ii = 1:numel(unique_countries)
    count_unique_countries(ii) = numel(regexp(val,['^' unique_countries{ii} '$'],'lineanchors'));
end

The two output variables are now

unique_countries = 
'France'    'USA'
count_unique_countries =
 1     2
ivan
  • 154
  • 7
  • This only finds one country. The question is asking to find all countries. – rayryeng Dec 03 '14 at 21:09
  • To make this work for all of the countries, perhaps find all of the unique countries in the list, then use `regexp` and loop over each unique country, but the efficiency of that is questionable compared to `unique` with `hist/accumarray`. – rayryeng Dec 03 '14 at 21:39
  • Thanks for the comment. I edited solution to find number of occurrences for all countries. As you mention it will first find the unique countries with a combination of unique and regexp. – ivan Dec 03 '14 at 21:44
0

If you have the statistics toolbox, you can also do the following:

valnom = nominal(val);
countries = getlabels(valnom);
occurrences = levelcounts(valnom);
Jim Quirk
  • 606
  • 5
  • 18