Counting occurrences of a string in an array (javascript)

Question

I have a code...

var userArray=userIn.match(/(?:[A-Z][a-z]*|\d+|[()])/g);

...that separates the user input of a chemical formula into its components.

For example, entering Cu(NO3)2N3 will yield

Cu , ( , N , O , 3 , ) , 2 , N , 3.

In finding the percentage of each element in the entire weight, I need to count how many times each element is entered.

So in the example above,

Cu : 1 , 
N  : 5 , 
O : 6

Any suggestions of how I should go about doing this?

Does the quantifier _always_ come right after the element? Also, is nesting allowed? Are two digit numbers allowed? — Benjamin Gruenbaum, Jun 28 '13 at 22:50
This is much more than just counting occurrences. This is parsing and multiplying. — Barmar, Jun 28 '13 at 22:51
@Barmar Yes, this requires an actual parser - not a particularly hard one though. Tokens are letters, numbers (quantifiers) and brackets. I don't mind giving the OP a good answer on _how_ to implement it but it's not very clear yet. — Benjamin Gruenbaum, Jun 28 '13 at 22:53
Yes, the quantifier will be right after the element, and two digits numbers ARE allowed. So entering H12, will be H, 12 . The only exception would be with parenthesis, where the following number would have to multiply by everything inside the parenthesis. — Rygh2014, Jun 28 '13 at 22:57
@TGH The `g` modifier makes it return all occurrences in an array. — Barmar, Jun 28 '13 at 22:59

Benjamin Gruenbaum · Accepted Answer · 2013-06-28T23:10:23.813

You need to build a parser

There is no simple way around that. You need nesting and memory, a regular expression can't handle that very well (well, a real CS regulular expression can't handle that at all).

First, you get the result regexp you have. This is called Tokenization.

Now, you have to actually parse that.

I suggest the following approach I will give you pseudo code because I think it will be better deductively. If you have any questions about it let me know:

method chemistryExpression(tokens): #Tokens is the result of your regex

Create an empty map called map

While the next token is a letter, consume it (remove it from the tokens)

2.1 Add the letter to the map with occurrence 1 or increment it by one if it's already inside the map

If the next token is (, consume it: # Deal with nesting

3.1 Add the occurrences from parseExpression(tokens) to the map (note, tokens changed)

3.2 Remove the extra ) you've just encountered

num = consume tokens while the next token is a number and convert to int

Multiply the occurances of all tokens in the map by num

Return the map

Implementation suggestion

The map can just be an object.
- Adding to the map is checking if the key is there, if it is not, set it to 1, if it is there, increment its value by one.
- Multiplying can be done using a for... in loop.
This solution is recursive this means you're using a function which calls itself (chemistryExpression) in this case. This parser is a very basic example of a recursive descent parser and handles nesting well.
Common sense and good practice necessitate two methods
- peek - what is the next token in the tokens, this is tokens[0]
- next - grab the next token from tokens, this is tokens.unshift()

Thanks! I think I understand for the most part, so I'll get to work! — Rygh2014, Jun 28 '13 at 23:15

score 0 · Answer 2 · answered Jun 28 '13 at 23:01

For each value in userArray, check if there is a next element anf if that next element is a number, if so, add this number to the count of the current element type, else add 1. You can use an object as a map to store a count for each distinct element type :

var map = { }
map[userArray[/*an element*/] = ...

EDIT : if you have numbers longer than a digit, then in a loop while the next is a number, concatenate all numbers into a string and parseInt()

Counting occurrences of a string in an array (javascript)

2 Answers2

You need to build a parser

Implementation suggestion