Obviously you'll have to break that chunk of text up into words.
Then you'll need to count the occurrences of each (unique) word.
What is a "word"? Well, most straightforwardly, it's the characters between spaces.
You mention that you want to ignore punctuation.
Also, you probably want to ignore lettercase: "Hello" is the same word as "hello".
Step by step:
- Convert the entire string to lowercase
let lowerText = sampleText.toLowerCase()
- Remove punctuation from the string
This is easiest to do with a regular expression. This one removes every character that's not a letter, number, or dash. It replaces any other character with a space.
let stringWithoutPunct = lowerText.replace(/[^a-zA-Z0-9-]/gi, ' ')
- Separate that chunk of text into separate words
let rawWords = stringWithoutPunct.split(' ')
Note that this will result in some "words" that are the empty string, if there is any place in the string that has two consecutive spaces. We'll make sure to ignore those items in subsequent steps
- Produce a list of unique words
let uniqueWords: Array<string> = []
for(let word of rawWords) {
// if this word is the empty string, ignore it
if(word === '') continue
// if this word is already on the list, ignore it
if(uniqueWords.includes(word)) continue
// otherwise, add this word to the list
uniqueWords.push(word)
}
- Count the occurrences of each word
We'll convert the list of unique words into a dictionary/hash whose keys are the words and whose values are the count.
let countedWords: Record<string, number> = {}
for(let word of uniqueWords) {
let count = 0
// loop through the list of raw words, counting occurrences of this word
for(let rawWord of rawWords) {
if(rawWord === word) count += 1
}
// now store this word+count pair in the dictionary
countedWords[word] = count
}