Find how many times each number occurs in list

Question

If we had a list A holding (1 2 1 1 2 3 3 4 4 4), how could we get a new list B with ((1 . 30) (2 . 20) (3 . 20) (4 . 30)) in it, such that the number_after_dot is the percentage of the number_before_dot in the list A.

For example 1 is 30% of list A, 2 is 20% of list A, etc..

(1 . 30) is a pair, which could be made by (cons 1 30)

I don't understand your specification of what the code is supposed to do. How do you compute the numbers in the output list? Your functions appear to do what they're supposed to. — Jeremiah Willcock, Jan 25 '11 at 04:21
I'm not sure that I understand you, but the aim of the task is to get all unique numbers in List A with there frequency in this list. I could write fucntion '(member x l)' but I don't know how to get the unique numbers and their count in list A — bpavlov, Jan 25 '11 at 04:27
Then do not write it as `(1.30)` as it is clearly something else. — leppie, Jan 25 '11 at 04:59

okonomichiyaki · Accepted Answer · 2011-01-26T15:11:54.727

I think what you want to do is calculate the percentage of the list that is equal to each element. You used the word "unique" but that a bit confusing since your list has no unique elements. This is based on your sample input and output, where the list (1 2 1 1 2 3 3 4 4 4) is composed of "30% ones".

You can break this down roughly into a recursive algorithm consisting of these steps:

If the input list is empty, return the empty list.
Otherwise, get the first element. Calculate how many times it occurs in the list.
Calculate the percentage, and cons the element with this percentage.
Remove all the occurrences of the first item from the cdr of the list.
Recurse on this new list, and cons up a list of (element . percentage) pairs.

To do the first part, let's use filter:

> (filter (lambda (x) (eq? (car A) x)) A)
(1 1 1)

With your list A, this will return the list (1 1 1). We can then use length to get the number of times it occurs:

> (length (filter (lambda (x) (eq? (car A) x)) A))
3

To calculate the percentage, divide by the number of elements in the whole list, or (length A) and multiply by 100:

> (* 100 (/ (length (filter (lambda (x) (eq? (car A) x)) A)) (length A)))
30

It's easy to cons this with the element (car A) to get the pair for the final list.

To do the second step, we can use remove which is the inverse of filter: it will return a list of all elements of the original list which do not satisfy the predicate function:

> (remove (lambda (x) (eq? (car A) x)) A)
(2 2 3 3 4 4 4)

This is the list we want to recurse on. Note that at each step, you need to have the original list (or the length of the original list) and this new list. So you would need to somehow make this available to the recursive procedure, either by having an extra argument, or defining an internal definition.

There might be more efficient ways I'm sure, or just other ways, but this was the solution I came up with when I read the question. Hope it helps!

(define (percentages all)
  (let ((len (length all))) ; pre-calculate the length
    ;; this is an internal definition which is called at ***
    (define (p rest)
      (if (null? rest)
          rest
          ;; equal-to is a list of all the elements equal to the first
          ;; ie something like (1 1 1)
          (let ((equal-to (filter (lambda (x) (eq? (car rest) x))
                                  rest))
                ;; not-equal-to is the rest of the list
                ;; ie something like (2 2 3 3 4 4 4)
                (not-equal-to (remove (lambda (x) (eq? (car rest) x))
                                      rest)))
            (cons (cons (car rest) (* 100 (/ (length equal-to) len)))
                  ;; recurse on the rest of the list
                  (p not-equal-to)))))
    (p all))) ; ***

Maybe, you'r right! The word unique is not very suitable for this. As unique elements in list A, I understand the elements without their peds. Yesterday, I understand that we could use function remove-repeats. That function takes list '(1 1 2 3 3 3 4 4 4 4) and output '(1 2 3 4). This makes the problem a little bit easy, but I still don't know how to do this. I'm going to read your full post later today. Thank you! P.S. This is not a homework. This is a problem that could be on my exam. — bpavlov, Jan 26 '11 at 13:56
Ok, sorry for assuming. I've edited my answer to include the solution I hacked together, I hope it's clear! — okonomichiyaki, Jan 26 '11 at 15:12

score 2 · Answer 2 · answered Mar 26 '11 at 14:15

The question formulation is very close to the idea of run-length encoding. In terms of run-length encoding, you can use a simple strategy:

Sort.
Run-length encode.
Scale the run lengths to get percentages.

You can implement run-length encoding like this:

(define (run-length-encode lst)
  (define (rle val-lst cur-val cur-cnt acc)
    (if (pair? val-lst)
        (let ((new-val (car val-lst)))
          (if (eq? new-val cur-val)
              (rle (cdr val-lst) cur-val (+ cur-cnt 1) acc)
              (rle (cdr val-lst) new-val 1 (cons (cons cur-val cur-cnt) acc))))
        (cons (cons cur-val cur-cnt) acc)))
  (if (pair? lst)
      (reverse (rle (cdr lst) (car lst) 1 '()))
      '()))

and scaling looks like:

(define (scale-cdr count-list total-count)
  (define (normalize pr)
    (cons (car pr) (/ (* 100 (cdr pr)) total-count)))
  (map normalize count-list))

Now we need something to sort a list. I'll just use the sort function in racket (adapt as needed). The function to calculate the percentages for each number in the list is then:

(define (elem-percent lst)
  (scale-cdr (run-length-encode (sort lst <)) (length lst)))

Some examples of use:

> (elem-percent '())
'()
> (elem-percent (list 1 2 3 4 5))
'((1 . 20) (2 . 20) (3 . 20) (4 . 20) (5 . 20))
> (elem-percent (list 1 2 1 1))
'((1 . 75) (2 . 25))
> (elem-percent (list 1 2 1 1 2 3 3 4 4 4))
'((1 . 30) (2 . 20) (3 . 20) (4 . 30))

Find how many times each number occurs in list

2 Answers2