0

So I have written a function to sort a list of Unordered pairs by their 'value' (the second part of the pair). Here is my attempt at a recursive function (I know it has a rudimentary design):

*Edit function design: the function works by:

  1. Taking a list of unordered pairs, a list of sorted frequencies, and an optional startList for recursive calls. It first sets listRet equal to the startList.

  2. If the unordered list has only one element, that element is pushed to listRet.

  3. If the list is larger than 1 each pair of the unordred list is looped through and checked if it equals the first element of the ordered frequency list.

  4. If it is not the last element, it is pushed to the listRet.

  5. The loop then continues until it hits the last element and then the function is called recursively with the pushed pair removed from the unordered list since it has been properly placed into the listRet and the top most frequency. The listRet is put in place as the optional startList parameter.

  6. Now in the case when the unordered list has more than one element and the last element is the correct frequency to sort, I opt to move that element to the front of the list and make a recursive call.

  7. The first element of the unordered list can now be pulled off and then I abuse the do loop to reach the end of the list and again make a recursive function call as in step 5.

  8. First If statement exits after unordered list is length one (as in step 2) and the listRet should be returned.

Function:

(defun sort-by-freq (listUnord listOrdFreqs &optional startList)
  (print startList)
  (let ((listRet startList)) ;; step 1
     (if (equal (length listUnord) 1) ;;step 2
         ;;(print listRet)
         (push (car listUnord) listRet)
         (loop for pair in listUnord ;;step 3
           do (if (and (equal (cdr pair) (car listOrdFreqs)) 
                       (not (equal (last-element-by-pair listUnord)(car pair))))
                  (push pair listRet) ;;step 4
                  (if (and (equal (cdr pair) (car listOrdFreqs))
                           (equal (last-element-by-pair listUnord)(car pair))) 
                      (sort-by-freq (append (list (cons (car pair) (cdr pair)))
                                            (remove-element listUnord pair)) 
                                    listOrdFreqs listRet) ;;step 6
                     (if (equal (last-element-by-pair listUnord)(car pair))
                         (sort-by-freq (remove-element listUnord (car listRet))
                                       (cdr listOrdFreqs)
                                       listRet)))))) ;; step 5
  listRet)) ;;step 8

So if I call:

(sort-by-freq (list(cons 'c 2)(cons 'b 3)(cons 'a 1)) '(3 2 1))

I expect the result:

((A . 1) (C . 2) (B . 3)) 

But for some reason, in the return I am getting only:

((B . 3)) 

Using the (print startList) statement I can confirm that the startList is building as I hoped. It outputs:

NIL 
((B . 3)) 
((B . 3)) 
((C . 2) (B . 3)) 

And I have confirmed via the commented out ;;(print retList) that the exit condition is being reached after ((C . 2) (B . 3)) is output. (push (car listUnord) listRet) should be pushing the third element (A . 1) to the front of the list and returning listRet. This is consistent with how I have designed other functions with outputs and it has worked.

What am I missing?

*Edit here are the two helper functions I use:

(defun remove-element (origPairList origPair)
  (let ((retlist (list)))
    (loop for pair in origPairList
      do (if (not(equal(car origPair)(car pair)))
           (push pair retlist)))
  retlist))

(defun last-element-by-pair (pairList)
  (let ((lastEl (car (car pairList))))
    (if (not (null (cdr pairList)))
      (loop for el in pairList
         do (setf lastEl (car el))))
  lastEl))
EliSquared
  • 1,409
  • 5
  • 20
  • 44
  • *compilation unit finished. Undefined functions:* `LAST-ELEMENT-BY-PAIR` `REMOVE-ELEMENT`. Can you please edit your question and add missing functions? And if you can also reformat your code to something not so wide - it would greatly improve readability - that would be great :) – rsm Aug 15 '18 at 02:39
  • @rsm I have formatted the function and I hope it is more readable. I have added in the helper methods as well, but I don't think that it makes much sense that is the problem since I know the function is hitting its exit condition. – EliSquared Aug 15 '18 at 03:48
  • 2
    I am trying to split the function you gave into smaller chunks but some parts have their parentheses off: you are calling cons with 3 arguments, etc. to the point that I am not even sure if some expressions belong where they are supposed to (in particular, "if" expressions). Could you check how balanced the parentheses are and maybe annotate with comments that explain the "why" of each part? Thanks. – coredump Aug 15 '18 at 08:04
  • The recursive calls aren't in a tail position and you're not doing anything with their return value. – jkiiski Aug 15 '18 at 08:07
  • 1
    @coredump I have formatted my code and added in the steps. – EliSquared Aug 16 '18 at 04:27
  • 1
    this is not how you attack writing a function, any function. don't think what to do; start by explaining -- to yourself -- what you have, and what you want to have as a result of the function having done its work. I don't see this in your question. Why do you even need the second argument, not to mention the third? – Will Ness Aug 16 '18 at 13:13

3 Answers3

2

Some hints...

Simplify things:

(defun remove-element (origPairList origPair)
  (let ((retlist (list)))
    (loop for pair in origPairList
      do (if (not(equal(car origPair)(car pair)))
           (push pair retlist)))
  retlist))

to

(defun remove-element (list pair)
  (remove (car pair) list :key #'car))

and

(defun last-element-by-pair (pairList)
  (let ((lastEl (car (car pairList))))
    (if (not (null (cdr pairList)))
      (loop for el in pairList
         do (setf lastEl (car el))))
  lastEl))

to

(defun last-element-by-pair (pair-list)
  (caar (last pair-list)))

Awful car/cdr mess. Use destructuring in LOOP.

(loop for pair in foo ... (car pair) ... (cdr pair) ... (car pair) ...)

is

(loop for (head . tail) in foo ... head ... tail ... head ...)

Don't walk lists all the time

Then

(if (equal (length foo) 1) ...)

is

(if (and (consp foo) (rest foo)) ...)  ; no need to traverse the list

Further Problems:

You also need to make sure that your code is correctly indented. Typically a keystroke in an editor to do that. Also your code was missing closing parentheses. The code is thus syntactically not correct.

(defun sort-by-freq (listUnord listOrdFreqs &optional startList)
  (print startList)
  (let ((listRet startList))
    (if (equal (length listUnord) 1)
        (push (car listUnord) listRet) ;; <- this makes no sense.
                                       ;; since you quit the function
                                       ;; there is no useful effect
      (loop for pair in listUnord
            do (if (and (equal (cdr pair) (car listOrdFreqs)) 
                        (not (equal (last-element-by-pair listUnord)(car pair) ))) 
                   (push pair listRet)
                 (if (and (equal (cdr pair) (car listOrdFreqs)) 
                          (equal (last-element-by-pair listUnord)(car pair)))
                     ;; this call to sort-by-freq makes no sense.
                     ;; you are not using the return value 
                     (sort-by-freq ;; what are you appending here, only one list?
                                   (append (list (cons (car pair) (cdr pair))
                                                 (remove-element listUnord pair)))
                                   listOrdFreqs
                                   listRet)
                   (if (equal (last-element-by-pair listUnord)(car pair))
                       ;; this call to sort-by-freq makes no sense.
                       ;; you are not using the return value 
                       (sort-by-freq (remove-element listUnord (car listRet))
                                     (cdr listOrdFreqs)
                                     listRet))))))
    listRet))

Basically in

(loop for e in list
      do (compute-some-thing-and-return-it e))

it makes no sense to call the function, since the return value is not used. The only reason to call the function would be if it had side-effects.

Example:

CL-USER 310 > (loop for e in '(1 2 3 4)
                    do (if (evenp e)
                           (* e 10)
                         (* e 100)))
NIL

As you see, it returns NIL. Probably not what you want.

Rainer Joswig
  • 136,269
  • 10
  • 221
  • 346
  • But if I do `(print (push (car listUnord) listRet))` at the part of my exit condition it executes and prints `( (A . 1) (C . 2) (B . 3) )`. I don't get why listRet changes to `(B . 3)`. I also don't understand why you are saying my recursive calls make no sense when my '(print startList)' shows that the function is working as intended and the list is being augmented as expected. It is poorly designed, but I am intentionally only using the most basic functions since that is what I have learned. In my helper functions I use similar syntax structure and have no problem with the output. – EliSquared Aug 16 '18 at 03:54
  • @EliSquared - see your question: the call is a problem, because you don't use the return value. If you don't use the returned value of a function, it is like you never called them. You can delete the call and the effect is the same. – Rainer Joswig Aug 16 '18 at 06:35
  • @EliSquared : `(push (car listUnord) listRet))` changes `listRet`. But you later never use that variable. So, why change it? See CONS vs PUSH. – Rainer Joswig Aug 16 '18 at 06:36
  • @EliSquared : I would propose to practice recursive functions with a few simpler examples, before you try this one. – Rainer Joswig Aug 16 '18 at 06:37
2

Main problem

You are evaluating expressions inside a loop, in a do clause. The returned value is never going to be used anywhere.

The function's return value is given by the local variable listRet, which is set at the beginning of the function and affected at exactly 2 places, the two calls to push. The first one happens only for an input list of size 1. The second push happen only at step 4.

We can easily see that all other operations have no effect whatsoever on the local listRet variable. Also, because sort-by-freq as well as your auxiliary functions are pure (they never destroy parts of the list structure pointed by listRet), you know also that the list is not going to be modified by having its cons cells linked differently over time.

Let's confirm this by tracing your code with your example:

(trace sort-by-freq last-element-by-pair remove-element)

When you evaluate your test, the following trace is emitted (output varies among implementations, here it is using SBCL):

0: (SORT-BY-FREQ ((C . 2) (B . 3) (A . 1)) (3 2 1))
  1: (LAST-ELEMENT-BY-PAIR ((C . 2) (B . 3) (A . 1)))
  1: LAST-ELEMENT-BY-PAIR returned A
  1: (LAST-ELEMENT-BY-PAIR ((C . 2) (B . 3) (A . 1)))
  1: LAST-ELEMENT-BY-PAIR returned A
  1: (LAST-ELEMENT-BY-PAIR ((C . 2) (B . 3) (A . 1)))
  1: LAST-ELEMENT-BY-PAIR returned A
  1: (REMOVE-ELEMENT ((C . 2) (B . 3) (A . 1)) (B . 3))
  1: REMOVE-ELEMENT returned ((A . 1) (C . 2))
  1: (SORT-BY-FREQ ((A . 1) (C . 2)) (2 1) ((B . 3)))

    2: (LAST-ELEMENT-BY-PAIR ((A . 1) (C . 2)))
    2: LAST-ELEMENT-BY-PAIR returned C
    2: (LAST-ELEMENT-BY-PAIR ((A . 1) (C . 2)))
    2: LAST-ELEMENT-BY-PAIR returned C
    2: (LAST-ELEMENT-BY-PAIR ((A . 1) (C . 2)))
    2: LAST-ELEMENT-BY-PAIR returned C
    2: (REMOVE-ELEMENT ((A . 1) (C . 2)) (C . 2))
    2: REMOVE-ELEMENT returned ((A . 1))
    2: (SORT-BY-FREQ ((C . 2) (A . 1)) (2 1) ((B . 3)))

      3: (LAST-ELEMENT-BY-PAIR ((C . 2) (A . 1)))
      3: LAST-ELEMENT-BY-PAIR returned A
      3: (LAST-ELEMENT-BY-PAIR ((C . 2) (A . 1)))
      3: LAST-ELEMENT-BY-PAIR returned A
      3: (REMOVE-ELEMENT ((C . 2) (A . 1)) (C . 2))
      3: REMOVE-ELEMENT returned ((A . 1))
      3: (SORT-BY-FREQ ((A . 1)) (1) ((C . 2) (B . 3)))

      3: SORT-BY-FREQ returned ((A . 1) (C . 2) (B . 3))
    2: SORT-BY-FREQ returned ((C . 2) (B . 3))
  1: SORT-BY-FREQ returned ((B . 3))
0: SORT-BY-FREQ returned ((B . 3))

A minor point that can be seen right there is that last-element-by-pair is called many times with the same input, which is wasteful. It could be called once before looping.

At level 0, the function iterates over pairs until it finds (B . 3), whose frequency equals the first frequency in the second list. This is step 4, and the pair is pushed in front of the list designated by the local variable listRet in that invocation of sort-by-freq.

When the loop reaches the last element of the unordered list, the functions is called recursively with (i) a new list of unordered pairs, computed using remove-element, (ii) one less frequency, and (iii) the current list bound to listRet.

But whatever happens during the recursive steps, and in particular no matter what result the recursive calls yield, the binding currently in scope for listRet is not going to be modified anymore. Also, the structure of the list currently pointed by listRet is not modified (e.g. using nconc or rplacd).

All the work that is done at level 2 and below is about pushing values in front of temporary variables locally named listRet, and then discarding them.

What am I missing?

The data flow from recursive invocations of sort-by-freq to the current invocation of that function is broken, you have to express the current result in terms of recursive results, OR you need to mutate things (but this is discouraged).

Naming and user interfaces

Taking a list of unordered pairs, a list of sorted frequencies, and an optional startList for recursive calls. It first sets listRet equal to the startList.

Based on your specification, I would define the function as follows:

(defun sort-pairs-by-frequencies (pairs frequencies)
  ...)

Whether or not there need to be additional parameters for recursive invocations is not a concern for the user of the function. That kind of details should be left hidden, except if you really want your user to be able to pass a list as a third argument.

Complex corner cases

If the unordered list has only one element, that element is pushed to listRet. [...] If it is not the last element, it is pushed to the listRet.

Functions that recurses over lists typically need only consider two cases: empty and non-empty lists. The fact that you consider more corner cases, like a list of one element or like when you check if an element is the last one, is a huge red flag. There are problems that needs to have complex base conditions, but this one can be done in a simpler way.

Redundant tests

In addition to the many calls to last-element-by-pair, notice also that you repeat tests at different points of your functions, even when they are necessarily true in a given context.

 (if (and (equal (cdr pair) (car listOrdFreqs))
          (not (equal (last-element-by-pair listUnord)(car pair))))
     (push pair listRet) ;;step 4
     (if (and (equal (cdr pair) (car listOrdFreqs))
              (equal (last-element-by-pair listUnord)(car pair)))
         (sort-by-freq (append (list (cons (car pair) (cdr pair)))
                               (remove-element listUnord pair))
                       listOrdFreqs listRet) ;;step 6
         (if (equal (last-element-by-pair listUnord)(car pair))
             (sort-by-freq (remove-element listUnord (car listRet))
                           (cdr listOrdFreqs)
                           listRet))))

Given appropriate definitions, the above could be written:

(if (and A (not B))
   (step-4)
   (if (and A B)
        (step-6)
        (if B (step-5))))

Let's write what conditions are true for each step, based on which branch they belong in the trees of if expressions:

  • For step 4: (and a (not b)) necessarily holds since it is in the "then" branch of the if.

  • For step 5: (and b (not a)), based on the following simplifications over the path predicate:

    (and b
         (not (and a b))
         (not (and a (not b))))
    
    => (and b
            (or (not a) (not b))
            (or (not a) b))
    
    => (and b (or (not a) nil) t)
    => (and b (not a))
    
  • For step 6: (and a b)

Thus, you could write the test as:

(cond
  (a (if b (step-6) (step-4)))
  (b (step-5)))

Simpler version

If the list is larger than 1 each pair of the unordred list is looped through and checked if it equals the first element of the ordered frequency list.

This above is the core of the algorithm. For each frequency F, you partition the unordered elements into two lists: the one where the frequency equals to F, and the other ones.

Let's define just this: how to remove or keep an item based on whether their frequency matches a given frequency. The following is not necessarily efficient, but it works and is simple. Also, it errs on the side of caution by using immutable operations only:

(defun remove-frequency (pairs frequency)
  (remove frequency pairs :test #'= :key #'cdr))

(defun keep-frequency (pairs frequency)
  (remove frequency pairs :test-not #'= :key #'cdr))

(remove-frequency '((a . 20) (b . 10) (c . 5) (d . 20)) 20)
=> ((B . 10) (C . 5))

(keep-frequency '((a . 20) (b . 10) (c . 5) (d . 20)) 20)
=> ((A . 20) (D . 20))

Then, your main function recurses over frequencies:

(defun sort-pairs-by-frequencies (pairs frequencies)
  (if (null frequencies)
      pairs
      (destructuring-bind (frequency . frequencies) frequencies
        (append (keep-frequency pairs frequency)
                (sort-pairs-by-frequencies (remove-frequency pairs frequency)
                                           frequencies)))))

When given no frequencies to sort by, the unsorted list of pairs is returned. Otherwise, frequencies can be destructured as a cons cell, where the first item is a frequency, and the remaining items are frequencies. Note that the previous binding of frequencies is shadowed, because from this point on we do not need to refer to the whole list of frequencies anymore.

Then, the function's return value in the general case is computed by appending pairs that have the given frequency and the list of pairs that did not match frequency, recursively sorted according to the remaining frequencies.

(sort-pairs-by-frequencies '((a . 20) (b . 10) (c . 5) (d . 20))
                           '(5 10 20))
=> ((C . 5) (B . 10) (A . 20) (D . 20))

By evaluating first (trace sort-pairs-by-frequencies), you also obtain the following trace:

 0: (SORT-PAIRS-BY-FREQUENCIES ((A . 20) (B . 10) (C . 5) (D . 20)) (5 10 20))
   1: (SORT-PAIRS-BY-FREQUENCIES ((A . 20) (B . 10) (D . 20)) (10 20))
     2: (SORT-PAIRS-BY-FREQUENCIES ((A . 20) (D . 20)) (20))
       3: (SORT-PAIRS-BY-FREQUENCIES NIL NIL)
       3: SORT-PAIRS-BY-FREQUENCIES returned NIL
     2: SORT-PAIRS-BY-FREQUENCIES returned ((A . 20) (D . 20))
   1: SORT-PAIRS-BY-FREQUENCIES returned ((B . 10) (A . 20) (D . 20))
 0: SORT-PAIRS-BY-FREQUENCIES returned ((C . 5) (B . 10) (A . 20) (D . 20))

The above functions were written so that they could be easily readable and "trivially" correct, but they are still wasting memory and stack. Other approaches are possible to sort elements in place (no memory allocation) using a loop (constant stack usage).

coredump
  • 37,664
  • 5
  • 43
  • 77
0

I figured it out using a less complex solution. While I acknowledge that the other posters here have written more elegant solutions, they are very unrelated to my attempt. I wanted to take on the problem using basic syntax that I understand at this point.

Here is what my new sort-by-frequency function looks like:

(defun sort-by-frequency (pairs frequencies)
  (let ((retList (list)))
        (loop for freq in frequencies
           do (push (extract-pair-match pairs freq (extract-keys retList)) retList))
    retList))

I now use a simple loop to iterate through the list of frequencies and then find a match based on the frequency using the function extract-pair-match, which also takes the keys (extract-keys) from the variable retList so that it can search through and make sure that the same key doesn't appear twice.

Here is the function:

(defun extract-pair-match(pairs match keys)
  (if (and (equal (cdr(car pairs)) match) (not (search-keys keys (car(car pairs)))))
        (car pairs)
        (extract-pair-match (cdr pairs) match keys)))

First it checks the first term of the list pairs to see if its key matches match, it then uses the function search-keys on the keys list (which is passed at the retList in sort-by-frequency to make sure that the key is not already part of the list. If it is it continues to the next term. The assumption made is that each frequency will have one match.

Here is the search-keys function:

(defun search-keys (keys match)
  (if (null keys)
      nil
      (if (equal (car keys) match)
      (car keys)
      (search-keys (cdr keys) match))))

Here is the extract-keys function:

(defun extract-keys (pairList)
  (let ((retlist (list (car (car pairList)))))
       (loop for pair in (cdr pairList)
          do (push (car pair) retlist))
    retlist))

So now if I do:

(sort-by-frequency (list(cons 'c 2)(cons 'b 3)(cons 'a 1)) '(1 2 3))

I get:

((A . 1) (C . 2) (B . 3))
EliSquared
  • 1,409
  • 5
  • 20
  • 44