4

I have a perfect balanced binary tree:

          1
        /   \
       2     5
      / \   / \
     3   4 6   7

A simple node structure with DF traversal:

function TreeNode(val){
  this.val = val
  this.left = this.right = null
  this.addLeft = function(node){
    this.left = node
  }
  this.addRight = function(node){
    this.right = node
  }
}

function visit(node){
  console.log(node.val)
}

function traverse(node){
  if (node == null) return

  visit(node)

  if(node.left){
    traverse(node.left)
  }
  if(node.right){
    traverse(node.right)
  }
}

I create nodes to represent a tree structure:

let rootNode = new TreeNode(1)
let node_A = new TreeNode(2)
let node_B = new TreeNode(5)
let node_C = new TreeNode(3)
let node_D = new TreeNode(4)
let node_E = new TreeNode(6)
let node_F = new TreeNode(7)
rootNode.addLeft(node_A)
rootNode.addRight(node_B)
node_A.addLeft(node_C)
node_A.addRight(node_D)
node_B.addLeft(node_E)
node_B.addRight(node_F)

Calling traverse(rootNode) prints out correctly:

1
2
3
4
5
6
7

I understand how recursion works, but am still a little bit confused how JavaScript handles it in the call stack. traverse(rootNode) is put in the call stack first, then it reaches the if condition, inside here it checks that rootNode has a node left, so it continues down the path until it reaches the final node, which is TreeNode(3). The call stack is like this:

|                             |
|                             |
| traverse(TreeNode(3))       |
| traverse(TreeNode(2))       |
| traverse(rootNode)          | 
|_____________________________|        |

TreeNode(3) does not have any node.left or node.right so it returns to the if condition, and goes down to check node.right, the second condition. Then it sees indeed TreeNode(2) has a node.right and traverse to TreeNode(4) and push it to the stack.

Now this part is confusing me. How does JavaScript keep track of calling rootNode.right when traverse(TreeNode(4) is completed? In other words, how does it know to switch to the right branch at the end of left branch? Because the output prints 1 to 7, so the call stack must be:

|                             |
| traverse(TreeNode(7))       |
| traverse(TreeNode(6))       |
| traverse(TreeNode(5))       |
| traverse(TreeNode(4))       |
| traverse(TreeNode(3))       |
| traverse(TreeNode(2))       |
| traverse(rootNode)          | 
|_____________________________|

But I believe the top of the stack will be the first to pop out and return, so the output should be starting from 7, not 1. So my second question is why the console log correctly prints out the result from 1 to 7.

Daniel
  • 437
  • 1
  • 5
  • 14
  • It's all relative to the node being passed into the method at that point in time. The method does not know any other state. It knows that input, and the method it came from. When node 4 finishes, it exits back to node 2, because that's who called it, which then exists back to node 1, because that's who called it. It then starts checking the right node of node 1. – Taplar Aug 03 '18 at 18:42
  • You could see this visually if you added `console.log('I have returned to node '+ node.val);` immediately after each of your traversal method calls. http://jsfiddle.net/0zy3fq92/1/ – Taplar Aug 03 '18 at 18:48
  • Javascript cues up all function calls first in the order it reaches them and then evaluates them in backwards order, kind of like digging a hole and then filling it back in. So looking at your diagram of the tree, it makes sense that it would go from 1, to 2, then 3 and four, THEN the original function would shift over to 5, and so on. – Devin Fields Aug 03 '18 at 18:48
  • The console.log happens before that of a child node, however the recursive function calls on child nodes are completed in reverse order. If you move the console.log to the end of the function you would get your reversed sequence. – James Aug 03 '18 at 18:48
  • @Taplar I added that line inside the `traverse` function even before the base return condition, and it still prints out correctly from `1` to `7`. – Daniel Aug 03 '18 at 18:52
  • Right, that doesn't change the logic. It's just creating a visual of which instance of the method call it is returning to, by showing you the value – Taplar Aug 03 '18 at 18:53
  • @James when putting the `console.log` to the end of the `traverse` function, I get `3 4 2 6 7 5 1` – Daniel Aug 03 '18 at 18:54
  • While reading on this topic, I understand that JS call stack remembers where the recursive functions is called, and after it returns, it will go back to that environment. It kind of explains how the call stack correctly switches to the right branch when it finishes traversing the left branch. – Daniel Aug 03 '18 at 18:59
  • But it's still confusing me how it gets the order correct. – Daniel Aug 03 '18 at 19:07

1 Answers1

3

Well, your explanation leaves some important steps out, and I think that's allowing you to reach some invalid conclusions. In particular, the stack never looks like

traverse(TreeNode(7))       
traverse(TreeNode(6))       
traverse(TreeNode(5))       
traverse(TreeNode(4))       
traverse(TreeNode(3))       
traverse(TreeNode(2))       
traverse(rootNode)          

For the record, none of this is javascript-specific; so really I think this is about improving how well you understand recursion.

Let's step through your traverse(rootNode) call in more detail. So of course the call stack starts out like

0: traverse ( node = rootNode ) {
|  =>  if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|      }
|      if(node.right){
|          traverse(node.right)
|_     }

Here I'm using notation that makes a few things more explicit:

  • I put a number before the call; this way, for example, I could refer to the above call as "call 0 on the stack"
  • I show the parameter assignments of each call
  • I show the code of the function behind each call, with an arrow => indicating current progress through the code for this particular call. Here we're about to execute the first instruction.

Following the convention you appear to be using, the most recently-pushed (next-to-pop) item will be shown at the top.

So now since node is rootNode and that's not null, it doesn't just immediately return. Before any recursion starts, it calls visit() on rootNode, which goes ahead and prints 1. This is one of the things that seems missing from your explanation - nodes are visited (and their values printed) before any recursion, so they're being printed as you traverse, as you first reach each node.

Then it checks the left and finds a truthy value (in this case, another node object), so it calls traverse(node.left) and we end up with a stack like

1: traverse ( node = node_A )
|  =>  if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|      }
|      if(node.right){
|          traverse(node.right)
|_     }

0: traverse ( node = rootNode ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

So again we start executing traverse() from the beginning with our new node value, which is not null so we go ahead and visit(node) - which, since node is node_A, prints 2. Then it checks left, which is truthy (node_C) and we get

2: traverse ( node = node_C )
|  =>  if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|      }
|      if(node.right){
|          traverse(node.right)
|_     }

1: traverse ( node = node_A ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

0: traverse ( node = rootNode ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

node_C isn't null, so it gets visit()ed and this prints 3. Now we check left, but it's falsey (undefined). So we check right and it, too, is falsey (undefined). So we return and now the stack says

1: traverse ( node = node_A ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

0: traverse ( node = rootNode ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

Now we're already midway through the traverse() code for call 1 on the stack, so we pick up where we left off (at the =>). So now we check right, which is truthy and we get

2: traverse ( node = node_D )
|  =>  if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|      }
|      if(node.right){
|          traverse(node.right)
|_     }

1: traverse ( node = node_A ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|      }
|      if(node.right){
|          traverse(node.right)
|_ =>  }

0: traverse ( node = rootNode ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

Now visit will print 4 (because node is node_D, and both left and right are falsey, so we return to

1: traverse ( node = node_A ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|      }
|      if(node.right){
|          traverse(node.right)
|_ =>  }

0: traverse ( node = rootNode ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

But call 1 on the stack has reached the end of it's code, so it returns and we go to

0: traverse ( node = rootNode ) {
|      if (node == null) return
|      visit(node)
|
|      if(node.left){
|          traverse(node.left)
|  =>  }
|      if(node.right){
|          traverse(node.right)
|_     }

Call 0 picks up where it left off, which means it's about to check right (which, I think, answers your original question). Execution proceeds down the right branch in exactly the same way it proceeded down the left, with the stack at various times containing calls for rootNode, node_B, then rootNode, node_B, node_E, then rootNode, node_B again (but with the code partially executed), then rootNode, node_B, node_F, then one last time rootNode, node_B (with the code about to finish), then back to rootNode, and then the initial call to traverse finally returns.

Mark Adelsberger
  • 42,148
  • 4
  • 35
  • 52
  • Thank you for your detailed explanation. So each `traverse()` function calls `visit(node)`, which returns the node value out to the console. Since `visit()` completes and returns a value, it will pop out of the stack and the next `traverse()` function will be put in, based on the condition. `traverse(3)` pops out, back to `traverse(2)`, which will call `traverse(4)`. Then `traverse(4)` pops out, back to `traverse(2)`. This function now completes and returns, so back to `traverse(rootNode)`. Then this function will start with the `if(node.right)` line and repeat with the right branch. – Daniel Aug 03 '18 at 21:29