Spilling a symbol doesn't improve colorability

Question

Say I have this intermediate representation of some code:

t1 = 1
t2 = 2
t3 = 3

t4 = t1 + t2
t5 = t3 + t4

use t5

The ultimate goal is to do register assignment using only two ARM registers, r0 and r1, and possibly spill some symbols.

The first step is to compute live ranges for each instruction:

t1 = 1       |
t2 = 2       | t1
t3 = 3       | t1, t2

t4 = t1 + t2 | t1, t2, t3 (this is going to become a problem)
t5 = t3 + t4 | t3, t4

use t5       | t5

Register interference graph

              t2
            /    \     
           /      \    
          /        \   
        t1----------t3
                    | 
                    | 
        t5          t4

Register assignment via graph coloring

Now use Chaitin's algorithm on it to color it with the two registers:

Find node with fewer than 2 edges
1. Found => t5
2. Remove it from the graph
3. Push it onto the stack: [t5]
Find node with fewer than 2 edges
1. Found => t4
2. Remove it from graph, push: [t5, t4]
No nodes with fewer than 2 edges! (we're looking at the t1 - t2 - t3 triangle)
1. Pick one at random (or with the highest degree (2 for all remaining nodes)) => t3 (this might be a problem)
2. Remove it from graph, push: [t5, t4, t3]
Find node with fewer than 2 edges
1. Found t2
2. Remove it from graph, push: [t5, t4, t3, t2]
Push the last node: [t5, t4, t3, t2, t1]

Now assign registers in reversed order:

Pop symbol => t1
1. Conflicting nodes: none
2. Assign: t1 => r0
3. Put t1 back in the graph
Pop t2
1. Conflicting nodes: {t1: r0}
2. We have one more free register
3. Assign: t2 => r1
4. Put t2 back in the graph
Pop t3
1. Conflicting nodes: {t1: r0, t2: r1}
2. OOPS! No more free registers!
3. Spill it => t3 => m0
4. Put t3 back in the graph
No more spills required!

Final assignment:

t1 -> r0
t2 -> r1
t3 -> m0 (spilled)
t4 -> r0 (or r1)
t5 -> r0 (or r1)

Inserting spills

Now, as per this (slide 29) and this (slide 4):

Before each operation that uses [the spilled symbol t3], insert t3 := load m0

After each operation that defines [the spilled symbol t3], insert store t3, m0

Let's do this and also compute the live ranges:

t1 = 1       |
t2 = 2       | t1

t3 = 3       | t1, t2
// STORE
store t3, m0 | t1, t2, t3 (STILL 3 SYMBOLS!)

t4 = t1 + t2 | t1, t2

// LOAD
t3 = load m0 | t4
t5 = t3 + t4 | t3, t4

use t5       | t5

Problem

As you can see, we still have an instruction where three symbols interfere, and we'll also get the same RIG as before.

So the spill didn't work!

Now, I brute-forced the possible candidates for spilling by hand and found out that the RIG remains not 2-colorable if we spill any one symbol, but it becomes 2-colorable if we spill any of these pairs of symbols:

t1 and t3
t2 and t3

...and also t1, t2 and t3, but spilling only two symbols looks simpler.

Actual question

How do I decide what to spill? What heuristic do I use? I'd also like the algorithm to do global register assignment, so plain live ranges and linear-scan assignment (as opposed to building the RIG) seems to be a pretty cumbersome approach.

Also, if I re-run the same algorithm on the code with spills, the result will be "spill t3" again, even though it's been done already, so the code will loop forever. Even more, if I continue spilling t3 and inserting load/store code, I'll get more and more instructions where t1, t2 and t3 interfere, so the situation will be getting worse.

And that's assuming that I can just t3 = load m0, while on ARM (that I'm targeting) m0 should be stored in its own register first.

Maybe the last paragraph of my answer [here](https://stackoverflow.com/questions/23446878/how-to-deal-with-multiple-spilled-values-as-operands-for-a-single-instruction/56971134#56971134) will help you, i.e. use a spill heuristic that spills longer ranges, so that the new temporaries generated by spilling won't be spilled again. — antoyo, Mar 04 '20 at 13:59
@antoyo, this seems to work after running the linear-scan algorithm on the last piece of code (with `t3` spilled) and choosing to spill `t1` as the symbol with the longest live range so far (at instruction `store t3, m0`). But the RIG approach loses all information about a symbol's live range (right?), so looks like I can't use this here. I could as well switch to linear-scan, but I already have a working graph-coloring allocator (that can't spill), and I'm not terribly excited about writing a new allocator — ForceBru, Mar 04 '20 at 14:37
So, what I used in [my own graph-coloring register allocator](https://github.com/antoyo/tiger-rs/blob/20d4f384a1bdd788d710cdb38ede48e18ae5963b/tiger/src/color.rs#L384) for the spill cost was the degree of the node (i.e. the number of neighbors) which is kind of correlated to the live range. You could try that. — antoyo, Mar 04 '20 at 22:59
@antoyo, I'm kinda using this implicitly already. See step 2.1 in the first list: "Pick one at random (or with the highest degree) => `t3`". So I end up with that `t1 - t2 - t3` triangle after popping off `t4` and `t5`, and there all three nodes have degree 2, so again I don't know which one to choose — ForceBru, Mar 05 '20 at 05:19