Playing perfect Tetris: how to align and scale two curves using scaling and translation?

Question

Given parameters of scaling on the y axis (s) and translation on the x axis (t), how to scale and align two curves that do not coincide when the purpose is to maximize curve superposition (not minimize distance)?

As indicated by @DWin, this could be rebranded "How to play Tetris perfectly with R", though it does have applications far beyond winning a Tetris game.

A variation of this question could involve any number of rigid body transformations (rotation, translation and scaling).

Given curve 1

curve1<-data.frame(x=c(1,1,2,2,3),
                   y=c(9,6,6,3,3))

with(curve1, plot(x=x, y=y, type="l", xlim=c(0,10), ylim=c(0,10)))

enter image description here

and curve 2

curve2<-data.frame(x=c(4,5,5,6,6,7),
                   y=c(2,2,1,1,2,3))

with(curve2, plot(x=x, y=y, type="l", xlim=c(0,10), ylim=c(0,10)))

curve 2

I wish to find s and t that maximize superposition between the two curves.

Ideally the method would be in R using optim.

In this made up example, t=3 and s=1/3, so that

t=3
s=1/3

with(curve2, plot(x=x, y=y, type="l", xlim=c(0,10), ylim=c(0,10)))

with(curve1, lines(x=x+t, y=y*s, col="red"))

enter image description here

Note that to obtain such a fitting, regions that can have a consensus must have a higher weight on parametrization than regions that can not be superimposed and that the larger the consensus region the higher the weight.

Trails I have been exploring:

minimize area between curves

algorithm for superposing two curves

shape analysis using the shapes package

Iterative closest point (ICP) if someone has implemented it in R or can port one of the C implementations

Bonus points for a method using maximum likelihood (assuming normal distribution of error).

Hmmm. An interesting problem. One thing that needs to be considered is that it seems likely something needs to be done to avoid pathological results. Specifically, a local optimum would likely be the case where you scale up one shape massively, and so fit everything onto a part of a straight line.... — Fhnuzoag, Jun 07 '12 at 14:28
Have ideas on how to do this but had a quick question. Suppose the two shapes are a calculator-style "5" (i.e. five equal horizontal/vertical line segments joined together to form a "5" shape - sure you understand what I mean), and a vertical line segment (aka a calculator-style "1"). Here, I want to check that you regard the best solution to be one with 100% overlap (i.e. the second shape overlaps exactly with one of the vertical segments of the first shape). In other words, the fact that most of the first shape is uncovered, and fairly distant from the second shape, doesn't matter, right? — Tim P, Jun 09 '12 at 12:04
So I've managed to come up with an interesting method that detects exact superposition - but the way the problem is specified I think you're inevitably going to hit issues with local minima that you don't want to move away from. Can I just ask whether you think it's likely (based on whatever problem you're solving) that the entirety of shape 1 can be superimposed onto shape 2? — Tim P, Jun 09 '12 at 16:10
@TimP, The amount of overlap between shape 1 and 2 is arbitrary. Sometimes shape one will completely fit within shape two as you indicated. — Etienne Low-Décarie, Jun 11 '12 at 01:40

score 5 · Answer 1 · edited May 31 '12 at 18:02

This will return the distances between the points when curve1 is scaled on the y-axis by a factor "tfac" and moved on the x-axis by an amount "s":

 as.matrix( dist( rbind(as.matrix(curve2), 
                 ( matrix(c(rep(s, 5), rep(1,5)), ncol=2) +   # x-shift matrix
                                       as.matrix(curve1) ) %*% 
                   matrix(c( 1, 0, 0, tfac),ncol=2) ) ) # the y-scaling matrix
                )[ 
                           # better not to use 't' as a variable name
      -(1:5), -(6:11)]  # easier to return the relevant distances when in matrix

It should be a simple matter to put that in a function to be minimized:

 dfunc <- function(C1, C2, s, tfac) { sum( .... ) }

I'm not absolutely sure this will return the result you are expecting since the objective function you are implying may not be the sum of distances. You may need to turn to integer programming methods. The Optimization CRAN Task View would be a good place to look for those methods in R. I suppose an alternative if this problem arises might be to round the "s" value and scale only to the nearest power of 2.

dfunc <- function(param, D1=curve1, D2=curve2) { 
           sum( as.matrix(dist( 
                   rbind(as.matrix(D2), 
                         ( matrix(c(rep(param[1], 5), rep(1,5)), ncol=2) + 
                           as.matrix(D1) ) %*% 
                   matrix(c(1,0,0,param[2]),ncol=2) ) ) )[-(1:5), -(6:11)])}
optim(c(1,1), dfunc)
#$par
$[1] 3.3733977 0.2243866
# trimmed output

Using these values one obtains the following superposition: Superposition using above values

I might, therefore, try s=3, tfac=0.25. (I see looking back that I switched the roles of t and s from what you asked for. Sorry.)

Interesting effort, thanks! It appears however that this is a more complex question than minimizing the distance, because regions that can have consensus must have a higher weight than regions that can not be superimposed. — Etienne Low-Décarie, May 31 '12 at 11:43
My suggestion to that effect was made even before I obtained that numeric result. Efforts to minimize based on area between the curves might also be tried but I can see ambiguities in that problem specification as well. I do not know of a Tetris-playing package for R. — IRTFM, May 31 '12 at 18:04

Fhnuzoag · Answer 2 · 2012-06-11T15:06:34.437

Okay, here's an attempt at a solution.

The basic trick is this: we rasterise the two curves, and then we can do the curve comparison on a tile by tile basis. This seems like a fairly reasonable way of comparing the curve superposition, at least. To encourage the optimiser to approach the curve, we also introduce a loss that penalises curves too far away from the other.

No guarantees this works for more complicated curves and transformations, but it's an idea, at least.

 curve2<-data.frame(x=c(4,5,5,6,6,7),
                    y=c(2,2,1,1,2,3))
 fillin <- function(ax, ay, bx, by, scaling= 10, steps= 100) floor(cbind(seq(from = ax, to = bx, len = steps), seq(from = ay, to = by, len = steps)) * scaling)
 Bmat <- matrix(0, 100, 100)
 for (i in 2:nrow(curve2)){
 Bmat[fillin (curve2[i-1,1], curve2[i-1,2], curve2[i,1], curve2[i,2])] =1
 }
 Bmat.orig = Bmat

 Bmat = Bmat.orig
 #construct utility function based on 
 #manhattan distances to closest point?
 shift = function(mat, offset){
 mat0 = array(0, dim = dim(mat)+2)
 mat0[1:nrow(mat) +1+ offset[1] , 1:ncol(mat) + 1+offset[2]] = mat
 return(mat0[1:nrow(mat) + 1, 1:ncol(mat) + 1])
 }

 for (i in 1:100){
 Bm = (Bmat != 0)
 Btmp1 = shift(Bm, c(1,0))
 Btmp2 = shift(Bm, c(-1,0))
 Btmp3 = shift(Bm, c(0,1))
 Btmp4 = shift(Bm, c(0,-1))

 Bmat = Bmat + pmax(Bm ,Btmp1, Btmp2, Btmp3, Btmp4)/i
 }

 Bmat2 = replace(Bmat, Bmat == max(Bmat), max(Bmat) + 10)

 #construct and compare rasterised versions
 getcurve = function(trans = c(0,1),  curve=data.frame(x=c(1,1,2,2,3) ,
                    y=c(9,6,6,3,3) ), Bmat = Bmat2){
 Amat = array(0, dim = dim(Bmat))
 curve[,1] = curve[,1] + trans[1]
 curve[,2] = curve[,2] * trans[2]
 for (i in 2:nrow(curve)){
 fillin (curve[i-1,1], curve[i-1,2], curve[i,1], curve[i,2]) -> ind
 if (min(ind) < 1 || max(ind) > nrow(Bmat)) return( array(-1, dim= dim(Bmat)))
 Amat[ind] =1
 }
 Amat = (Amat - mean(Amat))/sd(as.vector(Amat))
 Amat
 }
 compcurve = function(trans = c(0,1), curve=data.frame(x=c(1,1,2,2,3) ,
                    y=c(9,6,6,3,3) ) , Bmat = Bmat2){
 Amat = getcurve(trans, curve, Bmat)
 -sum(Amat * Bmat)
 }
 #SANN seems to work for this, but is slow. Beware of finite differencing
 # - criterion is non-smooth! 
 optim(c(0,1), compcurve, method = "SANN", Bmat = Bmat2) -> output
 image(Bmat)
 contour(getcurve(output$par), add = T)

Result:

Fit in black, verses target criterion

Not too shabby, maybe?

You will possibly have to fudge the specifics of the rasterisation to work for other problems. And you might wanna tune how the optim is done.

A 'smarter' alternative is to note that for an optimal solution, it's probably the case that at least one pair of vertices will conincide. This can give you a better search strategy. The advantage of the rasterisation scheme in comparison to area-between curves is that it's possibly more flexible wrt to different transformations and non-graphs (in particular, the vertical line in your first curve is a problem.) You can potentially avoid rasterisation by an appropriate computation, but it gives me a headache just to think about it.

Since we are maximising a criterion, then this is a maximum likelihood method. Curiously, I think it is actually impossible to phrase this question as a maximum likelihood problem using normal errors, because normal errors imply L2 based criterions, which will famously not give you exact superpositions.

Playing perfect Tetris: how to align and scale two curves using scaling and translation?

2 Answers2

Linked