14

I am designing a layout engine for ZenUML. One requirement (after simplification) is this:

  1. There a line segment;
  2. There are n (n < 100, it can be reduced to n < 30 if that makes difference in performance) points on this line segment, the order is fixed; (e.g. P1 ~ Pn)
  3. There are known minimum distances between some points; (e.g. m_d(P2,P4)=500)
  4. The length of the line segment should be as small as possible;
  5. (nice to have) The gaps between neighbour points should be as even as possible (measured by standard deviation^, and must not violate 1~4).
  6. (newly added) it must give result in no longer than 1s for 30 points in worst case (more or less constraints).

^ standard deviation is used as a best alternative that I can precisely describe with my limited math knowledge. However as Roman and David pointed out, there is more appealing result in some cases that does not satisfy lowest standard deviation. See this comment

A sister questions is asked at https://math.stackexchange.com/q/4377433. It is the same problem described with math language (matrix).

For example, for a line segment that has 4 points on it (P1 ~ P4).

m_d(P1, P3) = 200, m_d(P1, P4) = 900.

  • a. One solution would be P1 = 0, P2 = 0, P3 = 200, P4 = 900.
  • b. A better solution would be P1 = 0, P2 = 100, P3 = 200, P4 = 900. (put P3 at the middle)
  • c. A even better solution would be P1 = 0, P2 = 300, P3 = 600, P4 = 900. (evenly distribute P2 and P3).

Some background of this problem. Have a look at the following diagram.

  • We need to create a layout for all the lifelines (vertical lines). #1, #2
  • The gap between those line are decided by the length of messages. #3
  • We need to make sure the total width of the diagram is small (to save space when there are many participants). #4
  • In this diagram, B should be moved to closer to C. #5 enter image description here

I have implemented the following which satisfies constraints #1~#4. Can some please help with constraint #5? It can be partially met (like b in the example) or fully met (like c in the example). We can use standard deviation to measure the quality of the solution. Updated: thanks to Roman Svistunov, the base implementation to get non-optimal position is dramatically simplified.

// Known minimum distance is stored in a matrix
// e.g.
//    m = [
//      [0, 100, 0, 900],
//      [0, 0,   0, 0],
//      [0, 0,   0, 0],
//      [0, 0,   0, 0],
//    ]

let f = function (n: number, m: Array<Array<number>>): number {
  let array = Array(n).fill(0);
  for (let i = 0; i < n; i++) {
    array[i] = f(i, m) + m[i][n];
  }
  return Math.max(0, ...array);
}
Devs love ZenUML
  • 11,344
  • 8
  • 53
  • 67
  • @maraca, you are right. I have just fixed it in the question. Thanks a lot. – Devs love ZenUML Feb 09 '22 at 02:04
  • Can you specify the metric on gaps that you're trying to minimize, or, in other words, clarify what `(nice to have) The gaps between neighbour points.` in 5 means? Do you want to minimize the maximum gap size, the sum of squared gap sizes, or some other function? – kcsquared Feb 09 '22 at 04:43
  • @kcsquared, thanks. Just added clarification. Basically, if possible, evenly distribute the points on the line segment. – Devs love ZenUML Feb 09 '22 at 04:49
  • Could you specify the formula for measuring 'even distribution', e.g. standard deviation of gap sizes? Specifically, if `g_1, g_2, ... g_n` are one possible set of consecutive differences and `h_1, h_2, ... h_n` are another (both satisfying conditions 1-4), how do you tell which is better? – kcsquared Feb 09 '22 at 04:58
  • 1
    `standard deviation` is a good candidate. To provide some context, this is to create visually comfortable effect. Let's say evenly distributed points look better than crowed at one end and sparse at the other end for example. I would like to point of if there are n points, there would be (n-1) gaps. – Devs love ZenUML Feb 09 '22 at 06:10
  • There will not be, let's say, more than 30 points. The order of the points are pre-defined. – Devs love ZenUML Feb 10 '22 at 06:37
  • The extent between the first and the last is not pre-defined. For example, the given information for P1~P4 could be `m_d(P1, P2)` and `m_d(P3, P4)`. In this case, d(P1, P4) would be `m_d(P1, P2) + m_d(P3, P4)`. The `m_d(P1, P4)` is not given here. If it is given, it is NOT necessarily the final result, because other gaps may push P4 to right further. The algorithm I put in the question can produce a correct result for the distance between P1 and P4. It does not satisfy #5. – Devs love ZenUML Feb 10 '22 at 06:46
  • The extent between the first and last point is not fixed? Or always only a minimum given? 4) says it should be as small as possible. Should the standard deviation then be divided by the extent for criterion 5)? [Extent / (number of points - 1) would be optimum gap size without restraints] – Sebastian Feb 10 '22 at 06:47
  • 1
    So with 4) being more important than 5) and the algorithm, the extent is fixed. For each point a minimum and maximum position can be calculated according to this overall extent. – Sebastian Feb 10 '22 at 06:51
  • 4) is definitely more important. The background of this questions is to layout sequence diagram participants (vertical lines). 4) means we must keep the total width of the diagram to the minimum, especially when there are many vertical lines. – Devs love ZenUML Feb 10 '22 at 06:53
  • We have an interval for each point and an optimal position without constraints (only the overall extent) for each point. Could there be conflicts with the constraints, if each point is individually (!) moved as close as possible to the optimum position, but keeping it in the interval boundaries? – Sebastian Feb 10 '22 at 06:59
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/241890/discussion-between-devs-love-zenuml-and-sebastian). – Devs love ZenUML Feb 10 '22 at 07:04
  • 1
    Instead of optimizing the standard deviation of the gaps (global criterion for all points with more weight to the outliers) or the sum of squared differences to the optimal position (global criterion for all points with more weight to the outliers) or the minimum length of a gap (global criterion for the most extreme point only), you could also demand that points are in the middle of the two neighboring points by minimizing the sum of the squared differences from this position (local criterion for all points with more weight to the local outliers). I really think it visually makes a difference – Sebastian Feb 12 '22 at 05:01

3 Answers3

8

I’ll work through some theory and then give some JavaScript code that meets 1–4 and 6, a modified version of 5, and additionally

  1. is robust to numerical issues
  2. has no external dependencies.

We can express 1–4 as a linear program. Your sample instance, for example, would translate as

minimize x4 - x1
subject to

(known minimum distances)
a: x3 - x1 >= 200
b: x4 - x1 >= 900

(fixed order)
c: x2 - x1 >= 0
d: x3 - x2 >= 0
e: x4 - x3 >= 0

x1, x2, x3, x4 unbounded

Every linear program has a dual program with the same optimum objective value (under conditions that are met here). The dual of the program above is below:

maximize 200 a + 900 b
subject to

x1: -a - b - c = -1
x2: c - d = 0
x3: a + d - e = 0
x4: b + e = 1

a, b, c, d, e >= 0

This is a max-cost flow problem with no capacity constraints. In consequence, some path is an optimal flow, so we can understand criteria 1–4 as a longest path problem.

Once we have the optimal objective, 900 here, we can shift criterion 4 to a constraint

x4 - x1 <= 900

and think about how to make the most pleasing layout. As written, we can achieve criterion 5 by solving a quadratic program:

minimize (x2 - x1)**2 + (x3 - x2)**2 + (x4 - x3)**2
subject to

(most efficient layout)
x4 - x1 <= 900

(known minimum distances)
x3 - x1 >= 200
x4 - x1 >= 900

(fixed order)
x2 - x1 >= 0
x3 - x2 >= 0
x4 - x3 >= 0

x1, x2, x3, x4 unbounded

The objective that I’ve written is not the standard deviation, but it doesn’t matter. Minimizing the standard deviation is equivalent to minimizing the variance. In turn, the variance of a random variable X is E[X**2] - E[X]**2, and E[X]**2 is a constant here (E[X] = 900/4 = 150 in the example). Finally, maximizing E[X**2] is equivalent to maximizing the sum of squares of the equally likely outcomes.

I would propose a different criterion however.

5′. Maximize (in lexicographic order) the list of gap lengths sorted small to large.

What this means is that we try as hard as possible to avoid small gaps. In an example from Roman Svistunov, suppose that we have constraints

x3 - x1 >= 2
x4 - x2 >= 5.5
x5 - x3 >= 8

The solution with minimum standard deviation sets

x1 = 0
x2 = 0.75
x3 = 2
x4 = 6.25
x5 = 10

whereas the solution that optimizes the replacement objective sets

x1 = 0
x2 = 1
x3 = 2
x4 = 6.5
x5 = 10

We’ve made x2 more centered at the cost of making x4 less centered. Since x2 is closer to its neighbors than x4, this seems like a reasonable trade to me, but you obviously have the final say.

To optimize 5′, we proceed as follows. Start by formulating a linear program like

maximize z
subject to

(most efficient layout)
x5 - x1 <= 10

(known minimum distances)
x3 - x1 >= 2
x4 - x2 >= 5.5
x5 - x3 >= 8

(fixed order)
x2 - x1 >= z
x3 - x2 >= z
x4 - x3 >= z
x5 - x4 >= z

x1, x2, x3, x4, x5 unbounded

We solve for the optimum, z = 1 here, and also record which gaps prevented it from being larger. We replace those z variables and solve again, repeating until there is no occurrence of z.

maximize z
subject to

(most efficient layout)
x5 - x1 <= 10

(known minimum distances)
x3 - x1 >= 2
x4 - x2 >= 5.5
x5 - x3 >= 8

(fixed order)
x2 - x1 = 1
x3 - x2 = 1
x4 - x3 >= z
x5 - x4 >= z

x1, x2, x3, x4, x5 unbounded

The optimum is z = 3.5.

maximize z
subject to

(most efficient layout)
x5 - x1 <= 10

(known minimum distances)
x3 - x1 >= 2
x4 - x2 >= 5.5
x5 - x3 >= 8

(fixed order)
x2 - x1 = 1
x3 - x2 = 1
x4 - x3 >= z
x5 - x4 = 3.5

x1, x2, x3, x4, x5 unbounded

The optimum is z = 4.5, and that’s the last iteration.

As observed earlier, we can solve these linear programs using an extended longest path algorithm. In JavaScript:

// Dual numbers represent linear functions of time.
function Dual(p, v) {
  return { position: p, velocity: v };
}

// Adds dual numbers x and y.
function dualPlus(x, y) {
  return Dual(x.position + y.position, x.velocity + y.velocity);
}

const epsilon = Math.sqrt(Number.EPSILON);

// Compares dual numbers x and y by their position at a time infinitesimally
// after now.
function dualLessThan(x, y) {
  let d = x.position - y.position;
  return d < -epsilon || (Math.abs(d) <= epsilon && x.velocity < y.velocity);
}

// Tracks delta, the length of time for which every pair of arguments to
// .dualLessThan() maintains the present relation.
function DeltaTracker() {
  return {
    delta: Infinity,
    dualLessThan: function (x, y) {
      let lessThan = dualLessThan(x, y);
      if (lessThan) {
        [x, y] = [y, x];
      }
      if (x.velocity < y.velocity) {
        this.delta = Math.min(
          this.delta,
          (x.position - y.position) / (y.velocity - x.velocity)
        );
      }
      return lessThan;
    },
  };
}

// Converts the adjacency matrix to an adjacency list representation.
function graphFromMatrix(n, matrix) {
  let graph = [];
  for (let j = 0; j < n; j++) {
    graph.push([]);
    for (let i = 0; i < j; i++) {
      if (matrix[i][j] > 0) {
        graph[j].push({ i: i, length: Dual(matrix[i][j], 0) });
      }
    }
  }
  return graph;
}

// Essentially the usual algorithm for longest path, but tracks delta.
function longestPathTable(graph, gaps) {
  let tracker = DeltaTracker();
  let maximum = Dual(0, 0);
  let table = [];
  for (let j = 0; j < graph.length; j++) {
    let argument = null;
    if (j > 0) {
      maximum = dualPlus(maximum, gaps[j - 1]);
    }
    for (let edge of graph[j]) {
      let x = dualPlus(table[edge.i].maximum, edge.length);
      if (tracker.dualLessThan(maximum, x)) {
        argument = edge.i;
        maximum = x;
      }
    }
    table.push({ argument: argument, maximum: maximum });
  }
  return [tracker.delta, table];
}

// Essentially the usual algorithm for decoding the longest path from the
// dynamic program table.
function freezeCriticalGaps(table, graph, gaps) {
  let j = table.length - 1;
  while (j > 0) {
    let critical = true;
    let argument = table[j].argument;
    if (argument !== null) {
      j = argument;
    } else {
      j--;
      gaps[j].velocity = 0;
    }
  }
}

// Changes the time from now to now + delta.
function advanceGaps(gaps, delta) {
  for (let i = 0; i < gaps.length; i++) {
    gaps[i].position += gaps[i].velocity * delta;
  }
}

// Extracts the final result from the dynamic program table.
function positionsFromTable(table) {
  let positions = [];
  for (let entry of table) {
    positions.push(entry.maximum.position);
  }
  return positions;
}

// Entry point for the layout algorithm.
function layOut(n, matrix) {
  let graph = graphFromMatrix(n, matrix);
  let gaps = [];
  for (let j = 1; j < n; j++) {
    gaps.push(Dual(0, 1));
  }
  while (true) {
    let [delta, table] = longestPathTable(graph, gaps);
    if (delta == Infinity) {
      return positionsFromTable(table);
    }
    if (table[n - 1].maximum.velocity > 0) {
      freezeCriticalGaps(table, graph, gaps);
    } else {
      advanceGaps(gaps, delta);
    }
  }
}

// Various test cases.
console.log(
  layOut(4, [
    [0, 200, 0, 900],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

console.log(
  layOut(5, [
    [0, 0, 2, 0, 0],
    [0, 0, 0, 5.5, 0],
    [0, 0, 0, 0, 8],
    [0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0],
  ])
);

console.log(
  layOut(4, [
    [0, 0, 200, 0],
    [0, 0, 0, 150],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

console.log(
  layOut(4, [
    [0, 0, 200, 0],
    [0, 0, 0, 200],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

console.log(
  layOut(4, [
    [0, 0, 200, 0],
    [0, 0, 0, 300],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);
David Eisenstat
  • 64,237
  • 7
  • 60
  • 120
  • Hi David, Could you please elaborate on "a different interpretation of criterion 5: that we maximize the minimum gap (breaking ties by the second minimum, breaking ties by the third, etc.;" a little bit more? I did not get its full meaning. – Devs love ZenUML Feb 12 '22 at 07:29
  • @david-eisenstat it is not, if one of the gaps is forced to be quite short and others are flexible there are multiple solutions that are equally good with maximized minimum and only one of them is good for standard deviation. E.g. `m_d(0,1) = 10, m_d(1,2)=10, m_d(1,3)=50`, with `[0,10,20,60]` is as good as `[0,10,35,60]` but the second is the necessary solution and the first is not. – Roman Svistunov Feb 12 '22 at 14:41
  • 2
    @david-eisenstat sorry, i have not seen after edit 1. Anyway, it is still not the same metric. Assume we have 5 points, and `m_d(0,2)=2, m_d(1,3)=5.5, m_d(2,4)=8`. Now the best solution (for standard deviation) is `[0, 0.75, 2, 6.25, 10]` and your solution would produce `[0,1,2,6.5,10]`. But i would argue your metric would produce more appealing result for me, but that is subjective and we should get @devs-love-zenuml to resolve this. – Roman Svistunov Feb 12 '22 at 15:40
  • 1
    @RomanSvistunov I agree that `[0,1,2,6.5,10]` is more appealing than `[0,0.75,2,6.25,10]`. I don't know what is the math term to describe or quantify this. – Devs love ZenUML Feb 12 '22 at 21:10
  • @DevsloveZenUML I have deleted my answer about the physical model with springs and rods (BTW I don't think I'll have the time to overhaul it). FYI I calculated by hand what that physical model would yield. The result is `[0,0.75,2,6.25,10]`. – Walter Tross Feb 13 '22 at 16:17
  • @WalterTross, thanks. I will very interested to see it if one day you find time to implement it. It is an interesting model. I have discussed this model with my colleagues (not on SO). We were not to build it by ourselves. Thanks again and thanks to the SO community. – Devs love ZenUML Feb 13 '22 at 22:26
5

First, we should find the smallest possible length of the segment. This can be done with the greedy algorithm: place the first point at 0 (without loss of generality) and add the following points one by one with a coordinate as small as possible without violating distances to all previous points. We would fix the last point at its position and would not move it any further. I would suppose that correctness is obvious.

Now the tricky part with optimizing the standard deviation of neighboring distances. We would introduce n-1 variables - the distances, I would name them Di. Trying to minimize their standard deviation is the same as minimizing variance.

And because E(D) is constant as we fixed the first and last points, minimizing standard deviation is the same as minimizing the sum of differences.

Constraints on distances are all linear as whenever you have m_d(Pi,Pj) you would add constraint (sum Dk for k=i to j) >= m_d(Pi,Pj). To guarantee that sum of distances is what we need it to be we also would add contracts sum Dk is both less or equal and greater or equal to the minimum distance calculated in step one. Therefore this problem is a quadratic optimization problem, that can be solved with libraries.

Note that the coordinates for the points then would be easily restored from distances as prefix sums.

Even though for 30 points probably any algorithm of quadratic programming would be fast enough if you want to invest time into further optimization if you intend to increase the number of points, as this problem can actually be solved exactly in polynomial time.

The first and most important optimization. Because the target function is the sum of squares, the matrix of target quadratic form is positive definite (as it is the identity matrix), and therefore there exists a polynomial algorithm, as have been shown in this paper.

The last optimization I would like to mention might not be relevant, but it is very good on most of the possible m_d matrices. To reduce the quadratic programming problem size, we can find optimal positions for some points before quadratic programming.

First, we would make reductions in matrix to ensure that for any i<j<k it is true that m_d(Pi,Pj)+m_d(Pj,Pk)>=m_d(Pi,Pk) with updating the right side. Note that with this computation m_d(P1,Pn) is also the minimum possible length of the segment. After this, we can say that for any i, if m_d(P1,Pi)+m_d(Pi,Pn)=m_d(P1,Pn) is true there is only one possible position for the point Pi, and, this allows us to remove one of the variables as now distance from P(i-1) to Pi and from Pi to P(i+1) can be replaced with one variable - distance from P(i-1) to P(i+1): the first distance is just Pi position - P(i-1) position with can be derived by induction, and the P(i+1) position is P(i-1) position + this new distance.

Edit 2

There are two versions of this pseudocode, both assume you use some kind of quadratic programming library, first one is the intuitive one (and used in a few libraries), the second one is for libraries that use matrices as their input (but less intuitive):

function findOptimal(Matrix md, int n) {
  int minLength = findMinLength(md, n);
  Constraints constraints; //from library
  for (int i = 0; i < n; ++i) {
    for (int j = 0; j < m; ++j) {
      if (j < i)
        //Here sum of variables is not their current sum
        //but some object for library
        constraints.AddConstranit(sum of variables i to j >= md[j][i])
      else
        constraints.AddContraint(sum of variables j to i >= md[j][i])
    }
  }
  constraints.AddConstraint(sum of variables 0 to n-1 <= minLength)
  variables = QuadraticOptimizationSolve(sum of squares of variables, contracts)
  positions = Array<number>(n);
  positions[0] = 0;
  for (int i = 0; i < n; ++i) {
    positions[i + 1] = positions[i] + variables[i];
  }
  return positions;
}

As matrices are more common as inputs in libraries, here is the pseudocode to work with them:

function findOptimal(Matrix md, int n) {
  int minLength = findMinLength(md, n);
  Matrix optimization = Matrix(n - 1, n - 1);
  for (int i = 0; i < n - 1; ++i) {
    for (int j = 0; j < n - 1; ++j) {
      if (i == j)
        optimization[i][j] = 1
      else
        optimization[i][j] = 0
    }
  }
  Matrix constraints = Matrix(n * (n - 1) / 2 + 1, n)
  for (int i = 1; i < n; ++i) {
    for (int j = 0; j < i; ++j) {
      constraintRow = i * (i - 1) / 2 + j;
      //constriant is sum of distances from j to i is 
      //at least md[j][i]
      for (int k = j; k < i; ++k) {
        //these are coefficients in linear combination
        //negative because we need >= while optimizers usually use <= contraints
        if (k >= k and k < i)
          constraints[constraintRow][k] = -1
        else
          constraints[constraintRow][k] = 0
      }
      constraints[constraintsRow][n - 1] = -md[i][j];
    }
  }
  variables = QuadraticOptimizationSolve(optimization, constraints);
  positions = Array<number>(n);
  positions[0] = 0;
  for (int i = 0; i < n; ++i) {
    positions[i + 1] = positions[i] + variables[i];
  }
  return positions;
}
Also this solution assumes in your library the constants vector in constraints is the last column, but sometimes it might be separate vector

Edit 3

This is the javascript solution for your problem. Easily handles 30 points in a second (actually on my setup it can process up to 100 points in a second). It uses a different implementation of findMinLength function than yours (yours does not save the result of recursive calls and is exponential because of that). Also, I was unable to find a good library for quadratic programming, so I ended up using this solution (was edited in edit 4). This is for node js, but combining all the files (and removing requires and exports, for vsmall you need to define vsmall as epsilon in their file), and this library seems to work. It appears that the previous library was just a browser adaptation of this one, and this one had an issue that was fixed years after the adaptation was made, but it is quite easy to do it manually (as it has no dependencies).

let findMinLength = function (n, m) {
  let answer = [];
  answer[0] = 0;
  for (let i = 1; i < n; i++) {
    answer[i] = 0;
    for (let j = 0; j < i; j++) {
      answer[i] = Math.max(answer[i], answer[j] + m[i][j], answer[j] + m[j][i]);
    }
  }
  return answer[n - 1];
}
let findOptimal = function(n, m) {
  let minLength = findMinLength(n, m);
  let dmat = [];
  let dvec = [];
  for (let i = 0; i < n - 1; i++) {
    dmat[i + 1] = []
    dvec[i + 1] = 0
    for (let j = 0; j < n - 1; j++) {
      if (i == j)
        dmat[i + 1][j + 1] = 1;
      else
        dmat[i + 1][j + 1] = 0;
    }
  }
  let amat = [];
  for (let i = 0; i < n - 1; i++) {
    amat[i + 1] = [];
    amat[i + 1][1] = -1;
  }
  let bvec = [];
  bvec[1] = -1;
  for (let i = 1; i < n; i++) {
    for (let j = 0; j < i; j++) {
      let constraintRow = i * (i - 1) / 2 + j;
      for (let k = 0; k < n - 1; k++) {
        amat[k + 1][constraintRow + 2] = (k >= j && k < i) ? 1 : 0;
      }
      bvec[constraintRow + 2]=Math.max(m[i][j], m[j][i]) / minLength;
    }
  }
  let variables = solveQP(dmat,dvec,amat,bvec).solution;
  let positions = [];
  positions[0] = 0;
  for (let i = 0; i < n - 1; i++) {
    positions[i + 1] = positions[i] + variables[i + 1] * minLength;
  }
  return positions;
}

For example, calling findOptimal(4, m) where m is as in your example (with 4 points), the function returns [0,300,600,900]

Edit

As noted in @david-eisenstat there exists an interpretation of the 5th point as maximizing minimum gap, but there is a simpler and asymptotically faster solution, namely, answer binary search.

We know that the minimum gap is at least 0 and at most the segment length / n. We can maximize it with binary search, with a checker ensuring that every gap is at least our value. If findMinLength is a function that solves 1-4, the following is the solution that optimizes this metric (pseudocode):

function findOptimal(Matrix md, int n) {
  minLength = findMinLength(md, n) //The answer to minimize segment length
  minGapL = 0 //Lower bound in maximized minimum gap length
  minGapR = ceiling(minLength / n) //Upper bound in maximized minimum gap
  while minGapL < minGapR {
    minGapTest = (minGapL + minGapR) / 2
    mdCopy = md.copy()
    //This loop ensures that every gap is at least minimum gap tested
    for (int i = 1; i < n; ++i) {
      mdCopy[i - 1][i] = max(mdCopy[i - 1][i], miGapTest)
    }
    int testLength = findMinLength(mdCopy, n)
    if (testLength == minLength) {
      //if it is possible to get tested minimum gap update lower bound
      minGapL = minGapTest
    } else {
      //otherwise update lower bound
      minGapR = minGapTest - 1;
    }
  }
  return (minLength, minGapL);
}

With this function returning both length and maximized minGap one can reconstruct the answer with the matrix modification as in the loop and then solve just problem for 1-4. This solution is polynomial (the 1-4 problem can be solved in linear time as shown above, and binary search is logarithmic of the answer, and with the answer being exponential of the input size it is also linear of the input size).

Roman Svistunov
  • 1,069
  • 6
  • 11
  • 1
    Thanks @roman-svistunov. To make sure that I get it correctly, in line #2, parameter `n` is not required, right? line #7, it should be `md`. – Devs love ZenUML Feb 12 '22 at 03:06
  • I have updated my base (non-optimal) implementation based on your advice. It is now much more simpler. Thanks! – Devs love ZenUML Feb 12 '22 at 05:20
  • Yes, neither function needs parameter n, it is just the size of `md`. And in line #7 it should be `md`, also edited it into the answer. – Roman Svistunov Feb 12 '22 at 06:52
  • Also, the solution for minimizing the maximum for sure would be fast enough for 30 points (it should easily handle ~10^5 points in a second), and for the quadratic optimization, it all depends on the library you would use. I would try to add some pseudocode for this solution too as it might look confusing at the moment. – Roman Svistunov Feb 12 '22 at 07:03
  • 2
    The "interpretation of the 5th point as maximizing minimum gap" isn't necessarily true. For example, `m_d(0,1) = 10, m_d(1,2)=10, m_d(1,3)=50`. The distance between P0 and P1 will be fixed as 10. But P2 can still be moved to the middle of P1 and P3. – Devs love ZenUML Feb 12 '22 at 07:21
  • It is true, but the code would be much nicer and often it would work. The standard deviation optimization here seems to be as hard as any quadratic optimization with positive definite matrix (though never proved it, only subjective meaning), and it would be significantly slower (still enough to work with 30 points, but i would not be enthusiastic for 100 or more per second). Anyway I am finishing my edit with the pseudocode for the standard deviation solution. – Roman Svistunov Feb 12 '22 at 07:25
  • @devs-love-zenuml Also at this point, it would be nice to know what are languages you are restricted to when implementing this. I guess javascript, but that's only a guess. Knowing the environment would help find suitable optimization libraries and write code that would prepare inputs specifically for them, or, if libraries are not acceptable, we would at least suggest somewhere to read about the algorithms used for quadratic optimizers. Consider adding this information. – Roman Svistunov Feb 12 '22 at 07:50
  • Yes. I am restricted to JavaScript. This code is running in browser. Thanks for asking. – Devs love ZenUML Feb 12 '22 at 07:53
  • @devs-love-zenuml implemented the solution in javascript, which fits all 6 of your requirements. See edit 3. – Roman Svistunov Feb 12 '22 at 09:34
  • Thanks @roman-svistunov. The JavaScript solution seems not working for some cases. For example, `m_d(0,1)=60, m_d(1,2)=120, m_d(1,3)=180, m_d(2,3)=120`, the expected result is `[0, 60, 180, 300]`. The result produced from the solution is `[0, 60, 180, 240]`. – Devs love ZenUML Feb 12 '22 at 10:58
  • @devs-love-zenuml it appears that when the numbers are big the library I use does not converge. I edited the code adding matrix normalization, and now it should be always correct. I manually tested it on more inputs now. – Roman Svistunov Feb 12 '22 at 11:39
  • er, never mind, somehow i broke it – Roman Svistunov Feb 12 '22 at 11:41
  • 1
    @devs-love-zenuml now the problem is fixed, it was in the library, now library setup is slightly more difficult, but at least it should work, see post for details. – Roman Svistunov Feb 12 '22 at 12:09
  • I really appreciate your help and your solutions all work fine. I end up using David's solution is ZenUML product because it has less dependencies. I cannot split the bounty. Thanks a lot. – Devs love ZenUML Feb 18 '22 at 10:18
2

Since you indicated that an approximation would be OK, here’s a simple code that computes approximately the same result as my other answer. The adjacency list version is longer but likely significantly faster on sparse constraint matrices.

// Increase for faster but less accurate results.
const PIXEL = 1;

// Returns a layout where each gap is within PIXEL of optimal.
function layOut(n, m) {
  let positions = Array(n).fill(0);
  let gaps = Array(n).fill(0);
  while (true) {
    // Compute the leftmost layout.
    for (let j = 1; j < n; j++) {
      positions[j] = positions[j - 1] + gaps[j - 1];
      for (let i = 0; i < j; i++) {
        positions[j] = Math.max(positions[j], positions[i] + m[i][j]);
      }
    }
    // Compute the rightmost layout. As we do so, each gap that can grow, grows.
    let stillGrowing = false;
    for (let i = n - 2; i > -1; i--) {
      let proposedGap = gaps[i] + PIXEL;
      // i + 1 is rightmost; i is leftmost.
      let maxGap = positions[i + 1] - positions[i];
      if (proposedGap < maxGap) {
        gaps[i] = proposedGap;
        stillGrowing = true;
      } else {
        gaps[i] = maxGap;
      }
      positions[i] = positions[i + 1] - gaps[i];
      for (let j = n - 1; j > i; j--) {
        positions[i] = Math.min(positions[i], positions[j] - m[i][j]);
      }
    }
    if (!stillGrowing) {
      return positions;
    }
  }
}

// Adjacency list version.
function layOut(n, m) {
  let leftArcs = [];
  let rightArcs = [];
  for (let i = 0; i < n; i++) leftArcs.push([]);
  for (let j = 0; j < n; j++) rightArcs.push([]);
  for (let j = 1; j < n; j++) {
    for (let i = 0; i < j; i++) {
      let x = m[i][j];
      if (x > 0) {
        leftArcs[j].push([i, x]);
        rightArcs[i].push([j, x]);
      }
    }
  }
  let positions = Array(n).fill(0);
  let gaps = Array(n).fill(0);
  while (true) {
    // Compute the leftmost layout.
    for (let j = 1; j < n; j++) {
      positions[j] = positions[j - 1] + gaps[j - 1];
      for (let [i, x] of leftArcs[j]) {
        positions[j] = Math.max(positions[j], positions[i] + x);
      }
    }
    // Compute the rightmost layout. As we do so, each gap that can grow, grows.
    let stillGrowing = false;
    for (let i = n - 2; i > -1; i--) {
      let proposedGap = gaps[i] + PIXEL;
      // i + 1 is rightmost; i is leftmost.
      let maxGap = positions[i + 1] - positions[i];
      if (proposedGap < maxGap) {
        gaps[i] = proposedGap;
        stillGrowing = true;
      } else {
        gaps[i] = maxGap;
      }
      positions[i] = positions[i + 1] - gaps[i];
      for (let [j, x] of rightArcs[i]) {
        positions[i] = Math.min(positions[i], positions[j] - x);
      }
    }
    if (!stillGrowing) {
      return positions;
    }
  }
}

// Various test cases.
console.log(
  layOut(4, [
    [0, 200, 0, 900],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

console.log(
  layOut(5, [
    [0, 0, 200, 0, 0],
    [0, 0, 0, 550, 0],
    [0, 0, 0, 0, 800],
    [0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0],
  ])
);

console.log(
  layOut(4, [
    [0, 0, 200, 0],
    [0, 0, 0, 150],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

console.log(
  layOut(4, [
    [0, 0, 200, 0],
    [0, 0, 0, 200],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);

console.log(
  layOut(4, [
    [0, 0, 200, 0],
    [0, 0, 0, 300],
    [0, 0, 0, 0],
    [0, 0, 0, 0],
  ])
);
David Eisenstat
  • 64,237
  • 7
  • 60
  • 120
  • Thanks David. I have just tested this solution. As you said, all tests (with positions bigger than 100) pass. Averagely, without further optimisation it takes 10x longer than your original solution. I have not tested with "adjacency list representation". – Devs love ZenUML Feb 13 '22 at 22:21
  • @DevsloveZenUML depending on how sparse your constraint matrix is, I could easily see this solution making up the difference. – David Eisenstat Feb 15 '22 at 04:20
  • I don't have data on how sparse the matrix is. Imagine a typical sequence diagram. I need to create this matrix to hold the width of `messages` between `participants` AND the width of the labels of the `participants` (in cases, where participant name is long). – Devs love ZenUML Feb 15 '22 at 04:24
  • Also we will not have more than a couple of participants in a sequence diagram. – Devs love ZenUML Feb 15 '22 at 04:39
  • @DevsloveZenUML posted the adjacency list version for comparison. – David Eisenstat Feb 15 '22 at 04:59