How to sort different related units?

Question

I've given a task in which the user enters some unit relations and we have to sort them from high to low. What is the best algorithm to do that?

I put some input/output pairs to clarify the problem:

Input:

km = 1000 m
m = 100 cm
cm = 10 mm

Output:

1km = 1000m = 100000cm = 1000000mm

Input:

km = 100000 cm
km = 1000000 mm
m = 1000 mm

Output:

1km = 1000m = 100000cm = 1000000mm

Input:

B = 8 b
MiB = 1024 KiB
KiB = 1024 B
Mib = 1048576 b
Mib = 1024 Kib

Output:

1MiB = 8Mib = 1024KiB = 8192Kib = 1048576B = 8388608b

Input:

B = 8 b
MiB = 1048576 B
MiB = 1024 KiB
MiB = 8192 Kib
MiB = 8 Mib

Output:

1MiB = 8Mib = 1024KiB = 8192Kib = 1048576B = 8388608b

How to generate output based on given output?

This can be seen as a problem in linear algebra. Perhaps a matrix elimination method? — Neil, May 24 '21 at 18:02
Are you sure? Could you clarify more? because I tried matrix elimination but It wasn't successful maybe I'm doing It wrong — Mostafa Solati, May 24 '21 at 18:54
Math.SE would probably know the exact terminology, but it's conservative, and one over the amount goes in the opposite diagonal entry. As long as it has `n-1` off-diagonal terms in each column and has no inconsistent entries, it should be satisfiable. _Eg_, `KiB/B = KiB/MiB * MiB/B = (MiB/KiB)^-1 * MiB/B`. — Neil, May 24 '21 at 20:15
Convert all the units to a single common unit, sort, and then convert back. — duffymo, May 25 '21 at 00:31
use a proper units library like [Boost.Units](https://www.boost.org/doc/libs/1_76_0/doc/html/boost_units.html) or https://github.com/bernedom/SI, https://github.com/nholthaus/units, https://www.reddit.com/r/cpp_questions/comments/4r6vin/c_library_for_unit_conversion/. They also convert all units to a common unit to store internally. There are already some kind of units conversion in [std::chrono::duration](https://en.cppreference.com/w/cpp/chrono/duration). See also [Units of measurement in C++](https://stackoverflow.com/q/21868368/995714) — phuclv, May 25 '21 at 03:01

Neil · Accepted Answer · 2021-05-25T17:31:23.547

My attempt at a graph-based solution. Example 3 is the most interesting, so I'll take that one, (multiple steps and multiple sinks.)

Transform B = n A to edge A -> B and label it n, n > 1. If it's not a connected DAG, it's inconsistent.

Reduce to a bipartite graph by making multiple connections I -> J -> K skip to I -> K by multiplying the n of I -> J by J -> K. Any inconsistencies are a sign that the problem is inconsistent.

The idea of this step is to produce only one single greatest value. A vertex on the left with a degree of greater than 1, P, and { Q, R } are in the right set, where, P -> Q labelled n1 and P -> R labelled n2, 1 < n1 < n2, (WLOG,) can be transformed into P -> R (unchanged) and Q -> R with label n2 / n1 (bringing Q, in this case Mib, from right to left.)

Is the graph bipartite with a single right node? No, goto 2.

Sort the edges.
X -> Z with n1 ... Y -> Z with n2 becomes 1 Z = n1 X = ... = n2 Y.

OmG · Answer 2 · 2021-05-25T22:11:46.877

You can find the following algorithm:

 1. detect all existing units: `n` units
 2. create a `n x n` matrix `M` such that the same rows and columns show 
    the corresponding unit. put all elements of the main diagonal of the 
    matrix to `1`.
 3. put the specified value in the input into the corresponding row and column. 
 4. put zero for the transpose of the row and the column in step 3.
 5. put `-1` for all other elements

Now, based on `M` you can easily find the biggest unit:

 5.1 candidate_maxs <-- Find columns with only one non-zero positive element 

 not_max <-- []
  
 6. while len(candidate_max)> 1:

    a. take a pair <i, l> and find a column h such that both (i, h) 
       and (l, h) are known, i.e., they are positive. 
       If M[i, h] > M[l, h]:
            remove_item <-- l
        Else:
            remove_item <-- i
        candidate_max.remove(remove_item)
        not_max.append(remove_item)
     b. if cannot find such a pair, find a pair <i, l>: i from 
        candidate_max and h from not_max with the same property.
        If M[i, h] < M[l, h]:
            candidate_max.remove(i)
            not_max.append(i)
 biggest_unit <-- The only element of candidate_max

By finding the biggest unit, you can order others based on their value in the corresponding row of the biggest_unit.

 7. while there is `-1` value in the row `biggest_unit` on column `j`: 
    `(biggest_unit, j)`

    a. find a non-identity and non-zero positive element in (column `j`
       and row `k`) or (row `j` and column `k`), i.e., `(k,j)` or `(j, k)`, such that `(biggest_unit, k)` is strictly 
       positive and non-identity. Then, calculate the missing value 
       based on the found equivalences.

     b. if there is not such a row, continue the loop with another `-1` 
        unit element.

 8. sort units based on their column value in `biggest_unit` row in 
    ascending order.

However, the time complexity of the algorithm is Theta(n^2) that n is the number of units (if you implement the loop on step 6 wisely!).

Example

Input 1

km = 1000 m
m = 100 cm
cm = 10 mm

Solution:

      km   m    cm   mm
    km 1  1000  -1   -1
     m 0    1   100  -1
    cm -1   0    1   10
    mm -1  -1    0    1

M = [1  1000  -1  -1
     0    1   100 -1
    -1    0    1  10
    -1   -1    0   1]

===> 6. `biggest_unit` <--- km (column 1)

7.1 Find first `-1` in the first row and column 3: (1,3)
    Find strictly positive value in row 2 such that (1,2) is strictly 
    positive and non-identity. So, the missing value of `(1,3)` must be 
    `1000 * 100 = 100000`. 

7.2 Find the second `-1` in the first row and column 4: (1,4)
    Find strictly positive value in row 3 such that (1,3) is strictly 
 
    positive and non-identity. So, the missing value of `(1,4)` must be 
    `100000 * 10 = 1000000`.

The loop is finished here and we have:

M = [1  1000  100000  1000000
     0    1     100      -1
    -1    0      1       10
    -1   -1      0        1]

Now you can sort the elements of the first row in ascending order.

Input 2

km = 100000 cm
km = 1000000 mm
m = 1000 mm

Solution:

       km    m    cm     mm
    km  1   -1  100000 1000000
     m -1    1   -1     1000
    cm  0   -1    1      -1
    mm  0    0   -1       1

M = [1   -1  100000 1000000
    -1    1   -1     1000
     0   -1    1      -1
     0    0   -1       1]

===> 

6.1 candidate_max = [1, 2]

6.2 Compare them on column 4 and remove 2

biggest_unit <-- column 1

And by going forward on step 7, 
Find first `-1` in the first row and column 2: (1,2)
Find a strictly positive and non-identity value in row 2:(1,4)
So, the missing value of `(1,2)` must be `1000000 / 1000 = 1000`.
In sum, we have: 

M = [1  1000  100000 1000000
    -1    1   -1     1000
     0   -1    1      -1
     0    0   -1       1]

Now you can sort the elements of the first row in ascending order (step 8).

@MostafaSolati thanks. Sorry, I don't know any specific name for that. But, you can just call it "OmG's algorithm" ;) — OmG, May 25 '21 at 13:14
I mean where did you get that? :D did you invent it yourself or is it a common practice for solving these kinda issues? — Mostafa Solati, May 25 '21 at 13:35
@MostafaSolati Of course. It's mine : ) unless I should have cited the origin! — OmG, May 25 '21 at 13:56
Unfortunately, your solution doesn't work in all cases, for example, it can't find the biggest unit in example 2. Does it have any workaround? — Mostafa Solati, May 25 '21 at 14:44
Are you sure step 7 is right? How did you calculate 1000 in example 2? Because I can't reproduce it — Mostafa Solati, May 25 '21 at 19:33
@MostafaSolati yes. I've tried to elaborate a little bit more. Please see the update. — OmG, May 25 '21 at 19:46
I think there is a problem I cant apply the description of step 7 to example 2. You mentioned finding non-identity and non-zero positive elements in column j such biggest_unit,k > 0 but there is no such thing. could you please recheck your description with example 2 ? btw this algorithm should work In any case. I see you have sorted the units in the matrix in ascending order. Let's try for example mm Km cm m and see if it's works — Mostafa Solati, May 25 '21 at 21:56
@MostafaSolati Ah! I see. Thanks. You're right. I should have written (j, k) or (k,j). It's updated. — OmG, May 25 '21 at 22:12

How to sort different related units?

2 Answers2

Example