1

Edit1: Changed question to fit the guidlines. I already asked this question but it got taken down for being to vague. Rewrote the question to make it clearer.

My goal is to compare around 6000 numbers with each other and find the matching/closest pair. But no number is equal to the other and has an error/tolerance.

My starting condition is two data blocks saved in two arrays are. Data block [A] are numbers of theoretical calculated numbers. Data block [B] are real measured numbers. [B] are the real measurements of an Object and should be as close as possible to [A] (measurement error). I have to find Am to the matching Bn .

Every number of [A] is higher than the previous one by a value of X or Y. Both X and Y are possible and therefor a “tree” like structure is displayed afterwards (not important for the comparison process). Every number in [B] has a measurement error and every “addition” of X and Y has a 1.1% chance of being 1 % off. This means in higher ranges of numbers the offset from the theoretical numbers [A] will get larger. The closest fit of B_n with a number of [A] (A_m) should be linked/highlighted/somehow memorized.

My idea is to calculate the difference/offset between one number of [B] (B_n) and every number of [A]. The pair with the lowest offset should be memorizing and the offset should be saved as well.

This would look like this (still not a code just a concept):

z=1
loop {
    offset1 = B_n – A_m
    offset2 = B_n – A_m+z
    is offset1 lower than offset2?
       Yes: z=z+1;
       No:  offset1 = offset2;
            memorize m
}

Is there function which does this on it`s own/does it do similar/does it with differently with the same result? Comparing numbers and return the closest pair. IS this methode valid to use?

Side information: Programming language = C // Number Data Blocks = two arrays of length 3000 and more // Number Range: ~100.000001 up to ~100,001.000001 //

Schausi
  • 23
  • 4
  • I'm pretty sure that your question is considered too broad for SO. However, if i understand your problem correctly, the [hungarian method](https://en.wikipedia.org/wiki/Hungarian_algorithm) might be what you're looking for. You feed it a cost matrix, and it will spit out the minimal-cost solution in polynomial time (i.e. it will find the best possible match for your two datasets) – Felix G Aug 05 '20 at 10:35
  • Your question starts off interestingly, but then qualifies itself for [«needs more focus», possibly also «opinion-based»](https://stackoverflow.com/help/dont-ask). The two last paragraphs especially contribute to that. You should rewrite those, to better fit the format of Stack Overflow. – Andreas is moving to Codidact Aug 05 '20 at 10:45
  • 1
    Note to readers: OP previously asked [this](https://stackoverflow.com/q/63252539/298225). – Eric Postpischil Aug 05 '20 at 10:50
  • `My starting condition is two data blocks saved in two arrays are` Please show some code. Is it `int array1[??]; int array2[??]`. `This would look like this:` Could you please rewrite it to real code? Please create show placeholder code that you have tried - it will show what you have tried and how did you fail and ease up others work. Really are you looking for [R^2](https://en.wikipedia.org/wiki/Coefficient_of_determination)? `find the matching/closest pair` That should be easy, just calculate the difference and find the smallest, did you succeed in doing that? – KamilCuk Aug 05 '20 at 11:38
  • `My goal is to compare around 6000 numbers with each other and find the matching/closest pair. But no number is equal to the other and has an error/tolerance` _You_ are presenting the requirements for _your_ code here, yet I do not think you have a clear vision of what you want. You seem to be searching for some statistical measurement. You want to compare number, or find the closest pair? How do you calculate distance between pairs? How is the pseudocode you presented anyhow related to the problem? What is "memorize"? What is `z=1`? What does `offset` represent? – KamilCuk Aug 05 '20 at 11:41
  • 1
    @KamilCuk: Code is not required and not very useful for this question. The fundamental question here is one of algorithm and not of specific data manipulations. Code can help describe and discuss algorithms, but good pseudocode would suffice; we do not need to see “place holder code” or “what you have tried.” – Eric Postpischil Aug 05 '20 at 11:42
  • Tell us why sorting A and sorting B and then matching A[i] to B[i] would not be a solution. – Eric Postpischil Aug 05 '20 at 15:35
  • The pseudo-code in the question is not comprehensible. `n` and `m` are never changed in it. `memorize m` does not mean much; we would end up with a list of memorized numbers with no apparent meaning. Is it intended to memory pairings of `n` with `m`? The comparison with offsets has no absolute value, so, for any `B_n`, it would always select the greatest `A_i`, not the closest. If it is attempting to find, for each `B_n`, the closest `A_m`, that will, in general, result in various many-to-1 matches where several `B_n` are matched to one `A_m`, and the pseudo-code shows no resolution for that. – Eric Postpischil Aug 05 '20 at 15:38

0 Answers0