0

I have two dictionaries:

#1

{"reg2/image2:0.1": {
"binaries": {
  "/bin/xyz": {
    "components": [
      "d",
      "aa",
      "new",
      "git.example.com/wayform-chassis/xyzserver-v0.25.3"
    ],
    "md5": "8bf54c95f9"
  }
},
"loose_packages": []}

#2

  "reg2/image2:0.2": {
"binaries": {
  "/bin/xyz": {
    "components": [
      "c",
      "git.example.com/wayform-chassis/xyzserver-v0.25.2",
      "aa",
      "asdhkjahsd"
    ],
    "md5": "f78f65f31"
  }
},
"loose_packages": ["test package"]}

With the following command: ddiff = deepdiff.DeepDiff(data_a, data_b, ignore_order=True, get_deep_distance=True)

My result is:

    reg2/image2: {
  "values_changed": {
    "root['binaries']['/bin/xyz']['md5']": {
      "new_value": "8bf54c95f9",
      "old_value": "f78f65f31"
    },
    "root['binaries']['/bin/xyz']['components'][3]": {
      "new_value": "git.example.com/wayform-chassis/xyzserver-v0.25.3",
      "old_value": "asdhkjahsd"
    },
    "root['binaries']['/bin/xyz']['components'][0]": {
      "new_value": "d",
      "old_value": "c"
    }
  },
  "iterable_item_added": {
    "root['binaries']['/bin/xyz']['components'][2]": "new"
  },
  "iterable_item_removed": {
    "root['binaries']['/bin/xyz']['components'][1]": "git.example.com/wayform-chassis/xyzserver-v0.25.2",
    "root['loose_packages'][0]": "test package"
  },
  "deep_distance": 0.25806451612903225
}

This is incorrect, or at least not how I want to present the result. it should not report that the value changed based on order, see how it says:

"root['binaries']['/bin/xyz']['components'][3]": {
  "new_value": "git.example.com/wayform-chassis/xyzserver-v0.25.3",
  "old_value": "asdhkjahsd"

I should instead see:

"root['binaries']['/bin/xyz']['components'][3]": {
  "new_value": "git.example.com/wayform-chassis/xyzserver-v0.25.3",
  "old_value": "git.example.com/wayform-chassis/xyzserver-v0.25.2"

Or at least:

"iterable_item_added": {
    "root['binaries']['/bin/xyz']['components'][3]": "git.example.com/wayform-chassis/xyzserver-v0.25.3"

"iterable_item_removed": {
    "root['binaries']['/bin/xyz']['components'][1]": "git.example.com/wayform-chassis/xyzserver-v0.25.2"

As you can see its saying the new value of the location changed, when I thought with my ignore order argument, its supposed to do just that, ignore the order and just report on the diffs?

If that is not achievable with deepdiff, What is a better way to get to the desired result?

DarkSand
  • 13
  • 5
  • Those two strings are different. Even if it's only the last character, they're not the same string, so the ignore order algorithm won't recognize that it got both moved and modified, it just sees different strings. You would need to add fuzzy matching to get the output you're expecting, and that's a whole different story. – joanis Jun 28 '23 at 00:13
  • @joanis I know they are different strings, but thats what deepdiff does, it tells me what changed between two dicts. So youre saying if an item is in a different place in the second dict, it will tell me whats in it place rather then say iterable removed, iterable added? – DarkSand Jun 28 '23 at 02:49
  • In your example, there's only `"aa"` in common inside `components` between your two structures, and the algorithm correctly recognized that it's still there even though it moved. I don't understand why you think the output is wrong. The output you shared just makes perfect sense to me. Where `aa` was, something else was added, where `aa` now is, something else was removed, but where a string was removed and a different string was added, that's just one operation changing oldvalue to newvalue. – joanis Jun 28 '23 at 03:15
  • The way I see it, `ignore_order` just makes the algorithm consider that when the string is still there but somewhere else, that's not a change. – joanis Jun 28 '23 at 03:18
  • However, I do understand that your use case is different, at least in how you want to present the information. But you could go through your `ddiff` result and replace each element of `values_changed` to a pair of elements, one to append to `added` and one to `removed`, since you want the same information, just presented differently, I think. – joanis Jun 28 '23 at 03:22
  • I see what you are saying, and yes my use case is given two different dicts, compare and tell me what changed, ignoring the placement of objects inside. The examples given are just test examples, in reality its 10000 line dicts with thousands of binaries and components, so Im trying to figure out a way to present it accurately by saying what was removed and what was added. You present a good idea with doing more work with the result, i hadnt thought of that. Funny enough if the order stays the same ddiff gives me the correct answer im looking for but with two dicts and diff order it doesnt – DarkSand Jun 28 '23 at 04:09
  • You mention replacing the values changed by each element, but for an example with 10000 lines that might be time consuming and Im relatively new to programming, how would i implement what you are saying? If thats too vague i understand, and thank you for your guidance. – DarkSand Jun 28 '23 at 04:12

0 Answers0