0

I want to test Shodan data. The data includes fields like timestamp, crawler ID, server OS, etc. These things change at every request. Wow should I test them?

Shodan JSON data:

{
    "city": "Mountain View",
    "region_code": "CA",
    "os": null,
    "tags": [],
    "ip": 134744072,
    "isp": "Google",
    "area_code": 650,
    "dma_code": 807,
    "last_update": "2017-03-04T13:54:57.176297",
    "country_code3": "USA",
    "country_name": "United States",
    "hostnames": [
        "google-public-dns-a.google.com"
    ],
    "postal_code": "94035",
    "longitude": -122.0838,
    "country_code": "US",
    "ip_str": "8.8.8.8",
    "latitude": 37.385999999999996,
    "org": "Google",
    "data": [
        {
            "_shodan": {
                "options": {},
                "id": null,
                "module": "dns-udp",
                "crawler": "122dd688b363c3b45b0e7582622da1e725444808"
            },
            "hash": -553166942,
            "os": null,
            "opts": {},
            "ip": 134744072,
            "isp": "Google",
            "port": 53,
            "hostnames": [
                "google-public-dns-a.google.com"
            ],
            "location": {
                "city": "Mountain View",
                "region_code": "CA",
                "area_code": 650,
                "longitude": -122.0838,
                "country_code3": "USA",
                "country_name": "United States",
                "postal_code": "94035",
                "dma_code": 807,
                "country_code": "US",
                "latitude": 37.385999999999996
            },
            "timestamp": "2017-03-04T13:54:57.176297",
            "domains": [
                "google.com"
            ],
            "org": "Google",
            "data": "\nRecursion: enabled",
            "asn": "AS15169",
            "transport": "udp",
            "ip_str": "8.8.8.8"
        }
    ],
    "asn": "AS15169",
    "ports": [
        53
    ]
}

My test file:

def test_shodan_api():
    assert shodan_data == ???
AlG
  • 14,697
  • 4
  • 41
  • 54
nigella
  • 1
  • 2

1 Answers1

0

I assume you want to compare your actually received data with canned data and stumble over the fact that some part (the timestamps) differ in each call and thus your complete data never matches exact the canned data.

I propose to remove the timestamps from both the canned data and the received data and compare the rest:

del received_data['last_update']
del canned_data['last_update']  # you probably want to do this prior to canning the data ;-)

assert_equal(received_data, canned_data)
Alfe
  • 56,346
  • 20
  • 107
  • 159
  • Well, thanks for your time! I thought that, I was wondering if there was a different solution. – nigella Mar 06 '17 at 13:06
  • Of course, lots of options available. While this resembles blacklisting (you remove parts you do not want) you could also use whitelisting: Compare only some specific values: `for key in [ '...', '...', ...]: assert_equal(received_data[key], canned_data[key])` – Alfe Mar 06 '17 at 13:08
  • Btw, since you seem to be new here, welcome to StackOverflow! If you find an answer valuable, you are free to upvote it (the upper triangle left of the answer). if you think an answer solved your issue, feel free to accept it (the grey checkmark left of the answer). If an answer is not helpful, feel free to downvote it (the lower triangle). Specific "thank you" comments are normally not necessary and generally rather discouraged to keep the posts small. – Alfe Mar 06 '17 at 13:10