4

In my application I am checking user-entered urls for malware by sending them to google.

To test getting a "malware found" reaction I used the url http://malware.testing.google.test/testing/malware

To my surprise this url was not marked as malware

In fiddling about I found out that when I enter a trailing slash, it does get picked up as malware.

In the documentation it says the url's need to be canonicalized.

Do any of you know of an implementation of this requirement? (preferably in c#)

Sjors Miltenburg
  • 2,540
  • 4
  • 33
  • 60
  • 1
    Just want to know if you found a C# implementation? The Java implementation in my answer needed a little work but it now passes the Google test suit. – ForguesR Dec 22 '14 at 16:26
  • I have switched project so my quest to fix this issue has been put on hold for now. I will return to this problem eventually (or my collegue) so if you have a working C# solution I am interested! Thx for your answer, once I have time to look into this I will mark it as the answer. – Sjors Miltenburg Dec 22 '14 at 20:44

2 Answers2

3

I am working on the same problem right now and the only thing I have found is a Java implementation in the jGoogleSafeBrowsing library. Unfortunately, it is stuck to v2 of the API.

Anyhow, you can have a look at the canonicalization code here. Be aware that :

ForguesR
  • 3,558
  • 1
  • 17
  • 39
  • 1
    Update: there is [Gsb4j](https://github.com/bazi/gsb4j) library which is a Java client implementation of Google Safe Browsing **API v4**. It has a canonicalization code with a [test class](https://github.com/bazi/gsb4j/blob/gsb4j-1.0.3/core/src/test/java/kg/net/bazi/gsb4j/url/CanonicalizationTest.java) that passes all test cases provided in docs. Disclosure: I am the author of Gsb4j, open-sourced in 2018 (that's why a veery late comment here). – Bazi Jan 10 '19 at 22:59
3

Using the link ForguesR provided I have created this C# implementation.

It passes 26 out of the 33 tests from the google test suite found at: https://developers.google.com/safe-browsing/developers_guide_v3#Canonicalization

It has been deemed good enough for production since it doesnt catch the more obsure webpages.

Code: https://dotnetfiddle.net/xO9sWl

Community
  • 1
  • 1
Antoon Meijer
  • 89
  • 1
  • 9