3

I am scraping websites for information and it involves getting sha1 hashes of magnet links.

I get all the magnet links with a simple preg_match_all but in my results I am getting weird results, I understand that a magnet hash in its hexadecimal form is 40 characters long, but I am also getting results that return strings that are 32 characters long that contain other non hexadecimal values.

Two examples from my results, firstly a normal 40 hexadecimal hash within a magnet link,

array
    0 => string 'F5AD2D170C033736FD987106F04C3ABD6DF41D14' (length=40)

And the other weird results that I do not understand where the hash is a 32 non hexadecimal value,

array
    0 => string 'VPR33QQM3L6BFU5FGOZXMBNORAFFSZWW' (length=32)

Has the hash been packed in some way? I know it is not done with pack('H*', $hash) as that returns the binary of the hash? The magnet links do work as I have tested them.

More so you can see these hashes in use at this website

http://eztv.it

By hovering over the magnet links and looking a the magnet hash.

Thanks

Griff
  • 1,647
  • 2
  • 17
  • 27
  • 1
    That's other magnet information, your regex (which is a terrible way to parse an html page) must also be grabbing tracker information. – John V. Jun 09 '12 at 14:35
  • @AlexLunix I am not on about the other information, I am on about the hash which is in the second example with a length of 32? – Griff Jun 09 '12 at 14:38

2 Answers2

5

Hashes in magnet links can be encoded using Base32. In your example,

VPR33QQM3L6BFU5FGOZXMBNORAFFSZWW

turns into

ABE3BDC20CDAFC12D3A533B37605AE880A5966D6

which is a valid SHA-1 hash.

Chris
  • 3,113
  • 26
  • 33
1

Basically its not a valid torrent info hash, only sha1 (40 bytes) are valid, a torrent client or bencode script would fail if you passed that as a torrent hash.

It seems its related to:

http://eztv.it/magnet:?xt=urn:btih:VPR33QQM3L6BFU5FGOZXMBNORAFFSZWW Which is noting.

Lawrence Cherone
  • 46,049
  • 7
  • 62
  • 106
  • I you should also add `zoink.it` to your cache search when doing hash to torrent file :) I like your script :)! When using this link within u torrent though it does start downloading it? – Griff Jun 09 '12 at 14:46