I'm trying to connect to grooveshark. For this python is my language of choice. but i have hit a brick wall. it would seem that grooveshark recently changed part of their protocol, or i might have hit a limitation of python.
I am working "together" with JackTheRipper51 from github, he made this library for grooveshark: https://github.com/jacktheripper51/groove-dl it's not actually a library, but i quickly recoded to be a library.
earlier this week it worked fine, and i was able to use it for my project. but 2 days ago it started failing on the getToken function, httplib started returning httplib.BadStatusLine: ''
which from my research means that the server closed the connection early.
From this research i started looking at the javascript and flash source of grooveshark, but that didn't return anything of value. So i did what any sane person that spent 5 hours looking at decompiled actionscript without ever having coded a line in the stuff before would do, and blamed it on groovesharks server.
Specifically i figured that grooveshark might deny connections that feature the Connection: close
header. I therefore decided to test it in the REST Console
extension for Chrome.
I made the python script dump the json it was encoding, and i pastes that into Rest Console, hit POST and it returned fine, with the expected data. I was now certain that it wasn't impossible that i was right.
My next step was to code in httplib2 (as that supports Connection: keep-alive
) which i have, but the problem persists.
I have tested in wireshark (removing the SSL in https, and it does send Connection: keep-alive
, this causes grooveshark to respond, but with https required
)
I have only modified small parts of the code.
Completely changed getToken()
def getToken():
global staticHeader, _token
post = {}
post["parameters"] = {}
post["parameters"]["secretKey"] = hashlib.md5(staticHeader["session"]).hexdigest()
post["method"] = "getCommunicationToken"
post["header"] = staticHeader
post["header"]["client"] = "htmlshark"
post["header"]["clientRevision"] = "20120312"
header = {"User-Agent": _useragent, "Referer": _referer, "Content-Type":"application/json", "Cookie":"PHPSESSID=" + staticHeader["session"], "Connection":"keep-alive"}
response, content = http.request("https://grooveshark.com/more.php?getCommunicationToken", "POST" ,body = json.JSONEncoder().encode(post), headers = header)
print response
#_token = json.JSONDecoder().decode(gzip.GzipFile(fileobj=(StringIO.StringIO(conn.getresponse().read()))).read())["result"]
#print _token
I added what the httplib2 initializes:
http = httplib2.Http()
I imported httplib2:
import httplib, httplib2
I also renamed the json constructors, simply because i wanted then more descriptive.
The full traceback is:
Traceback (most recent call last):
File "C:\Users\Delusional Logic\Documents\GitHub\groove-dl\python\groove.py", line 141, in <module>
getToken()
File "C:\Users\Delusional Logic\Documents\GitHub\groove-dl\python\groove.py", line 51, in getToken
response, content = http.request("https://grooveshark.com/more.php?getCommunicationToken", "POST" ,body = json.JSONEncoder().encode(post), headers = header)
File "C:\Python27\lib\site-packages\httplib2-0.7.4-py2.7.egg\httplib2\__init__.py", line 1544, in request
(response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
File "C:\Python27\lib\site-packages\httplib2-0.7.4-py2.7.egg\httplib2\__init__.py", line 1294, in _request
(response, content) = self._conn_request(conn, request_uri, method, body, headers)
File "C:\Python27\lib\site-packages\httplib2-0.7.4-py2.7.egg\httplib2\__init__.py", line 1264, in _conn_request
response = conn.getresponse()
File "C:\Python27\lib\httplib.py", line 1027, in getresponse
response.begin()
File "C:\Python27\lib\httplib.py", line 407, in begin
version, status, reason = self._read_status()
File "C:\Python27\lib\httplib.py", line 371, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
What is causing the BadStatusLine, and how can i fix it.
PS i know for a fact they had an 8 hour meeting the day before this broke, i bet you this was on the agenda.
UPDATE: JackTheRipper51 has informed me that this happens with all ssl request to grooveshark.com/more.php, no matter what you send. this makes me belive that it's python playing tricks on us.
UPDATE 2:
JackTheRipper51 just informed me that it is indeed python. Here's his post:
I didn't need C at all. Prepare to be outraged. A simple
curl -H "Content-Type: text/plain" -d "@jsontest" "https://grooveshark.com/more.php?getCommunicationToken" -v on a linux
box got me a token... jsontest here being
{"header":{"client":"mobileshark","clientRevision":"20120227","privacy":0,"country":{"ID":63,"CC1":4611686018427388000,"CC2":0,"CC3":0,"CC4":0,"DMA":0,"IPR":0},"uuid":"BF5D03EE-91BB-40C9-BE7B-11FD43CAF0F0","session":"1d9989644c5eba85958d675b421fb0ac"},"method":"getCommunicationToken","parameters":{"secretKey":"230147db390cf31fc3b8008e85f8a7f1"}}
Even when the json is not syntactically correct, it always returns at least some headers! It's been Python all along...
The only question remaining is why is python doing this?