Guys i m scraping this url 'http://goldfilmesonline.com/jack-reacher-sem-retorno-legendado-online/' to try to get 1 link,but they have 2 that work for me, what happend to the site, i think it redirect, whem u play the video, 1 thing i know it send 2 media links whem i play the movie, those are the links i want to get because they work in VLC or Kodi so my firt code is to get the embeded url and it work fine
import requests
from bs4 import BeautifulSoup
a = requests.get('http://goldfilmesonline.com/jack-reacher-sem-retorno-legendado-online/')
soup = BeautifulSoup(a.content, 'html')
links = soup.find_all('iframe')
for i in links:
x = (i['src'])
if 'openload' in x:
print x
this is the result:
https://openload.co/embed/BQgJDIUtZ_w/
here is what i dont know what to do, i use fiddler and could get the headers request and response but i dont know what to parsed or params to give the so i took x and try to log this url and get the links i want but dindnt work here is what i try import requests
url = 'https://openload.co/embed/BQgJDIUtZ_w/'
data = {'mime':'true'}
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36'}
#response = requests.post(url,params=data,headers=headers)
#b = response.status_code
#x = response.text
#c = requests.get('https://openload.co/embed/BQgJDIUtZ_w',params=data,headers=headers)
#d = response.headers['Location']
#print d
headers2= {'Transfer-Encoding': 'chunked', 'Set-Cookie': '__cfduid=d11e26a118392f7f08c5df1e88e15b3f71479421090; expires=Fri, 17-Nov-17 22:18:10 GMT; path=/; domain=.openload.co; HttpOnly, _csrf=bf05bbd254877ef6c354a8fb3d4001938ce56f8141ab4704460eb96f946f790ca%3A2%3A%7Bi%3A0%3Bs%3A5%3A%22_csrf%22%3Bi%3A1%3Bs%3A32%3A%22QWePPbK2IdPoeI3nfzzW0cf1Xlu2xgVv%22%3B%7D; path=/; HttpOnly, _olbknd=w6; path=/', 'Server': 'cloudflare-nginx', 'Connection': 'keep-alive',
'Cache-Control': 'private', 'Date': 'Thu, 17 Nov 2016 22:18:10 GMT', 'CF-RAY': '30368e96ed2a5a56-BOS', 'Content-Type': 'text/html; charset=UTF-8'}
f = requests.post(url,params=data,headers=headers)
print f
this last code i get a response 400, but it should be 302 found
here is the fiddler results: this is the request headers
GET /stream/BQgJDIUtZ_w~1479434968~73.248.0.0~BduDRWUg?mime=true HTTP/1.1
Host: openload.co
Connection: keep-alive
Accept-Encoding: identity;q=1, *;q=0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36(KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36
Accept: */*
Referer: https://openload.co/embed/BQgJDIUtZ_w/
Accept-Language: en-US,en;q=0.8,pt-BR;q=0.6,pt;q=0.4
Range: bytes=0-
This is the Response headers
HTTP/1.1 302 Found
Date: Thu, 17 Nov 2016 02:09:42 GMT
Content-Type: video/mp4
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: __cfduid=d9ff8e31a73be207f66c4717480c6afe41479348582;expires=Fri, 17-Nov-17 02:09:42 GMT; path=/; domain=.openload.co; HttpOnly
Cache-Control: private
Access-Control-Allow-Origin: *
Location:https://1j8b54.oloadcdn.net/dl/l/zHA2Jp0IKTcUySV8/BQgJDIUtZ_w/Jack.Reacher.Never.Go.Back.HDTS.XviD-TOM.avi.mp4?mime=true
Set-Cookie: _olbknd=w5; path=/
Server: cloudflare-nginx
CF-RAY: 302fa46268585a68-BOS
I'm using python 2.7 with requests and beautifulSoup to get Python to print me 1 or both this 2 links, first is the one in Request headers:
GET /stream/BQgJDIUtZ_w~1479434968~73.248.0.0~BduDRWUg?mime=true
the second is the one in response headers:
Location: https://1j8b54.oloadcdn.net/dl/l/zHA2Jp0IKTcUySV8/BQgJDIUtZ_w/Jack.Reacher.Never.Go.Back.HDTS.XviD-TOM.avi.mp4?mime=true
can someone help me