0

Getting a blank screen on executing the python program.

Please help. It may be a duplicate question, but I don't know Python very much, because I am an Android developer.

Here is my code:

import sys
import requests
from bs4 import BeautifulSoup, SoupStrainer

home_url = 'https://parivahan.gov.in/rcdlstatus/'
post_url = 'https://parivahan.gov.in/rcdlstatus/vahan/rcDlHome.xhtml'
# Everything before the last four digits: GJ03KA
first = sys.argv[1]
# The last four digits: 0803
second = sys.argv[2]

r = requests.get(url=home_url)
cookies = r.cookies
soup = BeautifulSoup(r.text, 'html.parser')
viewstate = soup.select('input[name="javax.faces.ViewState"]')[0]['value']

data = {
    'javax.faces.partial.ajax':'true',
    'javax.faces.source': 'form_rcdl:j_idt32',
    'javax.faces.partial.execute':'@all',
    'javax.faces.partial.render': 'form_rcdl:pnl_show form_rcdl:pg_show form_rcdl:rcdl_pnl',
    'form_rcdl:j_idt32':'form_rcdl:j_idt32',
    'form_rcdl':'form_rcdl',
    'form_rcdl:tf_reg_no1': first,
    'form_rcdl:tf_reg_no2': second,
    'javax.faces.ViewState': viewstate,
}

r = requests.post(url=post_url, data=data, cookies=cookies)
soup = BeautifulSoup(r.text, 'html.parser')
table = SoupStrainer('tr')
soup = BeautifulSoup(soup.get_text(), 'html.parser', parse_only=table)
print(soup.get_text())
Shubham Sejpal
  • 3,556
  • 2
  • 14
  • 31
  • 1
    r returns response 500, ie Internal Server Error. Visiting the URL https://parivahan.gov.in/rcdlstatus/vahan/rcstatus.xhtml on browser also returns error code 500, with a "Bad request" message. Are you sure it is the right address? – Claire May 21 '18 at 16:15
  • 1
    Always check those error codes! And note in http all the ones in the 200's are success. – tdelaney May 21 '18 at 16:17
  • @Claire i edited the code please help me i tried it with many code and changes but i didn't get success on python code and even i don't know much more about it. Really today i touched it first time in my life. – Shubham Sejpal May 21 '18 at 16:58
  • Try this, better, free and, legal way https://shrouded-falls-48764.herokuapp.com/ https://www.youtube.com/watch?v=bMj-1BGbxfc – Trishant Pahwa Sep 20 '20 at 04:16

4 Answers4

2

If you print out the result from the requests post (r), you're getting a 500 error which is a generic http response for a server error. My guess is the url resource is bad or the data being posted to it isn't formatted correctly

steveholt
  • 21
  • 3
1

Let me open a new answer in response to the renewed question.

After trying some methods with just requests and urllib, I think it is better to use the selenium webdriver controller.

The following code will grab the table rows as you want.

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup

url = 'https://parivahan.gov.in/rcdlstatus/'

# Optional: Getting "Headless" browser, ie suppressing the browser window from showing
chrome_options = Options()  
chrome_options.add_argument("--headless")  

# Let the driver open, fill and submit the form
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get(url)
driver.delete_all_cookies()
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.NAME, 'form_rcdl:j_idt34')))
input1 = driver.find_element_by_name('form_rcdl:tf_reg_no1')
input1.send_keys('GJ03KA')
input2 = driver.find_element_by_name('form_rcdl:tf_reg_no2')
input2.send_keys('0803')
driver.find_element_by_name('form_rcdl:j_idt34').click()
wait = WebDriverWait(driver, 10)

# Get the result table
try:
    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "form_rcdl:j_idt63"))
    )
    result_html = driver.page_source
    #print(result_html)
    soup = BeautifulSoup(result_html, 'lxml')
    print(soup.findAll('tr'))
except TimeoutException:
    driver.quit()
    print('Time out.')

Below demonstrates the result of printing out the table html tags in soup.

enter image description here

I hope the government does not find out and block this way before you try out lol

Hope this helps! You may refer to the following references if interested:

Claire
  • 639
  • 9
  • 25
  • Hi @Claire it will open this thing in browser but i need it in the form of api or Web Page HTML source in the soup because than and than i will able to make the api for them. – Shubham Sejpal May 22 '18 at 04:39
  • I will got the HTML source in the result_html when i am trying to print the commented line. But as per your code when its try to print the soup.findAll('tr') at that time it will throws me the TimeoutException. – Shubham Sejpal May 22 '18 at 04:55
  • Hi @ShubhamSejpal With the `"headless"` argument as specified in answer, it will not open a browser window. – Claire May 22 '18 at 06:11
  • @ShubhamSejpal I'm not sure. The code prints the tags in soup alright here, with or without the commented line. Here's a demo for printing out the Web Page HTML source `result_html`: https://imgur.com/a/5Jaf8Yh You can manipulate the soup (soup form of `result_html`), to extract information and further handle it as you wish. – Claire May 22 '18 at 06:15
  • @ShubhamSejpal Maybe could you provide the Traceback to have a look? – Claire May 22 '18 at 06:40
  • By changing this line driver = webdriver.Chrome(chrome_options=chrome_options) with the chromedriver path it will opens the browser but when i am trying to run your code it will gives me the following error :- http://prntscr.com/jkzn1m , So i am trying to solve that error and the code is change to the path in one of the solution. – Shubham Sejpal May 22 '18 at 06:53
  • Changing the line will open the browser, because the `"headless"` option is stored in the `chome_options` variable passed in that line. The solution assumes you have chrome driver in your path, to avoid hard coding and let you run the script directly without editing. To manually specify the chromedriver path, you can change the line to `driver = webdriver.Chrome(executable_path=path_to_chromedriver, chrome_options=chrome_options)`, where `path_to_chromedriver` is a string of the absolute path to your chromedriver. Feel free to discuss continue discuss in chat as well. – Claire May 22 '18 at 07:13
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/171516/discussion-between-claire-and-shubham-sejpal). – Claire May 22 '18 at 07:13
0

The URL that actually returns a valid form webpage is 'https://parivahan.gov.in/rcdlstatus/'.

By inputting the example ID (Reg No.) in browser, error message "Registration No. does not exist!!! Please check the number." pops up. (which makes total sense. I do hope you didn't put a real ID in public lol)

Since I don't have a valid ID to test. Please see if this solves your problem.

Another thing noticed is that the fields for inputting the registration number should be "form_rcdl:tf_reg_no1" and "form_rcdl:tf_reg_no2". You can view the HTML source of the webpage (e.g. Ctrl+C in Chrome) to verify.

enter image description here

Claire
  • 639
  • 9
  • 25
  • Now i changed the registered number also with my original bike number in the comment. – Shubham Sejpal May 21 '18 at 17:00
  • @ShubhamSejpal Let me have a look. Nice that you are now dynamically retrieving the viewstate already. It's a long way you've come so far for the first day! :D – Claire May 21 '18 at 17:23
  • @ShubhamSejpal Done :) Just posted a new answer to keep comments relevant, so new comers will know what's going on – Claire May 21 '18 at 18:23
0

You have hardcoded jdt32 as button id... please note that button id in this website is dynamic.... your program should dynamically pickup the right button id. here is the solution

import sys
import re
import requests
from bs4 import BeautifulSoup, SoupStrainer

home_url = 'https://parivahan.gov.in/rcdlstatus/?pur_cd=102'
post_url = 'https://parivahan.gov.in/rcdlstatus/vahan/rcDlHome.xhtml'
# Everything before the last four digits: MH02CL
first = sys.argv[1]
# The last four digits: 0555
second = sys.argv[2]

r = requests.get(url=home_url)
cookies = r.cookies
soup = BeautifulSoup(r.text, 'html.parser')
viewstate = soup.select('input[name="javax.faces.ViewState"]')[0]['value']
#print soup.findAll('button', id=re.compile('form_rcdl^'))
#print soup.findAll('button', id=lambda x: x and x.startswith('form_rcdl'))
i = 0
for match in soup.find_all('button', id=re.compile("form_rcdl")):
  if i ==  0:
    button_id= match.get('id')
  i = 1

data = {
    'javax.faces.partial.ajax':'true',
    'javax.faces.source':button_id,
    'javax.faces.partial.execute':'@all',
    'javax.faces.partial.render': 'form_rcdl:pnl_show form_rcdl:pg_show form_rcdl:rcdl_pnl',
    button_id:button_id,
    'form_rcdl':'form_rcdl',
    'form_rcdl:tf_reg_no1': first,
    'form_rcdl:tf_reg_no2': second,
    'javax.faces.ViewState': viewstate,
}

r = requests.post(url=post_url, data=data, cookies=cookies)
#print (r.text)
soup = BeautifulSoup(r.text, 'html.parser')
table = SoupStrainer('tr')
soup = BeautifulSoup(soup.get_text(), 'html.parser', parse_only=table)
print(soup.get_text())
Android Rao
  • 157
  • 1
  • 10