0

I am trying to automate data uploads to a private website using python mechanize.

I successfully login and navigate to the uploading page, which offers 3 possible ways of providing data: database connection (source-sql), file upload (source-file), or remote hosted csv (source-url, which is the one I need).

I successfully navigated to that page with python-mechanize, and also modified the uploading form controls (sourceType, sourceName, url and add) needed for my data uploading.
When using a Chrome browser, I submit those data by clicking the add button (value "Add"), and the page navigates to the target script ('addsource.do').
I have already tried the same step with Chrome javascript disabled, and it works (the browser arrives to the target script and it shows the submitted data: looks like javascript is not needed in form submission step).

So I guess my current situation is similar to this example in python-mechanize github examples/forms/example.py:

request2 = f.click()  # mechanize.Request object
try:
    response2 = mechanize.urlopen(request2)
except mechanize.HTTPError as response2:
    pass

Those lines of code are very similar to the end part of my code, which raises an error:

br = mechanize.Browser()

# ... many lines of code (producing and filling in form contents) ... 

print("before submit: ", br.geturl())

# OUTPUT CURRENT SELECTED FORM CONTROLS AND VALUES:

f = br.form
print(f)
form_info = (" -- Form name:  {}\n" \
    +" -- Form action: {}\n" \
    +" -- Form attrs:  {}") \
    .format(f.name,f.action,f.attrs)
print (form_info)

# QUESTION 1: How should I now submit this form?
# f.submit()
#  ... or ...
# f.click(name="add", type="submit")

# I tried the 2nd option, and then the example above: 

myrequest = f.click(name="add", type="submit")

# QUESTION 2: how to print out the 'action' submitted within myrequest ???

try:
    response = mechanize.urlopen(myrequest)
except mechanize.HTTPError as response:
    print("EXCEPTION:", response)

This is the output of my code (server name changed to example.com):

before submit:  https://example.com/manage/resource.do?r=test-occ

 <post https://example.com/manage/addsource.do multipart/form-data
  <HiddenControl(r=test-occ)>
  <HiddenControl(validate=false)>
  <SelectControl(sourceType=[(), source-sql, source-file, *source-url])>
  <FileControl(file=<No files added>)>
  <TextControl(sourceName=AUTO_test-occ_occ_url)>
  <TextControl(url=https://another.example.com/datasets/datasource.csv)>
  <SubmitControl(add=Add)>
  <SubmitControl(clear=Clear)>>

 -- Form name:  None
 -- Form action: https://example.com/manage/addsource.do
 -- Form attrs:  {'action': 'addsource.do', 'method': 'post', 'enctype': 'multipart/form-data'}

EXCEPTION: HTTP Error 404: Not Found

Although form.attrs['action'] value 'addsource.do' looks correct to me, 'HTTP Error 404: Not Found' suggests the form action was targeted to a wrong url (?). Or perhaps I am submitting the form the wrong way.

So my questions:

  1. Which is the proper way to submit this form? I am a bit confused between these two options:
    br.submit()
    or
    br.form.click(name="add",type="submit")

  2. If I choose the 2nd option: is there any way of checking the 'action' actually submitted within myrequest, before entering the try-catch code?
    As for the aforementioned example, myrequest is a mechanize.Request object.
    There I see methods like get_data(), get_header() or get_method() from the request ... but no way to get_action(). Is there a way to do that?

Thanks

EDIT: this is my form upload html code

<form action="addsource.do" method="post" enctype="multipart/form-data">
  <input name="r" type="hidden" value="test-occ">
  <input name="validate" type="hidden" value="false">

  <select id="sourceType" name="sourceType" class="form-select form-select-sm my-1">
    <option value="" disabled="" selected="">Select source type</option>
    <option value="source-sql">Database</option>
    <option value="source-file">File</option>
    <option value="source-url">URL</option>
  </select>

  <div class="row">
    <div class="col-12">
      <input type="file" name="file" id="file" class="form-control form-control-sm my-1" style="display: none;">
      <input type="text" id="sourceName" name="sourceName" class="form-control form-control-sm my-1" placeholder="Source Name" style="">
      <input type="url" id="url" name="url" class="form-control form-control-sm my-1" placeholder="URL" style="">
    </div>
    <div class="col-12">
      <input type="submit" value="Add" id="add" name="add" class="btn btn-sm btn-outline-info-primary my-1" style="">
      <input type="submit" value="Clear" id="clear" name="clear" class="btn btn-sm btn-outline-secondary my-1" style="">
    </div>
  </div>
</form>
abu
  • 422
  • 7
  • 14

0 Answers0