2

I have done 2 weeks of research before posting question here. I have access to this but it requires login in order to access content.

I was wondering how can I login via wget and access content of it via wget? it is basic html login.

Here is html code for site.

<div id="loginh">
                    <div id="form">
                    <form name="frmLogin" action="/en/login.shtml" method="post">
                        <input type="hidden" name="login_attempt" value="yes">
                        <input type="hidden" name="redirect" value="/en/index.shtml">
                        <input type="text" name="login_username" class="txtBox1" title="Enter your user name" id="username" value="User name" onfocus="clickLoginField(this);" onkeypress="javascript:if ((event.which &amp;&amp; event.which == 13)||(event.keyCode &amp;&amp; event.keyCode == 13)) {document.frmLogin.login_password.focus(); return false;}">
                        <input type="password" name="login_password" class="txtBox2" title="Enter your password" id="pass" value="" onkeypress="javascript:if ((event.which &amp;&amp; event.which == 13)||(event.keyCode &amp;&amp; event.keyCode == 13)) { document.frmLogin.submit(); return false;}">
                        <input type="text" id="login_password_" name="login_password_" class="loginattemptstyle" value="">
                        <a class="login" href="javascript:void(0);" onclick="document.frmLogin.submit();">Login</a><noscript>&lt;input type="submit" value="Login"/&gt;</noscript>
                        <div class="clear"></div>
                        <table width="100%" cellpadding="5">
                            <tbody><tr>
                                <td><a href="/en/forgottenpassword.shtml" title="Have you forgotten your password?">Forgotten password?</a></td>
                                <td><table width="100%">
                                    <tbody><tr>
                                        <td><input type="checkbox" name="login_remember" style="margin-left: 0; margin-right: 5px;"></td>
                                        <td>Stay signed in</td>
                                    </tr>
                                </tbody></table></td>
                            </tr>
                        </tbody></table>
                    </form>
                    </div>
                    <script language="JavaScript" type="text/JavaScript">
                    <!--
                    $("#login_password_").val('1tCRztiXpM5jpmefqdWYn4O/ipyn5KWUneZoag==');
                    //-->
                    </script>
                </div>

what I have tried

wget -q -O-  save.txt --load-cookies cookies.txt http://mysite.com/en/article1.shtml | findstr /i "'streamer'" > save3.txt
Mowgli
  • 3,422
  • 21
  • 64
  • 88
  • that all depends on what the server's doing. does it set a cookie? does it switch over to http basic auth? did you capture ALL of the cookies the site set in your browser, blah blah blah. – Marc B Apr 02 '13 at 14:28
  • if I am not wrong, it only sets 1 cookie file for login. – Mowgli Apr 02 '13 at 14:32

2 Answers2

2

You will need to do a POST instead of a GET REQUEST

Check this out:

Variables in wget post data

EDIT:

If you can use PHP, I would recommend you to use Snoopy PHP http://snoopy.sourceforge.net

It simulates a Web Browser, allowing you to do further robot navigation and html retrieval. It also emulate Cookies

EDIT2:

If you do not intend to use PHP, you need to use CURL (below a link to download it). Using it you will be able to retrieve the html of the posted login page.

http://curl.haxx.se/download.html

Basic Usage1: http://curl.haxx.se/docs/httpscripting.html

A Stack overflow post using curl and POST: login POST form with cURL

Examples with code: http://www.yilmazhuseyin.com/blog/dev/curl-tutorial-examples-usage/

Community
  • 1
  • 1
Pedro Matos
  • 387
  • 2
  • 14
  • Thanks, I have tried link you posted, but it didn't solve my problem. I am running these tests on my personal PC. I don't think php would be helpful in automation I am trying to achieve. – Mowgli Apr 02 '13 at 15:41
  • 1
    Considering your comment, what you need then is CURL. http://curl.haxx.se/download.html I will edit my answer again and include CURL – Pedro Matos Apr 03 '13 at 14:45
  • Thanks a lot, I will give cURL a chance. – Mowgli Apr 03 '13 at 15:00
  • not working for me. the first command creates an empty cookies.txt file and returns an 'authentication failed' error. :| – ScienceFriction Sep 03 '13 at 16:12
0

I used lynx and wget to solve this problem. Please read the last answer in this post.

How to get past the login page with Wget?

PokerFace
  • 811
  • 2
  • 9
  • 15