4

I have written a program to download text files from National Stock Exchange. This is the downloading function:

downloadNseBhav: func [Date] [
    NSE_Url: to-url rejoin [
        "http://nseindia.com/content/historical/EQUITIES/" 
        (uppercase form-date Date "%Y/%b/") "cm" Sd "bhav.csv.zip"

        ; format NSE Bhavcopy url
    ]

    either error? try [
        write/binary to-file rejoin ["./NSE/" Sd "bhav.csv.zip"]
        read/binary NSE_Url

         ; download bhavcopy zip file to disk in ./NSE folder 
    ][
        append Log/text "Server made a boo boo ......NSE Bhavcopy not found^/^/"
        scroll-text Log 
        Exit
    ][
        append Log/text "Downloaded NSE Bhavcopy Zip^/"
        scroll-text Log
    ]   
]

I am getting file not found message many times though the required file is there. This is irritating when multiple files are requested and some of them are not downloaded. I receive the file if I try again.

I read the wait command in Rebol 2 documentation and found that wait is default for opening port. What am I doing wrong? Is there a way to make Rebol wait for couple of seconds to get response from the server?

Edit - There is a file for each day's activity. Say, I am downloading for 10 days( 1st Jan to 10th Jan. Then, I get files for some days and error for some days. If I download again immediately for same dates, I get some of the missing files. A third and fourth try will get all remaining files. However, file not found error will be random each time for any of the dates.

Well,

  1. I increased the time out to 10 seconds, as tomc said.
  2. I also collected list of those failed, as suggested by Graham Chiu
  3. I could not use Wireshark, as suggested by Hostilfork, but I could trap the error with slight change in code as below.

        either error? err: try [ BC: read/binary NSE_Url ]  ; download bhavcopy zip file to disk in ./NSE folder 
        [   err: disarm err
            probe err
            write/append %log.txt rejoin ["NSE EQUITIES/bhavcopy not found " DateYmd "^/"]
            Exit
            ] [] 
    ]
    

    Thereafter, I downloaded from 1 DEC 2015 to 15 DEC 2015 twice.

List of failed for first attempt -

NSE EQUITIES/bhavcopy not found 2015-12-08
NSE EQUITIES/bhavcopy not found 2015-12-09

List of failed for second attempt-

NSE EQUITIES/bhavcopy not found 2015-12-01
NSE EQUITIES/bhavcopy not found 2015-12-02

Error Message was same for all cases-

emake object! [
code: 507
type: 'access
id: 'no-connect
arg1: "nseindia.com"
arg2: none
arg3: none
near: [BC: read/binary NSE_Url]
where: 'open-proto

]

Please forgive me for not trapping error correctly earlier. I am new to Rebol.

I do not know what is the solution for this. I have an 8 mbps net connection and it is working perfectly.

Out of curiosity, I opened Rebol console and connected google.co.in This was the result for two simultaneous attempts-

test: read http://  google  . co. in
Access Error: Cannot connect to google.co.in
Where: open-proto
Near: test: read http://google.co.in
test: read http://google.co.in
== {<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en-IN"><head><meta content="text/html; charset=UTF...

Therefore, as of now, I am following both suggestions.

One more thing I learned - trapping error does not work like this-

either error? err: try [
        write/binary to-file rejoin ["./NSE/" Sd "bhav.csv.zip"]
        read/binary NSE_Url]

The file has to be read into a variable, other wise if file is actually received, Rebol2 crashes with error - err: needs a value.

Satish
  • 133
  • 8
  • When you say you are getting a file not found error, do you mean you're getting a response back from the webserver and it says it's not there... when sometimes it gives you an answer? If that is the case, then it is the web server's decision. Look using something like [WireShark](https://en.wikipedia.org/wiki/Wireshark) at the traffic, and if the request is correct and the server isn't always giving it to you that's not really Rebol's problem (outside of identifying itself as not-a-web-browser, which might change the behavior of nseindia's response...) – HostileFork says dont trust SE Jan 15 '16 at 07:54
  • Let me make it clearer. There is a file for each day's activity. Say, I am downloading for 10 days( 1st Jan to 10th Jan. Then, I get files for some days and error for some days. If I download again immediately for same dates, I get some of the missing files. A third and fourth try will get all remaining files. However, file not found error will be random each time for any of the dates. Therefore, I felt that Rebol is not waiting for sufficient time to receive the files. – Satish Jan 15 '16 at 08:27
  • Please see ["Should questions include “tags” in their titles?"](http://meta.stackexchange.com/questions/19190/should-questions-include-tags-in-their-titles), where the consensus is "no, they should not"! –  Jan 15 '16 at 08:28
  • 2
    Sorry for including tag in title. I shall take care in future. – Satish Jan 15 '16 at 08:32
  • FYI - **wait** can also take `integer!` & `time!` argument. for eg. `wait 3` stops for 3 seconds, `wait 0:2` for 2 minutes and `wait 1:2:3` lasts for 1 hour, 2 minutes and 3 seconds. – draegtun Jan 18 '16 at 11:50
  • Alternatively you can try `call` to `curl`, just be sure that your curl installation is 32bit. – endo64 Oct 05 '18 at 13:12

2 Answers2

2

in R2 you could set the http timeout with

system/schemes/http/timeout: whatever
tomc
  • 1,146
  • 6
  • 10
2

As tomc says, you can alter the timeout if the web server is slow in responding which is why you're getting these issues, and why it's intermittent. Probably the second time the result is now cached and ready for collection.

You could also collect a list of those that fail, and try those again on a second pass.

BTW

downloadNseBhav: func [Date] [
NSE_Url: to-url rejoin [
    "http://nseindia.com/content/historical/EQUITIES/" 
    (uppercase form-date Date "%Y/%b/") "cm" Sd "bhav.csv.zip"

    ; format NSE Bhavcopy url
]

and be written as

downloadNseBhav: func [Date] [
NSE_Url: rejoin [
    http://nseindia.com/content/historical/EQUITIES/
    uppercase form-date Date "%Y/%b/" "cm" Sd "bhav.csv" %.zip

    ; format NSE Bhavcopy url
]

since rejoin forces the series to be the same datatype as the first in the series, and the parens are unnecessary.

Graham Chiu
  • 4,856
  • 1
  • 23
  • 41