0

My client wants to extract the html content of a live webpage and make a copy of the html page in his server. I am thinking of using curl and is there any performance issue linked when I am intending to use curl? Will it use a lot of server memory.

If PHP curl has a performance issue, what is the best alternative to do this?

I am intending to download the page when a user submits the web URl in a form. The server that I am hosting the web is in a linux platform.

Thank you in advance!

madi
  • 5,612
  • 5
  • 36
  • 48
  • When do you download the page? Periodically? On every page load? – Jason McCreary Jul 26 '13 at 02:56
  • When a web url is submitted to a form – madi Jul 26 '13 at 02:57
  • 1
    performance is really going to be based on your network connection to the other website, I use CURL to load thousands of URLS as a php spider to index my websites for search ... if its not on the host server it takes a bit longer ... on the server its indexing its FAST ... its all relative. Try a few options and bench mark them for your project. – cmorrissey Jul 26 '13 at 03:00

1 Answers1

1

I'd use wget instead for a quick and dirty solution (on linux)

wget -r 

Please don't mention performance when you're using PHP. If you want to start asking those questions, perhaps you should be looking into network programming. As someone who likes playing with network programming, I should warn you that it's not a trivial topic.

Homer6
  • 15,034
  • 11
  • 61
  • 81
  • You're advocating a non-cross-platform tool over PHP's cURL wrapper? – alex Jul 26 '13 at 02:57
  • 1
    Cross platform is overrated (and not trivial either). Is he using windows? What about BeOS? Chances are, he's already using linux. And if not, he asked for an alternative to get a quick solution. Do you have a wget alternative for windows? Would you suggest that writes a full blown web crawler just so he can run it on windows too? – Homer6 Jul 26 '13 at 03:00
  • Ok, I am using a linux server. Wget -r can be used in php as a php execute command right? – madi Jul 26 '13 at 03:03
  • 2
    That is correct. http://php.net/manual/en/function.system.php If you're doing this, I'd be very careful to place heavy validation on this input. User driven input running on the shell should be used with caution. – Homer6 Jul 26 '13 at 03:03
  • @Homer6 Not saying the OP shouldn't use it. Just thought the answer should include the caveat that it probably won't be portable to Windows. For filtering any user input, there is [`escapeshellarg()`](http://php.net/manual/en/function.escapeshellarg.php). – alex Jul 26 '13 at 03:05
  • This won't run on ReactOS either. Oh noes! – 0x6A75616E Jul 26 '13 at 18:03