How to echo a website page in php that has js file in it?

Question

There is a site that I want to scrape: https://tse.ir/MarketWatch.html

I know that I have to use:

file_get_contents("https://examplesite.html")

to get the html part of site, but how can I find a specific part of site for example like this part in text file:

<td title="دالبر"title="something" class="txtclass="someclass">Tag namad">دالبر<Name</td>

When I open the text file, I never see this part and I think it is because in website there is JavaScript file. How can I get all information of website that include every part I want?

score 2 · Accepted Answer · edited May 06 '20 at 21:20

2

Content loaded by ajax request via javascript. This means you can't get this data simply grabbing the page contents.

There are two ways of collecting data you need:

Use solution based on selenium webdriver to load this page by real browser (which will execute JS), and collect data from rendered DOM.
Research what kind of requests are sent by website to get this data. You could use network activity tab in browser dev tools. Here is example for chrome. For other browsers is the same or similar. Than you send the same request and pase response regarding to your needs.

In your specific case, probably, you could use this url: https://tseest.ir/json/MarketWatch/data_211111.json to accees the json object with data you need.

edited May 06 '20 at 21:20

TheFaultInOurStars

3,464
1
8
29

answered Feb 22 '20 at 15:46

nklen

46
5

thank you very much, ahhhh, is this url giving me fresh and new data or i have to change something to get new data? – TheFaultInOurStars Feb 22 '20 at 18:17
I guess it should be fresh data, but you need to double check it. Also there is another url you could check: https://tse.ir/json/MarketWatch/update_1.json – nklen Feb 22 '20 at 22:17
can you exactly tell me where did you find this url?:)) i really need the source place of these urls;) thank you very much it helped me a lot – TheFaultInOurStars Feb 22 '20 at 23:36
i read your answer again now i can understand where did u find that thank you so much – TheFaultInOurStars Feb 22 '20 at 23:47

1000Gbps · Answer 2 · 2020-04-20T15:59:35.843

1

YOU have three variants for scraping the data:

There's an export to excel file: https://tse.ir/json/MarketWatch/MarketWatch_1.xls?1582392259131. Parse through it, just remember that this number is Unix Timestamp, where first 10 numbers are the month/day/year/hours/minutes
Also there's probably a refresh function(s) for the market data somewhere in all .js files loaded in the page. Just find it and see if you can connect directly to the source (usually a .json)
Download the page at your specific interval and scrape each table row using PHP's DOMXPath::query

edited Apr 20 '20 at 15:59

answered Feb 22 '20 at 17:39

1000Gbps

1,455
1
29
34

you meant that i should use date in unix to get fresh data?or something else? – TheFaultInOurStars Feb 22 '20 at 18:16
No, just notifying you to use those dates when going to fill a database with the market data – 1000Gbps Feb 22 '20 at 23:24

How to echo a website page in php that has js file in it?

2 Answers2