I am using cheerio/jQuery (not sure which one is causing the issue) to search the DOM of a webpage for a large unordered list of items. If the list contains less than 100 elements I can do whatever I want with it. However I have one list that has 250+ items in it, and i cannot load anything from it. I cannot even specify :nth-child(125) to select a specific element. Anyone else ran into this? I have searched for about 15 minutes all over the web and have not found anything remotely close. Thanks in advance!
Asked
Active
Viewed 213 times
0
-
2No way to answer without more understanding about how that page actually works. Could be paginated ... infinite scroll loading etc. Is it your page? – charlietfl Jan 22 '16 at 17:48
-
no it isn't my page. The page isnt paginated, and if i run $('elem ul li a').length in the console window it comes back with 250, however if i try to do that same code from cheerio it says 0 I'll also add that I can select h1's, h2's etc. from the page. For some reason it is just these huge lists that will not select. – Brian B Jan 22 '16 at 17:57
-
Is that list loaded asynchronously? Running code in console can have different results than on page load if it is. I doubt the problem is what you think ...too many elements. 250 isn't a lot – charlietfl Jan 22 '16 at 18:00
-
I bet it is async because its data from their database. Hmm...I am new to the whole scraping thing so do you know if there is a way to tell cheerio to wait til the page is done loading? – Brian B Jan 22 '16 at 18:06
-
put a long delay in and then count elements. There will be no magic way to say they have loaded all the content without analyzing the code in that page and even then there may not be a simple trigger – charlietfl Jan 22 '16 at 18:07
-
ahh i feel stupid. I've been playing with my selector and I think that the website is adding in a class dynamically or something because I removed the one class (it was unneccessary but chrome inspector gave it to me so i kept it). Now its working! Thanks for putting the fire under my a** Charlie!! – Brian B Jan 22 '16 at 18:31
-
no problem ... good learning experience that what appears in console may not be what is there when page actually loads – charlietfl Jan 22 '16 at 18:33