Questions tagged [domcrawler]

The DomCrawler is a Symfony component for PHP which eases DOM navigation for HTML and XML documents.

The DomCrawler component eases DOM navigation for HTML and XML documents and is part of the the Symfony PHP components.

The filter() function accepts the jQuery Selector Syntax and eases the selection of HTML tags and attributes.

Documentation

179 questions
0
votes
2 answers

Get first level dom elements by Symfony Crawler

I am using Symfony Crawler component to parse html like this:
//first level div
1
//sub div
2
// more levels and empty divs possible
Tesmen
  • 559
  • 1
  • 6
  • 21
0
votes
1 answer

get n list items using domcrawler

Is it possible to get only an n number of items using dom crawler ? I have `$items = $website->filter('ul.listnews li'); $items>each(function($node,$con){ }` But I want to get only the first 5 items from the list. I tried running a for…
Bazinga777
  • 5,140
  • 13
  • 53
  • 92
0
votes
1 answer

Attempted to call an undefined method named "filter" of class "DOMElement"

$goutte = new GoutteClient(); $crawler = $goutte->request('GET', 'https://www.website.com'); $reviewContent = $crawler->filter('.review-content'); $rows = $reviewContent->filter('.row'); foreach ($rows as $row) { $col1 =…
Jonathan
  • 3,016
  • 9
  • 43
  • 74
0
votes
0 answers

Get parent of child element with xpath in Symfony2 Crawler

I can access all links via xpath query. $result = $crawler->filterXPath('//ul[@id="menu"]/li/a'); but i wonder if is it possible to access parent…
cyb0k
  • 2,478
  • 23
  • 19
0
votes
0 answers

Change File size limit for Symfony2 DomCrawler or Goutte

I am using Goutte v2.0.4 which is a wrapper for Symfony2 DomCrawler. I have the html files locally stored. Some of them are below 10MB; I have crawled those files successfully. Other files are above 30MB. These are not getting crawled. This may be…
Tejas
  • 2,215
  • 2
  • 18
  • 27
0
votes
3 answers

Web Scrape Symfony2 - Impossible Challenge - Crawler Parsing

(Edit: I've still found no way of solving this problem. The $crawler object seems ridiculous to work with, I just want to parse it for a specific text, how hard is that? I cannot serialize() the entire crawler object either and make the entire…
Kenny
  • 2,124
  • 3
  • 33
  • 63
0
votes
1 answer

DOMCrawler find Tag with Inner HTML text

I'm trying to use Goutte to scrape a web page and I can't find a DOMCrawler method to search for actual text. Let's say there's a td, but it has no class or ID. So, I need to search for let's say "Title" then get that tds next sibling.
Kenny
  • 2,124
  • 3
  • 33
  • 63
-1
votes
1 answer

is there any method of PHP DOM-crawler to click on a element of a web page?

I'm trying to crawl all data of the products of a webpage but I am stuck at the "view-more" div, I need to use a crawler to click on it to view all products, I tried to click it sometimes, and use the full product URL to crawl but it still not…
duclong
  • 11
  • 2
-1
votes
1 answer

How to crawl link use the same css

I use this code to crawl the website, but I want the link as a separate result. I want the tag result separate from Artists to put them inside variables.
-1
votes
1 answer

Proper XPath Syntax

I'm trying to access an attribute of a previous sibling, but it's proving difficult. So basically the web page I'm trying to scrape is TERRIBLE and the anchor tags use crappy onclick instead of href. Stupid, I know. I'm trying to first find the…
Kenny
  • 2,124
  • 3
  • 33
  • 63
-2
votes
1 answer

Guzzle Symfony scrape iframes inside multiple Servers

I am building a scraper to scrape content using guzzle and symfony dom crawler But I run into an issue. The page I am scraping has multiple Iframe servers They default iframe is shown when the scraper loads the page but in order to get the other…
-3
votes
1 answer

How to download image blocked by cors

When I scrape data which includes images from other websites, I encounter the following error: $.get('https://truyenvua.com/128/1081/1.jpg?gt=hdfgdfg', function (data) { console.log(data) }); An error occurred as follows. Please help me with the…
HungTV
  • 1
-3
votes
2 answers

Unset null values in array of DOM elements

I am iterating trough array in php and I got result like: It's a an array with DOM elements within a DOM Crawler library. { "data": [ { "content": null, "property": null }, { "content": "Build your communication…
-3
votes
1 answer

Curl is returning a string

I'm using curl to get my values from a site name PKNiC My code is: function _isCurl() { return function_exists('curl_version'); } if (_iscurl()) { //curl is enabled $url =…
usman
  • 15
  • 4
1 2 3
11
12