Questions tagged [html-treebuilder]

Parser that builds a HTML syntax tree.

The HTML::TreeBuilder is a parser that builds a HTML syntax tree from data or a string.

33 questions
1
vote
2 answers

Need suggestion in printing the matched result one by one by using HTML-TreeBuilder-XPath findnodes() method

I am parsing an html content by using HTML-TreeBuilder-XPath in Perl . i have got the xpath location of the data i need. The issue i am facing is ,There are several matches of the xpath $html->findnodes()which is returned by single result ,but i…
Balakumar
  • 650
  • 1
  • 12
  • 29
1
vote
2 answers

perl Find varying element id using HTML::Treebuilder

I am trying to use a websites in built search function to collect data from it but can't work out how to press the 'search' button as it has some javascript wrapped around it and the id changes with each new iteration of the page. Data for the…
MicrobicTiger
  • 577
  • 2
  • 5
  • 21
0
votes
3 answers

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: html-parser. Do you need to install a parser library?

I was trying to Web Scraping by the following code: from bs4 import BeautifulSoup import requests import pandas as pd page = requests.get('https://www.google.com/search?q=phagwara+weather') soup = BeautifulSoup(page.content, 'html-parser') day =…
0
votes
1 answer

Perl HTML::Element how to look_down to extract next tag after a matching tag

I am using HTML::TreeBuilder to process HTML files. In those files I can have definition lists where there is term "Database" with definition "Database Name". Simulated html looks like this: #!/usr/bin/perl -w use strict; use warnings; use…
r0berts
  • 842
  • 1
  • 13
  • 27
0
votes
2 answers

Why does look_down method in HTML::Element fail to find
elements?

The code below shows that TreeBuilder method look_down cannot find the "section" element. Why? use strict; use warnings; use HTML::TreeBuilder; my $html =<<'END_HTML';
Shang Zhang
  • 269
  • 1
  • 8
0
votes
1 answer

TreeBuilder Get embedded nodes

Basically, I need to get the names and emails from all of these people in the HTML code. Name
0
votes
2 answers

HTML::TreeBuilder::XPath findvalue returns concatenation of values

The findvalue function in HTML::TreeBuilder::XPath returns a concatenation of any values found by the xpath query. Why does it do this, and how could a concatenation of the values be useful to anyone?
CJ7
  • 22,579
  • 65
  • 193
  • 321
0
votes
2 answers

HTML::TreeBuilder::XPath missing last tag in result

use WWW::Mechanize; use HTML::TreeBuilder::XPath; my $mech = new WWW::Mechanize; my $tree = new HTML::TreeBuilder::XPath; my $url = "http://www.elaws.gov.bw/wondersbtree.php"; $mech->get($url); $tree->parse($mech->content()); @nodes =…
CJ7
  • 22,579
  • 65
  • 193
  • 321
0
votes
1 answer

HTML::TreeBuilder inside a loop

I'm trying to delete all table elements from several HTML files. The following code runs perfectly on a single file, but when trying to automate the process it returns the error can't call method "look_down" on an undefined value Do you have any…
Tommy
  • 1
0
votes
1 answer

Perl split string at character entity reference  

Quick Perl question with hopefully a simple answer. I'm trying to perform a split on a string containing non breaking spaces ( ). This is after reading in an html page using HTML::TreeBuilder::XPath and retrieving the string needed by…
dan j
  • 157
  • 1
  • 11
0
votes
1 answer

Tree Builder issue with unicode text

I am using HTML::TreeBuilder to extract contents of a url by using tree->lookdown and then extracting text part from the string returned in lookdown method. My problem here is when I read that text and write it into a file its showing as junk. I am…
Nagaraju
  • 1,853
  • 2
  • 27
  • 46
0
votes
2 answers

How to parse html with HTML::TreeBuilder?

This is the code I'd like to parse [...]

Acid Splash

Caster Level(s):…

Daniele
  • 310
  • 3
  • 15
0
votes
1 answer

Perl HTML:TreeBuilder tag not equal to

I am using HTML::TreeBuilder in order to extract data from html file. What I need to do is to: $div->look_down(_tag => 'a', 'href' !=> 'index.html') So I am searching for a href that is not equal to 'index.html' and one other tag but obviously !=>…
Lenny
  • 887
  • 1
  • 11
  • 32
0
votes
1 answer

Converting HTML table to text using perl

I have a html table content which I am trying to convert it into text with same structure, with the help of use HTML::TreeBuilder and use HTML::FormatText in perl. I have tried with this code use strict; use warnings; use HTML::TreeBuilder; use…
Balakumar
  • 650
  • 1
  • 12
  • 29
0
votes
1 answer

How to fetch the value of a HTML tag using HTML::Tree?

Lets say i have an array which holds the contents of the body tag like shown below: print Dumper(\@array); $VAR1 = [