Questions tagged [tidy]

Tidy is a C library for cleaning up "bad" HTML. Don't use this tag for questions about keeping your code tidy.

Tidy is a library written in C for converting HTML that is syntactically incorrect to correct HTML or to XHTML. Especially useful when you are scraping web pages with curl and XML parsing functions because XML parsing functions don't accept bad HTML. Extensions for Tidy are available in PHP and Perl. The Tidy extension in PHP supports functions to covert bad HTML to XHTML with various options like dropping deprecated tags like font tag and hiding comments and dropping proprietary tags and dropping empty paragraphs and a lot more.

571 questions
0
votes
1 answer

JTidy and boolean attributes

There is radio button like next, No After tidy's parsing I have node just with 3 attributes, and its problem. How to configure tidy to parse boolean attributes ? Thanks. P.S. My Tidy…
Sergii Zagriichuk
  • 5,389
  • 5
  • 28
  • 45
0
votes
1 answer

Mac OSX .zshrc $PATH variable ignorning local command

I'm on MacOSX Mountain Lion 10.8.2, using OhMyZsh and for some odd reason, I am not able to run a command which should trigger a script in my local directory. If I navigate via terminal to the folder containing "tidy"…
netpoetica
  • 3,375
  • 4
  • 27
  • 37
0
votes
1 answer

Tidy to xhtml with parameters such as allowFullScreen

I'm parsing simple oembed urls from youtube, and converting them to xhtml, however, some html gets tidy'd (I believe to be) incorrectly. Shouldn't the valid xhtml be allowFullScreen="true"????? If this is correct, is there some tidy module that…
ansiart
  • 2,563
  • 2
  • 23
  • 41
0
votes
2 answers

scrapi doesn't see tidy libraries

I have a simple ruby file that scraps a price off of walmart's site. I did a gem install scrapi and a gem install tidy. When I run my code on my windows 7 box I get the following…
rahrahruby
  • 673
  • 4
  • 11
  • 28
0
votes
2 answers

HTML tag replacement regex not quite working correctly

This is a follow up to another question of mine. The solution I found worked great for every one of the test cases I threw at it, until a case showed up that eluded me the first time around. My goal is to reformat improperly formatted tag…
Cᴏʀʏ
  • 105,112
  • 20
  • 162
  • 194
0
votes
1 answer

PHP tidy turn off code improvement

I want to use PHP's tidy just for making the HTML output more readable, but not for any 'optimizations' of my code. In my opinion I am responsible for providing well - formed HTML, not readable HTML. So how can I turn this behavior off?
Zulakis
  • 7,859
  • 10
  • 42
  • 67
0
votes
1 answer

Smarty output filter

Smarty Templating for php lets your write output filters that are called everytime fetch() or display() is called. Smarty also uses output buffers and you can't create your own (you can't have an output buffer while the other one is still…
user1521604
0
votes
1 answer

PHP tidy disable repair feature, just indent

My code looks like that $markup = ob_get_clean(); // Specify configuration $config = array( 'indent' => true, 'output-xhtml' => true, 'wrap' => 200); // Tidy $tidy = new tidy; $tidy->parseString($markup, $config,…
heron
  • 3,611
  • 25
  • 80
  • 148
0
votes
2 answers

clean up html code

Hi i am looking for a script to simply line up my html code (opening and closing tags line up etc) i have tried tidyHTML and a few web based solutions but they all stuff up my grid system and php includes by removing or changing elements i just want…
Mitchell Bray
  • 558
  • 2
  • 14
  • 29
0
votes
2 answers

How can I tidy up some XML while keeping the newlines?

I'm working with some XML files as part of a team. Since some people have different indentation settings, the formatting sometimes gets screwed up, and it's convenient to have an automated tool re-pretty-print the file. Is there a way to…
PotatoEngineer
  • 1,572
  • 3
  • 20
  • 26
0
votes
1 answer

Unable to retrieve web data in java using tidy and Xpath

What I am trying to do is scrape a simple inner HTML from a XHTML file. I have narrowed down my search to the element node, but I fail to retrieve the information. PLEASE NOTE: the element node has no child node. I get a null pointer exception for…
user1511443
  • 175
  • 1
  • 11
0
votes
2 answers

Tidy causing bad spacing issues (JTidy)

We are using JTidy to clean up some html for sax processing. We've had a lot of trouble around spacing issues as shown in this example: Html stackoverflow which outputs "stackoverflow" But... Post…
jfeust
  • 845
  • 1
  • 9
  • 19
0
votes
1 answer

Tidy gives non-standard HTML

I use Tidy to clean and make HTML files compliant to HTML/XHTML. However, output contains non-standard attributes values like : ... or (look at the single quotes). How can I configure Tidy to give strict…
Viet
  • 17,944
  • 33
  • 103
  • 135
0
votes
2 answers

Fixing malformed html that html tidy doesn't fix

Okay, so I've been utilizing HTML tidy to convert regular HTML webpages into XHTML suitable for parsing. The problem is the test page I saved in firefox had its html apparently somewhat precleaned by firefox during saving, call this File F. Html…
Peter Smith
  • 849
  • 2
  • 11
  • 28
-1
votes
1 answer

Dataframe with all information in one row. How can spilt them?

I have a dataframe where all information is in one row. see picture below untidy data I need to change it to something like this tidy data so the first value (suffix_name) in the row should be changed to a variable and the second value (none) should…
shabuya
  • 11
  • 1