2

I have data stored in structured XML that I want to make it more readable using XSLT (or another alternative). The target document should have lots of instances of text aligned both to the left and to the right in the same line, and I need to have a behaviour like div floats:

<div style="float: left;">
  <p align=left>
    Left text. Left text. Left text. Left text. Left text. Left text. 
  </p>
</div>

<div style="float: right;">
  <p align=right>
    Right text. Right text. Right text. Right text. Right text. Right text. 
  </p>
</div>
<div style="clear: both;" />

This way, when the text lenght of both are bigger than the container width, the "Left text" DIV is written, then the "Right text" DIV is writter BELOW it.

I can't use XSLT to make a HTML file, because it will be use in print and I also need a language that has some "keep together" feature at page breaks (if a page break occurs in the middle of the element, it should break before the element). Having tables that supports auto sizing its columns (like the HTML table) would be a huge plus but not required.

I was studying XSL-FO, but I couldn't find a free renderer that supports those features. I thought about using XSL to make a WordML file, but I haven't found any tutorials on it. Having a Word (or Open Office) document would be great, cause I could make minor adjustments. Also, I'm considering using LaTex.

What can you suggest me?

EDIT: I was checking some CSS features that I didn't know about (haven't messed with it in years) and it does have some print related features (page-break-inside:avoid, @page, etc) besides having an excelent support for floats and auto layout of tables. Even though the print features are not widelly supported, Opera and IE do support it and I ran some tests in IE9 an it rendered very well. So I will try XSLT with HTML/CSS since it has everything I need and will have a smoother learning curve (I already know some CSS and have use HTML for years).

Luiz Borges
  • 537
  • 1
  • 5
  • 19
  • What suggestions do you need really? Couldn't collect your thoughts properly. Explain in short. – Sujit Agarwal Jun 01 '11 at 01:04
  • Since I'm a beginner at XML transformation, I need a suggestion of language (free) that allows those 2 features that I required (floats and page breaks). XSL-FO seems to fall short (at least with FOP and XEP Free). I don't know if Latex allows me to do that. I know that I can do that in Word documents (and probably in OO Writer as well), so XML to Word (or Writer) is a choice, but I can't find any good reference or tutorials on that. Basically I need help picking a language to study. – Luiz Borges Jun 01 '11 at 01:15
  • you mean to say that you want to render a XML file's content into a table? – Sujit Agarwal Jun 01 '11 at 01:17
  • I want to render XML in a presentation format, like HTML, XSL-FO, WordML, etc. – Luiz Borges Jun 01 '11 at 01:33
  • I know that!!!. I want to know, that if you are rendering the XML, would it be as a `HTML table`? By the way do you know php? – Sujit Agarwal Jun 01 '11 at 01:34
  • The document would have tables, but it would be mostly texts (with double aligned headers like I explained above). I know a little php, but I don't know how that is relevant since I need a print available document (with pagebreak and so on). – Luiz Borges Jun 01 '11 at 01:55
  • So, why have you tagged this question "xslt"? I don't see anything XSLT in this question -- please re-tag. – Dimitre Novatchev Jun 01 '11 at 03:07

2 Answers2

2

I think you should decide for a more general approach to your problem by going with some well known XML standard schema like DITA or DOCBOOK.

These schemas have the advantage of letting you write your own XML and render it as you need according to the output format they support. Moreover, they are free and you can obtain PDF, RTF, WebHelp, HTML and so on easily once you have the corret source documents.

So, you need to:

  • make your decision, DITA or DOCBOOK?
  • write a transfrom that gets your XML and convert it to the specific standard DITA or DOCBOOK
  • decide the tool to use for managing the standard you have chosen and test all output you can obtain from it and see which one fits better.

If you decide for DITA, you have two free choices there:

  • DITA-OT
  • DITAC

Personally, I would go with DITA and DITA-OT because it has the flexibility of plugins based on XSLT and custom builds based on ANT. But both have their reasons. I started with DITA-OT and ended using both DITA-OT and DITAC.

I did not put any reference here, because you can easily find what you need using Google.


ABOUT FLOAT

DITA-OT provides a specific XSL-FO transtype which is called PDF2 and whose XSL-FO processor is RenderX. RenderX currently supports fo:float so I imagine you will be able to find your way. May be you will need to override some PDF2 template, which is definitely simpler than implementing your own PDF transform.

Note that RenderX is feasable as long as a small foot watermark on each page is acceptable for you. Otherwise you should spend some money.

Emiliano Poggi
  • 24,390
  • 8
  • 55
  • 67
  • My first thought (even before start studying XSL) was DocBook or DITA, but I found it overwhelming, particularly DocBook. And since I had those design constraints with floats I thought it wasn't going to work. Do you know how it the DITA support for that? – Luiz Borges Jun 01 '11 at 11:03
  • DITA does not support "float". It's the output format you choose to get from the DITA source which should support it. If your final choice is going through XSL-FO to get PDF does not write XSL-FO from scratch, use DITA-OT and it will do the work for you. – Emiliano Poggi Jun 01 '11 at 12:22
  • @Luiz-Borges: DITA-OT provides a specific XSL-FO transtype which is called PDF2 and whose XSL-FO processor is RenderX. RenderX currently supports `fo:float` so I imagine you will be able to find your way. _May be_ you will need to override some PDF2 template, which is definitely simpler than implementing your own PDF transform. – Emiliano Poggi Jun 01 '11 at 18:49
  • @Luiz-Borgez: among the other things please try to get better involved in how SO community works. You might be informed about [upvoting](http://meta.stackexchange.com/questions/59828/how-to-upvote-responsibly) and [accept-rate/answers](http://meta.stackexchange.com/questions/27457/accept-rate-why-accept-an-answer-that-isnt-an-answer). – Emiliano Poggi Jun 01 '11 at 18:54
  • @empo, I just discovered Prince and its fantastic. I was planning on using Opera with some PDF printer, but I cad definitly live with the watermark in the corner of the free version of Prince, its very stylish :) – Luiz Borges Jun 01 '11 at 21:52
0

I would definitely stick with XSL-FO and Apache Fop. I've been very happy with the results I've been able to get with it and the only times I've see it struggle is when someone thinks it's a good idea to try and get it to produce Word Documents. I've never seen this produce good results. In fact I've onyl every been happy with the output it produces in PDF or PS format.

Please elaborate on how the FOP was falling short regarding floats and page breaks? It's hard to know if it is a problem specific to those features or if it was specific to those features with a Word/RTF/HTML output format.

Finally, I would avoid the tweek it in Work/Open Office mentality. If you are going to the effort of automating the document generation, spend the time on getting it right, so there are no manual steps.

Tom Howard
  • 6,516
  • 35
  • 58
  • FOP doesn't support fo:float, and this is a must since or current documentation use this style a lot (its HTML/CSS based) and people have grow used to it. Besides, the alternatives are really bad, you choose: half space with tables or overlapping with list item hacks. – Luiz Borges Jun 01 '11 at 11:01
  • wow, it's been so long since I used FOP that I just assumed floats were supported as some level. My bad. Sorry. – Tom Howard Jun 01 '11 at 12:16