40

How to create PDF files from an HTML input in Google Go? If it is not possible yet, are there any initations that aims to solve this problem?

I'm looking for a solution like TCPDF in php.

mimrock
  • 4,813
  • 8
  • 32
  • 35
  • 1
    https://github.com/SebastiaanKlippert/go-wkhtmltopdf see this link – muthukumar selvaraj Feb 01 '18 at 17:35
  • In most languages PDF generator you will get only basic HTML support. No proper fonts or latest css. So far the best solution which I Found after using many programming languages is to use chrome puppeteer as a Microservice for generating PDF. It works perfectly. Otherwise you have to use paid libraries. – Krishnadas PC Jul 09 '23 at 17:18

8 Answers8

17

what about gopdf (https://github.com/signintech/gopdf).

It seems like you are looking for.

Trần Hữu Hiền
  • 872
  • 1
  • 9
  • 22
Juan de Parras
  • 768
  • 4
  • 18
  • 29
    Neither of these libraries address the question. The user is looking for HTML to PDF. These are just PDF generators. While they may be good ones, neither converts an HTML document into a PDF. I need to ask this same question in a separate thread. – taystack Oct 02 '17 at 17:23
  • The above both Dependencies won't work for ` to .pdf` – muthukumar selvaraj Feb 01 '18 at 17:38
  • This answer seems to be correct : https://stackoverflow.com/a/48568415/8730051 – Hamed Lohi May 18 '22 at 15:04
17

Installation

go get -u github.com/SebastiaanKlippert/go-wkhtmltopdf

go version go1.9.2 linux/amd64

code

   import (
        "fmt"
        "strings"
        wkhtml "github.com/SebastiaanKlippert/go-wkhtmltopdf"
    )  
    
      func main(){
                 pdfg, err :=  wkhtml.NewPDFGenerator()
               if err != nil{
                  return
              }
              htmlStr := `<html><body><h1 style="color:red;">This is an html
 from pdf to test color<h1><img src="http://api.qrserver.com/v1/create-qr-
code/?data=HelloWorld" alt="img" height="42" width="42"></img></body></html>`
            
              pdfg.AddPage(wkhtml.NewPageReader(strings.NewReader(htmlStr)))
            
   
              // Create PDF document in internal buffer
              err = pdfg.Create()
              if err != nil {
                  log.Fatal(err)
              }
            
               //Your Pdf Name
               err = pdfg.WriteFile("./Your_pdfname.pdf")
              if err != nil {
                  log.Fatal(err)
              }
            
              fmt.Println("Done")
        }

The Above code Works for Converting html to pdf in golang with proper background image and Embedded Css Style Tags

Check repo

See Pull request Documentation Improved

Recommendations (from https://wkhtmltopdf.org/status.html) :

Do not use wkhtmltopdf with any untrusted HTML – be sure to sanitize any user-supplied HTML/JS, otherwise it can lead to complete takeover of the server it is running on! Please consider using a Mandatory Access Control system like AppArmor or SELinux, see recommended AppArmor policy.

If you’re using it for report generation (i.e. with HTML you control), also consider using WeasyPrint or the commercial tool Prince – note that I’m not affiliated with either project, and do your diligence.

If you’re using it to convert a site which uses dynamic JS, consider using puppeteer or one of the many wrappers it has.

Magellan
  • 93
  • 7
  • 4
    `go-wkhtmltopdf` depends on `wkhtmltopdf` binary. It have to be installed to system before. And `wkhtmltopdf` binary depends on about 50 or 60 packages from xserver. It is not suitable for backend solution at all. – Igor May 19 '22 at 16:28
  • It should be noted that wkhtmltopdf has been archived, and is no longer actively maintained. – David May 03 '23 at 21:29
7

There is also this package wkhtmltopdf-go, which uses the libwkhtmltox library. I am not sure how stable it is though.

damonkelley
  • 101
  • 2
  • 9
7

The function page.PrintToPDF() works great.

Here is an example using it with chromedp (go get -u github.com/chromedp/chromedp):

import (
    "context"
    "fmt"
    "io/ioutil"
    "log"
    "net/http"
    "os"
    "time"

    "github.com/chromedp/cdproto/emulation"
    "github.com/chromedp/cdproto/page"
    "github.com/chromedp/chromedp"
)

func main() {
        taskCtx, cancel := chromedp.NewContext(
            context.Background(),
            chromedp.WithLogf(log.Printf),
        )
        defer cancel()
        var pdfBuffer []byte
        if err := chromedp.Run(taskCtx, pdfGrabber("https://www.wikipedia.org", "body", &pdfBuffer)); err != nil {
            log.Fatal(err)
        }
        if err := ioutil.WriteFile("coolsite.pdf", pdfBuffer, 0644); err != nil {
            log.Fatal(err)
        }
}

func pdfGrabber(url string, sel string, res *[]byte) chromedp.Tasks {

    start := time.Now()
    return chromedp.Tasks{
        emulation.SetUserAgentOverride("WebScraper 1.0"),
        chromedp.Navigate(url),
        // wait for footer element is visible (ie, page is loaded)
        // chromedp.ScrollIntoView(`footer`),
        chromedp.WaitVisible(`body`, chromedp.ByQuery),
        // chromedp.Text(`h1`, &res, chromedp.NodeVisible, chromedp.ByQuery),
        chromedp.ActionFunc(func(ctx context.Context) error {
            buf, _, err := page.PrintToPDF().WithPrintBackground(true).Do(ctx)
            if err != nil {
                return err
            }
            *res = buf
            //fmt.Printf("h1 contains: '%s'\n", res)
            fmt.Printf("\nTook: %f secs\n", time.Since(start).Seconds())
            return nil
        }),
    }
}

The above will load wikipedia.org in chrome headless and wait for body to show up and then save it as pdf.

results in terminal:

$ go run main.go
https://www.wikipedia.org
Scraping url now...

Took: 2.772797 secs
james-see
  • 12,210
  • 6
  • 40
  • 47
4

I don't think I understand your requirements. Since HTML is a markup language, it needs context to render (CSS and a screen size). Existing implementations I've seen generally open the page in a headless browser and create a PDF that way.

Personally, I would just use an existing package and shell out from Go. This one looks good; it's even recommended in this answer.

If you're really determined to implement it all in Go, check out this WebKit wrapper. I'm not sure what you'd use for generating PDFs, but but at least it's a start.

Community
  • 1
  • 1
beatgammit
  • 19,817
  • 19
  • 86
  • 129
  • I do not have too many special requirements. I need to create pdf files, but preferably not from go code, but from a source that is a good compromise between flexibility and easy learning. In php, there are multiple libraries for converting HTML documents to pdf, because HTML is easy to learn, and pretty flexible. I was curious if someone has already written a library like that. Thank you for your answer. – mimrock Feb 21 '13 at 09:09
3

I'm creating an alternative lib to create PDFs in a simpler way (https://github.com/johnfercher/maroto). It uses gofpdf and have a grid system and some components like Bootstrap.

1

Another option is Athena. It has a microservice written in Go or it can be used as a CLI.

Pier
  • 10,298
  • 17
  • 67
  • 113
-4

Another option is UniHTML (container-based with API) which interoperates with UniPDF which is useful to create PDF reports and such based on HTML templates.

It uses a headless-chrome engine in a container, so the rendering is perfect and has all HTML features. The combination with UniPDF gives additional advantages, such as automatic table of content generation, outlines and such. As well as ability to add password protection, add PDF forms, digital signatures and such.

To create a PDF for an HTML template on disk, it can be done by:

package main

import (
    "fmt"
    "os"

    "github.com/unidoc/unihtml"
    "github.com/unidoc/unipdf/v3/common/license"
    "github.com/unidoc/unipdf/v3/creator"
)

func main() {
    // Set the UniDoc license.
    if err := license.SetMeteredKey("my api key goes here"); err != nil {
        fmt.Printf("Err: setting metered key failed: %v\n", err)
        os.Exit(1)
    }

    // Establish connection with the UniHTML Server.
    if err := unihtml.Connect(":8080"); err != nil {
        fmt.Printf("Err:  Connect failed: %v\n", err)
        os.Exit(1)
    }

    // Get new PDF Creator.
    c := creator.New()

    // AddTOC enables Table of Contents generation.
    c.AddTOC = true

    chapter := c.NewChapter("Points")

    // Read the content of the sample.html file and load it to the conversion.
    htmlDocument, err := unihtml.NewDocument("sample.html")
    if err != nil {
        fmt.Printf("Err: NewDocument failed: %v\n", err)
        os.Exit(1)
    }

    // Draw the html document file in the context of the creator.
    if err = chapter.Add(htmlDocument); err != nil {
        fmt.Printf("Err: Draw failed: %v\n", err)
        os.Exit(1)
    }

    if err = c.Draw(chapter); err != nil {
        fmt.Printf("Err: Draw failed: %v\n", err)
        os.Exit(1)
    }


    // Write the result file to PDF.
    if err = c.WriteToFile("sample.pdf"); err != nil {
        fmt.Printf("Err: %v\n", err)
        os.Exit(1)
    }
}

I have written an introduction article to UniHTML [here] which might be useful if more information is needed (https://www.unidoc.io/post/html-for-pdf-reports-in-go).

Disclosure: I am the original developer of UniPDF.

  • 6
    You might want to warn people that your solution requires paying _at least_ $1500. Source: https://www.unidoc.io/pricing – ewen-lbh May 30 '21 at 20:30