Comprehensive Guide to PDF Generation using pdfkit in Python

Introduction to PDFKit

PDFKit is a powerful Python library used for generating PDFs from HTML. It transforms HTML and CSS into well-formatted PDF documents, making it an essential tool for web developers and software engineers.

Installation

To get started, install pdfkit using pip:

  pip install pdfkit

It requires wkhtmltopdf, which can be installed by following the instructions on the official site.

Basic Usage

Here’s a simple example of creating a PDF from a HTML string:

  import pdfkit

  pdfkit.from_string('<h1>Hello World</h1>', 'output.pdf')

Generating PDFs from File

You can also generate a PDF from a HTML file:

  pdfkit.from_file('input.html', 'output.pdf')

Generating PDFs from URL

PDFKit allows you to create PDFs directly from a URL:

  pdfkit.from_url('https://www.example.com', 'output.pdf')

Setting Options

PDFKit supports various options to customize the PDF generation process. You can pass options as a dictionary:

  options = {
    'page-size': 'A4',
    'margin-top': '0.75in',
    'margin-right': '0.75in',
    'margin-bottom': '0.75in',
    'margin-left': '0.75in',
    'encoding': "UTF-8",
    'custom-header': [
      ('Accept-Encoding', 'gzip')
    ]
  }
  pdfkit.from_url('https://www.example.com', 'output.pdf', options=options)

Adding Headers and Footers

You can add headers and footers to the PDFs:

  options = {
    'header-left': 'Page [page] of [toPage]',
    'footer-center': 'Generated by PDFKit'
  }
  pdfkit.from_file('input.html', 'output.pdf', options=options)

Using Configuration

If wkhtmltopdf is not in your PATH, you can specify its location:

  config = pdfkit.configuration(wkhtmltopdf='/path/to/wkhtmltopdf')
  pdfkit.from_url('https://www.example.com', 'output.pdf', configuration=config)

PDF Encryption and Security

PDFKit also supports PDF encryption:

  options = {
    'password': 'your_password',
    'no-print': None
  }
  pdfkit.from_file('input.html', 'output.pdf', options=options)

Combining Multiple PDFs

You can combine multiple PDFs into one:

  input_files = ['file1.pdf', 'file2.pdf']
  pdfkit.from_file(input_files, 'combined_output.pdf')

Embedding Images

Embedding images in your PDF is straightforward:

  html_content = '<img src="path_to_image.jpg" alt="Sample Image">'
  pdfkit.from_string(html_content, 'image_output.pdf')

App Integration Example

Let’s create a simple Flask application using PDFKit:

  from flask import Flask, render_template_string, request, send_file
  import pdfkit

  app = Flask(__name__)

  @app.route('/create_pdf', methods=['GET', 'POST'])
  def create_pdf():
      if request.method == 'POST':
          html_content = request.form.get('html_content')
          pdf = pdfkit.from_string(html_content, False)
          response = make_response(pdf)
          response.headers['Content-Type'] = 'application/pdf'
          response.headers['Content-Disposition'] = 'attachment; filename=output.pdf'
          return response
      return '''
          <form method="post">
            <textarea name="html_content"></textarea>
            <input type="submit">
          </form>
      '''

  if __name__ == '__main__':
      app.run(debug=True)

This example demonstrates how to take HTML input from the user and generate a downloadable PDF file.

Hash: 398ce75f627d6452c0d1d9634cba3dd9736d69d114848c7a97a4f635017103f8

Leave a Reply

Your email address will not be published. Required fields are marked *