Advanced Search  :  Site Statistics  :  Directory  :  Web Resources  :  Polls  
    AccessPDF A PDF Forum for Users and Programmers    
 Welcome to AccessPDF
 Wednesday, November 19 2008 @ 12:52 PM PST
Pdftk - the PDF Toolkit

If PDF is electronic paper, then pdftk is an electronic staple-remover, hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. Pdftk is a command-line tool for doing everyday things with PDF documents. Keep one in the top drawer of your desktop and use it to:

  • Merge PDF Documents
  • Split PDF Pages into a New Document
  • Decrypt Input as Necessary (Password Required)
  • Encrypt Output as Desired
  • Fill PDF Forms with FDF Data and/or Flatten Forms
  • Apply a Background Watermark
  • Report on PDF Metrics such as Metadata, Bookmarks, and Page Labels
  • Update PDF Metadata
  • Attach Files to PDF Pages or the PDF Document
  • Unpack PDF Attachments
  • Burst a PDF Document into Single Pages
  • Uncompress and Re-Compress Page Streams
  • Repair Corrupted PDF (Where Possible)

Pdftk allows you to manipulate PDF easily and freely. It does not require Acrobat, and it runs on Windows, Linux, Mac OS X, FreeBSD and Solaris.

Pdftk is free software (GPL).

Simple Examples | Documentation | Download Pdftk | Build Pdftk | Auf Deutsch


Latest Version: 1.12, Released: November 9, 2004


PDFTK doesn't work with PDF forms of version 1.6

 Email Article To a Friend View Printable Version 
PdftkIt turns out, that PDFTK doesn't work with PDF forms of version 1.6. They are filled without an error, but the fields appear empty in Acrobat 7.0. The older Acrobat versions complain about the newer PDF version, but display the contents of the fields.
What can be the problem?
 
Post a comment
Comments (21) Trackbacks (0)

Splitting and combining

 Email Article To a Friend View Printable Version 
Pdftk

I wanted to print the front pages ONLY of patents downloaded as PDFs. I couldn't work out how to do this with PDFTK in one pass, so I did as follows. This is in a COMMAND.COM DOS box (not CMD.EXE) under Windows 2000 and assumes a directory c:zdir that is clean apart from the particular PDFs under consideration. It uses Horst Schaeffer's wonderful LMOD (List MODifier) to make the batch files. The idea is to use PDFTK to split the first page only from each pdf, using LMOD to give them unique serial numbers, then a second pass with PDFTK to reassemble the new files into a single one, followed by a call to Acrobat Reader's command line function for printing (still have to press the OK button manually). Those "skilled in the art" will understand the rest; hope line wrapping and your system's odd habit of dropping backslashes doesn't screw things up too much.

PRPATPDF.BAT
------------
@echo off
c:dosutilspushd.exe
c:
cd c:zdir
for %%a in (do_it.bat pdf.dir front_pages.pdf $*.pdf) do if exist %%a del %%a > nul
dir *.pdf /O:N /N /b > pdf.dir
lmod /L* pdftk [] cat 1-1 output $[#].pdf  do_it.bat
call do_it.bat
pdftk $*.pdf cat output front_pages.pdf
"C:Program FilesAdobeAcrobat 6.0ReaderAcroRd32.exe" /p "front_pages.pdf"
for %%a in (do_it.bat pdf.dir front_pages.pdf $*.pdf) do if exist %%a del %%a > nul
c:dosutilspopd.exe

Horst Schaeffer's Web site is http://home.mnet-online.de/horst.muc/

-- Robert Bull

 
Post a comment
Comments (10) Trackbacks (0)

Windows Wildcards Can Yield Extra Pages

 Email Article To a Friend View Printable Version 
Pdftk

Some pdftk users process hundreds of files. Performing this work on a Windows machine can yield unexpected results. The problem arises from the Windows command-prompt shell, not pdftk. The problem arises because for every long filename, Windows creates a short, DOS-compatible (8.3) filename. This short filename might end up matching a wildcard expression, even when the long filename does not. When using pdftk, the result is that you end up with more input files than you wanted.

This article offers a couple workarounds and then describes the case where this problem arose.

 
read more (441 words) Post a comment
Comments (7) Trackbacks (10)

Fill PDF Forms Using an HTML Front-End

 Email Article To a Friend View Printable Version 
Pdftk

I wrote an article for MacTech Magazine (Nov. 2004 ed.) about collecting form data using HTML forms and then packing this data into a PDF form for download. I posted the code online, along with a working example.

While merging data into a PDF form is old news to pdftk users, this example has an interesting twist. It uses pdftk's dump_data_fields operation to discover exactly what the PDF form wants, then creates a dynamic HTML form using this information. I.e., it automatically creates an HTML form to match your PDF form.

This HTML form is bare-bones, but it makes a good foundation for your web interface. It helps if you fill in the PDF fields' "Short Description," available via Acrobat's field properties dialog.

My MacTech article described the process in detail and introduced the reader to related topics, such as PDF forms, pdftk, and the FDF format. This artcile offers some tips on getting started and shows how to discover form field data using pdftk. Download the example PHP code here.

I haven't tested this on Adobe's new-fangled Acrobat 7 forms.

 
read more (173 words) Post a comment
Comments (15) Trackbacks (0)

forge_fdf in Python

 Email Article To a Friend View Printable Version 
Pdftk

Received today from Timothy Stebbing:

gday,

I ported forge_fdf to python for work, I thought it might be useful to you to use/post along with forge_fdf.php on the website for others. I've attached it, it's a direct block-for-block port, it could probably be optimised a fair bit but it was the least amount of effort to port it directly :) It makes an interesting comparison between the languages, just look at the line count ;)

-tjs

This code was later updated by Thomas Heetderks at Frontline Processing:

See the full article for code. Thanks Timothy and Thomas!

Forge_fdf is a little PHP script I created for casting form data into the FDF syntax. This is handy for filling online PDF forms automatically, or for filling forms using pdftk.

Here are some examples of forge_fdf in action:

 
read more (714 words) Post a comment
Comments (14) Trackbacks (1)

Splitting PDF on Page Text

 Email Article To a Friend View Printable Version 
Pdftk

I would like to split a large pdf file into several small files based on the account no on the page and would be able to give custom file name to each pdf. This I would like to do it in a batch job in solaris. How can I do it.

Any help is greatly appreciated.

I suggest using pdftotext to extract your PDF's text. run pdftotext --help to see its options. Its output uses the the formfeed (0x0C) character to show page breaks. Scan the output from pdftotext for the account number, or possibly some other distinctive feature, to find where to split the PDF. Count pages as you go along by counting formfeeds. Then create a pdftk command-line to perform the split and output the new file to your custom file name. Script using bash, if you like.

 
Post a comment
Comments (8) Trackbacks (0)

Modify PDF PageLayout and PageMode using Pdftk and Sed

 Email Article To a Friend View Printable Version 
Pdftk

PDF has features for controlling how a document first appears in the viewer. These include page layout settings: Single Page, Continuous, and Continuous Facing Pages. These also include page mode settings: Show Bookmarks, Show Thumbnails, and Full Screen.

Pdftk does not currently have built-in features for setting these options, but you will find they are easy to set using a sed script. My article about editing PDF using sed will give you the background on this technique. Here I will desciribe how it applies to page layout and page mode settings.

 
read more (316 words) Post a comment
Comments (1) Trackbacks (0)

Assemble PDF Pages after Double-Sided Scanning

 Email Article To a Friend View Printable Version 
Pdftk

Here is an email I received today that describes a common PDF problem. I sketched out a solution, and some kind folks have offered scripts. Feel free to contribute.

In the old HP 6350 scanner (with sheet feeder), they have included the HP Precision scan program that allows user to scan two-side paper in two passes.

Specifically after scanning all pages in other side, you will be asked to turn over the whole pile and scan the back side, i.e. first pass pages 1 3 5 7 second pass pages 8 6 4 2

Combined output pages 1 2 3 4 5 6 7 8

I wonder if your master piece "pdftk" can be used to serve this purpose, i.e.

Taking the first page from PDF A, then the last page from PDF B; then the 2nd page from PDF A, and then the (n-1) page from PDF B...

 
read more (313 words) Post a comment
Comments (15) Trackbacks (0)

Removing Blank Pages from a PDF

 Email Article To a Friend View Printable Version 
Pdftk

Here is an idea for how to remove blank pages from a PDF using pdftotext and pdftk. It is based on a recent posting to comp.text.pdf.

 
read more (417 words) Post a comment
Comments (8) Trackbacks (0)

Merging Multiple PDFs by Date on Windows

 Email Article To a Friend View Printable Version 
Pdftk

Pdftk is a command-line program, so it helps to know command-line (a/k/a "shell") programming. Especially when you want to do something that pdftk doesn't know how to do by itself.

A good example of this is using pdftk to combine PDF in a special order. This article describes how to use the Windows Command Prompt batch language to combine PDFs by file creation date. I also show how to do this using the bash shell which comes with MSYS.

 
read more (334 words) Post a comment
Comments (10) Trackbacks (16)
 Copyright © 2008 AccessPDF
 All trademarks and copyrights on this page are owned by their respective owners.
Powered By Geeklog 
Created this page in 0.61 seconds