Tech Tips

How to extract image and text from PDF

1. Install poppler-utils

2. To extract original embedded images:

$ pdfimages -j <file.pdf> <to_dir>

3. To extract text:

$ pdftotext -j <file.pdf>

Tech Tips

How to combine 2 single-page PDFs into 1 page

(1) Join the 2 separate PDFs into 1 file (e.g. joined.pdf) using tools such as pdfunite, pdfsam or pdfshuffler.

(2) Install pdfjam.
(Note: If you’re using Arch Linux, pdfjam is included in the texlive-bin package)

(3) Use the pdfnup tool from the pdfjam package to merge page 1 and page 2 of joined.pdf into a new pdf file consisting of only a single page, with page 1 stacked on top of page 2:

$ pdfnup joined.pdf --nup 1x2

Tech Tips

How to remove known password from protected PDF?

Install QPDF, then:

$ qpdf --password=? --decrypt in.pdf out.pdf

This doesn’t crack or guess the password for you. You must already know the password. It merely helps you to create a copy of the PDF without password protection.