Tuesday, January 14, 2020

Pdf Hacks



How to remove watermark from pdf using pdftk?

I had to first uncompress the PDF document in order to be able to find the watermark and replace it with sed. The first step involves uncompressing the PDF document using pdftk:

pdftk original.pdf output uncompressed.pdf uncompress 

now, the uncompressed.pdf can be used as in Dingo's answer:

sed -e "s/watermarktextstring/ /" uncompressed.pdf > unwatermarked.pdf

I then repaired and recompressed the document:

pdftk unwatermarked.pdf output fixed.pdf compress

How to convert a PDF to EPUB
Calibre (Windows, MacOS, Linux)



https://community.coherentpdf.com/
https://linuxcommando.blogspot.com/2013/02/splitting-up-is-easy-for-pdf-file.html


https://www.ostechnix.com/how-to-merge-pdf-files-in-command-line-on-linux/
sudo apt-get install pdftk

$ pdftk file1.pdf file2.pdf fiel3.pdf cat output outputfile.pdf
Alternatively, you can run:

$ pdftk *.pdf cat output outputfile.pdf
This command will merge all pdf files in the current directory into a single file.

https://superuser.com/questions/565028/splitting-a-pdf-into-pdfs-of-various-sizes
pdftk Input.pdf cat 1-3 output Pages_1-3.pdf
will extract pages 1-3 and save them as Pages_1-3.pdf
If you have a set of page ranges and desired output file names, you can save them to a text file as follows:
1-3 Pages_1-3.pdf
4-11 Pages_4-11.pdf
12 Page_12.pdf
13-end Pages_13-end.pdf
Now use the following command at the command prompt:
for /f "tokens=1*" %a in (Input.txt) do pdftk Input.pdf cat %a output %b
to split up the PDF into separate files.
If you simply want to save each page as a separate file, you can use the burst parameter instead. See that man page or type pdftk --help for details.


https://codeyarns.com/2010/02/08/how-to-split-pdf-using-pdftk/
To split a PDF file into multiple PDF files, one per page of the original PDF file, invoke:
1
$ pdftk foobar.pdf burst output foobar-%d.pdf
If foobar.pdf is a 2 page PDF, this splits it into foobar-1.pdf and foobar-2.pdf.
You may use pdftk and try
pdftk source.pdf cat 1-100 output try1.pdf
pdftk source.pdf cat 101-end output try2.pdf
https://net2.com/how-to-install-and-use-pdftk-on-linux-to-merge-or-split-pdf-files/
In some circumstances you may wish to merge many PDF files into a single document. This can be achieved very easily by PDFTK by running the command below :
pdftk file1.pdf file2.pdf fiel3.pdf cat output single_document.pdf
In case you have a high number of files in your current directory you wish to merge into a single one, you could proceed as follows :
pdftk *.pdf cat output single_document.pdf
If however you would want to use the so called handles (in order to carry out some single operations for instance on specific input files), proceed as shown below :
pdftk A=file1.pdf B=file2.pdf cat A B output output_file.pdf

4 – Splitting a pdf document

The PDFTK utility has the ability also to split a file into multiple single-page files, so that each resulting file contains only one single page of the original document.This can be done by using the switch burst as shown below :
pdftk bigfile.pdf burst
PDFTK will then place the single-page files in the same directory as the original file .i.e. bigfile.pdf.
If you would like to have the single-page files encrypted, use the command below :
pdftk bigfile.pdf burst owner_pw [ownerpassword] allow DegradedPrinting
Provide your password in the placeholder attribute [ownerpassword]. Enable low-quality printing.

5 – Excluding pages from a PDF file

It is possible using the PDFTK tool to exclude one or more pages from a PDF file by using the command below where we want to remove for instance page 12 from input_file.pdf in order to create output_file.pdf :
pdftk input_file.pdf cat 1-11 13-end output output_file.pdf
or:
pdftk A=input_file.pdf cat A1-11 A13-end output output_file.pdf

10 – Attaching files

PDFTK is able to attach binary as well as text files to a PDF document with ease. It is even possible to specify the page you want the attachment to be visible on. For example:
pdftk File1.pdf attach_files file_to_attach.html to_page 20 output OuputFile.pdf



https://jon.dehdari.org/tutorials/pdf_tricks.html

How do I join/merge PDFs together?
pdftk input1.pdf input2.pdf input3.pdf  cat output  output.pdf

How do I split all the pages of a PDF?
pdftk input.pdf burst

The output by default is pg_0001.pdf, pg_0002.pdf, etc.
How do I remove or extract certain pages from a PDF?
To remove page 7:
pdftk input.pdf  cat '~7'  output  output.pdf

To remove pages 7, 9, and 14:
pdftk input.pdf  cat '~7~9~14'  output output.pdf

To include only pages 3, 8-11, and 15:
pdftk input.pdf  cat 3 8-11 15  output output.pdf

How do I reverse the pages of a PDF?
pdftk input.pdf  cat end-1  output output.pdf

How do I password-protect and encrypt a PDF?
pdftk input.pdf  output output.pdf  user_pw PROMPT

https://github.com/DavidFirth/pdfjam
An alternative set of PDF manipulation tools, which are java-based, is provided by the Multivalent project. Yet another alternative set of tools is PDFsam. Those alternatives do much the same things as pdfjam, and maybe quite a bit more too.

https://stackoverflow.com/questions/20531079/adding-an-image-to-a-pdf-with-pdftk
First convert the image to PDF
convert image.png image.pdf
Then scale up and offset the image using pdfjam (another free tool)
pdfjam --paper 'a4paper' --scale 0.3 --offset '7cm -12cm' image.pdf
Then combine both PDFs using pdftk
pdftk text.pdf stamp image.pdf output combined.pdf
You may need to download STAMPtk if you need to position the image and add it to only one page in the general PDF, but this one you have to pay for it.
PDFjam: apt-get install texlive-extra-utils, PDFtk: apt-get install pdftk

And installation notes for MacPDFjambrew install homebrew/tex/pdfjam

https://stackoverflow.com/questions/11693137/how-do-i-control-pdf-paper-size-with-imagemagick
From experimenting with the settings, I found that the pdf page size can be controlled by combining -page -density and -units. The documentation for -page shows that letter is the same as entering 612 x 792. Combining -density 72 with -units pixelsperinch will give you (612px /72px) * 1in = 8.5in.
convert *.jpg -units pixelsperinch -density 72 -page letter foo.pdf should do what the original poster wanted.


For people who just want a PDF file made of their images, where the PDF page matches the image size and shape: convert -page 1678x1048 slide*.png presentation.pdf where 1678x1048 is the WIDTHxHEIGHT of your images in pixels

https://stackoverflow.com/questions/23214617/imagemagick-convert-image-to-pdf-with-a4-page-size-and-image-fit-to-page

convert --version
adding -quality 100 removes some of the noticeable noise
https://www.howtoinstall.co/en/ubuntu/xenial/img2pdf

sudo apt-get install img2pdf



https://opensource.com/article/19/2/manipulating-pdfs-linux

https://tecadmin.net/install-imagemagick-on-linux/
sudo apt install php php-common gcc

Step 2 – Install ImageMagick
After installing required packages, let’s install ImageMagick using the following command. ImageMagick package is available under default apt repositories.

sudo apt install imagemagick

https://superuser.com/questions/1255867/how-to-convert-png-to-pdf
You are almost certainly seeing references to ImageMagick, which has a "convert" utility that potentially allows .png to .pdf conversion e.g.
convert image1.png image2.png image3.png output.pdf

https://stackoverflow.com/questions/20531079/adding-an-image-to-a-pdf-with-pdftk
First convert the image to PDF
convert image.png image.pdf
Then scale up and offset the image using pdfjam (another free tool)
pdfjam --paper 'a4paper' --scale 0.3 --offset '7cm -12cm' image.pdf
Then combine both PDFs using pdftk
pdftk text.pdf stamp image.pdf output combined.pdf
You may need to download STAMPtk if you need to position the image and add it to only one page in the general PDF, but this one you have to pay for it.

http://www.imagemagick.org/discourse-server/viewtopic.php?t=32470
You need an output image and an input image and the extra page image. So if you have 1.pdf and want to add 1.jpg, you would do
CODE: SELECT ALL

convert 1.pdf 1.jpg 1.pdf
But Imagemagick will rasterize your pdf. So any vector data will not be preserved. Also you may need to specify the density for decoding your input pdf.
CODE: SELECT ALL

convert -density XXX 1.pdf 1.jpg 1.pdf


http://www.imagemagick.org/discourse-server/viewtopic.php?t=31615
A PDF with a single page, with a single image that is a combination of all the input images? Then:
CODE: SELECT ALL
magick montage *mhk* -mode concatenate -tile 1x out.pdf
Or do you want a PDF with as many pages as there are input images, one image per page? Then:
CODE: SELECT ALL
magick *mhk* out.pdf
https://imagemagick.org/script/convert.php
magick convert rose.jpg rose.png
Next, we reduce the image size before it is written to the PNG format:
magick convert rose.jpg -resize 50% rose.png

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts