Skip to content

A tool to extract pages from a PDF that add a substantial amount of new content.

License

Notifications You must be signed in to change notification settings

st-tran/pdf-condenser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PDF Condenser

A tool to extract pages from a PDF that add a substantial amount of new content.

Requirements

  • Python 3.4+ and modules defined in requirements.txt; can be installed with pip -r install
  • Libraries for ImageMagick
  • Linux (for now)

Usage

  • Clone this repository
  • Run pdf-condenser.py with an optional parameter of the PDF to be used (if one is not supplied, the program requests one from stdin)
  • Condensed PDF is saved to the same directory as the original file with -condensed appended to filename
  • Tips:
    • For PDFs with many pages, you may need to decrease RESOLUTION
    • Depending on the general content of the PDF (e.g. changing background), you may need to increase/decrease SIM_THRESHOLD
    • You can add this to your PATH variable to run it from wherever

Notes

  • You may need to change the ImageMagick policies as defined in /etc/ImageMagick-*/policy.xml (e.g. comment out <policy domain="coder" rights="none" pattern="PDF" />)

About

A tool to extract pages from a PDF that add a substantial amount of new content.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages