top of page
Search

PDF manipulations

2015-05-11: Warning. Cannot be used unchanged with Python3. It is recommended to install pdf, argparse and pyPDF with pip install (set paths in Windows to both Python27 folder and Python/Scripts folder)

As a teacher in Advanced Engineering Mathematics I grade a lot of home work. The students upload Maple output pdf files to the campus website and I access the files in a class/student_id folder structure obtained from a zip file. I move the pdf files to the top-layer and merge them using the shell and python scripts below. Once done I can easily print out files with four pages on each sheet to reduce waste and optimize the process of commenting and grading.

Utilities: Mingw, pdfmerge.py (modification of script found on the internet) and pyPDF:

#!/usr/bin/env python # -*- coding: utf-8 -*- #In pyPDF folder (find the package on the internet) #python setup.py install #Example – first oneline moves pdf files the second append files and add blanks as necessary #mv –backup=numbered **/*.pdf . #python ../Scripts/pdfmerge.py -p=. -o=output.pdf -b=../Scripts/blank-page.pdf from argparse import ArgumentParser from glob import glob from pyPdf import PdfFileReader, PdfFileWriter

def merge(path, blank_filename, output_filename):     blank = PdfFileReader(file(blank_filename, “rb”))     output = PdfFileWriter()

    for pdffile in glob(‘*.pdf’):         if pdffile == output_filename:             continue         print(“Parse ‘%s'” % pdffile)         document = PdfFileReader(open(pdffile, ‘rb’))         for i in range(document.getNumPages()):             output.addPage(document.getPage(i))         i=((4-document.getNumPages()) % 4)         while i>0:             output.addPage(blank.getPage(0))             print(“Add blank page to ‘%s’ (had %i pages)” % (pdffile, document.getNumPages()))    i=i-1     print(“Start writing ‘%s'” % output_filename)     output_stream = file(output_filename, “wb”)     output.write(output_stream)     output_stream.close()

if __name__ == “__main__”:     parser = ArgumentParser()

    # Add more options if you like     parser.add_argument(“-o”, “–output”, dest=”output_filename”, default=”merged.pdf”,                       help=”write merged PDF to FILE”, metavar=”FILE”)     parser.add_argument(“-b”, “–blank”, dest=”blank_filename”, default=”blank.pdf”,                       help=”path to blank PDF file”, metavar=”FILE”)     parser.add_argument(“-p”, “–path”, dest=”path”, default=”.”,                       help=”path of source PDF files”)

    args = parser.parse_args()     merge(args.path, args.blank_filename, args.output_filename)

1 view0 comments

Recent Posts

See All

dplyr or base R

dplyr and tidyverse are convenient frameworks for data management and technical analytic programming. With more than 25 years of R experience, I have a tendency to analyze programmatic problems before

©2020 by Danish Institute for Data Science. Proudly created with Wix.com

bottom of page