Command Line Arguments reading image and pdf: Argparse with Python
reading image and pdf from the command line
What are Command-line arguments in Python3?
Python Command line arguments are input parameters passed to the script when executing them.
use this line in the command prompt to get the output!
usage: python practice1.py -i sample1.jpg -p pdf/sample1.pdf
Code:
import numpy as np
import re
import pytesseract
import os
from PIL import Image
import cv2
from pdf2image import convert_from_path
import pandas as pd
import tabulafrom tabula import read_pdf
import pandas as pd
import numpy as np
import argparse
import syspytesseract.pytesseract.tesseract_cmd = ‘C:\\Program Files\\Tesseract-OCR\\tesseract.exe’
# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument(“-i”, “ — imgpath”, help=”image path”)
ap.add_argument(“-p”, “ — pdfpath”, help=”pdf path”)args = vars(ap.parse_args())
def func_name(imagepath,pdfpath):
— — -path = imagepath
— — -image = cv2.imread(path, 0)
— — - pdf_path = pdfpath
— — -list1 = []
— — -df1 = tabula.read_pdf(pdf_path, pages=’all’)
— — -for item in df1:
— —- for info in item.values:
— — -list1.append(info)
— — -df1 = pd.DataFrame(list1)
func_name(args[‘imgpath’],args[‘pdfpath’])
Now, our helper text is displayed we use --help
from the command line.
C:\Users\Madan \argparse>python practice1.py -h
usage: practice1.py [-h] [-i IMGPATH] [-p PDFPATH]optional arguments:
-h, — help show this help message and exit
-i IMGPATH, — imgpath IMGPATH
image path
-p PDFPATH, — pdfpath PDFPATH
pdf path
Read Files from folder :
usage: python argparse.py path pdf1
parser = argparse.ArgumentParser(description=’Read in a file or set of files, and return the result.’, formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument(‘path’, nargs=’+’, help=’Path of a file or a folder of files.’)
parser.add_argument(‘-e’, ‘ — extension’, default=’’, help=’File extension to filter by.’)
args = parser.parse_args()
# Parse paths
full_paths = [os.path.join(os.getcwd(), path) for path in args.path]
files = set()
for path in full_paths:
if os.path.isfile(path):
files.add(path)
else:
files |= set(glob.glob(path + ‘/*’ + args.extension))
I don’t like to write the waste content here, so I went directly to matter without writing anything blah blah blah