I got error message while running this following code to read the pdf file and extract text from it.
CODE:
import PyPDF2
pdfFileObject = open('pythonhelp.pdf', 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObject)
page=pdfReader.getPage(0)
print(page.extract_text())
ERROR MESSAGE:
Superfluous whitespace found in object header b'1' b'0'
Superfluous whitespace found in object header b'2' b'0'
Superfluous whitespace found in object header b'3' b'0'
Superfluous whitespace found in object header b'54' b'0'
Superfluous whitespace found in object header b'65' b'0'
Superfluous whitespace found in object header b'68' b'0'
Superfluous whitespace found in object header b'53' b'0'
Superfluous whitespace found in object header b'15' b'0'
Superfluous whitespace found in object header b'14' b'0'
Superfluous whitespace found in object header b'13' b'0'
Superfluous whitespace found in object header b'23' b'0'
Superfluous whitespace found in object header b'22' b'0'
Superfluous whitespace found in object header b'21' b'0'
Superfluous whitespace found in object header b'31' b'0'
Superfluous whitespace found in object header b'30' b'0'
Superfluous whitespace found in object header b'29' b'0'
Superfluous whitespace found in object header b'39' b'0'
Superfluous whitespace found in object header b'38' b'0'
Superfluous whitespace found in object header b'37' b'0'
Superfluous whitespace found in object header b'51' b'0'
Superfluous whitespace found in object header b'50' b'0'
Superfluous whitespace found in object header b'49' b'0'
Superfluous whitespace found in object header b'52' b'0'
SOLUTION:
Add parameter "strict=false". this resolves my problem.
pdfReader = PyPDF2.PdfFileReader(pdfFileObject, strict=False)
VIDEO GUIDE:
Post your comments / questions
Recent Article
- ModuleNotFounEerror:No module named celery in Django Project
- How to get domain name information from a Domain using Python
- ModulenotFoundError: no module named 'debug_toolbar' -SOLUTION
- How to create superuser in django project hosted in cPanel without terminal
- CSS & images not loading in django admin | cpanel without terminal
- Could not build wheels for mysqlclient, which is required to install pyproject.toml-based projects
- How to sell domain name on Godaddy (2023)
- TemplateSyntaxError at / Could not parse the remainder: ' + 1' from 'forloop.counter0 + 1'
Related Article