I got error message while running this following code to read the pdf file and extract text from it.
CODE:
import PyPDF2
pdfFileObject = open('pythonhelp.pdf', 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObject)
page=pdfReader.getPage(0)
print(page.extract_text())
ERROR MESSAGE:
Superfluous whitespace found in object header b'1' b'0'
Superfluous whitespace found in object header b'2' b'0'
Superfluous whitespace found in object header b'3' b'0'
Superfluous whitespace found in object header b'54' b'0'
Superfluous whitespace found in object header b'65' b'0'
Superfluous whitespace found in object header b'68' b'0'
Superfluous whitespace found in object header b'53' b'0'
Superfluous whitespace found in object header b'15' b'0'
Superfluous whitespace found in object header b'14' b'0'
Superfluous whitespace found in object header b'13' b'0'
Superfluous whitespace found in object header b'23' b'0'
Superfluous whitespace found in object header b'22' b'0'
Superfluous whitespace found in object header b'21' b'0'
Superfluous whitespace found in object header b'31' b'0'
Superfluous whitespace found in object header b'30' b'0'
Superfluous whitespace found in object header b'29' b'0'
Superfluous whitespace found in object header b'39' b'0'
Superfluous whitespace found in object header b'38' b'0'
Superfluous whitespace found in object header b'37' b'0'
Superfluous whitespace found in object header b'51' b'0'
Superfluous whitespace found in object header b'50' b'0'
Superfluous whitespace found in object header b'49' b'0'
Superfluous whitespace found in object header b'52' b'0'
SOLUTION:
Add parameter "strict=false". this resolves my problem.
pdfReader = PyPDF2.PdfFileReader(pdfFileObject, strict=False)
VIDEO GUIDE:
Post your comments / questions
Recent Article
- How to restrict access to the page Access only for logged user in Django
- Migration admin.0001_initial is applied before its dependency admin.0001_initial on database default
- Add or change a related_name argument to the definition for 'auth.User.groups' or 'DriverUser.groups'. -Django ERROR
- Addition of two numbers in django python
- The request was aborted: Could not create SSL/TLS secure channel -Error in Asp.net
- FieldError: Cannot resolve keyword 'id' into field in Django project
- How to hide the ID field from the Django admin?
- It is impossible to add a non nullable field without specifying a default. Django error
Related Article