I got error message while running this following code to read the pdf file and extract text from it.
CODE:
import PyPDF2
pdfFileObject = open('pythonhelp.pdf', 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObject)
page=pdfReader.getPage(0)
print(page.extract_text())
ERROR MESSAGE:
Superfluous whitespace found in object header b'1' b'0'
Superfluous whitespace found in object header b'2' b'0'
Superfluous whitespace found in object header b'3' b'0'
Superfluous whitespace found in object header b'54' b'0'
Superfluous whitespace found in object header b'65' b'0'
Superfluous whitespace found in object header b'68' b'0'
Superfluous whitespace found in object header b'53' b'0'
Superfluous whitespace found in object header b'15' b'0'
Superfluous whitespace found in object header b'14' b'0'
Superfluous whitespace found in object header b'13' b'0'
Superfluous whitespace found in object header b'23' b'0'
Superfluous whitespace found in object header b'22' b'0'
Superfluous whitespace found in object header b'21' b'0'
Superfluous whitespace found in object header b'31' b'0'
Superfluous whitespace found in object header b'30' b'0'
Superfluous whitespace found in object header b'29' b'0'
Superfluous whitespace found in object header b'39' b'0'
Superfluous whitespace found in object header b'38' b'0'
Superfluous whitespace found in object header b'37' b'0'
Superfluous whitespace found in object header b'51' b'0'
Superfluous whitespace found in object header b'50' b'0'
Superfluous whitespace found in object header b'49' b'0'
Superfluous whitespace found in object header b'52' b'0'
SOLUTION:
Add parameter "strict=false". this resolves my problem.
pdfReader = PyPDF2.PdfFileReader(pdfFileObject, strict=False)
VIDEO GUIDE:
Post your comments / questions
Recent Article
- How to check PAN-Aadhaar is Linked or NOT?
- How to customize pagination for django admin?
- How to fix HAXM is not installed |in Android Studio
- How to fix CMOS Checksum Error in Computer or Laptop | SOLVED
- Reactivating windows after a Hardware change on PC or Laptop
- FIXED: Windows reported that the hardware of your device has changed. Error code :0xc004F211
- "redirect" is not defined pylance("reportUndefinedVariable)
- This action cannot be completed because the file is open in SQL Server(SQLEXPRESS) - FIXED
Related Article