Skip to content

Commit

Permalink
Handle reports that cannot be opened
Browse files Browse the repository at this point in the history
  • Loading branch information
hancush committed Nov 5, 2024
1 parent bc82cbe commit f746622
Showing 1 changed file with 9 additions and 2 deletions.
11 changes: 9 additions & 2 deletions scrapers/office/scrape_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,12 +116,19 @@ def _parse_filing_pdf(self, version):
return {}

else:
version_pdf = pdfplumber.open(io.BytesIO(pdf.content))
# Skip reports that can't be opened
try:
version_pdf = pdfplumber.open(io.BytesIO(pdf.content))
except Exception as e:
logger.error(
f"Could not open report document at {report_url} due to the following exception:\n{str(e)}"
)
return {}

# Skip reports that can't be parsed
try:
version_content = parse_pdf(version_pdf)
except Exception as e:
# Skip reports that can't be parsed
logger.error(
f"Could not parse report document at {report_url} due to the following exception:\n{str(e)}"
)
Expand Down

0 comments on commit f746622

Please sign in to comment.