Skip to content
This repository has been archived by the owner on Jun 15, 2023. It is now read-only.

timeout option #43

Open
DanielRuf opened this issue Aug 19, 2020 · 2 comments
Open

timeout option #43

DanielRuf opened this issue Aug 19, 2020 · 2 comments

Comments

@DanielRuf
Copy link

DanielRuf commented Aug 19, 2020

Hi,

pdfx is very helpful for us to analyze a few things. Thanks for creating pdfx.

But we have a small problem. When a pdf file contains much text pdfx / python only fails after the "too many recursions" error is thrown.

It would be helpful to have a max-timeout option to prevent that pdfx tries to parse files for 45 minutes and more (in our case).

And another small question: how could we scan / check many files at once in the best way? So far we run single pdfx commands from a bash script and wait until every command has finished. Using the & trick would cause some issues with the job scheduler of the OS and that the whole OS freezes.

@metachris
Copy link
Owner

Could you post the full stack trace, and perhaps an example PDF? Please reopen the issue with those, thanks 🙏

@DanielRuf
Copy link
Author

DanielRuf commented Apr 12, 2021

Please reopen the issue with those, thanks

Only you can reopen the issue ;-)

Here is an example file:

54013162437.pdf

This stacktrace is produced:

pdfx.log

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants