Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crawler:buildQueue fails with Error at offset 0 of 92 bytes #1087

Open
hacksch opened this issue Jul 18, 2024 · 5 comments
Open

crawler:buildQueue fails with Error at offset 0 of 92 bytes #1087

hacksch opened this issue Jul 18, 2024 · 5 comments

Comments

@hacksch
Copy link
Contributor

hacksch commented Jul 18, 2024

Bug Report

Current Behavior

i execute the following command to build a queue and crawl the found pages.
"typo3cms crawler:buildQueue --depth 3 --mode exec 1 default"
After list of the found urls i get

Processing

0/299 [>---------------------------] 0%
Error at offset 0 of 92 bytes

Expected behavior/output
Execution should not fail

Steps to reproduce

Prepare a configuration
execute "typo3cms crawler:buildQueue --depth 3 --mode exec 1 default"

Environment

  • Crawler version(s): 11.0.10
  • TYPO3 version(s): 11.5.38
  • PHP version(s): 8.3
  • Is your TYPO3 installation set up with Composer (Composer Mode): yes

Possible Solution
The error appiers in JsonCompatibilityConverter line 39.

The dataString for unserialize in my behavior is
{"url":"https://www.domain.de/en/home.html","procInstructions":[""],"procInstrParams":[]}

This is not valid for unserialize. When i comment the lines 39-49 the execution works.

@ulrichmathes
Copy link

Your dataString looks totally fine, can be json_decoded and is an array afterwards which will be returned before unserialize take place.

I managed to see a convert call with an empty $dataString. This is not an array and throws the warning in unserialize.

I'm not familiar with this whole process but maybe we should:

  1. fail fast when $dataString is empty
  2. do not use try catch on unserialize as it does not throw anything but emitted a warning (since PHP 8.3)

Steps to reproduce: Crawler log -> Log -> Reload List

Should I prepare one or two pull requests?

@tomasnorre
Copy link
Owner

@ulrichmathes If you have a suggested fix in mind, I would be happy to review a PR.

ulrichmathes added a commit to ulrichmathes/crawler that referenced this issue Aug 7, 2024
Fixes tomasnorre#1087

Return early on empty $dataString as this can't be
an array in any case.
ulrichmathes added a commit to ulrichmathes/crawler that referenced this issue Aug 7, 2024
Fixes tomasnorre#1087

The intention of the try-catch block was to
catch Throwables during the unserialization of $dataString.

However, unserialize only throws exceptions when the
initialization of objects fails. With 'allowed_classes' => false,
no objects will be initialized.

Since PHP 8.3, an E_WARNING is issued if the string
is not unserializable. As we are not certain
if the string contains serialized data, we only want
to attempt to unserialize it.

To prevent E_WARNINGS, we need to suppress errors
instead of using try-catch.
ulrichmathes added a commit to ulrichmathes/crawler that referenced this issue Aug 7, 2024
@ulrichmathes ulrichmathes mentioned this issue Aug 7, 2024
4 tasks
@hacksch
Copy link
Contributor Author

hacksch commented Aug 7, 2024

Hello, i will check your fix on monday. I'm not sure if this will help in my case i described and had debugged.
In my case the string was not empty and i was a error which stopped the process an was not a warning

@hacksch
Copy link
Contributor Author

hacksch commented Aug 8, 2024

And what i remember now is that i tried to unserialize the string directly via unserialize() and could reproduce the error. Without any further code.
Maybe a check if the string contains serialized data could be a solution for the described problem, else the lines 39-49 will skipped.
What do you think?

@tomasnorre
Copy link
Owner

Hi @hacksch,

Sorry for not getting back to you. What do you think of the approach in PR #1091?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants