Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resuming after stop() is broken #91

Open
klkvsk opened this issue Nov 12, 2020 · 4 comments
Open

resuming after stop() is broken #91

klkvsk opened this issue Nov 12, 2020 · 4 comments

Comments

@klkvsk
Copy link

klkvsk commented Nov 12, 2020

So I'm trying to write a wrapper over this lib that would be used as a generator. Basically I want to yield all json-objects I need from a stream.

The lib does not use yields or generators, so this is what I came up with:

First, a simple listener for this example: lets collect all values from "foo" fields given they all are scalar

class FooJsonListener extends \JsonStreamingParser\Listener\IdleListener {
    protected $isReadingFoo = false;
    protected $onFooCallback;
    
    public function onFoo(callable $onFooCallback)
    {
        $this->onFooCallback = $onFooCallback;
    }

    public function key(string $key): void
    {
        $this->isReadingFoo = ($key === 'foo');
    }

    public function value($value): void
    {
        if ($this->isReadingFoo) {
            call_user_func($this->onFooCallback, $value);
            $this->isReadingFoo = false;
        }
    }
}

Now this function starts parser, then stops every time we collected a foo-value, yields it, then resumes parsing again:

function fooIterator($stream) {
    $jsonListener = new FooJsonListener();
    $jsonParser = new JsonStreamingParser\Parser($stream, $jsonListener);

    $lastFoo = null;
    $shouldYieldFoo = false;
    $jsonListener->onFoo(function ($foo) use (&$lastFoo, &$shouldYieldFoo, $jsonParser) {
        $lastFoo = $foo;
        $shouldYieldFoo = true;
        $jsonParser->stop();
    });

    while (true) {
        $jsonParser->parse();
        if ($shouldYieldFoo) {
            yield $lastFoo;
            $shouldYieldFoo = false;
        } else {
            break;
        }
    }
}

Looks awful, but should work. But it does not. After the first yield, next parse() will throw:

JsonStreamingParser\Exception\ParsingException: "Parsing error in [1:1]. Expected ',' or ']' while parsing array. Got: {"

It works if we don't interrupt it with "stop()" thought.

The problem:
parse() reads stream by chunks, but parses char-by-char. On any char there might be a call to listener to register parsed token. After every char it checks stopParsing flag, raised by stop().
So most of the cases, when you call stop() from listener you break chunk parsing in the middle, leftover of the chunk is discarded, and calling parse() then will proceed from the next chunk, not where it stopped.
And also stopParsing is never reset to false.

Possible fixes:
a. Store the actual count of bytes read on stop(), and do fseek to this offset on next parse() (would not work with non-seekable streams thought)
b. Better: store the leftover of chunk on stop() and prepend it to data read on next parse()

Sadly, neither of this is possible to implement by extending classes due to the privates.

@gonzofy
Copy link
Contributor

gonzofy commented Nov 12, 2020

Feel free to submit a PR!

@klkvsk
Copy link
Author

klkvsk commented Dec 7, 2020

This is kind of off-topic, but instead of changing your architecture I decided to write my own lib based on this idea of using Generators from the ground up. Check out klkvsk/json-decode-stream if you like.

@gonzofy
Copy link
Contributor

gonzofy commented Dec 7, 2020

@klkvsk nice!

@kirkbushell
Copy link

Instead of suggesting the use of stop, I would highly recommend instead removing the :void returns on all the methods. It doesn't achieve anything and prevents yield use-cases :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants