Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize store module keys and values #542

Open
abhimanyusinghgaur opened this issue Sep 26, 2024 · 0 comments
Open

optimize store module keys and values #542

abhimanyusinghgaur opened this issue Sep 26, 2024 · 0 comments

Comments

@abhimanyusinghgaur
Copy link
Contributor

As per the doc here:
Screenshot 2024-09-26 at 1 12 36 PM

  1. All the numeric data types are currently string serialized and then stored. That is inefficient for storage space. They can simply be byte serialized for the purpose of storage (little endian or big endian, whatever) which would be more storage efficient as well as compute efficient too. If degubbing is needed, those bytes can always be converted to strings for easier debuggability. But, perf wise it would be better for them to be byte serialized.
  2. Same thing for keys. Even the keys are string serialized, while they could just have been raw bytes. Whenever using an ethereum address as a key in a store, right now, it has to be converted to hex which ends up taking 40 bytes instead of just 20 in the raw form (2x cost). One can use base64 to optimize this, but that too would take 28 bytes. So, while it was possible to do the same work in 20 bytes, it's not allowed to do so ATM.
    When I tried using address raw bytes as keys via unsafe string conversion, my module started behaving strangely. It kept reading and reading data, the substreams GUI showed it read around 1.2GB of data for a block range of 100 blocks which I then had to kill as it wasn't giving any output. While for the same block range when I was using hex converted address keys, it did the job in just ~60MB data read. So, looks like there is some internal limitation currently around using the raw bytes, which ideally shouldn't exist.

This is related to: https://discord.com/channels/666749063386890256/982135810742697984/1271115693093556295

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant