Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow zstd to be used instead of deflate/flate #18

Open
Jacalz opened this issue Mar 24, 2022 · 7 comments
Open

Allow zstd to be used instead of deflate/flate #18

Jacalz opened this issue Mar 24, 2022 · 7 comments

Comments

@Jacalz
Copy link

Jacalz commented Mar 24, 2022

I would suggest that the protocol specification in some way could allow directory sends to use zstd compression instead of deflate. The speed and compression ratios are very impressive (https://facebook.github.io/zstd/) and seems to be both faster and better than deflate in almost every way.

The only mention of compression formats that I can find seems to be in https://github.com/magic-wormhole/magic-wormhole-protocols/blob/main/file-transfer-protocol.md. It seems to kind of indicate that it would be possible but I’m not sure.

I am hoping that getting this as part of the protocol specifications would allow using this in most wormhole implementations out there.

@piegamesde
Copy link
Member

piegamesde commented Mar 24, 2022

I am in favor of changing the compression layer, and I am in favor of moving to zstd too. We'd have a look at how different implementations currently handle the mode field. I'm pessimistic that this can be changed without breaking existing clients.

Therefore, this will very likely require a v2 transfer protocol. I'm all in favor for that, because the zipsize field is forcing us to build temporary zip files, which I'd absolutely like to avoid.

I am prototyping stuff in the transfer-v2 branch. It has been inactive and I have a lot of unpublished changes locally, but it hopefully provides a rough idea where things are headed.

@Jacalz
Copy link
Author

Jacalz commented Mar 24, 2022

Cool. That sounds like a good plan.

I'm all in favor for that, because the zipsize field is forcing us to build temporary zip files, which I'd absolutely like to avoid.

I definitely agree with you there. I’ve been annoyed with that need as well. It makes it a lot harder to get directory sends working on mobile when sending files that aren’t directly on the device file system.

It would also be cool to use tar instead of zip but I don’t know how much sense outside of my use case. At least when using the Go packages, tar files can be in unarchived directly from a stream but zip files need to be saved to disk first.

@piegamesde
Copy link
Member

It would also be cool to use tar instead of zip

Actually, I already tried that, and for exactly the same reasons as you mentioned. I came to the (weak) conclusion that using tar is generally unwieldy and that we would be better off sending folders as multiple files the way Croc does it. More specifically, I had hoped to offload the file system specific stuff to the tar library so I wouldn't have to deal with it as much. This turned out to be not true, and I instead ended up wrestling with tar internals in addition to the file system handling.

@Jacalz
Copy link
Author

Jacalz commented Mar 24, 2022

That sounds like it might be a better plan. It should probably solve the issues that come with using zip files.

Would it make sense to then apply streaming compression on regular files as well? Am I right in thinking that only directories are compressed at the moment?

@meejah
Copy link
Member

meejah commented Apr 1, 2022

Further twist: this could depend on the transport, too.

If you're using WebSocket via the relay and both sides support compression, that channel may already be compressed (so in that case, doing it twice wouldn't make sense obviously).

I think using Dilation is the best way to support multiple files. Given that, those Dilation-negotiated channels could include "use compression" as a future enhancement (Dilation currently does not specify anything about compression).

@piegamesde
Copy link
Member

Relay servers should not compress anything, since they are handling encrypted data anyways. So I don't think the double compression issue really applies here (although I also don't think double compression is necessarily a big problem).

@piegamesde
Copy link
Member

I'm honestly confused by the mode field in the directory offer, it does not make any sense. Because in zip files the compression is per file, and not an external attribute. The Python client just hard-codes it to "zipfile/deflated" but does not use it anywhere.

Therefore, it is safe to just use whatever compression method you want for building the zip file, and it will work fine because that part is handled by the zip library anyways. So feel free to use zstd compression.


A slightly hacky but fully backwards compatible solution would be to send the folder as tar file, and then add a "extract": true field to the file offer. Clients that support this would then proceed to extract the contents like for a directory offer. Clients that do not support this would just ignore the flag and receive a regular file for the user to extract.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants