Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problematic file names #263

Open
danielmlow opened this issue Oct 4, 2021 · 8 comments
Open

Problematic file names #263

danielmlow opened this issue Oct 4, 2021 · 8 comments

Comments

@danielmlow
Copy link

@sanuann , the files that submitted at the end of a protocol are currently saved like this (bottom files in white):

image

having characters after .zip can cause problems for some people when trying to unzip. could we change it to

test_debbie_01-001.zip
test_debbie_01-002.zip
test_debbie_01-003.zip
test_debbie_01-004.zip
etc?

So that they end in .zip and are easy to unzip?

@sanuann
Copy link
Collaborator

sanuann commented Oct 4, 2021

i don't remember why exactly it was done that way. @satra is it okay to change it to the format daniel mentioned above?

@satra
Copy link
Contributor

satra commented Oct 4, 2021

isn't that the multi part upload? in which case the zip file should be constructed by concatenating the parts.

@satra
Copy link
Contributor

satra commented Oct 4, 2021

one should not be trying to unzip those files, just concatenating them. and the multipart upload should also apply potentially to activity uploads no?

@sanuann
Copy link
Collaborator

sanuann commented Oct 4, 2021

yes, they are the multipart uploads. for the activity uploads, I chose not to use multipart upload. do we need it by default for any uploads?

@satra
Copy link
Contributor

satra commented Oct 4, 2021

do we need it by default for any uploads?

yes, since any activity could generate a lot of data, especially if those are voice recordings. it should only be a single upload if the size is smaller than the multipart upload threshold.

@danielmlow
Copy link
Author

danielmlow commented Oct 4, 2021

other people using reproschema in the future may not be as tech savvy, so that’s why i think it’s worth changing it so that it is easy to unzip or combine, but fine as is if we provide users with instructions on how to combine

@satra
Copy link
Contributor

satra commented Oct 4, 2021

@danielmlow - they are chunked uploads, they cannot be unzipped on their own. they need to be combined to create a zip file. this has nothing to do with tech savviness just that information is being uploaded in parts.

@sanuann - if you know the number of chunks in advance, may be useful to add that .001-010 for example if that's chunk 1 of 10 chunks.

@danielmlow
Copy link
Author

What's the correct way of merging?

from https://unix.stackexchange.com/questions/40480/how-to-unzip-a-multipart-spanned-zip-on-linux :

"you need to first concatenate the pieces, then repair the result. cat test.zip.* concatenates all the files called test.zip.* where the wildcard * stands for any sequence of characters; the files are enumerated in lexicographic order, which is the same as numerical order thanks to the leadings zeroes. >test.zip directs the output into the file test.zip."

cat test.zip.* >test.zip
zip -FF test.zip --out test-full.zip
unzip test-full.zip

"If you created the pieces by directly splitting the zip file, as opposed to creating a multi-part zip with the official Pkzip utility, all you need to do is join the parts."

cat test.zip.* >test.zip
unzip test.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants