Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feture request] Add support for s3 in particular minio storage #2028

Open
mat21mf opened this issue Aug 5, 2024 · 3 comments
Open

[Feture request] Add support for s3 in particular minio storage #2028

mat21mf opened this issue Aug 5, 2024 · 3 comments
Assignees
Labels
enhancement New feature or request qsv pro requires backend/cloud services WIP work in progress

Comments

@mat21mf
Copy link

mat21mf commented Aug 5, 2024

Is your feature request related to a problem? Please describe.

We are loading multiple data sources in a minio self hosted instance. Most of them comes in csv format and similar, and have large size. Currently I am using qsv to pre process them in order to convert them into parquet format, once they fulfill a minimum of consistency. To my knowledge, qsv don't support s3 protocol, to be able to create a connection to a minio storage using credentials. So I am pre processing the data outside of the data lake, just before to load them into the minio storage.

Describe the solution you'd like

Are any chance of including support for s3 protocol in particular to minio in the future?

Give thanks please

I really thank you guys for this tool, it's amazing.

@jqnatividad jqnatividad added the enhancement New feature or request label Aug 6, 2024
@jqnatividad
Copy link
Owner

Hi @mat21mf ,

I've been thinking about this for a while (see #654 ), but never got around to actually doing it as I "scratch itches" based on our other projects.

I started working on it, but didn't finish -

qsv/src/config.rs

Lines 108 to 124 in 2e3f4e5

None => (None, default_delim, false),
// WIP: support remote files; currently only http(s) is supported
// Some(ref s) if s.starts_with("http") && Url::parse(s).is_ok() => {
// let mut snappy = false;
// let delim = if s.ends_with(".csv.sz") {
// snappy = true;
// b','
// } else if s.ends_with(".tsv.sz") || s.ends_with(".tab.sz") {
// snappy = true;
// b'\t'
// } else {
// default_delim
// };
// // download the file to a temporary location
// util::download_file()
// (Some(PathBuf::from(s)), delim, snappy)
// },

Anyway, it's back on the enhancement backlog.

jqnatividad added a commit that referenced this issue Aug 19, 2024
- to get latest fixes/features and to lay the groundwork for supporting cloud storage sources/targets through polars.

see #2028
@jqnatividad
Copy link
Owner

With cloud and aws polars features enabled, it enqueued for the next release...

@jqnatividad jqnatividad added the WIP work in progress label Sep 6, 2024
@jqnatividad jqnatividad self-assigned this Sep 6, 2024
@jqnatividad jqnatividad added the qsv pro requires backend/cloud services label Sep 13, 2024
@jqnatividad
Copy link
Owner

This will be implemented in qsv pro, not qsv.

cc @rzmk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request qsv pro requires backend/cloud services WIP work in progress
Projects
None yet
Development

No branches or pull requests

2 participants