Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Software version information missing from benchmarks documentation #289

Open
jgrg opened this issue Jul 19, 2024 · 1 comment
Open

Software version information missing from benchmarks documentation #289

jgrg opened this issue Jul 19, 2024 · 1 comment

Comments

@jgrg
Copy link

jgrg commented Jul 19, 2024

Your documentation page DataFrames at Scale Comparison: TPC-H has some good information on how you setup the benchmarks but it does not mention the versions of any of the software. This is important if I want to know if the results are still current.

I also don't understand what the More than SQL column in the feature comparison table means. DuckDB has a cross in this row, but it has ways of using it without touching the SQL layer such as relational on Pandas or Ibis.

@scharlottej13
Copy link
Contributor

Hi @jgrg! Thanks for opening up this issue.

it does not mention the versions of any of the software. This is important if I want to know if the results are still current.

We used the following package versions:

pyspark[sql]==3.4.1 
polars==0.20.16
duckdb==0.10.1

I also don't understand what the More than SQL column in the feature comparison table means. DuckDB has a cross in this row, but it has ways of using it without touching the SQL layer such as relational on Pandas or Ibis.

Thanks for pointing this out! This table is certainly a more subjective summary (especially as compared to the benchmark results). It seems like this could be a good opportunity for us to try out Ibis or DuckDB's relational pandas API and consider making some updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants