Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Introduce basic file scan planning. #129

Merged
merged 9 commits into from
Jan 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -59,3 +59,4 @@ typed-builder = "^0.18"
url = "2"
urlencoding = "2"
uuid = "1.6.1"
tera = "1"
1 change: 1 addition & 0 deletions crates/iceberg/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -62,4 +62,5 @@ uuid = { workspace = true }
[dev-dependencies]
pretty_assertions = { workspace = true }
tempfile = { workspace = true }
tera = { workspace = true }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this used for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tera is a template engine in rust. We need to mock up some test data to do unittest. For example, in crates/iceberg/testdata/example_table_metadata_v2.json, we need to setup the real table location, manifest file path, etc using temporary dir, which is only available in runtime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or by just integrating with the existing REST catalog, that will take care of all the paths and such :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we will eventually add these into integration tests, but I also love small unit tests: they are fast to run, easy to debug, easy to understand.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With PyIceberg it will take a while for all the containers to boot on the first run, but after that the test are more or less instant :)

Copy link
Collaborator Author

@liurenjie1024 liurenjie1024 Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add integration tests later to verify it. But I think unit test is still quite valuable for ensuring quality. No worry, the dev-dependency is sth similar to test scope java's build tools like maven or gradle, e.g. it will only be used in tests, and will be excluded when linked to production code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the context. What I would like about the integration tests, is that is also very easy for me to make test-cases :)

tokio = { workspace = true }
2 changes: 2 additions & 0 deletions crates/iceberg/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -325,6 +325,8 @@ define_from_err!(
"Failed to convert decimal literal to rust decimal"
);

define_from_err!(std::io::Error, ErrorKind::Unexpected, "IO Operation failed");

/// Helper macro to check arguments.
///
///
Expand Down
2 changes: 2 additions & 0 deletions crates/iceberg/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ mod avro;
pub mod io;
pub mod spec;

mod scan;

#[allow(dead_code)]
pub mod expr;
pub mod transaction;
Expand Down
Loading
Loading