Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changes from manual testing #81

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

changes from manual testing #81

wants to merge 5 commits into from

Conversation

nitrosx
Copy link
Contributor

@nitrosx nitrosx commented Oct 21, 2024

This PR includes all the changes that resulted from manual testing the offline ingestor.

Comment on lines +252 to +257
options.urls = {
"datasets" : urljoin(options.host, "datasets"),
"proposals" : urljoin(options.host, "proposals"),
"origdatablocks" : urljoin(options.host, "origdatablocks"),
"instruments": urljoin(options.host, "instruments"),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 4 endpoints don't have to be configurable from the config file, no...?
Can I turn them to be a property instead?

Comment on lines +168 to +169
if __name__ == "__main__":
main()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need it cause we expose main function as scicat_ingestor command.
You can just type scicat_ingestor to start the online ingestor.
(it doesn't hurt to have it though)

Copy link
Contributor

@YooSunYoung YooSunYoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good overall!
Only a few comments on minor implementation details.

Comment on lines +138 to +139
if __name__ == "__main__":
main()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also these. You can just type background_ingestor instead of using the file itself as a script.

Comment on lines +277 to +297
# if file_path.exists() and compute_file_stats:
# return DataFileListItem(
# path=file_path.absolute().as_posix(),
# size=(file_stats := file_path.stat()).st_size,
# time=datetime.datetime.fromtimestamp(
# file_stats.st_ctime, tz=datetime.UTC
# ).strftime("%Y-%m-%dT%H:%M:%S.000Z"),
# chk=_calculate_checksum(file_path, file_hash_algorithm)
# if compute_file_hash
# else None,
# uid=str(file_stats.st_uid),
# gid=str(file_stats.st_gid),
# perm=oct(file_stats.st_mode),
# )
# else:
# return DataFileListItem(
# path=file_path.absolute().as_posix(),
# time=datetime.datetime.now(tz=datetime.UTC).strftime(
# "%Y-%m-%dT%H:%M:%S.000Z"
# ),
# )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the re-written part is fine. We can remove the commented out part...!

@@ -38,7 +38,10 @@ def build_offline_config() -> OfflineIngestorConfig:
# with ``OnlineIngestorConfig``.
del merged_configuration["kafka"]

return build_dataclass(OfflineIngestorConfig, merged_configuration)
config = build_dataclass(OfflineIngestorConfig, merged_configuration)
config.scicat = SciCatOptions.from_configurations(merged_configuration["scicat"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build_dataclass automatically builds the sub-dataclasses as well.
So it should be not needed.
If it wasn't it's a bug so I need to fix it...

Comment on lines +109 to +110
provided_path = path.split("/")[1:]
provided_path[0] = "/" + provided_path[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this part should be handled in the extract_paths_from_h5_file.

@YooSunYoung
Copy link
Contributor

YooSunYoung commented Oct 21, 2024

I'll just clone it and make the CI pass : D
(After lunch)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants