-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use real filesystem in ops.testing backend #960
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your continued effort here. This is really good work. I've left a bunch of comments. Mostly minor, but one discussion point about whether we should do the user/group stuff at all. I'll also try to schedule a meeting tomorrow to finish discussion on this.
|
||
""" | ||
if isinstance(container, str): | ||
container_name = container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was quite surprising to me that the str
case worked without accessing the container first; I initially thought that backend._pebble_clients
wouldn't yet be populated. But it does work, because instantiating the Model
calls get_unit
which calls Unit
which calls ContainerMapping
which calls Container
which calls get_pebble
, which finally populates _pebble_clients
... phew!
I'm not sure you're doing anything wrong, it was just strange to me that this worked! :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment here as suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't something that I think we have a contract for it always working, but it does fit specifically in how Harness is meant to work. (It is ok for it to rely on details of the implementation, because it is shipped with those details and can be updated if those details change. It would not be good for arbitrary charms to depend on those details directly)
Just make sure we have a good test case for it, so that if we do change that behavior elsewhere, we know that we need to fix this code path for our users.
self.assertTrue(foo_root.is_dir()) | ||
harness.begin() | ||
container = harness.charm.unit.get_container("foo") | ||
self.assertEqual(foo_root, harness.get_filesystem_root(container)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good test as far as it goes, but I think we need a few more test cases to test that 1) files that the test writes via get_filesystem_root
can then be pulled and listed, and 2) files pushed and make_dir'd can then be accessed via get_filesystem_root
. Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we already have a suite of tests for the filesystem, using the Pebble APIs (blackbox tests). I think I will add some more tests related to the get_filesystem_root
method, since that's an exposed API now. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added more unit tests for the get_filesystem_root
method, thanks!
@@ -38,6 +38,7 @@ jobs: | |||
matrix: | |||
os: [ubuntu-latest, macos-latest] | |||
python-version: ["3.8", "3.10"] | |||
root: [true, false] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love how this shows up in the test summary in GitHub, e.g., "ubuntu-latest, 3.10, false" -- it's unclear what "false" means. Also, I'd rather not run all the tests twice, and I don't think we need to or should be running all the tests as root anyway.
Can we use a suitable test name prefix, and then do something like we do with the RealPebble tests below? Perhaps a TestFilesystemAsRoot class and then use -k FilesystemAsRoot
filter. I think having a separate job is cleaner (even though it's a bit more duplication).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the tests depend on the outcome of the discussion, I will defer the changes for now.
Co-authored-by: Ben Hoyt <[email protected]>
Co-authored-by: Ben Hoyt <[email protected]>
Co-authored-by: Ben Hoyt <[email protected]>
Co-authored-by: Ben Hoyt <[email protected]>
Co-authored-by: Ben Hoyt <[email protected]>
Co-authored-by: Ben Hoyt <[email protected]>
Thanks for the fantastic review! Yes, let's talk about the permission issue. Another topic I want to discuss is about simulating storage mounts. Currently, we are using symbolic links to simulate storage mounts. However, actual storage mounts have some special properties that symbolic links can't simulate. For example, when a filesystem is attached to a directory, the original content of that directory becomes "shadowed". Additionally, it's an error to delete the mounting directory. Also, Of course, we can simulate these behaviors in the harness, but I feel like this introduces dishonesty into the design. I would prefer to let the user know that we are using symbolic links to simulate storage mounts and it's the users' responsibility to avoid tests involves unique properties of mounting directories. |
Per yesterday's discussion, @weiiwang01's going to simplify by removing the user/group handling, and (for now at least) document where the test harness isn't a perfect match. As a consequence, we can always remove the tests running as root (we don't want to run tests as root or encourage charmers to do that). He's also going to remove the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is great. Just a couple of nit comments. And also get a second review (probably John's as we've discussed this with him).
ops/testing.py
Outdated
rel_path = os.path.relpath(file_info.path, start=self._root) | ||
rel_path = '/' if rel_path == '.' else '/' + rel_path | ||
file_info.path = rel_path | ||
file_info.name = "/" if rel_path == "/" else os.path.basename(rel_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't file_info.name
correct already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for the root directory, generated filename is the name of the temporary directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is only for the root, why set it always, vs just:
if rel_path == "/":
// override the root name, since it should be considered the local root, without any filename
file_info.name = "/"
It seems clearer than always setting it to the value it should already have. (and avoids computing basename() for every file in every directory if you are iterating over them)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
John's suggestion looks clearer to me too.
test/test_testing.py
Outdated
def test_pull_path(self): | ||
(self.root / "foo").mkdir() | ||
(self.root / "foo/bar").write_text("bar") | ||
# TODO: pull_path doesn't pull empty directories, intended? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if that's the case, that seems like a bug. Can you please open a separate canonical/operator issue for that?
I've eliminated the ownership handling and restructured the code according to the recommendation. However, while writing the tests, I came across an problem: the |
Hi @weiiwang01 Thanks -- yep, I replied to your comments above. I've opened an issue on the |
Co-authored-by: Ben Hoyt <[email protected]>
|
||
""" | ||
if isinstance(container, str): | ||
container_name = container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't something that I think we have a contract for it always working, but it does fit specifically in how Harness is meant to work. (It is ok for it to rely on details of the implementation, because it is shipped with those details and can be updated if those details change. It would not be good for arbitrary charms to depend on those details directly)
Just make sure we have a good test case for it, so that if we do change that behavior elsewhere, we know that we need to fix this code path for our users.
self._harness_storage_path = pathlib.Path(self._harness_tmp_dir.name) / "storages" | ||
self._harness_container_path = pathlib.Path(self._harness_tmp_dir.name) / "containers" | ||
self._harness_storage_path.mkdir() | ||
self._harness_container_path.mkdir() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its a bit of a shame to me that we have a bunch of mkdir() and ultimately rmdir() calls for every unit test that uses a Harness, even if those tests never want to access the filesystem. I suppose we could rework that in the future, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I thought the same too. But then I did a timing test, and on my laptop I can os.mkdir()
10,000 directories in 0.1s and 100,000 in 1.6s (and a bit less than that for os.rmdir()
). So 3 dir creates + deletes adds about 50 microseconds to each test. Linux FS operations are quick!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can prefer /dev/shm
as the tempdir
to expedite the filesystem calls. But as the benchmark shows, it doesn't really have a significant influence on the overall performances.
ops/testing.py
Outdated
rel_path = os.path.relpath(file_info.path, start=self._root) | ||
rel_path = '/' if rel_path == '.' else '/' + rel_path | ||
file_info.path = rel_path | ||
file_info.name = "/" if rel_path == "/" else os.path.basename(rel_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is only for the root, why set it always, vs just:
if rel_path == "/":
// override the root name, since it should be considered the local root, without any filename
file_info.name = "/"
It seems clearer than always setting it to the value it should already have. (and avoids computing basename() for every file in every directory if you are iterating over them)
Thanks. Let's just make that |
Yeeha! Thanks for this @weiiwang01. This will be included with the next version of ops, 2.5.0, planned for release late July / early August. |
This pull request refactors the testing module of the Ops library, switching its backend from an in-memory virtual filesystem to a real filesystem for simulating filesystem operations within containers. Furthermore, it exposes a new API,
Harness.get_filesystem_root
, enabling developers to interact with the simulated filesystem during test cases for both populating and inspecting the testing container filesystem.Supersede #933
Changelog
ops.testing.Harness.get_filesystem_root
, for accessing and manipulating the simulated filesystem during testing.