Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use real filesystem in ops.testing backend #960

Merged
merged 21 commits into from
Jul 12, 2023
Merged

Conversation

weiiwang01
Copy link
Contributor

This pull request refactors the testing module of the Ops library, switching its backend from an in-memory virtual filesystem to a real filesystem for simulating filesystem operations within containers. Furthermore, it exposes a new API, Harness.get_filesystem_root, enabling developers to interact with the simulated filesystem during test cases for both populating and inspecting the testing container filesystem.

Supersede #933

Changelog

  • Refactored the testing module to use a real filesystem for simulating container filesystem operations.
  • Introduced a new API, ops.testing.Harness.get_filesystem_root, for accessing and manipulating the simulated filesystem during testing.

Copy link
Collaborator

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your continued effort here. This is really good work. I've left a bunch of comments. Mostly minor, but one discussion point about whether we should do the user/group stuff at all. I'll also try to schedule a meeting tomorrow to finish discussion on this.


"""
if isinstance(container, str):
container_name = container
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was quite surprising to me that the str case worked without accessing the container first; I initially thought that backend._pebble_clients wouldn't yet be populated. But it does work, because instantiating the Model calls get_unit which calls Unit which calls ContainerMapping which calls Container which calls get_pebble, which finally populates _pebble_clients... phew!

I'm not sure you're doing anything wrong, it was just strange to me that this worked! :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment here as suggested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't something that I think we have a contract for it always working, but it does fit specifically in how Harness is meant to work. (It is ok for it to rely on details of the implementation, because it is shipped with those details and can be updated if those details change. It would not be good for arbitrary charms to depend on those details directly)
Just make sure we have a good test case for it, so that if we do change that behavior elsewhere, we know that we need to fix this code path for our users.

ops/testing.py Outdated Show resolved Hide resolved
ops/testing.py Outdated Show resolved Hide resolved
ops/testing.py Outdated Show resolved Hide resolved
ops/testing.py Outdated Show resolved Hide resolved
ops/testing.py Outdated Show resolved Hide resolved
ops/testing.py Show resolved Hide resolved
ops/testing.py Outdated Show resolved Hide resolved
self.assertTrue(foo_root.is_dir())
harness.begin()
container = harness.charm.unit.get_container("foo")
self.assertEqual(foo_root, harness.get_filesystem_root(container))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good test as far as it goes, but I think we need a few more test cases to test that 1) files that the test writes via get_filesystem_root can then be pulled and listed, and 2) files pushed and make_dir'd can then be accessed via get_filesystem_root. Or am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we already have a suite of tests for the filesystem, using the Pebble APIs (blackbox tests). I think I will add some more tests related to the get_filesystem_root method, since that's an exposed API now. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added more unit tests for the get_filesystem_root method, thanks!

@@ -38,6 +38,7 @@ jobs:
matrix:
os: [ubuntu-latest, macos-latest]
python-version: ["3.8", "3.10"]
root: [true, false]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love how this shows up in the test summary in GitHub, e.g., "ubuntu-latest, 3.10, false" -- it's unclear what "false" means. Also, I'd rather not run all the tests twice, and I don't think we need to or should be running all the tests as root anyway.

Can we use a suitable test name prefix, and then do something like we do with the RealPebble tests below? Perhaps a TestFilesystemAsRoot class and then use -k FilesystemAsRoot filter. I think having a separate job is cleaner (even though it's a bit more duplication).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the tests depend on the outcome of the discussion, I will defer the changes for now.

@weiiwang01
Copy link
Contributor Author

Thanks for your continued effort here. This is really good work. I've left a bunch of comments. Mostly minor, but one discussion point about whether we should do the user/group stuff at all. I'll also try to schedule a meeting tomorrow to finish discussion on this.

Thanks for the fantastic review! Yes, let's talk about the permission issue.

Another topic I want to discuss is about simulating storage mounts. Currently, we are using symbolic links to simulate storage mounts. However, actual storage mounts have some special properties that symbolic links can't simulate. For example, when a filesystem is attached to a directory, the original content of that directory becomes "shadowed". Additionally, it's an error to delete the mounting directory. Also, list_files should display the mounting directory as a directory, but right now, it shows as a symbolic link.

Of course, we can simulate these behaviors in the harness, but I feel like this introduces dishonesty into the design. I would prefer to let the user know that we are using symbolic links to simulate storage mounts and it's the users' responsibility to avoid tests involves unique properties of mounting directories.

@benhoyt
Copy link
Collaborator

benhoyt commented Jul 7, 2023

Per yesterday's discussion, @weiiwang01's going to simplify by removing the user/group handling, and (for now at least) document where the test harness isn't a perfect match. As a consequence, we can always remove the tests running as root (we don't want to run tests as root or encourage charmers to do that). He's also going to remove the secrets.token_urlsafe call and just leave that out -- people shouldn't be testing exact contents of error messages anyway.

Copy link
Collaborator

@benhoyt benhoyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is great. Just a couple of nit comments. And also get a second review (probably John's as we've discussed this with him).

ops/testing.py Outdated Show resolved Hide resolved
ops/testing.py Outdated Show resolved Hide resolved
ops/testing.py Outdated
rel_path = os.path.relpath(file_info.path, start=self._root)
rel_path = '/' if rel_path == '.' else '/' + rel_path
file_info.path = rel_path
file_info.name = "/" if rel_path == "/" else os.path.basename(rel_path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't file_info.name correct already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the root directory, generated filename is the name of the temporary directory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is only for the root, why set it always, vs just:

if rel_path == "/":
  // override the root name, since it should be considered the local root, without any filename
  file_info.name = "/"

It seems clearer than always setting it to the value it should already have. (and avoids computing basename() for every file in every directory if you are iterating over them)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

John's suggestion looks clearer to me too.

ops/testing.py Outdated Show resolved Hide resolved
ops/testing.py Outdated Show resolved Hide resolved
def test_pull_path(self):
(self.root / "foo").mkdir()
(self.root / "foo/bar").write_text("bar")
# TODO: pull_path doesn't pull empty directories, intended?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if that's the case, that seems like a bug. Can you please open a separate canonical/operator issue for that?

@benhoyt benhoyt requested a review from jameinel July 10, 2023 00:50
@weiiwang01
Copy link
Contributor Author

Per yesterday's discussion, @weiiwang01's going to simplify by removing the user/group handling, and (for now at least) document where the test harness isn't a perfect match. As a consequence, we can always remove the tests running as root (we don't want to run tests as root or encourage charmers to do that). He's also going to remove the secrets.token_urlsafe call and just leave that out -- people shouldn't be testing exact contents of error messages anyway.

I've eliminated the ownership handling and restructured the code according to the recommendation. However, while writing the tests, I came across an problem: the pull_path method within ops.model.Container doesn't retrieve empty directories. I'm curious to know if this is by design, similar to how git operates.

@benhoyt
Copy link
Collaborator

benhoyt commented Jul 10, 2023

Hi @weiiwang01 Thanks -- yep, I replied to your comments above. I've opened an issue on the pull_path empty directory issue: #968 -- also see the other comments above.


"""
if isinstance(container, str):
container_name = container
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't something that I think we have a contract for it always working, but it does fit specifically in how Harness is meant to work. (It is ok for it to rely on details of the implementation, because it is shipped with those details and can be updated if those details change. It would not be good for arbitrary charms to depend on those details directly)
Just make sure we have a good test case for it, so that if we do change that behavior elsewhere, we know that we need to fix this code path for our users.

self._harness_storage_path = pathlib.Path(self._harness_tmp_dir.name) / "storages"
self._harness_container_path = pathlib.Path(self._harness_tmp_dir.name) / "containers"
self._harness_storage_path.mkdir()
self._harness_container_path.mkdir()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a bit of a shame to me that we have a bunch of mkdir() and ultimately rmdir() calls for every unit test that uses a Harness, even if those tests never want to access the filesystem. I suppose we could rework that in the future, though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought the same too. But then I did a timing test, and on my laptop I can os.mkdir() 10,000 directories in 0.1s and 100,000 in 1.6s (and a bit less than that for os.rmdir()). So 3 dir creates + deletes adds about 50 microseconds to each test. Linux FS operations are quick!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can prefer /dev/shm as the tempdir to expedite the filesystem calls. But as the benchmark shows, it doesn't really have a significant influence on the overall performances.

ops/testing.py Outdated
rel_path = os.path.relpath(file_info.path, start=self._root)
rel_path = '/' if rel_path == '.' else '/' + rel_path
file_info.path = rel_path
file_info.name = "/" if rel_path == "/" else os.path.basename(rel_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is only for the root, why set it always, vs just:

if rel_path == "/":
  // override the root name, since it should be considered the local root, without any filename
  file_info.name = "/"

It seems clearer than always setting it to the value it should already have. (and avoids computing basename() for every file in every directory if you are iterating over them)

@benhoyt
Copy link
Collaborator

benhoyt commented Jul 10, 2023

Thanks. Let's just make that file_info.name = "/" tweak that John mentioned, and then I'll see about merging this!

@benhoyt benhoyt merged commit 770282b into canonical:main Jul 12, 2023
@benhoyt
Copy link
Collaborator

benhoyt commented Jul 12, 2023

Yeeha! Thanks for this @weiiwang01. This will be included with the next version of ops, 2.5.0, planned for release late July / early August.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants