Make `run-task` and `docker-image` hashes optional in cache names #505

ahal · 2024-05-13T13:50:08Z

Currently Taskgraph adds both the hash of run-task and the docker-image tasks to cache names (if those things are being used):

taskgraph/src/taskgraph/transforms/task.py

Line 519 in e556578

suffix = f"{cache_version}-{_run_task_suffix()}"

This ensures correctness, it almost guarantees that we won't get errors due to different versions of tools being used across the same set of files. However, it comes at the cost of more cache misses!

For example, in Gecko we typically have a ton of tasks coming in for any given docker-image. Furthermore, pools tend to only run tasks with certain images, so this feature makes a lot of sense.

On the other hand, mozilla-vpn-client has only a single pool that runs a wide array of tasks with docker-images. Further, pushes come in infrequently so workers aren't very long lived. This means we almost never have cache hits.

Another point is the type of cache. Checkout caches tend to be more susceptible (especially with Mercurial) to this, but something like a dotfile cache might not be (maybe?). The point is different kinds of caches have different levels of risk for this.

I propose that instead of automatically adding the run-task and docker-image hashes to all cache names, we use them as values that can be interpolated into the cache name. I.e, a cache name could be checkouts-{run_task}-{docker_image} and these values would be included in the hash name. Or it could just be checkouts and then they wouldn't. This allows individual projects, and even individual caches within a project, to set up cache names however is best for that context.

There's definitely an open question around whether one or both of these hashes should be included by default. Also how hard we should try to preserve backwards compatibility.

The text was updated successfully, but these errors were encountered:

ahal changed the title ~~Support customization of cache names~~ Make run-task and docker-image hashes optional in cache names May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `run-task` and `docker-image` hashes optional in cache names #505

Make `run-task` and `docker-image` hashes optional in cache names #505

ahal commented May 13, 2024 •

edited

Loading

Make run-task and docker-image hashes optional in cache names #505

Make run-task and docker-image hashes optional in cache names #505

Comments

ahal commented May 13, 2024 • edited Loading

Make `run-task` and `docker-image` hashes optional in cache names #505

Make `run-task` and `docker-image` hashes optional in cache names #505

ahal commented May 13, 2024 •

edited

Loading