Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shell Function: run_dir #1646

Open
RINO-GAELICO opened this issue Aug 26, 2024 · 4 comments
Open

Shell Function: run_dir #1646

RINO-GAELICO opened this issue Aug 26, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@RINO-GAELICO
Copy link

RINO-GAELICO commented Aug 26, 2024

In relation to the new feature of Shell Function , it might be useful to add an additional argument for the user to specify the run_dir. Right now it uses a sandbox or the working task dir.

Describe the solution you'd like
A possible solution could be to pass the run_dir as one the optional arguments in the Shell constructor. Then checking if the run_dir is not None and let the user choice take priority over sandboxing.

run_dir = self.run_dir

@yadudoc
Copy link
Collaborator

yadudoc commented Aug 27, 2024

Hi @RINO-GAELICO, thanks for taking this time to write up this issue. This was one of the options in consideration while ShellFunctions were in development and was ultimately removed. Here is the rationale:

  1. Ideally we want one component in control of setting the task's run dir. Since the endpoint already manages this based on user-supplied configs this would be the ideal place.
  2. If the user wants to set a task dir on a per-task basis, it's still quite easy to do without support in the ShellFunction:
ShellFunction("mkdir -p {rundir}; cd {rundir}; ...", rundir=/path/to/run/dir)

Let me know what you think. If we get more folks asking for similar functionality, we are definitely open to supporting it.

@RINO-GAELICO
Copy link
Author

Hi @yadudoc thanks for your reply.

Regarding your points:

  1. I understand the logic, but from another perspective, it would become extremely tedious to change the endpoint configuration every time you want to modify the run_dir. For example, consider a piece of code that needs to run multiple times (for example in a loop), with only the run_dir changing. In this particular case, it wouldn't be ideal to manually update the configuration each time.

  2. From my understanding, this approach wouldn't work without modifying the ShellFunction constructor to accept additional **kwargs and using cmd.format(**kwargs). Am I wrong?

@yadudoc
Copy link
Collaborator

yadudoc commented Aug 29, 2024

Hi @RINO-GAELICO,

  1. If the user wants to set working_dir on a per-task basis, I completely agree that going via the configuration route is going to be tedious and very inefficient. We were working with the assumption that most users probably would want to set up the task sandbox directory, which would automatically isolate tasks. It sounds like you want more control than that.
  2. My example was buggy, here's a more complete example, that I've tested out. You do not pass the kwargs to ShellFunction at init but rather to executor.submit and the kwargs are used to format the cmd string :
>>> from globus_compute_sdk import Executor, ShellFunction
>>> ex = Executor(endpoint_id="a1e677f9-c08a-4b70-8b70-4a51b13128cd")  # This is my local EP, please update before testing
>>> future = ex.submit(ShellFunction("pwd; cd {run_dir}; pwd"), run_dir="/tmp")
>>> future.result().stdout
'/Users/yadu/.globus_compute/prod/tasks_working_dir\n/tmp\n'
>>> print(future.result().stdout)
/Users/yadu/.globus_compute/prod/tasks_working_dir
/tmp

Thanks again for your care in evaluating these options.

@rjmello
Copy link
Member

rjmello commented Aug 29, 2024

Also, in a multi-user endpoint context, working_dir can be set as a user configuration template variable, which the user can modify on every submit call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants
@yadudoc @rjmello @RINO-GAELICO and others