Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MISC] Subcommands alternative syntax #234

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

eseiler
Copy link
Member

@eseiler eseiler commented Feb 9, 2024

Same as #233 but with alternative syntax (see snippet).

int main(int argc, char ** argv)
{
    //                                                                           -------- Optional --------
    sharg::parser git_parser{"git", argc, argv, sharg::update_notifications::on, {"pull", "push", "remote"}};

    if (auto pull_parser = git_parser.add_subcommand("pull"))
    {
        std::string repository{};
        pull_parser->add_positional_option(repository, sharg::config{.description = "pull"});
        pull_parser->parse();
    }
    else if (auto push_parser = git_parser.add_subcommand("push"))
    {
        std::string repository{};
        push_parser->add_positional_option(repository, sharg::config{.description = "push"});
        push_parser->parse();
    }
    else if (auto remote_parser = git_parser.add_subcommand("remote"))
    {
        remote_parser->add_section("Remote options");
        // remote_parser->parse(); // Caveat
        if (auto recursive_sub_parser = remote_parser->add_subcommand("set-url"))
        {
            std::string repository{};
            recursive_sub_parser->add_positional_option(repository, sharg::config{});
            recursive_sub_parser->parse();
        }
        else if (auto recursive_sub_parser = remote_parser->add_subcommand("show"))
        {
            recursive_sub_parser->parse();
        }
        else if (auto recursive_sub_parser = remote_parser->add_subcommand("remote"))
        {
            recursive_sub_parser->parse();
        }
        else
        {
            remote_parser->parse();
        }
    }
    else
    {
        git_parser.parse();
    }

    return 0;
}

Copy link

vercel bot commented Feb 9, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
sharg-parser ✅ Ready (Inspect) Visit Preview Feb 9, 2024 1:17pm

@seqan-actions seqan-actions added lint [INTERNAL] signals that clang-format ran and removed lint [INTERNAL] signals that clang-format ran labels Feb 9, 2024
@seqan-actions seqan-actions added lint [INTERNAL] signals that clang-format ran and removed lint [INTERNAL] signals that clang-format ran labels Feb 9, 2024
@eseiler eseiler force-pushed the misc/alternative_subcommands branch from 17e7727 to 33f3f2c Compare February 9, 2024 13:16
@seqan-actions seqan-actions added lint [INTERNAL] signals that clang-format ran and removed lint [INTERNAL] signals that clang-format ran labels Feb 9, 2024
Copy link

codecov bot commented Feb 9, 2024

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (d48c51f) 91.90% compared to head (33f3f2c) 91.93%.

Files Patch % Lines
include/sharg/parser.hpp 95.89% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #234      +/-   ##
==========================================
+ Coverage   91.90%   91.93%   +0.03%     
==========================================
  Files          17       17              
  Lines        1618     1662      +44     
==========================================
+ Hits         1487     1528      +41     
- Misses        131      134       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@h-2
Copy link
Member

h-2 commented Feb 9, 2024

Thanks for picking this up so quickly! A few points:

  • I like the if checks a lot! This feels like a much better syntax for checking the subcommand in general! I never like looking at the info-string.
  • Unfortunately std::optional<T&> is not yet available, I would prefer that to a pointer. But under the current circumstances, the pointer is probably the best design.

I don't completely understand the code, yet. Does a successful .add_subcommand() automatically call .parse() on the parent parser? If not, why not? After all, this parent is can no longer receive changes afterwards, correct? i.e. add_subcommand() have to specified after all add_option(), right? I think this would need to be enforced to prevent misuse.
Or does .parse() on the subparser also trigger .parse() in the parent? Either way would be possible,¹ but ultimately everything needs to be parsed, and as soon as one parse() happens, no more changes are allowed.
I also think one of these cases must be true, since you don't currently parse() on the parent in your example if a subcommand is selected.

If successful add_subcommand() or sub-parse() does invoke .parse() on the parent, I would still allow invoking .parse() a second time (as NOP) on the parent. I think it's good style to always have it there, and it would be strange, if it can be omitted sometimes.

Your example would then look like this:

    sharg::parser git_parser{"git", argc, argv, sharg::update_notifications::on};

    if (auto pull_parser = git_parser.add_subcommand("pull"))
    {
        std::string repository{};
        pull_parser->add_positional_option(repository, sharg::config{.description = "pull"});
        pull_parser->parse();
    }
    else if (auto push_parser = git_parser.add_subcommand("push"))
    {
        std::string repository{};
        push_parser->add_positional_option(repository, sharg::config{.description = "push"});
        push_parser->parse();
    }
    else if (auto remote_parser = git_parser.add_subcommand("remote"))
    {
        remote_parser->add_section("Remote options");
        // remote_parser->parse(); // Caveat
        if (auto recursive_sub_parser = remote_parser->add_subcommand("set-url"))
        {
            std::string repository{};
            recursive_sub_parser->add_positional_option(repository, sharg::config{});
            recursive_sub_parser->parse();
        }
        else if (auto recursive_sub_parser = remote_parser->add_subcommand("show"))
        {
            recursive_sub_parser->parse();
        }
        else if (auto recursive_sub_parser = remote_parser->add_subcommand("remote"))
        {
            recursive_sub_parser->parse();
        }

        remote_parser->parse();
    }

    git_parser.parse();

    return 0;

¹ I actually think that there is merit to having parent-parse happen on successful add_subcommand, because that way top-level options are already evaluated and can be used to influence setup of the sub-command.

P.S.: I strongly suggest not adding the old subcommands-as-constructor-args syntax to the new constructor-config, as it is confusing to have two mechanisms that are so different.

@h-2
Copy link
Member

h-2 commented Feb 9, 2024

One more thing: if subcommands get .add_subcommand("name"), it might make sense to be able to also provide .add_subcommand(sharg::config{.long_id = "pull", .description = "Fetch from and integrate with another repository or a local branch"}). I always wanted a short description on the primary help page, and other properties like hidden and advanced would applicable, as well.

Maybe this is too much, though 🤔

@h-2
Copy link
Member

h-2 commented Feb 9, 2024

I am just brainstorming here, but I think subcommands are more similar to positional options — or at least most people perceive them as such. If you want that reflected in the design, you could also do something like this:

sharg::parser git_parser{"git", argc, argv, sharg::update_notifications::on};

std::string dir;
std::string subcommand;

git_parser.add_option(dir, sharg::config{.short_id = 'C'});

// add_subcommand always comes last and always invokes parse() on the current parser - independent of return value
auto subparser = git_parser.add_subcommand(subcommand, 
                                           sharg::config{.required = true, // this is easy to do now!
                                                         .validator = sharg::value_list_validator{"push", "pull", "remote"}});
if (!subparser) // this won't be reached, because set to required; but otherwise it would be OK to return, because parse() happend already
    return;

if (subcommand == "pull")
{
    std::string repository{};
    subparser->add_positional_option(repository, sharg::config{.description = "pull"});

}
else if (subcommand == "push")
{
    std::string repository{};
    subparser->add_positional_option(repository, sharg::config{.description = "push"});
}
else if (subcommand == "remote"))
{
    remote_parser->add_section("Remote options");

    std::string subsubcommand;
    auto subsubparser = subparser->add_subcommand(subsubcommand);
    
    if (subsubparser)
    {
        if (subsubcommand == "set-url")
        {
            std::string repository{};
            recursive_sub_parser->add_positional_option(repository, sharg::config{});
            recursive_sub_parser->parse();
        }
        else if (subsubcommand == "show")
        {
            //...
        }
        else if (subsubcommand == "remote")
        {
            //...
        }
    }
    // there is no else-case even if the subcommand is not set to required
    // because `add_subcommand()` already invoked parse() and we are done
}

Advantages of this design IMHO:

  • add_subcommand(variable, sharg::config{}) is super similar to add_positional_option(variable, sharg::config{})
  • config allows things otherwise not possible
  • It is easy to document: there is at most one invocation of add_subcommand() and it needs to come last and it implies parse().
  • different way of thinking: one subcommand that can have multiple values; not multiple subcommands
  • this is more inline with the actual implementation, because in reality, there is exactly one subparser (not multiple different ones)

Disadvantages of this design:

  • a bit more verbose?

@eseiler
Copy link
Member Author

eseiler commented Feb 9, 2024

#234 (comment)

Well, it's currently a bit convoluted.

I think, the init() function does too much.
It handles special formats (--help, etc.), the arguments (i.e. what are the arguments that need to be handled by this parser when calling parse()) and subparsers.

init():

  • Sets sub_parser if a subcommand was encountered in the arguments
  • Fills cmd_arguments with all arguments that the parser needs to handle when calling parse() (empty if sub_parser found).
  • Sets the format (help, short_help, "actually parse cmd_arguments", ...)

In the example:

  • ./app --help prints git help page, with all subcommands.
  • ./app --help pull prints short help of git-pull.
  • ./app --help pull arg prints git help page, but pull + push are missing from subcommands.

(Calling parse() on the parent should solve that, see below.)

I don't completely understand the code, yet. Does a successful .add_subcommand() automatically call .parse() on the parent parser?

It doesn't, but is does call init(), which is almost the same.

If not, why not? After all, this parent is can no longer receive changes afterwards, correct?

If we were to add multiple subcommand subsequently, calling parse() would throw if we use some subcommand we have not yet added.

parser.add_subcommand("pull"); // parser.parse() would throw for `./app push`
parser.add_subcommand("push");

i.e. add_subcommand() have to specified after all add_option(), right?

No, we just have to do it before calling parse(). Everything needs to be done before calling parse().

is_option_set and get_sub_parser() must be done after calling parse().

as soon as one parse() happens, no more changes are allowed.

Yes, any changes after parse() do not take effect.

I also think one of these cases must be true, since you don't currently parse() on the parent in your example if a subcommand is selected.

Yeah, with always calling git_parser.parse().
./app pull args has no output. I don't know yet why :)
The git_parser should do a short_help, I think.

I would still allow invoking .parse() a second time (as NOP) on the parent. I think it's good style to always have it there, and it would be strange, if it can be omitted sometimes.

It shouldn't be a problem to allow parse() on the parent. I wouldn't call it twice, but the git_parse.parse() at the very end should be fine.

Actually, calling it twice shouldn't be a problem if we just make the second call a NOP. Depends on the final design whether we need it.

Maybe it's even required and solves some weird behaviour I saw with --help and subcommands. Since I never call git_parser.parse(), the --help is parsed and format_help is stored for git_parser, but it never takes effect, because we don't call parse.

¹ I actually think that there is merit to having parent-parse happen on successful add_subcommand, because that way top-level options are already evaluated and can be used to influence setup of the sub-command.

From the documentation, only flags are allowed for the top-level, not sure if that is what actually happens.

But in case we need to access top-level-options in the subparser, we would need to call parse() on the parent. And nothing should throw because we found a valid sub_parser.

P.S.: I strongly suggest not adding the old subcommands-as-constructor-args syntax to the new constructor-config, as it is confusing to have two mechanisms that are so different.

My plan was to mark it as deprecated and remove it after the next release.

Edit
It randomly appeared to me that it may very well be that init() handles so much, because we have to. In general, we do not own the arguments (argc, argv), and hence we need to process them while they are in scope, i. e. the constructor. After the constructor, we cannot be certain that the argv pointer is valid (in almost all cases it will be, because argv usually comes from main).

So, currently we process the arguments and store/copy what we may need when parse is called.

If we instead copy the arguments, like this PR does, we could do all the evaluation when parse is called.
This would also mean that the parser constructor does not throw (but it is not noexcept, because we do allocation). And I figure the implementation would be cleaner/clearer, because we would not do part of the parsing in init and some other part in parse.

@eseiler
Copy link
Member Author

eseiler commented Feb 9, 2024

I am just brainstorming here, but I think subcommands are more similar to positional options — or at least most people perceive them as such. If you want that reflected in the design, you could also do something like this:

This would also work.

// add_subcommand always comes last and always invokes parse() on the current parser - independent of return value
auto subparser = git_parser.add_subcommand(subcommand, 
                                           sharg::config{.required = true, // this is easy to do now!
                                                         .validator = sharg::value_list_validator{"push", "pull", "remote"}});
if (!subparser) // this won't be reached, because set to required; but otherwise it would be OK to return, because parse() happend already
    return;

I don't think add_subcommand needs to call parse, though it could. Then it would be quite similar to the current way to do it.

Whether I do

parser.add_option(...);
parser.add_subcommand(...);

parser.parse();

or

parser.add_subcommand(...);
parser.add_option(...);

parser.parse();

shouldn't matter.

What would required = true mean for a subcommand, currently, we would just print the short help?
I guess we could throw if not set, but would it be of much use?

Unless we handle it like positional arguments, which are always required. But his also depends on whether we call parse on the parent, etc.

@h-2
Copy link
Member

h-2 commented Feb 12, 2024

If we instead copy the arguments, like this PR does, we could do all the evaluation when parse is called.
This would also mean that the parser constructor does not throw (but it is not noexcept, because we do allocation). And I figure the implementation would be cleaner/clearer, because we would not do part of the parsing in init and some other part in parse.

Yes, IMHO it would be a lot better to have all processing happen on .parse() and just copy the arguments initially.
However, I thought that the "format" needs to be known when things like .add_option() is called, because it has different effects based on the format? And the way to know the format is to do some of the parsing on construction?
One way around this would be to push the visitors onto a queue where they are stored until parse is called and the format is determined... or do you have other suggestions?

I think this is the most important design decision and will influence how other things can be solved.

@eseiler
Copy link
Member Author

eseiler commented Feb 12, 2024

Yes, IMHO it would be a lot better to have all processing happen on .parse() and just copy the arguments initially.
However, I thought that the "format" needs to be known when things like .add_option() is called, because it has different effects based on the format? And the way to know the format is to do some of the parsing on construction?
One way around this would be to push the visitors onto a queue where they are stored until parse is called and the format is determined... or do you have other suggestions?

I think this is the most important design decision and will influence how other things can be solved.

We didn't have time for a In-depth meeting today. But we think that doing everything in parse would be nice.

So I'll try that (as separate PR) first. The functions that delegate to the format will be trouble :D

@h-2
Copy link
Member

h-2 commented Feb 12, 2024

So I'll try that (as separate PR) first. The functions that delegate to the format will be trouble :D

Yeah, some fun involved, because you cannot keep the other parameters generic. I would start with something like this:

    template <typename option_type, typename validator_type>
    void add_option(option_type & value, config<validator_type> const & config)
    {
        verify_option_config(config);

        auto fn = [&format, &value, &config]()
        {
            auto inner_visitor =  [&value, &config](auto & f)
            {
                f.add_option(value, config);
            };
            std::visit(inner_visitor, format);
        };

        operations.push_back(fn);
    }

and add std::vector<std::function<void()>> operations; as a member and the iterate over those on parse().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants