Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: GitHub integration #4555

Open
dbarnett opened this issue Sep 30, 2024 · 26 comments
Open

FR: GitHub integration #4555

dbarnett opened this issue Sep 30, 2024 · 26 comments
Labels
enhancement New feature or request

Comments

@dbarnett
Copy link
Collaborator

Should have some forge integrations for GitHub remotes (avoiding the need to rely on https://cli.github.com/ and colocated git repos).

Some possible use cases could be:

  1. creating/managing PRs
  2. interacting with issues or code (browse) related to the current repo and its state
  3. checking status on remote action workflows
@dbarnett dbarnett added the enhancement New feature or request label Sep 30, 2024
@PhilipMetzger
Copy link
Collaborator

Some possible use cases could be:

  1. creating/managing PRs
  2. interacting with issues or code (browse) related to the current repo and its state
  3. checking status on remote action workflows

That sounds almost like the functionality which Phabricator's Arcanist (arc) had, although it never was completed. If we want Forge integrations as a first class thing, then we'd probably need to cover these.

@dbarnett
Copy link
Collaborator Author

dbarnett commented Oct 1, 2024

Yeah, tbh all the integrations I can think of for GitHub would have direct analogs for other forges, and I'm questioning whether it needs anything like an explicit jj github command at all. Besides some slight terminology differences, if you can do it for GH you should probably be able to do it for any other supported remote.

The gotchas making it all generic would be:

  1. mechanism to track which forge(s) you have associated with which git remotes and topics might be cumbersome
  2. discoverability and troubleshooting might be slightly harder due to terminology differences

BTW I saw relevant comments in another issue about whether to support multiple remotes in the same push command… this would have some of the same challenges at another level, if your repo has remotes on multiple forges.

@PhilipMetzger
Copy link
Collaborator

Yeah, tbh all the integrations I can think of for GitHub would have direct analogs for other forges, and I'm questioning whether it needs anything like an explicit jj github command at all.

Tbh, 1.5 years ago, I'd agreed with you, but I had this discussion a few times in the Discord but Martin had a good point here, which changed my opinion.

The gotchas making it all generic would be:

  1. mechanism to track which forge(s) you have associated with which git remotes and topics might be cumbersome
  2. discoverability and troubleshooting might be slightly harder due to terminology differences

Imo, it is worthwhile to have the common generic implementation in the lib, while each Forge is responsible for its own idiosyncrasies in their trait implementation.

BTW I saw relevant comments in another issue about whether to support multiple remotes in the same push command… this would have some of the same challenges at another level, if your repo has remotes on multiple forges.

Agreed, as there is no canonical main repository without user intervention.

@steveklabnik
Copy link
Collaborator

I think there's two concerns here: sharing code, and the UX. If the features are similar, there's no reason they couldn't share code. But I think there are UX benefits to forge specific names: it's a very clear indicator of support of your particular forge, it's easy to find, and there will always be some features that are specific to some forges that could get awkward otherwise.

Then again, I'm also speaking of "forges" as maybe a bit broader of a topic than they are. Is Gerrit a forge? It's really just a code review tool. What about sourcehut? It supports many features that github/gitlab/bitbucket do, but also has things they don't, like real time chat, and uses mailing lists instead of PRs/changesets.

@arxanas
Copy link
Collaborator

arxanas commented Oct 1, 2024

The discussion in this thread regards forges in general, so I continued the discussion here: #485 (comment)

@dbarnett
Copy link
Collaborator Author

dbarnett commented Oct 1, 2024

Excellent, thanks for pointing me toward the context!

So IIUC there's no concern if I send a PR to implement e.g. a github|gh pr create command?

@PhilipMetzger
Copy link
Collaborator

So IIUC there's no concern if I send a PR to implement e.g. a github|gh pr create command?

There never was, although I'd prefer to keep it under github as gh conflicts with their own CLI. Maybe take a look at the gerrit send PR (#2845) for some inspiration and things to share.

@dbarnett
Copy link
Collaborator Author

dbarnett commented Oct 1, 2024

Thanks!

as gh conflicts with their own CLI.

What would that affect? You're saying jj gh pr create and commands like that could be somehow ambiguous because of the similarity to their CLI name?

@PhilipMetzger
Copy link
Collaborator

PhilipMetzger commented Oct 1, 2024

What would that affect? You're saying jj gh pr create and commands like that could be somehow ambiguous because of the similarity to their CLI name?

Names bring user expectations which could imply some false dependency on the gh tool itself. The github namespace also makes sense to contrast gerrit, which lands with gerrit send, but this is again just my opinion.

@martinvonz
Copy link
Owner

FWIW, we already have jj git and a bunch of subcommands that conflict with with git and its subcommands. I'm not arguing for jj gh (I think I also prefer jj github, at least the canonical form).

@mati865
Copy link
Collaborator

mati865 commented Oct 1, 2024

Using a full service name is the safest option, if somebody wants a shorter name, they should create an alias.

@dbarnett
Copy link
Collaborator Author

dbarnett commented Oct 6, 2024

AFAICT the logistics for this would be something like...

  1. Add a dependency on https://crates.io/crates/octorust
  2. Create a GitHub App to use for auth flow from a command like jj github auth login
  3. Somehow integrate the client secret securely into the jj app

For 2, octorust has instructions for authenticating and I created https://github.com/apps/jujutsu-vcs but I don't really know what I'm doing on permissions and key security. If anyone wants me to transfer that (or delete it so they can recreate fresh), LMK.

Alternatively it could just shell out to a separately-installed gh cli, which would be less "self-contained" but I believe would avoid some of the security complexities and allow seamlessly reusing existing tokens a user may have already set up with gh cli.

@arxanas
Copy link
Collaborator

arxanas commented Oct 7, 2024

FWIW I shelled out to gh in git-branchless's GitHub integration specifically for those reasons. Authentication is a lot of incidental complexity whose effort could be better-spent elsewhere. I believe you can query the API directly via gh as well, in case you wanted to issue some esoteric queries.

Whatever solution should probably also handle Enterprise GitHub + single-sign-on as well (which I assume gh does).

@dbarnett
Copy link
Collaborator Author

dbarnett commented Oct 7, 2024

Whatever solution should probably also handle Enterprise GitHub + single-sign-on as well (which I assume gh does).

Yep, looks like $GH_HOST and $GH_ENTERPRISE_TOKEN are the env variables for controlling that, although I might need help from someone who regularly uses Enterprise to verify that's all fully working as intended.

From what I can tell, all the gh commands do have mechanisms to hard-code the repo/branch/etc to target, although they seem to vary a bit per command and might be a little brittle:

jj cmd gh cmd
jj github pr view GH_REPO=someorg/somerepo gh pr view BRANCHNAME
jj github browse some/path GH_REPO=someorg/somerepo gh browse -b BRANCHNAME some/path

where I believe GH_REPO can be inferred from git remote URLs (and probably same for GH_HOST?).

Those pr and browse ones jump out at me as some good initial ones to support, plus probably some auth commands so usage isn't too clunky if you're not freshly authenticated. The ones that have no relationship to "current repo" (like gist commands or repo list) I don't see any value in supporting as jj commands. Others like run/secret/workflow I could imagine adding support for as we see demand.

Maybe someday particular extensions like gh act too, but that doesn't seem to respect GH_REPO and I'm not sure there'd be a non-brittle way to generally support extensions like that if they remain needing special-case overrides.

@martinvonz
Copy link
Owner

The commands I would find most useful would be for:

  • Create/update stack of PRs, setting the base branch between them
  • Fetch the commit from a PR without having to configure a remote

@dbarnett
Copy link
Collaborator Author

dbarnett commented Oct 8, 2024

True. I'd assumed stacks and sync would be managed via separate forge-independent commands, but I guess anything touching PRs you'd imagine under jj github?

@PhilipMetzger
Copy link
Collaborator

While I would've preferred to build our own thing, @arxanas point still stands that we'll never get to the point of being a first-class user to GitHub, which gh is. So going with gh should be fine, even if it feels a bit icky to me.

For other providers (#485), we hopefully should be able to have a first-class experience built-in.

I think the option to use a stacked workflow or the normal GitHub branch-like thing would also be good to have in the initial version.

@martinvonz
Copy link
Owner

For creating/updating a stack of PRs, what you can't do in a forge-agnostic way is to set the base branch, so we'd need that to be under jj github.

For fetching, the only thing I was thinking of that might help if it had GithHub-awareness is so you can specify a PR number instead of a ref.

@emilazy
Copy link
Collaborator

emilazy commented Oct 8, 2024

Note that there’s two options for fetching GitHub PRs, the HEAD commit (pulls/…/head) or the synthesized merge commit (pulls/…/merge).

I would love a stacked PR solution that lets you choose which commits go into which PRs rather than making every PR a single commit, but there might not be many people who want that.

@dbarnett
Copy link
Collaborator Author

dbarnett commented Oct 8, 2024

I would love a stacked PR solution that lets you choose which commits go into which PRs rather than making every PR a single commit

Yep, that's exactly what "topics" will accomplish, isn't it? Each topic would generally map 1:1 with a PR?

@emilazy
Copy link
Collaborator

emilazy commented Oct 8, 2024

Probably! I don’t really fully understand topics except that they will solve everything and make life perfect forever, so that sounds plausible.

@PhilipMetzger
Copy link
Collaborator

Yep, that's exactly what "topics" will accomplish, isn't it? Each topic would generally map 1:1 with a PR?

Yes, if the topic strictly maps to a topological branch (which then translates into a Git branch). Although having the option to not strictly enforce that they must not overlap would still be nice, since it allows a multi-head approach to sending a PR.

@arxanas
Copy link
Collaborator

arxanas commented Oct 9, 2024

Yes, if the topic strictly maps to a topological branch (which then translates into a Git branch).

[topic: topics] To expand on this (seems relevant to GitHub interop), a "topic" for our purposes is some subset of commits. A branch, and therefore pull request, can only be inferred if that set has a single head. (You can make the conditions more restrictive as desired.)

[topic: commit message tagging] I've seen many people establish topics/branches by convention in the commit message specifically for tooling purposes.

  • Example: git-branchstack is used for many-to-many topic management.
  • It may work for many people in practice to use the commit message to establish the associated logical topic.
    • For example, an organization may have a policy to put a link to a bug tracker in the commit message, which could be used to determine the associated topic as well.
    • If implementing this, it could use a user-defined regex to determine the topic. (I don't think it's worth doing in a first pass.)

[topic: git-branchless design] git-branchless implicitly addresses this in its GitHub forge at present by only creating branches for the commits in the provided revset.

  • Example invocation: git submit --forge github --create 'stack() & message(JIRA-123)'.
  • Note: in git-branchless, creating PRs is an explicit step. When invoking without --create, only commits in the provided revset which already have PRs are updated. jj could make a different design decision here.
  • Ancestor commits not part of the revset implicitly go into the head commit PR.
    • There is no explicit implementation — that's just how it works if you create and push a branch and create a PR.
    • But that is the deliberate design for git-branchless to handle multi-commit PRs.

@arxanas
Copy link
Collaborator

arxanas commented Oct 9, 2024

If anyone starts to implement stacked PRs for the GitHub forge, here's my report on relevant issues to consider: arxanas/git-branchless#1259

@dbarnett
Copy link
Collaborator Author

dbarnett commented Oct 9, 2024

And still seems like topics will take some time yet to implement, so initial GH integration might not include any of that in its scope. I expected for now to create a PR from a bookmark name and then include all commits from base to that bookmark in the PR, same as the gh tool does.

@martinvonz
Copy link
Owner

That sounds good to me, fwiw. (I can't speak for others, of course.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants