Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor git source to allow ScanOptions and use source in engine #1518

Merged
merged 2 commits into from
Aug 1, 2023

Conversation

mcastorina
Copy link
Collaborator

Refactor the Chunks method of the git Source to call out to two helper methods: scanRepos and scanDirs which scans s.conn.Repositories and s.conn.Directories respectively. The only notable change in behavior is that a credential is no longer necessary if there are no s.conn.Repositories to scan.

@mcastorina mcastorina requested review from a team as code owners July 21, 2023 02:58
@mcastorina mcastorina force-pushed the refactor-git-engine-source branch 2 times, most recently from 6b6b06e to 2140f23 Compare July 21, 2023 03:10
Comment on lines -25 to -30
options := &gogit.PlainOpenOptions{
DetectDotGit: true,
EnableDotGitCommonDir: true,
}

repo, err := gogit.PlainOpenWithOptions(c.RepoPath, options)
Copy link
Collaborator Author

@mcastorina mcastorina Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the same options that the git source uses to open a git directory:

func RepoFromPath(path string) (*git.Repository, error) {
options := &git.PlainOpenOptions{
DetectDotGit: true,
EnableDotGitCommonDir: true,
}
return git.PlainOpenWithOptions(path, options)
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the source will now ignore any error returned from PlainOpenWithOptions though, but we are preparing the repo already in main which should catch any issues:

trufflehog/main.go

Lines 338 to 341 in 8ec5e49

repoPath, remote, err = git.PrepareRepoSinceCommit(ctx, *gitScanURI, *gitScanSinceCommit)
if err != nil || repoPath == "" {
logFatal(err, "error preparing git repo for scanning")
}

Comment on lines -150 to +194
return s.git.ScanRepo(ctx, repo, path, NewScanOptions(), chunksChan)
return s.git.ScanRepo(ctx, repo, path, s.scanOptions, chunksChan)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not initializing s.scanOptions, so the default value of nil will be used if it's not set. This is okay because the ScanRepo function already catches this:

func (s *Git) ScanRepo(ctx context.Context, repo *git.Repository, repoPath string, scanOptions *ScanOptions, chunksChan chan *sources.Chunk) error {
if scanOptions == nil {
scanOptions = NewScanOptions()
}

@mcastorina mcastorina marked this pull request as draft July 21, 2023 03:38
@mcastorina mcastorina marked this pull request as ready for review July 21, 2023 04:10
Refactor the Chunks method of the git Source to call out to two helper
methods: scanRepos and scanDirs which scans s.conn.Repositories and
s.conn.Directories respectively. The only notable change in behavior is
that a credential is no longer necessary if there are no
s.conn.Repositories to scan.
@rosecodym
Copy link
Collaborator

If I understand this correctly, you refactored the Git source to allow it to be used in a new way. What are the old ways that it's (presumably) still being used?

@mcastorina
Copy link
Collaborator Author

mcastorina commented Aug 1, 2023

If I understand this correctly, you refactored the Git source to allow it to be used in a new way. What are the old ways that it's (presumably) still being used?

The main difference is that we weren't doing a git scan using a git.Source and this PR changes that. Instead we were preparing the repo manually and using the dependencies of git.Source to do the scanning.

The problem I encountered when changing it was that the git.Source cleans up temporary directories, while that was previously the responsibility of the code that prepared the repository. I added a kludge in to keep that behavior, though maybe a better solution would be for the git.Source to track which directories were prepared by the Source so it can cleanup without having to rely on naming conventions (though I still want to keep the naming conventions).

@mcastorina mcastorina merged commit 69021f5 into main Aug 1, 2023
9 checks passed
@mcastorina mcastorina deleted the refactor-git-engine-source branch August 1, 2023 14:52
savely-krasovsky added a commit to savely-krasovsky/trufflehog that referenced this pull request Aug 3, 2023
mcastorina pushed a commit that referenced this pull request Aug 3, 2023
* feat: initial support for bare repositories

* feat: use concatenation instead of formatting and os.Getenv instead of os.Environ

Signed-off-by: Savely Krasovsky <[email protected]>

* fix: go-git update with pre-receive hooks fix

Signed-off-by: Savely Krasovsky <[email protected]>

* fix: remove info about pre-receive hook from README.md for now

Signed-off-by: Savely Krasovsky <[email protected]>

* fix: don't scan staged while using --bare option, fixes to make it work with the latest master

Signed-off-by: Savely Krasovsky <[email protected]>

* fix: small refactor according to #1518

Signed-off-by: Savely Krasovsky <[email protected]>

---------

Signed-off-by: Savely Krasovsky <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants