Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support indeterminacy in alchemy and update detector docs #1510

Merged
merged 18 commits into from
Jul 21, 2023
Merged
22 changes: 19 additions & 3 deletions hack/docs/Adding_Detectors_Internal.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,15 +62,31 @@ If you think that something should be included outside of these guidelines, plea
1. Add the test secret to GCP Secrets. See [managing test secrets](#managing-test-secrets)
2. Update the pattern regex and keywords. Try iterating with [regex101.com](http://regex101.com/).
3. Update the verifier code to use a non-destructive API call that can determine whether the secret is valid or not.
* Make sure you understand [verification indeterminacy](#verification-indeterminacy).
4. Update the tests with these test cases at minimum:
1. Found and verified (using a credential loaded from GCP Secrets)
2. Found and unverified
3. Not found
4. Any false positive cases that you come across
2. Found and unverified (determinately, i.e. the secret is invalid)
3. Found and unverified (indeterminately due to timeout)
4. Found and unverified (indeterminately due to an unexpected API response)
5. Not found
6. Any false positive cases that you come across
5. Create a merge request for review. CI tests must be passing.

## Addendum

### Verification indeterminacy

There are two types of reasons that secret verification can fail:
* The candidate secret is not actually a valid secret.
* Something went wrong in the process unrelated to the candidate secret, such as a transient network error or an unexpected API response.

In Trufflehog parlance, the first type of verification response is called _determinate_ and the second type is called _indeterminate_. Verification code should distinguish between the two by returning an error object in the result struct **only** for indeterminate failures. In general, a verifier should return an error (indicating an indeterminate failure) in all cases that haven't been explicitly identified as determinate failure states.

For example, consider a hypothetical authentication endpoint that returns `200 OK` for valid credentials and `403 Forbidden` for invalid credentials. The verifier for this endpoint could make an HTTP request and use the response status code to decide what to return:
* A `200` response would indicate that verification succeeded. (Or maybe any `2xx` response.)
* A `403` response would indicate that verification failed **determinately** and no error object should be returned.
* Any other response would indicate that verification failed **indeterminately** and an error object should be returned.

### Managing Test Secrets

Do not embed test credentials in the test code. Instead, use GCP Secrets Manager.
Expand Down
22 changes: 19 additions & 3 deletions hack/docs/Adding_Detectors_external.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,15 +62,31 @@ If you think that something should be included outside of these guidelines, plea
1. Create a [test secrets file, and export the variable](#using-a-test-secret-file)
2. Update the pattern regex and keywords. Try iterating with [regex101.com](http://regex101.com/).
3. Update the verifier code to use a non-destructive API call that can determine whether the secret is valid or not.
* Make sure you understand [verification indeterminacy](#verification-indeterminacy).
4. Update the tests with these test cases at minimum:
1. Found and verified (using a credential loaded from GCP Secrets)
2. Found and unverified
3. Not found
4. Any false positive cases that you come across
2. Found and unverified (determinately, i.e. the secret is invalid)
3. Found and unverified (indeterminately due to timeout)
4. Found and unverified (indeterminately due to an unexpected API response)
5. Not found
6. Any false positive cases that you come across
5. Create a pull request for review.

## Addendum

### Verification indeterminacy

There are two types of reasons that secret verification can fail:
* The candidate secret is not actually a valid secret.
* Something went wrong in the process unrelated to the candidate secret, such as a transient network error or an unexpected API response.

In Trufflehog parlance, the first type of verification response is called _determinate_ and the second type is called _indeterminate_. Verification code should distinguish between the two by returning an error object in the result struct **only** for indeterminate failures. In general, a verifier should return an error (indicating an indeterminate failure) in all cases that haven't been explicitly identified as determinate failure states.

For example, consider a hypothetical authentication endpoint that returns `200 OK` for valid credentials and `403 Forbidden` for invalid credentials. The verifier for this endpoint could make an HTTP request and use the response status code to decide what to return:
* A `200` response would indicate that verification succeeded. (Or maybe any `2xx` response.)
* A `403` response would indicate that verification failed **determinately** and no error object should be returned.
* Any other response would indicate that verification failed **indeterminately** and an error object should be returned.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we can say this is true for all apis we test against. e.g. Docker can still be a valid credential and return a 401 -

// Valid credentials can still return a 401 status code if 2FA is enabled
if (res.StatusCode >= 200 && res.StatusCode < 300) || (res.StatusCode == 401 && strings.Contains(string(body), "login_2fa_token")) {
s1.Verified = true
}

So maybe adjust the language to be less certain about the indeterminate verification failure in the case of non 2XX or 403 responses.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to pick language that made it clear that was a hypothetical case, but I guess I didn't do a good job :(

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a separate paragraph for the start of the hypothetical case? And potentially changing the would to would likely?


### Using a test secret file

1. Create a file called `.env` with this env file format:
Expand Down
28 changes: 26 additions & 2 deletions pkg/common/http.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ package common
import (
"crypto/tls"
"crypto/x509"
"io"
"net"
"net/http"
"strings"
Expand Down Expand Up @@ -75,6 +76,14 @@ func PinnedCertPool() *x509.CertPool {
return trustedCerts
}

type FakeTransport struct {
CreateResponse func(req *http.Request) (*http.Response, error)
}

func (t FakeTransport) RoundTrip(req *http.Request) (*http.Response, error) {
return t.CreateResponse(req)
}

type CustomTransport struct {
T http.RoundTripper
}
Expand All @@ -91,6 +100,21 @@ func NewCustomTransport(T http.RoundTripper) *CustomTransport {
return &CustomTransport{T}
}

func ConstantStatusHttpClient(statusCode int) *http.Client {
return &http.Client{
Timeout: DefaultResponseTimeout,
Transport: FakeTransport{
CreateResponse: func(req *http.Request) (*http.Response, error) {
return &http.Response{
Request: req,
Body: io.NopCloser(strings.NewReader("")),
StatusCode: statusCode,
}, nil
},
},
}
}

func PinnedRetryableHttpClient() *http.Client {
httpClient := retryablehttp.NewClient()
httpClient.Logger = nil
Expand Down Expand Up @@ -152,9 +176,9 @@ func SaneHttpClient() *http.Client {
}

// SaneHttpClientTimeOut adds a custom timeout for some scanners
func SaneHttpClientTimeOut(timeOutSeconds int64) *http.Client {
func SaneHttpClientTimeOut(timeout time.Duration) *http.Client {
httpClient := &http.Client{}
httpClient.Timeout = time.Second * time.Duration(timeOutSeconds)
httpClient.Timeout = timeout
httpClient.Transport = NewCustomTransport(nil)
return httpClient
}
18 changes: 15 additions & 3 deletions pkg/detectors/alchemy/alchemy.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package alchemy

import (
"context"
"fmt"
"net/http"
"regexp"
"strings"
Expand All @@ -11,14 +12,15 @@ import (
"github.com/trufflesecurity/trufflehog/v3/pkg/pb/detectorspb"
)

type Scanner struct{}
type Scanner struct {
client *http.Client
}

// Ensure the Scanner satisfies the interface at compile time.
var _ detectors.Detector = (*Scanner)(nil)

var (
client = common.SaneHttpClient()

defaultClient = common.SaneHttpClient()
// Make sure that your group is surrounded in boundary characters such as below to reduce false positives.
keyPat = regexp.MustCompile(detectors.PrefixRegex([]string{"alchemy"}) + `\b([0-9a-zA-Z]{23}_[0-9a-zA-Z]{8})\b`)
)
Expand Down Expand Up @@ -47,6 +49,10 @@ func (s Scanner) FromData(ctx context.Context, verify bool, data []byte) (result
}

if verify {
client := s.client
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: You could maybe just check s.client == nil and use it on line 59 below, rather than creating a client every time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean mutating the scanner struct?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Or set it as a default somewhere? As it is, every time this attempts to verify it will create a new client. Which is fine, but probably not necessary.

if client == nil {
client = defaultClient
}
req, err := http.NewRequestWithContext(ctx, "GET", "https://eth-mainnet.g.alchemy.com/v2/"+resMatch+"/getNFTs/?owner=vitalik.eth", nil)
if err != nil {
continue
Expand All @@ -61,7 +67,13 @@ func (s Scanner) FromData(ctx context.Context, verify bool, data []byte) (result
if detectors.IsKnownFalsePositive(resMatch, detectors.DefaultFalsePositives, true) {
continue
}

if res.StatusCode != 401 {
s1.VerificationError = fmt.Errorf("request to %v returned unexpected status %d", res.Request.URL, res.StatusCode)
}
}
} else {
s1.VerificationError = err
}
}

Expand Down
63 changes: 52 additions & 11 deletions pkg/detectors/alchemy/alchemy_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,12 @@ func TestAlchemy_FromChunk(t *testing.T) {
verify bool
}
tests := []struct {
name string
s Scanner
args args
want []detectors.Result
wantErr bool
name string
s Scanner
args args
want []detectors.Result
wantErr bool
wantVerificationErr bool
}{
{
name: "found, verified",
Expand All @@ -52,7 +53,8 @@ func TestAlchemy_FromChunk(t *testing.T) {
Verified: true,
},
},
wantErr: false,
wantErr: false,
wantVerificationErr: false,
},
{
name: "found, unverified",
Expand All @@ -68,7 +70,8 @@ func TestAlchemy_FromChunk(t *testing.T) {
Verified: false,
},
},
wantErr: false,
wantErr: false,
wantVerificationErr: false,
},
{
name: "not found",
Expand All @@ -78,14 +81,48 @@ func TestAlchemy_FromChunk(t *testing.T) {
data: []byte("You cannot find the secret within"),
verify: true,
},
want: nil,
wantErr: false,
want: nil,
wantErr: false,
wantVerificationErr: false,
},
{
name: "found, would be verified if not for timeout",
s: Scanner{client: common.SaneHttpClientTimeOut(1 * time.Microsecond)},
args: args{
ctx: context.Background(),
data: []byte(fmt.Sprintf("You can find a alchemy secret %s within", secret)),
verify: true,
},
want: []detectors.Result{
{
DetectorType: detectorspb.DetectorType_Alchemy,
Verified: false,
},
},
wantErr: false,
wantVerificationErr: true,
},
{
name: "found, verified but unexpected api surface",
s: Scanner{client: common.ConstantStatusHttpClient(404)},
args: args{
ctx: context.Background(),
data: []byte(fmt.Sprintf("You can find a alchemy secret %s within", secret)),
verify: true,
},
want: []detectors.Result{
{
DetectorType: detectorspb.DetectorType_Alchemy,
Verified: false,
},
},
wantErr: false,
wantVerificationErr: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
s := Scanner{}
got, err := s.FromData(tt.args.ctx, tt.args.verify, tt.args.data)
got, err := tt.s.FromData(tt.args.ctx, tt.args.verify, tt.args.data)
if (err != nil) != tt.wantErr {
t.Errorf("Alchemy.FromData() error = %v, wantErr %v", err, tt.wantErr)
return
Expand All @@ -95,6 +132,10 @@ func TestAlchemy_FromChunk(t *testing.T) {
t.Fatalf("no raw secret present: \n %+v", got[i])
}
got[i].Raw = nil
if (got[i].VerificationError != nil) != tt.wantVerificationErr {
t.Fatalf("verification error = %v, wantVerificationError %v", got[i].VerificationError, tt.wantVerificationErr)
}
got[i].VerificationError = nil
}
if diff := pretty.Compare(got, tt.want); diff != "" {
t.Errorf("Alchemy.FromData() %s diff: (-got +want)\n%s", tt.name, diff)
Expand Down
3 changes: 2 additions & 1 deletion pkg/detectors/getemail/getemail.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import (
"net/http"
"regexp"
"strings"
"time"

"github.com/trufflesecurity/trufflehog/v3/pkg/common"
"github.com/trufflesecurity/trufflehog/v3/pkg/detectors"
Expand All @@ -19,7 +20,7 @@ type Scanner struct{}
var _ detectors.Detector = (*Scanner)(nil)

var (
client = common.SaneHttpClientTimeOut(5)
client = common.SaneHttpClientTimeOut(5 * time.Second)

// Make sure that your group is surrounded in boundary characters such as below to reduce false positives.
keyPat = regexp.MustCompile(detectors.PrefixRegex([]string{"getemail"}) + `\b([a-zA-Z0-9-]{20})\b`)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (
"net/http"
"regexp"
"strings"
"time"

"github.com/trufflesecurity/trufflehog/v3/pkg/common"
"github.com/trufflesecurity/trufflehog/v3/pkg/detectors"
Expand All @@ -18,7 +19,7 @@ type Scanner struct{}
var _ detectors.Detector = (*Scanner)(nil)

var (
client = common.SaneHttpClientTimeOut(5)
client = common.SaneHttpClientTimeOut(5 * time.Second)

// Make sure that your group is surrounded in boundary characters such as below to reduce false positives.
keyPat = regexp.MustCompile(`(https:\/\/[a-zA-Z-0-9]+\.webhook\.office\.com\/webhookb2\/[a-zA-Z-0-9]{8}-[a-zA-Z-0-9]{4}-[a-zA-Z-0-9]{4}-[a-zA-Z-0-9]{4}-[a-zA-Z-0-9]{12}\@[a-zA-Z-0-9]{8}-[a-zA-Z-0-9]{4}-[a-zA-Z-0-9]{4}-[a-zA-Z-0-9]{4}-[a-zA-Z-0-9]{12}\/IncomingWebhook\/[a-zA-Z-0-9]{32}\/[a-zA-Z-0-9]{8}-[a-zA-Z-0-9]{4}-[a-zA-Z-0-9]{4}-[a-zA-Z-0-9]{4}-[a-zA-Z-0-9]{12})`)
Expand Down
3 changes: 2 additions & 1 deletion pkg/detectors/scrapersite/scrapersite.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import (
"net/http"
"regexp"
"strings"
"time"

"github.com/trufflesecurity/trufflehog/v3/pkg/common"
"github.com/trufflesecurity/trufflehog/v3/pkg/detectors"
Expand All @@ -19,7 +20,7 @@ type Scanner struct{}
var _ detectors.Detector = (*Scanner)(nil)

var (
client = common.SaneHttpClientTimeOut(10)
client = common.SaneHttpClientTimeOut(10 * time.Second)

// Make sure that your group is surrounded in boundary characters such as below to reduce false positives.
keyPat = regexp.MustCompile(detectors.PrefixRegex([]string{"scrapersite"}) + `\b([a-zA-Z0-9]{45})\b`)
Expand Down
3 changes: 2 additions & 1 deletion pkg/detectors/screenshotlayer/screenshotlayer.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import (
"net/http"
"regexp"
"strings"
"time"

"github.com/trufflesecurity/trufflehog/v3/pkg/common"
"github.com/trufflesecurity/trufflehog/v3/pkg/detectors"
Expand All @@ -19,7 +20,7 @@ type Scanner struct{}
var _ detectors.Detector = (*Scanner)(nil)

var (
client = common.SaneHttpClientTimeOut(10)
client = common.SaneHttpClientTimeOut(10 * time.Second)

// Make sure that your group is surrounded in boundary characters such as below to reduce false positives.
keyPat = regexp.MustCompile(detectors.PrefixRegex([]string{"screenshotlayer"}) + `\b([a-zA-Z0-9_]{32})\b`)
Expand Down
3 changes: 2 additions & 1 deletion pkg/detectors/zenserp/zenserp.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (
"net/http"
"regexp"
"strings"
"time"

"github.com/trufflesecurity/trufflehog/v3/pkg/common"
"github.com/trufflesecurity/trufflehog/v3/pkg/detectors"
Expand All @@ -18,7 +19,7 @@ type Scanner struct{}
var _ detectors.Detector = (*Scanner)(nil)

var (
client = common.SaneHttpClientTimeOut(5)
client = common.SaneHttpClientTimeOut(5 * time.Second)

// Make sure that your group is surrounded in boundary characters such as below to reduce false positives.
keyPat = regexp.MustCompile(detectors.PrefixRegex([]string{"zenserp"}) + `\b([0-9a-z-]{36})\b`)
Expand Down
Loading