Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some inconsistencies between CKV_IGNORED_DIRECTORIES and --skip-path #6737

Open
tpvasconcelos opened this issue Sep 28, 2024 · 1 comment
Open

Comments

@tpvasconcelos
Copy link

tpvasconcelos commented Sep 28, 2024

Hey! New checkov user here 👋

How I noticed the problem

While trying to run checkov's test suit on my local machine I noticed that some files were being (unexpectedly) ignored. After a little digging, I realised that this is because I cloned the repo under a path that looks something like: ~/my_git_forks/checkov.

The reason this is a problem is because the following patterns are treated as regular expressions by checkov:

EXCLUDED_PATHS = [*ignored_directories, DEFAULT_EXTERNAL_MODULES_DIR, ".idea", ".git", "venv"]

and the .git pattern obviously excludes any paths that include the word 'git' (incl. at least one preceding character)

The more fundamental issues

I think the more fundamental problem is that checkov documents --skip-path as including regexes, documents CKV_IGNORED_DIRECTORIES to include non-regex patterns (default values arenode_modules,.terraform,.serverless instead of node_modules,\\.terraform,\\.serverless), but then treats them equally in the code by merging them in some places, and explicitly treats them as both regexes and non-regex patterns in the same line of code.

I think someone has noticed something similar in the past because I see a few p.replace(".terraform", r"\.terraform") sprinkled around the code base.

However this just seems like another inconsistency because if users set --skip-path="^.*\.terraform.*$", it will get converted internally to ^.*\\.terraform.*$, which would not ignore paths like path/to/.terraform.

How to fix it?

Since checkov allows users to pass regexes to --skip-path and regards CKV_IGNORED_DIRECTORIES as legacy, it makes sense to consistently treat all patterns as regular expressions. However, since I just started looking at this project a couple of days ago, I have no idea what the consequences of doing this are...

Some (naive?) suggestions

  1. Escape the default values for CKV_IGNORED_DIRECTORIES:

IGNORED_DIRECTORIES_ENV = os.getenv("CKV_IGNORED_DIRECTORIES", "node_modules,.terraform,.serverless")

i.e.,:

-IGNORED_DIRECTORIES_ENV = os.getenv("CKV_IGNORED_DIRECTORIES", "node_modules,.terraform,.serverless")
+IGNORED_DIRECTORIES_ENV = os.getenv("CKV_IGNORED_DIRECTORIES", "node_modules,\.terraform,\.serverless")

Same needs to be done here:

self.IGNORED_DIRECTORIES = os.getenv("CKV_IGNORED_DIRECTORIES", "node_modules,.terraform,.serverless")

  1. Same here:

EXCLUDED_PATHS = [*ignored_directories, DEFAULT_EXTERNAL_MODULES_DIR, ".idea", ".git", "venv"]

-EXCLUDED_PATHS = [*ignored_directories, DEFAULT_EXTERNAL_MODULES_DIR, ".idea", ".git", "venv"]
+EXCLUDED_PATHS = [*ignored_directories, re.escape(DEFAULT_EXTERNAL_MODULES_DIR), "\.idea", "\.git", "venv"]
  1. Remove any hard-coded references to p.replace(".terraform", r"\.terraform")
  2. Consistently threat all references to ignored_directories, excluded_paths, and skip_path as regex patterns. This means removing all ~path in skip_path invocations since this is not compatible with regex (e.g. "\.hidden" not in ".hidden/secret.txt")
    • As a safeguard and to avoid ~path in skip_path code being committed in the first place, the excluded_paths variable should probably be a set of re.patterns instead of strings. e.g.:
      excluded_paths = {re.compile(p) for p in self.config.skip_path}`

Other things to consider

Currently, patterns set in the form of .hidden will always exclude directories of the form .hidden-not. However, this is a problem both in the non-regex ~path in skip_path way of doing things and naive regex definitions. However, at least with regex the user can be explicit and say something like (^|.*\/)\.hidden\/?

Screenshot 2024-09-28 at 03 19 08 Screenshot 2024-09-28 at 03 20 02 Screenshot 2024-09-28 at 03 20 57

However, this is not very user friendly and maybe there should be a way to allow users to define excluded directories in a .hidden,also_hidden flavour rather than the verbose (and non-Windown-compatible) --skip-path='(^|.*\/)\.hidden\/?' --skip-path='(^|.*\/)also_hidden\/?'

@tpvasconcelos
Copy link
Author

tpvasconcelos commented Sep 28, 2024

I opened a PR for reference (#6738) that fixes the issue I had running the tests locally.

I doesn't address all the issues discussed above but I wanted to get started with a possible incremental solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant