Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Add static code analysis using CodeQL. #477

Merged
merged 1 commit into from
Nov 1, 2023

Conversation

mmuetzel
Copy link
Contributor

This adds rules for the CI to run CodeQL (a static analyzer tool by GitHub).
More information about it can be found here:
https://docs.github.com/en/code-security/code-scanning

Unfortunately, this doesn't seem to mix well with ccache (or I gave up on it after it looked suspicious). And building with code analysis seems to slow down the build process even more.
Effectively, this means that this workflow is too expensive to run on each single PR or push event. Instead, I opted for adding a schedule (and a manual dispatch).
GitHub recommends to schedule at "unusual" times (and avoid, e.g., full hours). So, I set the schedule to run once a week on Saturday at 10:20 UTC.
(I temporarily added push and pull_request triggers to have it run at least once for this PR. But they should probably be removed before this is merged.)

On a test on my fork of this repository, the runners run out of memory when trying to analyze all projects in one go. To try and avoid that, I split the projects into two groups: One group for GraphBLAS and LAGraph, and another group for all other projects. The workflows run separately for these two groups. (I haven't tested yet if this is enough to have it succeed.)

I tried to use Alpine x86 as one of the runners (to have a 64-bit and a 32-bit platform). But the CodeQL actions don't seem to be compatible to the chroot that is used on Alpine. So, I added a MinGW 32-bit (WoW64) runner instead.

Results of the analysis should show up in the "Security" tab once the CI has completed to run this workflow. I don't know who will be able to see the results. It might be only maintainers of the repository or it might be everyone.

@github-advanced-security
Copy link

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

@DrTimothyAldenDavis
Copy link
Owner

DrTimothyAldenDavis commented Oct 29, 2023

Thanks -- this will be useful.

At least some of the results of CodeQL are posted (or maybe it's finished already). It gave warnings for AMD, CXSparse, and CHOLMOD about integer or double overflow or signed integer multiplication. All those warnings are about safe code, but I've revised them to do the casting first so the warnings will go away.

It also warned about printf formatting in ParU, which I've just now fixed.

This is a useful tool but so far it hasn't found a real error. LAGraph has a mix of code quality in the LAGraph/experimental folder so I wouldn't be surprised to see a few warnings there. That code is by definition a draft set of methods, and not ready for production use, but we make them available anyway because the code is developed by so many people.

I might try adding an unsafe memcpy in LG_CC_FastSV6 to see if gets caught, just out of curiosity.

See these PRs:
f4121a8
1af04dd
1f17b45
e935e13

@mmuetzel
Copy link
Contributor Author

It looks like the runner for the "graph" group still runs out of memory during the analysis. I'll try to take a look later. Switching to draft in the meantime.

I'm happy the results for the other group was already helpful for you.
It looks like these results aren't visible to anyone (at least I can't see them). Maybe, there is some configuration setting that could change that? Or leave it as is if you prefer it that way.

@DrTimothyAldenDavis
Copy link
Owner

DrTimothyAldenDavis commented Oct 30, 2023

Thanks for the update. I'm not sure how to make them visible. The warnings it reported are all fairly straight forward, and are in the screenshots below. Each has a link to more details:

image

...

image

image

For example, for AMD/Source/amd_order.c, link 155:

image

That line gives a warning because slen is size_t and n is either int32_t or int64_t. However, I know at this point that n >= 0 holds since I check that condition and return an error if n < 0, on line 52. So I just fixed it in the latest PR to add a new variable:

size_t nn = (size_t) n ;

and then I use nn in the test instead of n. That seems to silence the warning.

In this case, as in most of the others, the warnings are overly pessimistic. Here, I already know that n >= 0 so this is safe in the original code. However, it's a good idea to avoid any possibility of undefined behavior, since (as stated in the explanation) an aggressive optimizing compiler could delete this test entirely.

@mmuetzel
Copy link
Contributor Author

Thanks for giving a glimpse at this.
Found this in the documentation:
https://docs.github.com/en/code-security/code-scanning/managing-your-code-scanning-configuration/viewing-code-scanning-logs

Who can use this feature
If you have write permissions to a repository, you can view the code scanning logs for that repository.

So, only users with write permissions can see the scan results (which makes sense).

@mmuetzel
Copy link
Contributor Author

It looks like the GraphBLAS codebase is just to large for an analysis on the hosted runners. It failed with the following on the mingw32 runner now:

  Oops! A fatal internal error occurred. Details:
  com.semmle.util.exception.CatastrophicError: An error occurred while evaluating TranslatedElement#ea057665::TranslatedElement::getChild#1#dispred#fff_10#join_rhs/2@a63704uj
  Severe disk cache trouble (corruption or out of space) at D:\a\_temp\codeql_databases\cpp\db-cpp\default\cache\pages\e0\84.pack: Failed to write item to disk

I'll try to disable the factory kernels on that runner. If that works, at least the remainder of the project might still be able to be scanned.
I'll still wait to see if the Ubuntu runner succeeds before doing that on either both runners or just the mingw32 one.

@mmuetzel
Copy link
Contributor Author

The ubuntu runner timed out after 6 hours. So, I disabled the factory kernels for both configurations. Let's see what that will do...

@mmuetzel
Copy link
Contributor Author

I added the libraries that are only built with CUDA to the analysis. That revealed a bug in the build system (if NSTATIC=ON). See #480.

@mmuetzel
Copy link
Contributor Author

The new rules seem to be working now. So, I squashed everything into one commit and removed the push and pull_request triggers.

Marking as ready for review.

@mmuetzel mmuetzel marked this pull request as ready for review October 31, 2023 17:38
@DrTimothyAldenDavis
Copy link
Owner

I haven't had a chance to review it yet but I'll go ahead and merge it in.

@DrTimothyAldenDavis DrTimothyAldenDavis merged commit 3dc1a2d into DrTimothyAldenDavis:dev2 Nov 1, 2023
17 checks passed
@DrTimothyAldenDavis
Copy link
Owner

The new CodeQL analysis of GraphBLAS is working. It reports only 6 warnings in GB_jitifyer.c about potential security issues of passing a user-defined string to the system() call. That's intentional, since I need to have a way for the end user to tell me which compiler to use to compile JIT kernels. The name (or full path) to the compiler is something that is entirely under control of the user application, and I can't disable that. That command could be anything of course, so the warning is valid, but it's not something I can change. The end user needs the flexibility to tell me to use any compiler.

So I will need to mark those warnings as "do not warn; intentional" or something. Yes, it's a potential security issue but if someone needs a hardened GraphBLAS, then they can disable the JIT at compile time and these issues would go away.
"Won't fix" is too strong of a term since it's not broken.

That's all that CodeQL reports.

@DrTimothyAldenDavis
Copy link
Owner

See the comments I added to the code here: 63648d6

@mmuetzel
Copy link
Contributor Author

mmuetzel commented Nov 1, 2023

I realized that the analysis also shows for me when I ran it in my fork. (Albeit rather hidden. I had to find where to change the branch for which the analysis is displayed.)

I believe I see the same results there. Some (all?) of them could probably be sanitized. E.g., the values of the environment variables HOME, LOCALAPPDATA, and GRAPHBLAS_CACHE_PATH could probably be checked to be an existing directory.

I don't see the path (for the compiler name) that you are describing in the analysis in my fork...

@DrTimothyAldenDavis
Copy link
Owner

DrTimothyAldenDavis commented Nov 1, 2023

I suppose some things could be sanitized and checked, but fundamentally there are many things that cannot be checked. In particular, the GrB_*Op_new methods create a new user-defined function that is given to GraphBLAS to call. With the JIT, this function is described as an arbitrary string. For example:

void addgauss (gauss *z, const gauss *x, const gauss *y)
{
z->real = x->real + y->real ;
z->imag = x->imag + y->imag ;
}
#define ADDGAUSS_DEFN \
"void addgauss (gauss *z, const gauss *x, const gauss *y) \n" \
"{ \n" \
" z->real = x->real + y->real ; \n" \
" z->imag = x->imag + y->imag ; \n" \
"}"

The string defined there is injected into a kernel that I compile at run-time. A user-defined function can include anything at all, like its own call to system ("rm -rf /") or a more subtle injection. There's no way I could sanitize an arbitrary user-defined function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants