Skip to content

Commit

Permalink
Importing v3 (#482)
Browse files Browse the repository at this point in the history
* logging improvement

* add design doc

* spreadsheet parser refactored

* parallelize main

* parallelise importing of standards from a spreadsheet and generation of related data

* add progress bars

* return list of imported resources

* initial base parser implementation

* move parsers to own dir and migrate cwe, dsomm and iso27k to the new parser interface

* migrate parsers

* make main call parser

* add embeddings in the document object allow prasers to generate and fill embeddings of each document, decouple parsers from main methods to avoid circular imports, mode embeddings database storage to register_node

* add default value for embeddings object in dataclass

* sort out imports

* cwe parser test

* dsomm parser tests

* secure headers + tests

* cloud native security controls parser +tests

* ccmv4

* rename method

* juiceshop + tests

* mega lint

* rm commented code

* fix docs equality test

* change external project parsers to return dict of 'resourcename':<resource entries>

* fix web main test related to CSV returning inconsistency

* fix nit on gap analysis enqueue job

* operational changes to make mass importing easier

* fix spreadsheet importing bugs, add validation to dataclasses

* makefile improvements

* add validation and fix tests

* drop support for OSIB

* fix more tests

* pin black to same version as superlinter and lint everything

* change array hash to array key so that its legible, introduce ids for nodes which is a combination of their values, partially fix gap analysis

* add tests for bug where standards would only link to one cre

* change neo4j standards for regular postgres standards

* adjust main to not require redis when getting standards

* cache key to str

* add ability to import only the projects

* delete all traces of node and gap analysis of node, used when reimporting standards after either structural or informational changes

* nit: rearrange argument handling on main

* fix previously introduced cre hierarchy bug

* move commands for regenerating DB to a new 'import-all' script

* fix embedding gen

* disable iso, set port to 5001

* add 'automatically linked to' linktype and use it for low confidence mappings

* fix scripts, make import-projects use scripts

* rm cres, too large to keep around

* add migration

* since we removed cres dir, also remove export functionality

* fix importing script

* add new link status to db

* add import-only for external parsers and remove export and review functionality

* fix cwe typo

* generate embeddings for guaranteed non-none name

* add message on waiting jobs

* improve gap analysis logging

* add ability to skip reimporting if something already exists

* logging nit

* make the base parser not load the in-memory graph by default and fix the
linking of DSOMM link type to 'AutomaticlalyLinkedTo'

* make loading the graph in memory optional

make cre importing ONLY have an in memory graph to find cycles

* improve logging for gap analysis jobs

* fix endless loop when importing and gap analysis exists

* add gap analysis relationship 'automaticallyt linked to' and throw exception when cycle detection gets called with no in memory graph

* paginate graph retrieval init (#491)

* paginate graph retrieval init

* progress

* paginate explorer success

* backend tests

* Macos support (#496)

* fixes to make install run on macs

* activate venv and run on the same line

* update trest workflow

* update trest workflow

* backend runs on 5001 (#495)

* backend runs on 5002

* switch port to 5k2 as most docker registries and apple airplay run on 5k

* switch port to 5k2 as most docker registries and apple airplay run on 5k

* switch port to 5k2 as most docker registries and apple airplay run on 5k

* switch port to 5k2 as most docker registries and apple airplay run on 5k

---------

Signed-off-by: Spyros <[email protected]>

* Revert "backend runs on 5001 (#495)" (#498)

This reverts commit e5929f5.

* rm version from cwe and make ccm and iso disabled parsers commented out so it does not affect coverage

* add dev environment variables that do only graph importing to be used for debugging

* import external projects individually

* print less when calculating ga

* [ticket-508] Ensure autolinks appear on the CRE page (#509)

ensure autolinks show on the CRE page

* fix broken rebase

* add ability to run cre as a container and sync local cre with upstream

* cleanup unused spreadsheet parser methods

* fix e2e tests

* pin black to same version as superlinter

* pin node version in github e2e

* upgrade actions node

* ga query test

* add ability to external project parsers to skip gap analysis and embedding calculation

* nit: logging

* fix gap analysis bug where we wouldn't remove calculated ga from waiting list

* move ga preloading into script

* fix error where standards would get preloaded twice

* add explorer to header in staging

* in pyyaml, try to fix incompatibility with cython

* init fix e2e tests

---------

Signed-off-by: Spyros <[email protected]>
Co-authored-by: Diana <[email protected]>
  • Loading branch information
northdpole and dlicheva authored Jul 6, 2024
1 parent 6e47a17 commit a56acf1
Show file tree
Hide file tree
Showing 524 changed files with 7,168 additions and 25,512 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/deploy-staging.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: checkout
uses: actions/checkout@v2
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: staging
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: checkout
uses: actions/checkout@v2
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Deploy backend to heroku
Expand Down
16 changes: 5 additions & 11 deletions .github/workflows/e2e.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,30 +7,24 @@ jobs:
timeout-minutes: 10
steps:
- name: Check out code
uses: actions/checkout@v2
uses: actions/checkout@v4
- uses: actions/setup-python@v4
with:
python-version: '3.11.4'
cache: 'pip'
- uses: actions/setup-node@v3
with:
cache: 'yarn'
- name: Install python dependencies
node-version: 'v20.12.1'
- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install -y python3-setuptools python3-pip python3-virtualenv chromium-browser libgbm1
make install
- name: DB setup
run: |
cp cres/db.sqlite standards_cache.sqlite
make migrate-upgrade
python cre.py --upstream_sync
- name: Run app and e2e tests
run: |
yarn build
[ -d "./venv" ] && . ./venv/bin/activate
export FLASK_APP=./cre.py
export FLASK_CONFIG=development
export INSECURE_REQUESTS=1
FLASK_CONFIG=development flask run &
sleep 20s
yarn test:e2e
make e2e
4 changes: 1 addition & 3 deletions .github/workflows/linter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,12 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout Code
uses: actions/checkout@v2
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Lint Code Base
uses: github/super-linter@v5
env:

VALIDATE_PYTHON_BLACK: true
VALIDATE_ALL_CODEBASE: false
DEFAULT_BRANCH: main
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
35 changes: 35 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
name: Publish
on:
push:
tags:
- "v*.*.*"

permissions:
# Grant the ability to checkout the repository
contents: write
# Grant the ability to push packages
packages: write

jobs:
publish-docker-images:
name: Push Docker images
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v4

- name: Login to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Publish Docker images
run: |
CRE_VERSION_SEMVER=$(sed 's/v//' <<< ${{ github.ref_name }});
make docker-prod
docker tag opencre:$(git rev-parse HEAD) ghcr.io/owasp/OpenCRE/opencre:${CRE_VERSION_SEMVER}
docker tag opencre:$(git rev-parse HEAD) ghcr.io/owasp/OpenCRE/opencre:latest
docker push ghcr.io/owasp/OpenCRE/opencre:${CRE_VERSION_SEMVER}
docker push ghcr.io/owasp/OpenCRE/opencre:latest
26 changes: 15 additions & 11 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,23 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v2
- uses: actions/setup-python@v4
uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11.4'
cache: 'pip'
- uses: actions/setup-node@v3
with:
cache: 'yarn'
python-version: '3.12.3'
- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install -y \
python3-setuptools \
python3-virtualenv \
python3-pip \
libxml2-dev \
libxslt-dev
- name: Install python dependencies
run: |
sudo apt-get update
sudo apt-get install -y python3-setuptools python3-virtualenv python3-pip
pip install --upgrade pip
make install-python
pip install --upgrade pip
pip install --upgrade setuptools
make install-python
- name: Test
run: make test
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -65,4 +65,7 @@ neo4j/
.neo4j/

.mypy_cache
tmp/
tmp/

### CREs dir
cres/*
9 changes: 6 additions & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,14 @@ WORKDIR /code
COPY . /code
RUN yarn install && yarn build

FROM python:3.11.0 as run
FROM python:3.11 as run

COPY --from=build /code /code
WORKDIR /code
COPY ./scripts/prod-docker-entrypoint.sh /code
RUN pip install -r requirements.txt gunicorn

ENTRYPOINT gunicorn
CMD ["--timeout","800","--workers","8","cre:app"]
ENV INSECURE_REQUESTS=1
ENV FLASK_CONFIG="production"
RUN chmod +x /code/prod-docker-entrypoint.sh
ENTRYPOINT ["/code/prod-docker-entrypoint.sh"]
52 changes: 24 additions & 28 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,13 +1,27 @@

.ONESHELL:

.PHONY: run test covers install-deps dev docker lint frontend clean all

prod-run:
cp cres/db.sqlite standards_cache.sqlite; gunicorn cre:app --log-file=-

docker-neo4j-rm:
docker stop cre-neo4j
docker rm -f cre-neo4j
docker volume rm cre_neo4j_data
docker volume rm cre_neo4j_logs
# rm -rf .neo4j

docker-neo4j:
docker start cre-neo4j 2>/dev/null || docker run -d --name cre-neo4j --env NEO4J_PLUGINS='["apoc"]' --env NEO4J_AUTH=neo4j/password --volume=`pwd`/.neo4j/data:/data --volume=`pwd`/.neo4j/logs:/logs --workdir=/var/lib/neo4j -p 7474:7474 -p 7687:7687 neo4j

docker-redis-rm:
docker stop cre-redis-stack
docker rm -f cre-redis-stack

docker-redis:
docker start redis-stack 2>/dev/null || docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest
docker start cre-redis-stack 2>/dev/null ||\
docker run -d --name cre-redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest

start-containers: docker-neo4j docker-redis

Expand Down Expand Up @@ -49,7 +63,7 @@ install-deps-typescript:
install-deps: install-deps-python install-deps-typescript

install-python:
virtualenv -p python3.11 venv
virtualenv -p python3 --system-site-packages venv
. ./venv/bin/activate &&\
make install-deps-python &&\
playwright install
Expand Down Expand Up @@ -101,35 +115,17 @@ migrate-downgrade:
export FLASK_APP=$(CURDIR)/cre.py
flask db downgrade

import-projects:
$(shell CRE_SKIP_IMPORT_CORE=1 bash ./scripts/import-all.sh)

import-all:
[ -d "./venv" ] && . ./venv/bin/activate &&\
rm -rf standards_cache.sqlite &&\
make migrate-upgrade && export FLASK_APP=$(CURDIR)/cre.py &&\
python cre.py --add --from_spreadsheet https://docs.google.com/spreadsheets/d/1eZOEYgts7d_-Dr-1oAbogPfzBLh6511b58pX3b59kvg &&\
python cre.py --generate_embeddings && \
python cre.py --zap_in --cheatsheets_in --github_tools_in --capec_in --owasp_secure_headers_in --pci_dss_4_in --juiceshop_in --dsomm_in --dsomm_in --cloud_native_security_controls_in &&\
python cre.py --generate_embeddings
$(shell bash ./scripts/import-all.sh)

import-neo4j:
[ -d "./venv" ] && . ./venv/bin/activate &&\
export FLASK_APP=$(CURDIR)/cre.py && python cre.py --populate_neo4j_db

preload-map-analysis:
make docker-redis&\
make start-worker&\
make start-worker&\
make start-worker&\
make start-worker&\
make start-worker&\
make start-worker&\
make start-worker&\
make start-worker&\
make start-worker&\
make start-worker&\
make dev-flask&
sleep 5
[ -d "./venv" ] && . ./venv/bin/activate &&\
export FLASK_APP=$(CURDIR)/cre.py
python cre.py --preload_map_analysis_target_url 'http://127.0.0.1:5000'
killall python flask
preload-map-analysis:
$(shell RUN_COUNT=5 bash ./scripts/preload_gap_analysis.sh)

all: clean lint test dev dev-run
4 changes: 2 additions & 2 deletions Procfile
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
web: gunicorn cre:app --log-file=-g
worker: FLASK_APP=`pwd`/cre.py python cre.py --start_worker
web: gunicorn cre:app
worker: FLASK_APP=`pwd`/cre.py python cre.py --start_worker
Loading

0 comments on commit a56acf1

Please sign in to comment.