Lots of work

exercism · Aug 6, 2024 · dd658c3 · dd658c3
1 parent 936b93c
commit dd658c3
Show file tree

Hide file tree

Showing 6 changed files with 232 additions and 58 deletions.
diff --git a/building/tooling/analyzers/creating-from-scratch.md b/building/tooling/analyzers/creating-from-scratch.md
@@ -4,9 +4,9 @@ Firstly, thank you for your interest in creating an Analyzer!
 
 These are the steps to get going:
 
-1. Check [our repository list for an existing `...-analyzer`](https://github.com/exercism?q=-analyzer) to ensure that one doesn't already exist.
+1. Check [our repository list for an existing `...-analyzer`](https://github.com/search?q=org%3Aexercism+analyzer&type=repositories) to ensure that one doesn't already exist.
 2. Scan the [contents of this directory](/docs/building/tooling/analyzers) to ensure you are comfortable with the idea of creating an Analyzer.
-3. Open an issue at [exercism/exercism][exercism-repo] introducing yourself and telling us which language you'd like to create a Analyzer for.
+3. Start a new topic on [the Exercism forum][building-exercism] telling us which language you'd like to create a Test Runner for.
 4. Once an Analyzer repo has been created, use [the Analyzer interface document](/docs/building/tooling/analyzers/interface) to help guide your implementation.
 
 We have an incredibly friendly and supportive community who will be happy to help you as you work through this! If you get stuck, please start a new topic on [the Exercism forum][building-exercism] or create new issues at [exercism/exercism][exercism-repo] as needed 🙂

diff --git a/building/tooling/best-practices.md b/building/tooling/best-practices.md
@@ -1,39 +1,18 @@
 # Best Practices
 
-## Follow best practices
+## Follow official best practices
 
 The official [Dockerfile best practices](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/) have lots of great content on how to improve your Dockerfiles.
 
-## Prefer official images
-
-There are many Docker images on [Docker Hub](https://hub.docker.com/), but try to use [official ones](https://hub.docker.com/search?q=&image_filter=official).
-
-## Pin versions
-
-To ensure that builds are stable (i.e. they don't suddenly break), you should always pin your base images to specific tags.
-That means instead of:
-
-```dockerfile
-FROM alpine:latest
-```
-
-you should use:
-
-```dockerfile
-FROM alpine:3.20.2
-```
-
-With the latter, builds will always use the same version.
-
 ## Performance
 
 You should primarily optimize for performance (especially for test runners).
 This will ensure your tooling runs as fast as possible and does not time-out.
 
 ### Experiment with different Base images
 
-Try experimenting with different base images (e.g. Ubuntu instead of Alpine), to see if one (significantly) outperforms the other.
-If performance is relatively equal,
+Try experimenting with different base images (e.g. Alpine instead of Ubuntu), to see if one (significantly) outperforms the other.
+If performance is relatively equal, go for the image that is smallest.
 
 ### Try Internal Network
 
@@ -49,12 +28,29 @@ Tooling runs as one-off, short-lived Docker container:
 3. The Docker container is destroyed
 
 Therefore, code that runs in step 2 runs for _every single tooling run_.
-For this reason, reducing the amount of code to run in step 2 is a great way to improve performance
+For this reason, reducing the amount of code that runs in step 2 is a great way to improve performance
 One way of doing this is to move code from _run-time_ to _build-time_.
-Whilst run-time code runs every single tooling run, build_time code only runs once (when the Docker image is built).
+Whilst run-time code runs on every single tooling run, build-time code only runs once (when the Docker image is built).
 
-As build-time code runs as part of a GitHub Actions workflow, the student will never notice it.
-This also means that the code at build-time could be relatively slow, it's only running once after all!
+Build-time code runs once as part of a GitHub Actions workflow.
+Therefore, its fine if the code that runs at build-time is (relatively) slow.
+
+#### Example: pre-compilation
+
+When running tests in the Haskell test runner, it requires some base libraries to be compiled.
+As each test run happens in a fresh container, this means that this compilation was done _in every single test run_!
+To circumvent this, the [Haskell test runner's Dockerfile](https://github.com/exercism/haskell-test-runner/blob/5264c460054649fc672c3d5932c2f3cb082e2405/Dockerfile) has the following two commands:
+
+```dockerfile
+COPY pre-compiled/ .
+RUN stack build --resolver lts-20.18 --no-terminal --test --no-run-tests
+```
+
+First, the `pre-compiled` directory is copied into the image.
+This directory is setup as a sort of fake exercise and depends on the same base libraries that the actual exercise depend on.
+Then we run the tests on that directory, which is similar to how tests are run for an actual exercise.
+Running the tests will result in the base being compiled, but the difference is that this happens at _build time_.
+The resulting Docker image will thus have its base libraries already compiled, which means that no longer has to happen at _run time_, resulting in (much) faster execution times.
 
 ## Size
 
@@ -64,9 +60,32 @@ You should try to reduce the image's size, which means that it'll be:
 - Reduces costs for us
 - Marginally improves startup time of each container
 
-### Try different Base images
+### Try different distributions
+
+Different distribution images will have different sizes.
+For example, the `alpine:3.20.2` image is **ten times** smaller than the `ubuntu:24.10` image:
+
+```
+REPOSITORY   TAG       SIZE
+alpine       3.20.2    8.83MB
+ubuntu       24.10     101MB
+```
+
+In general, Alpine-based images are amongst the smallest images, so many tooling images are based on Alpine.
 
-Some base images are
+### Try slimmed-down images
+
+Some images have special "slim" variants, in which some features will have been removed resulting in smaller image sizes.
+For example, the `node:20.16.0-slim` image is **five times** smaller than the `node:20.16.0` image:
+
+```
+REPOSITORY   TAG            SIZE
+node         20.16.0        1.09GB
+node         20.16.0-slim   219MB
+```
+
+The reason "slim" variants are smaller is that they'll have less features.
+Your image might not need the additional features, and if not, consider using the "slim" variant.
 
 ### Removing unneeded bits
 
@@ -77,18 +96,188 @@ These can include things like:
 - Files targeting different architectures from the Docker image
 - Documentation
 
-### Cleanup package manager
+#### Remove package manager files
+
+Most Docker images need to install additional packages, which is usually done via a package manager.
+These packages must be installed at _build time_ (as no internet connection is available at _run time_).
+Therefore, any package manager caching/bookkeeping files should be removed after installing the additional packages.
+
+##### apk
+
+Distributions that uses the `apk` package manager (such as Alpine) should use the `--no-cache` flag when using `apk add` to install packages:
+
+```dockerfile
+RUN apk add --no-cache curl
+```
+
+##### apt-get/apt
+
+Distributions that uses the `apt-get`/`apk` package manager (such as Ubuntu) should run the `apt-get autoremove -y` and `rm -rf /var/lib/apt/lists/*` commands _after_ installing the packages:
+
+```dockerfile
+RUN apt-get update && \
+    apt-get install curl -y && \
+    apt-get autoremove -y && \
+    rm -rf /var/lib/apt/lists/*
+```
 
 ### Use multi-stage builds
 
-https://docs.docker.com/build/building/multi-stage/
+Docker has a feature called [multi-stage builds](https://docs.docker.com/build/building/multi-stage/).
+These allow you to partition your Dockerfile into separate _stages_, with only the last stage ending up in the produced Docker image (the rest is only there to support building the last stage).
+You can think of each stage as its own mini Dockerfile; stages can use different base images.
+
+Multi-stage builds are particularly useful when your Dockerfile requires packages to be installed that are _only_ needed at build time.
+In this situation, the general structure of your Dockerfile looks like this:
+
+1. Define a new stage (we'll call this the "build" stage).
+   This stage will _only_ be used at build time.
+2. Install the required additional packages (into the "build" stage).
+3. Run the commands that require the additional packages (within the "build" stage).
+4. Define a new stage (we'll call this the "runtime" stage).
+   This stage will make up the resulting Docker image and executed at run time.
+5. Copy the result(s) from the commands run in step 3 (in the "build" stage) into this stage (the "runtime" stage).
+
+With this setup, the additional packages are _only_ installed in the "build" stage and _not_ in the "runtime" stage, which means that they won't end up in the Docker image that is produced.
 
-TODO
+#### Example: downloading files
+
+The Fortran test runner requires `curl` to download some files.
+However, its run time image does _not_ need `curl`, which makes this a perfect use case for a multi-stage build.
+
+First, it's [Dockerfile](https://github.com/exercism/fortran-test-runner/blob/783e228d8449143d2040e68b95128bb791833a27/Dockerfile) defines a stage (named "build") in which the `curl` package is installed.
+It then uses curl to download files into that stage.
+
+```dockerfile
+FROM alpine:3.15 AS build
+
+RUN apk add --no-cache curl
+
+WORKDIR /opt/test-runner
+COPY bust_cache .
+
+WORKDIR /opt/test-runner/testlib
+RUN curl -R -O https://raw.githubusercontent.com/exercism/fortran/main/testlib/CMakeLists.txt
+RUN curl -R -O https://raw.githubusercontent.com/exercism/fortran/main/testlib/TesterMain.f90
+
+WORKDIR /opt/test-runner
+RUN curl -R -O https://raw.githubusercontent.com/exercism/fortran/main/config/CMakeLists.txt
+```
+
+The second part of the Dockerfile defines a new stage and copies the downloaded files from the "build" stage into its own stage using the `COPY` command:
+
+```dockerfile
+FROM alpine:3.15
+
+RUN apk add --no-cache coreutils jq gfortran libc-dev cmake make
+
+WORKDIR /opt/test-runner
+COPY --from=build /opt/test-runner/ .
+
+COPY . .
+ENTRYPOINT ["/opt/test-runner/bin/run.sh"]
+```
+
+##### Example: installing libraries
+
+The Ruby test runner needs the `git`, `openssh`, `build-base`, `gcc` and `wget` packages to be installed before its required libraries (gems) can be installed.
+Its [Dockerfile](https://github.com/exercism/ruby-test-runner/blob/e57ed45b553d6c6411faeea55efa3a4754d1cdbf/Dockerfile) starts with a stage (given the name `build`) that install those packages (via `apk add`) and then installs the libaries (via `bundle install`):
+
+```dockerfile
+FROM ruby:3.2.2-alpine3.18 AS build
+
+RUN apk update && apk upgrade && \
+    apk add --no-cache git openssh build-base gcc wget git
+
+COPY Gemfile Gemfile.lock .
+
+RUN gem install bundler:2.4.18 && \
+    bundle config set without 'development test' && \
+    bundle install
+```
+
+It then defines the stage that will form the resulting Docker image.
+This stage does _not_ install the dependencies the previous stage installed, instead it uses the `COPY` command to copy the installed libraries from the build stage into its own stage:
+
+```dockerfile
+FROM ruby:3.2.2-alpine3.18
+
+RUN apk add --no-cache bash
+
+WORKDIR /opt/test-runner
+
+COPY --from=build /usr/local/bundle /usr/local/bundle
+
+COPY . .
+
+ENTRYPOINT [ "sh", "/opt/test-runner/bin/run.sh" ]
+```
+
+```exercism/note
+The [C# test runner's Dockerfile](https://github.com/exercism/csharp-test-runner/blob/b54122ef76cbf86eff0691daa33c8e50bc83979f/Dockerfile) does something similar, only in this case the build stage can use an existing Docker image that has pre-installed the additional packages required to install libraries.
+```
 
 ## Safety
 
-TODO
+Safety is a main reason why we're using Docker containers to run our tooling.
+
+### Prefer official images
+
+There are many Docker images on [Docker Hub](https://hub.docker.com/), but try to use [official ones](https://hub.docker.com/search?q=&image_filter=official).
+These images are curated and have (far) less chance of being unsafe.
 
-## Support read-only filesystem
+### Pin versions
+
+To ensure that builds are stable (i.e. they don't suddenly break), you should always pin your base images to specific tags.
+That means instead of:
 
-TODO
+```dockerfile
+FROM alpine:latest
+```
+
+you should use:
+
+```dockerfile
+FROM alpine:3.20.2
+```
+
+With the latter, builds will always use the same version.
+
+### Run as a non-privileged user
+
+By default, many images will run with a user that has root privileges.
+You should consider running as a non-privileged user.
+
+```dockerfile
+FROM alpine
+
+RUN groupadd -r myuser && useradd -r -g myuser myuser
+
+# <RUN COMMANDS THAT REQUIRES ROOT USER, LIKE INSTALLING PACKAGES ETC.>
+
+USER myuser
+```
+
+### Update package repositories to latest version
+
+It is (almost) always a good idea to install the latest versions
+
+```dockerfile
+RUN apt-get update && \
+    apt-get install curl
+```
+
+### Support read-only filesystem
+
+We encourage Docker files to be written using a read-only filesystem.
+The only directories you should assume to be writeable are:
+
+- The solution dir (passed in as the second argument)
+- The output dir (passed in as the third argument)
+- The `/tmp` dir
+
+```exercism/caution
+Our production environment currently does _not_ enforce a read-only filesystem, but we might in the future.
+For this reason, the base template for a new test runner/analyzer/representer starts out with a read-only filesystem.
+If you can't get things working on a read-only file, feel free to (for now) assume a writeable file system.
+```
diff --git a/building/tooling/docker.md b/building/tooling/docker.md
@@ -50,21 +50,6 @@ Different languages perform better/worse with different configurations (e.g. Rub
 You can experiment locally by using the `--network` flag when running your docker. `--network none` is supported by default.
 To use the internal network, first run `docker network create --internal internal` to create the network, then use `--network internal` when running the container.
 
-### Read-only filesystem
-
-We encourage Docker files to be written using a read-only filesystem.
-The only directories you should assume to be writeable are:
-
-- The solution dir (passed in as the second argument)
-- The output dir (passed in as the third argument)
-- The `/tmp` dir
-
-```exercism/caution
-Our production environment currently does _not_ enforce a read-only filesystem, but we might in the future.
-For this reason, the base template for a new test runner/analyzer/representer starts out with a read-only filesystem.
-If you can't get things working on a read-only file, feel free to (for now) assume a writeable file system.
-```
-
 ### Memory
 
 Languages can set the maximum memory they need to use to run their jobs. Setting this to be as low as possible means that we can run more jobs more quickly in parallel. It also means that people who try and abuse memory will not be able to succeed. Different languages need wildly different maximum memory usage. Benchmarking the execution of a docker run to establish the maximum memory it uses is advised and appreciated.

diff --git a/building/tooling/representers/creating-from-scratch.md b/building/tooling/representers/creating-from-scratch.md
@@ -4,9 +4,9 @@ Firstly, thank you for your interest in creating a Representer!
 
 These are the steps to get going:
 
-1. Check [our repository list for an existing `...-representer`](https://github.com/exercism?q=-representer) to ensure that one doesn't already exist.
+1. Check [our repository list for an existing `...-representer`](https://github.com/search?q=org%3Aexercism+representer&type=repositories) to ensure that one doesn't already exist.
 2. Scan the [contents of this directory](/docs/building/tooling/representers) to ensure you are comfortable with the idea of creating an Representer.
-3. Open an issue at [exercism/exercism][exercism-repo] introducing yourself and telling us which language you'd like to create a Representer for.
+3. Start a new topic on [the Exercism forum][building-exercism] telling us which language you'd like to create a Test Runner for.
 4. Once a Representer repo has been created, use [the Representer interface document](/docs/building/tooling/representers/interface) to help guide your implementation.
 
 We have an incredibly friendly and supportive community who will be happy to help you as you work through this! If you get stuck, please start a new topic on [the Exercism forum][building-exercism] or create new issues at [exercism/exercism][exercism-repo] as needed 🙂

diff --git a/building/tooling/test-runners/README.md b/building/tooling/test-runners/README.md
@@ -12,7 +12,7 @@ Test Runners give us two advantages:
 Each language has its own Test Runner, written in that language.
 The website acts as the orchestrator between the Test Runners and students' submissions.
 
-Each Test Runner lives in the Exercism GitHub organization in a repository named `$LANG-test-runner` (e.g. `ruby-test-runner`).
+Each Test Runner lives in the Exercism GitHub organization in a repository named `$LANG-test-runner` (e.g. [`exercism/ruby-test-runner`](https://github.com/exercism/ruby-test-runner)).
 You can explore the different Test Runners [here](https://github.com/exercism?q=-test-runner).
 
 If you would like to get involved in helping with an existing Test Runner, please open an issue in its repository asking if there is somewhere you can help.

diff --git a/building/tooling/test-runners/creating-from-scratch.md b/building/tooling/test-runners/creating-from-scratch.md
@@ -4,9 +4,9 @@ Firstly, thank you for your interest in creating a Test Runner!
 
 These are the steps to get going:
 
-1. Check [our repository list for an existing `...-test-runner`](https://github.com/exercism?q=-test-runner) to ensure that one doesn't already exist.
+1. Check [our repository list for an existing `...-test-runner`](https://github.com/search?q=org%3Aexercism+test-runner&type=repositories) to ensure that one doesn't already exist.
 2. Scan the [contents of this directory](/docs/building/tooling/test-runners) to ensure you are comfortable with the idea of creating an Test Runner.
-3. Open an issue at [exercism/exercism][exercism-repo] introducing yourself and telling us which language you'd like to create a Test Runner for.
+3. Start a new topic on [the Exercism forum][building-exercism] telling us which language you'd like to create a Test Runner for.
 4. Once a Test Runner repo has been created, use [the Test Runner interface document](/docs/building/tooling/test-runners/interface) to help guide your implementation. There is a [generic test runner repository template](https://github.com/exercism/generic-test-runner/) that you can use to kick-start development.
 
 We have an incredibly friendly and supportive community who will be happy to help you as you work through this! If you get stuck, please start a new topic on [the Exercism forum][building-exercism] or create new issues at [exercism/exercism][exercism-repo] as needed 🙂