Jeopardy CTFs are made of challenges, A/D CTFs of services. A service is a pseudo-realistic application affected by security vulnerabilities that can be exploited to extract some secrets, i.e., the flags, which serve as proof of successful hacking attempts. Services could consist of multiple components, each storing a different set of flags: we call such components flag stores. Flags can be bound to a specific flag ID that simplifies the identification of the target to exploit.
For example, consider a service representing a Web forum where users can mark their threads as private. Users can also provide personal information in their profile. This information is only used for analytics and is not publicly shown to other users on the platform. Assume that flags are stored in 2 specific locations, representing the 2 flag stores of the service:
- as part of a private thread,
- as a field of their profile data.
The flag ID for the 1st flag store could be the identifier of the private thread, while the user identifier could be used as a flag ID for the 2nd flag store.
Services can contain multiple flag stores, typically between 1 and 3. CTF organizers can decide to impose a fixed amount of flag stores for all services, or decide that each service can have a different number of flag stores. Flag stores can be affected by multiple vulnerabilities. Here are some rules for the implementation of your service:
- a vulnerability typically allows for the extraction of flags from a single flag store; exceptions to this rule are allowed only if the exploitation of the vulnerability is very complicated or if it requires to be chained with other vulnerabilities
- flag stores should be as independent as possible. For instance, one flag store shouldn't be available only after exploiting the first one
- avoid vulnerabilities that may enable destructive attacks, like erasing all the flags contained in a DB, taking 100% of CPU time, or filling up all the disk quota. Getting arbitrary remote code execution shouldn't also be possible in general, unless the exploitation of the vulnerability is very hard or if proper confinement measures are in place to prevent DoS after exploitation
- the vulnerabilities must not be too similar: for instance, having two UNION-based SQL injections, one without any filtering and one with some keyword filtering, is not a great idea. Even better, the second vulnerability would not be a SQL injection at all
- it's important to design vulnerabilities such that they cover different difficulty levels: intuitively, a good CTF design would enable all teams to be able to exploit the simplest vulnerability of each service during the competition. tl;dr put at least 1 trivial vulnerability in your service.
The gameserver uses bots to interact with your service. Although you are required to extend a single python class file for each flag store (see the section Writing a Checker for details), bots typically perform 2 main actions: dispatching new flags and testing the intended service functionalities. Dispatchers are functions that use the legitimate functionalities of a service to insert new flags into a flag store. For every flag store in your service, you are required to implement an appropriate dispatcher.
Upon successful dispatch, the checker may return a flag ID that will be used:
- by the service checker to verify the service functionality
- by exploits to identify the target of an attack (flag IDs will be made available to teams via an API provided by our CTF infrastructure)
Service checkers are functions that verify the correct functioning of a service, e.g., whether the service is up and running, all functionalities are still in place, and if previously inserted flags are still accessible via the intended functionalities of the service. As for flag dispatchers, you are required to implement a checker for every flag store in your service.
It is extremely important that bots cannot be easily fingerprinted, e.g., because they always perform the same operations in the same order. Otherwise, teams could simply filter out incoming attacks by blocking all requests not following the same pattern used by the bot. For this reason, bots should randomly change their behavior among different executions by switching the order of some operations/requests and/or using different functionalities in different executions. Furthermore, bots performing HTTP connections shouldn't be easily identifiable by using a specific and constant user-agent: we recommend sticking to python requests, randomly picking one of the last N versions to simulate connections generated by other teams.
Flag dispatchers and service checkers are allowed to share some state across different executions. This is done by accessing a key-value storage for a given team-service pair. Avoid using this storage if possible anyway: one good way to share state between the dispatcher and the checker is to derive some information from the flag. For example, to compute a username from a flag, you can use something like:
>>> import hashlib
>>> SECRET = b'omg_so_secret'
>>> flag = b'CTF{foobarbaz}'
>>> username = hashlib.sha256(flag + SECRET).hexdigest()[:20]
>>> username
'aacb36a0f0bd0207844a'
Please read carefully the Writing a Checker section to understand how to write bots. As complementary material, please refer to the FAUST CTF documentation, and the Checker Script Python Library
You are also required to provide an exploit for every vulnerability of your service.
Exploits should typically take the following parameters as command-line arguments:
- the IP of the target team to attack
- the flag ID (if used by your service/flag store)
You should also provide appropriate patches to fix the intended vulnerabilities affecting the flag stores of your service. Patches should be easy to apply and test. Check GitLab documentation on how to generate a patch.
Security topics, programming language(s), and vulnerabilities are up to the developers of a service. It is recommended to combine multiple topics in the same service, e.g., a binary service with a vulnerability that needs some crypto knowledge to be exploited. Remember that the service should be challenging enough to be interesting for the participants, but not too hard to be solved by a single team. Keep in mind your target audience during the development!
This is a partial list of additional recommendations that you should keep in mind while working on your service.
- Go through all possible functionalities provided by the service. Can they be (ab)used to DoS the service? Consider disk usage (I/O and storage), CPU, RAM, etc. If this is the case, are there mitigations put in place? Something like a reasonable proof-of-work or a captcha can be used to slow down participants, but it needs to be carefully designed to combine with the gameserver bots.
- Similar to the point above... is it possible to forge a small request that gets amplified by the service? Stuff like zip bombs must be addressed proactively during the service design phase.
- For each flag store, ensure that deploying a trivial patch (that doesn't even fix the root cause of the vulnerability) prevents any further exploitation of the flag store. There must be enough vulnerabilities to keep on hacking a flag store even if teams come up with lame patches. For instance, avoid situations where the juicy part of a service comes after a vulnerability that can be easily patched.
- Be aware of lateral movements, i.e., using a bug to capture flags from multiple flag stores. This should only be possible if the vulnerability is very difficult or if the exploitation requires chaining multiple vulnerabilities.
- Vulnerabilities that are very difficult to exploit but trivial to patch are bad. This kind of asymmetry leads to no incentives in developing exploits. The opposite situation (a vulnerability that is trivial to exploit but difficult to patch) is allowed but should be carefully discussed
- Avoid vulnerabilities that can be used for DoS. Again, DoSsing services is not necessarily forbidden, especially if the DoS is logical and doesn't affect the infrastructure, but service developers must anticipate possible abuses.
- Teams could figure out ways to DoS the infrastructure by causing the gameserver bots to hang, break, getting blocked, and so on. Take some good time thinking how to minimize this risk.
- Are all service functionalities being checked by the checkers? Are vulnerable functions being tested by the checker? Everything that is not checked by the checker is a backdoor, and we want to minimize the presence of backdoors in services. Backdoors are a bit lame.
- Is it easy to identify the checker's pattern? Randomize the order in which functionalities are checked, skip some checks from time to time, randomize the length and the look of user-provided data (e.g., usernames), randomize the user-agent (see above), etc.
You can initialize your project by cloning the provided service template.
The template contains the basic directory structure which your repository should abide to:
- (root of your repository)
checkers
: The checkers that will place and retrieve flags from your service. Each flag store requires one checker.checker1
checker.py
docker-compose.yaml
Dockerfile
README.md
requirements.txt
checker2
- ...
- ...
dist
: Service files. This is the root of the repository that will be distributed to each participating team and can be used to patch the service.docker-compose.yaml
(please, do not use local paths in thevolumes
directive: use named volumes)Dockerfile
- ...
exploits
: You should place production-ready exploits for your service here.exploit1.py
README.md
src
(optional): In case your service will be distributed in compiled form, or there are other pre-processing steps that should not occur inside the team repos, place everything needed to prepare your service here.build.sh
: Script to start your pre-build process.Dockerfile
: To make your pre-build reproducible, please provide a docker container.
README.md
: Describe your service and its vulnerabilities here, following the same structure adopted in the original project proposal.
It is up to the organizers to decide whether the final deployment of your service will directly use the docker-compose files in the /dist
directory or not. It is a good rule to assume that Dockerfile
(s) and docker-compose.yaml
files are available to the participants for local testing, but they might not be editable by participants during the competition. This means that Dockerfile
and docker-compose.yaml
should not contain vulnerabilities as teams may not be able to apply patches to those files. To avoid conflicts, the exposed port numbers of the services will follow the naming scheme 100<id>{0..9}
where <id>
is the service id assigned by the organizers. As an example, if service 4 needs to expose two ports, those ports can be 10041
10042
(or any other in the range 10040-10049
).
In the FAUST framework, the component that places flags and retrieves them from each team's service is called a checker. You will have to implement one checker for each flags store in your service. Each of them will work independently form each other.
To implement a flagstore, you can use the checkerlib, which is part of the FAUST framework. You need to implement a subclass of checkerlib.BaseChecker
and overwrite three methods:
place_flag(self, tick: int) -> Tuple[CheckResult, str]
: called once per tick to place one flagcheck_service(self) -> Tuple[CheckResult, str]
: called once per tick to check the general availability of your service. All heavy checks should take place here. This should typically return eitherOK
,DOWN
, orFAULTY
check_flag(self, tick: int) -> Tuple[CheckResult, str]
: called for the current and multiple previous ticks to check if all the flags that did not expire are still retrievable. This should typically return eitherOK
orFLAG_NOT_FOUND
.
All three of these methods return one of the service status codes and a message that will be shown in the scoreboard so that teams can more easily identify issues with their patches. The status codes that are available are the following:
CheckResult.OK
: everything is fineCheckResult.DOWN
: the service is not reachable. You should only return this if you run into a timeoutCheckResult.FAULTY
: the service is responding but behaving incorrectly. Teams will not receive any SLA points for this roundCheckResult.FLAG_NOT_FOUND
: The service is responding, but the flag is not there (either because it was not placed correctly or it has been deleted). If this status is returned incheck_flag
for a flag from a previous tick, teams get a reduced amount of points for this tick and the service is shown asRECOVERING
. To give teams the chance to get theRECOVERING
points, you should use this return code incheck_flag
.
The checkerlib provides you with essential functions:
get_flag(tick: int) -> str
: Returns the flag for the given tick.set_flagid(data: str) -> None
: Stores the Flag ID for the current tick.get_flagid(tick: int) -> str
: Retrieves the Flag ID for the given tick.store_state(key: str, data: Any) -> None
: Stores a Python object persistenly.load_state(key: str) -> Any
: Retrieves persistent data preivously stored withstore_state()
.run_check(checker_cls: Type[BaseChecker]) -> None
: This function should be called at the end of your script to invoke the checkerlib to callplace_flag
,check_service
andcheck_flag
of your checker.
load_state
and store_state
allow for persistently storing and retrieving data across ticks. Assume that the container running your script could be recreated for every invocation, hence the only way to persistently store a state (including things such as usernames and passwords) is through this method. You should prefer to use deterministically generated credentials based on the hash of the flag (see the above section "Checkers"). Only use load_state
and store_state
if you have to, such as when your service generates credentials for you by handing out access tokens or certificates.