Provides following
- Create an image with pre-installed Scylla
- Allow to configure the database when an instance is launched first time
- Easy cluster creation
RPM/DEB package that is pre-installed in the image. Responsible for configuring Scylla during first boot of the instance.
aws/ami/build_ami.sh
Scylla AMI user-data should be passed as a json object, as described below
see AWS docs for how to pass user-data into ec2 instances: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-add-user-data.html
User Data that can pass when create EC2 instances
- Object Properties
- scylla_yaml (
Scylla YAML
) – Mapping of all fields that would pass down to scylla.yaml configuration file - scylla_startup_args (list) – embedded information about the user that created the issue (NOT YET IMPLEMENTED) (default=’[]’)
- developer_mode (boolean) – Enables developer mode (default=’false’)
- post_configuration_script (string) – A script to run once AMI first configuration is finished, can be a string encoded in base64. (default=’’)
- post_configuration_script_timeout (int) – Time in seconds to limit the post_configuration_script (default=’600’)
- start_scylla_on_first_boot (boolean) – If true, scylla-server would boot at AMI boot (default=’true’)
- scylla_yaml (
All fields that would pass down to scylla.yaml configuration file
see https://docs.scylladb.com/operating-scylla/scylla-yaml/ for all the possible configuration availble listed here only the one get defaults scylla AMI
- Object Properties
- cluster_name (string) – Name of the cluster (default=
generated name that would work for only one node cluster
) - auto_bootstrap (boolean) – Enable auto bootstrap (default=’true’)
- listen_address (string) – Defaults to ec2 instance private ip
- broadcast_rpc_address (string) – Defaults to ec2 instance private ip
- endpoint_snitch (string) – Defaults to ‘org.apache.cassandra.locator.Ec2Snitch’
- rpc_address (string) – Defaults to ‘0.0.0.0’
- seed_provider (mapping) – Defaults to ec2 instance private ip
- cluster_name (string) – Name of the cluster (default=
Spinning a new node connecting to “10.0.219.209” as a seed, and installing cloud-init-cfn package at first boot.
{
"scylla_yaml": {
"cluster_name": "test-cluster",
"experimental": true,
"seed_provider": [{"class_name": "org.apache.cassandra.locator.SimpleSeedProvider",
"parameters": [{"seeds": "10.0.219.209"}]}],
},
"post_configuration_script": "#! /bin/bash\nyum install cloud-init-cfn",
"start_scylla_on_first_boot": true
}
scylla_yaml:
cluster_name: test-cluster
experimental: true
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: 10.0.219.209
post_configuration_script: "#! /bin/bash\nyum install cloud-init-cfn"
start_scylla_on_first_boot: true
If other feature of cloud-init are needed, one can use mimemultipart, and pass
a json/yaml with x-scylla/yaml
or x-scylla/json
more information on cloud-init multipart user-data:
https://cloudinit.readthedocs.io/en/latest/topics/format.html#mime-multi-part-archive
Content-Type: multipart/mixed; boundary="===============5438789820677534874=="
MIME-Version: 1.0
--===============5438789820677534874==
Content-Type: x-scylla/yaml
MIME-Version: 1.0
Content-Disposition: attachment; filename="scylla_machine_image.yaml"
scylla_yaml:
cluster_name: test-cluster
experimental: true
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: 10.0.219.209
post_configuration_script: "#! /bin/bash\nyum install cloud-init-cfn"
start_scylla_on_first_boot: true
--===============5438789820677534874==
Content-Type: text/cloud-config; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="cloud-config.txt"
#cloud-config
cloud_final_modules:
- [scripts-user, always]
--===============5438789820677534874==--
example of creating the multipart message by python code:
import json
from email.mime.base import MIMEBase
from email.mime.multipart import MIMEMultipart
msg = MIMEMultipart()
scylla_image_configuration = dict(
scylla_yaml=dict(
cluster_name="test_cluster",
listen_address="10.23.20.1",
broadcast_rpc_address="10.23.20.1",
seed_provider=[{
"class_name": "org.apache.cassandra.locator.SimpleSeedProvider",
"parameters": [{"seeds": "10.23.20.1"}]}],
)
)
part = MIMEBase('x-scylla', 'json')
part.set_payload(json.dumps(scylla_image_configuration, indent=4, sort_keys=True))
part.add_header('Content-Disposition', 'attachment; filename="scylla_machine_image.json"')
msg.attach(part)
cloud_config = """
#cloud-config
cloud_final_modules:
- [scripts-user, always]
"""
part = MIMEBase('text', 'cloud-config')
part.set_payload(cloud_config)
part.add_header('Content-Disposition', 'attachment; filename="cloud-config.txt"')
msg.attach(part)
print(msg)
Use template aws/cloudformation/scylla.yaml
.
Currently, maximum 10 nodes cluster is supported.
Currently the only supported mode is:
dist/redhat/build_rpm.sh --target centos7 --cloud-provider aws
Build using Docker
docker run -it -v $PWD:/scylla-machine-image -w /scylla-machine-image --rm centos:7.2.1511 bash -c './dist/redhat/build_rpm.sh -t centos7 -c aws'
dist/debian/build_deb.sh
Build using Docker
docker run -it -v $PWD:/scylla-machine-image -w /scylla-machine-image --rm ubuntu:20.04 bash -c './dist/debian/build_deb.sh'
python3 -m venv .venv
source .venv/bin/activate
pip install sphinx sphinx-jsondomain sphinx-markdown-builder
make html
make markdown