Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bib: add experimental cross arch image building #139

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ jobs:
- name: Install test dependencies
run: |
sudo apt update
sudo apt install -y podman python3-pytest python3-paramiko python3-boto3 flake8 qemu-system-x86
sudo apt install -y podman python3-pytest python3-paramiko python3-boto3 flake8 qemu-system-x86 qemu-efi-aarch64 qemu-system-arm qemu-user-static
- name: Diskspace (before)
run: |
df -h
Expand Down
33 changes: 29 additions & 4 deletions bib/cmd/bootc-image-builder/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -115,9 +115,19 @@ func makeManifest(c *ManifestConfig, cacheRoot string) (manifest.OSBuildManifest
}

// resolve container
resolver := container.NewResolver(c.Architecture.String())
hostArch := arch.Current()
resolverNative := container.NewResolver(hostArch.String())
resolverTarget := container.NewResolver(c.Architecture.String())
Comment on lines +118 to +120
Copy link
Member

@achilleas-k achilleas-k Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading this right, in the standard case, we'll get two identical resolvers and each one will get the same container it will have to resolve. It's not terrible but a bit wasteful. Maybe we could instead:

resolvers[hostArch.String()] = container.NewResolver(hostArch.String())
if hostArch.String() != c.Architecture.String() {
    resolvers[c.Architecture.String()] = container.NewResolver(c.Architecture.String()) 
}

...

if plName == "build" {
    rarch = hostArch.String()
} else {
    rarch = c.Architecture.String()
}
for _, c := range sourceSpecs {
    rarch.Add(c)
}

I now also realise that even in this case, we're resolving the same container twice (and have been since we started using the container as a build root).
Again, not terrible, but maybe we can be smarter about it :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I will look at this, maybe in a followup as it's not that pressing but I agree that it's a bit wasteful

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again, I was thinking it could be something like:

index 5e9ed2d..9bb6a5b 100644
--- a/bib/cmd/bootc-image-builder/main.go
+++ b/bib/cmd/bootc-image-builder/main.go
@@ -115,13 +115,18 @@ func makeManifest(c *ManifestConfig, cacheRoot string) (manifest.OSBuildManifest
        }
 
        // resolve container
-       hostArch := arch.Current()
-       resolverNative := container.NewResolver(hostArch.String())
-       resolverTarget := container.NewResolver(c.Architecture.String())
+       hostArch := arch.Current().String()
+       targetArch := c.Architecture.String()
+
+       resolverNative := container.NewResolver(hostArch)
+       resolverTarget := resolverNative
+       if hostArch != targetArch {
+               resolverTarget = container.NewResolver(targetArch)
+       }
 
        containerSpecs := make(map[string][]container.Spec)
        for plName, sourceSpecs := range manifest.GetContainerSourceSpecs() {
-               var resolver container.Resolver
+               var resolver *container.Resolver
                if plName == "build" {
                        resolver = resolverNative
                } else {

which requires osbuild/images#433 (so that we get pointers to the resolver first) - I have no super strong opinion here though, happy to discuss more if you think this is less obvious or too verbose :)


containerSpecs := make(map[string][]container.Spec)
for plName, sourceSpecs := range manifest.GetContainerSourceSpecs() {
var resolver container.Resolver
if plName == "build" {
resolver = resolverNative
} else {
resolver = resolverTarget
}

for _, c := range sourceSpecs {
resolver.Add(c)
}
Expand Down Expand Up @@ -152,8 +162,8 @@ func saveManifest(ms manifest.OSBuildManifest, fpath string) error {
}

func manifestFromCobra(cmd *cobra.Command, args []string) ([]byte, error) {
hostArch := arch.Current()
repos, err := loadRepos(hostArch.String())
buildArch := arch.Current()
repos, err := loadRepos(buildArch.String())
if err != nil {
return nil, err
}
Expand All @@ -163,6 +173,20 @@ func manifestFromCobra(cmd *cobra.Command, args []string) ([]byte, error) {
configFile, _ := cmd.Flags().GetString("config")
tlsVerify, _ := cmd.Flags().GetBool("tls-verify")
imgType, _ := cmd.Flags().GetString("type")
targetArch, _ := cmd.Flags().GetString("target-arch")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be handling "target-variant" as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to add more here - but pardon my ignorance, what does target-variant mean in this context?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some Arm platforms have multiple variants. So if BIB is calling skopeo or podman then it is needs to pass the variant as well as the Arch if the user specified a variant.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a TODO here for now, if you could share some container or any other link where I can inspect this a bit more that would be super helpful. I will then add/fix this in a followup.

if targetArch != "" {
// TODO: detect if binfmt_misc for target arch is
// available, e.g. by mounting the binfmt_misc fs into
// the container and inspects the files or by
// including tiny statically linked target-arch
// binaries inside our bib container
fmt.Fprintf(os.Stderr, "WARNING: target-arch is experimental and needs an installed 'qemu-user' package\n")
if imgType == "iso" {
return nil, fmt.Errorf("cannot build iso for different target arches yet")
}
buildArch = arch.FromString(targetArch)
}
// TODO: add "target-variant", see https://github.com/osbuild/bootc-image-builder/pull/139/files#r1467591868

var config *BuildConfig
if configFile != "" {
Expand All @@ -179,7 +203,7 @@ func manifestFromCobra(cmd *cobra.Command, args []string) ([]byte, error) {
ImgType: imgType,
Config: config,
Repos: repos,
Architecture: hostArch,
Architecture: buildArch,
TLSVerify: tlsVerify,
}
return makeManifest(manifestConfig, rpmCacheRoot)
Expand Down Expand Up @@ -315,6 +339,7 @@ func run() error {
manifestCmd.Flags().String("config", "", "build config file")
manifestCmd.Flags().String("type", "qcow2", "image type to build [qcow2, ami]")
manifestCmd.Flags().Bool("tls-verify", true, "require HTTPS and verify certificates when contacting registries")
manifestCmd.Flags().String("target-arch", "", "build for the given target architecture (experimental)")

logrus.SetLevel(logrus.ErrorLevel)
buildCmd.Flags().AddFlagSet(manifestCmd.Flags())
Expand Down
3 changes: 3 additions & 0 deletions plans/all.fmf
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,16 @@ provision:
prepare:
how: install
package:
- edk2-aarch64
- podman
- pytest
- python3-boto3
- python3-flake8
- python3-paramiko
- python3-pip
- qemu-kvm
- qemu-system-aarch64
- qemu-user-static
execute:
how: tmt
script: |
Expand Down
19 changes: 15 additions & 4 deletions test/test_build.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
class ImageBuildResult(NamedTuple):
img_type: str
img_path: str
img_arch: str
username: str
password: str
journal_output: str
Expand All @@ -51,7 +52,13 @@ def image_type_fixture(shared_tmpdir, build_container, request, force_aws_upload
ImageBuildResult with the resulting image path and user/password
"""
# image_type is passed via special pytest parameter fixture
container_ref, image_type = request.param.split(",")
if request.param.count(",") == 2:
container_ref, image_type, target_arch = request.param.split(",")
elif request.param.count(",") == 1:
container_ref, image_type = request.param.split(",")
target_arch = None
else:
raise ValueError(f"cannot parse {request.param.count}")

username = "test"
password = "password"
Expand All @@ -74,7 +81,7 @@ def image_type_fixture(shared_tmpdir, build_container, request, force_aws_upload
if generated_img.exists():
print(f"NOTE: reusing cached image {generated_img}")
journal_output = journal_log_path.read_text(encoding="utf8")
yield ImageBuildResult(image_type, generated_img, username, password, journal_output)
yield ImageBuildResult(image_type, generated_img, target_arch, username, password, journal_output)
return

# no image yet, build it
Expand All @@ -99,6 +106,9 @@ def image_type_fixture(shared_tmpdir, build_container, request, force_aws_upload

upload_args = []
creds_args = []
target_arch_args = []
if target_arch:
target_arch_args = ["--target-arch", target_arch]

with tempfile.TemporaryDirectory() as tempdir:
if image_type == "ami":
Expand Down Expand Up @@ -129,6 +139,7 @@ def image_type_fixture(shared_tmpdir, build_container, request, force_aws_upload
"--config", "/output/config.json",
"--type", image_type,
*upload_args,
*target_arch_args,
])
journal_output = testutil.journal_after_cursor(cursor)
metadata = {}
Expand All @@ -141,7 +152,7 @@ def del_ami():

journal_log_path.write_text(journal_output, encoding="utf8")

yield ImageBuildResult(image_type, generated_img, username, password, journal_output, metadata)
yield ImageBuildResult(image_type, generated_img, target_arch, username, password, journal_output, metadata)
# Try to cache as much as possible
disk_usage = shutil.disk_usage(generated_img)
print(f"NOTE: disk usage after {generated_img}: {disk_usage.free / 1_000_000} / {disk_usage.total / 1_000_000}")
Expand All @@ -167,7 +178,7 @@ def test_image_is_generated(image_type):
@pytest.mark.skipif(platform.system() != "Linux", reason="boot test only runs on linux right now")
@pytest.mark.parametrize("image_type", gen_testcases("direct-boot"), indirect=["image_type"])
def test_image_boots(image_type):
with QEMU(image_type.img_path) as test_vm:
with QEMU(image_type.img_path, arch=image_type.img_arch) as test_vm:
exit_status, _ = test_vm.run("true", user=image_type.username, password=image_type.password)
assert exit_status == 0
exit_status, output = test_vm.run("echo hello", user=image_type.username, password=image_type.password)
Expand Down
9 changes: 9 additions & 0 deletions test/testcases.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import platform
import os


Expand Down Expand Up @@ -43,6 +44,14 @@ def gen_testcases(what):
CONTAINERS_TO_TEST["centos"] + "," + DIRECT_BOOT_IMAGE_TYPES[2],
CONTAINERS_TO_TEST["fedora"] + "," + DIRECT_BOOT_IMAGE_TYPES[0],
]
# do a cross arch test too
if platform.machine() == "x86_64":
# todo: add fedora:eln
test_cases.append(
f'{CONTAINERS_TO_TEST["centos"]},raw,arm64')
elif platform.machine() == "arm64":
# TODO: add arm64->x86_64 cross build test too
pass
return test_cases
elif what == "all":
test_cases = []
Expand Down
55 changes: 37 additions & 18 deletions test/vm.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import abc
import os
import pathlib
import platform
import subprocess
import sys
import time
Expand Down Expand Up @@ -93,33 +94,46 @@ def find_ovmf():

class QEMU(VM):
MEM = "2000"
# TODO: support qemu-system-aarch64 too :)
QEMU = "qemu-system-x86_64"

def __init__(self, img, snapshot=True, cdrom=None):
def __init__(self, img, arch="", snapshot=True, cdrom=None):
super().__init__()
self._img = pathlib.Path(img)
self._qmp_socket = self._img.with_suffix(".qemp-socket")
self._qemu_p = None
self._snapshot = snapshot
self._cdrom = cdrom
self._ssh_port = None
if not arch:
arch = platform.machine()
self._arch = arch

def __del__(self):
self.force_stop()

# XXX: move args to init() so that __enter__ can use them?
def start(self, wait_event="ssh", snapshot=True, use_ovmf=False):
if self.running():
return
log_path = self._img.with_suffix(".serial-log")
self._ssh_port = get_free_port()
self._address = "localhost"
qemu_cmdline = [
self.QEMU, "-enable-kvm",
def _gen_qemu_cmdline(self, snapshot, use_ovmf):
if self._arch in ("arm64", "aarch64"):
qemu_cmdline = [
"qemu-system-aarch64",
"-machine", "virt",
"-cpu", "cortex-a57",
"-smp", "2",
"-bios", "/usr/share/AAVMF/AAVMF_CODE.fd",
]
elif self._arch in ("amd64", "x86_64"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i also see the options here, so I'm assuming command line it should only accept amd64 or arm64?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should eventually support all of the platforms that RHEL supports, otherwise our parent company will not be happy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely need to test more scenarios here but if the container is available for both host and target arch and qemu-user is available for the target arch the existing code should DTRT even for e.g. ppc64le.

qemu_cmdline = [
"qemu-system-x86_64",
"-M", "accel=kvm",
# get "illegal instruction" inside the VM otherwise
"-cpu", "host",
]
if use_ovmf:
qemu_cmdline.extend(["-bios", find_ovmf()])
else:
raise ValueError(f"unsupported architecture {self._arch}")

# common part
qemu_cmdline += [
"-m", self.MEM,
# get "illegal instruction" inside the VM otherwise
"-cpu", "host",
"-serial", "stdio",
"-monitor", "none",
"-netdev", f"user,id=net.0,hostfwd=tcp::{self._ssh_port}-:22",
Expand All @@ -128,18 +142,23 @@ def start(self, wait_event="ssh", snapshot=True, use_ovmf=False):
]
if not os.environ.get("OSBUILD_TEST_QEMU_GUI"):
qemu_cmdline.append("-nographic")
if use_ovmf:
qemu_cmdline.extend(["-bios", find_ovmf()])
if self._cdrom:
qemu_cmdline.extend(["-cdrom", self._cdrom])
if snapshot:
qemu_cmdline.append("-snapshot")
qemu_cmdline.append(self._img)
self._log(f"vm starting, log available at {log_path}")
return qemu_cmdline

# XXX: move args to init() so that __enter__ can use them?
def start(self, wait_event="ssh", snapshot=True, use_ovmf=False):
if self.running():
return
self._ssh_port = get_free_port()
self._address = "localhost"

# XXX: use systemd-run to ensure cleanup?
self._qemu_p = subprocess.Popen(
qemu_cmdline,
self._gen_qemu_cmdline(snapshot, use_ovmf),
stdout=sys.stdout,
stderr=sys.stderr,
)
Expand Down
Loading