Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image refresh for arch (for fsinfo) #5861

Merged
merged 3 commits into from
Feb 6, 2024

Conversation

cockpituous
Copy link
Contributor

@cockpituous cockpituous commented Feb 2, 2024

Let's get a working image with fsinfo inside of it.

  • image-refresh arch

@cockpituous cockpituous added the bot label Feb 2, 2024
@cockpituous cockpituous changed the title Image refresh for arch (for fsinfo) WIP: rhos-01-9: [no-test] Image refresh for arch (for fsinfo) Feb 2, 2024
@cockpituous
Copy link
Contributor Author

@cockpituous cockpituous force-pushed the image-refresh-arch-20240202-202818 branch from 6c588a3 to bb54e61 Compare February 2, 2024 20:28
cockpituous pushed a commit that referenced this pull request Feb 2, 2024
@cockpituous cockpituous changed the title WIP: rhos-01-9: [no-test] Image refresh for arch (for fsinfo) Image refresh for arch (for fsinfo) Feb 2, 2024
@cockpituous
Copy link
Contributor Author

@allisonkarlitskaya
Copy link
Member

  cockpit (308-1 -> 310.1-1)

How very convenient. :)

@allisonkarlitskaya
Copy link
Member

Great. More curl upgrade fallout:

[root@archlinux ~]# curl --head http://172.27.0.15:9090
HTTP/1.1 301 Moved Permanently
Content-Type: text/html
Location: https://172.27.0.15:9090/

curl: (8) Weird server reply

and indeed:

sendto(5, "HEAD / HTTP/1.1\r\nHost: 172.27.0.15:9090\r\nUser-Agent: curl/8.6.0\r\nAccept: */*\r\n\r\n", 80, MSG_NOSIGNAL, NULL, 0) = 80
recvfrom(5, "HTTP/1.1 301 Moved Permanently\r\nContent-Type: text/html\r\nLocation: https://172.27.0.15:9090/\r\n\r\n<html><head><title>Moved</title></head><body>Please use TLS</body></html>\r\n", 102400, 0, NULL, NULL) = 171

That looks distinctly like a body sent in reply to a HEAD request. This one would have been coming from cockpit-tls. That's going to be more "interesting" to fix: cockpit-tls doesn't read past the first byte on the incoming message, and it certainly doesn't go about trying to parse the HTTP request. I think our best bet here would probably be to never send a body.

@martinpitt
Copy link
Member

The two cockpit PRs landed, retrying. I didn't yet look into the two machines failures, but we've accrued quite a number of flakes there, so retrying for comparison as well.

@martinpitt martinpitt mentioned this pull request Feb 3, 2024
1 task
@martinpitt
Copy link
Member

Ah, there is still the testRaidRepair crash, which already fixed in the other pending refresh in PR #5804 . I suppose this is for @mvollmer.

@martinpitt
Copy link
Member

The same regression now landed in Fedora updates-testing, see cockpit-project/cockpit#19937 . Smells like a kernel regression?

@allisonkarlitskaya
Copy link
Member

I wouldn't be surprised if it's related to the one in #5793...

@jelly
Copy link
Member

jelly commented Feb 4, 2024

I wouldn't be surprised if it's related to the one in #5793...

If you have a link to the kernel patch which is supposed to resolve it, I can easily build a kernel / verify

@martinpitt
Copy link
Member

The ubuntu one was a rather different area though (partition vs. RAID repair)

@mvollmer
Copy link
Member

mvollmer commented Feb 5, 2024

I'll have a look as well.

The first failure is interesting, the other two with a busy /dev/loop10 is just a failed cleanup.

testRaidRepair does indeed look like https://bugzilla.redhat.com/show_bug.cgi?id=2256432, but MaxLayoutSizes must be something else...

@mvollmer
Copy link
Member

mvollmer commented Feb 5, 2024

testRaidRepair does indeed look like https://bugzilla.redhat.com/show_bug.cgi?id=2256432,

Yes, lvconvert hangs with the exact same kernel stack trace.

@mvollmer
Copy link
Member

mvollmer commented Feb 5, 2024

testRaidRepair does indeed look like https://bugzilla.redhat.com/show_bug.cgi?id=2256432,

Yes, lvconvert hangs with the exact same kernel stack trace.

This seems to happen pretty easily, also with biggish disks. We might want to raise the alarm on https://bugzilla.redhat.com/show_bug.cgi?id=2256432, which has not getting any reaction so far.

I have found no way to continue when this happens. The lvconvert process can not be killed by any means known to man kind when it is in this state. I think we need to make this a destructive test with a naughty.

@mvollmer
Copy link
Member

mvollmer commented Feb 5, 2024

Blocked on cockpit-project/cockpit#19940

allisonkarlitskaya pushed a commit that referenced this pull request Feb 5, 2024
@allisonkarlitskaya
Copy link
Member

allisonkarlitskaya commented Feb 6, 2024

Looking into the machines failure. All of the noise about

> error: Scrollbar test exception: TypeError: Cannot read properties of null (reading 'appendChild')
> log: osinfo-detect command failed:  (process:1624): GLib-GIO-WARNING **: 06:32:25.473: Can't find module 'gvfs' specified in GIO_USE_VFS
Traceback (most recent call last):
  File "<string>", line 28, in <module>
gi.repository.GLib.GError: osinfo-tree-error: URL protocol is not supported (1)
> log: osinfo-detect command failed:  (process:1769): GLib-GIO-WARNING **: 06:32:35.476: Can't find module 'gvfs' specified in GIO_USE_VFS
Traceback (most recent call last):
  File "<string>", line 22, in <module>
gi.repository.GLib.GError: osinfo-media-error: No volume descriptors (0)
> error: Failed when connecting: Connection closed (code: 1000)
> info: Connection lost:  {"isTrusted":"false","detail":"Object","type":"disconnect","target":"null","currentTarget":"null"}
> error: Tried changing state of a disconnected RFB object
> error: spawn 'vm creation' returned error: "{"problem":null,"exit_status":1,"exit_signal":null,"message":"ERROR    Domain not found: no domain with matching uuid 'e8d716ac-835b-4a5c-b0ba-8f720bab6b01' (subVmTestCreate8)\nDomain installation does not appear to have been successful.\nIf it was, you can restart your domain by running:\n  virsh --connect qemu:///system start subVmTestCreate8\notherwise, please restart your installation.\nTraceback (most recent call last):\n  File \"<string>\", line 361, in <module>\n  File \"<string>\", line 263, in create_vm\n  File \"/usr/lib/python3.11/subprocess.py\", line 466, in check_output\n    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/lib/python3.11/subprocess.py\", line 571, in run\n    raise CalledProcessError(retcode, process.args,\nsubprocess.CalledProcessError: Command '['virt-install', '--connect', 'qemu:///system', '--quiet', '--os-variant', 'fedora28', '--memory', '128', '--name', 'subVmTestCreate8', '--check', 'path_in_use=off', '--wait', '-1', '--noautoconsole', '--disk', 'none', '--graphics', 'vnc,listen=127.0.0.1', '--graphics', 'spice,listen=::1', '--cdrom', 'https://archive.fedoraproject.org/pub/archive/fedora/linux/releases/28/Server/x86_64/os/images/boot.iso']' returned non-zero exit status 1."}"

is just noise. I see the same thing in a local run on main with the old image (which ends successfully).

Checking with the new image (and expanding the error message):
image

That sounds like more fallout from the new curl version — curl/curl#12844

@martinpitt
Copy link
Member

c-machines itself, in particular its machine_install.py, doesn't call curl. But of course that could be caused by qemu/block-curl.so calling it the wrong way. The test fakes fedoraproject.org with test/files/mock-range-server.py , but that doesn't use curl at all (it's the server side).

@allisonkarlitskaya
Copy link
Member

Downgrading curl (using 8.5.0 from https://archive.archlinux.org/packages/c/curl/) fixes the issue.

@allisonkarlitskaya
Copy link
Member

Blocked on cockpit-project/cockpit#19940

Landed.

@mvollmer mvollmer removed the blocked label Feb 6, 2024
@@ -0,0 +1,3 @@
Traceback (most recent call last):
File "*/check-machines-create", line *, in testCreateUrlSource
runner.createTest(TestMachinesCreate.VmDialog(self, sourceType='url',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make that more specific somehow? Otherwise this would match whenever anything is broken with createTest, not just when it is curls fault.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g., there is also "CURL: Error opening file: OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 0" in the browser log, a few lines above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

@mvollmer
Copy link
Member

mvollmer commented Feb 6, 2024

arch/other probably just needs a rebase to go green.

@allisonkarlitskaya allisonkarlitskaya merged commit 97aaf55 into main Feb 6, 2024
12 checks passed
@allisonkarlitskaya allisonkarlitskaya deleted the image-refresh-arch-20240202-202818 branch February 6, 2024 14:03
allisonkarlitskaya pushed a commit that referenced this pull request Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants