Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using UDP socket (libffi7+ specific?) #265

Open
v1993 opened this issue Apr 9, 2021 · 8 comments
Open

Error when using UDP socket (libffi7+ specific?) #265

v1993 opened this issue Apr 9, 2021 · 8 comments

Comments

@v1993
Copy link
Contributor

v1993 commented Apr 9, 2021

This is a weird error that seems to occur only with libffi version 7/8, but not version 6 (or could it be another library?). Here's an example that can reproduce it:

local lgi = require 'lgi'
local	GLib,		Gdk,		GObject,		Gio =
		lgi.GLib,	lgi.Gdk,	lgi.GObject,	lgi.Gio

local app = Gio.Application { application_id = 'org.v1993.udpcrash', flags = 'NON_UNIQUE' }

local socket = lgi.Gio.Socket.new('IPV4', 'DATAGRAM', 'UDP')
socket.blocking = false
local sa = lgi.Gio.InetSocketAddress.new(Gio.InetAddress.new_loopback('IPV4'), 3030)
assert(socket:bind(sa, true))

do
	-- To avoid extra allocations
	-- I assume this is long enough
	local buf = require("lgi.core").bytes.new(4096)
	local source = socket:create_source('IN')
	source:set_callback(function()
		print('This will print')
		local len, src = socket:receive_from(buf) -- CRASH!
		print('And this will not')
		if len > 0 then
			print(tostring(buf):sub(1, len))
		end
		return true
	end)

	source:attach(lgi.GLib.MainContext.default())
end

local function exitNormal()
	print('Exiting')
	app:release()
end

function app:on_activate()
	app:hold()
end

GLib.unix_signal_add(GLib.PRIORITY_HIGH, 1, exitNormal)
GLib.unix_signal_add(GLib.PRIORITY_HIGH, 2, exitNormal)
GLib.unix_signal_add(GLib.PRIORITY_HIGH, 15, exitNormal)

app:run({arg[0], ...})

Now, start this simple script and run in another window:

echo 'Hi' > /dev/udp/localhost/3030

This will cause script to crash with following output:

This will print
**
Lgi:ERROR:marshal.c:1197:lgi_marshal_2c_caller_alloc: assertion failed: (size > 0)
Bail out! Lgi:ERROR:marshal.c:1197:lgi_marshal_2c_caller_alloc: assertion failed: (size > 0)
Aborted (core dumped)

This actually causes a crash in my application: v1993/linuxmotehook#5 (thread contains some limited digging into issue), which evidently worked with older versions of stuff. Any clue what's causing it?

@psychon
Copy link
Collaborator

psychon commented Apr 9, 2021

Quick first reaction: I need bash -c 'echo 'Hi' > /dev/udp/127.0.0.1/3030' to reproduce this (zsh does not implement /dev/udp and for some reason Lua listens on IPv4, but bash sends using IPv6). With this, I can reproduce the problem.

Now it's time for dinner, sorry.

@psychon
Copy link
Collaborator

psychon commented Apr 9, 2021

Oh and: Thanks a lot for this great bug report!

Edit:
Shorter reproducer:

local lgi = require 'lgi'
local Gio = lgi.Gio

local socket = lgi.Gio.Socket.new('IPV4', 'DATAGRAM', 'UDP')
--socket.blocking = false
local sa = lgi.Gio.InetSocketAddress.new(Gio.InetAddress.new_loopback('IPV4'), 3030)
assert(socket:bind(sa, true))

print('This will print')
local len, src = socket:receive_from(buf) -- CRASH!
print('And this will not')

@v1993
Copy link
Contributor Author

v1993 commented Apr 9, 2021

Note: your reduced example seems to never define buf, which (I think) shouldn't be the case. As for "blocking", it does not actually matter for UDP IIRC.

@psychon
Copy link
Collaborator

psychon commented Apr 9, 2021

Note: your reduced example seems to never define buf,

And yet it still hits the same assert. Even calling socket:receive_from() does. This made me notice: https://github.com/pavouk/lgi/blob/a3d46f4f3cb1a3c19c61f46b856ee6683a2d57db/lgi/marshal.c#L1193-L1197
Could it be that this only can deal with arguments of type e.g. int[4096], i.e. a buffer of a fixed size? Could this be what the comment refers to?

Actually...The code with the failing assert is trying to allocate a GArray, not a C array. I guess it tries to completely eliminate the [out]-array from the function arguments and turn it into "just a return value".

If I add if(1) return FALSE; to the beginning of lgi_marshal_2c_caller_alloc(), the failed assertion goes away. I didn't yet try if the function actually works correctly / as expected with this change.

Adding an assert(0); to the code in question does not cause a test suite failure, so I am not quite sure why this is here. But it seems weird/fishy and seems to have an incorrect type check. I'll try to look more into it tomorrow.

@psychon
Copy link
Collaborator

psychon commented Apr 10, 2021

Random idea was to just skip things here:

diff --git a/lgi/marshal.c b/lgi/marshal.c
index f56b821..fde21a5 100644
--- a/lgi/marshal.c
+++ b/lgi/marshal.c
@@ -1194,7 +1194,18 @@ lgi_marshal_2c_caller_alloc (lua_State *L, GITypeInfo *ti, GIArgument *val,
 		elt_size =
 		  array_get_elt_size (g_type_info_get_param_type (ti, 0), FALSE);
 		size = g_type_info_get_array_fixed_size (ti);
-		g_assert (size > 0);
+		// FIXME: This used to be g_assert (size > 0);, but it is
+		// possible that this assert fails, see test
+		// gio.socket_receive_from.
+		// This whole code seems fishy: socket_receive_from() reaches
+		// here with a C array, but this code allocates a GArray.
+		// Most likely a stricter check on the type is required here.
+		// However, I could not find any valid code reaching here and
+		// thus tried to only make "minimal damage".
+		if (size == 0)
+		    {
+			break;
+		    }
 
 		/* Allocate underlying array.  It is temporary,
 		   existing only for the duration of the call. */

However, this does not work... Since the buffer argument is [out caller-allocates], the code supposedly tried to completely remove this argument from the argument list and instead allocate its own array.

Calling print(socket:receive_from()) results in the following call according to gdb (with the above patch):

g_socket_receive_from (socket=0x5555557b3280 [GSocket], address=0x7fffffffde18, buffer=0x55555558f2a8 "", size=140737488346664, cancellable=0x0, error=0x7fffffffde98)

So far: No idea where that size comes from, but it seems to be a valid pointer. In hex, it is 0x7fffffffde28, so "almost next to" address.

The print from the above call produces: userdata: 0x556c19392858 Error receiving message: Bad address

So... at least the GError works, but apparently the "normal C return value" was completely lost and instead we get... something.

Sigh. Dunno.

@gnulabis
Copy link

And yet it still hits the same assert. Even calling socket:receive_from() does. This made me notice:

https://github.com/pavouk/lgi/blob/a3d46f4f3cb1a3c19c61f46b856ee6683a2d57db/lgi/marshal.c#L1193-L1197

Disclaimer: I know almost nothing about Lua, and certainly nothing about gobject-introspection

But...

I was looking at v1993/linuxmotehook#5 to figure out why that one was not working for me and I came across the same assertion in marshal.c and the comment about only supporting fixed-sized arrays.

The thing is, in other places in the same file, there seems to be some logic to handle the case where g_type_info_get_array_fixed_size() returns a negative value, like this one:

https://github.com/pavouk/lgi/blob/a3d46f4f3cb1a3c19c61f46b856ee6683a2d57db/lgi/marshal.c#L494-L503

So I did a few checks and I saw that the assert is failing because the GITypeInfo *ti that we pass to g_type_info_get_array_fixed_size (ti) is an array indeed, but not of fixed size, so the function returns -1. On the other hand, I tried calling g_type_info_get_array_length(ti) and that one returns 2 (this is with the code from the linuxmotehook project, not your reduced example).

Unfortunately my understanding of the situation stops here. But it seems to me that if we were able to properly allocate based on the information returned by g_type_info_get_array_length(ti), it should work.

@v1993
Copy link
Contributor Author

v1993 commented Apr 25, 2021

The thing is, it should not be allocating array here at all, it is an output argument. Intended usage is that you pass your own array in and function fills it (returning some other info too).

@cipitaua
Copy link

hello, is there ANY workaround? this bug makes linuxmotehook totally unusable!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants