-
Notifications
You must be signed in to change notification settings - Fork 93
Exceptions when a TCPSocket connection timed out in and out of the Reactor using Kernel.timeout #97
Comments
What do you mean by "I cannot rescue the third one"? You only named two exceptions, and really the only one you need worry about is Celluloid::Task::TimeoutError. If we can figure out a solution to #56 (which is actually needed for the With Celluloid::FSM (which is, unfortunately, poorly documented) you can specify a timeout for a given state after which the FSM automatically transitions to another state, which can automatically run some code after the state transition. This makes it easy to build workflows for attempting something which will either induce a state transition when it completes, or setting a timeout which transitions to a different state which can handle the timeout error. No exception juggling required, just pure FSMs ;) |
The "third one" was the third example I described, that is the expection caused by Kernel.timeout inside the actor when it actually times out (this actually stops the reactor and kills the actor). I know you probably are against using Kernel.timeout inside the actor, but in such cases I sometimes don't have a choice, the timeout call is already there :) I will check out the FSM solution and I'll get back at you with more info about the experience. But I still think this is an important use case, concerning tcpsocket injection in an already existing library. Imagine such a library like Net::SSH or Net::Telnet have some logic inside using Kernel. timeout. They might be using the Celluloid::IO::TCPSocket, but apart from that, they're using regular sleep and timeout there, right? |
Okay, here's a vaporware, strawman solution ;) |
@TiagoCardoso1983 I'm not sure it makes sense for Celluloid to try to handle That's probably not what you want to hear, but that's my idea for the vaporware |
Yup, I am from the same opinion as you. From what I understood the Timeout::timeout call runs the block in a separate thread, and that is not very compatible with the reactor/mailbox/scratchoneofthem. Timeout::timeout receives as second parameter the exception to raise. My first short term solution proposal was to raise Celluloid::TimeoutError instead of TimeoutError. Execution stop at the libev level :S : ruby: ../libev/ev.c:3274: ev_run: Assertion `("libev: ev_loop recursion during release detected", ((loop)->loop_done) â
!= 0x80)' failed. So yes, that will also not help... I guess the best approach is, like you said, to "overwrite" the timeout primitive, that is, port the Celluloid::Actor#timeout code to the separate timeout gem and the kernel. Basically it has to check whether it is within the actor or not, and then fire the appropriate timeout, the only difference being, instead of calling Kernel.timeout, it calls Timeout.timeout. I'll try that one, but that'd be my proposal for the vaporware. |
That's actually a decent enough workaround, and the same approach Celluloid uses elsewhere (e.g. Celluloid::IO implicitly hijacks TCPSocket) |
Oh wait, you're redefining This is why I was proposing the timeout gem, although I was hoping to avoid core extensions, and see if we can get other gems to adopt a different API that supports pluggable timeout backends. |
Eheh, it seems redefining Kernel.timeout doesn't work as expected, it is included in the global namespace before and doesn't take the desired effect, hence I'm redefining the global namespace timeout, which is used everywhere. Yes, I'm from your opinion, and an awful lot of gems already use Timeout::timeout. But Net::Telnet doesn't, and it is core ruby... The concept I'm trying to work on is: on load, define my new timeout primitive, hijack the global namespace timeout, insert it in my primitive namespace and make it work as a fallback. After that, redefine the global namespace timeout as my new timeout primitive. Haven't been quite successful on that yet. Seems that, by unbinding a method and binding it somewhere else, it still suffer from changes you make to the original object. Is there something such as method duplicates? |
Here's my second approach: require 'timeout'
old_timeout = method(:timeout).unbind
# first step: define new implementation of timeout
module Celluloid
module Timeout
@old_timeout = method(:timeout).to_proc
# celluloid-aware timeout implementation, falls back to the latest implementation of it
def self.celluloid_timeout(duration, klass=nil, &blk)
Celluloid::actor? ?
Thread.current[:celluloid_actor].timeout(duration, &blk) :
@old_timeout.call(duration, klass, &blk)
end
end
end
# second step: redefine Kernel.timeout
private
def timeout(*args, &blk)
Celluloid::Timeout.celluloid_timeout(*args, &blk)
end So, I take the global namespace on load and use it as the fallback when not inside celluloid actor. By making it a proc, I assure that by redefinitions it will not affect this instance. And then I (gulp...) redefine global namespace timeout. What do you think? I know, I know, redefining timeout sucks, but how can you get all gem APIs to abide to a new namespace in short notice? Celluloid would need this solution now, right? |
I don't want Celluloid itself to define |
Exactly, my implementation is a possible implementation of the vaporware "timeout" gem. How could a timeout-functionalitz plug in API would look like? |
I'd probably do something like this: class Thread
attr_accessor :timeout_handler
end
def timeout(time, &block)
if timeout_handler = Thread.current.timeout_handler
timeout_handler.call(timeout, &block)
else
Timeout.timeout(time, &block)
end
end |
Does indeed look interesting. I assume you would pass the routine: Celluloid::actor? ?
Thread.current[:celluloid_actor].timeout(duration, &blk) :
@old_timeout.call(duration, klass, &blk) or something similar to the timeout_handler then. When would that make sense? here? |
Celluloid can integrate with the (optional) gem directly, and initialize a custom timeout handler if the gem is loaded. I don't want to make it an explicit dependency though... more something someone like you who knows what they're doing can optionally include to solve this class of problems. |
I like the sound of it. The gem should definitely be an optional dependency. The gem should therefore "inject" the integration (in the snippet I mentioned, or maybe somewhere else) in Celluloid itself. |
It could work that way if Celluloid had a hook for this stuff. Right now it doesn't |
Somehow related: #101 I guess, even using Celluloid::Actor#timeout, the descriptors are not being removed from the reactor selector. |
Is this still a problem? |
yup, at least until the timeout extensions are integrated in celluloid. |
I'm fighting against the non-establishment of connections using Celluloid::IO::TCPSocket s. Currently my issue is with connections timing out.
Let's say I have an "timeoutable" hostname.
Which is consistent with what happens when using a regular TCPSocket.
Now, when I do that inside an actor...
It raises a Celluloid::Task::TimeoutError, which makes sense (but it kills the actor).
Now, other case:
This one raises an "execution expired" exception coming from the reactor. And kills the actor.
So, there are two issues for me. Let's say I have an actor whose task is to perform a task on x remote devices. Being that for some I cannot connect on time, how can I rescue the connection timeout and not kill the actor in the process, but simple handle the logic, mark the remote location as unreachable and move on with my actor life? The way I tested, I can rescue Errno::ETIMEDOUT and Celluloid::Task::TimeoutError on the method and keep the actor alive. But I cannot rescue the third one, because that exception is thrown after the actor has been killed, so no way to keep him alive.
The second is regarding that last actor. As you have seen, I have used Kernel.timeout inside the reactor loop, which clearly doesn't play along very well. I know your first question is "uh, why you used Kernel.timeout instead of Actor#timeout as in the second example?", but I'm using a wrapper whom I pass a proxy. This wrapper doesn't know anything about Celluloid, hence it calls normal. timeout. i don't have any influence. As I've seen, I can only use this timeout inside the actor (an instance method, therefore). How could I access the current actor and call the #timeout method in such a case?
I hope the second question can lead to an answer of the first one :)
The text was updated successfully, but these errors were encountered: