Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reload: fix to handle removed callables + cleanly remove intervals + missing priority error #1399

Conversation

HumorBaby
Copy link
Contributor

@HumorBaby HumorBaby commented Oct 27, 2018

EDIT: rebased; some comments no longer apply.

Attempt #2 (see #1398)

This commit attempts to address #1056 . @dgw did most of the work here. , as I am working off his fork. @dgw has two branches for #1056 (one for reloading with recursion, and one for preventing duplicates). Since they both address reloading issues, I combined the changes from both branches into one. On top of that, I don't think the fix submitted in #1314 works completely for all cases. For example, when a module is defined with callables from a submodule imported into the init, additional issues arise.

For example,
my_sopel_module/some_submodule.py

...
__all__ = ['callable1', 'callable2']
...

my_sopel_module/__init__.py

...
from .some_submodule import *
...

leads to issues in the recursion, where the same module is found as my_sopel_module.some_submodule, and some_submodule. I added a condition to check for . in the module name, and ignore if that is the case. This should still work under all previously passing tests.

@HumorBaby
Copy link
Contributor Author

Well darn. Can't figure out checkstyle errors for failed travis-ci

@dgw dgw added the Bug Things to squish; generally used for issues label Oct 27, 2018
@dgw
Copy link
Member

dgw commented Oct 27, 2018

@HumorBaby I figured out what's going on. pycodestyle and pyflakes added new checks, and the CI build doesn't depend on any particular version of flake8 so it just uses the latest. That's why the CI builds include errors for files you didn't even touch. (That was my biggest clue that it wasn't your fault.)

My local versions are up-to-date now, too. I'll review the changelogs for flake8 and its dependencies in more detail, and open a PR of my own that gets Sopel back to a clean checkstyle.sh run, and then you can work off of that branch. Likely within the next 24 hours or so. Sound good?

@HumorBaby
Copy link
Contributor Author

HumorBaby commented Oct 27, 2018

@dgw sounds good to me. No rush though. Thanks for looking into that.
How should I then proceed with the PR? Continue committing onto this branch or close this/open new PR? I don't want to make a mess.

My gut says new PR since I'm building off a new branch of yours. But I could see how its possible to bring those changes from your branch into this history...

@dgw
Copy link
Member

dgw commented Oct 27, 2018

@HumorBaby You can continue committing on this branch; don't worry about the history for now. I'll walk you through the rebase process to clean it later, if/when the PR becomes ready to merge. It's much easier to have the evolution of a change all in one place, than to chase it between several closed PRs. 😸

For now, just keep doing your thing and ignore the CI errors from style checks. Sometimes the search module tests throw random errors, too, and you can also ignore those if they happen. The only errors you should worry about on a PR, in most cases, are errors in files you touched. Most other errors can be safely ignored, except in specific circumstances.

@HumorBaby
Copy link
Contributor Author

@dgw will do. Thanks again

@HumorBaby
Copy link
Contributor Author

Note for myself: under b2d805d (reload fix with recursion + no dupes), reloading a module that failed to load during startup will throw a KeyError because my_module is not in sys.modules.

bot_1  | Traceback (most recent call last):
bot_1  |   File "/home/bot-dev/pkgs/sopel/bot.py", line 478, in call
bot_1  |     exit_code = func(sopel, trigger)
bot_1  |   File "/home/bot-dev/pkgs/sopel/modules/reload.py", line 53, in f_reload
bot_1  |     reload_module_tree(bot, name)
bot_1  |   File "/home/bot-dev/pkgs/sopel/modules/reload.py", line 59, in reload_module_tree
bot_1  |     old_module = sys.modules[name]
bot_1  | KeyError: u'tldx'

@dgw
Copy link
Member

dgw commented Oct 28, 2018

If a module failed to load, it probably shouldn't be reloadable. Reloading should just fail in that case with a "module not loaded, try the load command" error.

HumorBaby added a commit to HumorBaby/sopel that referenced this pull request Nov 5, 2018
@dgw dgw added this to the 7.0.0 milestone Dec 26, 2018
@dgw dgw added Bugfix Generally, PRs that reference (and fix) one or more issue(s) and removed Bug Things to squish; generally used for issues labels Jan 3, 2019
@dgw
Copy link
Member

dgw commented Jan 17, 2019

@HumorBaby With #1314 merged, this will need a rebase. The history is messy enough that it was going to need one anyway… 😁 I'm sure you were already planning to!

But I found another bit/piece in commit 03215f6 that you might or might not find useful as you continue working on this. It's been a while since I touched it and I don't remember what the test results were like. It probably didn't work very well and/or broke something, since I never put it in #1314, but it's still an idea…

@HumorBaby
Copy link
Contributor Author

@dgw sounds good. I will get back in touch with this issue ASAP. I'm curious to see how #1314 affects this now. Maybe its not even an issue any more!

@dgw
Copy link
Member

dgw commented Jan 18, 2019

@HumorBaby I'm hoping that the last remaining edge cases can be fixed in Sopel 7. For example, #1314 does not (IIRC) handle removed methods well, only added and updated ones. :)

@HumorBaby
Copy link
Contributor Author

@dgw I didn't get around to testing removed methods either. I can take a look when I'm stepping through the reload process. Reptarsaurus in #sopel mentioned that @interval'd methods were still duplicated. I imagine this is related to method removal also and will add these to test cases for this fix. Let me know if you think otherwise

@dgw
Copy link
Member

dgw commented Jan 18, 2019

@HumorBaby I'm really not sure what's up with @interval methods. It seems like jobs are added when a callable is registered:

sopel/sopel/bot.py

Lines 245 to 248 in 73fd522

for func in jobs:
for interval in func.interval:
job = sopel.tools.jobs.Job(interval, func)
self.scheduler.add_job(job)

…and cleared when any callable (#831) is unregistered:

sopel/sopel/bot.py

Lines 219 to 222 in 73fd522

if hasattr(obj, 'interval'):
# TODO this should somehow find the right job to remove, rather than
# clearing the entire queue. Issue #831
self.scheduler.clear_jobs()

…but clearly (ha!) something is wrong with this logic—beyond the problem of deleting all jobs from the scheduler when we only want to remove a specific one. I don't think that a hypothetical @interval method would still be duplicated if it was removed before reloading its module, but haven't tested it yet.

@HumorBaby HumorBaby force-pushed the 1056-dgw-fix-with-recursion-no-dupes branch from 4e77958 to a3e143c Compare January 24, 2019 13:19
Copy link
Contributor Author

@HumorBaby HumorBaby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are for me to remember why I had made certain changes before the rebase. Also, I can design my tests around these changes I thought were important at the time of beginning this fix.

sopel/loader.py Show resolved Hide resolved
sopel/modules/reload.py Outdated Show resolved Hide resolved
sopel/modules/reload.py Show resolved Hide resolved
@HumorBaby HumorBaby force-pushed the 1056-dgw-fix-with-recursion-no-dupes branch from a3e143c to f1956f8 Compare February 1, 2019 20:27
@HumorBaby
Copy link
Contributor Author

HumorBaby commented Feb 1, 2019

Force push was to get the latest tests as I start re-start working on this. Since this PR touches sopel/loader.py, I figured I would bring in test_loader.py from #1424.

@HumorBaby
Copy link
Contributor Author

HumorBaby commented Feb 8, 2019

Oh dear... turns out removed functions from a module are not only not-removed, but also get duplicated repsonses 😢 just fyi... looking into this first as a means to solve the actual problem at hand: remove removed callables

@HumorBaby HumorBaby closed this Feb 15, 2019
@HumorBaby HumorBaby reopened this Feb 15, 2019
@HumorBaby
Copy link
Contributor Author

Oops!

@dgw
Copy link
Member

dgw commented Feb 28, 2019

@HumorBaby I remember you were playing around with this a week or two ago. Did that go anywhere?

@HumorBaby
Copy link
Contributor Author

@dgw, I haven't looked at it too much since then because tracing through .reload was frying my brain. Most recently, I ended up with a conclusion similar to @Exirel

Overall, "reload" in Python is very hard, and I'm not sure we can really fix it. There are too many use cases and corner cases where we don't have the right tools for that.

However, I don't think that it is impossible. It will just require a little bit of 🔨 to make it fit. My goal is still to minimize changes to the current implementation to avoid introducing new issues before the whole process is replaced by something likes #1490/Py3. It feels within reach

In the mean time, the following changes were made (or will be made immediately following this):

  1. Change title; the original title regarding the loading of modules in subpackages is no longer relevant as it was resolved by the reload: Recurse into packaged modules #1314 merge
  2. Push a commit addressing the issue of all @interval'd jobs being cleared.

@HumorBaby HumorBaby changed the title reload: fix issue with reloading modules within subpackage reload: fix to handle removed callables + cleanly remove intervals Feb 28, 2019
Previously, all callables using `@interval` were removed when a single
module was reloaded. Now, the `JobScheduler` can remove a single
callable from the jobs queue, and `bot.unregister` has been updated to
use `JobScheduler.remove_callable_job` instead of
`JobScheduler.clear_jobs`.
@HumorBaby HumorBaby force-pushed the 1056-dgw-fix-with-recursion-no-dupes branch from f1956f8 to 6d3cabb Compare February 28, 2019 14:20
@HumorBaby
Copy link
Contributor Author

RE: removing @interval'd jobs

While #1053 already proposed a solution to this issue:

def del_job(self, del_job):
"""Iterates through the job queue and re-adds all but the given job"""
while self._jobs.qsize():
job = self._jobs.get()
if job is not del_job:
self._jobs.put(job)

That approach of iterating the job queue didn't sit quite right with me. Given the nature of Queues in python (and the time-based access of the queue itself by the scheduler thread), I wonder if such a loop would ever even terminate (or ever get caught in the middle by the job-caller since no lock is used). Never got around to testing it myself, nor do I know whether it was actually tested in #1053. So, I just committed what I had come up with before I noticed that PR:

sopel/sopel/tools/jobs.py

Lines 93 to 104 in 6d3cabb

def remove_callable_job(self, callable):
"""Removes specific callable from job queue"""
with self._mutex:
new_queue = PriorityQueue()
while True:
try:
job = self._jobs.get(False)
if job.func != callable:
new_queue.put(job)
except Queue.Empty:
break
self._jobs = new_queue

I will gladly pick this out of #1053 (since that work was done first) if anyone believes it to be a safe solution to this issue 😄

Note: while this solves the problem of removing only the reloaded @interval'd jobs, it does not deal with the issue of removed @interval'd jobs. Interestingly, however, unlike normal callables, the @interval'd jobs are do not produce an echo after remove+reload (see #1399 (comment))

@HumorBaby
Copy link
Contributor Author

@dgw, how do you feel about changing this PR to deal solely with @interval'd issues? It has already changed from its original purpose once. The updated PR would resolve the following issues:

  • don't clear the entire job queue to remove a single job
  • don't execute @interval'd callables if the bot is not connected

This would let the issue of .reload not clearing non-@interval'd methods causing echoes (which I think is the last .reload bug) get its own PR if/when ready.

@dgw
Copy link
Member

dgw commented Mar 5, 2019

@HumorBaby It's kinda whatever. Fix what you can, when you can, you know? 😁 And retitle it to match when you update, of course.

Is there a currently open issue that would be good to use for tracking all the reload bugs/quirks, or would you rather just open a fresh one and populate it with a checklist of the current state? Maybe with references to the other issues/PRs open that relate to it? I'm kind of swamped both IRL and on GitHub at present, but it would be amazing if you have time to put together something like that!

@HumorBaby
Copy link
Contributor Author

HumorBaby commented Mar 12, 2019

@dgw desribed an issue where calling .reload * leads to AttributeError: 'function' object has no attribute 'priority' (file "/home/dgw/github/sopel/sopel/bot.py", line 216, in unregister). Specifically, this was due to the sopel-github module being enabled (error disappeared when module was excluded). The issue was seen using the 6.6.x branch.

Preliminary debug-prints indicated that callables did in fact "have priority attributes…" and "they get the default 'medium' assigned just like they should".

@HumorBaby suggested that non-callables (not called by the bot) were triggering this issue, but alas there was an "AttributeError in bot.unregister() on object issue_info".

The best current hypothesis is: "so it's still possibly a weird thing with callables from not-__init__.py but a really obscure one", potentially stemming from the previous recursion fix for .reload

@HumorBaby HumorBaby changed the title reload: fix to handle removed callables + cleanly remove intervals reload: fix to handle removed callables + cleanly remove intervals + missing priority error Mar 12, 2019
If a callable was imported like from .module import *, the recursion
used in `reload.py` would catch the callable twice. The first time
around, when the object name has the format `module.submodule...`, the
callable obj has the rules as uncompiled regexp strings and no
associated priority, and so causes issues in the `bot.unregister`
method. The second time, the object name is just the object name and now
rules have been compiled (with associated priorities). This is **only**
observed with the `@rule`-decorated callables, and not with
`@command`-decorated ones.

While the object that with the `module.submodule...` in the name does
actually exist, it is not used anywhere and just hangs around. It's like
a human wisdom tooth... Since it has no effect on bot function, as it is
never registered and the bot is unaware of its existence, we can just
ignore it by checking if the callable object name contains a `.` Upon
finding a `.`, we just skip `unregister`'ing it.

Also, while dealing with this, I added a check to ensure that compiled
rules are added to a callable's rules list only once. While not directly
contributing to the issue, it was tangentially related, and worth
refactoring.
@HumorBaby
Copy link
Contributor Author

re: the .reload github issue leading to AttributeError: ... 'priority' ... descirbed by @dgw

It turns out that it was directly related to the original issue for which I opened the PR (which has now become this box of smashed crabs). @sopel.module.rule'd objects have a ghost object that does not have compiled rules and no associated priorities. These spectral objects only become an issue when a module has a from xyz import * which leads .reload to find the callables multiple times as it recurses through the module files.

One quick and dirty fix is the just ignore objects with . in their name (see commit message). It may be worth investigating the order that callables with rules are decorated and regexp compiled.... or just wait until @Exirel's new interface is merged 😄

@dgw
Copy link
Member

dgw commented Mar 13, 2019

It may be worth investigating the order that callables with rules are decorated and regexp compiled.... or just wait until @Exirel's new interface is merged 😄

I can almost guarantee that the new interface will get merged before this (or anything else related to the issues with .reload), so we can have the maximum development time with the new plugin interface and shake out as many bugs as possible before 7.x releases.

Hopefully that won't be too much of an issue for you, what with having to take each of your fixes and figure out if it's still needed under the new system, then port the patch if so. 😁

This week is a busy one for me IRL (tech week for an operetta) but I still hope to find the time for reviewing #1479 and #1490 by the end of March. I hope they can be merged without too much fuss, so we can move from the overall system to the smaller bugs like what this PR wants to fix.

@HumorBaby
Copy link
Contributor Author

Sounds good to me 😄 I'll just drop keep dropping commits in here to keep track of things then

@dgw
Copy link
Member

dgw commented Apr 4, 2019

Is this still needed after #1479, @HumorBaby? I'm not sure if you and @Exirel have combined efforts.

@HumorBaby
Copy link
Contributor Author

HumorBaby commented Apr 4, 2019

This PR can be closed. I imagine @Exirel has these issues covered 😃. I think anything that arises after the new plugin interface should get its own issue/PR .

Edit: from what I can see, the drivers of this PR are no longer an issue in #1479

@dgw
Copy link
Member

dgw commented Apr 4, 2019

🎉

@dgw dgw closed this Apr 4, 2019
@dgw dgw added the Replaced Superseded by a newer PR label Apr 4, 2019
@Exirel
Copy link
Contributor

Exirel commented Apr 4, 2019

I can confirm this PR has been more or less "merged" into #1479 in a co-authored commit: ab8f174

@HumorBaby HumorBaby deleted the 1056-dgw-fix-with-recursion-no-dupes branch August 28, 2019 14:13
@dgw dgw removed this from the 7.0.0 milestone Dec 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bugfix Generally, PRs that reference (and fix) one or more issue(s) Replaced Superseded by a newer PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants