-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add debug logging to odo. #502
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for posting this PR!
I think this could help with debugging in some cases but I am worried about the overhead when idle. Can you post some profiling data to see how much time is spent in dropped log lines?
from contextlib import contextmanager | ||
from warnings import warn | ||
|
||
import networkx as nx | ||
from datashape import discover | ||
from .utils import expand_tuples, ignoring | ||
|
||
log = logging.getLogger(__name__) | ||
log.addHandler(logging.NullHandler()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you explicitly logging registering a handler for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I misunderstood initially what NullHandler was for. It's not needed here and I will remove it.
x = f(x, excluded_edges=excluded_edges, **kwargs) | ||
return x | ||
except NotImplementedError as e: | ||
if kwargs.get('raise_on_errors'): | ||
raise | ||
warn(FailedConversionWarning(A, B, e)) | ||
log.warning('Failed Conversion {} -> {} ({})'.format(A, B, str(e))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really need a log warning and a warning.warn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if both are desired. The difference between warning.warn and log.warning is explained on this page https://docs.python.org/2/howto/logging.html. In short, warning.warn is intended for libraries where application code should be modified to accomodate a change in the library.
@@ -57,12 +60,15 @@ def _transform(graph, target, source, excluded_edges=None, ooc_types=ooc_types, | |||
ooc_types=ooc_types) | |||
try: | |||
for (A, B, f) in pth: | |||
log.debug('[{src} -> {dest}] {f} ({kwargs})'.format(f=f.__name__, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general you should let logging libraries handle string formatting, this ensuers that no formatting happens unless the log message is being handled. logging
uses %
formatting so you can pass the values to format in as positionals after the message format.
Since the path is known before the loop, an alternative approach would be to log the path before traversing it. Though as far as performance goes, my initial impression is the calls to various conversion functions would far outweigh the overhead from calls to the log. |
I've done some simple (very simple) benchmarking/profiling. Yes, logging does add some overhead. If we are logging below the log level, each log call results in three function calls. getEffectiveLevel is probably the most expensive call (has a while loop that traverses the current logger and all its parents), the other two seem to be relatively cheap. I don't think any of odo's paths even approach 100 hops in length. I think my alternative solution will sidestep much of this overhead. Logging should be fine if we avoid calling it in the loop. Thoughts? import logging
log = logging.getLogger(__name__)
loops = 100
def withlogging():
y = 0
for x in range(loops):
log.debug("This is a debug message %s", x)
y += 1
def outlogging():
y = 0
log.debug("This is a debug message %s", loops)
for x in range(loops):
y += 1
def withoutlogging():
y = 0
for x in range(loops):
y+= 1
|
I left the log.debug call in the for loop for now. With logging, but no configuration (best of 7 runs) list -> DataFrame with 1 element. I tried to use a path that I was easy to benchmark and reasonable fast (not spend a whole lot of time with data structures. The path was 4 hops. EDIT: Updated timings |
These changes look good, could you rebase? After the ci runs on the rebase I will merge. |
I'll rebase hopefully this weekend. |
bb39be8
to
279b889
Compare
This was something that I added to an old fork I had of odo. I found it useful for understanding the route of transformations in odo and it also helped me understand an issue that I was having at the time.
Is logging like this something that is desirable for odo? The log will record the route that odo took transforming from source to destination.