Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add ability to configure logging for debugging #360

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

josh146
Copy link
Member

@josh146 josh146 commented Apr 12, 2020

Context:

When debugging configuration and API access, it's useful to be able to output debugging logging information to a file. This can then be shared when requesting help.

Currently, Strawberry Fields performs logging of internal state and warning, but requires the user have knowledge of the Python logging module in order to modify the logging level/output the log information to a file.

Description of the Change:

  • Adds a new section to the configuration pertaining to logging:

    [logging]
    # default logger output on the terminal
    level = "info"
    # optional logfile output
    logfile = "sf.log"

    By changing the level option, the user can specify the types of log messages that are displayed in the terminal when using the default SF logger. For instance, by setting level = "debug", more detailed information is output during execution.

    By setting the logfile option, the user can specify a logfile that is appended to during SF execution. Currently, the logfile contains all logged messages, of level DEBUG and up.

    Click here to see an example logfile created when submitting a remote program.
    2020-04-12 21:12:26,640 - DEBUG - Configuration file /home/josh/.config/strawberryfields/config.toml loaded
    2020-04-12 21:12:26,641 - DEBUG - Loaded configuration: {'api': {'authentication_token': 'm8Tox***********************************', 'hostname': 'platform.strawberryfields.ai', 'use_ssl': True, 'port': 443}, 'logging': {'level': 'debug'}}
    2020-04-12 21:12:26,641 - DEBUG - Compiling program for target X8_01 with compile options {}
    2020-04-12 21:12:26,964 - DEBUG - Submitting job
    name test
    version 1.0
    target X8_01 (shots=10)
    
    S2gate(0, 0) | [2, 6]
    S2gate(1, 0.0) | [1, 5]
    S2gate(0.0, 0.0) | [3, 7]
    S2gate(1.0, 0.0) | [0, 4]
    MZgate(5.3074357926500895, 1.5739876247864792) | [0, 1]
    MZgate(4.279290161703878, 4.202979995136468) | [2, 3]
    MZgate(5.824121814032434, 5.033017799971674) | [1, 2]
    MZgate(5.198882953351497, 3.715902833866422) | [0, 1]
    MZgate(4.712954779003035, 3.107023958221405) | [2, 3]
    MZgate(4.83853796968657, 5.988756943006127) | [1, 2]
    Rgate(0.6651640322699326) | 0
    Rgate(0.06106261020227405) | 1
    Rgate(2.94378117796741) | 2
    Rgate(1.820738947340938) | 3
    MZgate(5.3074357926500895, 1.5739876247864792) | [4, 5]
    MZgate(4.279290161703878, 4.202979995136468) | [6, 7]
    MZgate(5.824121814032434, 5.033017799971674) | [5, 6]
    MZgate(5.198882953351497, 3.715902833866422) | [4, 5]
    MZgate(4.712954779003035, 3.107023958221405) | [6, 7]
    MZgate(4.83853796968657, 5.988756943006127) | [5, 6]
    Rgate(0.6651640322699326) | 4
    Rgate(0.06106261020227405) | 5
    Rgate(2.94378117796741) | 6
    Rgate(1.820738947340938) | 7
    MeasureFock() | [0, 1, 2, 3, 4, 5, 6, 7]
    
    2020-04-12 21:12:27,608 - INFO - Job d101fc18-289f-442b-a899-3d5a6d40fffc was successfully submitted.

    Note: these options only affect the built-in SF logger handlers. If the user creates their own custom logging handler, this overrides the SF built-in handlers.

  • Adds a DEBUG log message on job submission, logging the submitted blackbird script.

  • Adds a DEBUG log message on RemoteEngine.run_async(), providing details on program compilation if it occurs.

  • Adds an INFO log message if the job is complete after a refresh

  • Adds a sf.configuration.SESSION_CONFIG global variable that stores the configuration of Strawberry Fields as first import. This is used to configure the logger --- the configuration of the logger should not change after this.

  • load_config() now logs debugging information, including the loaded configuration. Note that loaded API tokens are automatically masked.

  • Minor refactor of the sf.configuration module:

    • The logging section has been added to DEFAULT_CONFIG_SPEC.

    • Some existing functions had hardcoded dependence on the "api" section of the configuration file. This has been removed; now no functions, except for store_account(), depend on the structure of the "api" section. As a result, create_config has been replaced with _generate_config, which automatically generates a default configuration object given a configuration spec.

    • find_config_file seemed to duplicate the logic in get_available_config_paths, so now depends on the former.

    • get_api_section was currently used only in load_config. Since load_config is now agnostic of the configuration structure, this function is redundant.

    • keep_valid_options was removed; in testing, it tended to lead to bad user experience to filter out unknown options. Typos (e.g., prot instead of port) would be pruned from the loaded configuration, making debugging difficult.

  • Since there is now more than just an api section in the configuration, store_account now loads the existing configuration, modifies it as requested, and then re-writes it to disk. This avoids non-API configuration from being overwritten.

Benefits:

  • Provides an interface for easy logging configuration for the user.

Possible Drawbacks:

  • Currently, create_logger requires a loaded configuration to configure the logger. However, load_config creates a logger in order to log details about finding/loading the configuration! This leads to a circular dependency. Currently, this is avoided by passing logging=False to load_config on first load, to turn off any logging (since the logger has not been created yet). However, this is ugly, and bad form. If something goes wrong with the users logging configuration, it won't be logged!

  • We could consider having a function sf.configure_logger() in order to configure the logging section of the config file within SF, similar to how sf.store_account() configures the API section. E.g.:

    >>> sf.configure_logger(level="debug", logfile="sf.log")

    However, since the logger is configured on import, these settings would not take affect until the next SF import.

Related GitHub Issues: n/a



class ConfigurationError(Exception):
"""Exception used for configuration errors"""


def load_config(filename="config.toml", **kwargs):
def _deep_update(source, overrides):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or comparable such that within the function it's more apparent when read

Suggested change
def _deep_update(source, overrides):
def _deep_update(old_dict, new_dict):

def load_config(filename="config.toml", **kwargs):
def _deep_update(source, overrides):
"""Recursively update a nested dictionary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the Args part seems to be missing

if isinstance(value, collections.Mapping) and value:
# Override value is a non-empty dictionary.
# Update the source key with the override dictionary.
returned = _deep_update(source.get(key, {}), value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting this up might make it easier to grasp the small steps here

Suggested change
returned = _deep_update(source.get(key, {}), value)
source_key = source.get(key, {})
returned = _deep_update(source_key, value)

Comment on lines +34 to +51
"""dict: Nested dictionary representing the allowed configuration
sections, options, default values, and allowed types for Strawberry
Fields configurations. For each configuration option key, the
corresponding value is a length-2 tuple, containing:

* A type or tuple of types, representing the allowed type
for that configuration option.

* The default value for that configuration option.

.. note::

By TOML convention, keys with a default value of ``None``
will **not** be present in the generated/loaded configuration
file. This is because TOML has no concept of ``NoneType`` or ``Null``,
instead, the non-presence of a key indicates that the configuration
value is not set.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! 🎊

@@ -154,6 +154,8 @@ def create_job(self, target: str, program: Program, run_options: dict = None) ->

circuit = bb.serialize()

self.log.debug("Submitting job\n%s", circuit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to keep the level consistent here and on line 164: self.log.info("Job %s was successfully submitted.", job_id). Either both could be DEBUG or both could be INFO.

One thing I was wondering about: for errors, how should we handle logging? Would it make sense to have the same message when instantiating the error and also log that at the same time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! Errors should definitely be logged, the generated log files should have a running commentary of what happens during program execution.

One way of doing this could be using logger.exception.

This article seems to provide a nice overview of potential logging/exception patterns: https://www.loggly.com/blog/exceptional-logging-of-exceptions-in-python/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had this as info, but I'm not so sure they both belong at the same level. From the Python docs:

Level Description
DEBUG Designates fine-grained informational events that are most useful to debug an application.
INFO Designates informational messages that highlight the progress of the application at coarse-grained level.
  • Job submission provides information about the progress of the job submission at a very course-grained level. Seems particularly well suited, given that info is the default level, and will be displayed on the output by default.

  • Logging the submitted circuit is instead more fine-grained debugging data. In most cases, users won't want this displayed as 'info', since the submitted circuit should be evident from their Python code (or determined via prog.compile()). However, this will be useful information when debugging.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! Happy if these are kept as they are right now :)

match the expected type defined in the configuration spec.
"""
res = {}
for k, v in config_spec.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any use case for being able to pass config_spec to this function? Most of the times one would be using DEFAULT_CONFIG_SPEC, but now the function allows any nested dictionary structure. Not sure if the complexity arising from this is worth having.

To me, it seems that the block within the isinstance(v, tuple) case could be renamed and organized into a new function. Since we know that the maximum depth of nesting is 2, using recursion here seems like a general solution to a specific problem which can still be avoided to yield a simpler line of logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably agree with @antalszava that recursion probably should be avoided if not completely necessary (like in this case since only a very shallow, specific depth of 2 is used), in favour of a more explicit solution.

Copy link
Member Author

@josh146 josh146 Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point 🙂 I tried to keep this as general as possible, because TOML allows for nested sections. For example, in the future, we might have something like this:

[api]
hostname = "platform.strawberryfields.ai"

  [default]
  token = "personal_token"

  [org1]
  token = "organization_token"

When loaded via toml.load, it would look like this:

{
  "api": {
    "hostname": "platform.strawberryfields.ai",
    "default": {"token": "personal_token"},
    "org1": {"token": "organization_token"}
  }
}

I suppose it's a question of do we want to disallow nested TOML config files (in which case it makes sense to restrict to simply using DEFAULT_CONFIG_SPEC here), or if we want to keep this function general.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, yes :) I'd be leaning towards keeping it simple for now and updating it whenever a specific nested section is introduced for the configuration (but this is subjective point of view). Overall, this would come with the

  • Advantage: the configuration related code is tailored onto the current needs, is slightly simpler and easier to be picked up
  • Disadvantage: once a nested section is being added, the logic here will have to change to this more general solution

**Example**

>>> _generate_config(DEFAULT_CONFIG_SPEC, api={"port": 54})
{
Copy link
Contributor

@antalszava antalszava Apr 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come that the logging section is not included when using DEFAULT_CONFIG_SPEC? To me, it is something unexpected

Edit: added a suggestion, though this might once again change if the suggestion of having "" as default filename is accepted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I just forgot to update the docstring!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right that makes sense :)


DEFAULT_CONFIG_SPEC = {
"api": {
"authentication_token": (str, ""),
"hostname": (str, "platform.strawberryfields.ai"),
"use_ssl": (bool, True),
"port": (int, 443),
}
},
"logging": {"level": (str, "info"), "logfile": ((str, type(None)), None)},
Copy link
Contributor

@antalszava antalszava Apr 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this one to go with the same as for authentication_token?

(Edit: although, it also came to my mind that the advantage of leaving it as is would keep the distinction between required and not required options)

Suggested change
"logging": {"level": (str, "info"), "logfile": ((str, type(None)), None)},
"logging": {"level": (str, "info"), "logfile": (str, "")},

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for readability, I think it would be nicer to always write JSON-style nested dictionaries on several lines. 🙂

Suggested change
"logging": {"level": (str, "info"), "logfile": ((str, type(None)), None)},
"logging": {
"level": (str, "info"),
"logfile": ((str, type(None)), None)
},

@antalszava
Copy link
Contributor

antalszava commented Apr 12, 2020

Seems like the circular dependency could be resolved by having module imports: https://stackoverflow.com/a/22210807/12899253

Edit: locally I also had to separate parsing logging optionsfunction into a load_logging_config() and used this in create_logger.

Overall the overhead of pulling the logging options each time create_logger runs does not seem to be something too significant.

This is also the reason why in #359 such a solution was chosen for Connection.__init__ and the previous module attribute was removed. This would help out also in having a configure_logger (the logging details being pulled dynamically means that after a call to configure_logger, create_logger will indeed be using the new options).

Copy link
Contributor

@thisac thisac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks great! 💯 Only a few comments, but nothing major.


DEFAULT_CONFIG_SPEC = {
"api": {
"authentication_token": (str, ""),
"hostname": (str, "platform.strawberryfields.ai"),
"use_ssl": (bool, True),
"port": (int, 443),
}
},
"logging": {"level": (str, "info"), "logfile": ((str, type(None)), None)},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for readability, I think it would be nicer to always write JSON-style nested dictionaries on several lines. 🙂

Suggested change
"logging": {"level": (str, "info"), "logfile": ((str, type(None)), None)},
"logging": {
"level": (str, "info"),
"logfile": ((str, type(None)), None)
},

Comment on lines 25 to +31
DEFAULT_CONFIG_SPEC = {
"api": {
"authentication_token": (str, ""),
"hostname": (str, "platform.strawberryfields.ai"),
"use_ssl": (bool, True),
"port": (int, 443),
}
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious; is the type defined for each value something pythonic or is it only part of this code (it looks like the type validation is done in _generate_config below)? Nice either way. 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing Pythonic, just a way of having a single 'source of truth' for the allowed configuration at the top of the file 🙂 This means that changes to the configuration/default values/allowed types requires simply modifying this variable --- all subsequent functions are independent of the allowed keys/options/types.

match the expected type defined in the configuration spec.
"""
res = {}
for k, v in config_spec.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably agree with @antalszava that recursion probably should be avoided if not completely necessary (like in this case since only a very shallow, specific depth of 2 is used), in favour of a more explicit solution.

Comment on lines +143 to +144
res[k] = _generate_config(v, **kwargs.get(k, {}))
return res
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(apart from maybe not needing to use recursion) really nice! 🎉

Comment on lines +214 to +218
if logging:
config_details = "Loaded configuration: {}".format(config)
auth_token = config.get("api", {}).get("authentication_token", "")
config_details = config_details.replace(auth_token[5:], "*" * len(auth_token[5:]))
log.debug(config_details)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit hacky, though 😆

@@ -46,7 +47,8 @@
"hostname": "platform.strawberryfields.ai",
"use_ssl": True,
"port": 443,
}
},
'logging': {'level': 'info'}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'logging': {'level': 'info'}
"logging": {"level": "info"}

tests/frontend/test_configuration.py Outdated Show resolved Hide resolved
@thisac
Copy link
Contributor

thisac commented Apr 13, 2020

@josh146 There seem to be an issue with updating from environment variables. test_environment_variables_take_precedence_over_conf_file is failing, and by quickly looking at it it seems like the environment variables never overwrite the config (haven't run it yet though, only briefly looked at the code). 🐛

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants