-
Notifications
You must be signed in to change notification settings - Fork 107
python3 transition
Some of the automatic python2 to python3 translation can be performed with the 2to3
standard tool available with any python3 installation.
It can be done with a command like:
2to3 bin/ -w -n
which would update all the files under the bin/
directory from python2 to python3. Some wrong and/or sub-optimal updates to note are:
- it fails to deal with
meta_classes
. The way forward here would be to manually update the class toMyClass(parent, metaclass=MetaClass)
- conflicts with
past.builtins
. E.g., it changes fromfrom past.builtins import basestring
tofrom past.builtins import str
. The way forward with that is to either delete completely that line, or to change it tofrom builtins import str
.
Some useful grep commands to spot what needs to be updated is listed below:
egrep -rI 'PY2|PY3|python' * | grep -v 'env python'
grep -I -r 'from future' *
grep -I -r 'from __future' *
grep -I -r 'from past' bin/*
note that some scripts don't necessarily have the .py
extension, so searching only for python files is an incomplete solution.
This is not urgent and can be divided into two main phases:
- Stop using the py2 runtime, do not provide bugfix for errors occurring in py2 only, but maintain potential py2 compatibility for WMCore clients
- Remove the compatibility layer for both py2 and py3, so that dmwm/WMCore works with py3 only
The first phase will not require any change in our code, it is merely a change in how we address bugs and communication with other teams.
The second phase will require some development and the main steps will be
- if using python 3.8.2, move to using pickle protocol 5 everywhere
- i would suggest using everywhere this variable and fixing it to a specific number, so that we do not encounter bad surprises when comp changes the runtime version.
- remove
from __future__ import
statements - remove
from past import
statements - remove
from builtins import
statements - the following step means that we should also stop using
basestring
. it should be enough to- remove the
from past.builtins import basestring
-
isinstance(_, basestring)
->isinstance(_, (str, bytes))
- remove the
- fix how dictionaries are iterated over:
-
for k in viewkeys(mydict):
->for k in mydict
-
for v in viewvalues(mydict):
->for v in mydict.values()
-
for v in listvalues(mydict):
->for v in list(mydict.values())
-
for k, v in viewitems(mydict)
->for k, v in mydict.items()
-
for k, v in listitems(mydict)
->for k, v in list(mydict.items())
- these are only examples. such changes need to be done also when the iteration over as dictionary is not used in a loop but in other cases, such as creating a set. Every
view*()
andlist*()
iterator provided by python future'sbuiltins
needs to be replaced by the appropriate statement.
-
- remove
standard_library.install_aliases()
, which was mainly used to access the backported version of py3 urllib and httplib into py2 - make
if PY3
the default, remove all the code inif PY2
- if this is used in
decodeBytesToUnicodeConditional
orencodeUnicodeToBytesConditional
, then simply usedecodeBytesConditional
andencodeUnicodeToBytes
- if this is used in
Then, we can start using all the shiny new features that py3 provides!
The python3 transition within WMCore isn't really a migration to python3, but it's meant to be a modernization of our code such that it's compatible with python 2.7 and python 3 (latest stable release being 3.8.x at the moment). It's unclear whether python 2.6 would have to be supported as well (especially for the WMRuntime package).
Many of the CMS Computing services are maintained and built by our own group, as well as many of their dependencies. The CMS Computing software stack is currently maintained in this repository/branch: https://github.com/cms-sw/cmsdist/tree/comp_gcc630 where we also build many of the python libraries (either for python2 or python3). Thus, during this python migration, there will be the need to also build new (python) spec files for the required dependencies; and/or to update those that are out-dated or inconsistent between py2 and py3. We use this model such that we have full control of all the dependencies shipped with our CMS software, including any possible patches needed.
This link https://docs.python.org/3/howto/pyporting.html has a lot of good stuff on all the differences between python 2 and 3. The way we have planned this migration considers passing our code through python-futurize http://python-future.org/automatic_conversion.html . Developers should consider doing this now when changing existing code and validating unit tests.
Work of Summer student on py2 to py3 transition is summarized
- wiki page describing all performed steps (likely deprecated)
- twiki page summarizing futurize steps for different use-cases (either deprecated or to be reviewed)
items(), keys(), and values()
Python2 uses these keywords to and creates lists of them. This can take a lot of memory. Python3 has the same syntax, but they are now iterators. In Python2, iteritems(), iterkeys(), and itervalues() behave the same as the Python3 versions. There is a "problem" with the futurist fixer for these issues in that it
- Converts python2 uses of items() to list(items()) - not a problem, this is just explicit
- Converts iteritems() to items()
- This is OK for python3, but on python2 possibly alters the performance of the code
- And if you convert this again you now end up with list(items()) in your python3 code altering the performance under Python3 too
Eric's proposal is to use the futurize fixer, but discard all changes that change iteritems() etc to the python 3 versions and use that as the python2 code. We would run it a second time to create a dedicated python3 version. We could also take this opportunity to review our uses of items(), etc in python2 since in most cases we could be using the iterator versions. The most common case where we can't do this is in doing something like len(items()).