-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add FitFileEncoder for writing FIT files #58
base: master
Are you sure you want to change the base?
Conversation
1073502
to
0cfa1be
Compare
a66d1e6
to
f93d871
Compare
4a9b763
to
314569c
Compare
fitparse/records.py
Outdated
0x07: BaseType(name='string', identifier=0x07, fmt='s', parse=parse_string, unparse=unparse_string, in_range=lambda x: x), | ||
0x88: BaseType(name='float32', identifier=0x88, fmt='f', invalid_value=_FLOAT32_INVALID_VALUE, | ||
parse=lambda x: None if math.isnan(x) else x, | ||
in_range=lambda x: x if -3.4028235e+38 < x < 3.4028235e+38 else _FLOAT32_INVALID_VALUE), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would prefer if this constant was named for readability along with _FLOAT32_INVALID_VALUE
:
_FLOAT32_RANGE = (-3.4028235e+38, 3.4028235e+38)
in_range=lambda x: x if _FLOAT32_RANGE[0] < x < _FLOAT32_RANGE[1] else _FLOAT32_INVALID_VALUE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe should polish this part, maybe subclass BaseType and use merge in_range into unparse.
fitparse/utils.py
Outdated
:rtype bool""" | ||
if isinstance(obj, (str, bytes)): | ||
return False | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can simplify by using the Iterable
type from the collections
module:
from collections import Iterable
def is_iterable(obj):
return not isinstance(obj, (str, bytes)) and isinstance(obj, Iterable)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://stackoverflow.com/questions/1952464
isinstance(obj, Iterable)
does not cover all use cases. However, for most of situations, isinstance is OK. Maybe should do it to speed up the code a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, I actually had no idea that defining __getitem__
on a class would cause it to become iterable. Seems like a legacy feature that was kept in for backwards compatibility after PEP234 was implemented. I did some tests in 2.7/3.6 and it seems that the basic types all implement __iter__
now so as long as custom classes do the same it should be fine.
scripts/fitdump
Outdated
@@ -98,7 +98,7 @@ def main(args=None): | |||
|
|||
fitfile = fitparse.FitFile( | |||
options.infile, | |||
data_processor=fitparse.StandardUnitsDataProcessor(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fitdump
script is user-facing and should therefore convert units to SI by default. Getting the raw values from the file could be useful in some cases, but should be put behind a flag (--raw-values
?) if it's going to be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is a mistake, thanks.
# Python 2 compat | ||
try: | ||
num_types = (int, float, long) | ||
str = basestring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
str
is still used as a typecheck in get_messages
in the following code:
names = set([
int(n) if (isinstance(n, str) and n.isdigit()) else n
for n in names
])
However, looking at it now, this code can probably be refactored to:
def try_int(obj):
try:
return int(obj)
except ValueError:
return obj
names = set(try_int(n) for n in names)
This would remove the typecheck so the shim wouldn't be needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this code should be removed - the caller should make sure a message number is int, not str if wants to use the number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pR0Ps Should I remove this from the FitFile
?
fitparse/encoder.py
Outdated
:param profile_version: profile version. | ||
:param data_processor: custom data processor. | ||
""" | ||
self.protocol_version = 1.0 if protocol_version is None else float(protocol_version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to not put these defaults in the function definition? Would be more self-documenting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not any reason, just a mistake, thanks.
fitparse/encoder.py
Outdated
self._crc.update(data) | ||
|
||
def _write_struct(self, data, fmt, endian='<'): | ||
if fmt.startswith('<') or fmt.startswith('>'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, allowing multiple ways to specify a parameter can get confusing (how do they interact?). In this case I would either raise an error if the endianness is specified in both parameters or just not allow it in the fmt
at all since (as far as I can tell) it's only used for the CRC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it's a design flaw in the fitparse base, already. E.g. see test.py
func generate_fitfile()
- there is copy&paste of the struct.pack
spec for the FIT header and Crc. I've tried to fix that for Crc at least. But I can copy&paste the endianness for the Crc, too.
from .utils import fileish_open, FitParseError | ||
|
||
|
||
class FitFileEncoder(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like this class allows for creating chained FIT files (basically multiple FIT files concatenated together, see the commit that added support for parsing them). Not saying it has to, just letting you know that it's a possibility since it's a bit of an obscure part of the spec.
In fact, if you want to support chained FIT files, it should probably be taken care of by higher-level code that basically just does an itertools.chain(*fileishs)
for the fileishs written by this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, the code itself allows to append to a file, too. Just the file has to be opened by the user code. So, chaining is the task for the user code. Should not burden the core code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get that it shouldn't be done as part of the normal encoding since it's at a higher level than messing around with the bytes of the file, but that doesn't mean it can't be included in the library.
If it's a common operation, pushing it into user code just means that everyone is going to have to write their own implementation. A better way would be to provide all the functionality the user would need, but let them decide when to use it. For example, one way would be to have a small method chain
on the class that does something like (untested):
def chain(self, *fits):
if not self.completed:
raise Exception("Can't chain more files onto an incomplete FIT file")
for fit in fits:
if isinstance(fit, FitFile):
data = fit._file
else:
data = fileish_open(fit, 'rb')
while True:
block = data.read(BLOCK_SIZE)
if not block:
break
self._write(block)
This way the library can check for issues (like appending data to an incomplete fit datastream) as well as handle compatibility with library objects (the FitFile
shim), but it doesn't complicate the rest of the encoding process in any way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can already chain with this code:
with FitFileEncoder(open(fitname, 'ab') as fit:
...
with FitFileEncoder(open(fitname, 'ab') as fit:
...
I have not seen any particular support for chaining in the FIT SDK, I think they support it the same way.
+1 writing fit files will be useful ! |
+1 would love to see this merged as well! |
+1, would love to have this merged in master. Will have to clone this manually to get access to this new feature for now ! Thanks for this !! |
I am sorry but I do not plan to maintain this PR and resolve conflicts due to the development in the master. Personally, I have decided do abandon FIT writing (and parsing, too) in pure Python since the FIT format is loosely specified and complex. I recommend to make some FIT<-->JSON tool from the original FIT SDK (e.g. C#, Java, C++) and just call this tool from Python. |
Fix #8
I have started with
FitFileEncoder
, see thetest_encoder.py
. I will rebase this PR while fixing theFitFileEncoder
and edit this initial message.The messages are created by
DataMessageCreator
just to keep quite a lot of code outside of the coreDataMessage
.Cannot write:
But these fields are not necessary for an application generated FIT files. The compound and accumulated fields are designed to save a few bits of the resulting FIT file, which is not necessary for a server generated files and, furthermore, I need to use simple FIT features only to make some weak FIT parsers also working.
@dtcooper @pR0Ps Please review the code.
I would appreciate to move some code changes out of this PR and merge it to master to make this PR smaller and simpler to review:
fileish_open
, see RFCT utils.fileish_open with a write possibility #54FitFile
when CRC is ignored.apply_scale_offset
to therecords.py
, what about to merge it to thefield.render
code?FitFileDataProcessor.process_xxx
methods toparse_xxx
BaseTyp.invalid_value
, so as theparse
does not need to be defined in most of times.Thank you