Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quipGapXYZ: add unit convert and tag synonym matching #678

Open
wants to merge 7 commits into
base: devel
Choose a base branch
from

Conversation

SchrodingersCattt
Copy link

@SchrodingersCattt SchrodingersCattt commented Jul 3, 2024

Many datasets using the exyz format have different formats for units and tags. This PR aims to address the situation with multiple possible datasets.

Summary by CodeRabbit

  • New Features
    • Introduced energy and force unit conversions for more accurate calculations.
    • Enhanced error handling and unit conversions for fields like species, positions, atomic numbers, forces, and tags within processing functions.

Copy link

coderabbitai bot commented Jul 3, 2024

Walkthrough

Walkthrough

The changes in quip_gap_xyz.py entail the addition of energy and force unit conversions, adjustments in handling various fields within the handle_single_xyz_frame function, and parsing of energy and force units for accurate calculations. Key entities and error handling improvements have also been included to enhance the robustness and functionality of the data processing.

Changes

File Change Summary
dpdata/xyz/quip_gap_xyz.py Added energy and force unit conversions, adjustments in handling fields (species, pos, Z, force, tags) within handle_single_xyz_frame function, and modifications in parsing energy and force units for calculations. Introduced variables e_conv_kcalpermol2eV, e_conv_au2eV, f_conv_kcalpermolperang2eVperang, f_conv_auperang2eVperang, f_conv_kcalpermolperbohr2eVperang, f_conv_au2eVperang.

Tip

AI model upgrade

gpt-4o model for reviews and chat is now live

OpenAI claims that this model is better at understanding and generating code than the previous models. Please join our Discord Community to provide any feedback or to report any issues.


Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 37b96b8 and 81a27d8.

Files selected for processing (1)
  • dpdata/xyz/quip_gap_xyz.py (3 hunks)
Additional context used
Ruff
dpdata/xyz/quip_gap_xyz.py

133-133: Local variable Z_array is assigned to but never used

Remove assignment to unused variable Z_array

(F841)


252-252: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

Additional comments not posted (5)
dpdata/xyz/quip_gap_xyz.py (5)

10-11: Imports and constants for unit conversions.

The import statement and constants for unit conversions are correctly added. Ensure that EnergyConversion and ForceConversion are defined in unit module.


12-21: Unit conversion constants are correctly defined.

The constants for energy and force unit conversions are correctly defined using EnergyConversion and ForceConversion.


216-234: Proper handling of energy units.

The changes correctly handle various energy units and populate info_dict["energies"]. Ensure that all possible energy units are covered.


285-285: Handling of virials is correct.

The handling of virials is correctly implemented.


289-303: Handling of Lattice and nopbc fields is correct.

The handling of Lattice and nopbc fields is correctly implemented.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codspeed-hq bot commented Jul 3, 2024

CodSpeed Performance Report

Merging #678 will not alter performance

Comparing SchrodingersCattt:devel (81a27d8) with devel (e429d9a)

Summary

✅ 2 untouched benchmarks

Copy link

codecov bot commented Jul 3, 2024

Codecov Report

Attention: Patch coverage is 61.79775% with 34 lines in your changes missing coverage. Please review.

Project coverage is 84.12%. Comparing base (e429d9a) to head (81a27d8).

Files Patch % Lines
dpdata/xyz/quip_gap_xyz.py 61.79% 34 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            devel     #678      +/-   ##
==========================================
- Coverage   84.42%   84.12%   -0.30%     
==========================================
  Files          81       81              
  Lines        7144     7202      +58     
==========================================
+ Hits         6031     6059      +28     
- Misses       1113     1143      +30     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

try:
e_units = np.array([field_dict["energy-unit"].lower()])
f_units = np.array([field_dict["force-unit"].lower()])
except:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using bare except statements.

Using bare except statements can catch unexpected errors and make debugging difficult. Specify the exception type or use except Exception as e to handle specific exceptions.

-  except:
+  except KeyError:

Also applies to: 283-283

Tools
Ruff

200-200: Do not use bare except

(E722)

Comment on lines +273 to +275
elif f_units == "kcal/mol/bohr":
info_dict["forces"] = (
np.array([force_array]).astype("float32") * f_conv_au2eVperang
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove redundant conditional block for kcal/mol/bohr.

The conditional block for kcal/mol/bohr is redundant and should be removed.

-  elif f_units == "kcal/mol/bohr":
-    info_dict["forces"] = (
-      np.array([force_array]).astype("float32") * f_conv_au2eVperang
-    )

Committable suggestion was skipped due to low confidence.

else:
raise ValueError("No valid energy field found in field_dict.")
except KeyError:
raise ValueError("Error while accessing energy fields in field_dict.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raise exceptions with raise ... from None.

Within an except clause, raise exceptions with raise ... from None to distinguish them from errors in exception handling.

-  raise ValueError("Error while accessing energy fields in field_dict.")
+  raise ValueError("Error while accessing energy fields in field_dict.") from None
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
raise ValueError("Error while accessing energy fields in field_dict.")
raise ValueError("Error while accessing energy fields in field_dict.") from None
Tools
Ruff

252-252: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

info_dict["forces"] = np.array([force_array]).astype("float32")
else:
info_dict["forces"] = np.array([force_array]).astype("float32")
except:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using bare except statements.

Using bare except statements can catch unexpected errors and make debugging difficult. Specify the exception type or use except Exception as e to handle specific exceptions.

-  except:
+  except KeyError:

Committable suggestion was skipped due to low confidence.

Tools
Ruff

283-283: Do not use bare except

(E722)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Comment on lines 254 to 299
try:
if f_units == "kcal/mol/angstrom":
<<<<<<< HEAD
info_dict["forces"] = np.array([force_array]).astype("float32") * f_conv_kcalpermolperang2eVperang
elif f_units == "hartree/angstrom" or f_units == "hartree/ang" or f_units == "hartree/ang.":
info_dict["forces"] = np.array([force_array]).astype("float32") * f_conv_auperang2eVperang
elif f_units == "kcal/mol/bohr":
info_dict["forces"] = np.array([force_array]).astype("float32") * f_conv_kcalpermolperbohr2eVperang
elif f_units == "kcal/mol/bohr":
info_dict["forces"] = np.array([force_array]).astype("float32") * f_conv_au2eVperang
elif f_units == "ev/angstrom" or f_units == "ev/ang" or f_units == "ev/ang.":
info_dict["forces"] = np.array([force_array]).astype("float32")
else: info_dict["forces"] = np.array([force_array]).astype("float32")
except Exception:
info_dict["forces"] = np.array([force_array]).astype("float32")
=======
info_dict["forces"] = (
np.array([force_array]).astype("float32")
* f_conv_kcalpermolperang2eVperang
)
elif (
f_units == "hartree/angstrom"
or f_units == "hartree/ang"
or f_units == "hartree/ang."
):
info_dict["forces"] = (
np.array([force_array]).astype("float32") * f_conv_auperang2eVperang
)
elif f_units == "kcal/mol/bohr":
info_dict["forces"] = (
np.array([force_array]).astype("float32")
* f_conv_kcalpermolperbohr2eVperang
)
elif f_units == "kcal/mol/bohr":
info_dict["forces"] = (
np.array([force_array]).astype("float32") * f_conv_au2eVperang
)
elif (
f_units == "ev/angstrom" or f_units == "ev/ang" or f_units == "ev/ang."
):
info_dict["forces"] = np.array([force_array]).astype("float32")
else:
info_dict["forces"] = np.array([force_array]).astype("float32")
except:
info_dict["forces"] = np.array([force_array]).astype("float32")
>>>>>>> origin/devel
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit conversion for forces and redundant conditional block.

The new logic for force unit conversion is correctly implemented but contains a redundant conditional block for kcal/mol/bohr and uses a bare except statement.

-  elif f_units == "kcal/mol/bohr":
-    info_dict["forces"] = (
-      np.array([force_array]).astype("float32") * f_conv_au2eVperang
-    )
-  except:
-    info_dict["forces"] = np.array([force_array]).astype("float32")
+  except KeyError as e:
+    raise ValueError("Error while accessing force fields in field_dict.") from e

Committable suggestion was skipped due to low confidence.

Tools
Ruff

255-256: SyntaxError: Expected an indented block after if statement


256-256: SyntaxError: Expected except or finally after try block


256-256: SyntaxError: Expected a statement


256-256: SyntaxError: Expected a statement


256-256: SyntaxError: Expected a statement


257-257: SyntaxError: Unexpected indentation


258-258: SyntaxError: unindent does not match any outer indentation level


258-258: SyntaxError: Expected a statement


258-258: SyntaxError: Invalid annotated assignment target


258-259: SyntaxError: Expected an expression


259-259: SyntaxError: Unexpected indentation


260-260: SyntaxError: unindent does not match any outer indentation level


260-260: SyntaxError: Expected a statement


260-260: SyntaxError: Invalid annotated assignment target


260-261: SyntaxError: Expected an expression


261-261: SyntaxError: Unexpected indentation


262-262: SyntaxError: unindent does not match any outer indentation level


262-262: SyntaxError: Expected a statement


262-262: SyntaxError: Invalid annotated assignment target


262-263: SyntaxError: Expected an expression


263-263: SyntaxError: Unexpected indentation


264-264: SyntaxError: unindent does not match any outer indentation level


264-264: SyntaxError: Expected a statement


264-264: SyntaxError: Invalid annotated assignment target


264-265: SyntaxError: Expected an expression


265-265: SyntaxError: Unexpected indentation


266-266: SyntaxError: unindent does not match any outer indentation level


266-266: SyntaxError: Expected a statement


266-266: SyntaxError: Expected a statement


267-267: SyntaxError: Unexpected indentation


267-267: SyntaxError: Expected a statement


267-268: SyntaxError: Expected an expression


268-268: SyntaxError: Unexpected indentation


269-269: SyntaxError: Expected a statement


269-269: SyntaxError: Expected a statement


269-269: SyntaxError: Expected a statement


269-269: SyntaxError: Expected a statement


269-270: SyntaxError: Expected a statement


270-270: SyntaxError: Unexpected indentation


274-274: SyntaxError: unindent does not match any outer indentation level


274-274: SyntaxError: Expected a statement


275-275: SyntaxError: Invalid annotated assignment target


278-279: SyntaxError: Expected an expression


279-279: SyntaxError: Unexpected indentation


282-282: SyntaxError: unindent does not match any outer indentation level


282-282: SyntaxError: Expected a statement


282-282: SyntaxError: Invalid annotated assignment target


282-283: SyntaxError: Expected an expression


283-283: SyntaxError: Unexpected indentation


287-287: SyntaxError: unindent does not match any outer indentation level


287-287: SyntaxError: Expected a statement


287-287: SyntaxError: Invalid annotated assignment target


287-288: SyntaxError: Expected an expression


288-288: SyntaxError: Unexpected indentation


291-291: SyntaxError: unindent does not match any outer indentation level


291-291: SyntaxError: Expected a statement


292-292: SyntaxError: Invalid annotated assignment target


293-294: SyntaxError: Expected an expression


294-294: SyntaxError: Unexpected indentation


295-295: SyntaxError: unindent does not match any outer indentation level


295-295: SyntaxError: Expected a statement


295-295: SyntaxError: Expected a statement


295-296: SyntaxError: Expected a statement


296-296: SyntaxError: Unexpected indentation


297-297: SyntaxError: unindent does not match any outer indentation level


297-297: SyntaxError: Expected a statement


297-297: SyntaxError: Expected a statement


297-298: SyntaxError: Expected a statement


298-298: SyntaxError: Unexpected indentation


299-299: SyntaxError: Expected a statement


299-299: SyntaxError: Expected a statement


299-299: SyntaxError: Expected a statement


299-299: SyntaxError: Expected a statement

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Comment on lines +197 to +203
try:
e_units = np.array([field_dict["energy-unit"].lower()])
f_units = np.array([field_dict["force-unit"].lower()])
except Exception:
pass
# print('No units information contained.')

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proper handling of missing unit information.

The try-except block for handling missing unit information should raise an exception or log a warning if units are not found, rather than passing silently.

-  except Exception:
-    pass
-    # print('No units information contained.')
+  except KeyError as e:
+    raise ValueError("No units information contained.") from e
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
e_units = np.array([field_dict["energy-unit"].lower()])
f_units = np.array([field_dict["force-unit"].lower()])
except Exception:
pass
# print('No units information contained.')
try:
e_units = np.array([field_dict["energy-unit"].lower()])
f_units = np.array([field_dict["force-unit"].lower()])
except KeyError as e:
raise ValueError("No units information contained.") from e

Comment on lines +235 to +253
try:
possible_fields = [
"energy",
"energies",
"Energies",
"potential-energy.energy",
"Energy",
]
for key in possible_fields:
if key in field_dict:
info_dict["energies"] = np.array(
[field_dict[key]], dtype="float32"
)
break
else:
raise ValueError("No valid energy field found in field_dict.")
except KeyError:
raise ValueError("Error while accessing energy fields in field_dict.")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved handling of possible energy fields and raising exceptions.

The changes improve handling of possible energy fields and raising exceptions. Use raise ... from None to distinguish exceptions from errors in exception handling.

-  raise ValueError("Error while accessing energy fields in field_dict.")
+  raise ValueError("Error while accessing energy fields in field_dict.") from None
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
possible_fields = [
"energy",
"energies",
"Energies",
"potential-energy.energy",
"Energy",
]
for key in possible_fields:
if key in field_dict:
info_dict["energies"] = np.array(
[field_dict[key]], dtype="float32"
)
break
else:
raise ValueError("No valid energy field found in field_dict.")
except KeyError:
raise ValueError("Error while accessing energy fields in field_dict.")
try:
possible_fields = [
"energy",
"energies",
"Energies",
"potential-energy.energy",
"Energy",
]
for key in possible_fields:
if key in field_dict:
info_dict["energies"] = np.array(
[field_dict[key]], dtype="float32"
)
break
else:
raise ValueError("No valid energy field found in field_dict.")
except KeyError:
raise ValueError("Error while accessing energy fields in field_dict.") from None
Tools
Ruff

252-252: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

Comment on lines +254 to +283
try:
if f_units == "kcal/mol/angstrom":
info_dict["forces"] = (
np.array([force_array]).astype("float32")
* f_conv_kcalpermolperang2eVperang
)
elif (
f_units == "hartree/angstrom"
or f_units == "hartree/ang"
or f_units == "hartree/ang."
):
info_dict["forces"] = (
np.array([force_array]).astype("float32") * f_conv_auperang2eVperang
)
elif f_units == "kcal/mol/bohr":
info_dict["forces"] = (
np.array([force_array]).astype("float32")
* f_conv_kcalpermolperbohr2eVperang
)
elif f_units == "kcal/mol/bohr":
info_dict["forces"] = (
np.array([force_array]).astype("float32") * f_conv_au2eVperang
)
elif (
f_units == "ev/angstrom" or f_units == "ev/ang" or f_units == "ev/ang."
):
info_dict["forces"] = np.array([force_array]).astype("float32")
else:
info_dict["forces"] = np.array([force_array]).astype("float32")
except Exception:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit conversion for forces and redundant conditional block.

The new logic for force unit conversion is correctly implemented but contains a redundant conditional block for kcal/mol/bohr and uses a bare except statement.

-  elif f_units == "kcal/mol/bohr":
-    info_dict["forces"] = (
-      np.array([force_array]).astype("float32") * f_conv_au2eVperang
-    )
-  except:
-    info_dict["forces"] = np.array([force_array]).astype("float32")
+  except KeyError as e:
+    raise ValueError("Error while accessing force fields in field_dict.") from e
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
if f_units == "kcal/mol/angstrom":
info_dict["forces"] = (
np.array([force_array]).astype("float32")
* f_conv_kcalpermolperang2eVperang
)
elif (
f_units == "hartree/angstrom"
or f_units == "hartree/ang"
or f_units == "hartree/ang."
):
info_dict["forces"] = (
np.array([force_array]).astype("float32") * f_conv_auperang2eVperang
)
elif f_units == "kcal/mol/bohr":
info_dict["forces"] = (
np.array([force_array]).astype("float32")
* f_conv_kcalpermolperbohr2eVperang
)
elif f_units == "kcal/mol/bohr":
info_dict["forces"] = (
np.array([force_array]).astype("float32") * f_conv_au2eVperang
)
elif (
f_units == "ev/angstrom" or f_units == "ev/ang" or f_units == "ev/ang."
):
info_dict["forces"] = np.array([force_array]).astype("float32")
else:
info_dict["forces"] = np.array([force_array]).astype("float32")
except Exception:
try:
if f_units == "kcal/mol/angstrom":
info_dict["forces"] = (
np.array([force_array]).astype("float32")
* f_conv_kcalpermolperang2eVperang
)
elif (
f_units == "hartree/angstrom"
or f_units == "hartree/ang"
or f_units == "hartree/ang."
):
info_dict["forces"] = (
np.array([force_array]).astype("float32") * f_conv_auperang2eVperang
)
elif f_units == "kcal/mol/bohr":
info_dict["forces"] = (
np.array([force_array]).astype("float32")
* f_conv_kcalpermolperbohr2eVperang
)
elif (
f_units == "ev/angstrom" or f_units == "ev/ang" or f_units == "ev/ang."
):
info_dict["forces"] = np.array([force_array]).astype("float32")
else:
info_dict["forces"] = np.array([force_array]).astype("float32")
except KeyError as e:
raise ValueError("Error while accessing force fields in field_dict.") from e

Comment on lines +98 to +161
if kv_dict["datatype"] != "S":
raise RuntimeError(
"datatype for species must be 'S' instead of {}".format(
kv_dict["datatype"]
)
)
)
field_length = int(kv_dict["value"])
type_array = data_array[
:, used_colomn : used_colomn + field_length
].flatten()
used_colomn += field_length
continue
elif kv_dict["key"] == "pos":
if kv_dict["datatype"] != "R":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
field_length = int(kv_dict["value"])
type_array = data_array[
:, used_colomn : used_colomn + field_length
].flatten()
used_colomn += field_length
continue
elif kv_dict["key"] == "pos":
if kv_dict["datatype"] != "R":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
)
)
)
field_length = int(kv_dict["value"])
coords_array = data_array[:, used_colomn : used_colomn + field_length]
used_colomn += field_length
continue
elif kv_dict["key"] == "Z":
if kv_dict["datatype"] != "I":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
field_length = int(kv_dict["value"])
coords_array = data_array[
:, used_colomn : used_colomn + field_length
]
used_colomn += field_length
continue
elif kv_dict["key"] == "Z":
if kv_dict["datatype"] != "I":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
)
)
)
field_length = int(kv_dict["value"])
Z_array = data_array[
:, used_colomn : used_colomn + field_length
].flatten()
used_colomn += field_length
continue
elif kv_dict["key"] == "force":
if kv_dict["datatype"] != "R":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
field_length = int(kv_dict["value"])
Z_array = data_array[
:, used_colomn : used_colomn + field_length
].flatten()
used_colomn += field_length
continue
elif kv_dict["key"] == "tags":
if kv_dict["datatype"] != "I":
raise RuntimeError(
"datatype for tags must be 'I' instead of {}".format(
kv_dict["datatype"]
)
)
)
field_length = int(kv_dict["value"])
force_array = data_array[:, used_colomn : used_colomn + field_length]
used_colomn += field_length
field_length = int(kv_dict["value"])
used_colomn += field_length
continue
elif kv_dict["key"] == "force" or kv_dict["key"] == "forces":
if kv_dict["datatype"] != "R":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
)
)
field_length = int(kv_dict["value"])
force_array = data_array[
:, used_colomn : used_colomn + field_length
]
used_colomn += field_length
continue
except Exception as e:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved field handling and error management.

The changes improve handling of fields like species, pos, Z, tags, and forces. The logic is sound, but avoid using bare except statements for better debugging.

-  except Exception as e:
+  except KeyError as e:
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
if kv_dict["key"] == "species":
if kv_dict["datatype"] != "S":
raise RuntimeError(
"datatype for species must be 'S' instead of {}".format(
kv_dict["datatype"]
)
)
)
field_length = int(kv_dict["value"])
type_array = data_array[
:, used_colomn : used_colomn + field_length
].flatten()
used_colomn += field_length
continue
elif kv_dict["key"] == "pos":
if kv_dict["datatype"] != "R":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
field_length = int(kv_dict["value"])
type_array = data_array[
:, used_colomn : used_colomn + field_length
].flatten()
used_colomn += field_length
continue
elif kv_dict["key"] == "pos":
if kv_dict["datatype"] != "R":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
)
)
)
field_length = int(kv_dict["value"])
coords_array = data_array[:, used_colomn : used_colomn + field_length]
used_colomn += field_length
continue
elif kv_dict["key"] == "Z":
if kv_dict["datatype"] != "I":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
field_length = int(kv_dict["value"])
coords_array = data_array[
:, used_colomn : used_colomn + field_length
]
used_colomn += field_length
continue
elif kv_dict["key"] == "Z":
if kv_dict["datatype"] != "I":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
)
)
)
field_length = int(kv_dict["value"])
Z_array = data_array[
:, used_colomn : used_colomn + field_length
].flatten()
used_colomn += field_length
continue
elif kv_dict["key"] == "force":
if kv_dict["datatype"] != "R":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
field_length = int(kv_dict["value"])
Z_array = data_array[
:, used_colomn : used_colomn + field_length
].flatten()
used_colomn += field_length
continue
elif kv_dict["key"] == "tags":
if kv_dict["datatype"] != "I":
raise RuntimeError(
"datatype for tags must be 'I' instead of {}".format(
kv_dict["datatype"]
)
)
)
field_length = int(kv_dict["value"])
force_array = data_array[:, used_colomn : used_colomn + field_length]
used_colomn += field_length
field_length = int(kv_dict["value"])
used_colomn += field_length
continue
elif kv_dict["key"] == "force" or kv_dict["key"] == "forces":
if kv_dict["datatype"] != "R":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
)
)
field_length = int(kv_dict["value"])
force_array = data_array[
:, used_colomn : used_colomn + field_length
]
used_colomn += field_length
continue
except Exception as e:
try:
if kv_dict["key"] == "species":
if kv_dict["datatype"] != "S":
raise RuntimeError(
"datatype for species must be 'S' instead of {}".format(
kv_dict["datatype"]
)
)
field_length = int(kv_dict["value"])
type_array = data_array[
:, used_colomn : used_colomn + field_length
].flatten()
used_colomn += field_length
continue
elif kv_dict["key"] == "pos":
if kv_dict["datatype"] != "R":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
)
)
field_length = int(kv_dict["value"])
coords_array = data_array[
:, used_colomn : used_colomn + field_length
]
used_colomn += field_length
continue
elif kv_dict["key"] == "Z":
if kv_dict["datatype"] != "I":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
)
)
field_length = int(kv_dict["value"])
Z_array = data_array[
:, used_colomn : used_colomn + field_length
].flatten()
used_colomn += field_length
continue
elif kv_dict["key"] == "tags":
if kv_dict["datatype"] != "I":
raise RuntimeError(
"datatype for tags must be 'I' instead of {}".format(
kv_dict["datatype"]
)
)
field_length = int(kv_dict["value"])
used_colomn += field_length
continue
elif kv_dict["key"] == "force" or kv_dict["key"] == "forces":
if kv_dict["datatype"] != "R":
raise RuntimeError(
"datatype for pos must be 'R' instead of {}".format(
kv_dict["datatype"]
)
)
field_length = int(kv_dict["value"])
force_array = data_array[
:, used_colomn : used_colomn + field_length
]
used_colomn += field_length
continue
except KeyError as e:
Tools
Ruff

133-133: Local variable Z_array is assigned to but never used

Remove assignment to unused variable Z_array

(F841)

used_colomn += field_length
continue
except Exception as e:
print("unknown field {}".format(kv_dict["key"]), e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason not to raise an error?

]
used_colomn += field_length
continue
except Exception as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the error it is expected to catch?

Comment on lines +211 to +214
"""
info_dict["energies"] = np.array([field_dict["energy"]]).astype("float32")
info_dict["forces"] = np.array([force_array]).astype("float32")
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is not used anymore, it should be removed.

info_dict["energies"] = np.array([field_dict["energy"]]).astype(
"float32"
)
except Exception:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the expected error to catch here? It's hard to read.

)
elif e_units == "ev":
info_dict["energies"] = np.array([field_dict["energy"]]).astype(
"float32"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it converted to float32?

Copy link
Member

@njzjz njzjz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit test should be added to cover the cases in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants