Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A more efficient way of reading MD trajectory #367

Open
njzjz opened this issue Mar 7, 2022 · 3 comments
Open

A more efficient way of reading MD trajectory #367

njzjz opened this issue Mar 7, 2022 · 3 comments
Assignees
Labels
enhancement New feature or request lammps

Comments

@njzjz
Copy link
Member

njzjz commented Mar 7, 2022

In the workflow, we do not need to read every frame of trajectory, but only what we want. So, we should firstly make the following dict to map the frame to the trajectory:

frames_dict = {
  Trajectory0: [23, 56, 78],
  Trajectory1: [22],
  ...
}

Then, reading each trajectory:

for traj, f_idx in frames_dict.items():
    traj.read(f_idx)

For a LAMMPS trajectory or other raw text files, the read should be

def read(self, f_idx: list[int]):
    with open(self.fname) as f:
        for ii, lines in enumerate(itertools.zip_longest(*[f] * self.nlines)):
            if ii not in f_idx:
                continue
            self.process_block(lines)

where nlines is the number of lines in each block, which should be determined in the very beginning. Usually, every frame has the same number of lines.

process_block method should convert a LAMMPS frame to dpdata.

@amcadmus amcadmus added the enhancement New feature or request label Mar 7, 2022
@amcadmus
Copy link
Member

amcadmus commented Mar 7, 2022

This way of loading trajectory should replace the implementation in https://github.com/deepmodeling/dpgen2/blob/2011090d12ba26a1eb3849634883f9ae0b62cc9d/dpgen2/exploration/selector/conf_selector_frame.py#L132-L138

The problem is how. Shall we provide this highly efficient way of reading frames from trajectories to dpdata or directly implement it to dpgen2?

@amcadmus amcadmus changed the title Reading MD trajectory A more efficient way of reading MD trajectory Mar 7, 2022
@njzjz
Copy link
Member Author

njzjz commented Oct 20, 2022

We should add to dpdata, so others packages which use dpdata will also be benefited.

@njzjz
Copy link
Member Author

njzjz commented Oct 20, 2022

Enhance this method:

def load_file(fname, begin = 0, step = 1) :
lines = []
buff = []
cc = -1
with open(fname) as fp:
while True:
line = fp.readline().rstrip('\n')
if not line :
if cc >= begin and (cc - begin) % step == 0 :
lines += buff
buff = []
cc += 1
return lines
if 'ITEM: TIMESTEP' in line :
if cc >= begin and (cc - begin) % step == 0 :
lines += buff
buff = []
cc += 1
if cc >= begin and (cc - begin) % step == 0 :
buff.append(line)

@njzjz njzjz transferred this issue from deepmodeling/dpgen2 Oct 21, 2022
@njzjz njzjz added the lammps label Nov 5, 2023
@njzjz njzjz self-assigned this Nov 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lammps
Projects
None yet
Development

No branches or pull requests

2 participants