Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dxtbx.serialize.filename.resolve_path fails on Windows for data collected on a POSIX platform #613

Open
dagewa opened this issue Mar 3, 2023 · 4 comments

Comments

@dagewa
Copy link
Member

dagewa commented Mar 3, 2023

While investigating the DIALS test failure tests/util/test_image_grouping.py::test_real_h5_example on Windows, I found that ImageSet.get_path gives a strange result on that platform. This can be demonstrated using the dtpb_serial_processed dataset on dials-data, like this:

Linux

>>> from dxtbx.serialize import load
>>> el=load.experiment_list("well42_batch6_integrated.expt", check_format=False)
>>> el[0].imageset.get_path(0)
'/dls/mx/data/nt30330/nt30330-15/VMXi-AB1698/well_42/images/image_58766.nxs'

Windows

>>> from dxtbx.serialize import load
>>> el=load.experiment_list("well42_batch6_integrated.expt", check_format=False)
>>> el[0].imageset.get_path(0)
'c:\\dls\\mx\\data\\nt30330\\nt30330-15\\VMXi-AB1698\\well_42\\images\\image_58766.nxs'

This causes the test to fail here:

self = <dials.util.image_grouping.GroupingImageFiles object at 0x000001DD825207F0>
data_file_pairs = [FilePair(expt=WindowsPath('c:/Users/fcx32934/sw/cctbx/build/dials_data/dtpb_serial_processed/well42_batch6_integra
ted..., refl=WindowsPath('c:/Users/fcx32934/sw/cctbx/build/dials_data/dtpb_serial_processed/well42_batch6_integrated.refl'))]

    def _get_expt_file_to_groupsdata(self, data_file_pairs: List[FilePair]):
        expt_file_to_groupsdata: Dict[Path, GroupsForExpt] = {}

        for fp in data_file_pairs:
            expts = load.experiment_list(fp.expt, check_format=False)
            # need to match the images to the imagesets.
            images = set()
            image_to_group_info = {}
            for iset in expts.imagesets():
                images.update(iset.paths())
            for iset in images:
                image = None
                for ifile in self._files_to_groups_dict.keys():
                    if iset == ifile.name:
                        image = ifile
                        break
                if image is None:
>                   raise ValueError(f"Imageset {iset} not found in metadata")
E                   ValueError: Imageset c:\dls\mx\data\nt30330\nt30330-15\VMXi-AB1698\well_42\images\image_58766.nxs not found in me
tadata

src\dials\util\image_grouping.py:996: ValueError

@dagewa
Copy link
Member Author

dagewa commented Mar 3, 2023

Actually it seems like this is delegated back to a Reader object

>>> iset.reader()
<dxtbx.format.FormatMultiImage.Reader object at 0x0000025952A310F0>
>>> iset.reader()._filename
'c:\\dls\\mx\\data\\nt30330\\nt30330-15\\VMXi-AB1698\\well_42\\images\\image_58766.nxs'

@dagewa
Copy link
Member Author

dagewa commented Mar 3, 2023

Ok, where it goes wrong is here:

def resolve_path(path, directory=None):
"""Resolve a file path.
First expand any environment and user variables. Then create the absolute
path by applying the relative path to the provided directory, if necessary.
Args:
path (str): The path to resolve
directory (Optional[str]): The local path to resolve relative links
Returns:
str: The absolute path to the file to read
"""
if not path:
return ""
path = os.path.expanduser(os.path.expandvars(path))
if directory and not os.path.isabs(path):
path = os.path.join(directory, path)
return os.path.abspath(path)

On entry:

  • path is '/dls/mx/data/nt30330/nt30330-15/VMXi-AB1698/well_42/images/image_58766.nxs'
  • directory is 'c:\\Users\\fcx32934\\sw\\cctbx\\build\\dials_data\\dtpb_serial_processed'

Then os.path.join(directory, path) produces

'c:/dls/mx/data/nt30330/nt30330-15/VMXi-AB1698/well_42/images/image_58766.nxs'

Finally, os.path.abspath(path) escapes the path separators, resulting in

'c:\\dls\\mx\\data\\nt30330\\nt30330-15\\VMXi-AB1698\\well_42\\images\\image_58766.nxs'

@dagewa
Copy link
Member Author

dagewa commented Mar 3, 2023

Sorry, not quite right (I hate debugging on Windows).

It's just the action of os.path.abspath(path). The directory is irrelevant as the condition is False

In [5]: os.path.abspath(path)
Out[5]: 'c:\\dls\\mx\\data\\nt30330\\nt30330-15\\VMXi-AB1698\\well_42\\images\\image_58766.nxs'

@dagewa dagewa changed the title ImageSet.get_path broken on Windows dxtbx.serialize.filename.resolve_path fails on Windows for data collected on a POSIX platform Mar 3, 2023
@dagewa
Copy link
Member Author

dagewa commented Mar 3, 2023

What is the intended behaviour of resolve_path when it is passed a POSIX path on a Windows platform?

ndevenish added a commit to dials/dials that referenced this issue Mar 7, 2023
- Using prebuild CCTBX and the CMake builds of dxtbx and dials
- Relax python-nunit constraint, which now causes errors fixed in the
  now-latest version of the package
- XFail two tests that are failing on windows because of
  cctbx/dxtbx/issues/613
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant