Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove MaterializationClient factory #229

Merged
merged 6 commits into from
Sep 17, 2024
Merged

Remove MaterializationClient factory #229

merged 6 commits into from
Sep 17, 2024

Conversation

bdpedigo
Copy link
Collaborator

No description provided.

@bdpedigo bdpedigo changed the title remove factory Remove MaterializationClient factory Sep 11, 2024
@fcollman
Copy link
Collaborator

looks like tests are failing the server version endpoint needs to be mocked.. probably we need to have both versions..

@bdpedigo
Copy link
Collaborator Author

for sure, but I also wondered what we wanted to do about cases where the materialization server is not yet advertising its version?

currently, None version causes the check to fail like you see in this test. I think this made sense when I first wrote the decorator because I knew that any chunkedgraph server not advertising a version must be <2.15 or whatever it was when I introduced the version API - but in this more general case I'm not sure what it is safe to assume for backwards compatibility

@@ -251,7 +251,10 @@ def server_version(self) -> Optional[Version]:
"""The version of the service running on the remote server. Note that this
refers to the software running on the server and has nothing to do with the
version of the datastack itself."""
return self._server_version
if self._server_version is None and self._api_version is not None:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fcollman as an example, this logic would set some defaults for the versioning logic/decorator that uses the current _api_version to tell if a server is at least 2.x.y, 3.x.y, etc.

I think this would make things more backwards compatible generally, but I wonder if you see any holes in this logic.

Version("3") essentially resolves to 3.0.0, so any checks that look like >=3.1.2 etc would still fail (I just checked this)

@bdpedigo
Copy link
Collaborator Author

@ceesem it looked like the only methods that had separate implementations for v2 and v3 were live_live_query, tables, and views. The latter 2 im not sure that those are meaningful differences, though.

Here I took the most conservative/dumb approach of just wrapping the entire logic for v2/v3 logic in a massive if/else statement for each of those, I just wanted to make sure I didn't mess anything up there. I also protected a couple of additional kwargs to live_live_query using the decorator.

However, there is obviously a lot of redundant code — would you like me to take a stab at merging them, or would you like to handle it? Or leave as is for now?

@bdpedigo
Copy link
Collaborator Author

@ceesem it looked like the only methods that had separate implementations for v2 and v3 were live_live_query, tables, and views. The latter 2 im not sure that those are meaningful differences, though.

Here I took the most conservative/dumb approach of just wrapping the entire logic for v2/v3 logic in a massive if/else statement for each of those, I just wanted to make sure I didn't mess anything up there. I also protected a couple of additional kwargs to live_live_query using the decorator.

However, there is obviously a lot of redundant code — would you like me to take a stab at merging them, or would you like to handle it? Or leave as is for now?

For views and tables it looks easy - just only look for metadata for Table/ViewManager if V3.

@bdpedigo
Copy link
Collaborator Author

otherwise, this is the git diff between live_live_query v2 -> v3

diff --git a/v2.py b/v3.py
index 206cde3..ff0396e 100644
--- a/v2.py
+++ b/v3.py
@@ -6,8 +6,6 @@ timestamp = convert_timestamp(timestamp)
 return_df = True
 if datastack_name is None:
     datastack_name = self.datastack_name
-if desired_resolution is None:
-    desired_resolution = self.desired_resolution
 endpoint_mapping = self.default_url_mapping
 endpoint_mapping["datastack_name"] = datastack_name
 data = {}
@@ -16,6 +14,7 @@ query_args["return_pyarrow"] = True
 query_args["arrow_format"] = True
 query_args["merge_reference"] = False
 query_args["allow_missing_lookups"] = allow_missing_lookups
+query_args["allow_invalid_root_ids"] = allow_invalid_root_ids
 if random_sample:
     query_args["random_sample"] = random_sample
 data["table"] = table
@@ -31,6 +30,8 @@ if filter_equal_dict is not None:
     data["filter_equal_dict"] = filter_equal_dict
 if filter_spatial_dict is not None:
     data["filter_spatial_dict"] = filter_spatial_dict
+if filter_regex_dict is not None:
+    data["filter_regex_dict"] = filter_regex_dict
 if select_columns is not None:
     data["select_columns"] = select_columns
 if offset is not None:
@@ -40,6 +41,10 @@ if limit is not None:
     data["limit"] = limit
 if suffixes is not None:
     data["suffixes"] = suffixes
+if desired_resolution is None:
+    desired_resolution = self.desired_resolution
+if desired_resolution is not None:
+    data["desired_resolution"] = desired_resolution
 encoding = DEFAULT_COMPRESSION
 response = self.session.post(
     url,
@@ -53,22 +58,24 @@ response = self.session.post(
     verify=self.verify,
 )
 self.raise_for_status(response, log_warning=log_warning)
-if desired_resolution is None:
-    desired_resolution = self.desired_resolution
 with MyTimeIt("deserialize"):
     with warnings.catch_warnings():
         warnings.simplefilter(action="ignore", category=FutureWarning)
         warnings.simplefilter(action="ignore", category=DeprecationWarning)
         df = deserialize_query_response(response)
         if desired_resolution is not None:
-            if len(desired_resolution) != 3:
-                raise ValueError("desired resolution needs to be of length 3, for xyz")
-            vox_res = self.get_table_metadata(
-                table_name=table,
-                datastack_name=datastack_name,
-                log_warning=False,
-            )["voxel_resolution"]
-            df = convert_position_columns(df, vox_res, desired_resolution)
+            if not response.headers.get("dataframe_resolution", None):
+                if len(desired_resolution) != 3:
+                    raise ValueError(
+                        "desired resolution needs to be of length 3, for xyz"
+                    )
+                vox_res = self.get_table_metadata(
+                    table,
+                    datastack_name,
+                    log_warning=False,
+                )["voxel_resolution"]
+                df = convert_position_columns(df, vox_res, desired_resolution)
+
     if not split_positions:
         concatenate_position_columns(df, inplace=True)
 if metadata:
@@ -81,6 +88,7 @@ if metadata:
                 "exclusive": filter_out_dict,
                 "equal": filter_equal_dict,
                 "spatial": filter_spatial_dict,
+                "regex": filter_regex_dict,
             },
             select_columns=select_columns,
             offset=offset,
@@ -88,7 +96,9 @@ if metadata:
             live_query=timestamp is not None,
             timestamp=string_format_timestamp(timestamp),
             materialization_version=None,
-            desired_resolution=desired_resolution,
+            desired_resolution=response.headers.get(
+                "dataframe_resolution", desired_resolution
+            ),
         )
         df.attrs.update(attrs)
     except HTTPError as e:

@bdpedigo
Copy link
Collaborator Author

Open to feedback but this is good to go on my end @fcollman @ceesem - I did end up refactoring the ovelapping methods into one with version-conditional logic, but there isn't a ton

@bdpedigo bdpedigo merged commit 522caf3 into unfactory Sep 17, 2024
15 checks passed
@bdpedigo bdpedigo deleted the mat-unfactory-fix branch September 17, 2024 22:18
bdpedigo added a commit that referenced this pull request Oct 14, 2024
* refactor

* remove pcg chunkedgraph (#224)

* remove factory (#226)

* remove factory (#225)

* Remove JSONService and L2CacheClient factories (#227)

* remove factory

* remove factory

* remove client mappings

* remove more

* Remove MaterializationClient factory (#229)

* remove factory

* possible smarter versioning

* attempt tables merge

* attempt views merge

* refactor live_live_query
ceesem pushed a commit that referenced this pull request Oct 15, 2024
* refactor

* remove pcg chunkedgraph (#224)

* remove factory (#226)

* remove factory (#225)

* Remove JSONService and L2CacheClient factories (#227)

* remove factory

* remove factory

* remove client mappings

* remove more

* Remove MaterializationClient factory (#229)

* remove factory

* possible smarter versioning

* attempt tables merge

* attempt views merge

* refactor live_live_query
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants