-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify depth dimensions to match our input #2
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind to explain the flattening and tiling that is going on?
opencv-python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Santoi a missing dependency? If opencv
is not used for visualization, consider using pulling opencv-python-headless
instead. opencv-python
and matplotlib
don't play ball in certain cases (due to PyQt compatibility issues).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks!
scripts/splatam.py
Outdated
@@ -107,9 +107,9 @@ def get_pointcloud(color, depth, intrinsics, w2c, transform_pts=True, | |||
|
|||
# Select points based on mask | |||
if mask is not None: | |||
point_cld = point_cld[mask] | |||
#point_cld = point_cld[mask] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Santoi why comment this out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 reasons, not necessarily valid.
- It was causing some tensor dimension disparity I hadn't been able to solve.
This is now addressed in a new commit by generating the mask from a single channel, instead of all of them. - It didn't make sense since the mask is being created from valid depth values, and every depth value we had was valid (greater than 0). Just in case this happens, let's apply it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still don't understand what's going on here. How is that depth maps have 3 channels?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The capturing app saves the depth as a gray scale png images, therefore each pixel is represented with red, green and blue channels, which all have the same value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that is unusual. Depth maps usually use single-channel, unsigned integer pixels. How is the app converting that grayscale? Now I wonder if we are not losing information to this conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The multi-channel depth map is only provided by the NeRF Capture's "offline-mode", which currently has some issues:
- On one side, it hasn't been tested out by the SplaTAM authors. This means it probably shouldn't work right out of the box.
- On a second note, it seems it is simply broken, see comment.
In the meantime, I will give it a try with this issue's suggestions to see how it goes: spla-tam#59 (comment)
54b34d0
to
8a041d9
Compare
Hi, @hidmic ! Thanks for the review! PR description has been updated to better explain the changes. PTAL. |
The original application is intended to be executed while capturing the NeRF input at the same time, or transmitting the input with DSS.
This PR performs the required tweaks to use previously recorded datasets in the following format:
Changes:
The
depth
vector has the channels separated into 3 vectors, but the algorithm expects them to be all in 1 vector. Flattening the last dimension solves this by taking those 3 vectors and joining them.When using the
depth
mask for thecolor_mask
, it is expected that it has only 1 channel instead of 3, so the tiling is adapted for 3 channels. This is because the depth image is black and white, and could have been represented with 1 channel.Add the possibility of generating a pointcloud output with the visualization script.
WARNING: Visualization script freezes if there is no GUI when ran.