Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segformer Semantic Segmentation #131

Merged
merged 26 commits into from
Jul 30, 2024

Conversation

lucasreljic
Copy link
Contributor

@lucasreljic lucasreljic commented Jun 14, 2024

Node running:
The model is: "segformer_mit-b2_8xb1-160k_cityscapes-1024x1024"

Segformer_node.mov

@lucasreljic lucasreljic linked an issue Jun 14, 2024 that may be closed by this pull request
Copy link
Contributor

@danielrhuynh danielrhuynh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff Lucas! Just a few catches from an overview before it gets to Eddy

Copy link
Contributor

@danielrhuynh danielrhuynh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

@danielrhuynh
Copy link
Contributor

Wait for Eddie before merging tho

@Edwardius
Copy link
Collaborator

Since this is a transformer, how much vram does this take up?

Copy link
Collaborator

@Edwardius Edwardius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! A large portion of my concerns have to do with the size of the docker image, as well as the ros messages being used.

To reduce the size of the docker image, there are two avenues:

  1. wrap your head around docker caching to make MMLab stuff work
  2. use a simpler repo that has an implementation of segformer, ie. https://github.com/lucidrains/segformer-pytorch

As for ros messages:

  • I would instead store the masks of each class for downstream processing, instead of a rgb image, this should be more efficient (by at least a couple orders of magnitude because you are passing arrays of boolean values vs RGB image which has 2^8 bits of data multiplied by the image size)

@lucasreljic
Copy link
Contributor Author

I moved all the config stuff in the yaml file, the docker file should be smaller now. The publisher message is still one Image but it's just the mask now. I can change to a boolean array (such as ByteMultiArray I think) or the new vision message you mentioned. Let me know which one I should implement.

@danielrhuynh danielrhuynh merged commit 83aa29b into main Jul 30, 2024
25 checks passed
@lucasreljic
Copy link
Contributor Author

Should a LabelInfo topic still be created for semantics, to map the class name to the int value in the mask?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Semantic Segmentation
4 participants