Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ML tool for datasets generation #149

Open
AmrElsersy opened this issue Aug 6, 2021 · 12 comments
Open

Add ML tool for datasets generation #149

AmrElsersy opened this issue Aug 6, 2021 · 12 comments
Assignees
Labels
enhancement New feature or request

Comments

@AmrElsersy
Copy link
Contributor

related to #135 & #134

Desired behavior

Add a ML tool to generate datasets to be used in the training of the deep learning models for computer vision applications.
This will be done by using the Segmentation & Bounding box cameras and the goal is provide an easy way to collect samples data of these sensors to generate the dataset.

Alternatives considered

N/A (this is a new feature).

Implementation suggestion

The first suggestion is to provide a GUI System in ign-gazebo that users use to collect the samples of the dataset.
The GUI will have:

  • Image display widget to show the user the output of the sensor (the segmentation map in case of segmentation camera or an image with drawn boxes on it in case of bounding box camera)
  • A way to make the user select the required sensor and its configurations (for ex: segmentation type or bounding boxes type)
  • The position of the sensors may be the same as the position of the main camera (of the Scene3D plugin) to facilitate changing the position of the sensor (don't know yet how to get the pose of the main camera of the Scene3D)
  • The GUI system will be using the rendering APIs (the segmentation camera & bounding box camera directly, but not the sensors) ... but not sure also how to get the scene from the ign-gazebo::GUI System, or how to render the scene manualy in the plugin

...................

The second suggestion is to provide and APIs that generates samples data from the scene without the control of the user, by putting the required sensor in many random positions and take a screenshot samples

These randomness could be controlled by the user, for example by specifying a center and radius, and the tool will put the sensors in positions in the circle that is defined by that center and radius (I don't have much ideas about how to control that randomness)

I think that this approach may provide bad samples data, because it will show useless samples in some cases, and may collect the samples from a wrong view

What is your opinion about these suggestions ?

Additional context

@AmrElsersy AmrElsersy added the enhancement New feature or request label Aug 6, 2021
@adlarkin
Copy link
Contributor

adlarkin commented Aug 6, 2021

I think that the first idea (GUI system) is fine. When you say The GUI system will be using the rendering APIs, does that include doing things like like changing the position of the objects in the scene? Because for doing things like changing an entity's position, that can actually be done in ign-gazebo with the EntityComponentManager (for the example of changing an entity's position, you'd need to modify the entity's pose component). So, can you provide a specific list of ideas that you have for The GUI system will be using the rendering APIs? If any of the tasks you have in mind can be done through ign-gazebo instead of ign-rendering, then it would probably be better to use the ign-gazebo APIs.

For the second idea (auto-generated datasets that use randomness and/or without much user control), I have a few comments:

These randomness could be controlled by the user, for example by specifying a center and radius, and the tool will put the sensors in positions in the circle that is defined by that center and radius (I don't have much ideas about how to control that randomness)

Creating different camera positions/angles is one option. Here are a few other options:

  1. Add/remove other objects in the scene. This would help reduce bias in the datasets, and discourage false positives from models that are trained on this data. For objects that may be worth adding/removing in scenes used for generating datasets, I'd recommend looking at the Google Research fuel models. There are a lot of photo-realistic models to choose from here.
  2. Modify lighting and/or textures. This would allow for greater diversity in datasets without having to modify camera and/or object positions. You may want to look at the light spawning gazebo plugin, which was added in Edifice and is also in Fortress (this light spawning plugin could also be used for the GUI system suggestion, along with adding GUI options for adding/modifying textures).

I think that this approach may provide bad samples data, because it will show useless samples in some cases, and may collect the samples from a wrong view

I would actually argue that this approach of "randomness" could produce better datasets than human-generated datasets. When it comes to generating ML datasets, we want to remove as much bias from the data as possible. By viewing objects from various angles and having different objects in various images (based on what is visible and what is not), we are generating data that is more generalizable than data that only views objects under a pre-defined set of conditions. Another valuable aspect of the "randomness" approach is that a lot of data is generated quickly, compared to manual dataset generation, which usually takes a while.

Yes, I do agree with you that some samples in the generated data may be useless, and some human "data cleaning" may be needed once datasets are generated, but data cleaning is a normal part of the ML dataset collection process. In order to minimize the chance of generating useless data, we should implement some base-case checking on the backend for the auto-generation dataset tools that we write (for example, if we want to add functionality that randomly moves a camera in some range around a point, we'd want to ignore locations that are below the ground plane since collecting data through a surface is unrealistic). Another thing we can do to minimize useless data generation is provide users with parameters they can set to control how much variety/randomness they want in the data that's generated (for example, maybe users want to specify the range of angles/distances the camera should cover, or maybe they want to set a flag that skips scenes that have no labelled objects in them).

@adlarkin
Copy link
Contributor

adlarkin commented Aug 6, 2021

One other thing to keep in mind is that feature freeze for the Fortress release is about 1 month away (2021/09/07). So, if any of the dataset generation tools break API/ABI, we should probably work on those first so that the tools can be released with the accompanying sensors in Fortress.

@AmrElsersy
Copy link
Contributor Author

AmrElsersy commented Aug 6, 2021

for the example of changing an entity's position, you'd need to modify the entity's pose component

I mean changing the camera sensor position not the objects position, i don't think that we need to change the position of the objects, there is no rules to change that and guarantee that we will produce a realistic scene ... this idea is valid when we have simple unrelated objects in the scene .. but if i have a street, cars and buildings and I change the position of the objects randomly, that will produce a very unrealistic scenes

@AmrElsersy
Copy link
Contributor Author

If any of the tasks you have in mind can be done through ign-gazebo instead of ign-rendering, then it would probably be better to use the ign-gazebo APIs.

Ok, I will use the ign-gazebo APIs, but i mean that I need to use the ign-rendering APIs to create the camera and add it to the scene, and render the scene after putting the camera in the required position, and take its output image to save it

So i don't know how to use that from the ign-gazebo::GUI_System

@AmrElsersy
Copy link
Contributor Author

Creating different camera positions/angles is one option. Here are a few other options:
Add/remove other objects
....
Modify lighting and/or textures

I took a look at Google Research fuel models and the light spawn plugin, it seems that this is a data augmentation approach, and i still don't know how to do that, for example how to change the texture and guarantee a realistic data??,

And i believe there is no need to change anything of the scene, users do data augmentation through image processing, they can do these things by image processing and computer vision frameworks, only if they want, data augmentation may not be good in all cases.

some human "data cleaning" may be needed once datasets are generated, but data cleaning is a normal part of the ML dataset collection process.

Users still need to perform data cleaning manually in this case, by deleting the bad data individually .. but if we produce a regular image with the correct format/size, but with bad data (unrealistic scene due to wired positions or textures), so people can not write code to remove that.

@adlarkin
Copy link
Contributor

adlarkin commented Aug 7, 2021

I need to use the ign-rendering APIs to create the camera and add it to the scene, and render the scene after putting the camera in the required position, and take its output image to save it

So i don't know how to use that from the ign-gazebo::GUI_System

I'm actually fairly confident that you will not need to use ign-rendering APIs to create the camera. This can also be done in the ign-gazebo gui system plugin. What I'd try to do (at least as a start) is use a structure similar to the light spawning gazebo plugin: once a user has picked the sensor(s) they want to use (including sensor parameters) from a GUI menu, you'll want to build an SDF string and then call the SpawnFromDescription event with the string that you built. The SpawnFromDescription event will handle creating the camera for you if the string is built correctly, which should also automatically update the scene that is rendered.

For example, looking at the light spawning plugin, if a user selected a spot light from the GUI menu, a string is created that mimics a spot light in SDF: https://github.com/ignitionrobotics/ign-gazebo/blob/9fd8618436e49691fd0cbdaf5bf054ee9384b857/src/gui/plugins/lights/Lights.cc#L109-L132. Then, the SpawnFromDescription event is called with this string to actually create the light: https://github.com/ignitionrobotics/ign-gazebo/blob/9fd8618436e49691fd0cbdaf5bf054ee9384b857/src/gui/plugins/lights/Lights.cc#L143-L146. You may need to add a few other things to ign-gazebo to get the sensor to work properly once you call the GUI event, so take a look at the light spawning PR that I linked as needed.


In response/follow-up to your other comments/questions: how about we just focus on providing users with a way to manually create/spawn bounding box/segmentation sensors for now? This would be a part of the first suggestion you proposed. I'd imagine that you can create a GUI menu that has options to select either "bounding box" or "segmentation" sensor, and then something like a drop-down menu with parameters for the sensor(s) chosen (bounding box type, segmentation type, whether to save images or not, location to save data, etc...).

@AmrElsersy
Copy link
Contributor Author

Ok, so the approach that i will do is making a GUI system that
creates the segmentation(or boundingbox) sensor from building the sdf sensor tag string and then emitting a GUI event ignition::gui::events::SpawnFromDescription event(sdfSensorString);

just like the Lights plugin https://github.com/ignitionrobotics/ign-gazebo/blob/9fd8618436e49691fd0cbdaf5bf054ee9384b857/src/gui/plugins/lights/Lights.cc#L109-L132

then in the Update method of the GUI system, i will access this sensor via the SegmentationComponent or BoundingBoxComponent .. so that i can get the sensor's entity and update its position

is there any problem with that approach ? or anyone have another idea ?

@AmrElsersy
Copy link
Contributor Author

AmrElsersy commented Aug 10, 2021

Also i 've a question ... i want to get the pose of the main camera of the Scene3D plugin ... the main camera which is controlled by the mouse (I want to make the segmentation/bounding box cameras to have the same position as the main camera to collect the samples easily by controlling the camera by the mouse (moving the scene by dragging the mouse or zooming by the mouse wheel) so that the user has an easy way to change the position of the cameras

how to get it ?? is it published on a certain topic or a certain event ??

@chapulina
Copy link
Contributor

I haven't followed the entire conversation, but I'll leave some pointers here that may hopefully help:

GUI plugin

Before writing any code for the GUI plugin, I'd recommend creating a few sketches showing how the user is expected to interact with it. What buttons does it have? What goes where? What sequence of clicks is necessary to create a dataset?

generates samples data from the scene without the control of the user

Depending on the datasets being created, this can be more convenient to use than a GUI plugin. For example, a user can choose a set of 1000s of objects that they want to generate data for, run that headless on the cloud and upload the resulting data to some cloud storage. This would make use of a server-side scene, rather than the GUI scene. (The distinction between server-side and client-side rendering plugins is described in the Rendering plugins tutorial.

That's why I think it's important to come up with a user story before deciding what approach to take. Is it better for the user to manually click through a GUI plugin to setup a scene, or is it better to run a script that runs everything automated with just some simple inputs, like the model URLs?

ign-rendering vs ign-gazebo APIs

I'm assuming that for dataset generation, physics doesn't need to be running, right? If it needs to be running, I think using ign-gazebo APIs may make more sense. Otherwise, I don't think it matters. The difference would be whether we're creating entities that are part of the ECM and can be seen by other systems, or if we're creating rendering nodes that exist only in the 3D scene and are used only for the purposes of data generation. I can see both working, it just depends on the requirements for the tool.


I think it could be helpful for you peeps to come up with a list of requirements and use cases for the tool. You could start by:

  • listing all the inputs for the user, and all the outputs they get
  • writing at least one user story (@AmrElsersy may remember the issue for the plotting tool that had some user stories Plotting tool gz-gui#66)

@AmrElsersy
Copy link
Contributor Author

@chapulina
Ok, will write the user story

Is it better for the user to manually click through a GUI plugin to setup a scene, or is it better to run a script that runs everything automated with just some simple inputs, like the model URLs?

I think user should setup all the scenes he want to collect the samples from, not just getting the models URLs, as for the problem of object detection or segmentation, we need to collect a samples which contains many objects in the same image, so that the deep learning model can learn the spatial information .. so we cannot collect the samples via only the models URLs which will make us provide samples with only 1 object in the image (if that what you mean)

if we're creating rendering nodes that exist only in the 3D scene and are used only for the purposes of data generation

mmmm, if that is the case, what can i do to make the APIs ? without making an ign-gazebo systems

@adlarkin
Copy link
Contributor

we cannot collect the samples via only the models URLs

I think that what @chapulina meant is that you could specify all of the models you want to use for a scene in a script via their fuel URL. Then, when the script is executed, it creates a scene by fetching/downloading the resources provided in the URLs and then placing these resources in the scene as specified in the script. This would be quicker than manually placing objects and/or moving them.

@chapulina
Copy link
Contributor

This would be quicker than manually placing objects and/or moving them.

Yup, here's an example script pseudo-code:

for (1 to 1000)
  # Generate a new random world 
  random_world = generate_random_world()

  # Run simulation for enough iterations to get a given number of images
  ign gazebo -r -s --iterations 20 random_world
end

The random worlds could be generated using ERB templates for example. You can customize the templates as needed, with model URLs, random poses, etc.

I imagine the worlds would come with the sensors already in them, and they can be moved by some server plugin every few iterations as needed.

I'd expect the sensor images to be automatically saved to a set location on disk, doing something like this ign-sensors example.


☝️ That's just one idea

@adlarkin adlarkin removed their assignment Aug 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants