-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError when loading COCO dataset with multiple segmentation masks for one class #1209
Comments
Hi @DancinParrot 👋🏻 Sorry for the late response, but I traveled a lot at the end of last week, and my access to GitHub was limited. You're correct, we currently do not support loading multi-segment masks. I assume the change you want to make would be in the |
Hi @SkalskiP! Thanks for your response, I wasn't expecting a response this quick actually so no worries! I see, that would explain the error. However, the error was actually raised from this line which is within the I modified the def coco_annotations_to_detections(
image_annotations: List[dict], resolution_wh: Tuple[int, int], with_masks: bool
) -> Detections:
#...
if with_masks:
polygons = []
for image_annotation in image_annotations:
segmentations = image_annotation["segmentation"]
if len(segmentations) > 1:
s = merge_multi_segment(segmentations)
s = (
(np.concatenate(s, axis=0) / np.array(resolution_wh))
.reshape(-1)
.tolist()
)
reshaped = np.reshape(np.asarray(s, dtype=np.int32), (-1, 2))
else:
reshaped = np.reshape(
np.asarray(segmentations, dtype=np.int32), (-1, 2)
)
polygons.append(reshaped)
#...
return Detections(xyxy=xyxy, class_id=np.asarray(class_ids, dtype=int)) The aforementioned modification allows the dataset to be loaded. Though, I'm not sure if the outcome really fits my use case since the |
I'm very sorry. You're right, of course. I mentioned As for
If you want to load this as two separate masks, your COCO JSON is incorrectly constructed. You should not have multiple lists under the |
I see, no worries!
Understood, I'm trying out different implementations currently to fix the issue.
My apologies for the confusion, I might have misunderstood my dataset. In my current workflow, I export the annotated data from Label Studio in the form of a COCO dataset. Next, I import the dataset to Fiftyone for augmentation with Albumentation, which is then exported as a COCODetectionDataset to preserve the segmentation masks along with the bbox. Though, it seems that during this step, Fiftyone's native COCODetectionDataset exporter might have modified the structure of my dataset such that a mask is split into multiple parts (perhaps due to overlapping masks), resulting in supervision's inability to parse the dataset. However, from my observation, the dataset is still structured properly since I was able to import the dataset to Fiftyone and Label Studio again and the annotations remained unchanged. Thus, I doubt it's an issue with Fiftyone's exporter, but rather supervision's inability to merge the list of segmentation masks. Lastly, I tested out JSON2YOLO's |
Just an update, after messing around with a few implementations, I finally came up with a functional code by combining
Although the resulting product is not perfect, but it works well enough (so far) for my use case. I'm most definitely open to feedback and suggestions on ways to improve as well as possible alternatives to this approach. Should I open a PR for this? @SkalskiP Code is roughly as follows: # From https://github.com/voxel51/fiftyone/blob/8205caf7646e5e7cb38041a94efb97f6524c1db6/fiftyone/utils/coco.py
def normalize_coco_segmentation(segmentation):
# Filter out empty segmentations
# For polygons of 4 points (1 pixel), duplicate to convert to valid polygon
_segmentation = []
for seg in segmentation:
if len(seg) == 0:
continue
if len(seg) == 4:
seg *= 4
_segmentation.append(seg)
return _segmentation
# From https://github.com/ultralytics/JSON2YOLO/issues/38
def is_clockwise(contour):
value = 0
num = len(contour)
for i, point in enumerate(contour):
p1 = contour[i]
if i < num - 1:
p2 = contour[i + 1]
else:
p2 = contour[0]
value += (p2[0][0] - p1[0][0]) * (p2[0][1] + p1[0][1])
return value < 0
def get_merge_point_idx(contour1, contour2):
idx1 = 0
idx2 = 0
distance_min = -1
for i, p1 in enumerate(contour1):
for j, p2 in enumerate(contour2):
distance = pow(p2[0][0] - p1[0][0], 2) + pow(p2[0][1] - p1[0][1], 2)
if distance_min < 0:
distance_min = distance
idx1 = i
idx2 = j
elif distance < distance_min:
distance_min = distance
idx1 = i
idx2 = j
return idx1, idx2
def merge_contours(contour1, contour2, idx1, idx2):
contour = []
for i in list(range(0, idx1 + 1)):
contour.append(contour1[i])
for i in list(range(idx2, len(contour2))):
contour.append(contour2[i])
for i in list(range(0, idx2 + 1)):
contour.append(contour2[i])
for i in list(range(idx1, len(contour1))):
contour.append(contour1[i])
contour = np.array(contour)
return contour
def merge_with_parent(contour_parent, contour):
if not is_clockwise(contour_parent):
contour_parent = contour_parent[::-1]
if is_clockwise(contour):
contour = contour[::-1]
idx1, idx2 = get_merge_point_idx(contour_parent, contour)
return merge_contours(contour_parent, contour, idx1, idx2)
def mask2polygon(image):
contours, hierarchies = cv2.findContours(
image, cv2.RETR_TREE, cv2.CHAIN_APPROX_TC89_KCOS
)
contours_approx = []
polygons = []
for contour in contours:
epsilon = 0.001 * cv2.arcLength(contour, True)
contour_approx = cv2.approxPolyDP(contour, epsilon, True)
contours_approx.append(contour_approx)
contours_parent = []
for i, contour in enumerate(contours_approx):
parent_idx = hierarchies[0][i][3]
if parent_idx < 0 and len(contour) >= 3:
contours_parent.append(contour)
else:
contours_parent.append([])
for i, contour in enumerate(contours_approx):
parent_idx = hierarchies[0][i][3]
if parent_idx >= 0 and len(contour) >= 3:
contour_parent = contours_parent[parent_idx]
if len(contour_parent) == 0:
continue
contours_parent[parent_idx] = merge_with_parent(contour_parent, contour)
contours_parent_tmp = []
for contour in contours_parent:
if len(contour) == 0:
continue
contours_parent_tmp.append(contour)
polygons = []
max_area = 0
max_contour = None
# Get the largest contour based on area
for contour in contours_parent_tmp:
area = cv2.contourArea(contour)
if area > max_area:
max_area = area
max_contour = contour
if max_contour is not None:
polygon = max_contour.flatten().tolist()
return polygon
def coco_segmentation_to_mask(segmentation, bbox, frame_size):
x, y, w, h = bbox
width, height = frame_size
if isinstance(segmentation, list):
# Polygon -- a single object might consist of multiple parts, so merge
# all parts into one mask RLE code
segmentation = normalize_coco_segmentation(segmentation)
if len(segmentation) == 0:
return None
rle = mask_utils.merge(mask_utils.frPyObjects(segmentation, height, width))
elif isinstance(segmentation["counts"], list):
# Uncompressed RLE
rle = mask_utils.frPyObjects(segmentation, height, width)
else:
# RLE
rle = segmentation
mask = mask_utils.decode(rle)
polygon = mask2polygon(mask)
return polygon
def coco_annotations_to_detections(
image_annotations: List[dict], resolution_wh: Tuple[int, int], with_masks: bool
) -> Detections:
if not image_annotations:
return Detections.empty()
class_ids = [
image_annotation["category_id"] for image_annotation in image_annotations
]
xyxy = [image_annotation["bbox"] for image_annotation in image_annotations]
xyxy = np.asarray(xyxy)
xyxy[:, 2:4] += xyxy[:, 0:2]
if with_masks:
polygons = []
for image_annotation in image_annotations:
segmentation = image_annotation["segmentation"]
print("Segmentation: ", segmentation)
if len(segmentation) > 1:
s = coco_segmentation_to_mask(
segmentation, image_annotation["bbox"], resolution_wh
)
reshaped = np.reshape(np.asarray(s, dtype=np.int32), (-1, 2))
else:
reshaped = np.reshape(
np.asarray(image_annotation["segmentation"], dtype=np.int32),
(-1, 2),
)
polygons.append(reshaped)
mask = _polygons_to_masks(polygons=polygons, resolution_wh=resolution_wh)
return Detections(
class_id=np.asarray(class_ids, dtype=int), xyxy=xyxy, mask=mask
)
return Detections(xyxy=xyxy, class_id=np.asarray(class_ids, dtype=int)) |
Hi @DancinParrot 👋🏻 I must admit, I am very confused. Initially, I thought it was only about loading COCO annotations consisting of multiple segments. Is that still the case? I don't quite understand why we need all these extra steps like RLE conversion and polygon conversion. |
Yup, still that. It was my mistake, I misunderstood the issue, turns out Fiftyone split the mask for one annotation into multiple parts during export. The dataset is still structured properly though as it can be read on Fiftyone and Label Studio, it's only supervision that is unable to load the dataset.
The conversion to RLE I suppose merges all masks within an annotation to one array which is later used as input for the EDIT: I have updated the original issue to include my understanding of the issue. |
I think we should just be able to update the section of the code here: def coco_annotations_to_masks(
image_annotations: List[dict], resolution_wh: Tuple[int, int]
) -> npt.NDArray[np.bool_]:
return np.array(
[
rle_to_mask(
rle=np.array(image_annotation["segmentation"]["counts"]),
resolution_wh=resolution_wh,
)
if image_annotation["iscrowd"]
else polygon_to_mask(
polygon=np.reshape(
np.asarray(image_annotation["segmentation"], dtype=np.int32),
(-1, 2),
),
resolution_wh=resolution_wh,
)
for image_annotation in image_annotations
],
dtype=bool,
) It is unhappy because it does not expect multiple lists here The easiest way (not the most efficient), but still a lot more efficient than the conversion through all of the representations above, is to loop through lists in |
This seems a lot more efficient. Thanks! I'll try it out tomorrow when I get access to my work laptop and update this thread on the results. |
@DancinParrot Sure! Let me know how it goes. |
Hi @SkalskiP ! Thank you so much for your help! Here's the code that I've implemented based on your recommendation and it seems to merge all the polygons very well: def merge_masks(segmentations, resolution_wh):
parent = None
for s in segmentations:
if parent is None:
parent = polygon_to_mask(
polygon=np.reshape(
np.asarray(s, dtype=np.int32),
(-1, 2),
),
resolution_wh=resolution_wh,
)
else:
mask = polygon_to_mask(
polygon=np.reshape(
np.asarray(s, dtype=np.int32),
(-1, 2),
),
resolution_wh=resolution_wh,
)
parent = np.logical_or(parent, mask)
return parent
def coco_annotations_to_masks(
image_annotations: List[dict], resolution_wh: Tuple[int, int]
) -> npt.NDArray[np.bool_]:
return np.array(
[
(
rle_to_mask(
rle=np.array(image_annotation["segmentation"]["counts"]),
resolution_wh=resolution_wh,
)
if image_annotation["iscrowd"]
else (
merge_masks(image_annotation["segmentation"], resolution_wh)
if len(image_annotation["segmentation"]) > 1
else polygon_to_mask(
polygon=np.reshape(
np.asarray(
image_annotation["segmentation"], dtype=np.int32
),
(-1, 2),
),
resolution_wh=resolution_wh,
)
)
)
for image_annotation in image_annotations
],
dtype=bool,
) Any feedback for further improvement is much appreciated. Also, should I create a PR for this? |
Hi @DancinParrot 👋🏻, that seems like a good starting point. Please open a PR proposing the change. 🙏🏻 |
Hi @SkalskiP ! Will do, thanks! |
Search before asking
Bug
My current COCO dataset includes annotations with more than 1 segmentation masks of the same class. A rough analogy is as follows whereby one eye of a cat is segmented as a whole but when exported from Fiftyone two polygons are produced (turned into segmentation masks):
As a result, when the COCO dataset is loaded into my program using supervision, the program crashes with the following error:
After some research, I discovered Ultralytics' JSON2YOLO repository on GitHub, and adapted the library's
merge_multi_segment()
(seen here) function in supervision'scoco.py
file which then allows the COCO dataset to be loaded.Environment
Minimal Reproducible Example
The following code is used to load the COCO dataset with the annotations_path being the path to a .json file containing the paths and annotations for all images in the dataset:
The following is an example of a class/category containing multiple segmentation masks:
Additional
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: