Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix bug of evaluation only on frames with detections #36

Merged
merged 1 commit into from
Dec 19, 2023
Merged

Conversation

liqq010
Copy link
Contributor

@liqq010 liqq010 commented Jun 29, 2022

Fix the bug of evaluation as described in Issue #34 . In previous evaluation, when calculate the mAP, only frames with detection results will be considered. Frames without detection results will be ignored and frames with zero ground-truth labels will also be ignored, which is incorrect.

Changes:
calc_mAP.py

  • The frame which has zero ground-truths will also be included. Only the frame which is not annotated (annotated = 0) will be ignored.
  • The frame without detection results will not be ignored and will be given empty box, empty label and empty score.

Previous evaluation results:
{ 'PascalBoxes_PerformanceByCategory/[email protected]/Amber': 0.4818979219978077, 'PascalBoxes_PerformanceByCategory/[email protected]/Brake': 0.2737587107923605, 'PascalBoxes_PerformanceByCategory/[email protected]/Green': 0.5513865051639313, 'PascalBoxes_PerformanceByCategory/[email protected]/HazLit': 0.11620857348102875, 'PascalBoxes_PerformanceByCategory/[email protected]/IncatLft': 0.034942395130747914, 'PascalBoxes_PerformanceByCategory/[email protected]/IncatRht': 0.12010168705057221, 'PascalBoxes_PerformanceByCategory/[email protected]/Mov': 0.2954432712747147, 'PascalBoxes_PerformanceByCategory/[email protected]/MovAway': 0.39639561101448373, 'PascalBoxes_PerformanceByCategory/[email protected]/MovLft': 0.001175556462010909, 'PascalBoxes_PerformanceByCategory/[email protected]/MovRht': 0.0012504468524578121, 'PascalBoxes_PerformanceByCategory/[email protected]/MovTow': 0.4835664230674499, 'PascalBoxes_PerformanceByCategory/[email protected]/Ovtak': 0.008611427224683544, 'PascalBoxes_PerformanceByCategory/[email protected]/PushObj': 0.0, 'PascalBoxes_PerformanceByCategory/[email protected]/Red': 0.6935386888511119, 'PascalBoxes_PerformanceByCategory/[email protected]/Rev': 0.007226113801476214, 'PascalBoxes_PerformanceByCategory/[email protected]/Stop': 0.4943986054465475, 'PascalBoxes_PerformanceByCategory/[email protected]/TurLft': 0.08228591878765251, 'PascalBoxes_PerformanceByCategory/[email protected]/TurRht': 0.11892425574961471, 'PascalBoxes_PerformanceByCategory/[email protected]/Wait2X': 0.2885705863438699, 'PascalBoxes_PerformanceByCategory/[email protected]/Xing': 0.27006308968504367, 'PascalBoxes_PerformanceByCategory/[email protected]/XingFmLft': 0.22231675867047157, 'PascalBoxes_PerformanceByCategory/[email protected]/XingFmRht': 0.23739371151153493, 'PascalBoxes_Precision/[email protected]': 0.2354298299254351}

New evaluation results:
{ 'PascalBoxes_PerformanceByCategory/[email protected]/Amber': 0.48424418194749086, 'PascalBoxes_PerformanceByCategory/[email protected]/Brake': 0.25302643469768726, 'PascalBoxes_PerformanceByCategory/[email protected]/Green': 0.5483037838953202, 'PascalBoxes_PerformanceByCategory/[email protected]/HazLit': 0.1065736326357181, 'PascalBoxes_PerformanceByCategory/[email protected]/IncatLft': 0.035225949468750725, 'PascalBoxes_PerformanceByCategory/[email protected]/IncatRht': 0.11757236548808014, 'PascalBoxes_PerformanceByCategory/[email protected]/Mov': 0.283627116532719, 'PascalBoxes_PerformanceByCategory/[email protected]/MovAway': 0.3769292320941177, 'PascalBoxes_PerformanceByCategory/[email protected]/MovLft': 0.000808028653421845, 'PascalBoxes_PerformanceByCategory/[email protected]/MovRht': 0.0009041913921086089, 'PascalBoxes_PerformanceByCategory/[email protected]/MovTow': 0.4698070987135969, 'PascalBoxes_PerformanceByCategory/[email protected]/Ovtak': 0.0077842530030811895, 'PascalBoxes_PerformanceByCategory/[email protected]/PushObj': 0.0, 'PascalBoxes_PerformanceByCategory/[email protected]/Red': 0.6879472285182406, 'PascalBoxes_PerformanceByCategory/[email protected]/Rev': 0.004109427276317987, 'PascalBoxes_PerformanceByCategory/[email protected]/Stop': 0.46862518085443705, 'PascalBoxes_PerformanceByCategory/[email protected]/TurLft': 0.08007121980246595, 'PascalBoxes_PerformanceByCategory/[email protected]/TurRht': 0.10975420613050413, 'PascalBoxes_PerformanceByCategory/[email protected]/Wait2X': 0.30674470212547544, 'PascalBoxes_PerformanceByCategory/[email protected]/Xing': 0.2672282257701669, 'PascalBoxes_PerformanceByCategory/[email protected]/XingFmLft': 0.20849638315574076, 'PascalBoxes_PerformanceByCategory/[email protected]/XingFmRht': 0.2154771025323883, 'PascalBoxes_Precision/[email protected]': 0.2287845429403559}

Copy link
Contributor

@Edwardius Edwardius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -199,6 +208,7 @@ def run_evaluation(labelmap, groundtruth, detections, exclusions, logger):
start = time.time()
num_pred_ignored = 0
for image_key in pred_boxes:
# ignore frames without ground-truth annotations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this comment referring to?

Comment on lines +193 to +195
pred_boxes[image_key] = np.empty(shape=[0, 4], dtype=float)
pred_labels[image_key] = np.array([], dtype=int)
pred_scores[image_key] =np.array([], dtype=float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what the pascal_evaluator expects in the case of no detection? Or where did you get this idea

Copy link
Contributor Author

@liqq010 liqq010 Jul 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the code of evaluation details. Before calculation of tp/fp, invalid detection boxes will be removed. The empty detection boxes we provide will be removed.

detected_boxes, detected_scores, detected_class_labels, detected_masks = (
self._remove_invalid_boxes(detected_boxes, detected_scores,
detected_class_labels, detected_masks))

Then N detection boxes and M ground-truth boxes will be used to calculate tp/fp. I think everything else is correct after this step and this follows the evaluation process of pascal_evaluator.

If we don't provide the empty detection boxes, before the step described above, less images will be evaluated according to this part of code.

if image_key in self.groundtruth_boxes:
groundtruth_boxes = self.groundtruth_boxes[image_key]
groundtruth_class_labels = self.groundtruth_class_labels[image_key]
# Masks are popped instead of look up. The reason is that we do not want
# to keep all masks in memory which can cause memory overflow.
groundtruth_masks = self.groundtruth_masks.pop(image_key)
groundtruth_is_difficult_list = self.groundtruth_is_difficult_list[
image_key]
groundtruth_is_group_of_list = self.groundtruth_is_group_of_list[
image_key]
else:
groundtruth_boxes = np.empty(shape=[0, 4], dtype=float)
groundtruth_class_labels = np.array([], dtype=int)
if detected_masks is None:
groundtruth_masks = None
else:
groundtruth_masks = np.empty(shape=[0, 1, 1], dtype=float)
groundtruth_is_difficult_list = np.array([], dtype=bool)
groundtruth_is_group_of_list = np.array([], dtype=bool)
scores, tp_fp_labels = (
self.per_image_eval.compute_object_detection_metrics(
detected_boxes=detected_boxes,
detected_scores=detected_scores,
detected_class_labels=detected_class_labels,
groundtruth_boxes=groundtruth_boxes,
groundtruth_class_labels=groundtruth_class_labels,
groundtruth_is_difficult_list=groundtruth_is_difficult_list,
groundtruth_is_group_of_list=groundtruth_is_group_of_list,
detected_masks=detected_masks,
groundtruth_masks=groundtruth_masks))

@@ -111,8 +111,12 @@ def read_json(json_file, class_whitelist=None, load_score=False):
scores = defaultdict(list)
ann_dict = json.load(json_file)
for video in ann_dict['db'].keys():
# filter ground-truth of validation set only
if 'val_1' not in ann_dict['db'][video]['split_ids']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only val_1, should be a changeable param

@Edwardius Edwardius merged commit 41d2652 into master Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants