Results are not reproducible for me
Hi Team,
I have downloaded the weights and dataset, and I am trying to reproduce the metrics for the object detection task on 500 images. However, I am getting very low-performance metrics. Using the COCO metric calculation, the average is less than 0.5.
Hi ojasvii,
It seems to me that COCO uses the XYWH format for coordinates, while the post_process_object_detection
method returns bounding boxes in the XYXY format.
Could that be the source of your issue?
That makes sense, as the XY coordinates are shared between both formats, meaning the bounding box starts at the correct point. However, since the width and height are interpreted differently, it causes discrepancies in the bounding box dimensions and positioning. This would result in non-zero but lower performance metrics, which explains why you’re getting an average score of around 0.5.
We converted the cut format to xywh for coco metrics. We kept the threshold as 0.4 as mentioned in the code snapshot
OK, if you are using the mAP metric, then a score of 0.5 might be consistent with the evaluation, as it separately measures GIoU and bounding box class accuracy.
Maybe testing with a higher threshold could increase the mAP by reducing the number of false positives.
Sorry, I won’t be able to help further.
I have increased to 0.5 , it didn't help, do you remember applying any post or pre-processing
No, I don’t apply any additional pre-processing.
Do the visual results seem to match the mAP score you are measuring, or do the results appear better than what the mAP indicates?
this is how its looking like i think its many prediction per image
the metrics is very low because of that as well , thats why i asked , have any post processing is applied ?
Oh, I see, that makes more sense now. As we can see, the prediction itself is quite accurate, but there are too many overlapping bounding boxes nested within each other. My performance measurement for bounding boxes doesn’t account for this, as I start from the ground truth bounding box and reference the closest predicted one (in terms of GIoU). To reduce this issue, applying post-processing to remove bounding boxes nested inside others should help a lot.
Increasing the detection threshold or filtering by score could also help reduce this phenomenon by limiting low-confidence predictions.