How do I get the coordinates of a generated bounding box using an inference script from Google’s Object Detection API? I know that printbox[0][i] returns the prediction for the ith detection in the image but what exactly do these returned numbers mean? Is there a way to get xmin,ymin,xmax,ymax? Thanks in advance.
1> Gal_M..:
Google Object Detection API returns bounding boxes in [ymin,xmin,ymax,xmax] format , and returned in normalized form (full description here). To find the (x,y) pixel coordinates, we need to multiply the result by the width and height of the image. First get the width and height of the image:
width, height = image.size
Then, extract ymin, xmin, ymax, xmax from the boxes
object and multiply to get the (x, y) coordinates:
ymin = boxes[0][i][0]*height xmin = boxes[0][i][1]*width ymax = boxes[0][i][2]*height xmax = boxes[0][i][3]*width
Finally print the coordinates of the corners of the box:
print 'Top left' print (xmin,ymin,) print 'Bottom right' print(xmax,ymax)