Robot Ping Pong: Triangulation the unglamorous way

After struggling for weeks to get OpenCV to perform the triangulation for me, I've weakened my usually-high academic integrity and have done something gritty and practical.

Images

Well, let me back up. First I took some new images of the ping pong table. These images are a higher resolution of 640x480, in case the imprecision of pixel coordinates was part of my problem. They also include 6 new real-world points to build the correspondence from. They also use (approximately) parallel gaze directions for the two cameras, and keep the two cameras close together, meaning that the left and right images are fairly similar to each other. Here are the images I'm using now.

You can see I've added the ping pong net to the half-table, put some cans with orange tops on the table surface, and marked out the spots on the floor below the front two corners of the table. Those are my six new points. This was motivated by a fear that my previous eight points included two that were collinear. Based on my (slow and painful) reading of the text books I bought, I got the impression that collinear points don't add to accuracy. And six points is insufficient for some algorithms to solve for everything.

I also improved the accuracy of my manual pixel-marking tool. It is still not able to provide sub-pixel accuracy, but it does really choose the right pixel. Before I was just taking my first mouse click as close enough, but the cursor was often a pixel or two off from where I had intended to select. Now I can follow up the initial click with single-pixel movements using the arrow keys until I have the right pixel marked. Sub-pixel accuracy is theoretically possible by finding the intersection of lines, like the table edges, but I haven't gone that far yet.

Results

How does it work when run through the same program? About as well as the old images. Here are the rectified images with points.

But does the triangulation work? Nope. I still get something that doesn't match the real-world coordinates of my table.

However, there is something new. Remember how I said in past blog entries that the point cloud from the triangulation wasn't even in the shape of a table? Now it is. It's the correct shape, and apparently the correct scale, but it is rotated and translated from the real-world coordinates. Here is a 3D scatter plot from Matlab of the triangulation output.

Hard to make any sense of it, right? What about this one?

I hope that is easier to see the shape of the table. All I did was rotate the view in the Matlab plot. While I did this rotation manually the first time, I found a way to solve for the best rotation and translation to bring the points into the correct orientation. I found the algorithm (and even some Matlab code!) from this guy. And you know what? It works! I can get a rotation matrix, a translation vector, and applying them to the triangulated points, I get something very close to the true real-world 3D coordinates of the table.

Back to OpenCV

I searched high and low to find that rotation matrix in the many outputs from OpenCV. No luck. I figured it might be that OpenCV's triangulation gives me answers with the camera at the origin (instead of my real-world origin at the near-left corner of the table). But the rotations from solvePnP don't seem to work. I experimented with handedness and the choice of axes. That didn't seem to work. Basically nothing works. I would be grateful if anyone reading this could leave a comment telling me where I can get the correct rotation to apply! Or, for that matter, why it needs a rotation in the first place!

After many days of frustration, this morning I gave up. You know what? If I can solve for the correct rotation/translation in Matlab, why can't I do it in C++? So that's what I did. I implemented the same algorithm in C++, so that I can apply it directly to OpenCV's output from triangulation. And it works too. It's unglamorous, having to solve to find it, when it should be readily available, but it gets the job done.

Now that I have good triangulated points, I can see how accurate the method is. I calculated the root-mean-squared distance between the true point (as measured from the scene and table dimensions) and the triangulated point. I get something around 12mm. So in this setup, I would expect to be able to turn accurate ball-centers in each image into a 3D ball location to within 12mm. That sounds pretty good to me.

What's Next?

I feel a great sense of relief that I can triangulate the table, because I've been stuck on this for so long. I can't say that I'm delighted with how I did it, but at least I can move on, and maybe come back to solve this problem the right way another time.

Next, I need to return to video, from this detour into still images. I need to drop back to 320x240 images, and get a ping-pong ball bouncing. But I'm going to keep the new correspondence points (the net, the corners on the floor, and even the cans). I will experiment with having the cameras further apart and not having parallel gaze. Mr. W insists that this will result in better triangulation. I get his point -- it's a crappy triangle if two corners are in the same place -- but I need to make sure that all the OpenCV manipulation works just as well.

2 comments:

MirjamNovember 14, 2013 at 9:54 AM
Hi JB,

not sure whether you're still working on your project, but maybe I can help you a bit. As you supposed, the world coordinate system's origin in OpenCV is attached to the left camera. So you have to interpret all 3-D coordinates with respect to the orientation of the left camera. I quote from the very helpful OpenCV book "Learning OpenCV" by Bradsky and Kaehler: "Note that it is a right-handed coordinate system: if you point your right index finger in the direction of X and bend your right middle finger in the direction of Y, then your thumb will point in the direction of the principal ray." i.e. the x-axis is pointing to the right, the y-axis is pointing downwards, and the z-axis points in viewing direction of the camera.
Not sure whether this was of any help, but I'd like to know whether you are still working on your project.

Cheers,
Mirjam

Be nice, remember I'm an amateur, but by all means please give me feedback!

Robot Ping Pong

Sunday, March 24, 2013

Triangulation the unglamorous way

2 comments:

About Me