2

I am profoundly disturbed by a request that asks me to develop an iPad app to measure the actual size of an object using camera.

It is simply not practical.

Translating 2D object into 3D is never easy. I either need extra hardware, or have to make a lot of assumptions (which will almost never hold true). Regardless I will need time.

However, the person who requested me to develop the app is adamant of his vision. And he tries to show that he is right by listing a number of existing apps that do similar jobs. He does not realize that all the apps he refers to has a rating lower than three stars, which means they probably do not do their job.

How can I convince this person that such an app simply cannot be done in a practical, single developer fashion? Is there a PhD thesis I can use as my defense?

YoYoMyo
  • 131
  • 3
  • 12
    Why do you have to convince them so much? Is it not possible to simply say "No, I'm not going to take this project, good luck with it" ? – FrustratedWithFormsDesigner Jan 30 '12 at 19:12
  • 2
    Why not hold an ipad from an object 3 feet away, ask how big it is, then walk 2 feet closer and ask? I don't see how this could be successful without a meaningful, accurately known, distance from the object and there isn't an hardware build in I could see providing that... – Rig Jan 30 '12 at 19:12
  • @Rig Technically you might be able to determine this if the camera is sensitive enough to measure the time from a blink of a flash to the reception of the reflected light but if not you'll have to start with the camera and the flash. – Karlson Jan 30 '12 at 19:18
  • A good solution to this would probably not involve converting a 2D image to 3D, but getting multiple images from known locations and then using the differences between the images to calculate the size of an object. – FrustratedWithFormsDesigner Jan 30 '12 at 19:19
  • Are you looking for feedback on the idea, e.g whether or not it can/cannot be done & done well, or how to discuss these situations with bosses/clients/customers in general now and into the future? – jcmeloni Jan 30 '12 at 19:25
  • 4
    Did you actually download the apps in question and take a look for yourself? Did you see how accurate they are, or are you just looking at a star rating based on opinions of others? Remember, most people don't leave reviews unless they are unhappy with a product. – Tyanna Jan 30 '12 at 19:31
  • ...and you probably don't need to try and find a PhD to prove your point that it's impossible, as I did a project in undergrad that involved using a camera to find the distance to objects that moved around on a table. We didn't calculate dimensions of the objects, but we could have. And I'm pretty sure the iPad probably has as much processing power as the workstation we used back then (and possibly a better camera, too). ;) – FrustratedWithFormsDesigner Jan 30 '12 at 19:32
  • 9
    I have discovered a truly remarkable method to do this which this comment is too small to contain. – Dave Nay Jan 30 '12 at 19:32
  • See if you can get them to watch the UFO series episode "Close Up". It deals with the importance of context and metadata in interpreting photos. It may be doable depending on the level of control you have over these aspects. – jfrankcarr Jan 30 '12 at 19:36
  • 1
    I saw an app to do this for real-estate listings, but it involved sticking a square of a known size to the front of the building. – Paul Tomblin Jan 30 '12 at 19:36
  • 2
    Put it on rentacoder. I bet the guy who offered to solve `P = NP` for $500 will take it. – Paul Tomblin Jan 30 '12 at 19:48
  • @PaulTomblin That's what I mean by extra hardware. But even so, the result would be dependent on how the user uses the reference. Moreover, the things the app tries to measure include those long incandescent light bulbs. The user cannot simply stick a reference on the ceiling. – YoYoMyo Jan 30 '12 at 19:50
  • 1
    "The user cannot simply stick a reference on the ceiling". False. They can simply stick a reference on the ceiling. That's what a long stick and a post-it note are for. – S.Lott Jan 30 '12 at 19:54
  • Do you have access to the focus point of the camera? If you know how far away the camera is focused, couldn't you use that to estimate how big the object was (by comparing object size in-screen to total screen size)? This of course assumes you're looking at the object roughly perpendicularily. You'd also need a way to pick out the object itself, or just be able to place a bounding box. – Clockwork-Muse Jan 30 '12 at 20:04
  • I don't see how this can be "not constructive" and "this question will likely solicit opinion". – ysdx Jan 30 '12 at 21:33
  • 1
    Have a look at basic Computer Vision courses. It works the same as human vision, you cannot find the size (or equivalently depth) of a given object unless you add a second camera (and have calibration for the pair of camera). If you do not have this, you can only rebuild the world up to a scale factor: you could be able to figure out the relative size of objects, but to be able to get absolute sizes, you need to know some reference size from the scene. – ysdx Jan 30 '12 at 21:35
  • It's a tough problem, but I think it's theoretically possible. As ysdx says, you can wave the camera around, and use computer vision techniques to get a 3D model. Without additional information, you don't know the absolute size of the model -- but the iPad has an accelerometer, so you may be able to correlate camera motion with the accelerometer data to estimate an absolute size. I'm not sure how rough that estimate would be, though... – comingstorm Jan 30 '12 at 23:42

5 Answers5

15

It's called "Forced Perspective"

Step 1. Go outdoors with a friend. Stand in front of a building.

Step 2. Take a picture so that the friend is really close to you and the building is really far away. Assure that the friend's head lines up with the top of the building.

Step 3. Ask your customer how tall it is, given just the photograph.

Do not specify what "it" is that the customer must define the height of. Let them assume -- or guess -- what part of the picture is relevant.

S.Lott
  • 45,264
  • 6
  • 90
  • 154
  • 1
    http://offroadinghome.blogspot.com/2011/04/geo-tography-forced-perspective.html has numerous examples. – S.Lott Jan 30 '12 at 20:43
  • +1 @S.Lott: Great answer, though have a few questions: Why would you not need to know the distant between the two objects being forced into alignment to use to complete the estimated height of the unmeasured object? Also, would not both the top and the bottom of both objects need to be aligned, not just the top? Lastly, would it be correct to say that 'it' is always the closer of the two objects? Thanks! – blunders Jan 31 '12 at 00:32
  • 1
    @blunders: The point is that **none** of that is known. The bottoms are **never** aligned when taking forced perspective pictures. Therefore, the picture is utterly useless. The estimation of size from an image is impossible. All **forced perspective** pictures are ample proof that an image -- by itself -- is utterly unusable. Please look at some forced perspective pictures. Do your own Google search. – S.Lott Jan 31 '12 at 10:40
  • Yes, I did my own Google search, thanks... :-) ...where exactly does it say in the question that nothing is knowable other than the image, or are you saying that's your assumption? The only requirement I see is "measure the actual size of an object using camera", which in itself does not state that additional information is not knowable. – blunders Jan 31 '12 at 11:50
  • Also, 'all forced perspective' is not proof it's not possible in my opinion, or that the top/bottom may never align; eg knowing the height of a penny & quarter, I would be able to figure out roughly how far apart these [two coins are from each other](http://simplykuni.files.wordpress.com/2010/07/img_0179_edited-1.jpg?w=650). – blunders Jan 31 '12 at 11:51
  • 1
    @blunders: But you don't know the height of the person or the building. Therefore, you have zero basis for guessing the height of either. – S.Lott Jan 31 '12 at 12:40
  • 1
    @blunders: Also. If it's possible, please provide a photo of a random object and judge the size from the content of photo **alone**. No additional facts (i.e., diameter of a quarter) can be imposed. Please post the solution, rather than claim that it may be possible. That way I can revise my answer based on your results. – S.Lott Jan 31 '12 at 12:42
  • Again, where exactly does it say in the question that nothing is knowable other than the image, or are you saying that's your assumption? Appears so, but you're not directly stating it, and while the OP might later state that's the case, currently my understanding is the only requirement is "measure the actual size of an object using camera", which does not preclude awareness of additional information in my opinion. I personally would hope my friends know how tall they are... :-) ...but my point is you're saying it's not possible, when in fact it is possible based on my understanding. – blunders Jan 31 '12 at 12:47
  • 1
    @blunders: I'm glad you have such a good insight into the question. It does not say any additional information will be input. I'm sure you think it's important to add requirements like that, but I failed to see them. Since you know so much more about the question -- as asked -- I'm sure you will have a much better answer. That's the point. You could (if this wasn't closed) provide your own answer based on whatever other information you feel should have been part of the question. – S.Lott Jan 31 '12 at 13:06
  • Again... **Yes, or no**, does the question say that nothing is knowable other than the image? If no, I'd be happy to provide an explanation of how it's possible to with my existing answer, otherwise there's no point, since it's irrelevant. – blunders Jan 31 '12 at 13:37
  • 1
    @blunders: "It does not say any additional information will be input". Seems clear to me. Apparently, it does not seem clear to you. I'm so sorry that this is so confusing. I absolutely cannot see how additional input is required. Apparently, you can. I don't see what words provide that hint to you. – S.Lott Jan 31 '12 at 13:47
  • "It does not say any additional information will be input" -- states that the question does not disclose if additional information will be input, which is not the subject of my question. My question is "Yes, or no, does the question say that nothing is knowable other than the image?" Meaning the question either does not state additional information is forbidden, or does state additional information is forbidden. Your statement, and my question have a different meanings. Does that help clear up my question? If so, please answer yes, or no. – blunders Jan 31 '12 at 13:59
  • 1
    @blunders: "It does not say any additional information will be input". How can that be misinterpreted? I have no clue. The question clearly does not required additional input. The question does not list any additional input. No additional input is required by the question. The question does not list, mention, reference or even suggest additional input is needed. No additional input is mentioned in the question. There is no additional input described in the question. There are no requirements for additional input in the question. I do not know what hair you are splitting. – S.Lott Jan 31 '12 at 14:59
  • Hmm. It's a simple yes or no question, **does the question say that nothing is knowable other than the image?** If so, where? As for why I'm splitting hairs, in the example you gave, given the top/bottom of the person/building are aligned, and the user supplies the estimated distance to the person, building, and person's height - I believe it would be possible to calculate the height of the building; which you appear to be saying it is impossible to do using aligned forced perspective, and I'm saying it's not. Also, believe that would meet requirement stated, and would be a simple app to do. – blunders Jan 31 '12 at 15:50
  • To be honest, I thought that's what you where originally saying, only to realize it appeared you were attempting to provide a proof for why the request was impossible. – blunders Jan 31 '12 at 15:54
  • 1
    "the user supplies the estimated distance to the person, building, and person's height". That's just triangulation. Well known. Clearly workable. I don't see how the user supplying all that extra information is in the sketchy requirements in the question. As I've said before, you clearly are comfortable with those additional pieces of data. I don't see where that's allowed. You do. – S.Lott Jan 31 '12 at 16:19
  • +8 @S.Lott: To assume something is not allowed unless it's clearly stated as being allowed is a very limiting view of the world in my opinion, further unless someone says something is not allowed, to me it's fair game. That said, thank you for your attempt to understand what I was saying, and as a result, I've upvoted all of your comments, cheers! – blunders Jan 31 '12 at 18:08
  • @blunders: The question is not "how can I?" In which case, adding information would be relevant, useful and helpful. The question is "how do I show that it's impossible?" I'm simply showing that -- absent additional requirements -- it's impossible. You are free to manufacture a different question if you think that's helpful. I'm unable to guess. And didn't feel the need to ask. – S.Lott Jan 31 '12 at 18:33
6

Instead of trying to convince someone that something cannot be done try to analyze and determine what it would take to get this done.

When you calculate that it would take you alone 10 years and $1 billion the solution of what to do with the project will become obvious.

Karlson
  • 815
  • 1
  • 7
  • 14
6

It's possible - plus, you could always do the processing remotely, only using the phone to collect data and display results. Beyond that, there's also nothing that says you're not able to mount a device to the phone's camera to split and offset 2D input, and then convert those two inputs into a 3D input.

As for app ratings, that's not really a solid basis for understanding the feasibility of a concept, or it's complexity.

Just have fun with the challenge, it's not the end of the world.

blunders
  • 4,550
  • 4
  • 31
  • 48
4

Why not force the user to take the picture with a clearly displayed item of standard size next to the item being measured; a penny for example.

It's probably not as good of a solution as your employer wants but explaining a solution like this would at least show you can make something happen.

As you start listing the limitations and compromises needed, he will surely back away from a project like this (unless you have very good and sizable team behind you)

brian
  • 3,579
  • 1
  • 20
  • 22
  • That's what I mean by extra hardware. But even so, the result would be dependent on how the user uses the reference. Moreover, the things the app tries to measure include those long incandescent light bulbs. The user cannot simply stick a reference on the ceiling. – YoYoMyo Jan 30 '12 at 19:52
3

Actually I'm not sure you are correct.
Provided the user assist the app it should be possible.

For example of a 1D measurement. Stand 1meter away from 1dm high object. Add this as user input to your app. Stand 1km from a mountain, add this as user input. Keep the same angle from the camera with both your 1dm object and the mountain. The app should calculate the height of the mountain easily.

I would suggest further studding Triangulation for implementation details.

You can take this further by placing your phone in a tripod and and have the object to be measured always at a known distance. This way everything can be pre calibrated and your app could by only counting the pixels in height and width tell you the 2d measurements of the object under inspection.

  • I agree, I can think of several different solutions. – zzzzzzzzzzzzzzzzzzzzzzzzzzzzzz Jan 30 '12 at 19:15
  • 3
    this is an extremely fragile solution – Ryathal Jan 30 '12 at 19:36
  • 1
    In the case of measuring the mountain would the user have to stand exactly 1km away from the edge of the mountain? Or the exact center of the mountain? I really don't think this would work at all.It's easy to stand exactly 1m away from something that is rather small but scaling it up to a mountain does not seem feasible. – stuartmclark Jan 31 '12 at 09:41
  • @stuartmclark The user should stand 1km away, cathetus not hypotenuse, from the mountain top as that is what he is measuring. –  Jan 31 '12 at 10:41