| A conversation I had with my nephew over the Christmas 2006 holidays seems to have revealed a simple insight regarding the hopes for computers one day recognizing and interpreting image files on the World Wide Web. But first a little history. You may have noticed that some online transactions require the user to enter a code (embedded in an image file) by hand before the transaction can proceed. This is to confirm that a human being is requesting the transaction, not an automated program. Computers cannot understand images. Google and other search engines are able to (fairly effectively) search for images on the web only when the images are associated with tags containing ordinary text which a computer can understand. Now, there are two kinds of organizations which would benefit greatly from a computer ability to interpret image files directly, independent of any associated tags: Search Engines and the Homeland Security department of the US government, not to mention police departments of every city on the planet. Cameras on street corners, in shopping malls, and at airport terminals would be able to scan for the electronic signatures of the faces of known fugitives and terrorists. Now here's the insight that arose during the conversation mentioned at the beginning of this article: With so many image formats used on the web and in all the digital cameras and cell phones, there are too many possible electronic signatures for each of the elemental patterns and shapes which must be detected in an image before any sense of its possible meaning can be evaluated by image recognition software. So, the obvious solution is for image recognition software to analyze images only after importing them to an image format native to the software and which is designed with the utmost simplicity, no fancy algorithyms. Essentially, image recognition software would take a screenshot of an image, thus ignoring the binary contents of the original file, then look for electronic signatures of elemental patterns and shapes in the screenshot. Each possible elemental pattern or shape would have only one possible signature because only one image format is ever used. An image could be searched for very loose matches to some signatures, and very tight matches to other signatures. Or the software could toggle the match tolerance setting automatically depending on whether any matches to the signature database are being found. It could first search an image for tight matches and then loosen up the match tolerance setting until it begins to get matches. For this software to function would require a database of the signatures of all the shapes we wish to detect. For example, at Symbols.Net, I may wish to have a search engine capable of searching for triangles superimposed on circles or squares in a certain way. The software would search for the signatures of circles, triangles and squares, then return results where these signatures are associated near each other in an image. |