It is common that we find a problem difficult to solve completely satisfactorily after significant efforts and conclude that the current approaches may not be correct approaches for the problem. We then start exploring an entirely different â€“ sometime opposite — approach. This new approach may give us some early success and appear promising. This early success results in a euphoria in which we declare that earlier approaches were really duds and the new approach is what will solve it to our complete satisfaction. In this phase we abandon old approaches and ignore the success that those approaches produced. Moreover, we assume that the new approaches solve our problem to complete satisfaction, without even being close to what the old approaches were. This is the phase of confusion. I think in several areas of Search, particularly in Multimedia search (images, video, audio, â€¦) we are going through this state.
It is clear that the problem of automatically extracting content from images or any other media is a difficult problem. Even in text we could not do it. That resulted in all our search engines using simple keyword based approaches or developing approaches that will have significant manual component and will address only specific areas. Another interesting finding was that for an amorphous and large collection of information, taxonomy based approach was too rigid for navigation. Since it was found relatively easier to develop inverted file structures to search for keywords in large collections, people found the idea of tags attractive. By somehow assigning tags, we could organize relatively unstructured files and search. About the same time that this was found, the idea of the wisdom of crowd became popular. So it is easy to argue that tags could be assigned by people and will result in â€˜wise tagsâ€™ (because they are assigned by the crowd) and will be much better approach than the dictatorial taxonomy.
The idea is appealing and made Del.icio.us and flickr.com a darling of many people. These sites are given as examples of huge success for folksonomy. I am excited about this approach, but can not help asking myself and you whether this approach is really working â€“ or can it be made to work?
If everybody assigned several appropriate tags to a photo that she uploaded and then the crowd seeing that photo also assigned appropriate tags then the wisdom of crowd may come in action. But if the up loader rarely assigns tags and viewers, if any, assign tags even more rarely, then there is no crowd and there is no wisdom. Interesting game like approaches (See WWW.ESPGAME.ORG) are being developed to assign tags to images.
How successful is this approach really? Based on my unscientific and ad hoc analysis it seems that very few tags are being assigned to photos on flickr by people who upload images and fewer are being assigned by viewers. Also, just todaqy I saw that the tags assigned to the same event â€“ I believe â€“ could be nycmarathon, nymarathon, newyorkcitymarathon and similar combination. It appears that without any guidance people really get confused about how to assign tags.
Remember for information retrieval purpose at one time, many journals started asking authors to select index words for their articles. It is clear that automatically extracted keywords have made search engines successful where those index words failed.
Based on what I have seen so far, it appears that the success may come from some interesting combination of taxonomy and folksonomy. It would be great to hear about your experience in this area.