I find interesting (and funny) to observe how the market of unstructured information management is evolving so fast that we don’t even share a name or expression to define it.
If we consider structured information, we all agree that we’re talking about databases, data warehouses, data mining and, recently, business intelligence; but we don’t have anything similar for unstructured data. In fact, depending on circumstances, applications and points of view, all the following names can be and are actually used:
· search engine
· information retrieval
· information extraction
· clustering
· text mining
· etl
· content management
· enterprise search technology
· content access tools
· semantic intelligence (we ourselves invented this one)
· information access technology
· categorization
· text analytics
It’s also interesting to note that even IDC and Gartner don’t agree with each other on the name of the field: Gartner refers to “information access technology” while for IDC the right term is “content access tools”: luckily, they have “access” in common
Of course, I realize there are bigger problems in the world… but, I would like to see a clearer and more shared approach to names and expressions. Considering how crucial the label “business intelligence” has been to establish a group of technologies and solutions that were already available on the market, I think that it is crucial to converge as soon as possible towards a common terminology.