Searching Smarter, Not Harder

When it comes to searching, more is not always better. Data experts are honing topic maps to better classify the information that's out there. By John Gartner.

Databases and search engines provide instantaneous access to endless information about anyone or anything, but the search results often include as many misses as hits. To generate more-relevant answers, organizations including the federal government are using topic maps to index their data.

Topic maps are smart indices that improve search capabilities by categorizing terms based on their relationships with other things. For example, William Shakespeare is a topic that would be mapped to essays about him, his plays and his famous quotes.

Organizing content with topic maps provides context for words that can have multiple meanings, according to Patrick Durusau, chairman of a topic maps technical committee at OASIS, the Organization for the Advancement of Structured Information Standards.

For example, searching Google for "Franz Ferdinand" mixes results for the alternate rock group and the doomed Austrian archduke for whom the group is named. If topic maps were used to organize the data, the musical and historical links would be separated, Durusau said. "The payoff (of topic maps) from the user standpoint is that you are no longer confronted with everything in the world that is known about the subject," Durusau said.

Durusau said the Internal Revenue Service began developing topic maps to organize its tax forms about three years ago. Topic maps are used to help IRS representatives answer phone calls more efficiently, as well as to create the small-business CD that the agency sends to taxpayers. The IRS also uses topic maps to compare its data with that of the Social Security Administration, which "is structured completely different," he said.

Computer automation and human intervention are used in building topic maps, according to Michel Biezunski, president of InfoLoom and a consultant on the IRS project. He said an artificial-intelligence application groups the data into a preliminary map that is then refined by people, he said. "You need experts to build the relationships" between terms, according to Biezunski.

Biezunski, who helped write the topic-maps specification that was passed by the International Organization for Standardization, said several U.S. Department of Defense agencies are building topic maps, and that the legal and pharmaceutical industries are the next ones likely to index their data. "We are only at the beginning" of adoption, he said.

George Kondrach, president of software company Innodata Isogen, has been consulting with several U.S. intelligence agencies on how to use topic maps to overcome regional variations in spelling. Kondrach said agencies are working to define suspected terrorists as topics so that differences in agency spelling, such as "Osama" versus "Usama," would no longer prevent linking to vital information.

"The same problem exists in tracing genealogy," where last-name changes are common, Kondrach said. Topic maps can accelerate building family trees, as family members would be defined by all of their relationships, simplifying the process of tracking previous generations and extended families.

Eric Freese, a software engineer for research company LexisNexis, helped to create the specification for the XML representation of topic maps that is now part of the ISO standard."

Outside the U.S. government, Freese said the most interest in topic maps is coming from Europe, where companies such as Ontopia, Mondeca and Empolis are developing commercial applications. "The fact that it is gaining ground in Europe has me optimistic that we'll figure it out here," Freese said.

The sagging U.S. economy has slowed the adoption of topic maps in the private sector, according to Freese. "In 2002 (when the XML standard was finalized), nobody was spending on new technology except the government," Freese said. LexisNexis has a few prototype applications using topic maps but has not yet updated its commercial databases.

Freese said topic maps would allow a LexisNexis query of the word "Iowa" to differentiate between the University, the state and the jurisdiction. "It makes sense to present the multiple choices (of context) before returning all of the results," he said.

Freese said search engines such as Google could take advantage of topic maps to increase the accuracy of web search without any changes to the web pages they are indexing. He said that some several people have successfully converted directories, such as the Open Directory Project, into topic maps."