AN AMBITIOUS attempt to create a "knowledge engine" will go live next week. Called Wolfram Alpha, it is designed to understand search requests made in everyday language and work out the answer to factual questions on almost any aspect of human knowledge.
A preview of Alpha carried out by New Scientist has revealed some of the new technology's abilities but also exposed some shortcomings. Meanwhile, in an apparent attempt to steal Alpha's thunder, Google has released a data visualisation tool that may provide stiff competition when fully developed.
Alpha was created by Stephen Wolfram, famous for the software package Mathematica. He employed more than 150 people to collect information on all the major branches of science, from the properties of the elements and the location of planets to the relationships between species and the sequence of the human genome. Economic measures, such as inflation histories for specific countries, are included, as are geographic, cultural and many other data sets.
Alpha's potential stems from the fact that distinct data sets are assembled in the same place, and in a form that can be manipulated by Mathematica, which includes a huge range of tools for analysing and displaying data. That will mean users can combine previously disparate information on, say, economic performance and sports results, or trade patterns and population changes. "Our goal is to provide expert-level knowledge to everyone on the planet," says Russell Foltz-Smith, part of the Alpha team at Wolfram Research in Los Angeles.
So have they succeeded? I was given a chance to test the site prior to launch. Simple questions, such as the population of Sweden (9 million) or the boiling point of carbon dioxide (-78 °C) are answered with ease. While Wikipedia can also answer both questions just as quickly, Alpha has an advantage over its crowd-sourced rival: all of its content has been checked by a team of experts. Wolfram's site is aiming to be as trustworthy as gold-standard sources, such as NASA's climate data sets. "If it's not 100 per cent verifiable and accurate to the highest standards then it will not show up in our system," says Foltz-Smith.
To test the system's power further, I asked for a plot of the ratio of the population of China to Japan, found the date and time of the next solar eclipse that will occur in Timbuktu and checked my health status based on my cholesterol level and age. Right now, Google can't provide direct answers to such questions unless someone has performed these calculations and placed the results on appropriately labelled web pages - but that may be about to change (see "Google to give Alpha a run for its money").
Wolfram's team developed software that understands the difference between questions such as "what" and "who", and turns them into queries that the database can process. In theory, Alpha requests can be made using everyday language, but the level of natural-language processing is very limited, so results are hit-and-miss.
I asked "$25 million 1945 dollars in 2008" and Alpha successfully worked out that I wanted to know the modern equivalent of a sum of money in the past. Sure enough, out came the answer: $300 million. But it got confused when I entered the request in another form - "$25 million 1945 in 2008". The phrase "1945 in" was interpreted as 1945 inches, which Alpha multiplied by $25 million and then 2008 to give 98 trillion, measured in the bizarre unit of "inch US dollars".
That was not the only oddity. In a request to plot some data over the "period 1990-2000", for instance, Alpha interpreted "period" as referring to one of the three 20-minute sections of an ice hockey game.
Nicholas Carr, a technology writer and editorial board member at Encyclopaedia Britannica, warns that internet users have little tolerance for sites that do not work as expected: "Any level of frustration sends people away."
Foltz-Smith says that some of these problems will be ironed out before the 11 May launch and that others will be corrected by watching how people use the site. Users will also be able to pay for premium access to Alpha, allowing them to write code to extract and combine data and, if they wish, integrate it with their own data sets.
No comments:
Post a Comment