If you pay attention to technology, or even read the weekly tech section in your local newspaper, you have heard of Wolfram|Alpha from the creator of Mathematica. Many people have written and talked about it since its release in mid-May, but I had not spent any time looking at it, until I heard an interview on TWiT with Stephen Wolfram.
As I listened to the interview, I got the sense that Wolfram|Alpha could be a new take on business intelligence - at least my picture of what BI would be if it were using publicly available data. Based on what I've read and what I heard in the interview, Wolfram|Alpha has loads of curated, organized data plus calculations and other operators that can be run against that data. Plus the stories suggested that you could ask it questions in something approaching natural language, such as "what is the distribution of pet dogs in the USA." I had this picture that you could ask intelligible, quantitative questions and that it would provide a report or analysis (not just a number) in return.
I don't think it is quite ready for prime time yet - something that Wolfram|Alpha acknowledges in the about pages to some extent. After all it "only" has 10 trillion pieces of data and 50,000 algorithms & models to crunch across that data. So, trying to ask questions that feel like natural language didn't work for me. Once you start getting a hang of the input language (and it does feel that way), it still does provide some interesting information. And the results are presented in an attractive form that could be used within a larger report.
There are a bunch of handy examples to show you what it can do, and get you started down rat holes of doing calculations and discovering more information.
Some examples of a few interesting tidbits and many more "I wish it could's":
- The classic ego query does some fun things. Enter a (recognizable) first name, and it tells you the prevalence of that name on US birth certificates and combines that with US mortality data and predicts the number of people alive today with that name and an age distribution.
- On the down side, it doesn't handle name variants or less common names (like my wife's) very well. Nor can it seem to handle the reverse question, like "most popular male given name" which uses language generated by the report for given names. (I discovered that James is currently 15th, and expected to be the 2nd most prevalent name among all living US men.)
- In cases where there is a lot of underlying data, W|A often gives the option to see more detail, either behind the calculation or more members of a long list.
- If there are multiple interpretations of an input known to Wolfram|Alpha, it will pick the most likely and tell you Assuming "walnut" is a species specification with the other interpretations as links to click. (I like the data that comes back for walnut as a food.)
- It provides definitions of words, either directly or as alternate interpretations. Along with the basic definition, it also provides frequency of use (in each form of speech); thesaurus entries for narrower and broader terms and synonyms; even a synonym network. Sadly, I cannot click the synonym network to explore it, as we can with the Visual Thesaurus.)
- You can get a collation of data about your home town, but it doesn't seem to be much different than what's on Wikipedia, other than maybe the current weather.
- The world has about 0.95% French speakers.
- Once you find a useful query, Wolfram|Alpha makes it easy to do comparisons, usually with a comma separator, such as a comparison of US usage of the given male names Michael, James, John. (Sometimes W|A will interpret a space as a list separator, depending on the terms in your query.)
- And if you can get a number from a query, you can do computations with it, such as the weight ratio of black and brown bears (accounting for ranges even).
- If you can find a data set that has multiple dimensions, then you can start doing fun things, like getting GNP / population and comparing for various countries: US vs Tuvalu (countries with the highest and lowest GNP).
- As I say earlier, it's frustrating that I cannot reverse the queries. Many results provide highest/lowest rankings of numeric information. Rather than just list GNP, I'd like to ask for "country with highest GNP" or "countries with GNP > $1 trillion" but I can't quite seem to reverse the calculations or get the input language down. Something like "top ten GNP" could be useful. Ah, it all depends on the dataset: highest population gives you the "highest countries by population," but it can't tell me "highest countries by GNP."
- On Friday, I will be 4960π.
- I'd love to be able to specify a list and an operation and have W|A run the operation on every member of the list. "mortality" gives you raw numbers and country ranking. But what if you want deaths per population and a ranking of that? mortality / population gives you the global view with no country ranking. You can get the ratio one country at a time, such as for the USA or Switzerland. I was looking for this calculation on all countries.