In 1969 Arthur C. Clarke introduced us to his computer named HAL. He had us believing all we needed to do was talk to HAL. HAL would listen, understand and do what we wanted. Until HAL, that is, developed an evil soul and did nasty things to humans. The evil soul is pure fiction but HAL is not.
2010 is the year we get to meet the real HAL. He may still be a child but he is growing up fast thanks to four trends in computing that have coalesced and are now ready to explode. These trends are The Cloud, The Pipe, The UI and The API. A depiction is below.
The Cloud is elastic computing power. It is more than renting a server from a service provider. It means automatic, on-demand scalability onto as many servers as are needed to accomplish a task or take care of a sudden flood of customer needs. The Cloud gives any size organization the appearance and performance of Google-sized computing.
The Pipe is everywhere, all the time, high speed internet connections. Typical wired internet speeds today are over 6MB per second and wireless connections are quickly catching up with that – 3G and soon 4G deployments are common. The biggest trend in mobile devices is smart phones. These are devices that do more than route phone calls, but also manage email, calendars, music, applications and the entire internet. But of course the processing power to do these things is not all on the device. Instead it’s up in the cloud.
The UI or User Interface is smart. Speech to text and semantic technologies combine to allow for the appearance of intelligence. Computers or mobile phones spoken to in natural language understand and then locate, calculate, connect, tally, and display the answer to queries rather than simply list resources for you. Try Nuance, Vlingo or Google Mobile for speech to text accuracy. Try us at Expert System for semantic processing accuracy.
The API or Application Programming Interface means really useful applications. API’s package the first three trends so that creative types can make applications for specific tasks, domains or verticals quickly and make lots of them. Look how many IPhone / ITouch applications have been built in the last 2 years alone. Many have been built by individuals and not large corporations.
These four trends create a virtuous cycle. They combine to bring a sudden higher platform of computing. One that engages the imagination, has enormous productivity, improves processes and creates new value out of existing information resources.
No you can’t really see or touch HAL. But be assured he is there, working in the background, growing, learning and getting smarter every day. He is ready to serve you. Just ask.
A few days ago, Microsoft announced it’s intent to abandon the development of a Linux/Unix version of FAST, the corporate search engine it purchased a couple years ago. The decision didn’t really take anyone by surprise, being that Linux is Windows’ only real rival in the world of servers. So, obviously, Microsft would have no interest in developing solutions for the competition’s operating system; not to mention that FAST is increasingly integrated with SharePoint, which just further goes to prove my point.
From a strategic point of view, the choice is quite understandable. But from a sales standpoint, it seems to be an enormous sacrifice and a huge opportunity for the competition. From what I have read, at least half of FAST users have Linux/Unix (some actually say it’s close to 80%). This means that these users will have use another company’s search engine should they decide to change theirs. With this aspect in mind, I think Microsoft would have been better off if they continued development on systems which differ from Windows. However, if think about the fact that our search engine is compatible with Linux, thus giving us more sales opportunities, then I think they made the right decision
I recently read quite a few interesting articles about Twitter. The most intriguing (and exciting) was about the first tweet from outer space. At the moment, the concept of an intergalactic World Wide Web resides in the minds of few earthlings ;-), but there are however, already hypothetical plans for web servers to be hosted on Mars and on the Moon! Last week, Twitter’s effects on crowdsourcing was addressed by Alec Ross and Jared Cohen in a chat moderated by Google’s Eric Schmidt, where social networks in general were discussed. But, what really caught my eye was an article which reported the live coverage of an accident which was averted in flight. Apparently, a man attempted to open the plane’s exit door, but was promptly stopped by other passengers. Among those on board was the General Services Administration’s CIO, who sent out three tweets as the action took place, and in less than 300 characters, created a sensational news story.
E-mails, text messages, and social networks are some of the most innovative communication instruments today. The advantage of the text message is that it is simple and accessible to everyone (in fact, tens of millions of messages are sent everyday). These messages could certainly become functional and immediate channels for public involvement in safety issues. Citizens could use these systems, on a 24 hour basis, to give notice about events and situations as they happen, so that the public could be better served and numerous criminal acts could possibly be avoided.
The potential risk, however, is that these messages will go unacknowledged, or even worse, that they will be taken into consideration when it’s too late. For this reason, once citizens are offered the opportunity to participate directly, it is essential that law enforcement be ready and prepared to listen to them. The complication is that enormous quantities of information need to managed efficiently. Semantic technology can be used resolve this problem; it is able support the activities of data collection and analysis and can quickly sort through messages, thanks to its ability to “understand” text. In this situation, it could easily be applied to a system which allows citizens to use social networks, e-mails or mobile services to report crimes or alert officials of neighborhood situations, such as: broken streetlights, potholes, vandalism, etc.
Instant blogging has forever changed the life of new generations, but can it also revolutionize public safety? I believe it will, and I believe that the real enabler behind this revolution will be semantic technology.
In science we have tackled great problems. It was only a short number of years ago that we had mapped the human genome. Imagine unlocking the code of what makes us human. More recently, scientists are studying how proteins operate. Or more precisely how they fold. It is in the folding that we learn what a protein is intended for and what job it is supposed to do. Once we unlock this we will know how diseases form, replicate and, most importantly, how to beat them… all of them.
So what does the information science of semantics have to do with proteins? Semantics fold too. That’s what.
Scientists studying proteins that fold are discovering it’s most important and elemental attributes.
The same is true with semantics. Boil a sentence down to its most elemental parts and you get what is called a triple – that is a subject, a predicate and an object. So consider the sentence below;
“John works in the White House”.
Subject: Who or what does the sentence describe? Obviously, that would be” John”.
Predicate: What is the property that describes or connects the subject to the rest of the sentence? That would be the verb “works”.
Object: What is the value of the property? That would be “White House”.
So that example is pretty easy. What about a longer sentence. Something like this;
“John, a favorite of the President Obama from his days in Chicago,
now works as public liaisonin the White House”.
Now the job is tougher. It is clear John is still the subject of the sentence. It might be tempting to assign “favorite” as the predicate since it connects John to President Obama. But the commas indicate to us that this is really a clausal description of John and not the central action of the sentence. So we are left with “works” as the predicate. But what does “works” connect to? Is it “public liaison” or “White House”? The stronger connection is “public liaison” since this describes the kind of work John does. The White House is just the location of that work so it is nothing more than a qualifier.
When we learned to read as a child we were taught to reason through these example sentences pretty much like I just described. Of course you don’t think about it very deeply – the understanding of the sentence, the essence of it comes naturally: John – works – public liaison. The rest just colors these most important facts.
Semantics is the information science of establishing meaning over text without human intervention – and this includes establishing the triple of any sentence. This is also what is called the Semantic Web or Web 3.0. From a diagram perspective this basic notion is sometimes represented notionally like this;
You will note this diagram looks much like cells or proteins linked together. There is a reason for that. Like the proteins that fold and match up along the edges that are common in order to do their work so do semantic triples. Switching to a protein example now let’s consider these two sentences;
1. Protein X adds two molecules of zinc to the cell for each molecule of oxygen.
2. Protein Y adds one molecule of copper to the cell for each molecule of iron.
Our diagram now looks like the following;
So what happened? Each sentence has its own triple. But they have a common predicate of “adds”. So we can diagram two subjects and two objects but with a common predicate.
Just like proteins that fold and combine to make something new we have done the same here in the science of semantics. Because we boiled the sentences down to triples, stored them in a place that can be queried we can ask for all predicates that match to “add(s)”.
Why is this important? It gives scientists, researchers, business professionals, citizens a chance to tap into and glean true meaning from their documents, email or the web. This is far different from a Google like keyword match. The word “add(s)” certainly matched but it was the words role that also matched.
But what if the author of sentence (1) did not use the word “adds” but instead used the word “increased”. A keyword match would fail here. But semantics can also understand that “add” and “increase” are related and so the query would result in the same scientific discovery of Proteins that add/increase molecules.
Now let’s change sentence (2) from Protein Y to Protein X. A more restrictive query on a store of triples where you would ask for both subject and predicate matches would result in a diagram like below.
Again why is this important? Because now a scientist can rely on the smarts built into such a search index to deliver all the Protein X’s that add/increase [some kind of] molecule to a cell. The interesting thing for the scientist will be to group and sort the kind of molecules that will be added to the cell.
This is real discovery in science. It is semantics that get language out of the way. It is semantics that build in smarts to a system so the scientist can find, analyze and create new cures for diseases that have yet to be worked on effectively. So… semantics and folding proteins do have a lot in common – more than you thought.
As I have written many times before, semantic technology is unique in that it is able to go beyond the limits of other types of technology and approach the automatic understanding of a text. It is not perfect, however, and it certainly has yet to reach its maximum potential.
I realize that it’s not that easy for those who don’t work in the sector to understand (especially due to the fact that there are so many false promises out there, which tend to create unreasonable expectations, muddled ideas and market chaos). Therefore, it might be useful to use a common experience as an example, such as: our learning process.
Let’s start from the beginning: from the moment we (human beings) begin to talk, understand, learn, go to school, etc… We require at least 12-15 years to be able to read a newspaper and understand the most general articles and this is thanks to the experience we developed while learning the meanings of words and experimenting with a great deal of different phrase constructions. Consequently, the learning process is lengthier when we decide to tackle more technical terms or specific topics.
Learning takes time, and the same goes for a computer. It’s true that a computer can process in nanoseconds while we think in milliseconds, but it is also true that our method of learning uses a device (the brain) that no one has been able to fully understand and that is able to do things that not even the most powerful computer can imitate.
In summary, it doesn’t make sense to expect that a computer be able to perfectly analyze and understand a biology text, for example, without first having learned all it can about that subject. There are no shortcuts nor magic formulas: learning a language is difficult and even automatic processes require time and labor.
This week we announced the appointment of Julie Hartigan, Ph.D. as CTO of Federal Programs, and Rita Joseph as Vice President of Federal Programs. The expansion of our executive team here in North American is directly in line with our overall goals and vision for growth in the U.S.
Julie and Rita have the extensive experience to help us drive our federal program initiatives. And we’re all satisfied that in an era where government seeks to “connect the dots,” both of these seasoned veterans will bring expertise, guidance and our advanced, high speed, multilingual semantic processing to federal government agencies.