The publishing industry is experiencing a period of historical change. The advent of the internet has taken many of the main publishing companies by surprise, and in the rush to develop an online strategy, they spent too much time considering how to charge for content while new media companies offering free content were springing up in the marketplace. And so began the erosion of the traditional press market, where the rules of the game started to change, and the profits from online business turned from the hands of the big publishers to someone else.
This is in addition to changes in the advertising sector: recently the amount of investment in online advertising surpassed that of print advertising in many countries (first between all USA and UK). Many people think that one of the main titles will decide to stop printing daily editions—starting a chain reaction that, in the near future, will result in only a few traditional titles still being printed, most likely dedicated to niche markets and not the mainstream news.
There are many worries about the survival of the publishing industry and a lot of debates, the most recent occurring in the heart of Silicon Valley during the week of the most important conference for semantic technology: the Semantic Technology Conference (SemTech). Here, publishing has been one of the main topics at an event involving entrepreneurs, analysts, researchers and investors.
The semantic environment is undoubtedly fascinating, especially the semantic web, of which many consider the turning point for a new era of potential on the internet–an era in which all the available information will be more intelligent, more interconnected and automatically accessible by all, users and consumers alike. It will be a new era, in which new ways to access web sources will allow us to improve not only our online experience, but also our everyday lives. The revolution and the development allowed by semantic technology are the common denominators for the debate about the surviving of publishing.
Some propose to go back in time and sell content to readers. Others, (often with big results) are releasing new online titles with a greater focus on the reader. Still others hope for new laws and legislation to stop the power of some big companies (Google, for example) and be part of the sharing of some of the business opportunities left in the market. There is also the exploitation of new access devices, like the iPad, which allows the possibility to sell content that, until now, was free.
However, in this scenario, technology offers a great opportunity for everyone. During SemTech, these opportunities have been one of the main topics of discussion: an entire section of the conference has been dedicated to the discussion of instances where semantic software applications are used and how these allow content providers to make a greater profit using the same quantity of produced content.
There has been talk about software that automatically proposes similar content to viewers, pulled from the site’s archive or from other online sources. The purpose is to simplify the research of other potentially interesting content, increase the time spent on the site, increase loyalty, and consequently, advertising profits due to more time spent online by readers. There has been a really interesting discussion about the creation of intelligent procedures to improve the user’s experience, allowing better browsing and a higher quality of search results. These mechanisms can allow you to automatically split results in a detailed way (for instance, a reader searching for articles about Japanese cooking could have several similar results: restaurant advisors, recipes etc. for an easier future search).
Another topic has been the development, for publishers and brands, of more profitable ways for online advertising that are less invasive for readers. Thanks to semantic technology, it is possible to create advertising campaigns that link automatically to other content, either to an article or links to feedback submitted by users. This procedure is increasing the possibility to create unique messages that can improve the value for investors without compromising the experience for the reader. Another interesting opportunity offered by semantic technology is the ability for content creators to make their content automatically available in the standard formats required by semantic web. This makes it possible to immediately activate all the described features and to reduce costs so that journalists and bloggers can spend more time on other, more important activities.
In summary, I’m much more convinced that this revolution will bring about historical change, and that conditions exist that will make it possible for big publishers and small innovators to live and compete together in a transparent, fully functioning market. To make this possible I think it will be necessary for big publishers to focus not only on big legal battles against existing contents sources, like free information sites. It is necessary and cheaper for them to understand and take advantage of the opportunities provided by the new technologies (hardware and software). The profit that could be made by applications already existing in the market or just in the minds of some innovators…are potentially much larger than we could imagine.
December 23, 2020: in spite of the catastrophic forecasts of some years ago, the temperature of the earth has only increased half a degree, but in the city, the youngest children have never seen snow. The Smith family is awaiting the delivery of their children’s gifts, ordered as always on Amazon.
In some hours, the family must leave to spend Christmas with the grandparents, but the gifts have still not arrived. Only ten years ago this would have been a complicated situation—made worse by the risk of having to explain the missed arrival of Santa Claus. Today, such a problem is easily resolved thanks to the revolutionary mass adoption of the Semantic Web, which happened some time ago.
Instead of waiting at home for the packages to arrive, Mrs. Smith can modify her current online profile in her personal data locker, a complete archive that guards all personal data, managed directly by the individual. By fostering a collaborative environment, the Semantic Web has dismantled the countless customer archives and systems for the management of shipments, unique for each courier: in 2020, new shipment information catches up with the couriers in real time, and the gifts could instead be delivered to the grandparents’ address. The change of shipping address doesn’t require a telephone call to the customer, nor further modification of shipping procedures.
This small example is just one of the revolutionary scenarios used by this year’s keynote speaker at SemTech, David Siegel, author of “Pull: The Power of the Semantic Web to Transform Your Business,” to explain in practical terms the scale of the changes we can expect from the Semantic Web. According to Siegel’s brave vision, in the new world of the Semantic Web, it will be individuals and consumers who will become the primary point of reference for the market, and therefore it will substantially change each linked aspect of the information.
It will not be necessary to duplicate the data in every database of the organizations and companies in every sector in which we are operating or doing business (for example, think about the personal data necessary for login forms), solely to satisfy the needs of an antiquated, disconnected system. According to Siegel, in the era of the Semantic Web, it will fall to individuals to exactly define the rules of the game, creating a mechanism—a “pull” —so that suppliers of products and services will have the responsibility to retrieve information about its customers (instead of vice versa.) A mechanism that, according to some estimates, will help save thousands of billions of dollars every year.
In the subsequent debate following his presentation, David Siegel was not without critics. There are those who accused him of not estimating the technical aspects of such a model (for example, where will this mysterious personal data locker reside, how will we ensure it is securely managed?, etc.)
Personally, I found Siegel’s book and presentation very interesting. If we really want the new era of the Web to arise, it is necessary to initially explain the concrete advantages for consumers. Surely not everything will be developed as Siegel predicts, but his theory has merit, because it makes the motives and utilization of the Semantic Web more clear.
In the second keynote of the day, David Recordon from Facebook introduced the “Open Graph protocol” initiative. Released by the company some weeks ago, this initiative has opened an important view of what our online future could be. Through this new protocol, created by Facebook but available to all of its users, every web page will have the possibility to integrate itself in the social graph of Facebook through a simple copy and paste of code. This opens the door, for example, to the development of applications that automatically take into account each user’s social network.
For example, think about the personalization of search rankings for a classic site dedicated to restaurants or movies, according to the preferences of our friends who have already had eaten in the restaurant or who have already seen the movie. Although the first applications of this protocol already exist, the real revolution will happen in the coming years, and we can bet that new applications will know how to fully take advantage of this innovation.
Now that SemTech has drawn to a close, to give a combined vision of the current state of the industry, as well as other perspectives on the sector, I want to briefly point to the forecasts and data that emerged from the panel dedicated to investments and acquisitions in the field. In the last 12 months, we have seen a significant acceleration of activity. We have seen acquisitions of semantic technology companies from larger organizations or from new companies, plus a significant and increasing number of investments in start ups, especially in the areas of sentiment analysis, semantic publishing and semantic advertising. Therefore, it is clear that the field enjoys optimal health and we will continue to have interesting developments in the near future (stay tuned).
All in all, it has been another positive event. A larger participation of big companies like Procter & Gamble, Chevron, Lockheed Martin, CNN, etc, have given greater validity to the conference, confirming the business value of the semantic technology. The attention of some big names, from Facebook to Google, Apple to The New York Times, strengthens the vision of a Web and its consumer applications that will be diverse and innovative, even if not completely defined. If the recent increase in media interest is any indication (demonstrated by a significant increase in the number of articles dedicated to semantic technology in recent months), we can believe that the best is yet to come, and I really hope that I will be talking about it in the 2011 edition.
The idol (and surprise) of the first day of the Semantic Technology Conference in San Francisco has a very different profile than what you may expect, at least at an event that brings together some of the most brilliant minds from one of the more innovative sectors in the world of software. His name is Landon Donovan, a professional soccer player, and he was the only one who has been able to successfully unite all of the participants, after just 90 minutes, by making the goal that qualified the United States for the next stage of the World Cup. This unforeseen fortune has thus transformed the hundreds of geeks that fill the hall of the Hilton before the start of the conference, into a crowd of excited fans in a stadium.
Apart from Donovan, the first day of SemTech was surely in line with expectations. As the organizers promised, the presentations have been concentrated on the business rather than the purely technological aspects. Compared to last year’s conference, the topics have been more about case studies, ROI and the real needs of customers.
Among the presentations of this full program, I found of particular interest the series dedicated to the application of semantic technology to marketing. In a market of consumers who are increasingly influenced by opinions expressed online (websites, blogs or forums), by now it has become strategic for all businesses to institute mechanisms for real-time monitoring. By analyzing these sources, it is possible to precisely identify opinions, sentiment and new trends that can shift the competitive landscape.
Considering the immense quantities of available data and information to analyze daily, an effective solution requires a system that can process content automatically and reliably. Semantic technology can help, thanks to its capacity to understand the significance of words and their relation to expressed concepts.
The panel, including some of the main vendors in this very crowded sector, has rightly discussed the problems of complexity in fully realizing a semantic system and the often exaggerated expectations by companies new to the sector. The discussion of the necessity to integrate raw data (for example, that it measures sentiment regarding product features, extracted automatically), as well as analysis made by specialized consultants, has been particularly enlightening. Right now, the match is tied on this particular topic and we will probably still see some adjustments before these listening systems will be fully implemented in the marketplace.
A second very interesting presentation was that of SalesForce (www.salesforce.com). The company presented a semantic system that, through analysis of content added by employees, can create maps that summarize in real time those employees’ abilities and interests. This knowledge map is used to allow everyone who is looking for a particular set of skills to be able to query the system and easily retrieve the profiles of employees best suited to manage a particular data problem. This type of instrument can be very useful for companies with employees in diverse geographical locations, but also in particular situations that require fast access to unique or less common skills, as well as other diverse skills and interests that are not directly related to an employee’s current responsibilities.
The final note for the day is dedicated to an interesting online application (www.tripit.com) that provides a very useful tool for travellers—one that allows you to easily create a travel itinerary with all the useful information you need, as well as an electronic copy for your trip. For a simpler, more transparent user experience, Tripit uses semantic technology to personalize relevant information inside an email to confirm a travel reservation (for example, dates, location, name of hotel, etc.). The details automatically load into the email, including logistical information like the weather forecast, maps of the area, etc. And finally, the itineraries are available via the web or mobile (which is especially useful when travelling), allowing the traveller to avoid having to keep track of paper copies of reservations or confirmation numbers.
This year’s edition of the Semantic Technology conference has the clear objective to prove, to a still partially skeptical market, that semantic technologies and the broader semantic web are for real, and that they can significantly contribute to create business value to organizations in many different industries. Just by looking at the comprehensive agenda, organized very smartly in tracks covering specific topics and areas of applications, I was pleased to realize that speeches and panels, instead of focusing on standard and technical aspects like in past years, will cover concrete business applications that are easy to understand, even for people who have never written a line of code.
Considering these points, and last but not least, the fact that the event is in San Francisco (apologies to San Jose…), I am really looking forward to the conference to start, also because the event represents a unique opportunity to meet entrepreneurs, analysts and investors to understand whether their mood and vision on the future of the sector has changed after the difficult recession of the last 18 months. Among the long list of presentations, I am particularly interested in the following:
Personally, I will make a speech on Wednesday at 10.15 inside a panel dedicated to the application of semantic technologies to publishing. I find fascinating, and at the same time particularly confusing, the debate in this sector. On one side, you have the traditional players that are under a great deal of pressure to try to retake a leadership position, and financial profitability, after having allowed Google to almost destroy their business model and weaken significantly their competitive position. On the other side, you have the new online only players, like the Huffington Post, that are forced by their somehow unexpected success to continue to innovate to ensure they can continue to offer a unique a difference experience to their readers. Both sides see semantic technologies as strategic because these technologies, and the applications deriving from them, can help them to increase revenue by improving the user experience and providing a more effective way to serve advertising, and to reduce costs by automating the work of content creators in order to let them focus on the most valuable part of their job (creating content) instead of wasting time in low value activities like manually tagging the content to make it easier to search and access by the user.
If you are in San Francisco this week, don’t forget to visit Expert System’s booth (#207) or to follow us on this blog or on twitter @scagliarini and @brookeaker.
Last week I organized “Semantic Web Makes for Business Intelligence 2.0”, a meetup event for the CT Semantic Web Meetup Group (Ah yes, I funded the Connecticut Semantic Web Group last April. The reason was simple: we had the NYC and Boston version of these groups but didn’t have one of them in CT. So how about having some meetups right here in CT to learn, trade stories and promote the Semantic Web / Web 3.0 right here?). And this week I will take part in an other Semantic Web event which seems very interesting and was scheduled as part of Internet Week NY 2010: “Building NLP Semantic Web Applications for Financial Services“.
You’ll find all the details here. But what’s the point? This is sea-change technology and early adopters are in corporate communities. The technology is here now and coming on fast. Come to these meetups so you and your company are out in front and not left behind.
In the film “Dead Poets Society”, Professor John Keating asks his students to rip out the introduction to their poetry book which describes how to rate the quality of poetry using a Cartesian system of coordinates – the system is a kind of “mathematical” evaluation based on “measurable” criteria. While we can agree that a “mechanical” approach to poetry evaluation is most likely nonsensical, we asked ourselves if this kind of approach (along with the use of an automatic system) could actually be used to examine text which has nothing to do with poetry, such as a political speech.
The political debates currently being held in the UK are the perfect occasion to satisfy our curiosity. The question is, can this situation can be considered sufficiently “scientific”, in that the three candidates all have to respond to the same questions during a live conference with a time limit for each response?
My colleague, Marco Giorgini and I have spent some time creating a small program which uses our semantic software COGITO to analyze the debates and extract the elements which appear to be significant. We had some fun as we examined our results and found out what we could understand about a political campaign if we didn’t bother to look at each candidate, their facial expressions, their tone of voice, or the cadence of the discussions. We just limited ourselves to “chopping up” what was said and evaluating the most abstract essence of the concepts discussed, the use of the lexicon and the grammatical structures.
Obviously, we just played around with the language, but we were also able to make some interesting, if not curious, considerations. This report provides the details as to what we found.
A special thank-you to Marco Giorgini who worked directly on the development of the report as well as the layout of this post.
Italy seems to be a popular subject in the daily buzz published in the press (online and offline), in blogs and in other forms of social networking. The interest in this small country is certainly greater than it’s political, geographic or strategic role.
There are an infinite amount of declarations of love regarding: food, landscapes, history, “dolce vita”, Ferrari and beauty (especially in regards to the players on the national soccer team). These are counterbalanced by strong negative opinions concerning: disorganization, corruption, politics and the Italian stereotype of mammismo (aka ‘momism’). It seems like the world tends to be polarized in judging Italy using subjective emotional criteria instead of the same objective and modern criteria which is usually reserved for the leading nations of the 21st century.
The “Italy of Innovators” project (P.S. Expert System is one of the selected companies) is actually an excellent example of how Italy is also a modern country which pays attention to technology and is full of innovative companies who struggle to put their ideas on the market despite the financial, academic and political systems which are a far cry from the Silicon Valley.
So here’s my suggestion to all of the international investors, entrepreneurs and university professors who follow my blog (OK, perhaps I’ve exaggerated a bit ): set down your glass of Chianti, stop listening to Turandot, start reading the list of these companies and keep these innovators in mind during your next vacation to the boot-shaped country. Take a chance to set up a meeting with these entrepreneurs…it might just lead to new investment opportunities!
Note: for those would like read more about Italy, I suggest books by Beppe Severgnini who has also inspired some of the opinions I expressed in this post.
Wet morning in Santa Clara. People seem to be looking at the sky as if it was falling. We are not used to so much rain here.
There are not many people at the conference. The audience is an interesting mix of semantic geeks, marketing and product managers, business people. Definitely a very heterogeneous crowd.
The most interesting presentations are by Scott Prevost of Microsoft Bing and Mark Greaves from Vulcan.
Scott Prevost comes from the Powerset acquisition by Microsoft and is now part of the Bing project.
“The Semantic Web? It is already here” he says. What he really means is that in the Bing project they use quite extensively semantic technology like the ones we offer at Expert System. His opinion is that semantics, that is already applied under the hoods in all major search engines, is here to stay and will gradually evolve and make the user search experience better – most of the time without the final user even realizing that he is using semantic technologies!
Bing applies semantics in a lot of different ways:
They interpret semantically the requests of the user. Example: “who mocked Sarah Palin” returns not only results with “Sarah Palin” and “mocked”, but also “parodied”, “impersonated”, etc. We at Expert System provide a similar functionality for the Enterprise market with Cogito Answers.
They classify the search results so that they can be filtered and navigated in a better way by the user – similar to what we can do with the Cogito Categorizer.
They try to leverage RDF information added by publishers to their pages – similar to the rich snippets in Google. This information can be added to a search result to make it more interesting to the user and improve his search experience. A classical example is the search results for a restaurant returning the Yelp web page with the average score and the number of reviews. We can help publishers to produce automatically these snippets using our Cogito Discover technology.
They apply semantics to their advertising platform so that the advertisement campaigns can be based on concepts instead of keywords as they are today. We offer a similar solution with our Cogito Advertiser product.
Another interesting speaker is Mark Greaves from Vulcan Technologies. One of the most interesting points that he talks about is the fact that a lot of data that used to live in databases around the world is now moving into the “Semantic Web”. The advantages are huge:
Linking the data: Think about relational databases and on how you can link one piece of data from one database to another one (maybe belonging to a different organization). It may not be impossible, but it is at least very difficult. One basic advantage of the Semantic Web is that data can be linked in all sorts of ways. The OWL standard in particular provides the means to connect data in different “clouds” very easily.
“Organic growth” of the data: The Semantic Web also allows for “organic growth” of data. As opposed to relational databases where you need to define an outline before you even start entering any data, the Semantic Web is designed to provide the flexibility to add and modify data in different formats in different points in the web. With open data usually there is also a community that maintains it and makes sure it is accurate.
There are also some recurring themes at the conference that seem to be common in many of the talks:
- Mobile Internet: Internet on Mobile devices presents some specific challenges. The environment is different (e.g. no big keyword or mouse and much smaller browser). The market is huge, the opportunities also. Search Engines, social networks, content providers discuss how to use semantics to develop this new space.
- “Internet of Data”: the huge amount of Linked Open Data that is available for free today on the Internet represents a new and ever growing opportunity that can be leveraged by computer programs to help us humans in our daily tasks.
- Social Networking Interaction: this is a concept that seems to mean different things to different people. Some people talk about how social networks can be represented in a “semantic” way with RDF so that it can be used by semantic web applications. Other people talk about the way people in social networks contribute in publishing and maintaining data in the Linked Open Data Cloud in a similar way that the Wikipedia community has developed the huge Wikipedia knowledge base in the last few years.
Bottom line is that the Semantic Web is already here and the ideas discussed at Web 3.0 are mostly about opportunities on how to leverage in order to make our life better…
by Walter Pezzini, VP of Pre-Sales and Professional Services at Expert System
In a recent web seminar that we participated organized by Project 10X some 260 registered attendees submitted questions prior to the event. I semantically processed these questions (sometimes called “eating your own dog food” – imagine that!) looking for common themes and concerns.
In reviewing the outcome here is what I found;
1. Case Studies and ROI. People learn best with storytelling and proof points embodied by Return on Investment. So it should be no surprise that this tops the list of questions and concerns. These stories help convince funders, provide guidance for technical planning, and show feasibility. Yet this also shows a level of understanding of the technology by the participants. In other words they are convinced of the basic value parameters of semantic technologies and have come to believe they can be deployed with good outcomes within their organizations but need help to find the right place to start, the expected timelines, and how to sell the capabilities and outcomes to upper management. At Expert System we have over 100 implementations in the last 3 years alone and can confirm this concern meets with our experience.
2. Technical Integration Points. Here attendees concerns are about how to make semantics live with or interact with existing applications, data sets, and search products. Here I sense the need to make existing products pay a bit longer for their sunk cost and not to tear things out wholesale and start over. The good news is that semantic technology is intended to play this exact role by providing new insight into information where ever they currently live. 9 out of 10 customers ask us for a SAAS implementation with a front end user interface that already exists.
3. Semantic Networks. This is a real surprise to us but pleasantly so. While our technology relies heavily on a semantic network, sometimes called ontology, it is not always the case that other providers use this method to unlock the meaning of text. Some use statistical approaches, others heuristics and still others something called latent semantic processing. These other approaches tend to sound quite scientific but in reality are short cuts that prove to be less than sufficient for industry strength precision and recall. Semantic Networks are hard to produce and they take time. But the investment pays off. They become a knowledge representation of a domain of knowledge. When done thoroughly and properly can increase the precision and recall of the processing greatly. Many networks are specific to a branch of science or hold deep technical knowledge representations. Our semantic network, on the other hand, is of the common language, covering all topics, all words, all concepts and the connections between them. This means it can be applied to any domain.
4. W3C standards are confusing. When we read the comments its clear there are too many acronyms and to many standards. More concerning, the standards themselves seem to be the solution to semantics. It is as if many seem to think the standards provide the inference, the storage, the modeling, the interpretation and more that are core to semantics. The reality is that standards are only a proposed common language for describing and exchanging the outcomes of semantic processing.
To sum up – the semantic web has come a long way in terms of showing value and laying down a base of understanding. But as with any new technology, there is more to do. All of us to do better in terms of explaining, simplifying and educating up and down the organizational decision chain. Only when that is done will we be able to say “it’s baked”.
Where the categories mean the following;
Integration: How to embed or use semantics behind the scenes of existing applications.
Mobility: Get semantics to support mobile workers.
ROI Case Studies: Examples of successful, killer applications and their payback.
Semantic Nets: Semantic networks or ontologies, what they are, when to use them, how to maintain them.
Standards: W3C’s soup of acronyms and what they mean.
Timing: How fast will the technology and/or market progress.
Performance: Can semantics run with everything else and keep up.
Databases: How and when to use databases with semantics.
Automatic: Do semantic systems or tools learn on their own. What about maintenance and support.
Selling: How to make the case for funding to upper management.
NLP: how does semantics support natural language processing or computing.