Welcome to the second edition of The Hermes Series, a non-regular and often random collection of notes & thoughts about knowledge representation and reasoning, graphs, technology, and all things from around the Web.
In this second blog post in the series …
Tim Berners-Lee talked about the Giant Global Graph (GGG). He used the term to describe a digital world of interconnected, machine-processable data to complement the World-Wide Web (WWW), the human-oriented world of interconnected documents. GGG and WWW… get it? 🙂 The GGG was all about RDF, OWL and other Semantic Web technologies. I am a huge believer of the vision. I don’t particularly care about the specific technologies even though I do have my biases towards representation models that don’t require me to write many triples 🙂 Facebook’s OpenGraph showed us how a great user experience providers the incentive for information publishers to incorporate structured data into their pages.
And of course… we have to include devices such as the Nest, Nike Fuelband, Fitbit, and Aria (which my partner ordered and now tells me that I have to use 🙁 in the category of devices connected to the Internet of all things. Everyone seems to be building similar functionality, which leads me to believe that a platform is necessary. I think the people behind the company Presence are on the right track… a platform to connect everything on the Internet. Facebook is successful in connecting people. However, we need a platform that connects everything… people, physical objects, data. I really liked this quote from the founder:
“This is about making your interactions with spaces and objects more similar to your interaction with people and friends” (source TechCrunch interview)
But is the connection of devices to the Internet enough? Companies such as Fitbit and Nike create isolated islands of information, they lock the users data behind their respective walls. Shouldn’t all data and all devices be interconnected in a Giant Global Graph? Who is going to enable that capability? The value to users comes from a world full of bridges between islands.
As I wrote above, we need help with processing all the information on Web. We need information processing agents that operate in a manner tailored to our needs and interests. My observation is that everything that the big companies do these days focuses on learning as much as possible about their users so that they can offer highly-personalized services to them. Google hasn’t hidden the fact that their recent privacy policy changes are aimed towards that goal (note the “tailored for you” part). Facebook is, of course, already mining their users’ data.
It goes without saying that companies are focusing on how to make profit. As they get to know more about their users, they can sell more targeted ads, they can offer more relevant-to-the-user services. Here are examples of news in this space that I noticed over the last couple of weeks:
I would categorize all the above in the “digital assistants” space, as I also discussed in the previous section. They are offered to us users as little helpers that process information on our behalf and notify us about stuff that are of interest to us. Whether they can be considered “intelligent”, that’s a different topic 🙂 I personally avoid the use of the term because it has so many connotations. I have used the following spectrum on a number of presentations, inside and outside Microsoft…
I used to have “intelligence” instead of “understanding” but since the former term is so misunderstood and overloaded, I stopped using it. But I diverge.
As before, here we have another case of information islands. My (inferred) interests, consumer habits, activity timelines are isolated from service to service. There is no interconnection. Most importantly, it’s mostly the companies that benefit from that data. Yes, I do consume a personalized experience or receive relevant offers but, at the end of the day, it’s other companies that party on my data. Perhaps a new economic model is necessary.
Scott Merrill reviews “The Intention Economy” by Doc Searls who argues that we should really change the above game. Rather than allowing companies to find things about us, we should really express what we want to do. We should trust our data with a “fourth party” and allow companies to come to us based on what we want to accomplish, buy, consume. I sympathize with Searls premise. Searls writes: “We need ways of gathering, organizing, and controlling the data that we generate and that others suck in from our digital crumb trails. We also need new understandings about how personal data might be used.” (source TechCrunch)
Whether an intention-based economy with fourth parties acting as gatekeepers of personal data is going to be possible, I don’t know. I think that the work being done around digital assistants by so many companies will surface this issue big time.
And since the discussion was about digital assistants, let’s take a trip back to 1979. As always, Xerox Parc had the vision.
As I was writing earlier about my attempt to avoid the use of the term “intelligence”, I remembered of Stephen Wolfram‘s latest post. Regular readers and those in my organization will know that WolframAlpha is one of my favorite services out there. I really admire the work that Stephen and his team are doing.
Stephen talks about how they are “Overcoming Artificial Stupidity” 🙂 Effectively he explains how they have been improving the natural language understanding capability of WolframAlpha over the years. And since I’ve been hanging out a lot with language understanding folks lately, I get how usage data improves the accuracy of a language understanding system.
In my presentations and discussions around knowledge I always reference WolframAlpha as an example of a knowledge system that can do really great things but fails at some simple ones. I have been asking the following question as an example… “Who are the members of Coldplay?“. The answer I used to get was the definition of “member” from the dictionary. WolframAlpha didn’t know about the music domain so it tried its best to give me something else. Well, WolframAlpha now gives the correct answer. I wonder whether they found my query in their logs from the many times I used it 🙂 Just joking. They just hadn’t ingested the data.
WolframAlpha can now even answer questions such as “When were Radiohead formed?“. However, it can’t yet answer my next set of test questions: “How many members were there in Deep Purple?” (it knows the members and the years but it doesn’t count them) or “When did Berlin become the capital of Germany?” (it understand Berlin as the capital but it doesn’t answer the specific question).
WolframAlpha is evolving at a very fast pace, it’s improving, and its knowledge base is expanding with more and more domains. As Stephen says, he aspires to make computers do more than humans. I truly wish them the best. Their work is truly inspiring.
Related to language understanding… “Iris” is Readability‘s new content normalization service. As per the blog post announcing the feature, Iris will attempt to draw meaning from the Web and it’s inspired by IBM’s Watson. In my mind, Iris falls under the category of “content understanding”. It’s absolutely the future… trying to understand documents, language, gestures and connect them with structured data.
Well, talking about structured data, I think that the move by Flickr to push structured data to Pinterest is very interesting. I find it a great example of the proliferation of structured data on the Wed. Of course Flickr wants photographs to be attributed but no matter what the reason, it’s definitely the right strategy.
When we talk about structured data, we cannot ignore Facebook. Every week they seem to announce a new feature around OpenGraph. They execute really fast in this space and they should be congratulated:
Just to further emphasize the last point… Traffic to Pinterest increased by 60% when they integrated with the Open Graph.
Even though the following articles talk about startups, I believe that the advices they give are equally applicable to teams within large companies. I think any leader, any project would benefit.
And talking about the billion dollar mind trick, here’s how “easy” it was to scale to 30M users and ultimately to an $1B acquisition… “Scaling Instagram“. They started with 2 engineers and by the time they scaled to millions of users they only had 5 engineers. Very impressive!
It is a tech-oriented talk that every service engineer should read!!! 🙂 My key takeaways:
It took them 238 takes in order to get it right. Very impressive and very cool concept. Check out their “behind the scenes” video to see how they did it. (via Janet’s Facebook wall 🙂
If you made it all the way down here, thank you!!! 🙂
I am embarking on a side project that involves memory and multimodal understanding for an…
I was in Toronto, Canada. I'm on the flight back home now. The trip was…
The BBC article "How we fell out of love with voice assistants" by Katherine Latham…
Like so many others out there, I played a bit with ChatGPT. I noticed examples…