Details
Kingsley Idehen
Lexington, United States
Subscribe
Post Categories
Recent Articles
- Connecting Freebase, Wikipedia, DBpedia, and other Linked Data Spaces (Update 1)
- Is the Semantic Web necessary (and feasible)?
- The Essence of the Matter re. Information Overload
- Crunchbase & Semantic Web Interview (Remix - Update 1)
- Nice Quote about Information Architecture & World Wide Web
- Virtuoso, Linked Data, and Linq2Rdf (Update 1)
- DBpedia Architecture
- The Future of the Desktop
- Yahoo! and the Linked Data Web in a Nutshell (Updated)
- Response to: Whole Data Post (Update 3)
- DBpedia 3.1 is now Live!
- .NET, LINQ, and RDF based Linked Data (Update 2)
- Virtuoso's Universal Server Architecture (Conceptual & Technical)
- Time for Context Lenses (Update)
- Linked Data, Meshups, Twitter, and Friendfeed
Display Settings
Translate
|
Showing posts in all categories Refresh
Connecting Freebase, Wikipedia, DBpedia, and other Linked Data Spaces (Update 1)
Here are some demonstrations of (X)HTML based representations of resource descriptions from Freebase, DBpedia, BBC Music Beta, CrunchBase, OpenCyc, and UMBEL etc. What is really being demonstrated here is the use of Proxy / Wrapper URIs to expose powerful links across entities distilled from their container documents (or information resources). Of course, you see exactly the same technique in action whenever you visit DBpedia pages. Again, we are moving the concept of Linking from the document to document level, down to the document-entity to document-entity level. The evolution of network link focal points is illustrated in slides 15 to 22 of my Linked Data Planet presentation remix.
Live Examples
-
Abraham Lincoln - Freebase (note: link from Freebase to DBpedia via Wikipedia)
-
Amazon - CrunchBase (note: links from CruncBase to DBpedia)
-
Cold Play - BBC Music Beta (note: links to Musicbrainz)
-
Linked Data Planet Presentation - Also a Slidy, Bibo Ontology, and RDFa usage example
-
Music - OpenCyc Concept which exposes a Hyperdata link to its equivalent UMBEL Subject Concept and back
Virtuoso's RDFization Middleware & Linked Data Deployment Architecture Diagram
Note: You can substitute my examples using any Web resource URL. The underlying RDFization and Linked Data deployment functionality of the Virtuoso demo instance takes care of everything else. Also note that the HTML based resource description page capability is now deployed as part of the Virtuoso Sponger component of every Virtuoso installation starting with from version 5.0.8.
|
08/29/2008 17:53 GMT
|
Modified:
08/29/2008 14:57 GMT
|
Is the Semantic Web necessary (and feasible)?
Here is another "Linked Discourse" effort via a blog post that attempts to add perspective to a developing Web based conversation. In this case, the conversation originates from Juan Sequeda's recent interview with Jana Thompson titled: Is the Semantic Web necessary (and feasible)?
Jana: What are the benefits you see to the business community in adopting semantic technology?
Me: Exposure, exploitation, of untapped treasure trove of interlinked data, information, and knowledge across disparate IT infrastructure via conceptual entry points (Entity IDs / URIs / Data Source Names) that refer to as "Context Lenses".
Jana: Do you think these benefits are great enough for businesses to adopt the changes?
Me: Yes, infrastructural heterogeneity is a fact of corporate life (growth, mergers, acquisitions etc). Any technology that addresses these challenges is extremely important and valuable. Put differently, the opportunity costs associated with IT infrastructural heterogeneity remains high!
Jana: How large do you think this impact will actually be?
Me: Huge, enterprise have been aware of their data, information, and knowledge treasure troves etc. for eons. Tapping into these via a materialization of the "information at your fingertips" vision is something they've simply been waiting to pursue without any platform lock-in, for as long as I've been in this industry.
Jana: I’ve heard, from contacts in the Bay Area, that they are skeptical of how large this impact of semantic technology will actually be on the web itself, but that the best uses of the technology are for fields such as medical information, or as you mentioned, geo-spatial data.
Me: Unfortunately, those people aren't connecting the Semantic Web and open access to heterogeneous data sources, or the intrinsic value of holistic exploration location of entity based data networks (aka Linked Data).
Jana: Are semantic technologies going to be part of the web because of people championing the cause or because it is actually a necessary step?
Me: Linked Data technology on the Web is a vital extension of the current Web. Semantic Technology without the "Web" component, or what I refer to as "Semantics Inside only" solutions, simply offer little or no value as Web enhancements based on their incongruence with the essence of the Web i.e., "Open Linkage" and no Silos! A nice looking Silo is still a Silo.
Jana: In the early days of the web, there was an explosion of new websites, due to the ease of learning HTML, from a business to a person to some crackpot talking about aliens. Even today, CSS and XHTML are not so difficult to learn that a determined person can’t learn them from W3C or other tutorials easily. If OWL becomes the norm for websites, what do you think the effects will be on the web? Do you think it is easy enough to learn that it will be readily adopted as part of the standard toolkit for web developers for businesses?
Me: Correction, learning HTML had nothing to do with the Web's success. The value proposition of the Web simply reached critical mass and you simply couldn't afford to not be part of it. The easiest route to joining the Web juggernaut was a Web Page hosted on a Web Site. The question right now is: what's the equivalent driver for the Linked Data Web bearing in mind the initial Web bootstrap. My answer is simply this: Open Data Access i.e., getting beyond the data silos that have inadvertently emerged from Web 2.0.
Jana: Following the same theme, do you think this will lead to an internet full of corporate-controlled websites, with sites only written by developers rather than individuals?
Me: Not at all, we will have an Internet owned by it's participants i.e., You and the agents that work on your behalf.
Jana: So, you are imagining technologies such as Drupal or Wordpress, that allow users to manage sites without a great deal of knowledge of the nuts and bolts of current web technologies?
Me: Not at all! I envisage simple forms that provide conduits to powerful meshes of interlinked data spaces associated with Web users.
Jana: Given all of the buzz, and my own familiarity with ontology, I am just very curious if the semantic web is truly necessary?
Me:This question is no different than saying: I hear the Web is becoming a Database, and I wonder if a Data Dictionary is necessary, or even if access to structured data is necessary. It's also akin to saying: I accept "Search" as my only mechanism for Web interaction even though in reality, I really want to be able to "Find" and "Process" relevant things at a quicker rate than I do today, relative to the amount of information, and information processing time, at my disposal.
Jana: Will it be worth it to most people to go away from the web in its current form, with keyword searches on sites like Google, to a richer and more interconnected internet with potentially better search technology?
Me: As stated above, we need to add "Find" to the portfolio of functions we seek to perform against the Web. "Finding" and "Searching" are mutually inclusive pursuits at different ends of an activity spectrum.
Jana: For our more technical readers, I have a few additional questions: If no standardization comes about for mapping relational databases to domain ontologies, how do you see that as influencing the decisions about adoption of semantic technology by businesses? After all, the success of technology often lives or dies on its ease of adoption.
Me: Standardization of RDBMS to RDF Mapping is not the critical success factor here (of course it would be nice). As stated earlier, the issue of data integration that arises from IT infrastructural heterogeneity has been with decision makers in the enterprise for ever. The problem is now seeping into the broader consumer realm via Web ubiquity. The mistakes made in the enterprise realm are now playing out in the consumer Web realm. In both realms the critical success factors are:
-
Scalable productivity relative to exponential growth of data generated across Intranets, Extranets, and the Internet
- Concept based Context Lenses that transcend logical and physical data heterogeneity by putting dereferencable URIs in front of the Line of Business Application Data and/or Web Data Spaces such as Blogs, Wikis, Discussion Forums etc.).
|
08/29/2008 15:00 GMT
|
Modified:
08/29/2008 11:28 GMT
|
The Essence of the Matter re. Information Overload
The title of this post is an expression of my gut reaction to the quotes below, which originate from Leo Sauermann's post about the Nepomuk Semantic Desktop for KDE:
Ansgar Bernardi, deputy head of the Knowledge Management Department at Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI, or the German Research Center for Artificial Intelligence) and Nepomuk's coordinator, explains, "The basic problem that we all face nowadays is how to handle vast amounts of information at a sensible rate." According to Bernardi, Nepomuk takes a traditional approach by creating a meta-data layer with well-defined elements that services can be built upon to create and manipulate the information.
The comment above echoes my sentiments about the imminence of "information overload" due to the vast amounts of user generated content on the Internet as a whole. We are going to need to process more an more data within a fixed 24 hour timeframe, while attempting to balance our professional and personal lives. Be rest assured, this is a very serious issue, and you cannot event begin to address it without a Web of Linked Data.
"The first idea of building the semantic desktop arose from the fact that one of our colleagues could not remember the girlfriends of his friends," Bernard says, more than half-seriously. "Because they kept changing -- you know how it is. The point is, you have a vast amount of information on your desktop, hidden in files, hidden in emails, hidden in the names and structures of your folders. Nepomuk gives a standard way to handle such information."
If you get a personal URI for Entity "You", via a Linked Data aware platform (e.g. OpenLink Data Spaces) that virtualizes data across your existing Web data spaces (blogs, feed subscriptions, wikis, shared bookmarks, photo galleries, calendars, etc.), you then only have to remember your URI whenever you need to "Find" something, imagine that!
To conclude, "information overload" is the imminent challenge of our time, and the keys to challenge alleviation lie in our ability to construct and maintain (via solutions) few context lenses (URIs) that provide coherent conduits into the dense mesh of structured Linked Data on the Web.
|
08/28/2008 12:17 GMT
|
Modified:
08/28/2008 16:16 GMT
|
Crunchbase & Semantic Web Interview (Remix - Update 1)
After reading Bengee's interview with CrunchBase, I decided to knock up a quick interview remix as part of my usual attempt to add to the developing discourse.
CrunchBase: When we released the CrunchBase API, you were one of the first developers to step up and quickly released a CrunchBase Sponger Cartridge. Can you explain what a CrunchBase Sponger Cartridge is?
Me: A Sponger Cartridge is a data access driver for Web Resources that plugs into our Virtuoso Universal Server (DBMS and Linked Data Web Server combo amongst other things). It uses the internal structure of a resource and/or a web service associated with a resource, to materialize an RDF based Linked Data graph that essentially describes the resource via its properties (Attributes & Relationships).
CrunchBase: And what inspired you to create it?
Me: Bengee built a new space with your data, and we've built a space on the fly from your data which still resides in your domain. Either solution extols the virtues of Linked Data i.e. the ability to explore relationships across data items with high degrees of serendipity (also colloquially known as: following-your-nose pattern in Semantic Web circles).
Bengee posted a notice to the Linking Open Data Community's public mailing list announcing his effort. Bearing in mind the fact that we've been using middleware to mesh the realms of Web 2.0 and the Linked Data Web for a while, it was a no-brainer to knock something up based on the conceptual similarities between Wikicompany and CrunchBase. In a sense, a quadrant of orthogonality is what immediately came to mind re. Wikicompany, CrunchBase, Bengee's RDFization efforts, and ours.
Bengee created an RDF based Linked Data warehouse based on the data exposed by your API, which is exposed via the Semantic CrunchBase data space. In our case we've taken the "RDFization on the fly" approach which produces a transient Linked Data View of the CrunchBase data exposed by your APIs. Our approach is in line with our world view: all resources on the Web are data sources, and the Linked Data Web is about incorporating HTTP into the naming scheme of these data sources so that the conventional URL based hyperlinking mechanism can be used to access a structured description of a resource, which is then transmitted using a range negotiable representation formats. In addition, based on the fact that we house and publish a lot of Linked Data on the Web (e.g. DBpedia, PingTheSemanticWeb, and others), we've also automatically meshed Crunchbase data with related data in DBpedia and Wikicompany data.
CrunchBase: Do you know of any apps that are using CrunchBase Cartridge to enhance their functionality?
Me: Yes, the OpenLink Data Explorer which provides CrunchBase site visitors with the option to explore the Linked Data in the CrunchBase data space. It also allows them to "Mesh" (rather than "Mash") CrunchBase data with other Linked Data sources on the Web without writing a single line of code.
CrunchBase: You have been immersed in the Semantic Web movement for a while now. How did you first get interested in the Semantic Web?
Me: We saw the Semantic Web as a vehicle for standardizing conceptual views of heterogeneous data sources via context lenses (URIs). In 1998 as part of our strategy to expand our business beyond the development and deployment of ODBC, JDBC, and OLE-DB data providers, we decided to build a Virtual Database Engine (see: Virtuoso History), and in doing so we sought a standards based mechanism for the conceptual output of the data virtualization effort. As of the time of the seminal unveiling of the Semantic Web in 1998 we were clear about two things, in relation to the effects of the Web and Internet data management infrastructure inflections: 1) Existing DBMS technology had reached it limits 2) Web Servers would ultimately hit their functional limits. These fundamental realities compelled us to develop Virtuoso with an eye to leveraging the Semantic Web as a vehicle from completing its technical roadmap.
CrunchBase: Can you put into layman’s terms exactly what RDF and SPARQL are and why they are important? Do they only matter for developers or will they extend past developers at some point and be used by website visitors as well?
Me: RDF (Resource Description Framework) is a Graph based Data Model that facilitates resource description using the Subject, Predicate, and Object principle. Associated with the core data model, as part of the overall framework, are a number of markup languages for expressing your descriptions (just as you express presentation markup semantics in HTML or document structure semantics in XML) that include: RDFa (simple extension of HTML markup for embedding descriptions of things in a page), N3 (a human friendly markup for describing resources), RDF/XML (a machine friendly markup for describing resources).
SPARQL is the query language associated with the RDF Data Model, just as SQL is a query language associated with the Relational Database Model. Thus, when you have RDF based structured and linked data on the Web, you can query against Web using SPARQL just as you would against an Oracle/SQL Server/DB2/Informix/Ingres/MySQL/etc.. DBMS using SQL. That's it in a nutshell.
CrunchBase: On your website you wrote that “RDF and SPARQL as productivity boosters in everyday web development”. Can you elaborate on why you believe that to be true?
Me: I think the ability to discern a formal description of anything via its discrete properties is of immense value re. productivity, especially when the capability in question results in a graph of Linked Data that isn't confined to a specific host operating system, database engine, application or service, programming language, or development framework. RDF Linked Data is about infrastructure for the true materialization of the "Information at Your Fingertips" vision of yore. Even though it's taken the emergence of RDF Linked Data to make the aforementioned vision tractable, the comprehension of the vision's intrinsic value have been clear for a very long time. Most organizations and/or individuals are quite familiar with the adage: Knowledge is Power, well there isn't any knowledge without accessible Information, and there isn't any accessible Information without accessible Data. The Web has always be grounded in accessibility to data (albeit via compound container documents called Web Pages). Bottom line, RDF based Linked Data is about Open Data access by reference using URIs (HTTP based Entity IDs / Data Object IDs / Data Source Names), and as I said earlier, the intrinsic value is pretty obvious bearing in mind the costs associated with integrating disparate and heterogeneous data sources -- across intranets, extranets, and the Internet.
CrunchBase: In his definition of Web 3.0, Nova Spivack proposes that the Semantic Web, or Semantic Web technologies, will be force behind much of the innovation that will occur during Web 3.0. Do you agree with Nova Spivack? What role, if any, do you feel the Semantic Web will play in Web 3.0?
Me: I agree with Nova. But I see Web 3.0 as a phase within the Semantic Web innovation continuum. Web 3.0 exists because Web 2.0 exists. Both of these Web versions express usage and technology focus patterns. Web 2.0 is about the use of Open Source technologies to fashion Web Services that are ultimately used to drive proprietary Software as Service (SaaS) style solutions. Web 3.0 is about the use of "Smart Data Access" to fashion a new generation of Linked Data aware Web Services and solutions that exploit the federated nature of the Web to maximum effect; proprietary branding will simply be conveyed via quality of data (cleanliness, context fidelity, and comprehension of privacy) exposed by URIs.
Here are some examples of the CrunchBase Linked Data Space, as projected via our CruncBase Sponger Cartridge:
-
Amazon.com
-
Microsoft
-
Google
-
Apple
|
08/27/2008 18:16 GMT
|
Modified:
08/27/2008 20:35 GMT
|
Nice Quote about Information Architecture & World Wide Web
Even with the marginal degrees of serendipitous discovery that the current document oriented Web offers, it's still possible to stumble across poignant gems such as this statement from InspireUX
:
The statement above resonates with a lot of my fundamental views about the essence of Web. It also drives right at the core of what we are trying to address with the OpenLink Data Explorer (ODE) which isn't simply about visualization of Linked Data, but about the combination of visualization, user interaction, and unobtrusive exposure and exploitation of Linked Data. Through the use of extensible RDFizers, ODE can bring this powerful combination to bear on Linked Data Entities culled from across the existing Web of Linked Documents.
Do remember, "mission-critical" is no longer a corporate / enterprise theme. The lines of demarcation between the individual and enterprise are blurring at warp speed.
|
08/27/2008 14:47 GMT
|
Modified:
08/27/2008 15:31 GMT
|
Virtuoso, Linked Data, and Linq2Rdf (Update 1)
There are many challenges that have dogged attempts to mesh the DBMS & Object Technology realms for years, critical issues include:
- data access & manipulation impedance arising from Model mismatches between Relational Databases and Object Oriented & Object based Languages
-
Record / Data Object Referencing by ID.
The big deal about LINQ has been the singular focus on addressing point 1, in particular.
I've already written about the Linq2Rdf effort that meshes the best of .NET with the virtues of the "Linked Data Web".
Here is an architecture diagram that seeks to illustrate the powerful data access and manipulation options that the combination of Linq2RDF and Linked Data deliver:
What may not have been obvious to most in the past, is the fact that Mapping from Object Models to Relational Models wasn't really the solution to the problem at hand. Instead, the mapping should have been the other way around i.e., Relational to Object Model mapping. The emergence of RDF and RDBMS to RDF mapping technology is what makes this age-old headache addressable in very novel ways.
Related
-
RDBMS to RDF Mapping - W3C Workshop Presentation
-
Virtuoso RDBMS to RDF Mapping - W3C Rdb2Rdf Incubator Group Presentation
-
Creating RDF Views over SQL Data Sources - Technology Tutorial
|
08/26/2008 12:36 GMT
|
Modified:
08/27/2008 08:10 GMT
|
DBpedia Architecture
Here is a pictorial of DBpedia's Linked Data Deployment & Data Management architecture:
Key points:
SPASQL (SPARQL extension for SQL) enables the intelligent resource representation request handling and URI dereferencing, that underlies "Linked Data" (i.e., Hyperdata Linking) to occur in-process.
|
08/22/2008 02:50 GMT
|
Modified:
08/21/2008 23:10 GMT
|
The Future of the Desktop
Jason Kolb (who initially nudged me to chime in), and then ReadWriteWeb, and of course Nova's Twine about the topic, have collectively started an interesting discussion about Web.vNext (3.0 and beyond) under the heading: The Future of the Desktop.
My contribution to the developing discourse takes the form of a Q&A session. I've taken the questions posed and provided answers that express my particular points of view:
Q: Is the desktop of the future going to just be a web-hosted version of the same old-fashioned desktop metaphors we have today?
A: No, it's going to be a more Web Architecture aware and compliant variant exposed by appropriate metaphors.
Q: The desktop of the future is going to be a hosted web service
A: A vessel for exploiting the virtues of the Linked Data Web.
Q: The Browser is Going to Swallow Up the Desktop
A: Literally, of course not! Metaphorically, of course! And then the Browser metaphor will decomposes into function specific bits of Web interaction amenable to orchestration by its users.
Q: The focus of the desktop will shift from information to attention
A: No! Knowledge, Information, and Data sharing courtesy of Hyperdata & Hypertext Linking.
Q: Users are going to shift from acting as librarians to acting as daytraders
A: They were Librarians at Web 1.0, Journalist at Web 2.0, and Analysts in Web 3.0 (i.e, analyze structured and interlinked data), and CEOs in Web 4.0 (i.e. get Agents to do stuff intelligently en route to making decisions).
Q: The Webtop will be more social and will leverage and integrate collective intelligence
A: The Linked Data Web vessel will only require you to fill in your profile (once) and then serendipitous discovery and meshing of relevant data will simply happen (the serendipity quotient will grow in line with Linked Data Web density).
Q: The desktop of the future is going to have powerful semantic search and social search capabilities built-in
A: It is going to be able to "Find" rather than "Search" for stuff courtesy of the Linked Data Web.
Q: Interactive shared spaces will replace folders
A: Data Spaces and their URIs (Data Source Names) replace everything. You simply choose the exploration metaphor that best suits you space interaction needs.
Q: The Portable Desktop
A: Ubiquitous Desktop i.e. do the same thing (all answers above) on any device connected to the Web.
Q: The Smart Desktop
A: Vessels with access to Smart Data (Linked Data + Action driven Context sprinklings).
Q: Federated, open policies and permissions
A: More federation for sure, XMPP will become a lot more important, and OAuth will enable resurgence of the federated aspects of the Web and Internet.
Q: The personal cloud
A: Personal Data Spaces plugged into Clouds (Intranet, Extranet, Internet).
Q: The WebOS
A: An operating system endowed with traditional Database and Host Operating system functionality such as: RDF Data Model, SPARQL Query Language, URI based Pointer mechanism, and HTTP based message Bus.
Q: Who is most likely to own the future desktop?
A: You! And all you need is a URI (an ID or Data Source Name for "Entity You") and a Profile Page (a place where "Entity You" is Describe by You).
One Last Thing
You can get a feel for the future desktop by downloading and then installing the OpenLink Data Explorer plugin for Firefox, which allows you to switch viewing modes between Web Page and Linked Data behind the page. :-)
Related
|
08/21/2008 15:26 GMT
|
Modified:
08/21/2008 16:17 GMT
|
Yahoo! and the Linked Data Web in a Nutshell (Updated)
This automated mail from Yahoo! speaks for itself re. Linked Data Web incomprehension!
Greetings!
This is an automated email from Yahoo! Application Gallery. Please do not reply to this email message. We regret to inform you that your application 'Blog Data Space' has been rejected. You can view all your applications here.
Moderator Comments: insufficient info
Regards,
The Yahoo! Application Gallery Team
Your use of Yahoo! Application Gallery is subject to http://docs.yahoo.com/info/terms/
Message to Yahoo!:
Why bother? You clearly see the Web in a totally different light to the rest of us.
If you want to private label the Web, then fine, just don't park your vehicle in the "Linked Data" or "Semantic Web" spots.
The Web doesn't need any subjectivity bootstraps or booster-shots, it just needs open access to Structured and Linked Data via URIs.
Kind of Related
|
08/19/2008 19:43 GMT
|
Modified:
08/19/2008 18:14 GMT
|
Response to: Whole Data Post (Update 3)
This post is in response to Glenn McDonald's post titled: Whole Data, where he highlights a number of issues relating to "Semantic Web" marketing communications and overall messaging, from his perspective.
By coincidence, Glenn and I presented at this month's Cambridge Semantic Web Gathering.
I've provided a dump of Glenn's issues and my responses below:
Issue - RDF
- Ingenious data decomposition idea, but:
- too low-level; the assembly language of data, where we need Java or Ruby
- "resource" is not the issue; there's no such thing as "metadata", it's all data; "meta" is a perspective
- lists need to be effortless, not painful and obscure
- nodes need to be represented, not just implied; they need types and literals in a more pervasive, integrated way.
Response:
RDF is a Graph based Data Model it stands for Resource Description Framework. The Metadata data angle comes from it's Meta Content Framework (MCF) origins. You can express and serialize data based on the RDF Data Model using: Turtle, N3, TriX, N-Triples, and RDF/XML.
Issue - SPARQL (and Freebase's MQL)
These are just appeasement: - old query paradigm: fishing in dark water with superstitiously tied lures; only works well in carefully stocked lakes - we don't ask questions by defining answer shapes and then hoping they're dredged up whole.
Response:
SPARQL, MQL, and Entity-SQL are Graph Model oriented Query Languages. Query Languages always accompany Database Engines. SQL is the Relational Model equivalent.
Noble attempt to ground the abstr |