Details

Kingsley Uyi Idehen
Lexington, United States

Subscribe

Post Categories

Recent Articles

Display Settings

articles per page.
order.

Translate

Showing posts in all categories RefreshRefresh
Re-introducing the Virtuoso Virtual Database Engine

In recent times a lot of the commentary and focus re. Virtuoso has centered on the RDF Quad Store and Linked Data. What sometimes gets overlooked is the sophisticated Virtual Database Engine that provides the foundation for all of Virtuoso's data integration capabilities.

In this post I provide a brief re-introduction to this essential aspect of Virtuoso.

What is it?

This component of Virtuoso is known as the Virtual Database Engine (VDBMS). It provides transparent high-performance and secure access to disparate data sources that are external to Virtuoso. It enables federated access and integration of data hosted by any ODBC- or JDBC-accessible RDBMS, RDF Store, XML database, or Document (Free Text)-oriented Content Management System. In addition, it facilitates integration with Web Services (SOAP-based SOA RPCs or REST-fully accessible Web Resources).

Why is it important?

In the most basic sense, you shouldn't need to upgrade your existing database engine version simply because your current DBMS and Data Access Driver combo isn't compatible with ODBC-compliant desktop tools such as Microsoft Access, Crystal Reports, BusinessObjects, Impromptu, or other of ODBC, JDBC, ADO.NET, or OLE DB-compliant applications. Simply place Virtuoso in front of your so-called "legacy database," and let it deliver the compliance levels sought by these tools

In addition, it's important to note that today's enterprise, through application evolution, company mergers, or acquisitions, is often faced with disparately-structured data residing in any number of line-of-business-oriented data silos. Compounding the problem is the exponential growth of user-generated data via new social media-oriented collaboration tools and platforms. For companies to cost-effectively harness the opportunities accorded by the increasing intersection between line-of-business applications and social media, virtualization of data silos must be achieved, and this virtualization must be delivered in a manner that doesn't prohibitively compromise performance or completely undermine security at either the enterprise or personal level. Again, this is what you get by simply installing Virtuoso.

How do I use it?

The VDBMS may be used in a variety of ways, depending on the data access and integration task at hand. Examples include:

Relational Database Federation

You can make a single ODBC, JDBC, ADO.NET, OLE DB, or XMLA connection to multiple ODBC- or JDBC-accessible RDBMS data sources, concurrently, with the ability to perform intelligent distributed joins against externally-hosted database tables. For instance, you can join internal human resources data against internal sales and external stock market data, even when the HR team uses Oracle, the Sales team uses Informix, and the Stock Market figures come from Ingres!

Conceptual Level Data Access using the RDF Model

You can construct RDF Model-based Conceptual Views atop Relational Data Sources. This is about generating HTTP-based Entity-Attribute-Value (E-A-V) graphs using data culled "on the fly" from native or external data sources (Relational Tables/Views, XML-based Web Services, or User Defined Types).

You can also derive RDF Model-based Conceptual Views from Web Resource transformations "on the fly" -- the Virtuoso Sponger (RDFizing middleware component) enables you to generate RDF Model Linked Data via a RESTful Web Service or within the process pipeline of the SPARQL query engine (i.e., you simply use the URL of a Web Resource in the FROM clause of a SPARQL query).

It's important to note that Views take the form of HTTP links that serve as both Data Source Names and Data Source Addresses. This enables you to query and explore relationships across entities (i.e., People, Places, and other Real World Things) via HTTP clients (e.g., Web Browsers) or directly via SPARQL Query Language constructs transmitted over HTTP.

Conceptual Level Data Access using ADO.NET Entity Frameworks

As an alternative to RDF, Virtuoso can expose ADO.NET Entity Frameworks-based Conceptual Views over Relational Data Sources. It achieves this by generating Entity Relationship graphs via its native ADO.NET Provider, exposing all externally attached ODBC- and JDBC-accessible data sources. In addition, the ADO.NET Provider supports direct access to Virtuoso's native RDF database engine, eliminating the need for resource intensive Entity Frameworks model transformations.

Related

# PermaLink Comments [0]
02/17/2010 16:38 GMT Modified: 02/17/2010 16:54 GMT
Time for RDBMS Primacy Downgrade is Nigh! (No Embedded Images Edition)

As the world works it way through a "once in a generation" economic crisis, the long overdue downgrade of the RDBMS, from its pivotal position at the apex of the data access and data management pyramid is nigh.

What is the Data Access, and Data Management Value Pyramid?

As depicted below, a top-down view of the data access and data management value chain. The term: apex, simply indicates value primacy, which takes the form of a data access API based entry point into a DBMS realm -- aligned to an underlying data model. Examples of data access APIs include: Native Call Level Interfaces (CLIs), ODBC, JDBC, ADO.NET, OLE-DB, XMLA, and Web Services.

See: AVF Pyramid Diagram.

The degree to which ad-hoc views of data managed by a DBMS can be produced and dispatched to relevant data consumers (e.g. people), without compromising concurrency, data durability, and security, collectively determine the "Agility Value Factor" (AVF) of a given DBMS. Remember, agility as the cornerstone of environmental adaptation is as old as the concept of evolution, and intrinsic to all pursuits of primacy.

In simpler business oriented terms, look at AVF as the degree to which DBMS technology affects the ability to effectively implement "Market Leadership Discipline" along the following pathways: innovation, operation excellence, or customer intimacy.

Why has RDBMS Primacy has Endured?

Historically, at least since the late '80s, the RDBMS genre of DBMS has consistently offered the highest AVF relative to other DBMS genres en route to primacy within the value pyramid. The desire to improve on paper reports and spreadsheets is basically what DBMS technology has fundamentally addressed to date, even though conceptual level interaction with data has never been its forte.

See: RDBMS Primacy Diagram.

For more then 10 years -- at the very least -- limitations of the traditional RDBMS in the realm of conceptual level interaction with data across diverse data sources and schemas (enterprise, Web, and Internet) has been crystal clear to many RDBMS technology practitioners, as indicated by some of the quotes excerpted below:

"Future of Database Research is excellent, but what is the future of data?"

"..it is hard for me to disagree with the conclusions in this report. It captures exactly the right thoughts, and should be a must read for everyone involved in the area of databases and database research in particular."

-- Dr. Anant Jingran, CTO, IBM Information Management Systems, commenting on the 2007 RDBMS technology retreat attended by a number of key DBMS technology pioneers and researchers.

"One size fits all: A concept whose time has come and gone

  1. They are direct descendants of System R and Ingres and were architected more than 25 years ago
  2. They are advocating "one size fits all"; i.e. a single engine that solves all DBMS needs.

-- Prof. Michael Stonebreaker, one of the founding fathers of the RDBMS industry.

Until this point in time, the requisite confluence of "circumstantial pain" and "open standards" based technology required to enable an objective "compare and contrast" of RDBMS engine virtues and viable alternatives hasn't occurred. Thus, the RDBMS has endured it position of primacy albeit on a "one size fits all basis".

Circumstantial Pain

As mentioned earlier, we are in the midst of an economic crisis that is ultimately about a consistent inability to connect dots across a substrate of interlinked data sources that transcend traditional data access boundaries with high doses of schematic heterogeneity. Ironically, in a era of the dot-com, we haven't been able to make meaningful connections between relevant "real-world things" that extend beyond primitive data hosted database tables and content management style document containers; we've struggled to achieve this in the most basic sense, let alone evolve our ability to connect inline with the exponential rate at which the Internet & Web are spawning "universes of discourse" (data spaces) that emanate from user activity (within the enterprise and across the Internet & Web). In a nutshell, we haven't been able to upgrade our interaction with data such that "conceptual models" and resulting "context lenses" (or facets) become concrete; by this I mean: real-world entity interaction making its way into the computer realm as opposed to the impedance we all suffer today when we transition from conceptual model interaction (real-world) to logical model interaction (when dealing with RDBMS based data access and data management).

Here are some simple examples of what I can only best describe as: "critical dots unconnected", resulting from an inability to interact with data conceptually:

Government (Globally) -

Financial regulatory bodies couldn't effectively discern that a Credit Default Swap is an Insurance policy in all but literal name. And in not doing so the cost of an unregulated insurance policy laid the foundation for exacerbating the toxicity of fatally flawed mortgage backed securities. Put simply: a flawed insurance policy was the fallback on a toxic security that financiers found exotic based on superficial packaging.

Enterprises -

Banks still don't understand that capital really does exists in tangible and intangible forms; with the intangible being the variant that is inherently dynamic. For example, a tech companies intellectual capital far exceeds the value of fixture, fittings, and buildings, but you be amazed to find that in most cases this vital asset has not significant value when banks get down to the nitty gritty of debt collateral; instead, a buffer of flawed securitization has occurred atop a borderline static asset class covering the aforementioned buildings, fixtures, and fittings.

In the general enterprise arena, IT executives continued to "rip and replace" existing technology without ever effectively addressing the timeless inability to connect data across disparate data silos generated by internal enterprise applications, let alone the broader need to mesh data from the inside with external data sources. No correlations made between the growth of buzzwords and the compounding nature of data integration challenges. It's 2009 and only a miniscule number of executives dare fantasize about being anywhere within distance of the: relevant information at your fingertips vision.

Looking more holistically at data interaction in general, whether you interact with data in the enterprise space (i.e., at work) or on the Internet or Web, you ultimately are delving into a mishmash of disparate computer systems, applications, service (Web or SOA), and databases (of the RDBMS variety in a majority of cases) associated with a plethora of disparate schemas. Yes, but even today "rip and replace" is still the norm pushed by most vendors; pitting one mono culture against another as exemplified by irrelevances such as: FOSS/LAMP vs Commercial or Web vs. Enterprise, when none of this matters if the data access and integration issues are recognized let alone addressed (see: Applications are Like Fish and Data Like Wine).

Like the current credit-crunch, exponential growth of data originating from disparate application databases and associated schemas, within shrinking processing time frames, has triggered a rethinking of what defines data access and data management value today en route to an inevitable RDBMS downgrade within the value pyramid.

Technology

There have been many attempts to address real-world modeling requirements across the broader DBMS community from Object Databases to Object-Relational Databases, and more recently the emergence of simple Entity-Attribute-Value model DBMS engines. In all cases failure has come down to the existence of one or more of the following deficiencies, across each potential alternative:

  1. Query language standardization - nothing close to SQL standardization
  2. Data Access API standardization - nothing close to ODBC, JDBC, OLE-DB, or ADO.NET
  3. Wire protocol standardization - nothing close to HTTP
  4. Distributed Identity infrastructure - nothing close to the non-repudiatable digital Identity that foaf+ssl accords
  5. Use of Identifiers as network based pointers to data sources - nothing close to RDF based Linked Data
  6. Negotiable data representation - nothing close to Mime and HTTP based Content Negotiation
  7. Scalability especially in the era of Internet & Web scale.

Entity-Attribute-Value with Classes & Relationships (EAV/CR) data models

A common characteristic shared by all post-relational DBMS management systems (from Object Relational to pure Object) is an orientation towards variations of EAV/CR based data models. Unfortunately, all efforts in the EAV/CR realm have typically suffered from at least one of the deficiencies listed above. In addition, the same "one DBMS model fits all" approach that lies at the heart of the RDBMS downgrade also exists in the EAV/CR realm.

What Comes Next?

The RDBMS is not going away (ever), but its era of primacy -- by virtue of its placement at the apex of the data access and data management value pyramid -- is over! I make this bold claim for the following reasons:

  1. The Internet aided "Global Village" has brought "Open World" vs "Closed World" assumption issues to the fore e.g., the current global economic crisis remains centered on the inability to connect dots across "Open World" and "Closed World" data frontiers
  2. Entity-Attribute-Value with Classes & Relationships (EAV/CR) based DBMS models are more effective when dealing with disparate data associated with disparate schemas, across disparate DBMS engines, host operating systems, and networks.

Based on the above, it is crystal clear that a different kind of DBMS -- one with higher AVF relative to the RDBMS -- needs to sit atop today's data access and data management value pyramid. The characteristics of this DBMS must include the following:

  1. Every item of data (Datum/Entity/Object/Resource) has Identity
  2. Identity is achieved via Identifiers that aren't locked at the DBMS, OS, Network, or Application levels
  3. Object Identifiers and Object values are independent (extricably linked by association)
  4. Object values should be de-referencable via Object Identifier
  5. Representation of de-referenced value graph (entity, attributes, and values mesh) must be negotiable (i.e. content negotiation)
  6. Structured query language must provide mechanism for Creation, Deletion, Updates, and Querying of data objects
  7. Performance & Scalability across "Closed World" (enterprise) and "Open World" (Internet & Web) realms.

Quick recap, I am not saying that RDBMS engine technology is dead or obsolete. I am simply stating that the era of RDBMS primacy within the data access and data management value pyramid is over.

The problem domain (conceptual model views over heterogeneous data sources) at the apex of the aforementioned pyramid has simply evolved beyond the natural capabilities of the RDBMS which is rooted in "Closed World" assumptions re., data definition, access, and management. The need to maintain domain based conceptual interaction with data is now palpable at every echelon within our "Global Village" - Internet, Web, Enterprise, Government etc.

It is my personal view that an EAV/CR model based DBMS, with support for the seven items enumerated above, can trigger the long anticipated RDBMS downgrade. Such a DBMS would be inherently multi-model because you would need to the best of RDBMS and EAV/CR model engines in a single product, with in-built support for HTTP and other Internet protocols in order to effectively address data representation and serialization issues.

EAV/CR Oriented Data Access & Management Technology

Examples of contemporary EAV/CR frameworks that provide concrete conceptual layers for data access and data management currently include:

The frameworks above provide the basis for a revised AVF pyramid, as depicted below, that reflects today's data access and management realities i.e., an Internet & Web driven global village comprised of interlinked distributed data objects, compatible with "Open World" assumptions.

See: New EAV/CR Primacy Diagram.

Related

# PermaLink Comments [0]
01/27/2009 19:19 GMT Modified: 05/29/2009 10:39 GMT
The Time for RDBMS Primacy Downgrade is Nigh!

As the world works it way through a "once in a generation" economic crisis, the long overdue downgrade of the RDBMS, from its pivotal position at the apex of the data access and data management pyramid is nigh.

What is the Data Access, and Data Management Value Pyramid?

As depicted below, a top-down view of the data access and data management value chain. The term: apex, simply indicates value primacy, which takes the form of a data access API based entry point into a DBMS realm -- aligned to an underlying data model. Examples of data access APIs include: Native Call Level Interfaces (CLIs), ODBC, JDBC, ADO.NET, OLE-DB, XMLA, and Web Services.

Image

The degree to which ad-hoc views of data managed by a DBMS can be produced and dispatched to relevant data consumers (e.g. people), without compromising concurrency, data durability, and security, collectively determine the "Agility Value Factor" (AVF) of a given DBMS. Remember, agility as the cornerstone of environmental adaptation is as old as the concept of evolution, and intrinsic to all pursuits of primacy.

In simpler business oriented terms, look at AVF as the degree to which DBMS technology affects the ability to effectively implement "Market Leadership Discipline" along the following pathways: innovation, operation excellence, or customer intimacy.

Why has RDBMS Primacy has Endured?

Historically, at least since the late '80s, the RDBMS genre of DBMS has consistently offered the highest AVF relative to other DBMS genres en route to primacy within the value pyramid. The desire to improve on paper reports and spreadsheets is basically what DBMS technology has fundamentally addressed to date, even though conceptual level interaction with data has never been its forte.

Image

For more then 10 years -- at the very least -- limitations of the traditional RDBMS in the realm of conceptual level interaction with data across diverse data sources and schemas (enterprise, Web, and Internet) has been crystal clear to many RDBMS technology practitioners, as indicated by some of the quotes excerpted below:

"Future of Database Research is excellent, but what is the future of data?"

"..it is hard for me to disagree with the conclusions in this report. It captures exactly the right thoughts, and should be a must read for everyone involved in the area of databases and database research in particular."

-- Dr. Anant Jingran, CTO, IBM Information Management Systems, commenting on the 2007 RDBMS technology retreat attended by a number of key DBMS technology pioneers and researchers.

"One size fits all: A concept whose time has come and gone

  1. They are direct descendants of System R and Ingres and were architected more than 25 years ago
  2. They are advocating "one size fits all"; i.e. a single engine that solves all DBMS needs.

-- Prof. Michael Stonebreaker, one of the founding fathers of the RDBMS industry.

Until this point in time, the requisite confluence of "circumstantial pain" and "open standards" based technology required to enable an objective "compare and contrast" of RDBMS engine virtues and viable alternatives hasn't occurred. Thus, the RDBMS has endured it position of primacy albeit on a "one size fits all basis".

Circumstantial Pain

As mentioned earlier, we are in the midst of an economic crisis that is ultimately about a consistent inability to connect dots across a substrate of interlinked data sources that transcend traditional data access boundaries with high doses of schematic heterogeneity. Ironically, in a era of the dot-com, we haven't been able to make meaningful connections between relevant "real-world things" that extend beyond primitive data hosted database tables and content management style document containers; we've struggled to achieve this in the most basic sense, let alone evolve our ability to connect inline with the exponential rate at which the Internet & Web are spawning "universes of discourse" (data spaces) that emanate from user activity (within the enterprise and across the Internet & Web). In a nutshell, we haven't been able to upgrade our interaction with data such that "conceptual models" and resulting "context lenses" (or facets) become concrete; by this I mean: real-world entity interaction making its way into the computer realm as opposed to the impedance we all suffer today when we transition from conceptual model interaction (real-world) to logical model interaction (when dealing with RDBMS based data access and data management).

Here are some simple examples of what I can only best describe as: "critical dots unconnected", resulting from an inability to interact with data conceptually:

Government (Globally) -

Financial regulatory bodies couldn't effectively discern that a Credit Default Swap is an Insurance policy in all but literal name. And in not doing so the cost of an unregulated insurance policy laid the foundation for exacerbating the toxicity of fatally flawed mortgage backed securities. Put simply: a flawed insurance policy was the fallback on a toxic security that financiers found exotic based on superficial packaging.

Enterprises -

Banks still don't understand that capital really does exists in tangible and intangible forms; with the intangible being the variant that is inherently dynamic. For example, a tech companies intellectual capital far exceeds the value of fixture, fittings, and buildings, but you be amazed to find that in most cases this vital asset has not significant value when banks get down to the nitty gritty of debt collateral; instead, a buffer of flawed securitization has occurred atop a borderline static asset class covering the aforementioned buildings, fixtures, and fittings.

In the general enterprise arena, IT executives continued to "rip and replace" existing technology without ever effectively addressing the timeless inability to connect data across disparate data silos generated by internal enterprise applications, let alone the broader need to mesh data from the inside with external data sources. No correlations made between the growth of buzzwords and the compounding nature of data integration challenges. It's 2009 and only a miniscule number of executives dare fantasize about being anywhere within distance of the: relevant information at your fingertips vision.

Looking more holistically at data interaction in general, whether you interact with data in the enterprise space (i.e., at work) or on the Internet or Web, you ultimately are delving into a mishmash of disparate computer systems, applications, service (Web or SOA), and databases (of the RDBMS variety in a majority of cases) associated with a plethora of disparate schemas. Yes, but even today "rip and replace" is still the norm pushed by most vendors; pitting one mono culture against another as exemplified by irrelevances such as: FOSS/LAMP vs Commercial or Web vs. Enterprise, when none of this matters if the data access and integration issues are recognized let alone addressed (see: Applications are Like Fish and Data Like Wine).

Like the current credit-crunch, exponential growth of data originating from disparate application databases and associated schemas, within shrinking processing time frames, has triggered a rethinking of what defines data access and data management value today en route to an inevitable RDBMS downgrade within the value pyramid.

Technology

There have been many attempts to address real-world modeling requirements across the broader DBMS community from Object Databases to Object-Relational Databases, and more recently the emergence of simple Entity-Attribute-Value model DBMS engines. In all cases failure has come down to the existence of one or more of the following deficiencies, across each potential alternative:

  1. Query language standardization - nothing close to SQL standardization
  2. Data Access API standardization - nothing close to ODBC, JDBC, OLE-DB, or ADO.NET
  3. Wire protocol standardization - nothing close to HTTP
  4. Distributed Identity infrastructure - nothing close to the non-repudiatable digital Identity that foaf+ssl accords
  5. Use of Identifiers as network based pointers to data sources - nothing close to RDF based Linked Data
  6. Negotiable data representation - nothing close to Mime and HTTP based Content Negotiation
  7. Scalability especially in the era of Internet & Web scale.

Entity-Attribute-Value with Classes & Relationships (EAV/CR) data models

A common characteristic shared by all post-relational DBMS management systems (from Object Relational to pure Object) is an orientation towards variations of EAV/CR based data models. Unfortunately, all efforts in the EAV/CR realm have typically suffered from at least one of the deficiencies listed above. In addition, the same "one DBMS model fits all" approach that lies at the heart of the RDBMS downgrade also exists in the EAV/CR realm.

What Comes Next?

The RDBMS is not going away (ever), but its era of primacy -- by virtue of its placement at the apex of the data access and data management value pyramid -- is over! I make this bold claim for the following reasons:

  1. The Internet aided "Global Village" has brought "Open World" vs "Closed World" assumption issues to the fore e.g., the current global economic crisis remains centered on the inability to connect dots across "Open World" and "Closed World" data frontiers
  2. Entity-Attribute-Value with Classes & Relationships (EAV/CR) based DBMS models are more effective when dealing with disparate data associated with disparate schemas, across disparate DBMS engines, host operating systems, and networks.

Based on the above, it is crystal clear that a different kind of DBMS -- one with higher AVF relative to the RDBMS -- needs to sit atop today's data access and data management value pyramid. The characteristics of this DBMS must include the following:

  1. Every item of data (Datum/Entity/Object/Resource) has Identity
  2. Identity is achieved via Identifiers that aren't locked at the DBMS, OS, Network, or Application levels
  3. Object Identifiers and Object values are independent (extricably linked by association)
  4. Object values should be de-referencable via Object Identifier
  5. Representation of de-referenced value graph (entity, attributes, and values mesh) must be negotiable (i.e. content negotiation)
  6. Structured query language must provide mechanism for Creation, Deletion, Updates, and Querying of data objects
  7. Performance & Scalability across "Closed World" (enterprise) and "Open World" (Internet & Web) realms.

Quick recap, I am not saying that RDBMS engine technology is dead or obsolete. I am simply stating that the era of RDBMS primacy within the data access and data management value pyramid is over.

The problem domain (conceptual model views over heterogeneous data sources) at the apex of the aforementioned pyramid has simply evolved beyond the natural capabilities of the RDBMS which is rooted in "Closed World" assumptions re., data definition, access, and management. The need to maintain domain based conceptual interaction with data is now palpable at every echelon within our "Global Village" - Internet, Web, Enterprise, Government etc.

It is my personal view that an EAV/CR model based DBMS, with support for the seven items enumerated above, can trigger the long anticipated RDBMS downgrade. Such a DBMS would be inherently multi-model because you would need to the best of RDBMS and EAV/CR model engines in a single product, with in-built support for HTTP and other Internet protocols in order to effectively address data representation and serialization issues.

EAV/CR Oriented Data Access & Management Technology

Examples of contemporary EAV/CR frameworks that provide concrete conceptual layers for data access and data management currently include:

The frameworks above provide the basis for a revised AVF pyramid, as depicted below, that reflects today's data access and management realities i.e., an Internet & Web driven global village comprised of interlinked distributed data objects, compatible with "Open World" assumptions.

Image

Related

# PermaLink Comments [0]
01/24/2009 20:04 GMT Modified: 07/26/2009 10:20 GMT
New ADO.NET 3.x Provider for Virtuoso Released (Update 2)

I am pleased to announce the immediate availability of the Virtuoso ADO.NET 3.5 data provider for Microsoft's .NET platform.

What is it?

A data access driver/provider that provides conceptual entity oriented access to RDBMS data managed by Virtuoso. Naturally, it also uses Virtuoso's in-built virtual / federated database layer to provide access to ODBC and JDBC accessible RDBMS engines such as: Oracle (7.x to latest), SQL Server (4.2 to latest), Sybase, IBM Informix (5.x to latest), IBM DB2, Ingres (6.x to latest), Progress (7.x to OpenEdge), MySQL, PostgreSQL, Firebird, and others using our ODBC or JDBC bridge drivers.

Benefits?

Technical:

It delivers an Entity-Attribute-Value + Classes & Relationships model over disparate data sources that are materialized as .NET Entity Framework Objects, which are then consumable via ADO.NET Data Object Services, LINQ for Entities, and other ADO.NET data consumers.

The provider is fully integrated into Visual Studio 2008 and delivers the same "ease of use" offered by Microsoft's own SQL Server provider, but across Virtuoso, Oracle, Sybase, DB2, Informix, Ingres, Progress (OpenEdge), MySQL, PostgreSQL, Firebird, and others. The same benefits also apply uniformly to Entity Frameworks compatibility.

Bearing in mind that Virtuoso is a multi-model (hybrid) data manager, this also implies that you can use .NET Entity Frameworks against all data managed by Virtuoso. Remember, Virtuoso's SQL channel is a conduit to Virtuoso's core; thus, RDF (courtesy of SPASQL as already implemented re. Jena/Sesame/Redland providers), XML, and other data forms stored in Virtuoso also become accessible via .NET's Entity Frameworks.


Strategic:

You can choose which entity oriented data access model works best for you: RDF Linked Data & SPARQL or .NET Entity Frameworks & Entity SQL. Either way, Virtuoso delivers a commercial grade, high-performance, secure, and scalable solution.


How do I use it?

Simply follow one of guides below:

Note: When working with external or 3rd party databases, simply use the Virtuoso Conductor to link the external data source into Virtuoso. Once linked, the remote tables will simply be treated as though they are native Virtuoso tables leaving the virtual database engine to handle the rest. This is similar to the role the Microsoft JET engine played in the early days of ODBC, so if you've ever linked an ODBC data source into Microsoft Access, you are ready to do the same using Virtuoso.

Related

# PermaLink Comments [0]
01/08/2009 04:36 GMT Modified: 01/08/2009 09:05 GMT
Crunchbase & Semantic Web Interview (Remix - Update 1)

After reading Bengee's interview with CrunchBase, I decided to knock up a quick interview remix as part of my usual attempt to add to the developing discourse.

CrunchBase: When we released the CrunchBase API, you were one of the first developers to step up and quickly released a CrunchBase Sponger Cartridge. Can you explain what a CrunchBase Sponger Cartridge is?
Me: A Sponger Cartridge is a data access driver for Web Resources that plugs into our Virtuoso Universal Server (DBMS and Linked Data Web Server combo amongst other things). It uses the internal structure of a resource and/or a web service associated with a resource, to materialize an RDF based Linked Data graph that essentially describes the resource via its properties (Attributes & Relationships).

Image


CrunchBase: And what inspired you to create it?
Me: Bengee built a new space with your data, and we've built a space on the fly from your data which still resides in your domain. Either solution extols the virtues of Linked Data i.e. the ability to explore relationships across data items with high degrees of serendipity (also colloquially known as: following-your-nose pattern in Semantic Web circles).
Bengee posted a notice to the Linking Open Data Community's public mailing list announcing his effort. Bearing in mind the fact that we've been using middleware to mesh the realms of Web 2.0 and the Linked Data Web for a while, it was a no-brainer to knock something up based on the conceptual similarities between Wikicompany and CrunchBase. In a sense, a quadrant of orthogonality is what immediately came to mind re. Wikicompany, CrunchBase, Bengee's RDFization efforts, and ours.
Bengee created an RDF based Linked Data warehouse based on the data exposed by your API, which is exposed via the Semantic CrunchBase data space. In our case we've taken the "RDFization on the fly" approach which produces a transient Linked Data View of the CrunchBase data exposed by your APIs. Our approach is in line with our world view: all resources on the Web are data sources, and the Linked Data Web is about incorporating HTTP into the naming scheme of these data sources so that the conventional URL based hyperlinking mechanism can be used to access a structured description of a resource, which is then transmitted using a range negotiable representation formats. In addition, based on the fact that we house and publish a lot of Linked Data on the Web (e.g. DBpedia, PingTheSemanticWeb, and others), we've also automatically meshed Crunchbase data with related data in DBpedia and Wikicompany data.

CrunchBase: Do you know of any apps that are using CrunchBase Cartridge to enhance their functionality?
Me: Yes, the OpenLink Data Explorer which provides CrunchBase site visitors with the option to explore the Linked Data in the CrunchBase data space. It also allows them to "Mesh" (rather than "Mash") CrunchBase data with other Linked Data sources on the Web without writing a single line of code.

CrunchBase: You have been immersed in the Semantic Web movement for a while now. How did you first get interested in the Semantic Web?
Me: We saw the Semantic Web as a vehicle for standardizing conceptual views of heterogeneous data sources via context lenses (URIs). In 1998 as part of our strategy to expand our business beyond the development and deployment of ODBC, JDBC, and OLE-DB data providers, we decided to build a Virtual Database Engine (see: Virtuoso History), and in doing so we sought a standards based mechanism for the conceptual output of the data virtualization effort. As of the time of the seminal unveiling of the Semantic Web in 1998 we were clear about two things, in relation to the effects of the Web and Internet data management infrastructure inflections: 1) Existing DBMS technology had reached it limits 2) Web Servers would ultimately hit their functional limits. These fundamental realities compelled us to develop Virtuoso with an eye to leveraging the Semantic Web as a vehicle from completing its technical roadmap.

CrunchBase: Can you put into layman’s terms exactly what RDF and SPARQL are and why they are important? Do they only matter for developers or will they extend past developers at some point and be used by website visitors as well?
Me: RDF (Resource Description Framework) is a Graph based Data Model that facilitates resource description using the Subject, Predicate, and Object principle. Associated with the core data model, as part of the overall framework, are a number of markup languages for expressing your descriptions (just as you express presentation markup semantics in HTML or document structure semantics in XML) that include: RDFa (simple extension of HTML markup for embedding descriptions of things in a page), N3 (a human friendly markup for describing resources), RDF/XML (a machine friendly markup for describing resources).
SPARQL is the query language associated with the RDF Data Model, just as SQL is a query language associated with the Relational Database Model. Thus, when you have RDF based structured and linked data on the Web, you can query against Web using SPARQL just as you would against an Oracle/SQL Server/DB2/Informix/Ingres/MySQL/etc.. DBMS using SQL. That's it in a nutshell.

CrunchBase: On your website you wrote that “RDF and SPARQL as productivity boosters in everyday web development”. Can you elaborate on why you believe that to be true?
Me: I think the ability to discern a formal description of anything via its discrete properties is of immense value re. productivity, especially when the capability in question results in a graph of Linked Data that isn't confined to a specific host operating system, database engine, application or service, programming language, or development framework. RDF Linked Data is about infrastructure for the true materialization of the "Information at Your Fingertips" vision of yore. Even though it's taken the emergence of RDF Linked Data to make the aforementioned vision tractable, the comprehension of the vision's intrinsic value have been clear for a very long time. Most organizations and/or individuals are quite familiar with the adage: Knowledge is Power, well there isn't any knowledge without accessible Information, and there isn't any accessible Information without accessible Data. The Web has always be grounded in accessibility to data (albeit via compound container documents called Web Pages).
Bottom line, RDF based Linked Data is about Open Data access by reference using URIs (HTTP based Entity IDs / Data Object IDs / Data Source Names), and as I said earlier, the intrinsic value is pretty obvious bearing in mind the costs associated with integrating disparate and heterogeneous data sources -- across intranets, extranets, and the Internet.

CrunchBase: In his definition of Web 3.0, Nova Spivack proposes that the Semantic Web, or Semantic Web technologies, will be force behind much of the innovation that will occur during Web 3.0. Do you agree with Nova Spivack? What role, if any, do you feel the Semantic Web will play in Web 3.0?
Me: I agree with Nova. But I see Web 3.0 as a phase within the Semantic Web innovation continuum. Web 3.0 exists because Web 2.0 exists. Both of these Web versions express usage and technology focus patterns. Web 2.0 is about the use of Open Source technologies to fashion Web Services that are ultimately used to drive proprietary Software as Service (SaaS) style solutions. Web 3.0 is about the use of "Smart Data Access" to fashion a new generation of Linked Data aware Web Services and solutions that exploit the federated nature of the Web to maximum effect; proprietary branding will simply be conveyed via quality of data (cleanliness, context fidelity, and comprehension of privacy) exposed by URIs.

Here are some examples of the CrunchBase Linked Data Space, as projected via our CruncBase Sponger Cartridge:

  1. Amazon.com
  2. Microsoft
  3. Google
  4. Apple
# PermaLink Comments [0]
08/27/2008 18:16 GMT Modified: 08/27/2008 20:35 GMT
Birds of a Feather Flock Together - Mac OS X & Rails

A very cool video promo for Ruby on Rails and Mac OS X, or should I say: 37 Signals & Apple :-) Either way, very cool!

BTW - We have just released a collection of High-Performance Data Providers for ActiveRecord. Our providers deliver

Consistent Functionality
to RoR developers across Virtuoso, Oracle, SQL Server, Sybase, DB2, Ingres, Informix, and others without compromising performance or cross platform portability.
# PermaLink Comments [0] TrackBack [3390]
10/21/2006 00:55 GMT Modified: 05/28/2007 16:19 GMT
Birds of a Feather Flock Together - Mac OS X & Rails

A very cool video promo for Ruby on Rails and Mac OS X, or should I say: 37 Signals & Apple :-) Either way, very cool!

BTW - We have just released a collection of High-Performance Data Providers for ActiveRecord. Our providers deliver

Consistent Functionality
to RoR developers across Virtuoso, Oracle, SQL Server, Sybase, DB2, Ingres, Informix, and others without compromising performance or cross platform portability.
# PermaLink Comments [0] TrackBack [3390]
10/21/2006 00:55 GMT Modified: 05/28/2007 16:19 GMT
Prerelational DBMS vendors — a quick overview

Prerelational DBMS vendors — a quick overview: "

IBM. With BOMP and D-BOMP, IBM was probably the first company to commercialize precursors to DBMS. (BOMP stood for Bill Of Materials Planning, foreshadowing the hierarchical architecture of IMS.) Out of those grew DL/1 and IMS, IBM’s flagship hierarchical DBMS, and the world’s first dominant DBMS product(s). Of course, IBM also innovated relational DBMS, via the research of E. F. ‘Ted’ Codd, then some prototype products, and eventual the mainframe version of DB2. To this day DB2 on the mainframe remains one of the world’s major DBMS, as does the separate but related product of DB2 for ‘open systems.’

Cincom. In the 1970s, Cincom was probably the most successful independent software product company. Its flagship product was Total, a shallow-network DBMS that was a little more general than the strictly hierarchical IMS. What’s more, Total ran on almost any brand of computer hardware. Cincom remains independent and privately held to this day.

Cullinane/Cullinet. Charlie Bachman innovated a true network DBMS at Honeywell, but it didn’t turn into a serious product at that time. B. F. Goodrich, however, ran a version. This is what John Cullinane’s company bought and turned into IDMS, which at least on the mainframe supplanted Total as the technical, mind share, and probably revenue market leader. Cullinet (as it was then called) ran into technical difficulties, however, losing ground to the more flexible index-based DBMS. It was eventually sold to Computer Associates.

A lot of software industry leaders cut their teeth at Cullinet, notably Andrew ‘Flip’ Filipowski, later the colorful founder of Platinum. Other alumni include Renato ‘Ron’ Zambonini, Dave Litwack, Dave Ireland, and the original PowerBuilder development team. John Landry and Bob Weiler ran the firm for a while toward the end, but they don’t really count; rather, they’re the most prominent alumni of applications pioneer McCormack & Dodge.

Note: Index-based is a term I used in and probably coined for my first report in 1982, comprising both inverted-list and relational RDBMS, as opposed to the link(ed)-list hierarchical and network products such as IMS, Total, and IDBMS. The companies that beat Cullinet were long-time rival Software AG, and then especially Applied Data Research; then all three of those independents were blown out by IBM’s DB2. And then the whole mainframe DBMS business was in turn obsoleted by the rise of UNIX … but I’m getting ahead of my story.

Software AG. Like Cincom, Germany-based Software AG is a 1970s DBMS pioneer that has always remained independent and privately held. Sort of. Twice, Software AG of North America was spun off as a separate, eventually public company. Software AG’s flagship DBMS was the inverted list product ADABAS. SAP’s MaxDB was also owned by Software AG for a while (and seemingly by every other significant German computer company as well – or more precisely, by Nixdorf where it was developed, and by Siemens after it bought Nixdorf).

I actually visited Software AG in Darmstadt once. Founder Peter Schnell and key techie Peter Page were both gracious hosts. Schnell was proud of their new building, and especially of the hexagon-based wooden dual desks he’d personally designed. General analytic rule – when the CEO is focused on the décor, this is not a good sign for the company’s near-term prospects. (I call this having an ‘edifice complex.’)

Applied Data Research (ADR). ADR is often credited as being the first independent software company, having introduced products in the late 1960s and prevailed in antitrust struggles against IBM to allow the business to survive. Basically, it sold programmer productivity tools. This led it to acquire Datacom/DB, an inverted-list DBMS developed in the Dallas area. In the early 1980s, Datacom/DB began to boom, and was on a track to surpass both IDMS and ADABAS in market share until DB2 showed up and blew them all away. ADR was particularly aided by its fourth-generation language (4GL) IDEAL, which was an excellent product notwithstanding the famous State of New Jersey fiasco. (As John Landry said to me about that one, ‘4GLs are powerful tools. In particular, they allow you to write bad programs really quickly.’)

ADR was an underappreciated powerhouse, boasting all of the Fortune 100 as customers way back in the early 1980s (yes, even archrival IBM). When the DBMS business stalled, however, ADR was quickly sold — first to Ameritech (the Illinois-based Baby Bell company), and soon thereafter to Computer Associates.

Computer Corporation of America (CCA). CCA’s DBMS Model 204 may have been the best of the prerelational products, boasting an inverted-list architecture akin to that of ADABAS and Datacom/DB. The company was also interesting in that it was first and foremost a government contract research shop, and hence did all sorts of interesting prototype work that sadly never got commercialized. In about 1983 it became that the company wasn’t going anywhere, and it put itself up for sale.

I was personally instrumental in that decision. Our investment banker pretended he was considering taking CCA public. CCA President Jim Rothnie showed us revenue projections. I asked how he had gotten them. He replied that he had taken the market size projection 5 years out, assumed 10%, and drawn a ‘plausible curve.’ However, I quickly got Socratic with him. ‘How many salesmen do you have?’ ‘How much revenue does the average experienced salesman produce?’ ‘How many experienced salesmen do you expect to have next year?’ ‘How high do you think their average productivity can grow?’ ‘Let us multiply.’ (Yes, I really said that. I can be a jerk. And anyway Jim was the sort of analytic guy one can say that to without giving serious offense.)

CCA was sold to a Canadian insurance company whose name I’ve now forgotten. Eventually, it was spun back out (perhaps after some intermediate changes of ownership), and resurfaced as primarily a data integration company, called Praxis.

In the real old days (mid 1970s, perhaps), Model 204 was resold by Informatics (later Informatics General, later the hostile takeover that became the guts of Sterling Software, which like so many other companies was eventually absorbed into Computer Associates). I know this because Richard Currier used to sell the product when he worked at Informatics. That probably makes Richard and me about the only two people who still remember the fact.

Hmm. I forgot to mention Intel’s System 2000. Well, truth be told it was a dying product even back when I first became an analyst in 1981, and I recall nothing about it, except Gene Lowenthal’s observation that Intel had had trouble selling chips and DBMS through the same salesforce. I think Al Sisto, who I probably met when he was head of sales at RTI (Relational Technology, Inc. — later called Ingres), came out of that business, but I’m not 100% sure. I remember Pete Tierney from that RTI management team more clearly anyway, although that’s mainly because we stayed in touch at subsequent companies over the years.

"

(Via Software Memories.)

Tags: | |
# PermaLink Comments [0] TrackBack [3]
04/13/2006 20:04 GMT Modified: 02/13/2007 10:40 GMT
Prerelational DBMS vendors — a quick overview

Prerelational DBMS vendors — a quick overview: "

IBM. With BOMP and D-BOMP, IBM was probably the first company to commercialize precursors to DBMS. (BOMP stood for Bill Of Materials Planning, foreshadowing the hierarchical architecture of IMS.) Out of those grew DL/1 and IMS, IBM’s flagship hierarchical DBMS, and the world’s first dominant DBMS product(s). Of course, IBM also innovated relational DBMS, via the research of E. F. ‘Ted’ Codd, then some prototype products, and eventual the mainframe version of DB2. To this day DB2 on the mainframe remains one of the world’s major DBMS, as does the separate but related product of DB2 for ‘open systems.’

Cincom. In the 1970s, Cincom was probably the most successful independent software product company. Its flagship product was Total, a shallow-network DBMS that was a little more general than the strictly hierarchical IMS. What’s more, Total ran on almost any brand of computer hardware. Cincom remains independent and privately held to this day.

Cullinane/Cullinet. Charlie Bachman innovated a true network DBMS at Honeywell, but it didn’t turn into a serious product at that time. B. F. Goodrich, however, ran a version. This is what John Cullinane’s company bought and turned into IDMS, which at least on the mainframe supplanted Total as the technical, mind share, and probably revenue market leader. Cullinet (as it was then called) ran into technical difficulties, however, losing ground to the more flexible index-based DBMS. It was eventually sold to Computer Associates.

A lot of software industry leaders cut their teeth at Cullinet, notably Andrew ‘Flip’ Filipowski, later the colorful founder of Platinum. Other alumni include Renato ‘Ron’ Zambonini, Dave Litwack, Dave Ireland, and the original PowerBuilder development team. John Landry and Bob Weiler ran the firm for a while toward the end, but they don’t really count; rather, they’re the most prominent alumni of applications pioneer McCormack & Dodge.

Note: Index-based is a term I used in and probably coined for my first report in 1982, comprising both inverted-list and relational RDBMS, as opposed to the link(ed)-list hierarchical and network products such as IMS, Total, and IDBMS. The companies that beat Cullinet were long-time rival Software AG, and then especially Applied Data Research; then all three of those independents were blown out by IBM’s DB2. And then the whole mainframe DBMS business was in turn obsoleted by the rise of UNIX … but I’m getting ahead of my story.

Software AG. Like Cincom, Germany-based Software AG is a 1970s DBMS pioneer that has always remained independent and privately held. Sort of. Twice, Software AG of North America was spun off as a separate, eventually public company. Software AG’s flagship DBMS was the inverted list product ADABAS. SAP’s MaxDB was also owned by Software AG for a while (and seemingly by every other significant German computer company as well – or more precisely, by Nixdorf where it was developed, and by Siemens after it bought Nixdorf).

I actually visited Software AG in Darmstadt once. Founder Peter Schnell and key techie Peter Page were both gracious hosts. Schnell was proud of their new building, and especially of the hexagon-based wooden dual desks he’d personally designed. General analytic rule – when the CEO is focused on the décor, this is not a good sign for the company’s near-term prospects. (I call this having an ‘edifice complex.’)

Applied Data Research (ADR). ADR is often credited as being the first independent software company, having introduced products in the late 1960s and prevailed in antitrust struggles against IBM to allow the business to survive. Basically, it sold programmer productivity tools. This led it to acquire Datacom/DB, an inverted-list DBMS developed in the Dallas area. In the early 1980s, Datacom/DB began to boom, and was on a track to surpass both IDMS and ADABAS in market share until DB2 showed up and blew them all away. ADR was particularly aided by its fourth-generation language (4GL) IDEAL, which was an excellent product notwithstanding the famous State of New Jersey fiasco. (As John Landry said to me about that one, ‘4GLs are powerful tools. In particular, they allow you to write bad programs really quickly.’)

ADR was an underappreciated powerhouse, boasting all of the Fortune 100 as customers way back in the early 1980s (yes, even archrival IBM). When the DBMS business stalled, however, ADR was quickly sold — first to Ameritech (the Illinois-based Baby Bell company), and soon thereafter to Computer Associates.

Computer Corporation of America (CCA). CCA’s DBMS Model 204 may have been the best of the prerelational products, boasting an inverted-list architecture akin to that of ADABAS and Datacom/DB. The company was also interesting in that it was first and foremost a government contract research shop, and hence did all sorts of interesting prototype work that sadly never got commercialized. In about 1983 it became that the company wasn’t going anywhere, and it put itself up for sale.

I was personally instrumental in that decision. Our investment banker pretended he was considering taking CCA public. CCA President Jim Rothnie showed us revenue projections. I asked how he had gotten them. He replied that he had taken the market size projection 5 years out, assumed 10%, and drawn a ‘plausible curve.’ However, I quickly got Socratic with him. ‘How many salesmen do you have?’ ‘How much revenue does the average experienced salesman produce?’ ‘How many experienced salesmen do you expect to have next year?’ ‘How high do you think their average productivity can grow?’ ‘Let us multiply.’ (Yes, I really said that. I can be a jerk. And anyway Jim was the sort of analytic guy one can say that to without giving serious offense.)

CCA was sold to a Canadian insurance company whose name I’ve now forgotten. Eventually, it was spun back out (perhaps after some intermediate changes of ownership), and resurfaced as primarily a data integration company, called Praxis.

In the real old days (mid 1970s, perhaps), Model 204 was resold by Informatics (later Informatics General, later the hostile takeover that became the guts of Sterling Software, which like so many other companies was eventually absorbed into Computer Associates). I know this because Richard Currier used to sell the product when he worked at Informatics. That probably makes Richard and me about the only two people who still remember the fact.

Hmm. I forgot to mention Intel’s System 2000. Well, truth be told it was a dying product even back when I first became an analyst in 1981, and I recall nothing about it, except Gene Lowenthal’s observation that Intel had had trouble selling chips and DBMS through the same salesforce. I think Al Sisto, who I probably met when he was head of sales at RTI (Relational Technology, Inc. — later called Ingres), came out of that business, but I’m not 100% sure. I remember Pete Tierney from that RTI management team more clearly anyway, although that’s mainly because we stayed in touch at subsequent companies over the years.

"

(Via Software Memories.)

Tags: | |
# PermaLink Comments [0] TrackBack [3]
04/13/2006 20:04 GMT Modified: 02/13/2007 10:40 GMT
Ingres: Can You Ever Go Back?

A nice piece of DBMS history. I certainly believe that DBMS market history is getting more relevant by the second :-) Enjoy the post!

(From Dave Kellog, Mark Logic's CEO)

Ingres: Can You Ever Go Back?: "In an eerie turn of events, my ex-, ex-, ex-employer Ingres Corporation has been resurrected by, of all people, Terry Garnett, one of Oracle's early marketing vice presidents, now of Garnett & Helfrich Capital. They have bought Ingres back from Computer Associates, open sourced it, and are building a RedHat-like, open-source business around it. To boot, they have built quite an executive team, including the recent poaching of well-respected vice president Bill Maimone from Oracle.

The whole episode reminds me of my favorite bad sci-fi movie, Escape from New York, where lead character Snake Plissken is consistently greeted with: 'Snake Plissken? I thought you were dead.' The same could be said of Ingres.

For those not familiar with RDBMS history, Ingres was one of the first relational database management systems (RDBMSs) and was created at UC Berkeley. I worked with Ingres at the Center for Computational Seismology at Lawrence Berkeley Lab while I was in school. (We let them use the tape drive on our VAX 11/780 and were given a free license in return.)

After graduating, I went to work for the vendor, Relational Technology, Inc., then run by CEO Gary Morgenthaler with brilliant visionary Michael Stonebraker acting as de facto CTO. When I joined Ingres in 1985, it was one of the 'big three' relational vendors.

  • Relational Software, Inc., makers of Oracle, founded by Larry Ellison and Bob Miner
  • Relational Technology, Inc., makers of Ingres, founded by Michael Stonebraker, Eugene Wong, Larry Rowe, and (I think) Jon Nackerud and Gary Morgenthaler.
  • Relational Database Systems, Inc., makers of Informix, founded by Roger Sippl.
Both Ingres and Oracle were approximately $30M in size in 1985. Informix was a bit smaller. (For those wondering 'where's Sybase?' they entered the market approximately 5 years after the big three.)

Ingres placed me in the bizarre situation of experiencing great success and great failure, simultaneously. On one hand, during my 7 years there we went from being a $30M company to a $250M division of a $400M company, and I went from first-line technical support rep to director of product marketing.

On the other hand, in the same timeframe, Oracle went from $30M to $1B, won the second largest opportunity of the 20th century (the first was PC operating systems), and left the broken 'People's Republic of Ingres' in its dust.

Others have written the Ingres epitaph. Here is my version. Ingres, in my estimation, failed for the following reasons.

At a product level:
  • The wrong query language. Ingres bet on Quel. Oracle implemented SQL. While many (including the notable Chris Date) felt that Quel was 'better,' it didn't matter. IBM had stated its intention to implement SQL, making SQL a de facto standard. This was a huge difference and it's often forgotten. As late as 1990, Ingres was still selling a native Quel engine that preprocessed SQL to Quel on the front-end. Differing semantics between the languages and the echo-back of Quel from the server when SQL was sent to it, all sent smart customers running in the other direction.
  • Page-level locking. Oracle had row-level locking. Ingres had page-level locking. Oracle effectively rammed this difference down Ingres's throat in virtually every sales situation. Later, Sybase would suffer a similar fate, particularly with applications vendors like SAP who refused to implement on Sybase until it had row-level locks.
  • Lack of read consistency. The only way for readers to not block writers in Ingres was to set 'READLOCK = NOLOCK.' (This was about as poorly chosen a piece of syntax for marketing purposes as SET SERIALIZABLE = FALSE in early Oracle versions.) Oracle offered read-consistent snapshots, leveraging timestamps and the logging systems's before image file, that enabled a consistent view without blocking updates.
  • Lack of connect by. Oracle added connect-by to SQL enabling the transitive closure of a table, most commonly needed in bills-of-materials and other 'parts explosion' type queries.
  • Portability strategy. Oracle did a much better job of porting not only to more platforms, but keeping the product the same across them. Ingres attempted to optimize more for each platform (e.g., squeezing the product into 640K on the PC by dropping functionality) which, while perhaps counter-intuitive, was a mistake.
At a business level:
  • Failure to understand the tornado. The tornado refers to Geoffrey Moore's metaphor for the hypergrowth phase of a high-tech, infrastructure market. During that phase, Moore argues that vendors should 'just ship' in an attempt to gain as much market share as possible so as to pop out of the tornado as the clear market leader. During the tornado, increasing returns happen -- the more clear your leadership, the more customers want to buy from you. The 3-5 year tornado determines who will lead the market for the next decade. Ingres missed this, was timid when it needed to be aggressive, and lost.
  • Failure to understand the 'best product' doesn't necessarily win and that best product is defined in the mind of the customer. I joked that my job in 1989 was to explain why you didn't need row-level locking when you had a 2K page size, but that didn't matter. Customers wanted SQL, if inferior to Quel. Customers wanted row-locks, if somewhat unnecessary given a small page size. Customers wanted read consistency, which was indeed absolutely necessary. Product marketing literally begged R&D and the company for these features, but they were (1) deep architectural limitations and thus 'hard' to fix, (2) deemed somewhat unnecessary at a technical level by engineering, and (3) generally viewed as sales and marketing problems that should be sold-around.
  • Failure to understand sales and marketing. The company generally didn't 'get' either sales or marketing, underinvested in both, and in marketing's case had a revolving door of executives of all ilks (e.g., engineers, alliances people, consumer marketers) except those who understood the products and the customers. I had something like 10 bosses in marketing in 4 or 5 years.
So every year, Oracle planned to double and Ingres planned to grow 50% or so. Every year the execs told us this was the year that Oracle would get its comeuppance. Every year, Oracle doubled, or more than doubled. Ever year, we found it harder and harder to make the 50% growth target.

The laws of compounding took effect and across some 7 years Ingres went from being 100% of Oracle's size to 25%. Oracle, indeed, hit the wall around 1990, but it was too late. Ingres had lost. So many people were so invested in Oracle that it literally couldn't fail. Larry Ellison got about $100M from the Japanese (NTT), restated revenues for all the bricks that had been shipped over the years (as I recall removing an entire Sybase from its books at the time), and turned the company around.

Ingres was bought by ASK in 1990 and sold to Computer Associates around 1992. I thought it would rest in peace in the CA cemetery for eternity, until I learned of its recent spin-out.

To say that Ingres had a strong corporate culture is an understatement. In fact, it lives on today at the Ex-Ingres website, complete with one of my favorite slogans: 'Ingres corporate culture without the corporation.'

Many successful companies sprung from Ingres. Documentum and Forte are two of the bigger successes. The Forte crowd lives on today at AmberPoint. John Newton, one of Documentum's two founders, is trying his luck at open source with Al Fresco. Lesser known but quite successful, Perforce, was founded by lab-coat-wearing Ingres engineer Christopher Seiwald who, after reading Positioning, learned that a company should try to own one word in the mind of a customer, decided to build the fast configuration management tool, and quite successfully did just that.

Will the new Ingres be successful? They have quite a team, but one never knows. The whole open source vs. subscription/ASP vs. traditional enterprise software licensing battle is in reality just beginning. How I think it all ends will be the subject of another post."
# PermaLink Comments [0] TrackBack [1]
04/13/2006 18:55 GMT Modified: 02/21/2007 09:45 GMT
 <<     | 1 | 2 | 3 |     >>
Powered by OpenLink Virtuoso Universal Server
Running on Linux platform