Data As A Service - A Practical Viewpoint
Sunday, July 18, 2010 at 9:14AM In a recent discussion the topic of clouds arouse. Now many in the industry are still attempting to come to grips with the fundamentals of this concept. In this specific conversation we traversed down the thoughts of public vs. private, externally housed vs. internally housed, cloud-bursting, etc. Eventually we landed on the topic of services such as Infrastructure-as-a-Service, Software-as-a-Service, Platforms-as-a-Service and Data-as-a-Service.
We focused on Data-as-a-Service mostly because the professionals I were speaking to had issues around data and were curious as to how Data-as-a-Service or DaaS could help address many concerns traditional companies now face.
If one looks at the internal arrangement of any company that has been around for a bit, their data ecology is a mix of islands within the corporate environment: RDBMS like Oracle and SQL Server in a variety and number, file-based data such as Excel spreadsheets and the like, electronic communication such as email, etc. On these islands lay valuable pieces of data, keys, details about accounts, customers, strategic initiatives, etc. Historically speaking obtaining valuable information from these data islands spread throughout the ecology has been tremendously painful and labor-intensive requiring an organization to place significant investments into things such as data warehousing. While effective on a number of levels, in today's age of lightning-fast changes being able to get to valuable information that not only resides within but about the data ecology is absolutely essential to survival.
This is where the concepts of services particularly around DaaS are very powerful. From a high-level the DaaS allows an organization to not only have access to required information, but also places powerful discovery and self-evolving mechanisms that had not previously existed into the hands of the organization. There are a few key concepts that help make the DaaS work:
- Storage: This can be reasonably small to very large. In truth DaaS need not truly focus on this as storage is usually addressed as Infrastructure-as-a-Service os IaaS. However practically speaking it is just a matter of understanding whether the goal of DaaS is to hook into an existing IaaS or act as a mesh over existing islands.
- Meta-Data Dictionary: Everyone in IT and development knows what a data dictionary is. And equally many business people care less. However the idea here is to evolve from isolated or even an enterprise data dictionary into more of a Meta-Data Dictionary. The reason? Data dictionaries are what I view as instance-specific ways of making a definition, placing things into that definition, and interacting with that definition. Practically speaking it is how say one gets data within an Oracle database, or within an Excel spreadsheet, etc. When traversing the larger ecology they are tremendously ineffective, unable to handle rapidly changing contexts across extended domains. Meta-Data Dictionaries serve this purpose. They tap into the local dictionaries and extend them to include a much larger array of context across the organization to provide answers more rapidly while saving time and effort.
- Meaning: The next major component of DaaS is meaning. It is something that is not as commonly spoken about. Meaning in the context of DaaS is the ability to consistently present not only information about what resides in the ecology but also about the ecology with regard to the domain of the organization. In a traditional relational database for example, issuing a query does not take into account the fundamental about meaning. The results of a single query does not mean anything by itself to the organization until it is placed within a broader context that not only spans the targeted island, but all the desired islands within the ecology. As the context grows so does the ever changing complexity of establishing meaning. Working in tandem with a Meta-Data Dictionary as opposed to individual data dictionaries, meaning can be quickly determined as a question is posed throughout the data ecology.
- Discovery: This takes the paradigm of search and applies it throughout the DaaS ecology. Whenever events happen within the DaaS that affect the meaning as interpreted via the Meta-Data Dictionary, discovery adjusts to that by not only making older patterns available but newer ones as well. In this manner an organization is capable of discovering evolving patterns within their ecology as it relates to their business.
- Living Data: Now this is old hat to many in the internet crowd but fairly new to organizations especially since it arises with the mention of DaaS. This concept means that all data elements as they change with respect to the data ecology are available immediately upon a user request. Practically speaking it appears to a consumer that data changes it's behavior as they interact with it. Examples include Twitter updates, Google Finance chart navigation or online banking activity from any bank. These instances not only process events, but as the events have impact on the ecology they are reflected back to the requester within seconds.
- Services: A fundamental aspect of any "as-a-Service" model is the concept of services. The type, manner, number of, and management of services in a DaaS should not be underestimated. Services need to be simple, powerful and flexible to meet the needs of the organization. For a typical DaaS because of it's Meta-Dictionary it also has services related to Meaning and Discovery that provide far more value than traditional access services such as query.
I have had the opportunity and privilege to work on such a platform in my career. Having a reasonably strong background in databases I can say the transition was not an easy one. All the newer dimensions require significantly more consideration and realization than a simple database perspective. For example with a DaaS one can see the changing patterns of behavior and ask questions and gain insights into complex questions not really addressable before. In one of my previous experiences working at a large telecom the question arose about how long would it take for data elements of a particular marketing campaign to reach all the necessary parts of the organization. As with most typical organizations, the answer was not really precise since it was a culmination of asking each division and then aggregating the responses. In many cases the divisions were not 100% certain of the timing themselves. With my primitive platform in place, we were able to look up the information in a few minutes and provide a more comfortable, provable answer to the organization in a rapid manner. The cost savings along were well worth it; 2 mins of a single individuals time vs. 30mins for 1200 people of varying levels. Other questions such as how the ecology handles volumes, what volumes mean in relation to business operations, the amount and volume of meaning inconsistencies and what savings could be achieved are just some of the more typical operational questions. However with a DaaS in place, higher value insights can be gained such as missed opportunities for new products/services based on customer activity, competitive standing based on social responses and replies with regards to existing products/services, capacity planning for bursting or planned progessions, and many others.
It was at this point my colleagues were thinking it would take them years to build out a DaaS. I responded that a DaaS does take effort, but not necessarily time. It is an equal mix of the deep technology which would be a blend of building it and using vendor tools, and the expertise and knowledge of technical and busines staff. From a tools perspective, solutions such as those provided by vendors such as QuePlix for data virtualization and Kapow for integration to leverage existing domains can quickly get an organization with significant existing assets to DaaS basics very rapidly. The core characteristic is the commitment from the organization. Any undertaking such as DaaS is something fundamental to the culture not just a dalliance.
Then I pointed out the shifting landscape of competitive pressure due to the economic crisis. Those with stronger, valuable, flexible and more timely interactions with their data ecologies are the ones that typically engage their customers more meaningfully. Whereas those with less capabilities quickly find themselves losing opportunities to competitors. From a career standpoint many of the new technologies related to the DaaS such as cloud concepts, big data, distributed data, and the like are some of the most in-demand skills not just hands-on, but in management, deployment, architecture, etc. As more and more companies realize the value of DaaS along with other strategic approaches, they are moving to embrace them in order to stay competitive and survive.

