Wednesday, December 3, 2014

The Power and Potential of Data Virtualization



Let’s be blunt here. If you are not seriously planning to launch Data Virtualization initiatives in the coming months, you are likely to lag in supporting your company’s ability to remain competitive. You simply cannot sustain the historic infrastructure costs of all the ballast along with the time to implement and maintain integration middleware as it has been for years (decades, really). Data Virtualization is clearly a game changer.
There is a whole spectrum of Data Virtualization (DV).  I see a significant amount of confusion about data virtualization once you get past the basic concept that data from multiple sources can be combined and made available as if it were a single source. Most people are extremely narrow in their view of the scope, value and uses of DV.  
  • DV can be about creating a virtual enterprise data model for querying
  • DV can be about creating virtual MDM definitions
  • DV can be about simplifying the layers of services for any application
  • DV can be about eliminating the majority of staging data bases in your organization
  • DV can be about defining exactly the domain required by an application or portal
  • DV can be about significantly improving your time to value for any integration you need
  • DV can be about simplifying most any IT project
  • DV can be about Business Analytics and Business Intelligence
  • DV can be about Big Data
Data Virtualization is poised to drive a fundamental shift in the way IT departments and solution providers address the classic messy challenges of data integration and data availability. Think in terms of shedding the layers and layers of legacy accommodation that have been necessary simply because it has been “impossible” to align disparate data.
A word of caution:  if your Data Virtualization platform requires your data to be in a specific format (e.g., relational or XML) in order to include it in the data virtualization, then you are defying the concept altogether. You are having to do things like move your operational data from instruments to relational databases, and you have to put your SAP data into some form that the DV platform can deal with.  That’s a far cry from accessing live data!
Enterprise Enabler® deals with all types of sources live directly from the source. And, the concepts of DV are also applied to other patterns, for example, eliminating staging for ETL.

Friday, August 22, 2014

Cache as Cache Can

Caching is one of those afterthoughts, when you know you have a great solution, but you start wondering about performance. Since caching is about moving data at varying speeds, it is (or should be) an inherent feature and responsibility of any integration solution. You will find that a truly Agile Integration Software, such as Enterprise Enabler, makes it easy to configure a wide range of models of caching, and to adjust as your requirements change.

Agile Integration covers everything from ETL through near real time Bi-directional Data Virtualization (DV), all with federation at the core, so caching can be implemented anywhere, end-to-end, in the data flow cycle.

The Continuum of Caching
According to Wikipedia, cache is a “component that transparently stores data so that future requests for that data can be served faster.” I think of it as being any data store, however static or ephemeral, however Big or small, and whether the cached data is exactly in the source form, perhaps to be federated on the way out, or federated already as the endpoint needs or the Master Data form, ready to go on to its destination, or somewhere else in the flow of the data. The specific subset of data to be cached should be optimized to ensure the greatest efficiency, minimal size, and highest reusability. The transparency comes in because, in the big scheme of things, the destination, the consumer, or the workflow steps never need to know the data is not all coming live from the original sources.

This is where data federation and Data Virtualization add to the flexibility of caching. Agile Data Virtualization supports cache as one of the sources, so there could be DV involved to create the cache, whether in-memory, on disk or in a database, and then that cache can be used as one source in a federation that is delivered either on-demand or event-triggered.

Today, most people talk about cache as being refreshed as opposed to accumulating a history, however with all the options that can be configured, this is actually a  realistic and sometimes useful consideration. You can see that the possible combinations are many, clearly enough that one must be careful not to get tangled up, and not to lose sight of the original objectives of caching! 

One could easily argue that caching is more like ETL than like Data Virtualization, however DV often requires caching more than other integration patterns, since the uses generally expect rapid, “live” data, without latency. When the rubber meets the road, in many situations, caching is the only way to ensure that a DV solution with many users does not bring the source applications “to their knees.” This is why Agile Integration Software, which combines all the integration patterns, solves Data Virtualization problems better than pure DV platforms.

What do you need to determine before you configure caching?
·         Which data to cache
·         Why you selected caching this particular data
·         Where to cache – memory, disk, database, etc
·         How often to refresh – schedule, event, as soon as available
·         Where in its path to cache – directly from source, partially processed, before or after federation, endpoint ready, as part of a Master Data definition
·         When to release from cache- as soon as read, as soon as a particular set of consumers have read
·         Is the cache subject to bi-directional data flow

When should you plan to Cache?
First of all, keep in mind that if you don’t identify your caching needs up front, with Agile Integration Software, you can easily add it as your traffic grows and the parameters get to point where it’s needed.  Particularly when you are using Data Virtualization, and are hitting backend source systems live at each request, you should take a close look at the needs and best approaches to caching. You should consider caching in situations where:
·         You are concerned that too much traffic hitting mission critical or any sources could adversely impact the performance of those systems.
·         You are concerned about the response times for end users.
·         You need to have the same value throughout a process where you might be accessing it multiple times

What to Cache?
·         Data that doesn’t need to be real-time
·         Data that you want to ensure the same snapshot is used for different things
·         Data that changes so slowly that having it real-time doesn’t matter. You could refresh the cache once an hour or day or month, even.

Agile Caching
Agile Integration Software offers a wide range of options for caching, with ease of configuring even complex caching patterns without custom programming. With the ability to select full data sets, specific fields,  mixed in-memory and on-disk caching, and all combinations, including conditional full workflow-driven caches,  great architecting doesn’t have to be constrained by what is practical to implement.

Thursday, August 7, 2014

I Hate Data

I Hate Data. Loathing is the Mother of Invention.
I hate data. I’ve always hated data. As a programmer right out of school, I worked in “technical programming” as opposed to “business programming.”  Business was about accounting mostly, and technical about, well, technical stuff. One would think that technical would involve data and numbers while business would be about less precise things. Nevertheless, the fact was that it was the business side that worried about data and about uploading huge amounts for backup every night. Therein, of course, lies my initial dilemma. It happened that I was fortunate enough to focus on my passion, which was computer graphics. This was back in the days when we were figuring out how to make circles look round and to get rid of the “jaggies.”  At the time, I had little respect, if not outright disdain for business, but then, I was much younger and living in my 3-d graphics world.  We on the technical side focused on what you could program computers to do. I wrote code that made every graphics device I could find sing and dance, and the input devices like mice and early tablets and joysticks, too.  Imagine the thrill of making one of the first joysticks move cursers and move through 3D spaces rendered on 2D screens! Compare that to the dubious activity of staring at tons of numbers on countless reports. Those guys debugged things by studying numerical data, while I had the joy of detecting bugs with the screen looking like a war zone of odd shapes, colors, and flashes, or simply by crashing the computer altogether. Who wouldn’t choose the latter?
Sometime later, after my phases of programming robots and working with pattern recognition algorithms, I began working with refinery modeling and programs to perform mathematical optimization (LPs). These programs combined data from the refinery operations, from laboratories, inventories, planning systems and such, along with current economic information.  Lots of data was involved, but the huge impediment was getting the data from ten different sources in a manner that aligned all of it meaningfully. Often the end users would manually enter some of the data and they would pretend that running a very precise optimizer would give just as good results with some of last month’s data. Wrong! Why couldn’t the data just work smoothly in the background?
I hate data. It’s hard to deal with. Too many problems. That’s not what I want to focus on – there are much more interesting things to think about. That’s why Enterprise Enabler® just had to come along – so that I wouldn’t have to deal with all the idiosyncrasies of disparate data. It was very selfish of me. Let a computer handle all that craziness. Hide everything behind the scenes and automate everything that has to be done more than a couple of times.  But then Enterprise Enabler unexpectedly swept me into “business” and all kinds of things I never imagined. I have to be careful, now that the headaches of data are managed, I might start liking it. I can’t admit it, but I’m starting to think data may be what it’s all about. Big Data, little data, virtual, bi-virtual, octy-virtual, and numberical, too.

Monday, July 7, 2014

Do you have the guts to be a hero? Take the Agile Integration Plunge.

Come on now. This day and age, you business leaders are still beholden to your IT organization. You are the cleverest business person you know. You have successfully negotiated the biggest acquisition in the history of your company. Besides that, you are on the leading edge with all the latest innovations in video, phone, and personal computer and tablet technology. You even rigged up a sensor to notify you when the bird feeder in your yard is empty. How can you not be frustrated that you just can’t seem to feel as confident about your IT infrastructure?

There are a number of reasons why some corporate IT tools and infrastructure have lagged generations behind the advances of consumer technologies, but that’s for another blog. The important message here is that finally, the next generation integration platforms have matured and are ready to turn the ship. Change has been incredibly difficult, and large companies, in particular, have been unable to respond quickly to opportunities because IT could not keep up.  We have reached a point where those that adopt agile integration software will have a clear competitive advantage. We are seeing that transformation take place, and not at the speed of classic IT. At the speed of change.


I heard at a recent Gartner conference, the keynote speaker announcing that with the new imperatives of agility and supporting the “Nexus” of data integration demands, the Big Players will be DOA (“Dead on Arrival”). You must take charge and get on board with the next generation, even though you are faced with resistance from people whose IT knowledge may intimidate you.

How can you tell if it’s time to take up agility?
1.       Not everyone is working off of the same numbers
2.       Long wait times to get access to new data that you need
3.       Manpower costs for building integration are exorbitant
4.       More than 40% of the costs of new projects are for integration and data access
5.       Data you get is not up-to-the-minute
6.       You have business processes that are highly manpower-intensive
7.       Your partners and customers are not getting information as fast and in the forms they would like
8.       You may be moving forward with Cloud, Big Data, others, but the rest of the IT team can’t keep up with the demands of everyday project work

One tell-tale sign is that the data warehouse is the center of the universe, but is less than agile. Data warehouse is good for historic data, but not for real-time or close to live data. It’s taking an extra step in the storage process that requires the data to be staged when it’s needed.

How can you get the ship turning? Institute just a couple of new guidelines:
“All new integration must:
1. Leverage Agile Integration Software
2. Use Virtual Data Federation wherever data needs to be combined across more than one data source
3. Use Data Virtualization for all “on-demand” integration (proactively asking for data when you need it so you can get live data from the sources as opposed to getting stale data)

You will be met with resistance from all sides, but you will gain strong supporters quickly after the first jack-rabbit projects come in ahead of time and below budget. There are plenty of causes of resistance:

  •          The Big Vendors are always a safe choice
  •          Fear of change; stubbornness; laziness about learning new things
  •          There is a relationship between how much time it takes for consultants to do something and the amount of money they earn
  •         Anything new is up for more scrutiny than going with the tried and slow
And you can certainly augment the list in the context of your business.
The promises of Agile Integration Software like Enterprise Enabler are real and are being realized in many companies. The technology is definitely Enterprise-Ready, and not to be relegated to small projects with small potential benefits.

Do yourself and your company a favor. Take the plunge. Stare down fear with the guts that got you here in the first place.

Friday, April 11, 2014

Agile Big Data is Coming

Agility with Big Data?
Shiny but not Agile, perhaps destined to never be Agile.
Just look at ETL. Never, never agile.
So Clunky. After all these years, why is it so Clunky? So Clunky it’s fragile.

Big Data follows in the footsteps.
Big feet. Big hype, Big opportunity
for discoveries otherwise unfathomed.

Great programmers performing great feats,
Coding, coding new horizons… Open Source assist.
No hope for Agility.
Where are the trailblazers of Agility?

Ah, yes!
They’ve conquered ETL, EAI.
They’ve conquered Data Virtualization.
Agile Big Data is coming.

Are great programmers actually an impediment to Agility in complex software problems?  I‘m not talking about Agile development, but rather Agile solutions. Great programmers like to be on the leading edge of the latest new shiny trend. I remember the rush of programmers to work on the Y2K-driven Enterprise Application Integration and ERP rage. Now we’re all over Big Data.

It takes time for the market, business, and technologists to evolve the thinking about what the Shiny thing really is for, what it means, and what the technology requirements are. So, who jumps in first? The really good programmers who want to blaze the trails and solve the problems first-hand. Unfortunately, the early players can’t benefit from the evolution and maturing of the Shiny and what the requirements will really be. By the time it’s clear what it takes to genericize the problem, the Great programmers have moved on to the next trend, and are certainly not going to step back and solve the problem again in a tool or platform that hides all the repetitive work behind the scenes. 

Product companies then step in to harvest the rest of the hype curve. Leveraging their Big Name, they make plenty of revenues on hard-coded, specialized solutions where custom coding is accepted as the only way to solve the problem. In my view, a timeless benchmark for Agility is the minimization of custom code. The more coding involved in generating a solution, the farther away that solution is from Agile.

The growing love affair with Open Source code is a huge step backward for the cause of Agility. Programmers love coding. I, for one, love coding, too. But I HATE having to code essentially the same thing over and over, with just a couple of tweaks difference each time. This is exactly what integration is all about: lots of small (very important) tweaks to the same code over and over.

That is why Enterprise Enabler exists. That is why we hide all the technical details behind the scenes of our agile integration software. Any time a programmer has to do essentially the same thing more than a handful of times, it is automated with tweaks configurable in a UI. Programmers should be programming exciting, innovative new things, not laboring over repetitive, boring scripts and code changes, and maintenance.

We’re applying the same philosophy to automate Big Data integration and analysis.

Sunday, February 9, 2014

Does Data Virtualization Foreshadow the End of Data Warehouses as We Know them Today?

How many warehouses do you think have been eliminated (or never built) because of Amazon.com? I have no idea, but I bet it’s a big number. Maybe the warehouses they do need are smaller, too. This is worth reflecting on, even though it’s pretty obvious.  Why were they able to skip the warehouse in the distribution/delivery process? They figured out that it is much more efficient to deliver the goods directly from the source. They needed agility, and they made it happen.

It seems to me that the time has come for IT departments to start thinking the same way about Data Warehouses (DW). It ought to be easier to deal with electronic data than physical objects, shouldn’t it? So, what’s the problem with this picture?  Why not go straight to the source for data when it’s needed and deliver the freshest data where it’s needed? Now that Data Virtualization (DW) has become mature, increasingly forward-looking companies are heading that direction.

Your data warehouse diehards will tell you something similar to what Vincent Rainardi says in his blog http://dwbi1.wordpress.com/2012/12/03/why-do-we-need-a-data-warehouse/  that a data warehouse is worth it because it is:

a)      Integrated
b)      Consistent
c)      Contains historical data
d)      Tested and verified
e)      Performant

He goes on to say that the reason the DW meets these characteristics is that so much time has been invested by business analysts, data architects, ETL Architects, ETL Developers, and testers. (Is this good?)

I believe that Data Virtualization can bring all of these characteristics to the table with the arguable exception of historic data. But notice that the first and foremost reason above for a data warehouse is that it provides integrated data. Perhaps going forward, Data Warehouses should be designed primarily to maintain historical data that is not being captured and/or maintained anywhere else. Let’s say we reduce the Data Warehouse use to maintaining historic data, with all other data access and movement being accomplished by Data Virtualization. That thought raises lots of flags, doesn’t it? Security; validation; moving data physically when needed; writing back when the data is federated; performance, etc. Actually, by combining DV with other patterns, companies are addressing these requirements now.

Data Warehouses and Data Virtualization are inextricably tied together, with clearly overlapping objectives. Now that Data Federation and Data Virtualization are coming of age, we need to begin thinking more in terms of the best use for each, so that we can leverage Data Virtualization wherever it makes sense. DV adds dramatically to the agility of a company’s infrastructure and to its capacity for informed, rapid decisions. Data Federation, which is at the heart of DV as we commonly speak of it, can also be applied to ETL-type data movement, eliminating the staging. So, Data Federation can be the best way to populate todays and tomorrow’s DW.

Within five years, the most competitive companies will be using predominantly agile integration for BI, BA, and transactions, with the data warehouses focused primarily on accessibly preserving historic data.   Realistically speaking, though, there will still be many companies relying on their workhorse Data Warehouse, and they still will have trouble calling themselves “agile.”

Let me know what you think.

Wednesday, January 8, 2014

Agile Integration for the Nexus

Last month I attended the Gartner Application Architecture Development & Integration Summit. The Keynote and theme throughout the conference was the Nexus of forces, Social, Cloud, Mobile, and Information (Big Data). A crucial supporting theme was the necessity for Agile Integration Technologies. The big integration players, they say, will be “DOA” (dead on arrival) as the importance of agility spills over from business to technology in a bigger way than ever.

Agile integration is all about communicating across all kinds of endpoints in all kinds of formats with all kinds of integration patterns, and all adjusted to make sense together. It is about getting it in place quickly and it’s about integration that adapts to the changing environment, devices, and fast pace business changes. The archaic picture of integration- making data move from one place to another- simply won’t do the trick.

Most integration products focus only on Social Media OR On-Premise OR Cloud OR mobile OR Big Data. They tend to be based on a model of SOA OR ETL, OR Data Virtualization, OR Messaging Bus. This means that you need to have multiple integration products, each handling its own specialty.  Even with the big integration players, who would blatantly use AND instead of OR, you are jumping from one environment to another, since their claim to breadth comes largely from acquisition. Their “AND” consists of the camouflage duct tape and baling wire utilities provided.  Suppose you have need for an integration pattern that combines ETL, Cloud, Mobile, and Data Virtualization?  It’s really unfathomable that you could tie integration from multiple platforms together in a way that anyone could call “Agile.”

What does agile integration really mean to your business and your tech team?  It means being able to support agile decisions and an agile enterprise, with significantly less resources and at a lower cost.  

Here are some requirements to look for to assess the agility of an integration product, and examples:

1)      Agility requires:  Reusability
       Reusable connections, business logic, schemas… most everything.

2)      Agility requires:  Rapid time-to-value
       Generate integration definitions, test them, and deploy quickly to begin getting immediate business benefit                                                                                                                                                

3)      Agility requires:  Instantly adapting to changing requirements
       Add a new source, change the business rules, move some data to cache, and anything else in the configuration must be easily changed in minutes. Deploy to cloud when needed, or as Mobile accessible integration

4)      Agility requires:  Scaling up or down quickly and easily
       Fast-growing businesses need scaling up for growth or acquisition, and scaling down as needed for distributed use

5)      Agility requires:  Connectivity to Everything
       Single platform connects and reuses integration definitions for electronic instruments, enterprise ERPs, cloud apps, spreadsheets, relational databases, b2b standards, etc.

6)      Agility requires:  Simple construction of Complex Integration Patterns
       Add caching, data synchronization, or ETL to an on-demand data virtualization with write-back.

7)      Agility requires:  Deploying integration logic multiple ways
       One Button Deployment and Hosting of Data Virtualization as web service, REST, ODBC, JDBC, OData, Ado.net connection string, SharePoint BDC/BCS etc.

8)      Agility requires:  Single product; Single platform.
       No jumping in and out of multiple environments and trying to make them work together

9)      Agility requires:  Completely metadata driven; Single Metadata Stack
       Means external programming is reduced to nothing; heavy reusability; visibility of the state across the whole integration at any time

10)   Agility requires:  Configured, not coded.
        Enough said.

Agility meansMinimal Tech Debt. All of these points greatly reduce tech debt by virtue of agility. It just doesn’t get any better than that.


Stone Bond’s Enterprise Enabler® is THE Agile Integration Software.