Skating to where the puck is

Being an SAP Mentor is an interesting situation. The relationship with SAP usually feels amazingly collaborative, but occasionally uncomfortably adversarial. ThIs is probably a healthy place for a relationship with a massive organization to be. Every organization includes many great people and interactions with those people are always delightful. Every organization is also, as an organization, predisposed to get the answers to certain questions consistently right and the answers to other sorts of questions consistently wrong.

One such question that SAP has consistently gotten wrong in the past is how to push technology adoption among a massive install base. Historically, it would appear that SAP has tended to try to monetize upgrades to underlying technology in order to recoup development costs or profit from its technology offerings. Netweaver, BW, Visual Composer, add-ons like SSO and Gateway, Personas, Business Client, and HANA* are all technologies in which SAP has either underinvested due to lack of direct revenue potential (BW especially) or has tried to monetize. The result is that widely deployed "free" technology from SAP often stagnates while advanced technology with an additional cost often sees limited adoption in SAP's install base.

This outcome is terrible for SAP's business. It makes it very difficult for SAP to keep its application offerings competitive in the marketplace. The reason is that basic application functionality to be competitive is often dependent on improvements in underlying technology. But if technology is either widely deployed and under-featured or decently featured but not widely deployed, then applications need to use only the under-featured technology stack in order to have a large enough potential install base to justify development costs. In other words, SAP often forces itself to build applications on sub-par technology.

We see this dynamic constantly with SAP. One example is UI technology. Clearly a long-standing weakness of SAP's applications has been user-interface and user-experience. The SAP GUI is outdated and SAP's attempts to improve on it like the Business Client have, in my opinion, been torpedoed by SAP's monetization reflex. They haven't initially been better enough to see widespread adoption under a monetization scheme, and with lack of adoption, investment has faded. A similar dynamic played out with SAP's support for Flash and Silverlight as UI technologies for years after it had become clear they were destined for the trash-heap of web-UI delivery technologies.

And yet, over the last 4 years, SAP has been able to overcome this tendency in one area that may be incredibly important to its long-term business prospects around the newly announced S/4HANA. In the case of a key succession of technologies, SAP has been able to do something different, with impressive results. Those technologies: Gateway, SAP UI5, and Fiori**.

Initially, all seemed to be going well. SAP developed Gateway in part due to prodding from influencers like SAP Mentors around the need for a modern, RESTful API layer in SAP applications to allow the development of custom user interfaces and add-ons in a sustainable manner. DJ Adams and others showed the way with projects like the Alternative Dispatch Layer (in 2009), to make development of these APIs easier. Uwe Fetzer taught the ABAP stack to speak JSON. And suddenly one could fairly easily create modern-looking APIs on SAP systems. SAP folded a lot of those learnings into Gateway, which was a great way to push the tools for building modern APIs into SAP's applications and out to the install-base.

Well, it would have been, but SAP made its usual mistake: it attempted to monetize Gateway.

The result of the monetization attempt could be predicted well enough. It would make roll-out of Gateway to the SAP install base slow, at best. Fiori would be delayed because it would need to build out its own API layer rather than relying on Gateway technology. Applications like Simple Finance or S/4HANA that are dependent on Fiori would subsequently be delayed, if they were created at all. Perhaps Fiori's roll-out is delayed by a year, Simple Finance by 2 years, and S/4HANA isn't announced until 2018.

But S/4HANA was announced in January 2015. So what went differently this time?

Fortunately, back in 2011, shortly after the initial Gateway monetization strategy was announced, SAP Mentors like Graham Robinson and other community members stepped in to explain why this was a mistake and push back against the strategy, both publicly and privately. While clearly not the only reason for SAP's change of heart, this feedback from the community was powerful, and ultimately SAP revised Gateway licensing in 2012 so that it made sense for SAP, its partners, and customers to build new applications using Gateway.

This revision set us on the path to relatively quick uptake of SAP UI5 (a modern browser-based front-end framework which leverages the Gateway APIs) and later Fiori and the Fiori apps (most of which are based on UI5 and Gateway APIs). With Fiori, SAP again thought short-term and attempted to monetize Fiori, only reversing course and including Fiori with existing Business Suite licenses after similar pressure from another group of SAP Mentors who had experienced customers' frustration with SAP's stagnant user-experience. These Mentors, community members, and analysts were able to communicate the role Fiori needed to play in guarding against SAP customers migrating to cloud vendors offering a more compelling user-experience story.

Fiori, meanwhile, is a hit and serves as the basis for Simple Finance and now S/4HANA - SAP's recently announced blockbuster refresh of the entire Business Suite offering, which SAP clearly hopes will drive its business for the next 10 years. Would that be happening now if SAP had remained on its standard course of attempting to monetize Gateway? I don't think so. The interaction with SAP on these topics often left some feathers a bit ruffled, but it sure must be nice for those like Graham and DJ to see some of the fruits of those discussions in major products like S/4HANA.


*A note on HANA: I think that HANA may be the exception in this story. Unlikely most of SAP's other technology offerings, HANA is good enough to be competitive on its own, and not simply as a platform for SAP's applications. The results are not yet in on the HANA monetization strategy, but things are looking OK, if not great. Of course, Graham has something to say about that, and what he says is always worth reading.

** Fiori is actually a design language focused on user-experience and a line of mini-applications implementing the Fiori design language to improve SAP Business Suite user-experience for specific business processes. However in the context of S/4HANA we can think of Fiori as a prerequisite and enabling component, like a technology prerequisite.

SAP BI OnDemand and Hana

It's been some time now since the press releases and SAP TechEd Bangalore keynote proclaiming that SAP's BI OnDemand product now runs on HANA as its underlying database. The press releases have gone out. The product is here. The BI OnDemand website has been updated with a shiny new "Powered by SAP Hana" logo.

There is only one problem. It seems that the BI OnDemand that most people can see is not actually powered by Hana.

I discovered this for myself when discussing the topic with Courtney Bjorlin, who was working on an article about the announcement. SAP confirms in the article that only the "Advanced Edition" of BI OnDemand is available on the HANA database. At SAP's TechEd in Madrid, I was able to ask around on the show floor and hallways and find out more about the situation.

How do I get BI OnDemand running on HANA?

You have to buy the "Advanced Edition" of BI OnDemand. This involves a sales process and is a hosted version of the BI OnDemand platform. It seems that it's not exactly SaaS or "OnDemand", but more on that below.

The fact that the logo at https://bi.ondemand.com says "Powered by SAP Hana" is apparently an inaccuracy. Hopefully that will be corrected soon.

What are these different "editions" of BI OnDemand?

There are three "editions" of BI OnDemand: Personal, Essential, and Advanced. Based on my discussions, it seems that Personal and Essential editions are SaaS applications hosted by SAP, while the Advanced edition is hosted by partners. All editions seem to include the same web interface as seen on bi.ondemand.com, but the Essential version includes customization and branding options as well as more storage. The Advanced edition features even more storage and customization options, plus access to a hosted version of the BusinessObjects Data Services, which can be used to manage contents of DataSets. This integration of Data Services can allow for incremental updates to DataSets, which is a key feature and is not possible in the Personal or Essential editions.

As far as I can tell, none of this is documented anywhere on SAP's standard sites. My thanks to Richard Hirsch for finding this presentation outlining some of these points (see page 17): link to PDF (link has been removed because SAP has let the domain lapse - original URL was http://sap-partnersummit2011.com/doc/post_event/FKOM2011_BA&T%20track_Day2/BAtrack_2_BA-Solutions&Innovation.pdf).

So if I have the Advanced edition, I'm now on Hana?

No, not quite.

First of all, based on discussions at TechEd Madrid, it seems that only new customers can currently get onto the Hana-based BI OnDemand platform. Apparently there are contingencies for existing customers to migrate eventually, but right now it is only for new customers.

Further complicating the issue, it seems that not all hosting partners for the Advanced edition provide HANA as the underlying platform. I was told by SAP employees on the show floor that only one partner is currently providing BI OnDemand on HANA, and that partner is only in North America. Other partners are providing the BI OnDemand on the older Microsoft SQL Server-based platform. I have yet to confirm this; it is only based on the one source, so take it with a grain of salt. But there is clearly confusion around the availability of BI OnDemand using HANA, even if you are purchasing the Advanced Edition.

If capabilities provided only by HANA are required for your implementation, be sure you are actually getting HANA when you buy the BI OnDemand Advanced Edition.

Is it Hana or HANA?

I have no idea. I did learn at TechEd that HANA (or Hana) is not an acronym, so I'm leaning towards Hana, but old habits die hard.

Ok, enough with the Q&A. What does this mean?

In my view, this means that SAP still has a lot of work to do getting its message across clearly. It is not particularly bad or good that HANA is not available for the Personal or Essential editions of BI OnDemand. These editions are limited to data set sizes that are simply too small for HANA to make much of a difference.

The greater concern here is one of communication. For any company, it is extremely important to say what you mean and mean what you say. It would have been much better if SAP had been clear about the roll-out of HANA for BI OnDemand from the beginning. As it stands now, many people will try out the Personal edition and think that they are using "Hana", but they're not.

Looking to the wider view, I worry about what this partial roll-out means for SAP's BI cloud play. The BI SaaS market is still very immature and SAP has the opportunity to play a leading role in this emerging market. However the BI OnDemand product doesn't seem to have received the sort of development attention required for this role, and the deployment options seem to be severely lacking.

Companies and departments looking to buy powerful SaaS BI capabilities are not interested in figuring out what database the product is using and the impact this has on their reporting needs. SaaS should work as defined in SLAs, and it should keep getting faster and better in a way that is non-disruptive.

After talking with some of the BI OnDemand development team in February, I know that they have a good understanding of the BI SaaS space and have some great ideas for the BI OnDemand platform. I'd love to see SAP deliver on its potential in this area and I think they have the people and the vision to do so, but we haven't seen it in the product yet.

Hopefully SAP can get both the BI OnDemand message and the platform straightened out quickly. The BI SaaS market it still extremely young and SAP could be leading the way.

Disclosure: SAP provided my travel and badge for the TechEd + Sapphire 2011 conference in Madrid.

SAP's HANA and "the Overall Confusion"

I threw together a very long response to a very long question on the SCN forums, regarding SAP's HANA application and its impact on business intelligence and datawarehousing activities. The original thread is here and I'm sure it will continue to grow. But since my response was pretty thorough and contains a ton of relevant links, I thought I would reformat it and post it here as well. In order to get a good overview of the HANA situation, I strongly recommend that anyone interested check out the following blogs and articles by several people, myself included:

Some of these blogs are using out of date terminology, which is hard to avoid since SAP seems to change its product names every 6 months. But hopefully if you read them they will give you some insight into the overall situation unfolding around HANA. With regards to DW/BI and HANA, these blogs address many of those issues as well. Now, to try answering the questions:

1. Does SAP HANA replace BI?

It's worth noting that HANA is actually a bundle of a few technologies on a specific hardware platform. It includes ETL (Sybase Replication Server and BusinessObject Data Services), Database and database-level modeling tools (ICE, or whatever it's called today), and reporting interfaces (SQL, MDX, and possibly bundled BusinessObjects BI reporting tools). So, in the sense that your question is "does anything change as far as needing to do ETL, modeling, and reporting work to develop BI solutions?", then the answer is no. If you are asking about SAP's overall strategy regarding BW, then this is open to change and I think the blogs above will give you some answers. The short answer is that I see SAP supporting both the scenario of using BW as a DW toolkit (running on top of BWA or HANA) as well as the scenario of using loosely coupled tools (HANA alone, or the database of your choice with BusinessObjects tools) for the foreseeable future. At least I hope this is the case, as I think it would be a mistake to do otherwise.

2. Will SAP continue 5-10 years down the road to support "Traditional BI"?

I hope so. If you read my last blog listed above you will see that HANA actually solves none of the traditional BI problems, and addresses only a few of them. So we still need "traditional" (read "good old hard work") approaches to address these problems.

3. What does this mean for our RDBMS, meaning Oracle?

Very interesting question. For a long time, SAP has supported competitive products to Oracle offerings. In my view, this was to give SAP and its customers options other than the major database vendors, and to give itself an out in the event that contract negotiations with a major vendor went south. So in a sense, HANA can be seen as maintaining this alternative offering. Of course, SAP says HANA is more than that, and I think they are right. Analytic DBMSes have been relatively slow catching on and as SAP's business slants more and more towards BI, the fact is that the continued use of traditional RDBMSes in BI and DW contexts has done a lot of damage by making it difficult to achieve good performance. It's a lot easier to sell fast reports than slow reports :-) So that is another driver. Personally, I don't agree with SAP's rhetoric about HANA being revolutionary or changing the industry. The technologies and approaches used in the ICE are not new, as far as I have seen. As far as changing the industry from a performance or TCO perspective, I'm reserving judgement on that until SAP releases some repeatable benchmarks against competing products. I doubt that HANA will significantly outperform competitive columnar in-memory databases like Exasol and ParAccel. If you are Oracle, you have a rejuvenated, and perhaps slightly more frightening competitor. I don't think anyone really thought that MaxDB was a danger to Oracle, but HANA holds more potential as a competitor to Exadata. Licensing discussions could get interesting.

4. Is HANA going to be adopted and implemented more quickly on the ECC side than BI side first?

Everything I have seen has indicated that SAP will be driving adoption in BI/Analytic scenarios first and then in the ECC/Business Suite scenario once everyone is satisfied with the stability of the solution. Keep in mind, the first version of HANA is still in ramp-up. SAP is usually very conservative in certifying databases to run Business Suite applications.

Musing about semantics in BI

Recently I've been blogging mostly about SAP's new HANA product and the general in-memory approach. My deeper professional focus is a little further from the metal, in datawarehousing, business intelligence, and planning processes and architectures. Some recent emails, tweets, and discussions have prompted me to get back to my roots ... but roots are hidden and hard to conceptualize. So I brought diagrams!

One of the hard problems in datawarehousing and business intelligence is semantics, or meaning. We need to integrate the semantics in user requirements with the semantics of the underlying systems. We need to integrate the semantics of underlying systems with each other. And we need to integrate the semantics of a system with itself!

That wasn't very clear. Here's an example: Revenue.

Simple right? Not so fast!

Our users want a revenue report. When our finance users say revenue, they might mean the price on the invoice, without any discounts. But our ERP system may display revenue as a number that includes certain types of discounts. (This is the problem of integrating user's semantics with system semantics.) And our other ERP system may include a different mix of discounts in the revenue number. (The problem of integrating the semantics of underlying systems with each other.) Meanwhile, a single SAP ERP system will record revenue from a sales in several different places: On the invoice, in the G/L, maybe in a CO-PA document. Each of these records is going to have a different semantics and it's quite possible that it is difficult to derive the number the system displays to us from the data in the underlying tables. (The challenge of integrating the semantics of systems with themselves.)

Wow! That's just the first line of the P&L statement!

This example is a little contrived, but it's not too far from the truth. At this point, I just want to recognize that this is a tough problem and we really don't have a very good solution to it aside from the application of large amounts of effort. The interesting question to me right now is where this effort is already embedded into our systems (so we don't have to expend as much effort in our implementations) and what affect SAP's new analytics architectures might have in this area.

I promised diagrams and musing, so here we go. I want to talk a little bit about layering semantic representations on top of ERP data models, which tend to be highly optimized for performance and therefore quite semantically opaque. In order to think more clearly about the different ways of doing this and the trade-offs involved, I cooked up some pictures. We'll start simple and move on to more complex architectures.

This is a naive model of an ERP system. It's got a lot of tables: 5 (multiply by at least 1000 for a real ERP system). These tables have a lot of semantic relationships between themselves that the ERP system keeps track of. It knows which tables hold document headers and which tables hold the line items for those documents. It knows about all the customers, and the current addresses of those customers, and it knows how to do the temporal join to figure out what the addresses of all our customers was in the middle of last year. I don't have much more to say about this. It just is how it is: Complicated

This is an ERP system that has semantic views built into it. These views turn the underlying tables into something that makes sense to us - we might call them views of business objects. Maybe the first view is all of those customers with start and end dates for each address. And the second view might be our G/L entries with line item information properly joined to document header information.

Interestingly, creating semantic views like this is almost exactly what BW business content extractors do. These extractors have been built up over more than a decade of development. They were built by the application teams, so if anyone knows how the application tables are supposed to fit together, it's the people who built these extractors. There is a lot not to like about various business content extractors but we can't deny the huge amount of semantic knowledge and integration work embedded in these tools.

Other tools, like the BusinessObjects Rapidmart solutions also know how to create semantic views of underlying ERP tables, though Rapidmarts accomplish this in a slightly different way. There is a lot of knowledge and work embedded in these solutions as well.

When we use the business content extractors with BW, we move the semantic view that the ERP system creates into a structure in the datawarehouse. As long as you use the business content extractors you don't need to worry much about the ERP data models. This diagram shows a fairly traditional datawarehousing approach. The same sort of thing happens with other solutions based on semantic representations of ERP data.

Another option is to directly replicate our ERP tables into an analytic layer. This is what happens in the case of SAP HANA if you are using Sybase Replication Server to load data into HANA. Notice the virtual semantic views that are created in the datawarehouse system. This work must be done for most ERP data structures, because as we've already discussed, these ERP data structures don't necessarily make any sense on their own. Creating these views is one thing we have been hearing from Vitaliy Rudnytskiy that IC Studio will be used for. Ingo Hilgefort touches on some of the same points in his blog on the HANA architecture. And Brian Wood also briefly touches on his role in developing semantic views for ERP data in HANA in his TechEd 2010 presentation.

I find that there are two interesting things about this approach, and these are things to watch out for if you are implementing a system like this:

First, whereas the semantic views in the previous diagram are materialized (meaning pre-calculated), these views are not, meaning that they need to be calculated at query run-time. Even on a system as blazing fast as HANA, I can see the possibility of this turning into a problem for certain types of joins. No matter how fast you are going, some things just take time. Vitaliy, again, does a great job of discussing this in his comment on Arun's blog musing on the disruption that HANA may cause to the datawarehousing space: http://www.sdn.sap.com/irj/scn/weblogs?blog=/pub/wlg/22570.

The second musing I have is that until SAP or partners start releasing semantic integration content, each customer or systems integrator is going to need to come up with their own strategy for building these semantic views. In some cases this is trivial and it's going to be tough to get wrong, but in a lot of cases the semantics of ERP tables are extremely complex and there will be lots of mistakes made. It is going to take a while for semantic content to reach a usable level, and it will take years and years for it to reach the level of the current business content extractors. Customers who are used to using these extractors with their BW installations should take note of this additional effort.

The solution to semantic views that are too processing intensive to run in the context of a query is to materialize the view. It is unclear to me whether or not you can use IC Studio to do this in HANA. At worst you can use BusinessObjects Data Integrator to stage data into a materialized semantic view, then query on this view in HANA. Of course, now we are storing data twice in HANA, and these blades aren't exactly cheap!

When we do this, using the tools currently available to us in HANA, we also lose the concept of real time. This is because our ETL process is no longer only a push process using Sybase Replication Server; now there is also a batch ETL process that populates the materialized view. We are back in the same trade-off between load-time complexity and query-time complexity that we face and struggle with in any BI system.

One possible solution to the second problem mentioned above (the difficulty of building semantic views on very complex and heterogeneous data models), is for SAP and partners to deliver semantic integration logic in a specialized semantic unification layer. We might call this layer the Semantic Layer, which Jon Reed, Vijay Vijayasankar, and Greg Myers discuss very insightfully in this podcast: http://www.jonerp.com/content/view/380/33/. I suspect that this layer will be a central piece in the strategy to address the semantic integration problem that is introduced when we bypass the business content extractors or source datawarehouse structures from non-SAP systems.

This is even possible across source systems in BusinessObjects 4.0 with the use of Universes that support multiple sources, a feature that is new to this release. It is a very powerful idea and I really look forward to seeing what SAP, customers, and partners build on this new platform.

But I'm a little worried about this approach in the context of higher-volume data, and the reason is those stripped arrows crossing the gap between the datawarehouse system and the semantic layer system. If you look back at the previous diagrams, the initial semantic view is always in the same physical system as the tables that the semantic view is based on. Except in the last diagram. In this diagram the semantic view is built on a different platform than the data is stored in.

What does this mean? It means for certain types of view logic, we are going to be in one of two situations: Either we are going to need to transfer the entire contents of all tables that feed the view into the semantic layer, or we are going to need to do large numbers of round-trip queries between the semantic layer and the datawarehouse layer as the semantic layer works to incrementally build up the view requested by the query. Either of these integration patterns is very difficult to manage from a performance perspective, especially when the integration is over a network between two separate systems.

There are ways around this, including (re)introducing the ability to easily move semantically integrated data from an ERP system into a hypothetical future HANA datawarehouse, or tight integration of the semantic layer and the datawarehouse layer that allows the logic in the semantic layer to be pushed down into the datawarehouse layer.

I wonder if we'll see one or both of these approaches soon. Or maybe something different and even better!