Palladio

I've been working on this for a while, so it's probably worth posting something about it.

Palladio is a platform for data visualization and exploration designed for use by humanities researchers. We're in early beta at the moment and will be doing a series of releases throughout 2014. You can read about it and try it out here: http://palladio.designhumanities.org/

I think the website does a pretty good job of explaining the capabilities of the platform, so I'll leave that for the moment. I encourage you to go check it out before reading on, because it will be worth understanding how the platform works to give some context to the discussion below.

The map view in the Palladio interface

So, why am I super-excited about this? Mostly because of the great team, which has the vision, technical skills, theoretical and domain knowledge, and information design chops to pull off this type of project. I consider myself lucky to be able to work with this group.

It's also great to be working on a project like this for a field that is simultaneously very strong on information theory and a bit underserved in terms of some types of tools. This is in stark contrast to my usual enterprise data management and visualization work where the theory tends to be weak but a plethora of tools exist.

In addition to trying to build a tool that incorporates important and underserved aspects of humanistic inquiry, I am excited to work with a team that buys into introducing state-of-the-art concepts around data exploration tools in general. Many of the concepts we are working to implement in Palladio are directly applicable to the types of data exploration problems we find in the enterprise and are concepts rarely expressed in existing tools. Palladio is a great example (one of many great examples) of how the process of humanistic inquiry can motivate the development of methods that are both technically and conceptually applicable in wildly different disciplines.

Interaction

The thing that initially most impresses people about Palladio is the way that filtering and movement are integral to the visualization. Specifically, the visualizations update and move in real-time as you filter. This is not a new concept, but I don't think I've ever seen it fully implemented in a general-purpose tool. Getting the level of movement right is a design challenge that the team is tackling as a work in progress, but in my opinion this characteristic of real-time updates and movement is a key feature for a data exploration tool, and few if any tools implement it.

I'll try not to get too squishy here, but this behavior of the tool allows a person to interact with the data in a very direct way, giving a feel for the data that would not otherwise exist. When you can see the results of your interactions with the data in real time, it is a lot easier to conceptually link step-changes and interesting events with the actions that caused them. For example, dragging a filter along the timeline component allows you to play back history at your own speed, speeding up or slowing down as suits you. My theory-foo is weak, but when you see it, I think you'll understand. Try it out with the sample data.

Browser

Techy alert: Palladio is a purely client-side, browser-based application. The only involvement of a server is to deliver the HTML, Javascript, and CSS files that comprise Palladio. We arrived at this design through a few iterations, but the motivation was that we wanted to be cross-platform, super-easy to install, and still support pretty big data sets and fluid interactions. 10 years ago, this would have been nearly impossible, but today we have web browsers that, amazingly, are capable of supporting these types of applications. Yes, browsers can now support dynamically filtering, aggregating, and displaying 10s and 100s of thousands of rows of data and displaying hundreds of data points simultaneously in SVG; thousands of data points if you use canvas instead.

The time for client-side data visualization in the browser has come and we are taking advantage of that in a big way. A great strength of browser-rendered visualizations is that they allow true interaction with the visualization. Just using SVG or Canvas as a nicer replacement for static images is fine, but it isn't fully exploiting the medium. Add to this that the type of interactivity we are providing with Palladio is technically impossible in a client-server setup. Even if the server responds to queries instantaneously, the round-trip time the client-server communication introduces means that interactions won't be as closely linked as they are in Palladio, severely degrading the quality of the interactive experience.

Admittedly, we have work to do on performance and our cross-browser support could be better. Additionally, the problem of data that simply doesn't fit in the browser's memory remains unaddressed, though we have some ideas for mitigating the problem. But I think this is an application design approach that could be exploited for the vast majority of data sets out there, either because the data itself is relatively small, or through judicial use of pre-aggregation to avoid performance and size issues.

Design

Lastly, user experience and information design have been integral components of this project from the start. The design has been overhauled several times along the way, and I wouldn't be at all surprised if it happened again. To be clear, I'm a complete design newb, but we have a great designer working on the team. One thing that has become clear to me through this process is that designing a general purpose interactive visualization tool is hard. There are more corner-cases than I previously imagined possible, but we are trying to get the big pieces right and I think we're on the road to success.

Obviously the organizational dynamics on a small team like ours are very different than those in a big development organization, but it seems like information design on most of the enterprise data exploration tools from larger vendors either started out problematic and stayed that way, or started out pretty well and started slipping as the tool took off. I'm not sure if there is an answer to this, but it's clear that when building a tool in this space, having at least one information designer with a strong voice on the team is indispensable.

Let me sum up

So, that's the Palladio project, as well as a few takeaways that I feel can be applied back to pretty much any data exploration project. In closing, I'll just mention that none of this would be possible without some great open source projects that we use and, most importantly, without the great team working on this as well as the feedback and patience of our dedicated beta participants. The central Javascript libraries we used to pull this off are d3.js, Crossfilter, and Angular.js. The members of the core team are Nicole Coleman (our fearless leader), Giorgio Caviglia (design and visualization), Mark Braude (documentation, testing, working with our beta users, and project management), and myself doing technical implementation work. Dan Edelstein is the principal investigator.

It's been a great ride so far and we've got some exciting things planned for the rest of the year. This is definitely a work in progress, and feedback is very welcome. Follow the Humanities + Design lab on Twitter for updates.

Another SAP data visualization head scratcher

Maybe I need to make a series of out of these.

This time via a tweet from Andrew Fox: The worst implementation of a pie chart I have ever seen, courtesy of SAP BusinessObjects Visual Intelligence.

​

Is there something between Jackets and Sweat-T-Shirts? And what is between Accessories and Dresses? What happened to Overcoats, City Trousers, and Outerwear? Fell through the cracks, I guess.

SAP BusinessObjects Visual Intelligence is supposed to be SAP's answer to industry leading data visualization products like Tableau and Qlikview. Seeing this, it looks more like an advertisement for these competing (I use the term loosely) products.

The full demo video is here.

Elastic lists using Protovis

I've been seeing more and more list-based visualizations used for data selection showing up in BI software. These types of selection interfaces are especially prominent in Qlikview and SAP BusinessObjects Explorer (which you can try on theweb).

Ever since seeing Moritz Stefaner's implementation of Elastic Lists, I've been a bit dissatisfied with the implementations in enterprise BI tools, including the ones listed above. "Elastic" lists leverage the list format to visualize characteristics of the data by tying the size of the bar representing a column value in the selection list to a metadata metric - in this case the probability that a given column value will occur in a dataset.

In order to help myself understand the strengths and weaknesses of this type of visualization more thoroughly, I started to experiment with list-based visualizations in Protovis (a Javascript-based visualization library using SVG for rendering). Eventually, I added in elasticity and gave the list selection the power to drive a second visualization. It uses the cars dataset and visualization from the Protovis examples to demonstrate driving a second visualization with the list selection. (Note: the coordinates on the second visualization are reversed for reasons that I haven't looked into at the moment.)

That experiment is now working well enough that I thought I'd publish it so that others can comment, use the code (but really, it's a bit of a mess, so be wary), and experiment with the concept. If you want to add some capability, go right ahead and fork the project on Github.

For my part, I will likely do a more thorough analysis of list-based visualization in BI tools eventually, but for now I think I can safely say that anywhere a list appears, there is little excuse for lack of "elasticity" in the visualization.

Note: This visualization will only work in browsers that support the SVG standard. It does not work in IE6, 7, or 8. Pretty much any other browser (Firefox, Chrome, Safari, etc.) should work fine.

You can view a static image of the visualization below.