Friday, November 23, 2012

Cash to Delivery

This post looks at a specific high level process drawn from the "eating out" industry. It's origins are from that fine barbecue establishment in Dallas - the original Sonny Bryan's. The time 1985. The conversation took place in the shack that was as crowded as ever at lunch time. The method was, place order, pay for order, hang around and wait for order, pick up order, attempt to squueze oversized bottom into school chairs, devour product. While casually waiting for the order to be prepared, I idly asked my colleague, "how do they match the orders with the people?". It seemed as if the orders always came out in sequence, so, being a systems person, I got to wondering about the nature of the process.

I had clearly paid in a single "transaction". Sonny Bryan's had my money. In return I had a token (numbered receipt) that stated my order content as evidence of what I had paid for. However that transaction was not synchronous with the delivery of the food, nor was the line held up while the food was delivered. Had it been, the place would have emptied rapidly because the allotted time for lunch would have expired.

I, as the customer, think that the transaction is done when I have wiped my mouth for the last time, vacated my seat and thrown away the disposable plates, etc. But the process doesn't work like that.
There are intermediate "transactions". The I paid and got a receipt transaction (claim check, perhaps?). The I claimed my order transaction. The I hung around looking for somewhere to sit transaction. The I threw away the disposables transaction.

Each of these transactions can fail, of course. I can place my order and then discover I can't pay for it. No big deal (from a system perspective, but quite embarrasing from a personal perspective). I could be waiting for my order, and get called away, so my food is left languishing on the counter. Sonny Bryans could have made my order up incorrectly. I could pick up the wrong order. I could have picked up the order and discovered no place to sit. Finally I could look for a trash bin, and discover that there isn't one available (full or non-existent).

I defintely want to view these as related transactions, not one single overarching transaction (in the system's sense). In reality what I have is a series of largely synchronous activities, buffered by asynchronous behavior between them.

Designing complete systems with a mixture of synchronous and asynchronous activities is a very tricky business indeed. It isn't the "happy path" that is hard, it is the effect of failure at various stages in an asymchronous world that makes it so tough.

Wednesday, November 21, 2012

A data hoarder's delight

Withe big data, I sometimes feel like the character Davies in the masterful spy novel, "The Riddle of the Sands" by Erskine Childers. "We laughed uncomfortably, and Davies compassed a wonderful German phrase to the effect that 'it might come in useful'"

Some of the "big data" approaches that I have seen are a bit like that. We keep stuff because we can, and because it "might come in useful." For sure there are some very potent use cases. Forbes, in this piece describes how knowledge about the customercan drive predictive analytics. And valuable it is.

However the other compelling use-cases are a bit harder to find. We can certainly do useful analysis of log files, click through rates, etc., depending on what has been shown to a customer or "tire kicker." But beyond that the cases are harder to come by. That is to some extent why much of the focus has been on the technology and technology vendor side. 

There is a pretty significant dilemma here, though. If we wait to capture the data until we know what we need, then we will have to wait until we have sufficient data. If we capture all the data we can as it is flying through our systems and we don't yet know how we might use it, we need to make sure it is kept in some original form. Apply schema at the time of use. That makes us quake in our collective boots if we are data base designers, DBAs etc. We can't ensure the integrity. Things change. Where are the constraints?... In my house that is akin to me throwing things I don't  know what to do with (pieces of mail, camping gear, old garden tools, empty paint cans, bicycle pumps,... you get the idea). So when Madame asks for something - say a suitable weight for holding the grill cover down, I can say, "Aha, I have just the right thing. Here, uses this old, half full can of paint." A properly ordered household might have had the paint arranged in a tidy grouping. But actually that primary classification inhibits out of the box use.

Similarly with data. I wonder what the telephone country code for the UK is.. Oh yes, my sister lives there, I can look up her number and find it out. Not exactly why I threw her number onto the data pile, but handy nonetheless.

So with the driver of cheap storage, cheap processing, we can sudden;y start to manage big piles of data instead of managing things all neatly.

This thinking model started a while back with the advent of tagging models for email. Outlook vs Gmail as thinking models. If I have organized my emails in the Outlook folders way, then I have to know roughly where to look for the mail - all very well if I am accessing by some primary classification path, but not so handy when asked to provide all the documents containing a word for a legal case...It turns out - at least for me that I prefer a tag based model - a flat space like Gmail where I use search as my primary approach, as opposed to a categorization model where I go to an organized set of things.

There isn't much "big" in these data examples. It is really about the new ways we have to organize and manage the data we collect - and accepting that we can collect more of it. Possibly even collecting every state of every data object, every transaction that changed that state, etc. Oh and perhaps the update dominant model of data management that we see today will be replaced with something less destructive.