Saturday, January 1, 2011

Time Part 2 - Januray 1, 2011

This post follows on from the introduction to thinking about time posted here.

In the previous post, I introduced a few of the nuanaces of time. It isn't about the obvious things like format, timezone, precision, calendar, etc. Those are important and can be the source of errors in coding, but they aren't the worst issues by any means. Of more importance is the meaning of time in a specific context. So to start with. let's look at the kinds of time we might be concerned with. I like to think of time as either a"point in time" or duration. Then associated with times there are various "operations" that are to be applied - dealing with relative time.

So thinking about the common "kinds" of time we see, we might classify this way:
  • Points in time
    • Real time - coincident (to a fine tolerance) with something else
    • Near real time - within the same "transaction" scope as something else
    • Actual time - When something happened
    • Notification time - When an event was notified. This may or not be the time the event happened
    • Observation time - when an observer receives notification that something happened
    • Effective time - a point in time when something becomes effective
    • Expiry time - a point in time when something is no longer valid
    • Cut off time - a point in time when changes are no longer accepted (may behve like an expiry time)
    • ?
  • Durations
    • Validity period - a time duration where something is valid
    • Length of time - a time duration where something is happening
    • Relative duration - a duration that is defined without there being an actual start time. For example, a half in soccer. A type level concept and not an instance level concept.
Clearly in this intial writing there are some quite nebulous words like "something" or "transaction" or "Event". Perhaps happening would be a more useful word.

Anyhow much of the difficulty in systems comes about because of ambiguity related to time. Something may "happen" causing the place at which it happens to think that the happening has occurred, but it has yet to be observed by other entities. Until it does the other entities cannot know that the state has changed.

This is not just a systems phenomenon, but a physics phenomen too. Light moves at a finite speed, so when we see an object (eg a star) that has emitted light, we are seeing the object as it was at the time the light was emitted, and not as it is "now" - or in the frame of reference of us, the observers. We don't (and cannot) know whether the object even still exists. We make assumptions, of course, but in reality we don't know. In other words there is considerable ambiguity.

The big question then is how to handle such ambiguities/inconsistencies of existence. We could take the "GOD" view. There is only one view of the truth, all questions must be asked of that view and there shall be  no other views.  That's fine, except it does take finite time to ask questions, so even if there is a "GOD" view there is and always will be inconsistency.

We can take a view that allows each observer to have its own "correct" view. But that doens't feel any better. We can never be sure when asking that observer if the same set of data has been included as in another observer's set. So for example, when I am presenting some conclusions, it might be quite reasonable for a questioner to ask if I had included the research that was published in December 2010. The point is that unless we are pretty explicit about what is in and what isn't, it is hard to determine what something actually means.

Ultimately we cop out and attempt to have a GOD view of our own data and then allow for other pockets of data which we "update" by taking extracts from the "GOD" view. That has sort of worked - we batch things together and know that the information was correct at midnight yesterday or other known time.

All changes in systems that work in individual events, where data are not batched, where a system is as up to the moment as possible because we don't know and cannot in today's systems know' what has actually been included and what hasn't.

The taxonomy of time that I presented earlier is really only useful as an analysis set. It helps us as analysts answer questions about the data as the data are purposed and repurposed in the (extended) enterprise. It gives a vocabulary to use when asking questions around a particular happening or duration.