IMBOK - 402

Tools and techniques - The information portfolio

Just as with applications, information comes in different forms and carries different importance to the organisation. Research has revealed that the two discriminating factors that allow us to consrtuct an alternative portfolio, for information, are:

The availability of information: Some information is wholly contained within the organisation, but increasingly as organisations share information, and as the cloud becomes more and more important as a resource, we have to recognise the information that resides outside the organisation.
The nature of the information: Some information is tightly structured, and some is not. Increasingly, there is a stream of structured data coming from remote devices and telemetry, but at the same time organisations have access to huge volumes of unstructured data such as is found on the social web

The information portfolio lays out these two discriminators as a 2x2 matrix, and provides examples in each quadrant:

The information portfolio helps us to deal with these challenges and it organises the way that we might work around them. The notes that follow relate to the numbered elements of the model as presented in Figure 26. Stage 1: Taking advantage of public information The simplest first step in managing with the portfolio is to recognise and adopt well-structured external schemes of reference data, such as post codes, weather data, GPS positioning data and even travel timetables. It is now routine in the United Kingdom to undertake a web-based transaction using first your house number and then the postcode – the combination of these seven or eight digits and characters identifies your full address immediately, without any ambiguity, and the remainder of the order screen is auto-filled from that raw data. Other administrations elsewhere in the world are not so advanced and not so rigorous with their post coding schemes that the data is reliable. In South Africa a new road tolling scheme (at the time of writing) relies on the existing national database of registered road vehicles in order to track down drivers or vehicle owners who have used the roads and not paid. The quality of the vehicle registration data is so poor that some experts think the road tolling scheme will collapse, as revenues are falling far below what is needed to pay for the cost of building and operating the tolling system. Elsewhere, in education, in health and in general administration there are many existing schemes that provide perfectly adequate structuring for data, and not choosing them risks serious problems of compatibility in the future; where there are duplicate or overlapping schemes then we have a problem, of course. The inefficiencies and risks of having to translate one set of external codes (for example, a supplier part number code) into an internal code (for example an internal stock number) can undermine business performance, and lead to a constant struggle to keep things properly lined up and to sort out the avoidable problems that arise. The international effort to organise well-defined structures and codes for illnesses, drugs, educational material, fast-moving consumer goods and even crimes (to mention just a few) are worthwhile and helpful, and should be supported and encouraged. Stage 2: Tagging the noise on the web Outside of the boundaries of any single organisation there is a constant flow of potentially relevant information that needs to be monitored in case there is something there that is potentially valuable. The problem is, there is so much of it, mostly on the social pages of the World Wide Web of course. There are different ways to harvest and organise such data, possibly by using existing schemes such as post codes and GPS data (already mentioned above) but more typically by adding “tags” or by analysing it and fitting it to a prepared scheme of ideas that we might call an ontological model. This effort to organise the vast content of the web is happening now, for example in the project that is known as the “semantic web”; experts are meeting, papers are being written, conferences are happening (for example: SEMTECHBIZ, 2012) – we are (at the time of writing) at a tipping point that will affect life for many years to come in ways that we cannot yet anticipate. Hence, there are two ways to make sense of the web. First, it is possible to construct formal ontologies, and second it is possible to have open ended tagging schemes – these two approaches lead to quite different results. The ontological approach attempts to be highly structured and rigorous, and the tagging approach tends to be completely open, it is able to be manipulated, and it is potentially highly redundant. Ontologies are a formalised structuring of the “things” that comprise our “real” world. In a research project an ontology will identify the entities (and the relationships between them) that are relevant to the research and about which the project wishes to gather and analyse data (we will return to a discussion of entity modelling again, shortly). An ontological model can be developed using the same kinds of rules that are used in the development of entity-relationship models because (as your author sees them) they achieve a very similar thing. Stage 3: Sifting and analysing A research project can usefully develop its own ontological view of what it needs to work with, but in the wider world of business the generalised ontologies that are under development extend to hundreds of entities and hundreds of relations between them. They are quite alien to the majority of business people and web users, who – understandably – prefer to take the line of least difficulty and use the tagging option. It is possible to devise and apply tags to web content and to other stored data such as photographs and emails, within personal systems or within shared systems, and those tags can be used to quickly select the content relevant to a business issue in a web site, or a photo archive, or a discussion board with tens of thousands of contributions. Exactly how this tagging might actually be done is wide open and some management will be helpful in avoiding complete chaos, rather have “managed chaos” I think. Individuals work with search engines every day, and develop their own lists of tags (or “key words”) that work well for them; information consolidators such as the global news providers (many of which simply scour the content of the web for stories written by others) do this automatically using tags and keywords, with just some human intervention to make sure that the best stories are featured more prominently (but that will always be a human judgement, surely?); the sort of face recognition that is now happening (with image management software such as Picasa) is entirely automatic, it uses “tagging” schemes that we do not understand that are highly complex and embedded in image processing, and at this stage they are entirely proprietary, it seems. One might surmise that at this point in time the average business, or government department, or community, will be happy to wait and see what happens with ontologies and with the semantic web, and go no further than playing with tagging schemes; other progressive businesses and organisations with a special interest in recovering information from raw data will be working closely with these ideas, which are yet to show their true potential in the world of business, in fighting crime, or in manipulating and nurturing groups of people on the social web. Stage 4: Structuring and archiving When it finally comes to organising and structuring information within an organisation, something comparable to the value chain would be useful. A high-level view of the organisation, on just one sheet of paper, with a generic arrangement of ideas that lets us compare out business to the way that other businesses work? Sounds good.

There is an extended discussion about the applications portfolio in the book, starting on page 136. The discussion goes on to present a generic information model for an organisation, that can be used to assess the extent and detail of the information that is needed, and to identify the critical transactional information that will tell you whether you are making money or not.