This is the fourth entry about the role of natural language in the semantic web; the previous entry can be found here
In the previous posts, we explored the idea of recruiting knowledge workers in the creation of metadata for highly specialized data. The advantages of this idea are clear; knowledge workers have studied for years to understand the subtlety of a field, and are sure to perform better than a machine at understanding the subtle reasons why a document is about, say, one product line and not another. They are embedded in the work culture, and understand why certain documents are valuable, and to whom. Extracting that information using automated means is complex and error-prone.
But the disadvantages of this approach are also clear. Knowledge workers are in high demand, and don't have time to do such mundane work as marking up their documents. Knowledge management folks live for the day when they can get ten minutes of time from their knowledge workers; their time is too precious to be spent on knowledge management.
But let's examine this a bit closer; is it really true, that a knowledge worker will spend a few days on a precious document, but can't be bothered to spend another ten minutes, or even a half hour, indexing it so that it can be found when needed? Why did they write the document, if it can't be used by the organization? "We aren't in the business of writing metadata," a banker once told me, "we are in the business of making deals!" What about the deal that your competitor closed over dinner last night, because he found the report that his analyst did last week on this industry, while your analyst stayed up all night re-doing the work for your breakfast meeting? That breakfast meeting that never happened, because the deal closed last night. Yes, you are in the business of creating and keeping metadata, whether you like it or not.
Another common objection is that knowledge workers aren't motivated to publish their work to other people. They want to hoard their knowledge. This may be true in some settings, but the success of things like Wikipedia suggest otherwise. Wikipedia was written by experts who put in hours of research for the sheer joy of recognition by their peers. Don't believe me? Have a look at the private blog of a regular wikipedia contributor. That sure looks like someone who works for the sheer joy of recognition to me.
Then there's the fellow I met at the Kennedy Space Center who got so fed up with failing to distribute and find design documents that he built a whole indexing system. In FORTRAN, of course. It wasn't the best KM system you've ever seen, but it worked. It worked because it responded to the engineers' needs, and allowed them to express themselves.
These stories go on and on, of knowledge workers who are supposedly too busy to author metadata, but are not too busy to build their own infrastructure to share things.
What are the lessons to be learned from this? How can these observations help us organize knowledge management? That's the subject of the final installment.
