|
Creating Usage Sensitive Knowledge Structures
|
While the Internet has proven to be an invaluable mechanism to facilitate human-to-human communications, it presently fails to effectively support the interactions between machines and humans, as well as between machines themselves. In its present manifestation, the Internet is a myriad of inter-connected digital resources which are usually convenient for humans to read, but at the same time almost impossible for machines to interpret. While users are content to view information through informal text and visual imagery, machines find meaning by delving below this abstraction in search of more formalized semantics and structures. Although standards such as eXtensible Markup Language (XML) have provided a means to digitally express user-defined information structures, they nevertheless fail to impart any sense of meaning to the data they express, and leave the user to manually separate the significant from the irrelevant.
The aforementioned failures of the Internet at present suggest the need for advancements that will render information on its networks as meaningful to machines as it is to humans. This extension of the Internet, termed the Semantic Web, enables information to be given well-defined meaning, better enabling computers and people to work in co-operation.
It is with these challenges in mind that the project in question be presented.
Ontologies and similar information structures have been developed by experts in an attempt to fashion meaning out of the vast expanse of information that litters the digital realm. However, while these mechanisms of conceptualization undeniably create meaning where little existed before, they nevertheless do so at costs which are proving to be too great. More recently, however, casual Internet users have provided a means of collaborative classification by tagging, generating flat structures known as folksonomies. While this novel approach certainly does not provide a complete solution, it nevertheless strengthens the connection between the Internet and the machines that it depends upon.
The primary goal of the project around which this report is based is to investigate novel mechanisms that are able to create data that is present and current. While the managed hierarchical data models that are mentioned above serve several notable purposes, they are inherently static in nature, and are limited in their ability to adapt to changing perceptions. This shortcoming ensures that numerous relationships that they communicate are often dated.
In an attempt to investigate techniques that are able to partially remedy this hindrance, manual and automatic means of textual categorization are combined to produce a system whose output will express a current interpretation of the relationships between the terms of a limited vocabulary. Positive results will be epitomized by semantic data that is both current and useful to individuals and organizations that rely on data categorization and retrieval.
|