Natural Language Processing as a Foundation of the Semantic Web
Foundations and Trends® in
Web Science
Volume 1 Issue 3–4
DOI: 10.1561/1800000002
Natural Language Processing as a Foundation of the Semantic Web
Yorick Wilks
University of Oxford, UK
Christopher Brewster
Aston University, UK, c.a.brewster@aston.ac.uk
SUGGESTED CITATION:
Yorick
Wilks
and
Christopher
Brewster
(2009)
"Natural Language Processing as a Foundation of the Semantic Web",
Foundations and Trends® in Web Science: Vol. 1: No 3–4, pp 199-327.
http:/dx.doi.org/10.1561/1800000002
Abstract
The main argument of this paper is that Natural Language Processing (NLP) does, and will continue to, underlie the Semantic
Web (SW), including its initial construction from unstructured sources like the World Wide Web (WWW), whether its advocates
realise this or not. Chiefly, we argue, such NLP activity is the only way up to a defensible notion of meaning at conceptual
levels (in the original SW diagram) based on lower level empirical computations over usage. Our aim is definitely not to claim
logic-bad, NLP-good in any simple-minded way, but to argue that the SW will be a fascinating interaction of these two methodologies,
again like the WWW (which has been basically a field for statistical NLP research) but with deeper content. Only NLP technologies
(and chiefly information extraction) will be able to provide the requisite RDF knowledge stores for the SW from existing unstructured
text databases in the WWW, and in the vast quantities needed. There is no alternative at this point, since a wholly or mostly
hand-crafted SW is also unthinkable, as is a SW built from scratch and without reference to the WWW. We also assume that,
whatever the limitations on current SW representational power we have drawn attention to here, the SW will continue to grow
in a distributed manner so as to serve the needs of scientists, even if it is not perfect. The WWW has already shown how an
imperfect artefact can become indispensable.