I'm trying to kick off a language input system by tokenising the input.
I've found the 'tokenize' module - the system says it is now visible (DOS shell in XP) - but when I try to issue 'tokenize('What is the time?',K), it saysn 'tokenize/2 undefined predicate - it says this if I use 's' instead of 'z' and 'ser/zer'.
What am I missing?
Here is a session on how you can use the tokenizer in Ciao:
Ciao 1.13.0-8283: Tue Jun 26 10:25:23 CEST 2007 ?- use_module(library(tokenize)). Note: module tokenize already in executable, just made visible
yes ?- read_tokens(TokenList, Dictionary). |: p(X,Y).
Dictionary = dic([88],[_B|_],_,dic([89],[_A|_],_,_)), TokenList = [atom(p),'(',var(_B,[88]),',',var(_A,[89]),')','.'] ?
Note, however, that this is not a generic tokenizer, but rather a tokenizer for a particular (Prolog-style) syntax. If you want to write a tokenizer for another language you can look at how tokenize.pl is implemented. Another good pointer is to use DCGs (see the 'dcg' package). --Manuel H