The first version of McCullough Knowledge Explorer (MKE) and the MKR language was born in 1996. MKR/MKE slowly evolved as I discovered what was needed for real world applications. By the time I introduced the current form of action/method characterization in 1999, MKE had grown to about 25,000 lines of Icon programs. After modifications in 2003 to interface with different languages and systems of the "Semantic Web", MKE was about 50,000 lines of Unicon programs, plus 1,000 lines of Java programs, plus 4,000 lines of UNIX shell programs.
I have thought about creating a smaller base version of MKE, with dynamically loaded add-ons as needed. But the steady advances in computer technology have relieved me of the necessity of doing so.
During all this development, I have maintained the uniform structure of the MKR language.
This section covers the record structures used by MKE. Access to these structures is via a set of procedures with a "Unicon-class-like" spirit. However, they are not classes because the original MKE implementation was done in Icon, and making the change to Unicon classes never got to the top of my priority list.
token.icn: record WORD(wtype,wvalue) token.icn: record TOKEN(ttype,tvalue) symbol.icn: record SYMBOL(stype,svalue) phrase.icn: record PHRASE(pvalue) nvlist.icn: record NVPHRASE(novtype,novlist) nvstack.icn: (name-space stack) dollar.icn: (name-value lookup) array.icn: record AAPHRASE(aname,aindex) bselist.icn: record BSE(bse_separator,bse_begin,bse_list,bse_end) pplist.icn: record PPOBJECT(ppat,ppout,ppof,ppwith,ppod,ppfrom,ppto)
As an example of the "Unicon-class-like" structure, consider the procedures used to access NVPHRASE. novlist is [name,op,value]. novtype is "nv" or "nvnull" (no value).
# new_nv(novlist) # nv_novlist(x) # nv_name(x) # nv_op(x) # nv_value(x) # nv_badtype(t,x,ierror) # error message # nv_unparse(x) # nv_writes(fd,x) # nv_tsize(tsym) # nv_map_symbol(x,tokenlist)) # nv2nov(nvphrase) # nov2nv(novlist) # symbol2nv(symbol)The purpose of most of the procedures is quite obvious, so I will only comment on three of them. Every "class" has an unparse(), which returns the original input string; this is a little "tricky" because blanks and list separators are discarded early in MKE processing. Every "class" has a tsize(), which returns the number of tokens in a symbol; this is used for error checking during the parsing phase, to insure that no tokens are added or deleted. Every "class" has a map_symbol(), which is used to map token type back to token value after symbol parsing is complete. For a complex example requiring the reinsertion of blanks between words, see htxt_map_symbol() in KEHOME/src/html.icn.
concept.icn: record CONCEPT(iconcept,...) role.icn: record ARGDEF(argname,argtype,...) event.icn: record EVENT(action,subject,ppobject,space,time,view) context.icn: record CONTEXT(atspace,attime,atview) knit.icn: record CONTEXT_TABLE(ct_ALIAS,ct_NICK,ct_K,ct_C,ct_A,ct_views) knit.icn: record CONTEXT_DATA(cx_view,cx_krlanguage,...)
group.icn: record ABSTRACT_GROUP(iname,...) begin.icn: record GROUP(gtype,gvalue) hwalk.icn: record HOUNIT(holist,hoend) relation.icn: record RELUNIT(reltuple,relend) xml.icn: record TRIPLE(ntsubject,ntpredicate,ntobject,ntend) xml.icn: record MCF(mcfname,mcfvalue) xml.icn: record RDF(rdfsymbol)
ged.icn: record PERSON(gedname,uniquename,...) ged.icn: record FAMILY(gedname,uniquename,...) ged.icn: record NOTE(gedname,uniquename,...)
prompt() gets the next line from the input file; get_word() gets the next word from that line; get_token() maps the word to a single-byte token type; get_symbol() parses the tokens into symbols.Since every proposition ends in a semicolon, get_symbol() is activated when a semicolon is found. With nested propositions (e.g. if-then-else-fi) processing is delayed until the NEWcomplete() procedure determines a complete proposition has been read.
get_token(fd,ps,option) get_word(fd,ps,option) prompt(fd)The interpret_line(line,dollar) procedure inserts lines into the input steam by calling parse_file(line). When prompt(line) finds that its argument is a string, it returns that string instead of returning the next line from the input file.
The depth of the parse trees makes it rather tedious to find the right data for interpretation. To alleviate this problem I use find_stype(stype,symbol) to search down the parse tree and find the right data. For example, if I am processing a proposition, I can use
subject := list_unparse(find_stype("subject",symbol).svalue) verb := list_unparse(find_stype("verb",symbol).svalue) object := list_unparse(find_stype("object",symbol).svalue)
John brel father = Sam;where "rel" is a predefined MKR verb, and "father" denotes a user-defined binary relation. Alternatively, you can define a new verb to express the binary relation. Here's an equivalent statement of the above example.
relFather isu relation verb with arity=2; John relFather Sam;User-defined verbs have a special token type "B". New verbs are simply added to the mkr_word table which is used by get_token().
This capability can even be used to make MKR look more like RDF. (I call this the RDF/OWL Mtriples language.)
subClassOf isu verb with ctype=relation; type isu verb with ctype=relation; Chevy subClassOf car; Bessie type Chevy;The MKR equivalent of these statements is
Chevy iss car; Bessie isu Chevy;
I adapted the Merr error message system, which was designed for the Unicon compiler, to work with my custom MKR parser. Only minor changes were required to use names instead of numbers for parser states and input tokens. I added one new feature: the ability to associate a "high level" message with a parser state.
The "err.meta" file contains the token error patterns and error messages. Merr executes "ksc" to determine the parser state associated with the token error pattern, and creates the yyerror.icn file. I configured merr to use "ksc" instead of "ke" because "ksc" is about 100 times faster than "ke".
John do hit od the ball done;The most general form of action in MKR is
at space=s,time=t,view=v { subject do action = event out action products of action domains with action characteristics od action direct objects from action initial characteristics to action final characteristics done; };Actions can also be written as productions
product := subject do ... done;Methods are just a special case of actions, and are described in the same way. All n-ary relations have an implicit method defined by a format and meaning, but relations have only "direct objects".
For the more complicated case of any prepositional phrases in any order, the PPOBJECT record is used to pass the arguments to the method (see section 1.1).
format = [class:1, ...]For the more complicated case, the format is expressed as a proposition
format = { MKR proposition }
meaning = { proposition list }For methods implemented as a Unicon procedure, the meaning is expressed as the name of that procedure
meaning = procedure_name
The map_word() procedure uses four tables to translate common words in DC, RDF, OWL, MCF to the corresponding words in MKR.
MKE includes the following commands:
do read tap from file done; do read rdf from file done; do read mcf from file done;