Parsing for text adventure games

I don’t know much about natural language processing, but if you’re interested in implementing your own parser you could always look up some information on parsing formal grammars. The simplest methods I’m aware of involve describing a grammar in Backus-Naur form and building parse trees from input phrases, which can be used on very limited sets of English (they are ideal for subject-verb-object forms, for example).

Basically, you run through your input phrase and process individual tokens in terms of lexical categories, like noun and verb, then use Backus-Naur form (or some equivalent) to attempt to build a parse tree based on a system of formal rules. If you can successfully build a tree with one terminal at the root, you have a syntactically correct input and you can process it semantically to get the actual ‘meaning’ of the instruction.

That’s kind of just a really basic rundown on a simple method which is often used to process formal languages, you should be able to find more information on, say, Wikipedia or from a language processing text.

If you’re interested in getting your hands dirty with interactive fiction without getting too much into language parsing theory, I’d really recommend checking out Inform 7 (you can probably google it for the web site) — it’s a language which compiles to Z-code and is expressed using a limited subset of English. Because it compiles to Z-code, compiled games can be played in any program which will execute Z-code, so you don’t have to worry about writing parsing code yourself.

For example, to define a room containing an angry midget, you would write something like:

The Midget Room is a room. The angry midget is a person in the Midget Room. The description of the Midget Room is “You are in a room. An angry midget is glowering at you.”

That may not be entirely correct, but that’s sort of how Inform 7 programs look. You should check it out if it sounds interesting. It’s really quite neat.

Regardless of whether you go with a fancy technical version, the important thing is to make sure that the subset of English you are parsing is sufficient for your needs.

Before you think about ignoring prepositions, consider the difference between “Give Jack the shotgun” or “Give the shotgun to Jack.” Same meaning, separate noun orders. If the latter is parsed as “Give the shotgun Jack,” which is interpreted as giving Jack to the shotgun, and the player simply receives the message, “I don’t think it would want that,” then he will be awfully confused.

You could consider that only things could be given and only people could be recipients, but then you might have trouble with an exchange like:

  • Give the dog a dog toy.
  • Give the dog to the dangerously armed clown.

Whereas a BNF production like:

  • ::= Give| Giveto

Would automatically parse both forms as syntactically equivalent and allow you to attach the correct semantic information to the resulting parse tree without really putting in any additional work.

You can use a simplified parser, and depending on what you’re doing, it may be fairly appropriate, but you will most likely end up with some weirdness at some point, and if you need to expand the parser at a later date — to understand the difference between “put the quarter on the vending machine” and “put the quarter in the vending machine,” for example — then you may find it exceedingly difficult to do so once a fair bit of coding has been done.

Ultimately, if it works well enough for you it works well enough for you, and I wouldn’t worry about it. Just be aware that more complete solutions already exist, although they can be fairly complicated.

 

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a comment