Voice of the Citizens

Voice of the Citizens

Not enough ratings
Scripting for Dummies (Expert)
By Carl Gustav Jung
Unlocking full potential with JSGF. This is not for the faint hearted, I warned you.
   
Award
Favorite
Favorited
Unfavorite
Introduction
The JSpeech Grammar Format (JSGF) is a platform-independent, vendor-independent textual representation of grammars for use in speech recognition. Grammars are used by speech recognizers to determine what the recognizer should listen for, and so describe the utterances a user may say. JSGF adopts the style and conventions of the JavaTM Programming Language in addition to use of traditional grammar notations.

Read:
W3C Documentation[www.w3.org]
Basic Grammar from Graph
Since the people over at CMUSphinx didn't include the actual grammar used for their graph, the graph loses a lot of educational value. I'll fix that for you:


Here's the grammar:

#JSGF V1.0 grammar GraphGrammar; public <basicCmd> = <startPolite> <command> <endPolite>; <command> = <action> <object>; <action> = open | close | delete | move; <object> = [the | a] (window | file | menu); <startPolite> = (please | kindly | could you | oh mighty computer) *; <endPolite> = [ please | thanks | thank you ];

As you can see, there are quite a lot of possible sentences you can speak using quite little typing. This is especially useful if you want your users to be able to speak freely and not follow strict commands like "please delete the file" as they can now talk more naturally and say for example:
  • "Could you please delete the file?"
  • "Could you delete the file?"
  • "Please delete the file?"
  • "Delete the file?"
  • ...
An actual and valid case of use in gaming.
Arma3.

Have you ever tried commandeering your team? It's tidious! All that scrolling and clicking.. jesus. Good luck writing 100's of commands for each possible task. 100's? Yes. Arma has color coded teams, explicit unit control by number, explicit assignments,...

Take this example:

  • "Everyone follow me"
  • "Unit one follow me"
  • "Unit two follow me"
  • "Unit three follow me"
  • ...
  • "Team blue follow me"
  • "Team red follwo me"
  • ...

You catch the drift right? Each of those would have to be implemented as explicit command. 1 command per Team Color & action.

You'd have 10's of commands to just make a single colored team do all the actions like "follow, fall back, advance, ..."

Now, here's how JSGF makes that much better:

#JSGF V1.0; grammar Arma3; public <command5> = (everyone | ([team] <team> [and])* | ([unit] <number> [and])*) <object> [[team] <team>] {Command5}; public <command4> = <operation> [team <team>] [ unit <number>]{Command4}; public <command3> = (everyone | [team] <team> | [unit] <number>) <operation>+ {Command3}; public <command2> = <operation> {Command2}; public <command1> = [[unit] <number> [and]]+ <operation>+ {Command1}; <team> = (red | blue | green | yellow | white | black) ; <formation> = (line | column | delta | diamond | file | vee | echelon right | echelon left | wedge | staggered column | column ); <direction> = (north | south | west | east)+; <operation> = (select | move | with me | follow me | watch <direction> | engage | target | attack | heal | fall back| regroup | formation <formation> | <formation> formation | now | on zulu | get out | zulu go | fire at will | copy stance | stay low | no target | scan horizon | suppressive fire | disengage | engage at will | weapons free | fire at will | fire | hold fire | move to vehicle | get in vehicle); <object> = ( is | are | there | now |get|in|that|to); <number> =( one | two | three | four | five | six | seven | eight | nine | ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty | (twenty ( one | two | three | four | five | six | seven | eight | nine)) | thirty | (thirty ( one | two | three | four | five | six | seven | eight | nine)) | forty | (forty ( one | two | three | four | five | six | seven | eight | nine)) | fifty | (fifty ( one | two | three | four | five | six | seven | eight | nine)) )*;

Using this grammar, you basically have all the options to assign units to teams, commandeer units, teams or everyone, let them do whatever you want and have just 5 defined commands.

Now you can say things like

  • "Red team is team Blue" to re-assign the units of red to the blue team.
  • Combine mupltiple units/teams and issue them a command at the same time like "team red and blue fall back, team green advance, team yellow watch north-east."

It goes without saying that this greatly enhances your efficiency on the battlefield and saves you a lot of writing in the end.
Processing commands in your script.
Well, by now you must have asked yourself how on earth you're going to process all those possibilities in your script and execute the appropirate action.. Well.. brace yourself.

Easing the pain by tagging

JSGF Supports tagging and VOTC will give those tags to your script in addition to the string representation of the sentence/command/words the user spoke. Parsing commands that are valid in your JSGF file without tags is almost impossible or far too expensive to compute at one point. All the if/else's, ordering and sometimes even heuristics would make your script huge, slow and painful to read/write.

Tagging helps a lot. I told you to tag your grammar in the beginning.

For JSGF commands like this one

public <command2> = <operation> {Command2};

the Tag "Command2" is enough. There are just 10's of possible words the user could have said, so you can go straight ahead and string match the spoken word with your List of "Operations" and execute the associated keystrokes.

Basic concepts

I'm assuming a Dictionary<string, Action> Dictionary, where the key corresponds each value in your <operation> jsgf rule and the value a method/lambda that executes the keystrokes and gives TTS feedback.

if(tag == Command2) OperationDictionary[sentence].Invoke()

Fair enough innit?

Now let's look at

public <command1> = ([unit] <number> [and])+ <operation> {Command1};

This one is too complex to process using onle "Command1" as tag, but we don't have any other tags assigned in the jsgf file. That's basically because when I wrote that grammar, I had almost no idea how to use JSGF and was happy it was valid JSGF. Don't. Be. Me.

Let's analyze that command so we can understand it better:

Remember, things in []'s are optional. We will discard [unit] and [and] as irrelevant and ignore it completely. That makes the braces around "[unit] <number> [and]" irrelevant aswell.

We have one or more units according to
<number>+
and one operation according to
<operation>


Which means, we could have commands like
  • "one, two, three, advance"
  • "one advance"

Let's tag <number> and <operation>. We will using {Number} and {Operation} as tags and put them at the end of each line associated BUT before the semikolon (;)

<operation> = (select | move | with me | follow me | watch <direction> | engage | target | attack | heal | fall back| regroup | formation <formation> | <formation> formation | now | on zulu | get out | zulu go | fire at will | copy stance | stay low | no target | scan horizon | suppressive fire | disengage | engage at will | weapons free | fire at will | fire | hold fire | move to vehicle | get in vehicle) {Operation};

<number> =( one | two | three | four | five | six | seven | eight | nine | ten | eleven | twelve | thirteen | fourteen | fifteen | sixteen | seventeen | eighteen | nineteen | twenty | (twenty ( one | two | three | four | five | six | seven | eight | nine)) | thirty | (thirty ( one | two | three | four | five | six | seven | eight | nine)) | forty | (forty ( one | two | three | four | five | six | seven | eight | nine)) | fifty | (fifty ( one | two | three | four | five | six | seven | eight | nine)) )* {Number};

Now we'll get more tags back.

Which means commands like
  • "one, two, three, advance"
  • "one advance"

will also be fully represented by tags like
  • "Number Number Number Operation Command1"
  • "Number Operation Command1"

All you have to do now is split the sentence in numbers and operation and execute each keystroke.

In this example, it'd be as easy as doing

foreach(string word in sentence.Split(' ')) { if(NumberDictionary.ContainsKey(word) NumberDictionary[word].Invoke(); if(OperationDictionary.ContainsKey(word) OperationDictionary[word].Invoke(); }

Additionally, you could remove the words we deemed useless like "unit" and "and" and skip the contains key check.

Also, a Dictionary<string, Dictionary<string, Action>> might work better in complex scripts where you could then do things like

Dict[Tag][Word].Invoke()