CodeWorker

From MetaSharp

Jump to: navigation, search

Article Author(s): Alune Phaxay
All Rights Reserved.


Contents

Introduction

A Domain Specific Language (DSL) is a little language created only to resolve problems in a particular domain.

CodeWorker (CW) is a parsing tool and source code generator for generative programming. CW Story starts in 1996. Cedric Lemaire still improve CW.

CW key features are:

  • increasing the productivity and the source code quality
  • parsing any DSL file and generating source code in any language (C#, C++, ...)
  • it's free: Open Source GNU Lesser General Public License
  • available in C++, Java and .NET applications (API available)
  • syntax highlighting with Eclipse and jEdit

Source Code

Installation

To install CW, you will need to download it here first.

Copy "CodeWorker.exe" and "libcurl.dll"

in

directory "C:\Programs File\CodeWorker\" (for example)

put "C:\Programs File\CodeWorker\" path in environment PATH.

(Control Panel -> System -> Advanced -> Environnement variables -> Path)

How CW works?

In CodeWorker, there are 3 parts:

  • a extended-BNF parse script, for parsing text ("*.cwp")
  • a template-based script, for text/code generation ("*.cwt")
  • a common script for execution ("*.cws")


An EBNF-like parsing script

EBNF notation

CW is based on the EBNF notation, modified for CodeWorker.

You can use comments of C#: // and /* something */ in your rules file.

charA ::= 'a'; // a rule matching 1 character 'a'

A rule is a simple thing. It has:

  • a name of your choice (charA here)
  • an assignment operator (::=)
  • a matching pattern of your choice ('a' here)
  • a terminator (;)

You probably guessed it: all the difficulty lies into the matching pattern. Let's have a look at what you should know about matching patterns:

  • 'a' | 'b' means character 'a' or character 'b'
  • 'a'..'z' means any lowercase alphabetic character (in order words: from a to z in ascii table)
  • [] defines a box containing a matching pattern

Let's see some common rules we use:

alpha     ::= 'a'..'z' | 'A'..'Z';
num       ::= '0'..'9';
alphanum  ::= alpha | num;
alphanums ::= [alphanum]+; // 1 or more consecutive alphanum

The same way we used a "+" here, we can define a box repetition count in some other ways:

alphanumszero     ::= [alphanum]*; // 0 or more consecutive alphanum
alphanumornothing ::= [alphanum]?; // 0 or 1 consecutive alphanum
twoalphanums      ::= [alphanum]2; // exactly 2 consecutive alphanum

Rules can also use some directives like the following:

#ignore(blanks) // ignores spaces, tabs and newlines

Let's see a more complicated rule now:

myRule ::= #ignore(blanks)
           alphanums:myVar     // : stores in myVar a matched alphanums rule
           #empty              // means the end of the file
           =>                  // "=>" introduces a scripting language zone
           {                   // "{}" defines the scripting language zone bounds (used if more than 1 line)
             traceLine(myVar); // outputs on screen the value stored in myVar
             traceLine("end");
           };

The CW script language

The CW scripting language is very large and powerful.

Documentation

Here is a scripting language overview

Some basic function :

traceLine(str)		// print string on the screen
saveProject(outFile);   // save the tree in XML format

Here is an overview of functions available in CW.

Basis

How to manipulate variable

Here, we'll see variable manipulation

// Simple type
local	myString = "foo";
local	myInt = 10;
local	myDouble = 10.5;

// Array
local	myTable;
insert	myTable["red"] = "couleur rouge";
insert	myTable["vert"] = "couleur verte";

// Tree representing a computer
local	myComputer;
insert	myComputer.processor = "athlon 2000+";
insert	myComputer.ram = "Samsung";
insert	myComputer.ram.capacity = "2giga";
insert	myComputer.hd = "seagate";
insert	myComputer.hd.space = "200giga";

// Imagine that I want to buy a new computer with the same configuration with a faster processor
local	myNewComputer
setall	myNewComputer = myComputer;
// setall copie all the data on all graph (insert will copie only the variable value)
myNewComputer.processor = "athlon 2800+";
// Output:
// myNewComputer.processor = "athlon 2800+"
// myNewComputer.ram = "Samsung"
// myNewComputer.ram.capacity = "2giga"
// myNewComputer.hd = "seagate"
// myNewComputer.hd.space = "200giga"

// Imagine that you want to add the new graphic card on this computer
local	emptyComputer;
insert	emptyComputer.graphicCard = "the last 3D graphic card";
// merge will fusion the two graph
merge myComputer = myNewComputer;
// Output:
// myNewComputer.processor = "athlon 2800+"
// myNewComputer.ram = "Samsung"
// myNewComputer.ram.capacity = "2giga"
// myNewComputer.hd = "seagate"
// myNewComputer.hd.space = "200giga"
// myNewComputer.graphicCard = "the last 3D graphic card";

// here is a reference
local myReference;
ref myReference = myNewComputer;
// or shorter
localref myReference = myNewComputer;

local a = 10;
local b = 50;
local c = a + b;
local d = $a + b$;     // one '$' before the first, the second one after
traceLine(a + b);	// print 1050
traceLine(c);		// print 1050
traceLine($a + b$);	// print 60
traceLine(d);		// print 60
// if you put a $, it will be interpreted like an integer otherwise like a string

A common script for execution

Here is a simple cws execution file example

// execute MyProject.cwp script with the first argument file and save all data in project
parseAsBNF("./MyProject.cwp", project, _ARGS[0]);
local treeFile = getWorkingPath() + _ARGS[0] + ".ktree";
// save the AST in a xml file
saveProject(treeFile);
// print on screen the xml file
traceLine(loadFile(treeFile));

The CW script language is used in this part too

A template-based script, for text/code generation

Example

We will create a simple class generator.

Create a Domain Specific Language (DSL) file

This is a simple DSL File. It describes class and methods.


  • First we describe the main class Animal with 3 methods.
[animal]
methods=[eat,drink,sleep];
  • Next we describe 2 class derived of Animal that implement other methods.
[tiger]
methods=[scratch,howl];
properties=[name="tigrou",color="yellow"];
father=animal;
[turtle]
properties=[name="little",color="green"];
father=animal;

Parse DSL file

Now we parse the dsl file and print the value.

classList ::=
       => traceLine("* parsing DSL file");
       #ignore(blanks)
       [class]*
       #empty
       => traceLine("* parsing finished");
;
class ::=
       className
       [
               methods
       |       properties
       |       father
       ]*
       => traceLine("");
;
className ::= 
       LBRACET
       chars:value
       RBRACET
       => traceLine("Class name : " + value);
;
methods ::=
       "methods"
       EQUAL
       LBRACET
       [
               [COMMA]?
               method
       ]+
       RBRACET
       SEMICOLON
;
method ::=
       chars:value
       => traceLine("Method value: " + value);
;
properties ::=
       "properties"
       EQUAL
       [
               LBRACET
               [
                       [COMMA]?
                       property
               ]+
               RBRACET
       ]:value
       SEMICOLON
       => traceLine("Properties value: " + value);
;
property ::= 
       chars:name
       => traceLine("Property name: " + name);
       EQUAL
       string:value
       => traceLine("Property value: " + value);
;
father ::=
       "father"
       EQUAL
       chars:value
       => traceLine("Father value: " + value);
       SEMICOLON
;
char ::= 'a'..'z' | 'A'..'Z';
chars ::= [char]+;
digit ::= '0'..'9';
digits ::= [digit]+;
alphanum ::= char | digit;
alphanums ::= [alphanum]+;
string ::= '"' alphanums '"';
COMMA ::= ',';
LBRACET ::= '[';
RBRACET ::= ']';
SEMICOLON ::= ';';
EQUAL ::= '=';

(Question : We use [] to defines a box containing a matching pattern in 2 particulars cases. Witch ones ? if you find more, mail me ;) )

  • We execute the script : "codeworker -nologo -parseBNF classGenerator.cwp simpledsl.txt"

Output:

Class name : animal
Method value: eat
Method value: drink
Method value: sleep
Class name : tiger
Method value: scratch
Method value: howl
Property name: name
Property value: "tigrou"
Property name: color
Property value: "yellow"
Properties value: [name="tigrou",color="yellow"]
Father value: animal
Class name : turtle
Property name: name
Property value: "little"
Property name: color
Property value: "green"
Properties value: [name="little",color="green"]
Father value: animal

Creating an Abstract Syntax Tree (AST)

  • Now, we create the AST :


classList ::=
       => traceLine("* parsing DSL file");
       #ignore(blanks)
       [class]*
       #empty
       =>
       {
               traceLine("* parsing finished");
               traceLine("* remove temporary variables");
               removeVariable(project.tmpClass);
               removeVariable(project.tmpMethod);
               removeVariable(project.tmpProperty);
               traceLine("* save the project");	
               // the global variable project is saved is the "tree.xml" file
               saveProject("tree.xml");
       }
;
class ::=
       // clear the temporary variable that will contain values
       => clearVariable(project.tmpClass);
       #ignore(blanks)
       className
       [
               methods
       |       properties
       |       father
       ]*
       => 
       {
               // add a new item on the array
               pushItem project.classes;
               // copy values parsed on the last array element
               setall project.classes#back = project.tmpClass;
       }
       => traceLine("");
;
className ::=
       LBRACET
       chars:value
       =>
       {
               // insert a new value
               insert project.tmpClass.name = value;
               traceLine("Class name : " + project.tmpClass.name);
       }
       RBRACET
;
methods ::=
       "methods"
       EQUAL
       LBRACET
       [
               [COMMA]?
               method
               => 
               {
                       pushItem project.tmpClass.methods;
                       setall project.tmpClass.methods#back = project.tmpMethod;
               }
       ]+
       RBRACET
       SEMICOLON
;
method ::=
       => clearVariable(project.tmpMethod);
       chars:value
       => 
       {
               insert project.tmpMethod = value;
               traceLine("Method value: " + value);
       }
;
properties ::=
       "properties"
       EQUAL
       [
               LBRACET
               [
                       [COMMA]?
                       property
                       => 
                       {
                               pushItem project.tmpClass.properties;
                               setall project.tmpClass.properties#back = project.tmpMethod;
                       }
               ]+
               RBRACET
       ]
       SEMICOLON
;
property ::= 
       => clearVariable(project.tmpProperty);
       chars:name
       => 
       {
               insert project.tmpProperty.name = name;
               traceLine("Property name: " + name);
       }
       EQUAL
       string:value
       =>
       {
               insert project.tmpProperty.value = value;
               traceLine("Property value: " + value);
       }
;
father ::=
       "father"
       EQUAL
       chars:project.tmpClass.father
       => traceLine("Father value: " + project.tmpClass.father);
       SEMICOLON
;
char ::= 'a'..'z' | 'A'..'Z';
chars ::= [char]+;
digit ::= '0'..'9';
digits ::= [digit]+;
alphanum ::= char | digit;
alphanums ::= [alphanum]+;
string ::= '"' alphanums '"';
COMMA		::= ',';
LBRACET		::= '[';
RBRACET		::= ']';
SEMICOLON	::= ';';
EQUAL		::= '=';
  • The result in the "tree.xml" file :


<project>
       <classes>
               <__ARRAY_ENTRY __KEY="0">
                       <name __VALUE="animal" />
                       <methods>
                               <__ARRAY_ENTRY __KEY="0" __VALUE="eat" />
                               <__ARRAY_ENTRY __KEY="1" __VALUE="drink" />
                               <__ARRAY_ENTRY __KEY="2" __VALUE="sleep" />
                       </methods>
               </__ARRAY_ENTRY>
               <__ARRAY_ENTRY __KEY="1">
                       <name __VALUE="tiger" />
                       <methods>
                               <__ARRAY_ENTRY __KEY="0" __VALUE="scratch" />
                               <__ARRAY_ENTRY __KEY="1" __VALUE="howl" />
                       </methods>
                       <properties>
                               <__ARRAY_ENTRY __KEY="0">
                                       <name __VALUE="name" />
                                       <value __VALUE=""tigrou"" />
                               </__ARRAY_ENTRY>
                               <__ARRAY_ENTRY __KEY="1">
                                       <name __VALUE="color" />
                                       <value __VALUE=""yellow"" />
                               </__ARRAY_ENTRY>
                       </properties>
                       <father __VALUE="animal" />
               </__ARRAY_ENTRY>
               <__ARRAY_ENTRY __KEY="2">
                       <name __VALUE="turtle" />
                       <properties>
                               <__ARRAY_ENTRY __KEY="0">
                                       <name __VALUE="name" />
                                       <value __VALUE=""little"" />
                               </__ARRAY_ENTRY>
                               <__ARRAY_ENTRY __KEY="1">
                                       <name __VALUE="color" />
                                       <value __VALUE=""green"" />
                               </__ARRAY_ENTRY>
                       </properties>
                       <father __VALUE="animal" />
               </__ARRAY_ENTRY>
       </classes>
       <tmpClass />
</project>

Generate output code

To generate cs code, we use a template : classGenerator.cwt

Like in php we use :

<% : script mode

%> : text mode

@ : change the mode (script to text and text to script)

public class @this.name@<%
if this.father != "" {
%> : @this.father@<%
}
%>
{
<% foreach method in this.methods { %>
	public void @method@()
	{
	}<%
	%>
<% } %><% foreach property in this.properties { %>
	private string _@property.name@<%
	if property.value != "" {
	%> = @property.value@<% } %>;
	public string @property.name@
	{
		set { _@property.name@ = value; }
		get { return _@property.name@; }
	}
<% } %>
}

To execute this template file, we need to create a script file : classGenerator.cws

* input :
-dsl file to parse
-output directory where the cs will be saved
-debug directory where the ast will be saved
local input;
local outputDir;
local debugDir;
// default value
if (_ARGS[0] != "")	{ input = _ARGS[0]; }	else	{ input = "input"; }
if (_ARGS[1] != "")	{ outputDir = _ARGS[1];}else	{ outputDir = "output"; }
if (_ARGS[2] != "")	{ debugDir = _ARGS[2];}	else	{ debugDir = "_debug"; }
traceLine("* Processing " + input + " file\n");
traceLine("** parsing structure description file");
parseAsBNF("./classGenerator.cwp", project, input);
local shortFileName = getShortFilename(input);
local treeFile = getWorkingPath() + debugDir + "/" + shortFileName + ".ktree";
traceLine("** save the project");		
saveProject(treeFile);
insert project.name = shortFileName;
insert project.header = shortFileName;
insert project.filenamesansext = toUpperString(rsubString(shortFileName, 4));
foreach class in project.classes {
       local outFile = outputDir + "/" + toLowerString(class.name) + ".cs";
       traceLine("** generate file: " + outFile);
       generate("./classGenerator.cwt", class, outFile);
}

We execute this file : codeworker.exe -nologo -script classGenerator.cws -args simpledsl.txt

  • The result is 3 files :
  • animal.cs :
public class animal
{
       public void eat()
       {
       }
       public void drink()
       {
       }
       public void sleep()
       {
       }
}
  • tiger.cs :
public class tiger : animal
{
       public void scratch()
       {
       }
       public void howl()
       {
       }
       private string _name = "tigrou";
       public string name
       {
               set { _name = value; }
               get { return _name; }
       }
       private string _color = "yellow";
       public string color
       {
               set { _color = value; }
               get { return _color; }
       }
}
  • turtle.cs :
public class turtle : animal
{
       private string _name = "little";
       public string name
       {
               set { _name = value; }
               get { return _name; }
       }
       private string _color = "green";
       public string color
       {
               set { _color = value; }
               get { return _color; }
       }
}

Conclusion

You are free to build your own domain specific langage.

Parsing, create a AST and generate the code is easy.

A lot of stuff is not handle in this article but the aim is to permit you to understand CW faster.

I will improve this article little to little to make is easier to understand.

Don't hesitate to give me comment !

  • Download tutorial source file there.

Misc

CodeWorker .NET API

  • First, download it.
  • Use this example to understand better how to use it
using System;
using CodeWorker;
namespace CodeWorkerExample
{
   class Program
   {
       static object synchro = false;
       static void Main(string[] args)
       {
           // initialization of CodeWorker
           CodeWorker.Main.initialize();
           CompiledBNFScript cwBnfScript = new CompiledBNFScript();
           cwBnfScript.buildFromFile("parser2.cwp");

           ParseTree context = new ParseTree();
           try
           {
               // Be careful, CodeWorker isn't multi-tread. 
               // Here is a trick to use it in different thread
               lock (Program.synchro)
               {
                   // parse the file and put the result in the context variable
                   cwBnfScript.parse(context, "input.dsl");

                   // if you want to parse a string, use parseString
                   // _cwBnfScript.parseString(context, txt2Parse );
               }
               // Write on Console datas in context
               ParseTree classesNode = context.getNode("classes");
               if (classesNode != null)
               {
                   ParseTree[] classArray = classesNode.array;
                   foreach (ParseTree classNode in classArray)
                   {
                       Console.WriteLine("class: name=" + classNode.getNode("name").text);
                       ParseTree methodsNode = classNode.getNode("methods");
                       if (methodsNode != null)
                       {
                           ParseTree[] methodArray = methodsNode.array;
                           foreach (ParseTree methodNode in methodArray)
                               Console.WriteLine("method: name=" + methodNode.text);
                       }
                   }
               }
           }
           catch (Exception e)
           {
               Console.WriteLine(" [CWExample] parsing error:" + e.Message);
           }
           // termination of CodeWorker
           CodeWorker.Main.terminate();
       }
   }
}
  • Get here a C# solution example using CodeWorker .NET API

For more information, check this page.


=== CodeWorker and .NET Api


Contributor(s): Audric Thevenet

Personal tools