CodeWorker
From MetaSharp
Article Author(s): Alune Phaxay
All Rights Reserved.
Contents |
Introduction
A Domain Specific Language (DSL) is a little language created only to resolve problems in a particular domain.
CodeWorker (CW) is a parsing tool and source code generator for generative programming. CW Story starts in 1996. Cedric Lemaire still improve CW.
CW key features are:
- increasing the productivity and the source code quality
- parsing any DSL file and generating source code in any language (C#, C++, ...)
- it's free: Open Source GNU Lesser General Public License
- available in C++, Java and .NET applications (API available)
- syntax highlighting with Eclipse and jEdit
Source Code
Installation
To install CW, you will need to download it here first.
Copy "CodeWorker.exe" and "libcurl.dll"
in
directory "C:\Programs File\CodeWorker\" (for example)
put "C:\Programs File\CodeWorker\" path in environment PATH.
(Control Panel -> System -> Advanced -> Environnement variables -> Path)
How CW works?
In CodeWorker, there are 3 parts:
- a extended-BNF parse script, for parsing text ("*.cwp")
- a template-based script, for text/code generation ("*.cwt")
- a common script for execution ("*.cws")
An EBNF-like parsing script
EBNF notation
CW is based on the EBNF notation, modified for CodeWorker.
You can use comments of C#: // and /* something */ in your rules file.
charA ::= 'a'; // a rule matching 1 character 'a'
A rule is a simple thing. It has:
- a name of your choice (charA here)
- an assignment operator (::=)
- a matching pattern of your choice ('a' here)
- a terminator (;)
You probably guessed it: all the difficulty lies into the matching pattern. Let's have a look at what you should know about matching patterns:
- 'a' | 'b' means character 'a' or character 'b'
- 'a'..'z' means any lowercase alphabetic character (in order words: from a to z in ascii table)
- [] defines a box containing a matching pattern
Let's see some common rules we use:
alpha ::= 'a'..'z' | 'A'..'Z'; num ::= '0'..'9'; alphanum ::= alpha | num; alphanums ::= [alphanum]+; // 1 or more consecutive alphanum
The same way we used a "+" here, we can define a box repetition count in some other ways:
alphanumszero ::= [alphanum]*; // 0 or more consecutive alphanum alphanumornothing ::= [alphanum]?; // 0 or 1 consecutive alphanum twoalphanums ::= [alphanum]2; // exactly 2 consecutive alphanum
Rules can also use some directives like the following:
#ignore(blanks) // ignores spaces, tabs and newlines
Let's see a more complicated rule now:
myRule ::= #ignore(blanks)
alphanums:myVar // : stores in myVar a matched alphanums rule
#empty // means the end of the file
=> // "=>" introduces a scripting language zone
{ // "{}" defines the scripting language zone bounds (used if more than 1 line)
traceLine(myVar); // outputs on screen the value stored in myVar
traceLine("end");
};
The CW script language
The CW scripting language is very large and powerful.
Documentation
Here is a scripting language overview
Some basic function :
traceLine(str) // print string on the screen saveProject(outFile); // save the tree in XML format
Here is an overview of functions available in CW.
Basis
How to manipulate variable
Here, we'll see variable manipulation
// Simple type local myString = "foo"; local myInt = 10; local myDouble = 10.5; // Array local myTable; insert myTable["red"] = "couleur rouge"; insert myTable["vert"] = "couleur verte"; // Tree representing a computer local myComputer; insert myComputer.processor = "athlon 2000+"; insert myComputer.ram = "Samsung"; insert myComputer.ram.capacity = "2giga"; insert myComputer.hd = "seagate"; insert myComputer.hd.space = "200giga"; // Imagine that I want to buy a new computer with the same configuration with a faster processor local myNewComputer setall myNewComputer = myComputer; // setall copie all the data on all graph (insert will copie only the variable value) myNewComputer.processor = "athlon 2800+"; // Output: // myNewComputer.processor = "athlon 2800+" // myNewComputer.ram = "Samsung" // myNewComputer.ram.capacity = "2giga" // myNewComputer.hd = "seagate" // myNewComputer.hd.space = "200giga" // Imagine that you want to add the new graphic card on this computer local emptyComputer; insert emptyComputer.graphicCard = "the last 3D graphic card"; // merge will fusion the two graph merge myComputer = myNewComputer; // Output: // myNewComputer.processor = "athlon 2800+" // myNewComputer.ram = "Samsung" // myNewComputer.ram.capacity = "2giga" // myNewComputer.hd = "seagate" // myNewComputer.hd.space = "200giga" // myNewComputer.graphicCard = "the last 3D graphic card"; // here is a reference local myReference; ref myReference = myNewComputer; // or shorter localref myReference = myNewComputer; local a = 10; local b = 50; local c = a + b; local d = $a + b$; // one '$' before the first, the second one after traceLine(a + b); // print 1050 traceLine(c); // print 1050 traceLine($a + b$); // print 60 traceLine(d); // print 60 // if you put a $, it will be interpreted like an integer otherwise like a string
A common script for execution
Here is a simple cws execution file example
// execute MyProject.cwp script with the first argument file and save all data in project
parseAsBNF("./MyProject.cwp", project, _ARGS[0]);
local treeFile = getWorkingPath() + _ARGS[0] + ".ktree";
// save the AST in a xml file
saveProject(treeFile);
// print on screen the xml file
traceLine(loadFile(treeFile));
The CW script language is used in this part too
A template-based script, for text/code generation
Example
We will create a simple class generator.
Create a Domain Specific Language (DSL) file
This is a simple DSL File. It describes class and methods.
- First we describe the main class Animal with 3 methods.
[animal] methods=[eat,drink,sleep];
- Next we describe 2 class derived of Animal that implement other methods.
[tiger] methods=[scratch,howl]; properties=[name="tigrou",color="yellow"]; father=animal;
[turtle] properties=[name="little",color="green"]; father=animal;
Parse DSL file
Now we parse the dsl file and print the value.
classList ::=
=> traceLine("* parsing DSL file");
#ignore(blanks)
[class]*
#empty
=> traceLine("* parsing finished");
;
class ::=
className
[
methods
| properties
| father
]*
=> traceLine("");
;
className ::=
LBRACET
chars:value
RBRACET
=> traceLine("Class name : " + value);
;
methods ::=
"methods"
EQUAL
LBRACET
[
[COMMA]?
method
]+
RBRACET
SEMICOLON
;
method ::=
chars:value
=> traceLine("Method value: " + value);
;
properties ::=
"properties"
EQUAL
[
LBRACET
[
[COMMA]?
property
]+
RBRACET
]:value
SEMICOLON
=> traceLine("Properties value: " + value);
;
property ::=
chars:name
=> traceLine("Property name: " + name);
EQUAL
string:value
=> traceLine("Property value: " + value);
;
father ::=
"father"
EQUAL
chars:value
=> traceLine("Father value: " + value);
SEMICOLON
;
char ::= 'a'..'z' | 'A'..'Z'; chars ::= [char]+; digit ::= '0'..'9'; digits ::= [digit]+; alphanum ::= char | digit; alphanums ::= [alphanum]+; string ::= '"' alphanums '"';
COMMA ::= ','; LBRACET ::= '['; RBRACET ::= ']'; SEMICOLON ::= ';'; EQUAL ::= '=';
(Question : We use [] to defines a box containing a matching pattern in 2 particulars cases. Witch ones ? if you find more, mail me ;) )
- We execute the script : "codeworker -nologo -parseBNF classGenerator.cwp simpledsl.txt"
Output:
Class name : animal Method value: eat Method value: drink Method value: sleep
Class name : tiger Method value: scratch Method value: howl Property name: name Property value: "tigrou" Property name: color Property value: "yellow" Properties value: [name="tigrou",color="yellow"] Father value: animal
Class name : turtle Property name: name Property value: "little" Property name: color Property value: "green" Properties value: [name="little",color="green"] Father value: animal
Creating an Abstract Syntax Tree (AST)
- Now, we create the AST :
classList ::=
=> traceLine("* parsing DSL file");
#ignore(blanks)
[class]*
#empty
=>
{
traceLine("* parsing finished");
traceLine("* remove temporary variables");
removeVariable(project.tmpClass);
removeVariable(project.tmpMethod);
removeVariable(project.tmpProperty);
traceLine("* save the project");
// the global variable project is saved is the "tree.xml" file
saveProject("tree.xml");
}
;
class ::=
// clear the temporary variable that will contain values
=> clearVariable(project.tmpClass);
#ignore(blanks)
className
[
methods
| properties
| father
]*
=>
{
// add a new item on the array
pushItem project.classes;
// copy values parsed on the last array element
setall project.classes#back = project.tmpClass;
}
=> traceLine("");
;
className ::=
LBRACET
chars:value
=>
{
// insert a new value
insert project.tmpClass.name = value;
traceLine("Class name : " + project.tmpClass.name);
}
RBRACET
;
methods ::=
"methods"
EQUAL
LBRACET
[
[COMMA]?
method
=>
{
pushItem project.tmpClass.methods;
setall project.tmpClass.methods#back = project.tmpMethod;
}
]+
RBRACET
SEMICOLON
;
method ::=
=> clearVariable(project.tmpMethod);
chars:value
=>
{
insert project.tmpMethod = value;
traceLine("Method value: " + value);
}
;
properties ::=
"properties"
EQUAL
[
LBRACET
[
[COMMA]?
property
=>
{
pushItem project.tmpClass.properties;
setall project.tmpClass.properties#back = project.tmpMethod;
}
]+
RBRACET
]
SEMICOLON
;
property ::=
=> clearVariable(project.tmpProperty);
chars:name
=>
{
insert project.tmpProperty.name = name;
traceLine("Property name: " + name);
}
EQUAL
string:value
=>
{
insert project.tmpProperty.value = value;
traceLine("Property value: " + value);
}
;
father ::=
"father"
EQUAL
chars:project.tmpClass.father
=> traceLine("Father value: " + project.tmpClass.father);
SEMICOLON
;
char ::= 'a'..'z' | 'A'..'Z'; chars ::= [char]+; digit ::= '0'..'9'; digits ::= [digit]+; alphanum ::= char | digit; alphanums ::= [alphanum]+; string ::= '"' alphanums '"';
COMMA ::= ','; LBRACET ::= '['; RBRACET ::= ']'; SEMICOLON ::= ';'; EQUAL ::= '=';
- The result in the "tree.xml" file :
<project>
<classes>
<__ARRAY_ENTRY __KEY="0">
<name __VALUE="animal" />
<methods>
<__ARRAY_ENTRY __KEY="0" __VALUE="eat" />
<__ARRAY_ENTRY __KEY="1" __VALUE="drink" />
<__ARRAY_ENTRY __KEY="2" __VALUE="sleep" />
</methods>
</__ARRAY_ENTRY>
<__ARRAY_ENTRY __KEY="1">
<name __VALUE="tiger" />
<methods>
<__ARRAY_ENTRY __KEY="0" __VALUE="scratch" />
<__ARRAY_ENTRY __KEY="1" __VALUE="howl" />
</methods>
<properties>
<__ARRAY_ENTRY __KEY="0">
<name __VALUE="name" />
<value __VALUE=""tigrou"" />
</__ARRAY_ENTRY>
<__ARRAY_ENTRY __KEY="1">
<name __VALUE="color" />
<value __VALUE=""yellow"" />
</__ARRAY_ENTRY>
</properties>
<father __VALUE="animal" />
</__ARRAY_ENTRY>
<__ARRAY_ENTRY __KEY="2">
<name __VALUE="turtle" />
<properties>
<__ARRAY_ENTRY __KEY="0">
<name __VALUE="name" />
<value __VALUE=""little"" />
</__ARRAY_ENTRY>
<__ARRAY_ENTRY __KEY="1">
<name __VALUE="color" />
<value __VALUE=""green"" />
</__ARRAY_ENTRY>
</properties>
<father __VALUE="animal" />
</__ARRAY_ENTRY>
</classes>
<tmpClass />
</project>
Generate output code
To generate cs code, we use a template : classGenerator.cwt
Like in php we use :
<% : script mode
%> : text mode
@ : change the mode (script to text and text to script)
public class @this.name@<%
if this.father != "" {
%> : @this.father@<%
}
%>
{
<% foreach method in this.methods { %>
public void @method@()
{
}<%
%>
<% } %><% foreach property in this.properties { %>
private string _@property.name@<%
if property.value != "" {
%> = @property.value@<% } %>;
public string @property.name@
{
set { _@property.name@ = value; }
get { return _@property.name@; }
}
<% } %>
}
To execute this template file, we need to create a script file : classGenerator.cws
* input : -dsl file to parse -output directory where the cs will be saved -debug directory where the ast will be saved
local input; local outputDir; local debugDir;
// default value
if (_ARGS[0] != "") { input = _ARGS[0]; } else { input = "input"; }
if (_ARGS[1] != "") { outputDir = _ARGS[1];}else { outputDir = "output"; }
if (_ARGS[2] != "") { debugDir = _ARGS[2];} else { debugDir = "_debug"; }
traceLine("* Processing " + input + " file\n");
traceLine("** parsing structure description file");
parseAsBNF("./classGenerator.cwp", project, input);
local shortFileName = getShortFilename(input);
local treeFile = getWorkingPath() + debugDir + "/" + shortFileName + ".ktree";
traceLine("** save the project");
saveProject(treeFile);
insert project.name = shortFileName;
insert project.header = shortFileName;
insert project.filenamesansext = toUpperString(rsubString(shortFileName, 4));
foreach class in project.classes {
local outFile = outputDir + "/" + toLowerString(class.name) + ".cs";
traceLine("** generate file: " + outFile);
generate("./classGenerator.cwt", class, outFile);
}
We execute this file : codeworker.exe -nologo -script classGenerator.cws -args simpledsl.txt
- The result is 3 files :
- animal.cs :
public class animal
{
public void eat()
{
}
public void drink()
{
}
public void sleep()
{
}
}
- tiger.cs :
public class tiger : animal
{
public void scratch()
{
}
public void howl()
{
}
private string _name = "tigrou";
public string name
{
set { _name = value; }
get { return _name; }
}
private string _color = "yellow";
public string color
{
set { _color = value; }
get { return _color; }
}
}
- turtle.cs :
public class turtle : animal
{
private string _name = "little";
public string name
{
set { _name = value; }
get { return _name; }
}
private string _color = "green";
public string color
{
set { _color = value; }
get { return _color; }
}
}
Conclusion
You are free to build your own domain specific langage.
Parsing, create a AST and generate the code is easy.
A lot of stuff is not handle in this article but the aim is to permit you to understand CW faster.
I will improve this article little to little to make is easier to understand.
Don't hesitate to give me comment !
- Download tutorial source file there.
Misc
CodeWorker .NET API
- First, download it.
- Use this example to understand better how to use it
using System; using CodeWorker;
namespace CodeWorkerExample
{
class Program
{
static object synchro = false;
static void Main(string[] args)
{
// initialization of CodeWorker
CodeWorker.Main.initialize();
CompiledBNFScript cwBnfScript = new CompiledBNFScript();
cwBnfScript.buildFromFile("parser2.cwp");
ParseTree context = new ParseTree();
try
{
// Be careful, CodeWorker isn't multi-tread.
// Here is a trick to use it in different thread
lock (Program.synchro)
{
// parse the file and put the result in the context variable
cwBnfScript.parse(context, "input.dsl");
// if you want to parse a string, use parseString
// _cwBnfScript.parseString(context, txt2Parse );
}
// Write on Console datas in context
ParseTree classesNode = context.getNode("classes");
if (classesNode != null)
{
ParseTree[] classArray = classesNode.array;
foreach (ParseTree classNode in classArray)
{
Console.WriteLine("class: name=" + classNode.getNode("name").text);
ParseTree methodsNode = classNode.getNode("methods");
if (methodsNode != null)
{
ParseTree[] methodArray = methodsNode.array;
foreach (ParseTree methodNode in methodArray)
Console.WriteLine("method: name=" + methodNode.text);
}
}
}
}
catch (Exception e)
{
Console.WriteLine(" [CWExample] parsing error:" + e.Message);
}
// termination of CodeWorker
CodeWorker.Main.terminate();
}
}
}
- Get here a C# solution example using CodeWorker .NET API
For more information, check this page.
=== CodeWorker and .NET Api
Contributor(s): Audric Thevenet
