Short description
Summarize this content to 100 words To describe objects and processes in terms of business logic, configuration and definition of structure and logic in complex systems, a popular approach is the use of subject-specific languages (Domain Specific Language – DSL), which are implemented either through the syntactic features of a programming language (for example, with using metaprogramming tools, annotations/decorators, operator overrides and infix operator creation, such as Kotlin DSL) or by using specialized development tools and compilers (such as Jetbrains MPS or general purpose parsers such as ANTLR or Bison). But there is also a DSL implementation approach based on parsing and concurrent code generation to generate code that executes the description, and in this article we will look at some examples of using the textx library to generate DSLs in Python.textX is a language modeling (DSL) tool in Python. It allows you to quickly and easily determine the grammar of a language and generate a parser for that language. textX is open source, easily integrated with other Python tools, and can be used in a variety of projects that need to define and process text-based languages.With textX, you can define different types of language constructs, such as keywords, identifiers, numbers, strings, etc., and define their properties, such as data types or value constraints. The definition of the grammar of the language takes place in a text file in a format called meta-DSL (and registered as a tx language associated with the file mask *.tx). The metamodel is used to check the syntactical correctness of the DSL model and allows you to generate a tree of objects for use in the Python code generator (which can generate both other Python code and source text in any other programming language and even make a PDF document or create a dot – file for graphviz to visualize the meta-model).textX allows you to create simple and complex DSLs, for example, to describe configuration files, to create domain-specific languages (DSLs) used in various fields such as business rules, scientific computing, natural language processing, and many others.To use textX, first install the necessary modules:pip install textX click After installation, the console utility textx will appear, which will be used to check the correctness of metamodels and DSL modules (according to the grammar of the language). Textx uses setuptools to search for registered components and allows you to expand its capabilities by adding languages (the list can be viewed via textx list-languages) and connecting generators (textx list-generators). Extensions can be installed as modules via pip install, for example:textx-jinja – using the jinja templator to convert a DSL model into a text document (eg HTML)textx-lang-questionnaire – DSL definition of questionnairesPDF-Generator-with-TextX – PDF generator based on DSL descriptionWe will build our language without using additional extensions to see the whole process. Let’s start with a simple task of interpreting a text file consisting of the strings “hello”, which will generate code that runs in Python to display address greetings. The language generator and description will be created in the hello package and we will also define a setup.py file to configure the entry points to extract the metamodel for the language and the code generator.The description (hello.tx) may look like this:DSL: hello*=Hello ; Hello: ‘Hello’ Name ; Name: name=/[A-Za-z\ 0-9]+/ ; Conventional notations are used in the definition:hello*=Hello – enumeration of several elements (Hello), may be missing (will be collected in the hello list)hello+=Hello – one or more elementshello?=Hello – the element may be present, but not necessarily (the model will have None)hello=Hello – exactly one elementAlso, the definition can include string constants, regular expressions, their grouping (for example, the choice of one of the values is indicated through a vertical border) with modifiers (+, ?, * have normal meaning as for regular expressions, # implies a possible arbitrary order of definitions).Let’s create a definition of language in hello/__init__.py:import os.path from os.path import dirname from textx import language, metamodel_from_file @language(‘hello’, ‘*.hello’) def hello(): “””Sample hello language””” return metamodel_from_file(os.path.join(dirname(__file__), ‘hello.tx’)) Here we define a new language with an identifier hellowhich will apply to any files with a name mask *.hello. Let’s add a definition setup.py:from setuptools import setup, find_packages setup( name=”hello”, packages=find_packages(), package_data={“”: [“*.tx”]}, version=’0.0.1′, install_requires=[“textx_ls_core”], description=’Hello language’, entry_points={ ‘textx_languages’: [ ‘hello = hello:hello’ ] } ) And let’s install our module:python setup.py install And let’s check the presence of the installed language:textx list-languages textX (*.tx) textX[3.1.1] A meta-language for language definition hello (*.hello) hello[0.0.1] Sample hello language Now let’s create a test model based on the grammar (test.hello):Hello World Hello Universe Let’s check the correctness of the metamodel and our DSL model (for metamodel compatibility):textx check hello/hello.tx hello/hello.tx: OK. textx check test.hello test.hello: OK. We can get a visual diagram to describe the DSL composition (a description can be created for Graphviz or PlantUML):textx generate hello/hello.tx –target dot dot -Tpng -O hello/hello.dot Now we will add the possibility of code generation, but before that we will do a software analysis of the DSL definition on the metamodel:from textx import metamodel_from_file metamodel = metamodel_from_file(‘hello/hello.tx’) dsl = metamodel.model_from_file(‘test.hello’) for entry in dsl.hello: print(entry.name) Here we will see a list of names (World, Universe) that refer to Hello terms (the terms themselves will be available through the hello list in the root object of the disassembled model). Let’s now add the possibility of code generation, for this we will add a function with the @generator annotation:@generator(‘hello’, ‘python’) def python(metamodel, model, output_path, overwrite, debug, **custom): “””Generate python code””” if output_path is None: output_path = dirname(__file__) with open(output_path + “/hello.py”, “wt”) as p: for entry in model.hello: p.write(f”print(‘Generated Hello {entry.name}’)\n”) print(f”Generated file: {output_path}/hello.py”) The generator is created for the hello language and will be available as target python. Let’s add to the entry_points registration in setup.py:’textx_generators’: [ ‘python = hello:python’ ] And now let’s generate the code:textx generate test.hello –target python The result of the generation will look like this:print(‘Generated Hello World’) print(‘Generated Hello Universe’) Now let’s complicate the task a little and implement a finite state machine. A DSL definition for a finite state machine might look like this:states { RED, YELLOW, RED_WITH_YELLOW, GREEN, BLINKING_GREEN } transitions { RED -> RED_WITH_YELLOW (on1) RED_WITH_YELLOW -> GREEN (on2) GREEN -> BLINKING_GREEN (off1) BLINKING_GREEN -> YELLOW (off2) YELLOW -> RED (off3) } Here we enumerate the possible states and transitions between them (with transition IDs). The following description can be used to define grammar:StateMachine: ‘states’ ‘{‘ states+=State ‘}’ ‘transitions’ ‘{‘ transitions+=Transition ‘}’ ; State: id=ID (‘,’)? ; TransitionName: /[A-Za-z0-9]+/ ; Transition: from=ID ‘->’ to=ID ‘(‘ name=TransitionName ‘)’ ; An optional comma modifier is used here after the event identifier. The created model can be used directly by applying the DSL metamodel to the description of traffic light states:from textx import metamodel_from_file metamodel = metamodel_from_file(‘statemachine/statemachine.tx’) dsl = metamodel.model_from_file(‘test.sm’) # описание перехода class Transition: state_from: str state_to: str action: str def __init__(self, state_from, state_to, action): self.state_from = state_from self.state_to = state_to self.action = action def __repr__(self): return f”{self.state_from} -> {self.state_to} on {self.action}” states = map(lambda state: state.id, dsl.states) transitions = map(lambda transition: Transition(transition.__getattribute__(‘from’), transition.to, transition.name), dsl.transitions) # извлекаем название действий (для меню) actions = list(map(lambda t: t.action, transitions)) And now we implement the finite state machine:current_state = states[0] while True: print(f’Current state is {current_state}’) print(‘0. Exit’) for p, a in enumerate(actions): print(f'{p + 1}. {a}’) i = int(input(‘Select option: ‘)) if i == 0: break if 0 < i <= len(actions): a = actions[i-1] for t in transitions: if t.state_from == current_state and t.action == a: current_state = t.state_to break An alternative solution could be to generate a transition function based on the DSL, which will take an enum value from the state space and a transition (from a list of possible ones) and return a new state or an exception if such a transition is not implemented.Similarly, more complex grammars can be implemented both for data representation and for describing algorithmic language constructs (such as branching or loops).At the end of the article, I want to recommend a free lesson in which we will discuss the basics of API development using the FastAPI framework, as well as consider an example of a small program, and highlight the features of deployment.
Creation of a DSL in Python with the textx library
To describe objects and processes in terms of business logic, configuration and definition of structure and logic in complex systems, a popular approach is the use of subject-specific languages (Domain Specific Language – DSL), which are implemented either through the syntactic features of a programming language (for example, with using metaprogramming tools, annotations/decorators, operator overrides and infix operator creation, such as Kotlin DSL) or by using specialized development tools and compilers (such as Jetbrains MPS or general purpose parsers such as ANTLR or Bison). But there is also a DSL implementation approach based on parsing and concurrent code generation to generate code that executes the description, and in this article we will look at some examples of using the textx library to generate DSLs in Python.
textX is a language modeling (DSL) tool in Python. It allows you to quickly and easily determine the grammar of a language and generate a parser for that language. textX is open source, easily integrated with other Python tools, and can be used in a variety of projects that need to define and process text-based languages.
With textX, you can define different types of language constructs, such as keywords, identifiers, numbers, strings, etc., and define their properties, such as data types or value constraints. The definition of the grammar of the language takes place in a text file in a format called meta-DSL (and registered as a tx language associated with the file mask *.tx). The metamodel is used to check the syntactical correctness of the DSL model and allows you to generate a tree of objects for use in the Python code generator (which can generate both other Python code and source text in any other programming language and even make a PDF document or create a dot – file for graphviz to visualize the meta-model).
textX allows you to create simple and complex DSLs, for example, to describe configuration files, to create domain-specific languages (DSLs) used in various fields such as business rules, scientific computing, natural language processing, and many others.
To use textX, first install the necessary modules:
pip install textX click
After installation, the console utility textx will appear, which will be used to check the correctness of metamodels and DSL modules (according to the grammar of the language). Textx uses setuptools to search for registered components and allows you to expand its capabilities by adding languages (the list can be viewed via textx list-languages
) and connecting generators (textx list-generators
). Extensions can be installed as modules via pip install, for example:
-
textx-jinja – using the jinja templator to convert a DSL model into a text document (eg HTML)
-
textx-lang-questionnaire – DSL definition of questionnaires
-
PDF-Generator-with-TextX – PDF generator based on DSL description
We will build our language without using additional extensions to see the whole process. Let’s start with a simple task of interpreting a text file consisting of the strings “hello”, which will generate code that runs in Python to display address greetings. The language generator and description will be created in the hello package and we will also define a setup.py file to configure the entry points to extract the metamodel for the language and the code generator.
The description (hello.tx) may look like this:
DSL:
hello*=Hello
;
Hello:
'Hello' Name
;
Name:
name=/[A-Za-z\ 0-9]+/
;
Conventional notations are used in the definition:
-
hello*=Hello
– enumeration of several elements (Hello), may be missing (will be collected in the hello list) -
hello+=Hello
– one or more elements -
hello?=Hello
– the element may be present, but not necessarily (the model will have None) -
hello=Hello
– exactly one element
Also, the definition can include string constants, regular expressions, their grouping (for example, the choice of one of the values is indicated through a vertical border) with modifiers (+
, ?
, *
have normal meaning as for regular expressions, # implies a possible arbitrary order of definitions).
Let’s create a definition of language in hello/__init__.py
:
import os.path
from os.path import dirname
from textx import language, metamodel_from_file
@language('hello', '*.hello')
def hello():
"""Sample hello language"""
return metamodel_from_file(os.path.join(dirname(__file__), 'hello.tx'))
Here we define a new language with an identifier hello
which will apply to any files with a name mask *.hello
. Let’s add a definition setup.py
:
from setuptools import setup, find_packages
setup(
name="hello",
packages=find_packages(),
package_data={"": ["*.tx"]},
version='0.0.1',
install_requires=["textx_ls_core"],
description='Hello language',
entry_points={
'textx_languages': [
'hello = hello:hello'
]
}
)
And let’s install our module:
python setup.py install
And let’s check the presence of the installed language:
textx list-languages
textX (*.tx) textX[3.1.1] A meta-language for language definition
hello (*.hello) hello[0.0.1] Sample hello language
Now let’s create a test model based on the grammar (test.hello):
Hello World
Hello Universe
Let’s check the correctness of the metamodel and our DSL model (for metamodel compatibility):
textx check hello/hello.tx
hello/hello.tx: OK.
textx check test.hello
test.hello: OK.
We can get a visual diagram to describe the DSL composition (a description can be created for Graphviz or PlantUML):
textx generate hello/hello.tx --target dot
dot -Tpng -O hello/hello.dot
Now we will add the possibility of code generation, but before that we will do a software analysis of the DSL definition on the metamodel:
from textx import metamodel_from_file
metamodel = metamodel_from_file('hello/hello.tx')
dsl = metamodel.model_from_file('test.hello')
for entry in dsl.hello:
print(entry.name)
Here we will see a list of names (World, Universe) that refer to Hello terms (the terms themselves will be available through the hello list in the root object of the disassembled model). Let’s now add the possibility of code generation, for this we will add a function with the @generator annotation:
@generator('hello', 'python')
def python(metamodel, model, output_path, overwrite, debug, **custom):
"""Generate python code"""
if output_path is None:
output_path = dirname(__file__)
with open(output_path + "/hello.py", "wt") as p:
for entry in model.hello:
p.write(f"print('Generated Hello {entry.name}')\n")
print(f"Generated file: {output_path}/hello.py")
The generator is created for the hello language and will be available as target python. Let’s add to the entry_points registration in setup.py:
'textx_generators': [
'python = hello:python'
]
And now let’s generate the code:
textx generate test.hello --target python
The result of the generation will look like this:
print('Generated Hello World')
print('Generated Hello Universe')
Now let’s complicate the task a little and implement a finite state machine. A DSL definition for a finite state machine might look like this:
states {
RED,
YELLOW,
RED_WITH_YELLOW,
GREEN,
BLINKING_GREEN
}
transitions {
RED -> RED_WITH_YELLOW (on1)
RED_WITH_YELLOW -> GREEN (on2)
GREEN -> BLINKING_GREEN (off1)
BLINKING_GREEN -> YELLOW (off2)
YELLOW -> RED (off3)
}
Here we enumerate the possible states and transitions between them (with transition IDs). The following description can be used to define grammar:
StateMachine:
'states' '{'
states+=State
'}'
'transitions' '{'
transitions+=Transition
'}'
;
State:
id=ID (',')?
;
TransitionName:
/[A-Za-z0-9]+/
;
Transition:
from=ID '->' to=ID '(' name=TransitionName ')'
;
An optional comma modifier is used here after the event identifier. The created model can be used directly by applying the DSL metamodel to the description of traffic light states:
from textx import metamodel_from_file
metamodel = metamodel_from_file('statemachine/statemachine.tx')
dsl = metamodel.model_from_file('test.sm')
# описание перехода
class Transition:
state_from: str
state_to: str
action: str
def __init__(self, state_from, state_to, action):
self.state_from = state_from
self.state_to = state_to
self.action = action
def __repr__(self):
return f"{self.state_from} -> {self.state_to} on {self.action}"
states = map(lambda state: state.id, dsl.states)
transitions = map(lambda transition: Transition(transition.__getattribute__('from'), transition.to, transition.name),
dsl.transitions)
# извлекаем название действий (для меню)
actions = list(map(lambda t: t.action, transitions))
And now we implement the finite state machine:
current_state = states[0]
while True:
print(f'Current state is {current_state}')
print('0. Exit')
for p, a in enumerate(actions):
print(f'{p + 1}. {a}')
i = int(input('Select option: '))
if i == 0:
break
if 0 < i <= len(actions):
a = actions[i-1]
for t in transitions:
if t.state_from == current_state and t.action == a:
current_state = t.state_to
break
An alternative solution could be to generate a transition function based on the DSL, which will take an enum value from the state space and a transition (from a list of possible ones) and return a new state or an exception if such a transition is not implemented.
Similarly, more complex grammars can be implemented both for data representation and for describing algorithmic language constructs (such as branching or loops).
At the end of the article, I want to recommend a free lesson in which we will discuss the basics of API development using the FastAPI framework, as well as consider an example of a small program, and highlight the features of deployment.