Creation of DSL based on Python

Creation of DSL based on Python

Today we’re going to talk about how to make your developer life even more exciting and productive. Yes, it will be about how you can create your own programming language using Python.

You have probably come across situations where standard programming languages ​​were not very suitable for your particular task. This is where the concept of Domain-Specific Language or DSL comes into play. This tool allows you to create a specialized language that precisely matches the needs of your domain.

Let’s be honest, sometimes common programming languages ​​can be a bit cumbersome. By creating a DSL, you focus only on the aspects you need, avoiding unnecessary gloom. It’s like a step-by-step recipe in cooking: you don’t knead the dough if you’re making soup. This way, your code becomes cleaner, clearer and more efficient.

Choice of language and approach

An explanation of the choice of Python as the base language

So, why Python? This programming language, which is loved by millions, was not chosen by chance. As an artist’s palette, it allows us to create DSL masterpieces with incredible ease. Python has a clean and expressive syntax that, like an artist’s brush, allows us to create beautiful and understandable constructions of a new language.

Looking back, we can see that Python was originally designed to be easy to read and fun to write code. It is these features that make Python an ideal candidate for creating a DSL. We can express our ideas in simple and understandable constructions.

Comparison of built-in and external DSLs

In addition to choosing Python, we’ll decide whether we’re going to build a built-in DSL or an external one. After all, it’s like choosing tools for future works. The built-in DSL is like the watercolor paints applied to the canvas by the artist, becoming part of the overall picture. An external DSL is like a symphony orchestra playing our music sheet by sheet.

An embedded DSL lives inside the host language, in our case Python. It allows us to embed domain-specific constructs directly into Python code. It is like the singing of birds in the forest, which fits into the general chorus of nature. An external DSL is a separate language that we build independently of Python. It is like a solo performance that emphasizes the depth and beauty of our ideas.

Determination of goals and requirements for the language being created

Now that we’ve chosen Python and decided what approach we’re going to take, it’s time to define the goals and requirements of the upcoming piece.

The purpose of creating a DSL can be different. We can strive to increase performance by making the code more efficient and optimized. We can focus on improving code readability and understanding so that every colleague can be inspired by our creativity.

Designing a DSL: How to Create Your World in Python Code

Definition of syntax

The syntax of our DSL is the style that will describe our ideas.

Let’s look at an example. Let’s say you’re building a DSL to manage robots. Your keywords can be move, rotate, scanAnd operators – for example, degrees, meters. Now you can write something like:

robot.move(2.5, meters).rotate(90, degrees).scan()

It is important to choose a syntax that is intuitive to the users of your DSL.

Selection of data structures

The choice of data structures can greatly affect the usability of a DSL.

Let’s return to our example with robots. You can use classes and objects to represent robots and their actions:

class Robot:
    def __init__(self):
        self.actions = []

    def move(self, distance, unit):
        self.actions.append(f"Move {distance} {unit}")
        return self

    def rotate(self, angle, unit):
        self.actions.append(f"Rotate {angle} {unit}")
        return self

    def scan(self):
        self.actions.append("Scan")
        return self

This way you can build a sequence of actions in your DSL:

robot = Robot()
robot.move(2.5, meters).rotate(90, degrees).scan()

Development of semantics

Semantics are how the actions of your DSL are interpreted.

Continuing with the example, let’s add semantics to the robot’s actions:

class Robot:
    # ... (предыдущий код)

    def execute(self):
        for action in self.actions:
            print(f"Executing: {action}")

robot = Robot()
robot.move(2.5, meters).rotate(90, degrees).scan().execute()

Interaction with existing libraries and Python code

Finally, we come to how to integrate your DSL with existing Python code and libraries. Your DSL can become an integral part of the project.

To do this, let’s consider how you can interact with the library to work with robots:

class Robot:
    # ... (предыдущий код)

    def connect_to_robot(self, real_robot):
        self.real_robot = real_robot

    def execute(self):
        for action in self.actions:
            print(f"Executing: {action}")
            # Пример вызова метода библиотеки
            if "Move" in action:
                self.real_robot.move()
            # ... другие действия

real_robot = RealRobot()  # Подразумевается, что у нас есть класс для реального робота
robot = Robot()
robot.connect_to_robot(real_robot)
robot.move(2.5, meters).rotate(90, degrees).scan().execute()

Embedded language implementation:

Using a Python library to create a parser

A parser is like a director who puts the actors on stage in the right order. We can use Python libraries to create a parser and turn our lines of code into a data structure.

Imagine we have a DSL for mathematical expressions:

expr = "add(5, multiply(3, 2))"

Using the library pyparsingwe can create a parser:

from pyparsing import Word, nums, Forward, Group, Suppress

integer = Word(nums).setParseAction(lambda tokens: int(tokens[0]))
ident = Word(alphas)

func_call = Forward()
atom = integer | ident + Suppress("(") + func_call + Suppress(")")
func_call <<= ident + Suppress("(") + Group(atom[...] + Suppress(",").leaveWhitespace()) + atom + Suppress(")")

parsed_expr = func_call.parseString(expr, parseAll=True)
print(parsed_expr)

Creating an Abstract Syntax Tree (AST)

And now our parser creates an Abstract Syntax Tree (AST). AST allows us to represent the code in the form of a tree-like structure, which makes it easier to process.

Let’s continue with our example:

class Node:
    def __init__(self, value):
        self.value = value
        self.children = []

for token in parsed_expr:
    if isinstance(token, str):
        node = Node(token)
        parsed_expr.insert(0, node)
    else:
        node.children.append(token)

The AST now contains information about the structure of our code, like actors and their replicas.

We can turn our DSL into this Python code:

def evaluate(node):
    if isinstance(node.value, int):
        return node.value
    elif node.value == "add":
        return sum(evaluate(child) for child in node.children)
    elif node.value == "multiply":
        result = 1
        for child in node.children:
            result *= evaluate(child)
        return result

print(evaluate(parsed_expr[0]))

The DSL code was converted to executable Python code.

Ensuring reading and checking of the Code

Adding comments and documentation for the DSL

Let’s imagine that your code is a book, and the comments and documentation are like a conversation with the author. Well-formed comments make the code more understandable and accessible. They will help developers who use the DSL to quickly understand its functionality.

Continuing the story with works:

class Robot:
    # ... (предыдущий код)

    def move(self, distance, unit):
        """Move the robot by a specified distance."""
        # ... (код функции)

    def rotate(self, angle, unit):
        """Rotate the robot by a specified angle."""
        # ... (код функции)

    def scan(self):
        """Perform a scan operation."""
        # ... (код функции)

Adding such documentation will make it easy for users of your DSL to understand what each feature does and how to use it properly.

Development of module tests for language

Let’s move on to testing your DSL. Module tests are like rehearsals before an important performance. They ensure the reliability of your code and confidence in its operation.

Example:

import unittest

class TestRobotDSL(unittest.TestCase):
    def test_move(self):
        robot = Robot()
        robot.move(2.5, meters)
        self.assertEqual(robot.execute(), "Move 2.5 meters")

    def test_rotate(self):
        robot = Robot()
        robot.rotate(90, degrees)
        self.assertEqual(robot.execute(), "Rotate 90 degrees")

if __name__ == '__main__':
    unittest.main()

These tests will help you quickly identify errors and ensure that your DSL is working as intended.

Using linters and static analysis to ensure code quality

Linters and static analysis allow you to identify potential problems in your code and help you adhere to design standards.

We use pylint:

pip install pylint

Then we run the linter:

pylint your_dsl_module.py

The linter will issue recommendations to improve code structure, style, and identify potential bugs.

An example of creating a language for Habra

Let’s see how to create your own language for Habra.

Step 1: Defining Syntax and Key Constructions

Let’s start by defining how users will use our DSL for Habra. Let’s say we want to make it easy for users to link to articles and cite interesting points. Let’s create keywords link and quote:

link("URL")
quote("Текст цитаты")

Step 2: Development of Syntax

Now let’s create a parser for DSL. We use the library pyparsingto convert our strings to Python objects:

from pyparsing import Word, alphas, Suppress, QuotedString

link_keyword = Suppress("link")
quote_keyword = Suppress("quote")
url = QuotedString('"')
text = QuotedString('"', multiline=True)

link_parser = link_keyword + "(" + url + ")"
quote_parser = quote_keyword + "(" + text + ")"

Step 3: Create an Abstract Syntax Tree (AST)

Now let’s create classes to represent the objects of our DSL:

class LinkNode:
    def __init__(self, url):
        self.url = url

class QuoteNode:
    def __init__(self, text):
        self.text = text

Step 4: Convert to Python Code

Let’s add methods __str__ to our classes so that they can transform themselves into a line of Python code:

class LinkNode:
    # ... (предыдущий код)

    def __str__(self):
        return f'link("{self.url}")'

class QuoteNode:
    # ... (предыдущий код)

    def __str__(self):
        return f'quote("{self.text}")'

Step 5: Example of use

So, our DSL is ready! Now let’s see how it will look in action:

content = [
    link("https://habr.com"),
    quote("Создание DSL на основе Python"),
    link("https://habr.com/article/12345"),
    quote("OTUS")
]

for item in content:
    print(item)

Step 6: Additional ideas

We’ve created a DSL for adding links and citations, but why not add functionality? For example, you can create keywords to create lists, underline text, and even insert images. Let’s create keywords list, underline and image:

codelist("Элемент 1", "Элемент 2", "Элемент 3")
underline("Выделенный текст")
image("URL_изображения", caption="Подпись к изображению")

Error handling and language extensions

Error and exception handling in DSL

We cannot avoid mistakes.

Let’s implement this idea in a DSL, such as Habr. Let’s imagine that we have a function quoteand the user accidentally forgot to close the quotation mark:

quote("Забыл закрыть кавычку)

To warn the user, let’s add a check and throw an exception:

class QuoteNode:
    def __init__(self, text, author=None):
        if '"' not in text:
            raise ValueError("Текст цитаты должен быть заключен в кавычки")
        self.text = text
        self.author = author

Now, if the user makes a mistake, he will immediately receive a human message about what went wrong.

Possibilities for expanding the functionality of the language

Our DSL already allows you to do some things on Habre. But let’s make it even more advanced. For example, let’s add: code to insert software code and image for images:

code("Python", "print('Hello, DSL!')")
image("https://example.com/image.jpg", caption="Красивое изображение")

Adding new features allows users to create a variety of content directly from our DSL.

Support for new features without disrupting existing functionality

While we’re adding new features, it’s important not to break existing functionality. For this, let’s use the concept of inheritance.

class ImageNode:
    def __init__(self, url, caption=None):
        self.url = url
        self.caption = caption

class CodeNode:
    def __init__(self, language, code):
        self.language = language
        self.code = code

In this way, we add new classes for new functions without touching the work of existing ones.

Conclusion

Creating a DSL is like writing a symphony where you are the composer. You identify the notes, decide how they will sound, and create the harmony. Your DSL not only makes your code more readable and convenient, but also inspires new creative heights.

Keep exploring, improving and expanding your DSL. Add new features, experiment with new ideas and contribute to its development. Your DSL is the key to endless possibilities and creative growth.

At the open classes conducted by OTUS experts, you will be able to get even more practical analysis of various tools and learn a lot of useful information. Look at the calendar and register for events that interest you. All open classes are held free of charge.

Related posts