Management of memory and shared resources without errors

Management of memory and shared resources without errors

I briefly ran through the article Synchronization of operations in .NET using examples / Habr, after which I wanted to share with Habr users some thoughts about synchronizing access to objects in different programming languages.

To be honest, most of my article has been in drafts for a long time, but everyone did not get around to finishing it, and here is such a good reason to share my thoughts on this topic, all that remained was to add this introductory part 🙂

What are the objects of access synchronization?

Very often, data access synchronization objects are needed to implement various optimizing algorithms. For example, when several computational operations are performed in parallel on different CPU cores (or GPUs), the results must then be combined, which by definition requires the use of object synchronization.

A lot of different synchronization objects have been invented, which differ in purpose and implementation options:

  • Event
  • Mutex
  • Shared mutex
  • Recursive mutex
  • Semaphore
  • Waitable timer
  • Critical section
  • Interlocked Variable Access, etc.

But these options have one thing in common, they can be implemented using the Mutex concept (mutex, from mutual exclusion — “mutual exception”)

Internal and external synchronization objects

If you try to classify them, then first of all, you are asked to divide them into objects of synchronization of access within only one program and into objects of synchronization at the level of the operating system.

The difference between them, as well as the pros and cons, are obvious. Synchronization objects within one application work much faster than similar objects from the OS core, while only the latter provide the ability to synchronize access between several applications.

Possibility of recursive blocking

I would highlight the possibility of recursive blocking (recursive mutex) as the second important property for the classification of synchronization objects. A recursive mutex is a special type of mutex that can be locked multiple times by the same process/thread without causing mutual locking ie. the same thread can capture this type of mutex multiple times.

Typically, a recursive mutex counts how many times it has been locked, and requires the same number of unlock operations before other threads can lock it.

Separate level of access

As the third (but not the most important) property for classifying synchronization objects, I would single out the ability to capture a lock with a shared mutex. Unlike normal mutexes with exclusive access, shared mutex has two levels of access: common – several threads can jointly own the same mutex and exclusive – only one thread can own the mutex.

Shared mutexes are most often used to control access to resources with the possibility of simultaneous reading/modification by multiple readers. General access is needed by readers, and exclusive access is needed by changes to the object.

Where do problems with access synchronization start?

Problems with the synchronization of separate access in programming languages ​​appear only if they have the possibility of creating references to objects (atomicity of operations and synchronization of access to resources of the operating system that are separated from different applications are not taken into account yet).

Moreover, by the term “reference” I do not mean a physical pointer to some address in the computer’s memory, but a single logical entity that allows you to access the same data (resource) in an arbitrary (uncontrolled) order.

Because if the order of access to data to be shared is deterministic and defined in advance, then access to such data does not require synchronization objects.

Or if there will be no references to objects (for example, there is always deep atomic copying of objects, that is, in fact, the transfer of an object by value), then the problem with the synchronization of separate access to data inside the program is basically absent as a class.

But such assumptions and limitations greatly reduce the possibilities of implementing various algorithms, and therefore are rarely used in practice.

What is silver for?

I have several articles on memory management in the Argentum programming language stuck in my memory. I will not repeat the specifics of this language, giving the floor to its author kotan-11 in the publications Object Lifetime Management: Why it is important and why it was necessary to create a new language “Argentum” and Implementation of the reference model in the Argentum programming language.

And despite the fact that the language itself did not cause me an enthusiastic reaction, the approach to memory management at the syntax level seemed to me to be more than a healthy idea.

Why Rust?

And while studying Rust, I was inspired by the idea of ​​memory management based on ownership – where each value in memory should have only one owner variable, and when the owner leaves the scope of execution, the memory is immediately freed. In fact, this is a compile-time implementation of reference counting only.

But unfortunately, the developers of Rust stopped halfway in implementing full control over memory management. It seems to me that it would be more correct to implement only at the language level as control of “ownership” of objects, and full management of memory, in particular, management links to objects inside the program.

In other words, if immediately when defining a variable to indicate which options for obtaining references to it are allowed, then the compiler has the possibility of full automatic control over resources shared within the application.

Since references to objects assume, including the possibility of shared access, it makes sense to include in the memory management model the control of access to shared resources (if necessary), because shared ownership of references to each will require any – what synchronization mechanisms. .

And thanks to this, the concept of managing the life time of objects with simultaneous control of shared access was born.

Terms and concepts

So, the main concepts and assumptions for the concept of object management:

  • Any object is a reference to an area of ​​data memory.

  • References to objects can be of two types:

    • Strong/Possessive references (similar to shared_ptr from C++), and actual, is a variable that stores the value of the object.
    • Weak/Does not own links (similar to weak_ptr from C++) — pointers to other objects that require mandatory capture (that is, conversion to a strong link) before use.

  • Variables – object owners (references are stored in them) can be of two types:

    • local (controlled) — their sphere of life is strictly limited by the rules of language syntax and controlled by the compiler (function arguments, variables inside code blocks or functions, etc.).
    • not controlled — global or static variable objects that are dynamically created, the lifetime of which is not controlled by the compiler.

  • When a local variable goes out of scope, the reference counter is decremented, and when it reaches zero, the object’s memory is freed.

  • An object can have only one not controlled a strongly linked variable and any number of any type of link local (controlled) variables

  • For not controlled variables are only allowed to make weak references that need to be captured before use, e.g. local (controlled) variable

  • Managing the lifetime of an object includes not only managing the memory of the object, but also creating an access synchronization mechanism (if necessary) that works automatically when capturing and releasing references

  • When defining a variable (object), possible methods of obtaining references to this object are immediately declared, which can be:

    • without creating links, that is. the compiler will not allow you to create a reference to this variable, and such a change will not be shared
    • creating a link only in the current thread. When generating machine code, the compiler does not need to create a variable access synchronization object. But such a variable cannot be accessed from other threads (that is, the variable is not visible from another module, and a reference to such an object cannot be returned as a result of the function execution).
    • You can create links with exclusive access. The compiler automatically creates a non-recursive mutex to synchronize variable access.
    • You can create links with recursive access. The compiler automatically creates a recursive mutex (it can be captured multiple times)

  • All types of links can be constant, ie. read-only (and in the case of immutable objects, such a reference will not need a mutex).

  • Weak link capture is provided using special syntax constructs with result retention local (controlled) variable Such use of object capture logic only at the level of language syntax guarantees the subsequent automatic release of the temporary variable, which is equivalent to the absence of circular references.

A hypothetical example of code in an abstract language:

Conventional symbols:

  • & — single-stream link without access blocking
  • && — a multithreaded link with monopoly access blocking
  • &* — a multithreaded link with recursive access blocking
  • ^ (&^, &&^ or &*^) – Suffix to indicate a reference to a constant object, ie. capturing such a link is read-only
  • * – Capture link for monopoly access to the object
  • *^ – Link capture read only

An example of creating variables:

# Обычная переменная - владелец объекта без раздельного доступа 
val := 123; 
# и без возможности создания ссылки на объект т.е. 
val2 := &val; # Ошибка !!!!

# Переменная - владелец объекта с возможностью создания ссылки 
# для исопльзования в текущем потоке т.е. 
& ref := 123;

ref2 := &ref; # ОК, но только в рамках одного потока !!!!

# Переменная - владелец объекта с возможностью создать ссылку
# и с синхронизацией доступа из разных потоков приложения 
# (с межпотоковой синхронизацией) т.е. 
&& ref_mt := 123;
ref_mt2 := &&ref_mt; # ОК !!!!

# Переменная - владелец объекта с возможностью создать ссылку
# и с синхронным доступом из разных потоков приложения с рекурсивной блокировкой
&* ref_mult := 123;
ref_mult2 := &* ref_mult; # ОК !!!!

Or something like this:

    # Обычная переменная без раздельного доступа 
   # и без возможности создания ссылки на объект 
    let val := 123; 
    let val_err := &val; # Ошибка !!!!

    # Переменная - владелец объекта с возможностью раздельного доступа
    #и с возможностью создать ссылку в текущем потоке 
    let & ref := 123; 
    let ref2 := &ref; # ОК, но только в рамках одного потока без мьютекса !!!!

    # Переменная с возможностью создать ссылку 
    # с доступом из разных потоков приложения и монопльной блокировкой
    let && ref_mt := Dict(field1 = 123, filed2 = 456); 
    let ref_mt2 := &&ref_mt; # ОК

    # Захват слабой ссылки выполняется отдельным оператором
    print(*ref_mt2.field1);
    *ref_mt2.field2 = 42;

    def func_name(val,  &ref){
        # Контролируемые переменные
        let dup = val; # Дубль сильной ссылки с инкрементом счетчика
        let local = *ref; # Захват слабой ссылки с инкрементом счетчика
        # При выхое из функции будет декремент счетчика ссылок
    }

    func_name(ref, &ref);

Why such difficulties?

This concept frees the user (programmer) from the need for manual memory management, does not require the use of a garbage collector, and eliminates errors in both memory management and shared access to resources shared within the program.

And all memory control happens when the program’s source code is compiled, so that everything flies when the program is executed :-)!

Z.I.

Happy New Year!

Related posts