Does C# always have a wrapper when concatenating with a string and interpolating? / Hebrew

Does C# always have a wrapper when concatenating with a string and interpolating? / Hebrew

C# developers are very familiar with the term “wrapper”. It can be obvious, or it can be invisible. For example, adding a meaningful type with a string to the package prompts. Or does not lead. This is the “Schrödinger’s package”. In this note, we will try to deal with this uncertainty.

1060_NoteAboutBoxing_ua/image1.png

How we faced it

This topic did not arise by chance. The fact is that I am involved in the C# development of the PVS-Studio analyzer. One of the directions of its development in 2023 became diagnostic rules focused on Unity Engine projects. In particular, we decided to implement diagnostics that indicate optimization possibilities.

We started with rule V4001. It identifies which code in the project is executed relatively often and indicates cases of packaging within it. Packaging is a rather expensive operation compared to the usual transfer by reference or value, so we decided to implement the functionality of finding places of its application.

One of the considered cases was packing when concatenating a string and a value:

string Foo(int a)
{
  return "The value is " + a;
}

At first glance, packaging will always be produced here. But after digging deeper, we realized that everything is not so clear.

Where does the packaging even come from during concatenation?

Packing is done when converting a variable of significant type to a variable of type Object or the type of the interface implemented by this significant type. Transformation of this kind can be explicit or implicit. An explicit conversion will be considered a direct conversion of the type:

var boxedInt = (object)1;

Implicit conversion occurs when a variable of significant type is used where a type is expected or referenced Objectwhether the reference is implemented by this significant interface type:

bool Foo(object obj, int number)
{
  return obj.Equals(number);
}

Method Equals expects a type argument Objecttherefore meaning number will be packed during transmission.

And what happens during concatenation? To some extent, the answer can be given by Visual Studio:

1060_NoteAboutBoxing_ua/image2.png

The operator accepts as the right operand Objectand hence meaning a will be packed. At least it seems so.

The truth is in IL

Of course, you can’t trust IDE prompts “at their word” in such matters. Let’s take a look at what the above code turns into:

.method private hidebysig static void  Foo(string str,
                                           int32 a) cil managed
{
  ....
  IL_0001:  ldarg.0
  IL_0002:  ldarg.1
  IL_0003:  box      [mscorlib]System.Int32
  IL_0008:  call     string [mscorlib]System.String::Concat(object,
                                                            object)
  IL_000d:  stloc.0
  IL_000e:  ret
}

For simplicity, I have shortened the resulting IL code a bit. The main thing we can see here is the instruction box. It indicates the operation of packing the value of the variable a. You can also notice that the called String.Concat takes 2 reference types Objectbut not String and ObjectAs you might think. In any case, the fact of the presence of packaging is undeniable.

All of the above looks logical, but, despite this, packaging in the case of such a concatenation will not always be carried out.

But how can it be? After all, we saw the command in the IL-code box! Isn’t that the packaging? Well, let’s take another look at the compilation output:

.method private hidebysig static void  Foo(string str,
                                           int32 a) cil managed
{
  ....
  IL_0001:  ldarg.0
  IL_0002:  ldarga.s   a
  IL_0004:  call       instance string [mscorlib]System.Int32::ToString()
  IL_0009:  call       string [mscorlib]System.String::Concat(string,
                                                              string)
  IL_000e:  stloc.0
  IL_000f:  ret
}

As I said, there is no packaging here :).

Okay, careful (and not so) readers will probably have noticed that the IL code is quite different in these cases. The previous example did have a wrapper and a call String.Concat(object, object). At the same time, a method is called on a numeric variable ToStringafter which the method for concatenating 2 lines is quite logically used.

However, it is important to note: the source code for both examples is the same.

What is the difference?

As you can easily guess, the difference is in the collection algorithm. The fact is that, starting with some version, the C# compiler began to automatically optimize such cases of concatenation. I noticed pretty quickly that if the code is compiled from Visual Studio 2019 or later, there will be no packaging when concatenating. Then I decided to dig a little deeper and take a cursory look at the situation with the various platforms.

With projects under the .NET Framework, everything is quite simple. If MSBuild from Visual Studio 2017 or higher is used for building, then packaging at concatenation is not optimized. At the same time, the version of the target platform does not matter (at least the selection of the latest version at the moment did not bring any optimizations).

.NET Core has optimization since about version 3.1. Again, I would like to note that it does not matter what version of TargetFramework is installed for the project itself. It all depends on the version of the SDK being used.

I think it will not be a surprise that the considered optimization for .NET 5 (and later) is available.

Execution time optimizations

Particularly inquisitive minds may assume that the JIT itself could eliminate packaging during concatenation. Indeed, such optimization seems possible.

I tested it on a .NET Framework project. Unfortunately, I did not see any optimizations: if there was packaging in the resulting IL-code, it was really executed during execution (a very noticeable difference in the number of allocations).

If you are interested in this topic and you decide to investigate it, please write about your findings in the comments :). For now, I propose to consider one more interesting related issue.

Interpolation

The packaging during concatenation was dealt with. And what about a similar operation — interpolation? After all, it is practically the same thing – a combination of different pieces in one line. In fact, of course, everything is completely wrong. First of all, it is worth saying that there are differences depending on the chosen target platform.

.NET Framework

Let’s look at another example:

void Foo(string str, int num)
{
  _ = $"{str} {num}";
}

No tricks this time – I’ll say straight away that I’m compiling this code from Visual Studio 2022 without doing anything unnatural :). Let’s see the result:

.method private hidebysig instance void  Foo(string str,
                                             int32 num) cil managed
{
  ....
  IL_0001:  ldstr      "{0} {1}"
  IL_0006:  ldarg.1
  IL_0007:  ldarg.2
  IL_0008:  box        [mscorlib]System.Int32
  IL_000d:  call       string [mscorlib]System.String::Format(string,
                                                              object,
                                                              object)
  IL_0012:  pop
  IL_0013:  ret
}

I would say the result is disappointing. We can see that in the case of interpolation, the packaging has not gone anywhere even with a new version of the compiler.

Let’s try to call it ourselves ToString:

1060_NoteAboutBoxing_ua/image3.png

Visual Studio’s built-in rule IDE0071 suggests removing the “unnecessary” call ToString. However, the benefit of such a call is obvious from the compilation results:

.method private hidebysig instance void  Foo(string str,
                                             int32 num) cil managed
{
  ....
  IL_0001:  ldarg.1
  IL_0002:  ldstr      " "
  IL_0007:  ldarga.s   num
  IL_0009:  call       instance string [mscorlib]System.Int32::ToString()
  IL_000e:  call       string [mscorlib]System.String::Concat(string,
                                                              string,
                                                              string)
  IL_0013:  pop
  IL_0014:  ret
}

No more packaging. Moreover, there is not even a challenge here String.Format – The code turned into a concatenation of 3 lines.

.NET Core and .NET

Consider the behavior on these platforms using the same example:

void Foo(string str, int num)
{
  _ = $"{str} {num}";
}

Here, experiments have shown that optimization depends solely on the target platform of the project. If the project is focused on .NET Core or .NET 5, then the IL is formed for the presented code in the same way as in the case of the .NET Framework (that is, there are no optimizations, packaging and calling String.Format).

If the project is focused on .NET 6 and above, then the compilation result is very different:

.method private hidebysig instance void  Foo(string str,
                                             int32 num) cil managed
{
  ....
  .locals init (valuetype DefaultInterpolatedStringHandler V_0)
  IL_0000:  nop
  IL_0001:  ldloca.s V_0
  IL_0003:  ldc.i4.1
  IL_0004:  ldc.i4.2
  IL_0005:  .... DefaultInterpolatedStringHandler::.ctor(int32, int32)
  IL_000a:  ldloca.s   V_0
  IL_000c:  ldarg.1
  IL_000d:  .... DefaultInterpolatedStringHandler::AppendFormatted(string)
  IL_0012:  nop
  IL_0013:  ldloca.s V_0
  IL_0015:  ldstr " "
  IL_001a:  .... DefaultInterpolatedStringHandler::AppendLiteral(string)
  IL_001f:  nop
  IL_0020:  ldloca.s   V_0
  IL_0022:  ldarg.2
  IL_0023:  .... DefaultInterpolatedStringHandler::AppendFormatted<int32>(!!0)
  IL_0028:  nop
  IL_0029:  ldloca.s   V_0
  IL_002b:  .... DefaultInterpolatedStringHandler::ToStringAndClear()
  IL_0030:  pop
  IL_0031:  ret
}

The code has been greatly shortened for readability. To put it mildly, everything became a little more complicated than a simple challenge String.Format :). Instead, the DefaultInterpolatedStringHandler structure is used to generate the string. Investigating the effectiveness of this approach is beyond the scope of this article, but something here clearly catches the eye (unless they leaked from so much IL, of course).

Pay attention to the challenge DefaultInterpolatedStringHandler::AppendFormatted(!!0). I’ll be honest – I have no idea what “!!0” is, but the presence of the generic parameter hints that there will be no number wrapping here.

.NET 6 rules, in general :).

Conclusion

In general, if we use old versions of the compiler, then the packaging during concatenation really exists, and therefore there is a sense in the calls ToString. There will be no packaging in the new versions anyway (I hope no one will torment candidates with such questions during interviews).

Interpolation is protected from packaging only if the project targets .NET 6 and above. In other situations, challenge ToString in interpolation elements can be useful.

Thank you for your attention. Let me remind you that I participate in the development of the PVS-Studio analyzer, which allows you to search for various errors in the code. If you suddenly want to try it in business, you can do it for free here. Good luck!

If you would like to share this article with an English-speaking audience, please use the translation link: Nikita Lipilin. Does C# always have boxing with string concatenation and interpolation?.

Related posts