So how many lines of C does it actually take to do a + b in Python? / Hebrew

So how many lines of C does it actually take to do a + b in Python? / Hebrew

This is the answer to the article “How many lines of C does it take to do a + b in Python?” where the number of lines is not specified. I’m not going to cover all scripts for python by the number of lines – too many options, too lazy. But here’s the question, how many strings does python need to add 2 numbers? Will it make a difference if I store these numbers as variables or no difference?

Formulation of the problem

I decided that I will compare the difference not in interactive mode, but when reading from a file. Because in this case, you can get rid of unnecessary procedures for inputting a conclusion for the interactive mode. Python will execute the file accordingly. And here we will compare the difference between the execution of an empty file and a file where this very addition will have to be made. And so let’s decide what we need for this:

  • find out how many lines of python are needed in general to execute an empty file and exit without errors

  • find out how many lines of python are needed to execute a file where you need to add two numbers without saving the result anywhere

  • find out how many lines of python are needed in general to execute a file where there is an addition of two variables containing numbers

Contents of files for comparison.

We will compare an empty file with two cases

4+5

and

a=4
b=5
a+b

I will not display the source code of an empty file =)

What is needed for this?

Compile the debugging python, which will include code coverage, and then start generating the report. I really wanted to do this all on Windows, but code coverage is only available in Visual Studio Enterprise and I have Community Edition. So, it’s time to install a virtual machine with Linux (I chose Ubuntu 22.04 LTS, because I don’t like to work only from the console). Here is a step-by-step guide on how to do it. If you are only interested in the results, you can safely go to the end of the article.

How to get the build.

Let’s go to the official page, how can we collect the build here. We find the operating system we need and follow the steps. We set dependencies:

vim /etc/apt/sources.list
// вставляем в конец файла
deb-src http://archive.ubuntu.com/ubuntu/ jammy main
//сохраняемся.
sudo apt-get update
sudo apt-get build-dep python3
sudo apt-get install pkg-config
sudo apt-get install git

//вот это уже не обазательно, но уберем лишние варнинги при сборке,
//что питону чего то не хватает
sudo apt-get install build-essential gdb lcov pkg-config \
libbz2-dev libffi-dev libgdbm-dev libgdbm-compat-dev liblzma-dev \
libncurses5-dev libreadline6-dev libsqlite3-dev libssl-dev \
lzma lzma-dev tk-dev uuid-dev zlib1g-dev

Ok, now you can download python.

git clone https://github.com/python/cpython
cd cpython

We need a debugging python, respectively:

./configure --with-pydebug
make -j8
// где 8 количество ядер, которые нам нужны для компиляции, по умолчанию будет 2

So, the assembly was successful and a magical file appeared.pythonLet’s run it and check that we load and exit without errors, and that 4+5 is still 9. Now we need to assemble an assembly with code coverage. despite our configuration with –with-pydebug (Of course, there is none, but it is better without it, otherwise the report will become even less accurate. Open configure in a text editor and see where the “-Og” flag is set. In my case, it is:

PYDEBUG_CFLAGS="-O0"
if test "x$ac_cv_cc_supports_og" = xyes
then :
  PYDEBUG_CFLAGS="-Og"
fi

Let’s take away -Og, leave the quotation marks empty.

Key for gcc–coverage” includes code coverage. In order for gcc to see these keys during compilation, you need to set two environment variables. This is written in the help of the configure script:

./configure --help
// вот эти строчки
CFLAGS      C compiler flags
LDFLAGS     linker flags, e.g. -L<lib dir> if you have libraries in a
            nonstandard directory <lib dir>

On Linux it is:

export СFLAGS="--coverage"
export LDFLAGS="--coverage"

env
// проверяем что они поставились

We run configure and make, as we did once. We start python and check that everything works and nothing crashes. If everything went correctly, we should have a bunch of files with the extension gcno and gcda in the folder.
We are checking.

find . | grep gcda
find . | grep gcno

Files gcno contain static information. But here gcda indicate which lines were passed during the call. Accordingly, before each report and python run with three different scripts, it would not be a bad idea to remove them. This is how it is done.

find . |grep gcda |xargs rm

Now there are no gcda files and we can run our files.

We generate a coverage report

Therefore, before each report, it is necessary to get rid of gcda files. The first run is simply with an empty file:

./python empty.py

mkdir report_empty
gcovr --html-details report_empty/index.html
// генерируем отчет

Remove gcda files, and repeat the procedure in a circle with other folders for reports and files.
A typical report looks like this:

Results

We open index.html from each folder with reports and see how many lines were launched from each run. My results are as follows.

Empty file:

Exec

Total

Coverage

Lines:

33241

253928

13.1%

Branches:

15536

190265

8.2%

4+5

Exec

Total

Coverage

Lines:

34261

253928

13.5%

Branches:

16136

190265

8.5%

a=4
b=5
a+b

Exec

Total

Coverage

Lines:

34724

253928

13.7%

Branches:

16483

190265

8.7%

All that remains is to perform a minus operation and we get that:

Epilogue

What do these numbers say, is it a lot or a little? I would say it’s normal, it’s a python. If it’s expensive for you, that is, number crunchers in bulk, such as scipy or numpy. There, such procedures can cost one CPU cycle or less. And in general, the results of the experiment have nothing to do with real life. Because it’s all compiled without optimization. With string optimization, of course, it will not become less, but the assembler can lose weight very decently.

Moreover, I only looked at adding ordinary integers. Adding classes will already cost differently, and for each class differently. And each line in programming has a different price and complexity. Therefore, it would be much more correct to compare how many assembly instructions would be required for this. But the problem is that depending on the platform, the compilation keys, the type and size of the data being compiled as well as the position of the moon and the playfulness of Mercury’s mood the result can vary greatly. I am already silent about the fact that number crushers can and do use a completely different assembler.

Related posts