Auto-Scaling in Telegram bots

Auto-Scaling in Telegram bots

Salute, Habre!

Auto-scaling is a mechanism that allows you to flexibly adapt to changing load conditions by automatically expanding or reducing resources. This technology is very relevant in our world.

For example, you created your incredibly cool Telegram bot, which suddenly started firing. Initially, it handled a steady flow of requests, but as the robot’s subs grew and activity increased, it became apparent that a scalable infrastructure was needed.

Auto-scaling will allow not only to maintain the stable operation of the bot during bursts of activity, but also to significantly reduce the costs of maintaining an excessive infrastructure during periods of low activity.

Auto-Scaling base:

The main goal of auto-scaling is to maintain stable performance while minimizing costs.

Auto-scaling allows the system to dynamically adapt to changing workloads by automatically increasing or decreasing the number of servers or application instances. This helps avoid resource overspending during low-load periods and resource shortages during peak periods.

Auto-scaling is based on monitoring key performance indicators, such as CPU, memory, network activity, or even application-specific metrics, such as in the case of Telegram bots, the number of concurrent user sessions.

Threshold values ​​of these metrics are defined, upon reaching which the system automatically initiates the scaling process.

Horizontal scaling is to add or remove instances of servers or containers.

Vertical scaling includes increasing or decreasing resources for existing instances (for example, increasing RAM or CPU).

All autoscaling processes are automated and can be performed without human intervention. After scaling, the system should continue to function as it did before scaling without losing data or functionality.

Defined rules or policies, Determining when and how scaling should be done. These rules can be schedule-based (for example, different scaling parameters for peak hours and off-peak hours) or adaptive algorithms that respond to changes in real time.

When and why your bot might need Auto-Scaling.

The load on the Telegram bot is determined by the number of requests it receives per unit of time. These requests can vary from simple commands to complex interactive dialogs.

Load types:

Peak load: Temporary spikes in activity, often caused by specific events or marketing promotions. Constant load: A regular number of requests sustained over a long period of time. Variable load: Irregular changes in activity, often related to the time of day or days of the week.

Metrics for load analysis:

Number of active users (DAU/MAU): Daily and monthly active users Number of messages per second/minute (MPS/MPM): Average number of messages per second or minute. Response time: The time required to respond to a user request. Error rate: Percentage of requests that fail

If load metrics exceed predefined thresholds (such as response time or MPS), this is a call to scale.

How to implement

Stateless architecture

The main feature is that each request is processed independently, without saving any user state on the server between requests. This allows any instance to handle any request.

The bot does not store user state information between successive requests. All necessary information is transmitted in each request. All user state and session data is stored in external storage, such as a database. Each request must be idempotent, that is, its repeated execution must lead to the same results.

For example, it might look like this:

from telegram.ext import Updater, CommandHandler, MessageHandler, Filters
import os
import psycopg2

# Подключение к базе данных PostgreSQL
DATABASE_URL = os.getenv('DATABASE_URL')
conn = psycopg2.connect(DATABASE_URL, sslmode="require")

# Функция обработки команды /start
def start(update, context):
    user_id = update.effective_user.id
    with conn.cursor() as cur:
        cur.execute("INSERT INTO users (user_id) VALUES (%s) ON CONFLICT DO NOTHING", (user_id,))
        conn.commit()
    update.message.reply_text("Привет! Я безсостоянийный телеграм-бот.")

# Функция для обработки обычных сообщений
def handle_message(update, context):
    user_id = update.effective_user.id
    # Получение данных пользователя из базы данных
    with conn.cursor() as cur:
        cur.execute("SELECT * FROM users WHERE user_id = %s", (user_id,))
        user_data = cur.fetchone()
        # Здесь можно обработать данные пользователя
    update.message.reply_text("Ваше сообщение обработано!")

# основа примера
def main():
    TOKEN = os.getenv('TELEGRAM_BOT_TOKEN')
    updater = Updater(TOKEN, use_context=True)

    dp = updater.dispatcher
    dp.add_handler(CommandHandler("start", start))
    dp.add_handler(MessageHandler(Filters.text & ~Filters.command, handle_message))

    updater.start_polling()
    updater.idle()

if __name__ == '__main__':
    main()

PostgreSQL is used to store user data. The connection is made through psycopg2. start and handle_message are command and message handlers, respectively. They interact with the database to read or write data about the user.

All information about the user is stored in the database. Each request is processed independently and the state is not stored on the bot server.

Queue management systems

Queues allow programs to process tasks asynchronously. The producer sends messages to the queue, and the consumer retrieves them and processes them when ready. They help to evenly distribute the load between different work nodes or processes.

They also increase the system’s fault tolerance because messages can be stored and reprocessed in the event of errors.

Allows you to easily add additional worker nodes to process messages.

For example, let’s integrate the RabbitMQ queue with the robot. In this scenario, the bot will send messages to a RabbitMQ queue, which will then be processed by separate worker processes:

Manufacturer: Telegram bot

import os
from telegram.ext import Updater, CommandHandler, MessageHandler, Filters
import pika

# Подключение к RabbitMQ
connection = pika.BlockingConnection(pika.ConnectionParameters(host="localhost"))
channel = connection.channel()
channel.queue_declare(queue="telegram_queue")

# Функция обработки сообщений
def handle_message(update, context):
    message = update.message.text
    channel.basic_publish(exchange="",
                          routing_key='telegram_queue',
                          body=message)
    update.message.reply_text('Сообщение отправлено в очередь!')

def main():
    TOKEN = os.getenv('TELEGRAM_BOT_TOKEN')
    updater = Updater(TOKEN, use_context=True)

    dp = updater.dispatcher
    dp.add_handler(MessageHandler(Filters.text & ~Filters.command, handle_message))

    updater.start_polling()
    updater.idle()

if __name__ == '__main__':
    main()

Consumer: A message processing workflow

import pika

def callback(ch, method, properties, body):
    print(f"Получено сообщение: {body.decode()}")

connection = pika.BlockingConnection(pika.ConnectionParameters(host="localhost"))
channel = connection.channel()
channel.queue_declare(queue="telegram_queue")

channel.basic_consume(queue="telegram_queue", on_message_callback=callback, auto_ack=True)

print('Ожидание сообщений. Для выхода нажмите CTRL+C')
channel.start_consuming()

RabbitMQ is used as a queue management system. Messages from the Telegram bot are placed in a queue telegram_queue. The robot sends the received messages to the RabbitMQ queue. Each message from the user becomes a separate message in the queue.

A separate process (or processes) is subscribed to a queue and processes messages as they arrive.

Microservice architecture

I think microservices need no introduction.

Let’s say we have a bot that provides two main functions: processing text commands and processing images. In the microservice architecture, we will divide these functions into two separate services:

  1. TextCommandService: Processes text commands from users

  2. ImageProcessingService: Handles image processing requests.

Both services can be deployed independently and scale based on load.

Let’s write a simple code where both microservices interact via HTTP requests.

TextCommandService

# text_command_service.py
from flask import Flask, request
app = Flask(__name__)

@app.route('/process_text', methods=['POST'])
def process_text():
    text = request.json.get('text')
    # Обработка текстовой команды
    response = f"Обработан текст: {text}"
    return response

if __name__ == '__main__':
    app.run(port=5000)

ImageProcessingService

# image_processing_service.py
from flask import Flask, request
app = Flask(__name__)

@app.route('/process_image', methods=['POST'])
def process_image():
    image_data = request.json.get('image_data')
    # Обработка изображения
    response = f"Обработано изображение: {image_data[:10]}..."
    return response

if __name__ == '__main__':
    app.run(port=5001)

Telegram bot

# telegram_bot.py
import requests
from telegram.ext import Updater, CommandHandler, MessageHandler, Filters

def handle_text(update, context):
    text = update.message.text
    response = requests.post("http://localhost:5000/process_text", json={"text": text})
    update.message.reply_text(response.text)

def handle_photo(update, context):
    photo_file = update.message.photo[-1].get_file()
    image_data = photo_file.download_as_bytearray()
    response = requests.post("http://localhost:5001/process_image", json={"image_data": image_data})
    update.message.reply_text(response.text)

def main():
    TOKEN = 'YOUR_TELEGRAM_BOT_TOKEN'
    updater = Updater(TOKEN, use_context=True)

    dp = updater.dispatcher
    dp.add_handler(MessageHandler(Filters.text & ~Filters.command, handle_text))
    dp.add_handler(MessageHandler(Filters.photo, handle_photo))

    updater.start_polling()
    updater.idle()

if __name__ == '__main__':
    main()

TextCommandService and ImageProcessingService running on different ports and handling different types of requests. the bot determines the type of incoming message (text or image) and redirects the request to the appropriate microservice.

Each of the microservices can be scaled independently. For example, if most of the requests are related to image processing, ImageProcessingService can scale by adding more instances.

Containerization and orchestration

Containerization is the process of packaging an application with all its dependencies and configurations into an isolated container.

example Dockerfile for Telegram bot:

FROM python:3.12

# Устанавливаем рабочую директорию в контейнере
WORKDIR /bot

# Копируем файлы проекта в контейнер
COPY . /bot

# Устанавливаем зависимости
RUN pip install -r requirements.txt

CMD ["python", "./my_telegram_bot.py"]

We create a Docker image for the Telegram bot, starting with a basic Python image, installing the necessary dependencies and defining the command to launch the bot.

Orchestration of containers – This is the process of managing the life cycle of containers.

To deploy a Telegram bot in Kubernetes, you can create a configuration file deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: telegram-bot-deployment
spec:
  replicas: 2 # Начинаем с двух экземпляров
  selector:
    matchLabels:
      app: telegram-bot
  template:
    metadata:
      labels:
        app: telegram-bot
    spec:
      containers:
      - name: telegram-bot
        image: my_telegram_bot:latest
        ports:
        - containerPort: 80

To enable autoscaling in Kubernetes, you can use the Horizontal Pod Autoscaler (HPA), which automatically scales the number of pods in response to load:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: telegram-bot-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: telegram-bot-deployment
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 80

This HPA will scale the number of pods for telegram-bot-deployment 2 to 10 based on CPU usage.

CI/CD

Continuous Integration (CI) provides automatic testing of the code at each commit to the version control system.

Consider an example where we use GitHub Actions to automatically test the Telegram bot on each commit.

Let’s create a file .github/workflows/python-app.yml in the repository on GitHub:

name: Python application

on:
  push:
    branches: [ master ]
  pull_request:
    branches: [ master ]

jobs:
  build:

    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout
    - name: Set up Python 3.12
      uses: actions/setup-python
      with:
        python-version: 3.12
    - name: Install dependencies
      run: |
        pip install -r requirements.txt
    - name: Run tests
      run: |
        pytest

on: Determines when actions should be performed (on push and pull requests to the master branch).

jobs: Defines a set of tasks to be performed (here only build).

steps: A sequence of steps including checking in code from the repository, installing python, installing dependencies, and running tests with pytest.

Continuous Delivery (CD) automates application deployment to a test or production environment after successful passing of tests.

Let’s say we want to automatically deploy our Telegram bot to Kubernetes after successful tests. For this, we need a Docker container and Kubernetes configuration:

As an example, let’s take the docker file discussed above. Let’s configure GitHub Actions to build and publish the Docker image:

- name: Build and Push Docker image
  if: github.event_name == 'push'
  run: |
    echo ${{ secrets.DOCKER_PASSWORD }} | docker login -u ${{ secrets.DOCKER_USERNAME }} --password-stdin
    docker build -t my_telegram_bot .
    docker tag my_telegram_bot ${{ secrets.DOCKER_USERNAME }}/my_telegram_bot:latest
    docker push ${{ secrets.DOCKER_USERNAME }}/my_telegram_bot:latest

Automatic deployment in Kubernetes:

- name: Deploy to Kubernetes
  if: github.event_name == 'push'
  run: |
    kubectl apply -f k8s-deployment.yaml

k8s-deployment.yaml is a Kubernetes configuration file that describes how the Telegram bot should be deployed and scaled.


Competent autoscaling allows you to cope with changing loads and significantly increases the stability and adaptability of the bot.

All the tools to improve the stability and performance of the systems under development can be learned under the guidance of experienced engineers in online courses at OTUS.

Related posts