We create fast gRPC services with Tonic and Rust

We create fast gRPC services with Tonic and Rust

Hello, Habre!

Today we will see how with the help of a framework Tonic and languages Rust create gRPC services for machine learning. If your project needs to build distributed systems as efficiently as possible, and performance and asynchronous programming are something you value, then Rust in conjunction with Tonic is a great tool

Installing Tonic

The first thing to do is create a new Rust project. To do this, open the terminal and enter:

cargo new my_grpc_service
cd my_grpc_service

Now open the file Cargo.toml. Let’s add Tonic and some other necessary libraries.

[dependencies]
tonic = "0.12.2"
prost = "0.11"
tokio = { version = "1", features = ["full"] }
  • Tonic – The main framework for gRPC.

  • Prost – a library for encoding and decoding Protocol Buffers.

  • Tokyo – asynchronous runtime for Rust, which we need to work with Tonic.

Now let’s add the necessary flags for Tonic. This is necessary to include all the necessary features. IN Cargo.toml should be something like this:

[dependencies]
tonic = { version = "0.12.2", features = ["transport", "codegen", "tls"] }
prost = "0.11"
tokio = { version = "1", features = ["full"] }

[build-dependencies]
tonic-build = "0.12.2"
  • transport – Includes HTTP/2 support.

  • codegen – includes code generation with .proto files.

  • tls – To work with secure connections.

Now that we have a dependency, let’s create the basic structure of our project.

mkdir proto

Let’s create a file hello.proto in this folder and add the following code:

syntax = "proto3";

package hello;

service Greeter {
    rpc SayHello (HelloRequest) returns (HelloResponse);
}

message HelloRequest {
    string name = 1;
}

message HelloResponse {
    string message = 1;
}

This simple protocol describes a service Greeterwhich accepts HelloRequest and returns HelloResponse.

Now you need to generate the code from .proto file For this, we create a file build.rs in the root of the project with the content:

fn main() {
    tonic_build::compile_protos("proto/hello.proto").unwrap();
}

This code will force tonic-build compile our protocol when compiling the project.

Now that we have everything ready, it’s time to assemble the project. Execute the command:

cargo build

If everything is done correctly, the project compiles without errors.

Creating a gRPC service

Let’s start with what needs to be created .proto a file in which we define our messages and services. Let’s take the same one hello.protowhich we created earlier and will expand it a bit. Let’s say you want to add functionality for greeting users and getting a list of users:

syntax = "proto3";

package hello;

service Greeter {
    rpc SayHello (HelloRequest) returns (HelloResponse);
    rpc ListUsers (ListUsersRequest) returns (ListUsersResponse);
}

message HelloRequest {
    string name = 1;
}

message HelloResponse {
    string message = 1;
}

message ListUsersRequest {}

message ListUsersResponse {
    repeated string users = 1;
}

A new method has been added here ListUserswhich returns a list of users. Now let’s generate the code.

We execute the assembly command:

cargo build

This will generate Rust modules based on .proto file

Now we create a file src/main.rs and add the following code:

use tonic::{transport::Server, Request, Response, Status};
use hello::greeter_server::{Greeter, GreeterServer};
use hello::{HelloRequest, HelloResponse, ListUsersRequest, ListUsersResponse};

pub mod hello {
    include!("generated.hello.rs");
}

#[derive(Default)]
pub struct MyGreeter {}

#[tonic::async_trait]
impl Greeter for MyGreeter {
    async fn say_hello(
        &self,
        request: Request,
    ) -> Result, Status> {
        let name = request.into_inner().name;
        let reply = HelloResponse {
            message: format!("Hello, {}!", name),
        };
        Ok(Response::new(reply))
    }

    async fn list_users(
        &self,
        _request: Request,
    ) -> Result, Status> {
        let users = vec!["Alice".to_string(), "Bob".to_string(), "Charlie".to_string()];
        let reply = ListUsersResponse { users };
        Ok(Response::new(reply))
    }
}

#[tokio::main]
async fn main() -> Result<(), Box> {
    let addr = "[::1]:50051".parse()?;
    let greeter = MyGreeter::default();

    Server::builder()
        .add_service(GreeterServer::new(greeter))
        .serve(addr)
        .await?;

    Ok(())
}

MyGreeter: implementation of the service Greeter. We implement the methods defined in our .proto file

say_hello: method handles the greeting request. He pulls the name out of HelloRequestforms a response and returns it.

list_users: the method returns a list of users. We simply create a vector with strings and rotate it into ListUsersResponse.

In Tonic, requests and responses are processed using types Request and Response. These types allow us to work with the data coming from the client and form the responses that will be sent back.

Request: This type contains data received from the client as well as metadata. You can get the main message using the method into_inner().

let name = request.into_inner().name;

Response: This type is used to form responses We are creating an instance Responsepassing the data you want back to the client.

let reply = HelloResponse {
    message: format!("Hello, {}!", name),
};
Ok(Response::new(reply))

To start the server, simply execute the command:

cargo run

If everything is successful, you will see that the server is running on localhost:50051.

A little optimization

Message size control

The first step to optimization is managing message sizes.

Tonic has the ability to set the maximum size of incoming and outgoing messages. By default, the limit is 4 MB for incoming messages and usize::MAX for the weekend

Let’s add the following configuration to the server implementation:

use tonic::{transport::Server, Request, Response, Status};
use hello::greeter_server::{Greeter, GreeterServer};
use hello::{HelloRequest, HelloResponse};

#[tokio::main]
async fn main() -> Result<(), Box> {
    let addr = "[::1]:50051".parse()?;
    let greeter = MyGreeter::default();

    Server::builder()
        .max_decoding_message_size(10 * 1024 * 1024) // 10 МБ
        .max_encoding_message_size(10 * 1024 * 1024) // 10 МБ
        .add_service(GreeterServer::new(greeter))
        .serve(addr)
        .await?;

    Ok(())
}

Here, 10 MB limits have been set for incoming and outgoing messages.

Compression is another powerful optimization tool. Tonic supports message compression using algorithms gzip and zstd.

First you will need to enable compression in the server configuration. Here’s how to do it:

use tonic::{transport::Server, Request, Response, Status};
use hello::greeter_server::{Greeter, GreeterServer};
use hello::{HelloRequest, HelloResponse};

#[tokio::main]
async fn main() -> Result<(), Box> {
    let addr = "[::1]:50051".parse()?;
    let greeter = MyGreeter::default();

    let compression = tonic::Compression::Gzip;

    Server::builder()
        .compression(compression)
        .add_service(GreeterServer::new(greeter))
        .serve(addr)
        .await?;

    Ok(())
}

The server will now use compression gzip for messages. If you need to use zstdjust replace tonic::Compression::Gzip on tonic::Compression::Zstd.

You can enable compression on the client side:

let mut client = GreeterClient::connect("http://[::1]:50051")
    .await?
    .accept_compressed(tonic::Compression::Gzip)
    .await?;

Thus, when sending data, it will be automatically compressed, and when received, it will be decompressed.

Tonic is built on a foundation tokiowhich allows the use of asynchronous threads to process requests.

Let’s say you need to handle requests with a delay to emulate interaction with a remote database or API. Can be used tokio::time::sleep to create a delay:

#[tonic::async_trait]
impl Greeter for MyGreeter {
    async fn say_hello(
        &self,
        request: Request,
    ) -> Result, Status> {
        let name = request.into_inner().name;

        // Имитация задержки
        tokio::time::sleep(std::time::Duration::from_secs(2)).await;

        let reply = HelloResponse {
            message: format!("Hello, {}!", name),
        };
        Ok(Response::new(reply))
    }
}

This way, the server can remain responsive and process other requests while waiting for the long-running operation to complete.


You can learn more about Tonic here.

Learn advanced ML techniques for practicing Data Scientists who want to upgrade their professional level to Middle+ can on the online course “Machine Learning. Advanced”.

Related posts