how I created a chatbot for job search
Greeting! My name is George Kalyapin. When I started working as a developer, various small orders came to me, and then I started looking for them myself in chats with freelancers. The problem was that the chats had to be constantly monitored and there were many untargeted vacancies.
That’s why I decided to create a chatbot RemoteHunt – An assistant in finding a freelancer. It browses thematic channels and chats 24/7, then segments vacancies into categories and sends them to the user. Initially, the bot was conceived as a pet project, but during the development process it grew into something more.
In this article, I will talk about the principle of the chatbot and the difficulties I encountered. Not everything turned out perfectly the first time, so I will correct or improve some points. I encountered similar tasks during the course “Middle Python Developer” in the Workshop, but I didn’t want to copy ready-made solutions.
Contents
Principle of operation
Currently, the system has about thirty channels from which the bot selects vacancies. Usually, resumes, spam and any other advertising are also posted in these channels. The bot doesn’t need these categories, so I use a parser for filtering.
A parser is a separate account that sits in all these channels and picks up messages. For each channel, I have built keywords that must either be present or absent. Let’s say I’m not interested in the “resume” hashtag – the message doesn’t pass filtering. And the message with the hashtag “vacancy” goes further and gets to RabbitMQ.
RabbitMQ is a message broker, a queue. All messages first arrive there, and then go to ChatGPT. It listens to the queue and sorts the jobs into categories, then sends them back to RabbitMQ. Example prompt for GPT:
question = ‘I will send a vacancy, and you will answer to which categories it belongs.\n\n’ \
‘Vacancy:\n\n’ \
f'{text}.\n\n’ \
“Reply with the number and category to which the offered vacancy most likely belongs” \
‘described above and nothing more:\n’ \
+ “”.join([f”{x + 1}) {categories[x].display_name}\n” for x in range(len(categories))]) + \
‘\nIf it’s not a job, say no.’
At first, I planned to do the segmentation myself, save everything in the database, and then train the neuron based on the database. But then I tried to use GPT and realized that it is a good solution.
GPT needs to be outsourced to a separate service because it can often be unresponsive: if I were to implement GPT through the API, it would always be a 30-60 second delay and not the fact that GPT would respond. He may just be overwhelmed and say “sorry”. This is where RabbitMQ comes in: if GPT hasn’t processed a message, it doesn’t go anywhere and waits for its turn.
So ChatGPT sorted the jobs into categories and sent them back to RabbitMQ. Vacancies are currently ready to be sent to users.
The user interacts with the main robot, which in turn is connected to 12 robots with categories. The main bot is used for subscription, payment and selection of categories – it is not related to sending the vacancies themselves. 12 bots are, in fact, the very categories with vacancies that the user sees.
One vacancy can fall into several categories at once. For example, there is a vacancy for a copywriter, the requirements state that the posts must have illustrations. The bot reads this and decides that the job is suitable for the “copywriting” and “design” categories. I think that’s normal.
The bot has a Scheduler. Every 10 minutes, it monitors subscriptions and informs users with a subscription that it has expired, that it is time to buy a new one. It also sends a daily admin report on new users and subscriptions. I’m currently working on getting it to send jobs to the general feed every day – one job per category.
Postgres stores all data: by users, categories, channels, subscriptions and payments. Content API is required to access the database, and admin is implemented through Django-Admin. Payment App Service is a service that is responsible for working with subscriptions, specifically with the payment system. And Payment Service is already a third-party service through which payment is made.
I decided to do everything wisely and connect Yukasa. It is convenient for both users and me. Users do not need to throw money at someone’s card and guess whether they will be connected to a subscription or not. Everything is fully automated: the chatbot itself connects the subscription upon successful payment. At the same time, all payments are made officially, without suspicious transfers.
I worked on the project for about 4 months. He was not engaged in active development all the time: he walked around thinking, sketching ideas, interviewing freelancers he knew. Then I took a vacation for two weeks and fully devoted myself to its launch: I looked at the logs, what was working, I started the first traffic — I conducted such alpha tests.
What can be improved
I will rewrite the Scheduler: I wrote it quickly and from scratch, and only then did I remember the Dagster library, which will help make everything easier and clearer. There is already a ready-made service with a visual interface and extended functionality. For example, if something goes wrong, it creates a file where I can view the errors and then restart everything from my phone.
I still doubt if I did the right thing in terms of implementing the Redis caching service. The course taught that Redis should reside in an API. This means that the request from the user always goes to the API, and the user already goes to the Postgres database, or retrieves information from the cache, that is, from Redis.
I implemented Redis in the bot itself instead of inside the API, which I thought was a good decision. This means that the bot immediately goes to Redis and tries to find there, for example, data about the user: who he is, what subscriptions he has, etc. Only if the bot does not receive a response, it already calls the API.
This is done in order to speed up the response time of the bot and reduce the load on the API. The user interacts with the robot through buttons, everything happens quite quickly. I believe that there is no need for the bot to call the API every time, it receives information once, stores it in Redis and then retrieves the information from there.
Glad to hear your thoughts in the comments: did I do the right thing or should I have listened to the course “Middle Python Developer” in the Workshop. Or maybe there’s some other option I’ve missed.
Tips for those who want to do a similar project
If the project is of exactly the format I have, you need to understand that engaging the audience is a separate topic. The project can be very cool, but you will still have to drive traffic there, and it will be paid. You need to be ready for this.
I do not advise to hammer on some parts of the system and do as you like if something is unclear. After all, the project is done in a relatively short time, and it will work for a long time. It’s better to spend a few extra days now than to deal with problems that pop up in the future.
The parser could become such a stain for me: at first, it simply forwarded messages, mistaking them for vacancies, and did not filter them in any way. I didn’t turn a blind eye to it, but I deliberately put the question aside, and when I found out about GPT, I went back and finished it. The idea came by itself and stood up perfectly, like a puzzle in a mosaic – you just had to wait, and not use crutches.
The design and clarity of the interface is very important. Everything should be beautiful, clear and simple for the user – not all IT experts or programmers need to figure it out. If you’re doing something tricky, it’s better inside, and everything outside should be user-friendly.
Plans for the future
I want to parallelize the bot not only for vacancies, but also for searching for employees. I still don’t know how exactly I will implement it. Maybe there will be the same categories, but with check boxes to select jobs or resumes.
I also want to make it so that users can notify the admin about a non-target vacancy in the category directly through the bot, not through support. And I want to add the ability to block vacancies from specific users.
I want the project to generate more income, so I will add the possibility of posting paid jobs so that employers can apply directly. I will also do a paid resume mailing and continue to grow the channel to earn advertising revenue.