How to translate the game ‘The Lamplighters League’ / Habr

How to translate the game ‘The Lamplighters League’ / Habr

This example suggests analyzing the translation of .wem audio files from one language (English) into another language (Russian), followed by packaging in .wem and use in the game. Python, neural networks, and the Wwise program will be used as tools. Interestingly, a network that determines the gender of the speaker will also be used, so that the translation is two-voiced.

The work algorithm will be as follows

  • extract .wem game files to .wav
  • convert .wav to .txt in English
  • from English .wav to determine the gender of the speaker and create a table file
  • translate English .txt in .txt in Russian
  • convert .txt to brownish. in .wav Ukrainian.
  • pack the finished .wav back into .wem

*- find empty files c len(text)==0 and transfer them “as is”, because these are files with screams, noises, etc.

It looks simple, let’s take a closer look.

Extract .wem game files to .wav

The vast majority of game audio files are packaged in wem format and are located in the game folder.

Therefore, the first step will be to unpack them for further processing.

And here we should mention the article, because the main amount of information was obtained from it.

It is necessary to download the program itself for Windows – vgmstream-win64
Create two folders in it: input, output and put all .wem files from the game folder in the first one

The Lamplighters League\LamplightersLeague_Data\StreamingAssets\Audio\GeneratedSoundBanks\Windows\English(US).

Next, you need to create a .bat file in the folder with the program, which will process all the files in the input folder, because by default, the program processes only one file.

Contents of the .bat file (for example):

for %%f in (input/*.wem) do "./vgmstream-cli.exe" -o output/%%f.wav input/%%f
pause

Since there are quite a lot of files (~15000), the process will take some time.

The output will be .wav files:

It should be noted right away that the same program will not work to pack back into .wem.

Let’s move on.

Convert .wav to .txt in English

In the previous article during the translation from English. wav in English .txt used vosk-model-en-us-0.22-lgraph, which coped well with the task.

This time we will try another “aid package” – wisper
It works no worse, but, according to feelings, it is of better quality.

Let’s pass through the code all files that were previously converted:

import whisper

##model = whisper.load_model("base")
##result = model.transcribe("59364.wem.wav")
##print(result["text"])


import torch, time, whisper, os
from glob import glob

device = torch.device('cuda')
model = whisper.load_model("base")
##model = whisper.load_model(name="base", device=device)

##
def job(file):
##    audio = whisper.load_audio(file, sr=16000)
    result = model.transcribe(file,language="en", task='transcribe', fp16=True)
    #print(result['text'])
    a=file.split('\\')[1].split('.')[0]    
    with open (f"{a}.txt","a") as f:
        f.write(result['text'])
##    print(datetime.now()- start)

import multiprocessing
def main():
    p = multiprocessing.Pool(8)
    for f in glob(r'.\*.wav'):
        #job(f)
        #print(f)
        #print('work on:',f.split('\\')[1])
        p.apply_async(job, [f]) 

    p.close()
    p.join() 

if __name__ == '__main__':
    main() 

The base model is used, which will be loaded when the script, cuda, is started. If you don’t have cuda installed, you can replace cpu.

The result of the program will be the same ~15,000 files, but already in English. text:

It is necessary to remember about empty files and files with anomalies in the text. An example of the latter:


We're going to be going to be going to be going to be going to be going to be going to be going to be going to be going to be going ...

This is translated by the whistling model.

wav

Wav-files in English, which correspond to the names of empty .txt files, must be transferred “as is”.

You can find empty files, for example, like this:

from glob import glob
for f in glob(r'./*.txt'):
    with open (f) as f2:
        text=f2.read()
    if len(text)==0:
        print(f,text)

From English .wav to determine the gender of the speaker and create a table file

In order to get a two-voice translation and the characters do not speak in the voices of women for men and vice versa, we define for each wav in English. floor of the speaker.

There are more than two characters in the game itself, but in this case it is a necessary simplification.

For gender-recognition, we will use the following repository.
With this package there will be some difficulties when installing, so it is recommended to look at the issue to “make it work” if you do not install the package from requirements.txt.

Let’s put English in the males and females folders. wav files it doesn’t matter in what ratio, the program needs both folders (assuming you have previously trained the gmm model according to the descriptions in the repository). Also, before running GenderIdentifier.py in Voice-based-gender-recognition\Code, we will fix it a little:

GenderIdentifier.py


import os
import pickle
import warnings
import numpy as np
from FeaturesExtractor import FeaturesExtractor

warnings.filterwarnings("ignore")


class GenderIdentifier:

    def __init__(self, females_files_path, males_files_path, females_model_path, males_model_path):
        self.females_training_path = females_files_path
        self.males_training_path   = males_files_path
        self.error                 = 0
        self.total_sample          = 0
        self.features_extractor    = FeaturesExtractor()
        # load models
        self.females_gmm = pickle.load(open(females_model_path, 'rb'))
        self.males_gmm   = pickle.load(open(males_model_path, 'rb'))

    def process(self):
        files = self.get_file_paths(self.females_training_path, self.males_training_path)
        # read the test directory and get the list of test audio files
        for file in files:
            self.total_sample += 1
            #print("%10s %8s %1s" % ("--> TESTING", ":", os.path.basename(file)))
            
            vector = self.features_extractor.extract_features(file)
            winner = self.identify_gender(vector)
            #print(file)
            expected_gender = file.split("\\")[1][:-1]
            #expected_gender = file.split("/")[1][:-1]

            #print("%10s %6s %1s" % ("+ EXPECTATION",":", expected_gender))
            #print("%10s %3s %1s" %  ("+ IDENTIFICATION", ":", winner))
            #print(os.path.basename(file),winner)
            with open ('out.txt','a') as f:
                f.write(f'{os.path.basename(file)},{winner}\n')

            if winner != expected_gender: self.error += 1
            #print("----------------------------------------------------")

        #accuracy     = ( float(self.total_sample - self.error) / float(self.total_sample) ) * 100
        #accuracy_msg = "*** Accuracy = " + str(round(accuracy, 3)) + "% ***"
        #print(accuracy_msg)

    def get_file_paths(self, females_training_path, males_training_path):
        # get file paths
        females = [ os.path.join(females_training_path, f) for f in os.listdir(females_training_path) ]
        males   = [ os.path.join(males_training_path, f) for f in os.listdir(males_training_path) ]
        files   = females + males
        return files

    def identify_gender(self, vector):
        # female hypothesis scoring
        is_female_scores         = np.array(self.females_gmm.score(vector))
        is_female_log_likelihood = is_female_scores.sum()
        # male hypothesis scoring
        is_male_scores         = np.array(self.males_gmm.score(vector))
        is_male_log_likelihood = is_male_scores.sum()

        #print("%10s %5s %1s" % ("+ FEMALE SCORE",":", str(round(is_female_log_likelihood, 3))))
        #print("%10s %7s %1s" % ("+ MALE SCORE", ":", str(round(is_male_log_likelihood,3))))

        if is_male_log_likelihood > is_female_log_likelihood: winner = "male"
        else                                                : winner = "female"
        return winner


if __name__== "__main__":
    #gender_identifier = GenderIdentifier("TestingData/females", "TestingData/males", "females.gmm", "males.gmm")
    #gender_identifier = GenderIdentifier("males", "females", "males.gmm","females.gmm")
    gender_identifier = GenderIdentifier("males", "females", "males.gmm","females.gmm")
    gender_identifier.process()

At the output, we receive a file that contains information about where the speaker is a man, where a woman is

file

What’s next?

Translate English .txt in .txt in Russian

The same approach as in the previous article will be used to translate the texts. Unchanged.

The output is 15k files in ru format:

With text in Russian.

Convert .txt to Russian. in .wav in Ukrainian

When translating from txt to wav, silero models will be used again, the work with which was described in the previous article.

But with small changes: we will take into account the gender of the speaker.

For this, the out.txt file will be used, in which the floor is “attached” to the file names. Female voice – ‘baya’, male – ‘eugene’.


import torch,csv
from datetime import datetime
from glob import glob
import multiprocessing
torch._C._jit_set_profiling_mode(False)

language="ru"
model_id = 'v4_ru'
device = torch.device('cpu') 
model, _ = torch.hub.load(repo_or_dir="snakers4/silero-models",
                                     model="silero_tts",
                                     language=language,
                                     speaker=model_id)
model.to(device)  # gpu or cpu
sample_rate = 48000
#speaker="baya" #xenia
put_accent=True
put_yo=True

with open('out.txt') as f:
    genders = {line[0]:line[1] for line in csv.reader(f)}

def job(file):    
    with open (file) as f:
        text=f.read()    
    a=file.split('\\')[1].split('.')[0]    
    gender=genders[f'{a}.wem.wav']
    if gender=='female':
        try:
            model.save_wav(text=text,
                       audio_path= f'{a}.wav',
                       sample_rate = sample_rate,
                       speaker="baya" )
        except:
            pass
    else:
        try:
            model.save_wav(text=text,
                       audio_path= f'{a}.wav',
                       sample_rate = sample_rate,
                       speaker="eugene" )
        except:
            pass

for f in glob(r'./*.ru'):
    job(f)

The result is the same 15,000 files, but with Russian male and female voices.

The process of localization of the game has come to the end – all that remains is to pack the finished files in wem format.

But not everything is so simple here.

Wav-to-wem.

Wem format is just compressed wav, if simplified. Ready-made wems take up much less disk space and are somewhat similar to the ogg format.

On github there are a couple of converters for python in wem, but with truncated functionality. The author did not manage to get them to work for the current task.

Therefore, you will need the Wwise program from Audiokinetic.

Install the program – a separate stability test.

First, you need to register on the website, and then the process is closed for the Russian public.
It will not work to download the bootloader. VPN? Maybe.

But only the VPN is not for the browser, but for the PC. Since during installation, the downloader will also download Wwise itself.

If the installation is successful, welcome to Wwise:

Next, a short clip from the wav-to-wem resource is captivating with its simplicity.

But… if you just put ready-made wav in Russian in wwise and convert “as is”, ready-made wems will not work in the game. There will be just silence, the game will not crash.

Therefore, it is necessary to pay attention to the packaging of the initial wem in English. language
If you use the vgm-stream console utility from the vgmstream-win64 package from the beginning of the article, you can find out that the wem file is in English. (the original) looks like this:

While WWise outputs the following format by default:

Difference in encoder and bitrate.

How to fix it?

After importing all audio files into the project, you need to select one of them (the change is applied to all others automatically).

Select the Vorbis format, then click Edit in the Adv. tab:

Choose a bitrate close to the original wem in English.

Next, you can convert to wem according to the standard procedure (see the YouTube video above):

After that, you need to replace the originals in the corresponding game folder with ready-made wem files.

In this case, the translation process will be considered complete.

There are .bnk files in the game – these are obviously dialogue and cutscene files.
But there are not so many of these files, and you can “put up with English”.

Related posts