Julia in machine learning: base

Julia in machine learning: base

Hello, Habre!

Julia was born in 2012, thanks to the efforts of four enthusiastic developers: Jeff Besanson, Stefan Karpinski, Viral B. Shah, and Alan Edelman. They sought to create a language that would connect the lightness of Python, speed C, the dynamism of Ruby, linguistic purity of Lisp and opportunities mathematical systems such as Matlab. They succeeded! Julia is a fusion of simplicity and power.

Thanks to JIT compilationJulia code can run at speeds comparable to code written in C or Fortran.

The main features of Julia

JIT compilation

JIT compilation is a process in which the source code of a program is compiled into machine code directly at runtime, rather than in advance.

When a Julia program is run, it initially runs as an interpreted language. The code is read and executed line by line.

When Julia encounters a function, it compiles that function into machine code for a specific processor architecture. This happens the first time the function is called.

The compiled machine code is then cached. If the function is called again, Julia will use the already compiled code, rather than interpreting or recompiling it.

The JIT compiler also optimizes code based on its usage. If the compiler detects that a particular piece of code is frequently used, it can optimize that piece of code to improve efficiency.

Typification

Like many high-level languages, Julia defaults to dynamic typing – variable types are determined at runtime

You can explicitly specify the types of variables and functions, which is characteristic of purely dynamic languages.

function add(a, b)
    return a + b
end

println(add(10, 20))  # вывод: 30
println(add(10.5, 20.3))  # вывод: 30.8

add a function can accept any type that supports the addition operation.

function add(a::Int, b::Int)
    return a + b
end

println(add(10, 20))  # вывод: 30
println(add(10.5, 20.3))  # вызовет ошибку, так как аргументы не являются Int

add is strictly limited to integer arguments.

function add{T <: Number}(a::T, b::T)
    return a + b
end

println(add(10, 20))  # вывод: 30
println(add(10.5, 20.3))  # вывод: 30.8

add takes arguments of the same type, but that type can be any subtype Number.

Multithreading

Julia provides built-in support for multithreading, which allows you to distribute computational tasks between different threads. You can control the number of threads used by their programs and distribute tasks between them.

Basic multithreading looks like this:

using Base.Threads

function threaded_sum(arr)
    nthreads = nthreads()  # получаем количество потоков
    sums = zeros(nthreads)  # массив для сумм по каждому потоку
    @threads for i in 1:nthreads
        # Распределяем итерации цикла по потокам
        for j in i:nthreads:length(arr)
            sums[i] += arr[j]
        end
    end
    return sum(sums)  # суммируем результаты всех потоков
end

arr = rand(1:100, 10^6)  # большой массив случайных чисел
println(threaded_sum(arr))  # использование многопоточности для подсчета суммы

With atomic operations:

using Base.Threads

function threaded_increment(n)
    count = Atomic{Int64}(0)  # атомарная переменная
    @threads for i in 1:n
        atomic_add!(count, 1)  # безопасное увеличение счетчика
    end
    return count[]
end

println(threaded_increment(10^6))  # увеличиваем счетчик многократно в многопоточном режиме

Libraries for ML

For julia there is jupiter on git.

Flux.jl

Flux.jl is the most basic and foundational library in machine learning for Julia.

Flux provides many conditioned layerswhich are the building blocks for neural networks.

Dense layer: the main layer for creating fully connected networks:

dense = Dense(10, 5, σ)  # 10 входов, 5 выходов, сигмоидная функция активации

Convolutional layer: to create convolutional networks:

conv = Conv((3, 3), 1=>16, σ)  # 3x3 фильтры, 1 входной канал, 16 выходных каналов

Flux has activation functionssuch as σ, relu, tanh:

relu_layer = Dense(10, 5, relu)

Optimizers are used to update model weights during training. Flux supports various optimizers such as SGD, Adam, RMSprop:

Example from SGD:

opt = Descent(0.01)  # SGD с шагом обучения 0.01

Loss functions how the cross-entropy or MSE is used to evaluate the performance of the model, for example, the cross-entropy might look like this:

loss(x, y) = Flux.Losses.crossentropy(model(x), y)

Flux provides utilities to simplify the learning processsuch as train!:

data = [(x, y), ...]  # набор данных
Flux.train!(loss, params(model), data, opt)

Let’s create a simple fully connected neural network for image classification from the MNIST dataset:

using Flux, Flux.Data.MNIST, Statistics
using Flux: onehotbatch, onecold, crossentropy, throttle
using Base.Iterators: repeated

# так выгдядит загрузка 
images = float.(reshape(hcat(float.(MNIST.images())...), 28 * 28, :))
labels = onehotbatch(MNIST.labels(), 0:9)

# делим данные на обучающую тестовую выборку
train_indices = 1:60000
x_train, y_train = images[:, train_indices], labels[:, train_indices]

test_indices = 60001:70000
x_test, y_test = images[:, test_indices], labels[:, test_indices]

# объявляем модель
model = Chain(
  Dense(28*28, 64, relu),
  Dense(64, 10),
  softmax
)

# функция потерь и оптимизатор
loss(x, y) = crossentropy(model(x), y)
opt = ADAM()

# само обучение
dataset = repeated((x_train, y_train), 200)
evalcb = () -> @show(loss(x_train, y_train))
Flux.train!(loss, params(model), dataset, opt, cb = throttle(evalcb, 10))

MLJ

MLJ (Machine learning in Julia) provides a unified interface to access a wide range of machine learning models.

MLJ simplifies the process of loading and preprocessing data.

Download data:

using MLJ
X, y = @load_iris

It looks very interesting.

MLJ makes it easy to select and configure different machine learning models.

Model selection:

tree_model = @load DecisionTreeClassifier

Setting hyperparameters:

tree = tree_model(max_depth=5)

With MLJ, one can train a model and make predictions using simple functions:

Model training:

mach = machine(tree, X, y)
fit!(mach)

Prophecy:

yhat = predict(mach, Xnew)

Estimates of the performance of the models are naturally in this library, for example the cross-entropy estimate looks like this:

cv = cross_validation(mach, K=5)
cv_measure = mean(cv.measurements)

Let’s apply MLJ to classify irises using a decision tree:

using MLJ

# грузим набор
X, y = @load_iris

# делим данные на обучающую и тестовую выборки
train, test = partition(eachindex(y), 0.7, shuffle=true)

# выбираем модель и настраиваем гиперпараметров
tree_model = @load DecisionTreeClassifier
tree = tree_model(max_depth=5)

# создаем и обучаем
mach = machine(tree, X, y)
fit!(mach, rows=train)

# предсказание для тестовой выборки
yhat = predict(mach, rows=test)

ScikitLearn.jl:

For those used to Scikit-Learn in Python, ScikitLearn.jl offers a similar experience, but with Julia.

ScikitLearn.jl makes it easy to download standard datasets for testing and training models:

using ScikitLearn
@sk_import datasets: load_iris
iris = load_iris()

Data pre-processing includes functions for normalization, scaling and transformation of data:

@sk_import preprocessing: StandardScaler
scaler = StandardScaler()
X_scaled = fit_transform(scaler, iris["data"])

Function for dividing data into training and test samples:

@sk_import model_selection: train_test_split
X_train, X_test, y_train, y_test = train_test_split(iris["data"], iris["target"], test_size=0.4)

ScikitLearn.jl has many algorithms including linear models, decision trees, etc.

@sk_import linear_model: LogisticRegression
model = LogisticRegression()

The process of training the model on the prepared data:

fit!(model, X_train, y_train)

Using a trained model to predict results on new data:

predictions = predict(model, X_test)

Assessment of model quality using various metrics:

@sk_import metrics: accuracy_score
acc = accuracy_score(y_test, predictions)

Classifications of iris species:

using ScikitLearn

@sk_import datasets: load_iris
@sk_import model_selection: train_test_split
@sk_import preprocessing: StandardScaler
@sk_import linear_model: LogisticRegression
@sk_import metrics: accuracy_score
# да, такие вот импорты ;)

iris = load_iris()
X, y = iris["data"], iris["target"]

scaler = StandardScaler()
X_scaled = fit_transform(scaler, X)

X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.3)

model = LogisticRegression()
fit!(model, X_train, y_train)

predictions = predict(model, X_test)

Optimization and automatic differentiation

Zygote is an automatic differentiation library in Julia that makes it easy to compute gradients of functions:

using Zygote

f(x) = 3x^2 + 2x + 1
df = gradient(f, 2)  # вычисляет градиент f в точке x=2
println(df)  # вывод: (14,)
g(x, y) = sin(x) + cos(y)
dg = gradient(g, π/2, π)  # вычисление градиента по x и y
println(dg)  # вывод: (-1.0, 0.0)
# функция потерь
loss(x, y) = sum((x - y).^2)

# градиент функции потерь
x, y = [1, 2, 3], [3, 2, 1]
grad_loss = gradient(() -> loss(x, y), params([x, y]))
println(grad_loss)

There is also Optim, which allows to optimize functions:

using Optim

f(x) = (x[1] - 2)^2 + (x[2] - 3)^2
res = optimize(f, [0.0, 0.0])  # начальная точка [0.0, 0.0]
println(Optim.minimizer(res))  # ввод: близко к [2, 3]

Optim also allows you to use the gradients provided by the Zygote library for better optimization.

f(x) = (x[1] - 1)^2 + (x[2] - 2)^2
∇f(x) = gradient(f, x)[1]

res = optimize(f, ∇f, [0.0, 0.0], BFGS())  # BFGS метод
println(Optim.minimizer(res))  # близко к [1, 2]
# функция потерь
loss(x) = (x[1] - 3)^2 + (x[2] - 4)^2
∇loss(x) = gradient(loss, x)[1]

initial_params = [0.0, 0.0]
res = optimize(loss, ∇loss, initial_params, LBFGS())
println(Optim.minimizer(res))  # близко к [3, 4]

Plots.jl

With plost.jl you can visualize data and more.

using Plots

# линейный графика
plot([1, 2, 3, 4, 5], [2, 4, 1, 3, 7], label="линия", linewidth=2)
# гистограмма
histogram(randn(1000), bins=15, label="частоты")
# график рассеяния
scatter(randn(100), randn(100), color=:blue, label="точки")

Julia provides high performance, which has been demonstrated in scientific studies, where it reached a peak performance of 1.54 petaflops.

Despite the fact that Julia is still a young language and its community is not as large as that of Python or R, it is rapidly developing, enriched with new libraries and capabilities.

OTUS experts tell more interesting things from the world of machine learning in online courses. And in the calendar of events, everyone can register for a free webinar that interests them.

Related posts