Applied machine learning developers have a lot of open-source machine learning frameworks, e.g., ScikitLearn, Spark MLlib, ML.Net, etc. Frameworks provide a user-friendly high-level interface to algorithms, but implementations resort to low-level languages and optimizations. Such low-level C/C++, C#, or Java code is far from the original mathematical notation that is preferable for machine learning algorithms research and development. Semagle Framework makes the most of low-level C#-like constructs for performance optimization and high-level semi-mathematical F# notation for joining the algorithm blocks. Modularization of algorithms with fine-grained blocks makes research and development of new implementations for the same family of problems straightforward.

Semagle framework includes the following principal F# libraries:

## Metrics Library

Semagle.MachineLearning.Metrics library implements performance evaluation metrics for classification, regression, and information retrieval:

Accuracy is used for two and one class classification problems.

$accuracy = \frac{1}{n} \sum_{i=1}^n \mathbb{1}_{predicted_i = expected_i}$
let accuracy (expected: 'Y[]) (predicted: 'Y[]) =
...
Classification.accuracy test_y predict_y


Recall is used for classification and sequence labeling problems.

$recall = \frac{\sum_{i=1}^n \mathbb{1}_{predicted_i=expected_i=+1}}{\sum_{i=1}^n \mathbb{1}_{expected_i=+1}}$
let recall (expected: 'Y[]) (predicted: 'Y[]) =
...
Classification.recall test_y predict_y


Precision is used for classification and sequence labeling problems.

$precision = \frac{\sum_{i=1}^n \mathbb{1}_{predicted_i=expected_i=+1}}{\sum_{i=1}^n \mathbb{1}_{predicted_i=+1}}$
let precision (expected: 'Y[]) (predicted: 'Y[]) =
...
Classification.precision test_y predict_y


F1 is used for classification and sequence labeling problems.

$F_1 = 2 \times \frac{recall \times precision}{recall + precision}$
let f1 (expected: 'Y[]) (predicted: 'Y[]) =
...
Classification.precision test_y predict_y


Mean squared error (MSE) is used for regression problems.

$MSE = \frac{1}{n}\sum_{i=1}^n (predicted_i - expected_i)^2$
let mse (expected: float32[]) (predicted: float32[]) =
...
Regression.mse test_y predict_y


## Vectors Library

Semagle.Numerics.Vectors library provides DenseVector and SparseVector vector types. These data types implement operations required by standard SVM kernels. SparseVector is also a basic data type for the Structured SVM implementation.

DenseVector is constructed from the array of float32 values. This type of vector stores zero and non-zero values and suits well for feature vectors with many non-zero elements.

let a = DenseVector([| 1.0f; 3.0f; -3.0f; 4.0f; 8.0f |])


SparseVector is constructed from the array of int indices of non-zero elements and the array of float32 values of non-zero elements. This type of vector suits well for feature vectors with a few non-zero elements.

let a = SparseVector([|0; 1; 3; 5|], [|1.0f; -2.0f; 4.0f; -6.0f|])


DenseVector and SparseVector support the following operations:

• indexing a. and slicing a.[1..3], a.[..3], a.[2..], but implementations of SparseVector operations are more expensive because they require binary search for finding item indices;
• element-wise addition a + b, subtraction a - b, multiplication a * b, and division a / b;
• unary negation -a and multiplication a * 1.5f and division a / 2.0f by a scalar;
• dot product a .* b and Euclidean distance a ||-|| b.

## SVM Library

Semagle.MachineLearning.SVM library provides a high-level interface for classification and regression support-vector machines and a low-level interface for sequential minimal optimization 1,2. All interfaces are data type agnostic and with kernel functions only:

type Kernel<'X> = 'X -> 'X -> float


A high-level interface includes the following models:

• TwoClass for two-class classification;
• OneClass for two-class classification with training on positive samples only;
• MultiClass for multi-class classification;
• Regression for real-valued regression.
type SVM<'X,'Y> =
/// SVM model for two class classification
| TwoClass of Kernel<'X> * ('X[]) *  (float[]) * float
/// SVM model for one class classification
| OneClass of Kernel<'X> * ('X[]) *  (float[]) * float
/// SVM model for regression
| Regression of Kernel<'X> * ('X[]) *  (float[]) * float
/// SVM model for multi-class classification
| MultiClass of Kernel<'X> * (('Y * 'Y * ('X[]) * (float[]) * float)[])


Prediction functions for different problems are similar:

module TwoClass =
let inline predict (model : SVM<'X,'Y>) (x : 'X) = ...
module OneClass =
let inline predict (model : SVM<'X,'Y>) (x : 'X) = ...
module Regression =
let inline predict (model : SVM<'X,'Y>) (x : 'X) = ...
module MultiClass =
let inline predict(model : SVM<'X,'Y>) (x : 'X) = ...


They are based on the weighted sum of values of kernel functions for support vectors:

module SVM =
let inline predict (x : 'X) (K : Kernel<'X>) (X : 'X[]) (A : float[]) (b : float) =
Array.fold2 (fun sum x_i a_i -> sum + a_i * (float (K x_i x))) b X A


The matching problem solvers show how to apply the generic sequential minimization solver to specific problems:

Two-class and multi-class problem solvers build classification models for data type X. Labels for two-class classifiers are +1/-1. For multi-class, they have an arbitrary data type Y:

/// Optimization parameters for C_SVC problem
type C_SVC = {
/// The penalty for +1 class instances
C_p : float;
/// The penalty for -1 class instances
C_n : float;
}

/// Two class C Support Vector Classification (SVC) problem solver
let C_SVC (X : 'X[]) (Y : float32[]) (K : Kernel<'X>) (parameters : C_SVC) (options : OptimizationOptions) = ...

/// Multi-class C Support Vector Classification (SVC) problem solver
let C_SVC_M (X : 'X[]) (Y : 'Y[]) (K : Kernel<'X>) (parameters : C_SVC) (options : OptimizationOptions) = ...


One-class problem solver builds classification models for data type X and +1/-1 labels for an input dataset with positive samples only.

/// Optimization parameters for One-Class problem
type OneClass = {
/// The fraction of support vectors
nu : float;
}

/// One-Class problem solver
let OneClass (X : 'X[]) (K : Kernel<'X>) (parameters : OneClass) (options : OptimizationOptions) = ...


Regression problem solver builds regression models for data type X and float32 labels.

/// Optimization parameters for C_SVR problem
type C_SVR = {
/// The boundary of the approximated function
eta : float;
/// The penalty
C : float;
}

/// C Support Vector Regression (SVR) problem solver
let C_SVR (X : 'X[]) (Y : float32[]) (K : Kernel<'X>) (parameters : C_SVR) (options : OptimizationOptions) = ...


Sequential minimization optimization solver is a low-level building block for the above problem-specific problem solvers and Structured SVM optimization below:

/// General parameters for C_SMO problem
type C_SMO = {
/// Initial feasible values of optimization varibles
A : float[];
/// Per-sample penalties
C : float[];
/// The linear term of the optimized function
p: float[];
}

/// Sequential Minimal Optimization (SMO) problem solver
let C_SMO (X : 'X[]) (Y : float32[]) (Q : Q) (parameters : C_SMO) (options : OptimizationOptions) =


## Structural SVM Library

Semagle.MachineLearning.SSVM library provides a high-level interface for multi-class classification and sequence labeling, and a low-level cutting-plane solver for “1-slack” 3 structural SVM formulation. All high-level interfaces require two functions:

• FeatureFunctions creates a feature vector for an arbitrary data type X;
• LossFunction defines a loss value for predicted y and expected y' labels of an arbitrary data type Y.
/// Simple feature function
type FeatureFunction<'X> = 'X -> Vector

/// Structured SVM loss function type
type LossFunction<'Y> = (* y *) 'Y -> (* y' *)'Y -> float


Multi-class classification model builds a linear classification model in the combined feature and label space:

/// Structured SVM model for multi-class classification
type MultiClass<'X,'Y> = MultiClass of FeatureFunction<'X> * (* W *) float[] * 'Y[]

module MultiClass =
/// Optimization parameters for the multi-class structured SVM
type MultiClass<'Y> = {
/// Penalty for slack variables
C : float;
/// Loss function
loss : LossFunction<'Y>
}

/// Learn structured SVM model for multi-class classification
let learn (X : 'X[])(Y : 'Y[])(F: FeatureFunction<'X>)(parameters: MultiClass<'Y>)(options : OneSlack.OptimizationOptions) = ...

/// Predict class by multi-class structured model
let predict (MultiClass(F, W, Y) : MultiClass<'X,'Y>) (x : 'X) : 'Y = ...


Sequence labeling model builds a linear classification model in the combined feature and states space:

/// Structured SVM model for sequence labeling
type Sequence<'X,'Y> = Sequence of FeatureFunction<'X> * (* W *) float[] * 'Y[]

module Sequence =
/// Optimization parameters for the sequence structured SVM
type Sequence<'Y> = {
/// Penalty for slack variables
C : float;
/// Loss function
loss : LossFunction<'Y>
}

/// Learn structured SVM model for sequence labeling
let learn (X : 'X[][]) (Y : 'Y[][]) (F : FeatureFunction<'X>) (parameters : Sequence<'Y>) (options : OneSlack.OptimizationOptions) = ...

/// Predict labals sequence by sequence structured SVM
let predict (Sequence(F,W,Y) : Sequence<'X, 'Y>) (X : 'X[]) : 'Y[] = ...


“1-slack” cutting-plane solver is a low-level building block for the above problem-specific Structural SVM solvers. It requires a JointFeatureFunction that maps $\mathcal{X} \times \mathcal{Y}$ space into a sparse vector space (vectors with predominately zero elements). As a result of optimization, the solver returns a weight vector for the sparse vector space features:

/// Joint feature function type
type JointFeatureFunction<'X,'Y> = (* x *) 'X -> (* y *) 'Y -> SparseVector

// 1-slack optimization problem solver
let oneSlack (X : 'X[]) (Y : 'Y[]) (JF : JointFeatureFunction<'X,'Y>) (parameters : OneSlack<'X,'Y>) (options : OptimizationOptions) = ...


Check code, documentation, samples and API reference for more details. Happy hacking!

1. Chang, C. & Lin, C. LIBSVM : A Library for Support Vector Machines // ACM Trans. Intell. Syst. Technol. v.2, 2011 - p. 1–39.

2. Fan, R.-E., Chen, P.-H. & Lin, C.-J. Working Set Selection Using Second Order Information for Training Support Vector Machines. // J. Mach. Learn. Res. v.6, 2005 - p. 1889–1918.

3. Joachims, T., Finley, T. & Yu, C. J. Cutting-Plane Training of Structural SVMs // Mach. Learn. v.77, 2009 - p. 27–59.