StreamlinerCore

StreamlinerCore is a julia library to generate, train and evaluate models defined via some configuration files.

Data interface

StreamlinerCore.streamFunction
stream(f, data::AbstractData, partition::Integer, streaming::Streaming)

Stream partition of data by batches of batchsize on a given device. Return the result of applying f on the resulting batch iterator. Shuffling is optional and controlled by shuffle (boolean) and by the random number generator rng.

The options device, batchsize, shuffle, rng are passed via the configuration struct streaming::Streaming. See also Streaming.

source
StreamlinerCore.ingestFunction
ingest(data::AbstractData{1}, eval_stream, select)

Ingest output of evaluate into a suitable database, tensor or iterator. select determines which fields of the model output to keep.

source
StreamlinerCore.TemplateType
Template(::Type{T}, size::NTuple{N, Int}) where {T, N}

Create an object of type Template. It represents arrays with eltype T and size size. Note that size does not include the minibatch dimension.

source

Parser

StreamlinerCore.ParserType
Parser(;
    model, layers, sigmas, aggregators, metrics, regularizations,
    optimizers, schedules, stoppers, devices
)

Collection of dictionaries to performance the necessary conversion from the user-specified configuration file or dictionary to julia objects.

For most usecases, one should define a default parser

parser = default_parser()

and pass it to Model and Training upon construction.

A parser object is also required to use interface functions that read from the MongoDB:

See default_parser for more advanced uses.

source

Parsed objects

StreamlinerCore.ModelType
Model(parser::Parser, metadata::AbstractDict)

Model(parser::Parser, path::AbstractString, [vars::AbstractDict])

Create a Model object from a configuration dictionary metadata or, alternatively, from a configuration dictionary stored at path in TOML format. The optional argument vars is a dictionary of variables the can be used to fill the template given in path.

The parser::Parser handles conversion from configuration variables to julia objects.

Given a model::Model object, use model(data) where data::AbstractData to instantiate the corresponding neural network or machine.

source
StreamlinerCore.TrainingType
Training(parser::Parser, metadata::AbstractDict)

Training(parser::Parser, path::AbstractString, [vars::AbstractDict])

Create a Training object from a configuration dictionary metadata or, alternatively, from a configuration dictionary stored at path in TOML format. The optional argument vars is a dictionary of variables the can be used to fill the template given in path.

The parser::Parser handles conversion from configuration variables to julia objects.

source
StreamlinerCore.StreamingType
Streaming(parser::Parser, metadata::AbstractDict)

Streaming(parser::Parser, path::AbstractString, [vars::AbstractDict])

Create a Streaming object from a configuration dictionary metadata or, alternatively, from a configuration dictionary stored at path in TOML format. The optional argument vars is a dictionary of variables the can be used to fill the template given in path.

The parser::Parser handles conversion from configuration variables to julia objects.

source

Training and evaluation

StreamlinerCore.ResultType
@kwdef struct Result{P}
    iteration::Int
    iterations::Int
    stats::NTuple{P, Vector{Float64}}
    trained::Bool
    resumed::Maybe{Bool} = nothing
    successful::Maybe{Bool} = nothing
end

Structure to encode the result of train, finetune, or validate. Stores configuration of model, metrics, and information on the location of the model weights.

source
StreamlinerCore.trainFunction
train(
    dir::AbstractString,
    model::Model, data::AbstractData{2}, training::Training;
    callback = default_callback
)

Train model using the training configuration on data. Save the resulting weights in dir.

After every epoch, callback(m, trace).

The arguments of callback work as follows.

  • m is the instantiated neural network or machine,
  • trace is an object encoding additional information, i.e.,
    • stats (average of metrics computed so far),
    • metrics (functions used to compute stats), and
    • iteration.
source
StreamlinerCore.finetuneFunction
finetune(
    (src, dst)::Pair,
    model::Model, data::AbstractData{2}, training::Training;
    init::Maybe{Result} = nothing, callback = default_callback
)

Load model encoded in model from src and retrain it using the training configuration on data. Save the resulting weights in dst.

Use init = result::Result to restart training where it left off. The callback keyword argument works as in train.

source
StreamlinerCore.loadmodelFunction
loadmodel(model::Model, data::AbstractData, device)

Load model encoded in model on the device. The object data is required as the model can only be initialized once the data dimensions are known.

source
loadmodel(dirname::AbstractString, model::Model, data::AbstractData, device)

Load model encoded in result on the device. The object data is required as the model can only be initialized once the data dimensions are known.

source
StreamlinerCore.validateFunction
validate(
    dir::AbstractString,
    model::Model,
    data::AbstractData{1},
    streaming::Streaming
)

Load model with weights saved in dir and validate it on data using streaming settings streaming.

source
StreamlinerCore.evaluateFunction
evaluate(
        device_m, data::AbstractData{1}, streaming::Streaming,
        select::SymbolTuple = (:prediction,)
    )

Evaluate model device_m on data using streaming settings streaming.

source
evaluate(
    dirname::AbstractString,
    model::Model, data::AbstractData{1}, streaming::Streaming,
    select::SymbolTuple = (:prediction,)
)

Load model with weights saved in dirname and evaluate it on data using streaming settings streaming.

source
StreamlinerCore.summarizeFunction
summarize(io::IO, model::Model, data::AbstractData, training::Training)

Display summary information concerning model (structure and number of parameters) and data (number of batches and size of each batch).

source