StreamlinerCore
StreamlinerCore is a julia library to generate, train and evaluate models defined via some configuration files.
Data interface
StreamlinerCore.AbstractData — Type
AbstactData{N}Abstract type representing streamers of N datasets. In general, StreamlinerCore will use N = 1 to validate and evaluate trained models and N = 2 to train models via a training and a validation datasets.
Subtypes of AbstractData are meant to implement the following methods:
StreamlinerCore.stream — Function
stream(f, data::AbstractData, partition::Integer, streaming::Streaming)Stream partition of data by batches of batchsize on a given device. Return the result of applying f on the resulting batch iterator. Shuffling is optional and controlled by shuffle (boolean) and by the random number generator rng.
The options device, batchsize, shuffle, rng are passed via the configuration struct streaming::Streaming. See also Streaming.
StreamlinerCore.ingest — Function
ingest(data::AbstractData{1}, eval_stream, select)Ingest output of evaluate into a suitable database, tensor or iterator. select determines which fields of the model output to keep.
StreamlinerCore.get_templates — Function
get_templates(data::AbstractData)Extract templates for data. Templates encode type and size of the arrays that data will stream. See also Template
StreamlinerCore.get_metadata — Function
get_metadata(x)::Dict{String, Any}Extract metadata for x. metadata should be a dictionary of information that identifies x univoquely. get_metadata has methods for AbstractData, Model, and Training.
StreamlinerCore.get_nsamples — Function
get_nsamples(data::AbstractData{N})::NTuple{N, Int} where {N}Return number of samples for data.
StreamlinerCore.Template — Type
Template(::Type{T}, size::NTuple{N, Int}) where {T, N}Create an object of type Template. It represents arrays with eltype T and size size. Note that size does not include the minibatch dimension.
Parser
StreamlinerCore.Parser — Type
Parser(;
model, layers, sigmas, aggregators, metrics, regularizations,
optimizers, schedules, stoppers, devices
)Collection of dictionaries to performance the necessary conversion from the user-specified configuration file or dictionary to julia objects.
For most usecases, one should define a default parser
parser = default_parser()and pass it to Model and Training upon construction.
A parser object is also required to use interface functions that read from the MongoDB:
See default_parser for more advanced uses.
StreamlinerCore.default_parser — Function
default_parser(; plugins::AbstractVector{Parser}=Parser[])Return a parser::Parser object that includes StreamlinerCore defaults together with optional plugins.
Parsed objects
StreamlinerCore.Model — Type
Model(parser::Parser, metadata::AbstractDict)
Model(parser::Parser, path::AbstractString, [vars::AbstractDict])Create a Model object from a configuration dictionary metadata or, alternatively, from a configuration dictionary stored at path in TOML format. The optional argument vars is a dictionary of variables the can be used to fill the template given in path.
The parser::Parser handles conversion from configuration variables to julia objects.
Given a model::Model object, use model(data) where data::AbstractData to instantiate the corresponding neural network or machine.
StreamlinerCore.Training — Type
Training(parser::Parser, metadata::AbstractDict)
Training(parser::Parser, path::AbstractString, [vars::AbstractDict])Create a Training object from a configuration dictionary metadata or, alternatively, from a configuration dictionary stored at path in TOML format. The optional argument vars is a dictionary of variables the can be used to fill the template given in path.
The parser::Parser handles conversion from configuration variables to julia objects.
StreamlinerCore.Streaming — Type
Streaming(parser::Parser, metadata::AbstractDict)
Streaming(parser::Parser, path::AbstractString, [vars::AbstractDict])Create a Streaming object from a configuration dictionary metadata or, alternatively, from a configuration dictionary stored at path in TOML format. The optional argument vars is a dictionary of variables the can be used to fill the template given in path.
The parser::Parser handles conversion from configuration variables to julia objects.
Training and evaluation
StreamlinerCore.Result — Type
@kwdef struct Result{P}
iteration::Int
iterations::Int
stats::NTuple{P, Vector{Float64}}
trained::Bool
resumed::Maybe{Bool} = nothing
successful::Maybe{Bool} = nothing
endStructure to encode the result of train, finetune, or validate. Stores configuration of model, metrics, and information on the location of the model weights.
StreamlinerCore.has_weights — Function
has_weights(result::Result)Return true if result is a successful training result, false otherwise.
StreamlinerCore.train — Function
train(
dir::AbstractString,
model::Model, data::AbstractData{2}, training::Training;
callback = default_callback
)Train model using the training configuration on data. Save the resulting weights in dir.
After every epoch, callback(m, trace).
The arguments of callback work as follows.
mis the instantiated neural network or machine,traceis an object encoding additional information, i.e.,stats(average of metrics computed so far),metrics(functions used to computestats), anditeration.
StreamlinerCore.finetune — Function
finetune(
(src, dst)::Pair,
model::Model, data::AbstractData{2}, training::Training;
init::Maybe{Result} = nothing, callback = default_callback
)Load model encoded in model from src and retrain it using the training configuration on data. Save the resulting weights in dst.
Use init = result::Result to restart training where it left off. The callback keyword argument works as in train.
StreamlinerCore.loadmodel — Function
loadmodel(model::Model, data::AbstractData, device)Load model encoded in model on the device. The object data is required as the model can only be initialized once the data dimensions are known.
loadmodel(dirname::AbstractString, model::Model, data::AbstractData, device)Load model encoded in result on the device. The object data is required as the model can only be initialized once the data dimensions are known.
StreamlinerCore.validate — Function
validate(
dir::AbstractString,
model::Model,
data::AbstractData{1},
streaming::Streaming
)Load model with weights saved in dir and validate it on data using streaming settings streaming.
StreamlinerCore.evaluate — Function
evaluate(
device_m, data::AbstractData{1}, streaming::Streaming,
select::SymbolTuple = (:prediction,)
)Evaluate model device_m on data using streaming settings streaming.
evaluate(
dirname::AbstractString,
model::Model, data::AbstractData{1}, streaming::Streaming,
select::SymbolTuple = (:prediction,)
)Load model with weights saved in dirname and evaluate it on data using streaming settings streaming.
StreamlinerCore.summarize — Function
summarize(io::IO, model::Model, data::AbstractData, training::Training)Display summary information concerning model (structure and number of parameters) and data (number of batches and size of each batch).