Julia Reference
DatasetManager.TrialConditions — TypeTrialConditions(conditions, labels; [required, types, defaults, subject_fmt])
Define the names of experimental conditions (aka factors) and the possible labels within each condition. Conditions are determined from the absolute path of potential sources.
subject is a reserved condition name for the unique identifier (ID) of individual subjects/participants in the dataset. If :subject is not explicitly included in conditions, it will be inserted at the beginning of conditions. The format of the subject identifier can be specified in labels or using the keyword argument subject_fmt.
Arguments
conditionsis a collection of condition names (eg(:medication, :dose)) in the order they must appear in the file paths of trial sourceslabelsmust have a key-value pair for each condition. The value(s) for each key define the acceptable labels for each condition. Levels may be defined using a:- String
- Regex
old => transf [=> new], whereoldmay be a Regex or one/multiple String(s), wheretransfmay be aString,Function, or aSubstitutionString(only ifoldis a Regex), and wherenewis a Regex- Array of any combination of the preceding.
labelswhich are not included inconditionswill be ignored.
Keyword arguments
required=conditions: The conditions which every trial is required to have.types=Dict(conditions .=> String): The types that each condition should be parsed asdefaults=Dict{Symbol,Any}(): Default conditions to set when a given condition is not matched. Defaults can given for required conditions. If a condition is not required, has no default, and is not matched, it will not be included as a condition for a source.subject_fmt=r"Subject (?<subject>\d+)?": The Regex pattern used to match the trial's subject ID. If:subjectis present inlabels, that definition will take precedence.
Examples
julia> labels = Dict(
:subject => r"(?<=Patient )\d+",
:group => ["Placebo" => "Control", "Group A", "Group B"],
:posture => r"(sit|stand)"i => lowercase,
:cue => r"cue[-_](fast|slow)" => s"\\1 cue" => r"(fast|slow) cue");
julia> conds = TrialConditions((:subject,:group,:posture,:cue), labels;
types=Dict(:subject => Int));
DatasetManager.DataSubset — TypeDataSubset(name, source::Union{Function,<:AbstractSource}, dir, pattern; [dependent=false])
Describes a subset of source data files found within a folder dir which match pattern (using glob syntax). The name of the DataSubset will be used in findtrials as the source name in a Trial.
Some sources described by a DataSubset may not be relevant as standalone/independent Trials (e.g. maximal voluntary contraction "trials", when collecting EMG data, are typically only relevant to movement trials for that specific subject/session of a data collection, but are not useful on their own). Dependent sources (eg dependent=true) will not create new trials in findtrials and will only be added to pre-existing trials when the required conditions and a "condition" with the same name as the DataSubset's name exists. The matched "condition" will be used in findtrials! as the source name in corresponding Trials.
If source is a function, it must accept a file path and return a Source.
See also: Source, TrialConditions, findtrials, findtrials!
Examples
julia> DataSubset("events", Source{Events}, "/path/to/subset", "Subject [0-9]*/events/*.tsv")
DataSubset("events", Source{Events}, "/path/to/subset", "Subject [0-9]*/events/*.tsv")
DataSubsets for dependent sources
julia> labels = Dict(
:subject => r"(?<=Patient )\d+",
:session => r"(?<=Session )\d+",
:mvic => r"mvic_[rl](bic|tric)", # Defines possible MVIC "trial" names
);
julia> # Only :subject and :session are required conditions (for matching existing trials)
julia> conds = TrialConditions((:subject,:session,:mvic), labels; required=(:subject,:session,));
julia> # Note the DataSubset name matches the "condition" name in `labels`
julia> subsets = [
DataSubset("mvic", Source{C3DFile}, c3dpath, "Subject [0-9]*/Session [0-2]/*.c3d"; dependent=true)
];
julia> findtrials!(trials, subsets, conds)
DatasetManager.Trial — TypeTrial{ID}(subject::ID, name::String, [conditions::Dict{Symbol}, sources::Dict{String}])
Characterizes a single instance of data collected from a specific subject. The Trial has a name, and may have one or more conditions which describe experimental conditions and/or subject specific charateristics which are relevant to subsequent analyses. A Trial may have one or more complementary sources of data (e.g. simultaneous recordings from separate equipment stored in separate files, supplementary data for a primary data source, etc).
Examples
julia> trial1 = Trial(1, "baseline", Dict(:group => "control", :session => 2))
Trial{Int64}
Subject: 1
Name: baseline
Conditions:
:group => "control"
:session => 2
No sources
DatasetManager.subject — Functionsubject(trial::Trial{ID}) -> ID
subject(seg::Union{Segment,SegmentResult}) -> ID
Get the subject identifier of a Trial, Segment, or SegmentResult.
DatasetManager.conditions — Functionconditions(trial::Trial{ID}) -> Dict{Symbol,Any}
conditions(seg::Union{Segment,SegmentResult}) -> Dict{Symbol}
Get the conditions of a Trial, Segment, or SegmentResult.
DatasetManager.sources — Functionsources(trial::Trial{ID}) -> Dict{String,AbstractSource}
Get the sources for trial
DatasetManager.hassubject — Functionhassubject(trial, sub) -> Bool
Test if the subject ID for trial is equal to sub
hassubject(sub) -> BoolCreate a function that tests if the subject ID of a trial is equal to sub, i.e. a function equivalent to t -> hassubject(t, sub).
DatasetManager.hassource — Functionhassource(trial, src::String) -> Bool
hassource(trial, srctype::S) where {S<:AbstractSource} -> Bool
hassource(trial, src::Regex) -> Bool
Check if trial has a source with key or type matching src.
Examples
julia> trial1 = Trial(1, "baseline", Dict(), Dict("model" => Source{Nothing}()));
julia> hassource(trial1, "model")
true
julia> hassource(trial1, Source{Nothing})
true
julia> hassource(trial1, r"test*")
false
hassource(src) -> BoolCreate a function that tests if a trial has the source src, i.e. a function equivalent to t -> hassource(t, src).
Examples
julia> trial1 = Trial(1, "baseline", Dict(), Dict("model" => Source{Nothing}()));
julia> trial2 = Trial(2, "baseline");
julia> filter(hassource("model"), [trial1, trial2])
1-element Vector{Trial{Int64}}:
Trial(1, "baseline", 0 conditions, 1 source)
DatasetManager.hascondition — Functionhascondition(trial, (condition [=> value])...) -> Bool
Test if trial has condition, or that condition matches value. Specifying value is optional. Multiple conditions and/or condition pairs can be given which all must be true to match. value can be a single level, multiple acceptable levels, or a predicate function.
Examples
julia> trial = Trial(1, "baseline", Dict(:group => "control", :session => 2));
julia> hascondition(trial, :group)
true
julia> hascondition(trial, :group => "A")
false
julia> hascondition(trial, :group => ["control", "A"])
true
julia> hascondition(trial, :group => "A", :session => 1)
false
julia> hascondition(trial, :group => ["control", "A"], :session => >=(2))
true
hascondition((condition => value)...) -> BoolCreate a function that tests if a trial has the given condition(s)/value(s), i.e. a function equivalent to t -> hascondition(t, conditions...).
Examples
julia> trial1 = Trial(1, "baseline", Dict(:group => "control", :session => 2));
julia> trial2 = Trial(2, "baseline", Dict(:group => "A", :session => 1));
julia> filter(hascondition(:group => "A"), [trial1, trial2])
1-element Vector{Trial{Int64}}:
Trial(2, "baseline", 2 conditions, 0 sources)
DatasetManager.getsource — Functiongetsource(trial, name::String) -> Source
getsource(trial, src::S) where {S<:AbstractSource} -> Source
getsource(trial, name::String => src::Type{<:AbstractSource}) -> Source
getsource(trial, pattern::Regex) -> Vector{Source}
Return a source from trial with the requested name or src. When the both name and src are given as a pair, a source with name will be searched for first, and if not found, a source of type src will be searched for. When src (as an <:AbstractSource) is given, only a source of type src may be present, otherwise an error will be thrown.
If a Regex pattern is given, multiple sources may be returned.
DatasetManager.findtrials — Functionfindtrials(subsets, conditions; <keyword arguments>) -> Vector{Trial}
Find all the trials matching conditions which can be found in subsets.
Keyword arguments:
ignorefiles::Union{Nothing, Vector{String}}=nothing: A list of files, given in the form of an absolute path, that are in any of thesubsetsfolders which are to be ignored.debug=false: Show files that did not match (all) the required conditionsverbose=false: Show files that did match all required conditions whendebug=truemaxlogs=50: Maximum number of files per subset to show when debugging
See also: Trial, findtrials!, DataSubset, TrialConditions
DatasetManager.findtrials! — Functionfindtrials!(trials, subsets, conditions; <keyword arguments>) -> Vector{Trial}
Find more trials and/or additional sources for existing trials.
For DataSubsets in subsets which are dependent, candidate source files must have the required conditions and have a "condition" matching the DataSubset name.
See also: findtrials, Trial, DataSubset, TrialConditions
DatasetManager.summarize — Functionsummarize([io,] trials; [verbosity=5, ignoreconditions])
Summarize a vector of Trials.
Examples
julia> summarize(trials)
Subjects:
└ 10: "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"
Trials:
├ 40 trials
└ Trials per subject:
└ 4: 10 subjects (100%)
Conditions:
├ Observed levels:
│ ├ stim => ["placebo", "stim"]
│ └ session => [1, 2]
└ Unique level combinations observed: 4 (full factorial)
stim │ session │ # trials
─────────┼─────────┼──────────
placebo │ 1 │ 10
stim │ 1 │ 10
placebo │ 2 │ 10
stim │ 2 │ 10
Sources:
└ "events" => Source{GaitEvents}, 40 trials (100%)
DatasetManager.analyzedataset — Functionanalyzedataset(f, trials, Type{<:AbstractSource}; [enable_progress, show_errors, threaded]) -> Vector{SegmentResult}
Map function f over all trials (multi-threaded by default) and return the SegmentResult. If f errors for a given trial, the SegmentResult for that trial will be empty (no results), and the error will be rethrown along with the trial which caused the error.
Keyword arguments
enable_progress=true: Enable the progress metershow_errors=true: Show trials and their errorsthreaded=true
DatasetManager.stack — Functionstack(rs::Vector{SegmentResult}, conds; [variables])
Compile the results into a stacked, long form DataFrame
DatasetManager.write_results — Functionwrite_results(filename, df, conditions; [variables, format, archive])
Write the results in df (generated by DatasetManager.stack) to file at filename. format may be :wide or :long. If archive == true, if filename already exists, it will be moved to "$(filename).bak" before writing the new results to filename.
conditions and variables specifies which conditions or variables, respectively, should be included in the output.
DatasetManager.export_trials — Functionexport_trials([f,] trials, dir[, sources])
Export (copy) sources in trials to outdir. When left unspecified, sources is set to all unique sources found in trials. Optionally can be given a function f, which must accept 2 arguments (a trial and a src which is of eltype(sources)), to control the names of the exported data. The default behavior exports all sources to dir with no subdirectories, using the naming schema "$trial.subject_$srcname_basename(sourcepath)" (pseudo-code).
Examples
julia> export_trials(trials, pwd()) do trial, source
"$(subject(trial))_$(conditions(trial)[:group]).$(srcext(source))"
end
Sources
DatasetManager.AbstractSource — TypeAbstractSource
The abstract supertype for custom sources. Implement a subtype of AbstractSource if Source{S} is not sufficient for your source requirements (e.g. your source has additional information besides the path, such as encoding parameters, compression indicator, etc, that needs to be associated with each instance).
Extended help
AbstractSource interface requirements
All subtypes of AbstractSource must:
- have a
pathfield or extend thesourcepathfunction. - have at least these two constructor methods:
- empty constructor
- single argument constructor accepting a string of an absolute path
AbstractSource subtypes should:
- have a
readsourcemethod
AbstractSource subtypes may implement these additional methods to improve user experience and/or enable additional functionality:
readsegmentgeneratesource(if enablingrequiresource!generation)dependencies(if defining ageneratesourcemethod)srcextsrcname_default
DatasetManager.Source — TypeSource{S}([path])
A standard/default source, where S can be an existing type or a singleton type defined for dispatch purposes. When path is omitted, a unique temporary file path is used.
Examples
julia> Source{Missing}()
Source{Missing}("/tmp/jl_ruMeMy")
julia> struct RandomDev; end
julia> Source{RandomDev}("/dev/urandom")
Source{RandomDev}("/dev/urandom")
DatasetManager.readsource — Functionreadsource(src::S; kwargs...) where {S <: AbstractSource}
Read the source data from file.
DatasetManager.sourcepath — Functionsourcepath(src) -> String
Return the absolute path to the src.
DatasetManager.requiresource! — Functionrequiresource!(trial, name::String; [force, deps, kwargs...]) -> nothing
requiresource!(trial, name::Regex; <keyword arguments>) -> nothing
requiresource!(trial, src::S; <keyword arguments>) where {S<:AbstractSource} -> nothing
requiresource!(trial, ::Type{S}; <keyword arguments>) where {S<:AbstractSource} -> nothing
requiresource!(trial, name::String => src; <keyword arguments>) -> nothing
Require src to be exist for trial. Generate src if it does not exist, or throw an error if src is not present and cannot be generated. When src is a Regex, deps is constrained to UnknownDeps, regardless of the keyword argument. Unused keyword arguments will be passed on to the generatesource method for src.
Keyword arguments
force=false: Force generatingsrc, even if it already existsdeps=dependencies(src):depscan be manually specified as aname => src::AbstractSourcepair per dependency if there exist multiple sources that would match a dependent source type.kwargs...are passed on to thegeneratesourcefunction forsrc.
Examples
julia> dependencies(Source{Events})
(Source{C3DFile},)
julia> requiresource!(trial, Source{Events})
julia> requiresource!(trial, Source{Events}; force=true, deps=("mainc3d" => Source{C3DFile}))
DatasetManager.generatesource — Functiongeneratesource(trial, src::Union{S,Type{S}}, deps; kwargs...) where S <: AbstractSource -> newsrc::typeof(src)
Generate source src using dependent sources deps from trial. Returns a source of the same type as src, but is not required to be exactly equal to src (i.e. a different sourcepath(newsrc) is acceptable). Keyword arguments named force and deps should not be used in generatesource methods since they are used in requiresource! and will not be passed on to the generatesource method.
DatasetManager.dependencies — Functiondependencies(src::Union{S,Type{S}}) where {S<:AbstractSource} -> Tuple{<:AbstractSource}
Get the sources that src depends on to be generated by generatesource. Defaults to UnknownDeps, which prevents automatically generating src.
Extended help
In some cases, it may be preferable to leave the dependencies for custom source undefined if e.g. the source is a container of some form that may contain data generated and depending an unknown number/type of sources.
DatasetManager.srcext — Functionsrcext(src::Type{S}) where {S<:AbstractSource} -> String
srcext(src::S) where {S<:AbstractSource} -> String
Return the file extension for src. If src is an AbstractSource subtype, the default extension for that subtype will be returned; if src is an AbstractSource instance, the returned extension will be the actual extension for that src, regardless of whether it is the same as the default extension for that src type.
Extended help
When extending this function for a custom source, only define a method for the source type, e.g. srcext(::Type{MySource}). The period (.) should be included in the extension as the first letter.
DatasetManager.srcname_default — Functionsrcname_default(src::Union{S,Type{S}}) where {S<:AbstractSource} -> String
Get the default name for a source of type S
Segments
DatasetManager.Segment — TypeSegment(trial, src::Union{S,Type{S},String}; [start, finish, conditions])
Describes a portion of a source in trial from time start to finish with segment specific conditions, if applicable.
If src is a String or AbstractSource, it must refer to a source that exists in trial. start and finish are used in readsegment to trim time from the beginning and/or end of the data read from src. start and finish default to the beginning and end, respectively, of the source/trial. start must be before finish, but they are otherwise only validated during readsegment.
Any conditions present in trial will be merged into the conditions for the segment. Note: if src is a <:AbstractSource instance, it will not be added to the trial's sources.
Example:
julia> t = Trial(1, "intervention", Dict(:group => "control"), Dict("events" => Source{Events}("/path/to/file")));
julia> seg = Segment(t, "events")
Segment{Source{Events},Int64}
Trial: Trial(1, "intervention", 1 conditions, 1 source)
Source: Source{Events}("/path/to/file")
Time: beginning to end
Conditions: (same as parent trial)
:group => "control"
julia> seg = Segment(t, Source{Events}("/new/events/file"); start=0.0, finish=10.0, conditions=Dict(:stimulus => "sham"))
Segment{Source{Events},Int64}
Trial: Trial(1, "intervention", 1 conditions, 1 source)
Source: Source{Events}("/new/events/file")
Time: 0.0 to 10.0
Conditions:
:stimulus => "sham"
:group => "control"
DatasetManager.readsegment — Functionreadsegment(seg::Segment{S}; warn=true, kwargs...) where S <: AbstractSource
If defined for S, returns the portion of seg.source from seg.start to seg.finish. Otherwise, equivalent to readsource (i.e. no trimming of time-series occurs). Warns by default if the main method is called.
DatasetManager.SegmentResult — TypeSegmentResult(segment::Segment, results::Dict{Symbol)
Contains the results of any analyses performed on the trial segment in a Dict.
Example:
segresult = SegmentResult(seg, Dict(:avg => 3.5))
DatasetManager.trial — Functiontrial(seg::Union{Segment,SegmentResult}) -> Trial
Return the parent Trial for the given Segment or SegmentResult
DatasetManager.segment — FunctionGet the segment of a SegmentResult
DatasetManager.source — Functionsource(seg::Union{Segment,SegmentResult}) -> AbstractSource
Return the source for the parent Trial of the given Segment or SegmentResult
DatasetManager.results — FunctionGet the results of a SegmentResult
DatasetManager.resultsvariables — Functionresultsvariables(sr::Union{SegmentResult,Vector{SegmentResult}})
Get the unique variables for SegmentResults.