Julia Reference

DatasetManager.TrialConditionsType
TrialConditions(conditions, labels; [required, types, defaults, subject_fmt])

Define the names of experimental conditions (aka factors) and the possible labels within each condition. Conditions are determined from the absolute path of potential sources.

subject is a reserved condition name for the unique identifier (ID) of individual subjects/participants in the dataset. If :subject is not explicitly included in conditions, it will be inserted at the beginning of conditions. The format of the subject identifier can be specified in labels or using the keyword argument subject_fmt.

Arguments

  • conditions is a collection of condition names (eg (:medication, :dose)) in the order they must appear in the file paths of trial sources
  • labels must have a key-value pair for each condition. The value(s) for each key define the acceptable labels for each condition. Levels may be defined using a:
    • String
    • Regex
    • old => transf [=> new], where old may be a Regex or one/multiple String(s), where transf may be a String, Function, or a SubstitutionString (only if old is a Regex), and where new is a Regex
    • Array of any combination of the preceding.
    Keys in labels which are not included in conditions will be ignored.

Keyword arguments

  • required=conditions: The conditions which every trial is required to have.
  • types=Dict(conditions .=> String): The types that each condition should be parsed as
  • defaults=Dict{Symbol,Any}(): Default conditions to set when a given condition is not matched. Defaults can given for required conditions. If a condition is not required, has no default, and is not matched, it will not be included as a condition for a source.
  • subject_fmt=r"Subject (?<subject>\d+)?": The Regex pattern used to match the trial's subject ID. If :subject is present in labels, that definition will take precedence.

Examples

julia> labels = Dict(
    :subject => r"(?<=Patient )\d+",
    :group => ["Placebo" => "Control", "Group A", "Group B"],
    :posture => r"(sit|stand)"i => lowercase,
    :cue => r"cue[-_](fast|slow)" => s"\\1 cue" => r"(fast|slow) cue");

julia> conds = TrialConditions((:subject,:group,:posture,:cue), labels;
    types=Dict(:subject => Int));
source
DatasetManager.DataSubsetType
DataSubset(name, source::Union{Function,<:AbstractSource}, dir, pattern; [dependent=false])

Describes a subset of source data files found within a folder dir which match pattern (using glob syntax). The name of the DataSubset will be used in findtrials as the source name in a Trial.

Some sources described by a DataSubset may not be relevant as standalone/independent Trials (e.g. maximal voluntary contraction "trials", when collecting EMG data, are typically only relevant to movement trials for that specific subject/session of a data collection, but are not useful on their own). Dependent sources (eg dependent=true) will not create new trials in findtrials and will only be added to pre-existing trials when the required conditions and a "condition" with the same name as the DataSubset's name exists. The matched "condition" will be used in findtrials! as the source name in corresponding Trials.

If source is a function, it must accept a file path and return a Source.

See also: Source, TrialConditions, findtrials, findtrials!

Examples

julia> DataSubset("events", Source{Events}, "/path/to/subset", "Subject [0-9]*/events/*.tsv")
DataSubset("events", Source{Events}, "/path/to/subset", "Subject [0-9]*/events/*.tsv")

DataSubsets for dependent sources

julia> labels = Dict(
        :subject => r"(?<=Patient )\d+",
        :session => r"(?<=Session )\d+",
        :mvic => r"mvic_[rl](bic|tric)", # Defines possible MVIC "trial" names
    );

julia> # Only :subject and :session are required conditions (for matching existing trials)
julia> conds = TrialConditions((:subject,:session,:mvic), labels; required=(:subject,:session,));

julia> # Note the DataSubset name matches the "condition" name in `labels`
julia> subsets = [
    DataSubset("mvic", Source{C3DFile}, c3dpath, "Subject [0-9]*/Session [0-2]/*.c3d"; dependent=true)
];

julia> findtrials!(trials, subsets, conds)
source
DatasetManager.TrialType
Trial{ID}(subject::ID, name::String, [conditions::Dict{Symbol}, sources::Dict{String}])

Characterizes a single instance of data collected from a specific subject. The Trial has a name, and may have one or more conditions which describe experimental conditions and/or subject specific charateristics which are relevant to subsequent analyses. A Trial may have one or more complementary sources of data (e.g. simultaneous recordings from separate equipment stored in separate files, supplementary data for a primary data source, etc).

Examples

julia> trial1 = Trial(1, "baseline", Dict(:group => "control", :session => 2))
Trial{Int64}
  Subject: 1
  Name: baseline
  Conditions:
    :group => "control"
    :session => 2
  No sources
source
DatasetManager.subjectFunction
subject(trial::Trial{ID}) -> ID
subject(seg::Union{Segment,SegmentResult}) -> ID

Get the subject identifier of a Trial, Segment, or SegmentResult.

source
DatasetManager.conditionsFunction
conditions(trial::Trial{ID}) -> Dict{Symbol,Any}
conditions(seg::Union{Segment,SegmentResult}) -> Dict{Symbol}

Get the conditions of a Trial, Segment, or SegmentResult.

source
DatasetManager.hassubjectFunction
hassubject(trial, sub) -> Bool

Test if the subject ID for trial is equal to sub

source
hassubject(sub) -> Bool

Create a function that tests if the subject ID of a trial is equal to sub, i.e. a function equivalent to t -> hassubject(t, sub).

source
DatasetManager.hassourceFunction
hassource(trial, src::String) -> Bool
hassource(trial, srctype::S) where {S<:AbstractSource} -> Bool
hassource(trial, src::Regex) -> Bool

Check if trial has a source with key or type matching src.

Examples

julia> trial1 = Trial(1, "baseline", Dict(), Dict("model" => Source{Nothing}()));

julia> hassource(trial1, "model")
true

julia> hassource(trial1, Source{Nothing})
true

julia> hassource(trial1, r"test*")
false
source
hassource(src) -> Bool

Create a function that tests if a trial has the source src, i.e. a function equivalent to t -> hassource(t, src).

Examples

julia> trial1 = Trial(1, "baseline", Dict(), Dict("model" => Source{Nothing}()));

julia> trial2 = Trial(2, "baseline");

julia> filter(hassource("model"), [trial1, trial2])
1-element Vector{Trial{Int64}}:
 Trial(1, "baseline", 0 conditions, 1 source)
source
DatasetManager.hasconditionFunction
hascondition(trial, (condition [=> value])...) -> Bool

Test if trial has condition, or that condition matches value. Specifying value is optional. Multiple conditions and/or condition pairs can be given which all must be true to match. value can be a single level, multiple acceptable levels, or a predicate function.

Examples

julia> trial = Trial(1, "baseline", Dict(:group => "control", :session => 2));

julia> hascondition(trial, :group)
true

julia> hascondition(trial, :group => "A")
false

julia> hascondition(trial, :group => ["control", "A"])
true

julia> hascondition(trial, :group => "A", :session => 1)
false

julia> hascondition(trial, :group => ["control", "A"], :session => >=(2))
true

source
hascondition((condition => value)...) -> Bool

Create a function that tests if a trial has the given condition(s)/value(s), i.e. a function equivalent to t -> hascondition(t, conditions...).

Examples

julia> trial1 = Trial(1, "baseline", Dict(:group => "control", :session => 2));

julia> trial2 = Trial(2, "baseline", Dict(:group => "A", :session => 1));

julia> filter(hascondition(:group => "A"), [trial1, trial2])
1-element Vector{Trial{Int64}}:
 Trial(2, "baseline", 2 conditions, 0 sources)

source
DatasetManager.getsourceFunction
getsource(trial, name::String) -> Source
getsource(trial, src::S) where {S<:AbstractSource} -> Source
getsource(trial, name::String => src::Type{<:AbstractSource}) -> Source
getsource(trial, pattern::Regex) -> Vector{Source}

Return a source from trial with the requested name or src. When the both name and src are given as a pair, a source with name will be searched for first, and if not found, a source of type src will be searched for. When src (as an <:AbstractSource) is given, only a source of type src may be present, otherwise an error will be thrown.

If a Regex pattern is given, multiple sources may be returned.

source
DatasetManager.findtrialsFunction
findtrials(subsets, conditions; <keyword arguments>) -> Vector{Trial}

Find all the trials matching conditions which can be found in subsets.

Keyword arguments:

  • ignorefiles::Union{Nothing, Vector{String}}=nothing: A list of files, given in the form of an absolute path, that are in any of the subsets folders which are to be ignored.
  • debug=false: Show files that did not match (all) the required conditions
  • verbose=false: Show files that did match all required conditions when debug=true
  • maxlogs=50: Maximum number of files per subset to show when debugging

See also: Trial, findtrials!, DataSubset, TrialConditions

source
DatasetManager.findtrials!Function
findtrials!(trials, subsets, conditions; <keyword arguments>) -> Vector{Trial}

Find more trials and/or additional sources for existing trials.

For DataSubsets in subsets which are dependent, candidate source files must have the required conditions and have a "condition" matching the DataSubset name.

See also: findtrials, Trial, DataSubset, TrialConditions

source
DatasetManager.summarizeFunction
summarize([io,] trials; [verbosity=5, ignoreconditions])

Summarize a vector of Trials.

Examples

julia> summarize(trials)
Subjects:
 └ 10: "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"
Trials:
 ├ 40 trials
 └ Trials per subject:
   └ 4: 10 subjects (100%)
Conditions:
 ├ Observed levels:
 │ ├ stim => ["placebo", "stim"]
 │ └ session => [1, 2]
 └ Unique level combinations observed: 4 (full factorial)
        stim │ session │ # trials
    ─────────┼─────────┼──────────
     placebo │       1 │ 10
        stim │       1 │ 10
     placebo │       2 │ 10
        stim │       2 │ 10
Sources:
 └ "events" => Source{GaitEvents}, 40 trials (100%)

source
DatasetManager.analyzedatasetFunction
analyzedataset(f, trials, Type{<:AbstractSource}; [enable_progress, show_errors, threaded]) -> Vector{SegmentResult}

Map function f over all trials (multi-threaded by default) and return the SegmentResult. If f errors for a given trial, the SegmentResult for that trial will be empty (no results), and the error will be rethrown along with the trial which caused the error.

Keyword arguments

  • enable_progress=true: Enable the progress meter
  • show_errors=true: Show trials and their errors
  • threaded=true
source
DatasetManager.stackFunction
stack(rs::Vector{SegmentResult}, conds; [variables])

Compile the results into a stacked, long form DataFrame

source
DatasetManager.write_resultsFunction
write_results(filename, df, conditions; [variables, format, archive])

Write the results in df (generated by DatasetManager.stack) to file at filename. format may be :wide or :long. If archive == true, if filename already exists, it will be moved to "$(filename).bak" before writing the new results to filename.

conditions and variables specifies which conditions or variables, respectively, should be included in the output.

source
DatasetManager.export_trialsFunction
export_trials([f,] trials, dir[, sources])

Export (copy) sources in trials to outdir. When left unspecified, sources is set to all unique sources found in trials. Optionally can be given a function f, which must accept 2 arguments (a trial and a src which is of eltype(sources)), to control the names of the exported data. The default behavior exports all sources to dir with no subdirectories, using the naming schema "$trial.subject_$srcname_basename(sourcepath)" (pseudo-code).

Examples

julia> export_trials(trials, pwd()) do trial, source
    "$(subject(trial))_$(conditions(trial)[:group]).$(srcext(source))"
end
source

Sources

DatasetManager.AbstractSourceType
AbstractSource

The abstract supertype for custom sources. Implement a subtype of AbstractSource if Source{S} is not sufficient for your source requirements (e.g. your source has additional information besides the path, such as encoding parameters, compression indicator, etc, that needs to be associated with each instance).

Extended help

AbstractSource interface requirements

All subtypes of AbstractSource must:

  • have a path field or extend the sourcepath function.
  • have at least these two constructor methods:
    • empty constructor
    • single argument constructor accepting a string of an absolute path

AbstractSource subtypes should:

AbstractSource subtypes may implement these additional methods to improve user experience and/or enable additional functionality:

source
DatasetManager.SourceType
Source{S}([path])

A standard/default source, where S can be an existing type or a singleton type defined for dispatch purposes. When path is omitted, a unique temporary file path is used.

Examples

julia> Source{Missing}()
Source{Missing}("/tmp/jl_ruMeMy")

julia> struct RandomDev; end

julia> Source{RandomDev}("/dev/urandom")
Source{RandomDev}("/dev/urandom")
source
DatasetManager.requiresource!Function
requiresource!(trial, name::String; [force, deps, kwargs...]) -> nothing
requiresource!(trial, name::Regex; <keyword arguments>) -> nothing
requiresource!(trial, src::S; <keyword arguments>) where {S<:AbstractSource} -> nothing
requiresource!(trial, ::Type{S}; <keyword arguments>) where {S<:AbstractSource} -> nothing
requiresource!(trial, name::String => src; <keyword arguments>) -> nothing

Require src to be exist for trial. Generate src if it does not exist, or throw an error if src is not present and cannot be generated. When src is a Regex, deps is constrained to UnknownDeps, regardless of the keyword argument. Unused keyword arguments will be passed on to the generatesource method for src.

Keyword arguments

  • force=false: Force generating src, even if it already exists
  • deps=dependencies(src): deps can be manually specified as a name => src::AbstractSource pair per dependency if there exist multiple sources that would match a dependent source type. kwargs... are passed on to the generatesource function for src.

Examples

julia> dependencies(Source{Events})
(Source{C3DFile},)

julia> requiresource!(trial, Source{Events})

julia> requiresource!(trial, Source{Events}; force=true, deps=("mainc3d" => Source{C3DFile}))

source
DatasetManager.generatesourceFunction
generatesource(trial, src::Union{S,Type{S}}, deps; kwargs...) where S <: AbstractSource -> newsrc::typeof(src)

Generate source src using dependent sources deps from trial. Returns a source of the same type as src, but is not required to be exactly equal to src (i.e. a different sourcepath(newsrc) is acceptable). Keyword arguments named force and deps should not be used in generatesource methods since they are used in requiresource! and will not be passed on to the generatesource method.

source
DatasetManager.dependenciesFunction
dependencies(src::Union{S,Type{S}}) where {S<:AbstractSource} -> Tuple{<:AbstractSource}

Get the sources that src depends on to be generated by generatesource. Defaults to UnknownDeps, which prevents automatically generating src.

Extended help

In some cases, it may be preferable to leave the dependencies for custom source undefined if e.g. the source is a container of some form that may contain data generated and depending an unknown number/type of sources.

source
DatasetManager.srcextFunction
srcext(src::Type{S}) where {S<:AbstractSource} -> String
srcext(src::S) where {S<:AbstractSource} -> String

Return the file extension for src. If src is an AbstractSource subtype, the default extension for that subtype will be returned; if src is an AbstractSource instance, the returned extension will be the actual extension for that src, regardless of whether it is the same as the default extension for that src type.

Extended help

When extending this function for a custom source, only define a method for the source type, e.g. srcext(::Type{MySource}). The period (.) should be included in the extension as the first letter.

source

Segments

DatasetManager.SegmentType
Segment(trial, src::Union{S,Type{S},String}; [start, finish, conditions])

Describes a portion of a source in trial from time start to finish with segment specific conditions, if applicable.

If src is a String or AbstractSource, it must refer to a source that exists in trial. start and finish are used in readsegment to trim time from the beginning and/or end of the data read from src. start and finish default to the beginning and end, respectively, of the source/trial. start must be before finish, but they are otherwise only validated during readsegment.

Any conditions present in trial will be merged into the conditions for the segment. Note: if src is a <:AbstractSource instance, it will not be added to the trial's sources.

Example:

julia> t = Trial(1, "intervention", Dict(:group => "control"), Dict("events" => Source{Events}("/path/to/file")));

julia> seg = Segment(t, "events")
Segment{Source{Events},Int64}
 Trial: Trial(1, "intervention", 1 conditions, 1 source)
 Source: Source{Events}("/path/to/file")
 Time: beginning to end
 Conditions: (same as parent trial)
    :group => "control"

julia> seg = Segment(t, Source{Events}("/new/events/file"); start=0.0, finish=10.0, conditions=Dict(:stimulus => "sham"))
Segment{Source{Events},Int64}
 Trial: Trial(1, "intervention", 1 conditions, 1 source)
 Source: Source{Events}("/new/events/file")
 Time: 0.0 to 10.0
 Conditions:
    :stimulus => "sham"
    :group => "control"
source
DatasetManager.readsegmentFunction
readsegment(seg::Segment{S}; warn=true, kwargs...) where S <: AbstractSource

If defined for S, returns the portion of seg.source from seg.start to seg.finish. Otherwise, equivalent to readsource (i.e. no trimming of time-series occurs). Warns by default if the main method is called.

source
DatasetManager.SegmentResultType
SegmentResult(segment::Segment, results::Dict{Symbol)

Contains the results of any analyses performed on the trial segment in a Dict.

Example:

segresult = SegmentResult(seg, Dict(:avg => 3.5))
source
DatasetManager.trialFunction
trial(seg::Union{Segment,SegmentResult}) -> Trial

Return the parent Trial for the given Segment or SegmentResult

source
DatasetManager.sourceFunction
source(seg::Union{Segment,SegmentResult}) -> AbstractSource

Return the source for the parent Trial of the given Segment or SegmentResult

source