Multi-subject simulation
In this tutorial, you will learn how to simulate data for multiple subjects. In particular, you will learn how to specify fixed and random effects and what their influence on the simulated data looks like.
Setup
Click to expand
# Load required packages
using UnfoldSim
using Unfold
using CairoMakie
using UnfoldMakie
using DataFrames
Similar to the single subject case, multi-subject simulation depends on:
Design
(typically aMultiSubjectDesign
)Components
(typically aMixedModelComponent
)Onset
(any)Noise
(any)
Design
Our first design should be 20 subjects, with 4 items each. Any individual image is shown only either as large or small, thus we choose items_between
.
design = MultiSubjectDesign(
n_subjects = 20,
n_items = 4,
items_between = Dict(:condition => ["large", "small"]),
)
MultiSubjectDesign
n_subjects: Int64 20
n_items: Int64 4
subjects_between: Dict{Symbol, Vector}
items_between: Dict{Symbol, Vector}
both_within: Dict{Symbol, Vector}
event_order_function: #10 (function of type UnfoldSim.var"#10#14")
Between, within?
In the beginning, the distinction between between-items
, between-subjects
and within-subjects
, within-items
and both-between
, both-within
feels daunting.
We base our terminology on MixedModelsSim
which uses the following definitions:
subjects_between
-> effects between subjects, e.g. young vs olditems_between
-> effects between items, e.g. natural vs artificial images, (but shown to all subjects if not specified in subjects_between as well)both_within
-> effects completly crossed, e.g. word vs. scramble, where the "original" word is the item, and shown to all subjects
Components
For multi-subject, similar to the LinearModelComponent
specified before, we have to define the fixed effect β
, the model parameters that are applied to all subjects.
β = [1, 2] # 1 = intercept, 2 = difference between large and small
2-element Vector{Int64}:
1
2
In addition, we have to provide random effects σs
, which define the spread (and correlation) of the subjects around the fixed effects, foreach parameter
σs = Dict(
:subject => [0.5, 1], # we have more spread in the condition-effect
:item => [1], # the item-variability is higher than the subject-variability
)
Dict{Symbol, Vector} with 2 entries:
:item => [1]
:subject => [0.5, 1.0]
now we are ready to assemble the parts
signal = MixedModelComponent(;
basis = UnfoldSim.hanning(50),
formula = @formula(0 ~ 1 + condition + (1 + condition | subject) + (1 | item)),
β = β,
σs = σs,
contrasts = Dict(:condition => EffectsCoding()), # we highly recommend specifying your contrasts, by Default its Dummy/ReferenceCoding with alphabetically sorted levels (relying 100% on StatsModels.jl)
)
MixedModelComponent
basis: Array{Float64}((50,)) [0.0, 0.004104993088376974, 0.016352568480485274, 0.03654162132698918, 0.0643406479383053, 0.09929318906602175, 0.14082532495113625, 0.18825509907063326, 0.24080371584473748, 0.2976083284388031 … 0.2976083284388031, 0.24080371584473748, 0.18825509907063326, 0.14082532495113625, 0.09929318906602175, 0.0643406479383053, 0.03654162132698918, 0.016352568480485274, 0.004104993088376974, 0.0]
formula: StatsModels.FormulaTerm{StatsModels.ConstantTerm{Int64}, Tuple{StatsModels.ConstantTerm{Int64}, StatsModels.Term, StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}, StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}
β: Array{Int64}((2,)) [1, 2]
σs: Dict{Symbol, Vector}
contrasts: Dict{Symbol, EffectsCoding}
and simulate!
data, evts = simulate(design, signal, NoOnset(), NoNoise(), return_epoched = true);
┌ Warning: No random generator defined, used the default (`Random.MersenneTwister(1)`) with a fixed seed. This will always return the same results and the user is strongly encouraged to provide their own random generator!
└ @ UnfoldSim ~/work/UnfoldSim.jl/UnfoldSim.jl/src/simulation.jl:17
We get data with 50 samples (our basis
from above), with 4
items and 20 subjects. We get items and subjects separately because we chose no-overlap (via NoOnset
) and return_epoched = true
`.
size(data)
first(evts, 5)
Row | subject | item | condition |
---|---|---|---|
String | String | String | |
1 | S01 | I1 | large |
2 | S01 | I2 | small |
3 | S01 | I3 | large |
4 | S01 | I4 | small |
5 | S02 | I1 | large |
Finally, let's plot the data
f = Figure()
for k = 1:4
series(
f[1, k],
data[:, k, :]',
solid_color = :black,
axis = (; limits = ((0, 50), (-5, 6))),
)
Label(f[1, k, Top()], text = "Item:" * evts[k, :item] * ", c:" * evts[k, :condition])
end
f

Some remarks on interpreting the plot:
- The β main-effect of small (#2 and #4) vs. large (#1 and #3) is clearly visible.
- The variability between subjects, is the variability between the individual curves.
- The item effect shows up e.g. that #2 vs. #4 column show different values.
Continuous Signals / Overlap
Let's continue our tutorial and simulate overlapping signals instead.
We replace the NoOnset
with an UniformOnset
with 20 to 70 samples between subsequent events. We further remove the return_epoched
, because we want to have continuous data for now.
data, evts = simulate(design, signal, UniformOnset(offset = 20, width = 50), NoNoise());
size(data)
(275, 20)
with the first dimension being continuous data, and the latter still the subjects.
series(data', solid_color = :black)

Each line is one subject, and it looks a bit unstructured, because the event-onsets are of course random for each subject.
All subjects have the same sequence of trials, if you need to change this, specify a event_order_function
in the MultiSubjectDesign
.
Analyzing these data with Unfold.jl
We will analyze these data using the Unfold.jl
toolbox. While preliminary support for deconvolution (overlap correction) for mixed models is available, here we will not make use of it, but rather apply a MixedModel to each timepoint, following the Mass-univariate approach.
data, evts = simulate(
design,
signal,
UniformOnset(offset = 20, width = 50),
NoNoise();
return_epoched = true,
);
size(data)
(50, 4, 20)
For Unfold.jl, we have to reshape the data, so that all subjects are concatenated.
data = reshape(data, size(data, 1), :)
times = range(0, 1, length = size(data, 1))
m = fit(
UnfoldModel,
@formula(0 ~ 1 + condition + (1 | item) + (1 + condition | subject)),
evts,
data,
times,
)
plot_erp(coeftable(m), mapping = (; col = :group))

The first column shows the fixed effects, the latter the item and subject random effects as they evolve across time
This page was generated using Literate.jl.