Multi-subject simulation

In this tutorial, you will learn how to simulate data for multiple subjects. In particular, you will learn how to specify fixed and random effects and what their influence on the simulated data looks like.

Setup

Click to expand
# Load required packages
using UnfoldSim
using Unfold
using CairoMakie
using UnfoldMakie
using DataFrames

Similar to the single subject case, multi-subject simulation depends on:

  • Design (typically a MultiSubjectDesign)
  • Components (typically a MixedModelComponent)
  • Onset (any)
  • Noise (any)

Design

Our first design should be 20 subjects, with 4 items each. Any individual image is shown only either as large or small, thus we choose items_between.

design = MultiSubjectDesign(
    n_subjects = 20,
    n_items = 4,
    items_between = Dict(:condition => ["large", "small"]),
)
MultiSubjectDesign
  n_subjects: Int64 20
  n_items: Int64 4
  subjects_between: Dict{Symbol, Vector}
  items_between: Dict{Symbol, Vector}
  both_within: Dict{Symbol, Vector}
  event_order_function: #10 (function of type UnfoldSim.var"#10#14")

Between, within?

In the beginning, the distinction between between-items, between-subjects and within-subjects, within-items and both-between, both-within feels daunting.

We base our terminology on MixedModelsSim which uses the following definitions:

  • subjects_between -> effects between subjects, e.g. young vs old
  • items_between -> effects between items, e.g. natural vs artificial images, (but shown to all subjects if not specified in subjects_between as well)
  • both_within -> effects completly crossed, e.g. word vs. scramble, where the "original" word is the item, and shown to all subjects

Components

For multi-subject, similar to the LinearModelComponent specified before, we have to define the fixed effect β, the model parameters that are applied to all subjects.

β = [1, 2] # 1 = intercept, 2 = difference between large and small
2-element Vector{Int64}:
 1
 2

In addition, we have to provide random effects σs, which define the spread (and correlation) of the subjects around the fixed effects, foreach parameter

σs = Dict(
    :subject => [0.5, 1], # we have more spread in the condition-effect
    :item => [1], # the item-variability is higher than the subject-variability
)
Dict{Symbol, Vector} with 2 entries:
  :item    => [1]
  :subject => [0.5, 1.0]

now we are ready to assemble the parts

signal = MixedModelComponent(;
    basis = UnfoldSim.hanning(50),
    formula = @formula(0 ~ 1 + condition + (1 + condition | subject) + (1 | item)),
    β = β,
    σs = σs,
    contrasts = Dict(:condition => EffectsCoding()), # we highly recommend specifying your contrasts, by Default its Dummy/ReferenceCoding with alphabetically sorted levels (relying 100% on StatsModels.jl)
)
MixedModelComponent
  basis: Array{Float64}((50,)) [0.0, 0.004104993088376974, 0.016352568480485274, 0.03654162132698918, 0.0643406479383053, 0.09929318906602175, 0.14082532495113625, 0.18825509907063326, 0.24080371584473748, 0.2976083284388031  …  0.2976083284388031, 0.24080371584473748, 0.18825509907063326, 0.14082532495113625, 0.09929318906602175, 0.0643406479383053, 0.03654162132698918, 0.016352568480485274, 0.004104993088376974, 0.0]
  formula: StatsModels.FormulaTerm{StatsModels.ConstantTerm{Int64}, Tuple{StatsModels.ConstantTerm{Int64}, StatsModels.Term, StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}, StatsModels.FunctionTerm{typeof(|), Vector{StatsModels.AbstractTerm}}}}
  β: Array{Int64}((2,)) [1, 2]
  σs: Dict{Symbol, Vector}
  contrasts: Dict{Symbol, EffectsCoding}

and simulate!

data, evts = simulate(design, signal, NoOnset(), NoNoise(), return_epoched = true);
┌ Warning: No random generator defined, used the default (`Random.MersenneTwister(1)`) with a fixed seed. This will always return the same results and the user is strongly encouraged to provide their own random generator!
@ UnfoldSim ~/work/UnfoldSim.jl/UnfoldSim.jl/src/simulation.jl:17

We get data with 50 samples (our basis from above), with 4 items and 20 subjects. We get items and subjects separately because we chose no-overlap (via NoOnset) and return_epoched = true`.

size(data)

first(evts, 5)
5×3 DataFrame
Rowsubjectitemcondition
StringStringString
1S01I1large
2S01I2small
3S01I3large
4S01I4small
5S02I1large

Finally, let's plot the data

f = Figure()

for k = 1:4
    series(
        f[1, k],
        data[:, k, :]',
        solid_color = :black,
        axis = (; limits = ((0, 50), (-5, 6))),
    )
    Label(f[1, k, Top()], text = "Item:" * evts[k, :item] * ", c:" * evts[k, :condition])
end
f
Example block output

Some remarks on interpreting the plot:

  • The β main-effect of small (#2 and #4) vs. large (#1 and #3) is clearly visible.
  • The variability between subjects, is the variability between the individual curves.
  • The item effect shows up e.g. that #2 vs. #4 column show different values.

Continuous Signals / Overlap

Let's continue our tutorial and simulate overlapping signals instead.

We replace the NoOnset with an UniformOnset with 20 to 70 samples between subsequent events. We further remove the return_epoched, because we want to have continuous data for now.

data, evts = simulate(design, signal, UniformOnset(offset = 20, width = 50), NoNoise());
size(data)
(275, 20)

with the first dimension being continuous data, and the latter still the subjects.

series(data', solid_color = :black)
Example block output

Each line is one subject, and it looks a bit unstructured, because the event-onsets are of course random for each subject.

Note

All subjects have the same sequence of trials, if you need to change this, specify a event_order_function in the MultiSubjectDesign.

Analyzing these data with Unfold.jl

We will analyze these data using the Unfold.jl toolbox. While preliminary support for deconvolution (overlap correction) for mixed models is available, here we will not make use of it, but rather apply a MixedModel to each timepoint, following the Mass-univariate approach.

data, evts = simulate(
    design,
    signal,
    UniformOnset(offset = 20, width = 50),
    NoNoise();
    return_epoched = true,
);
size(data)
(50, 4, 20)

For Unfold.jl, we have to reshape the data, so that all subjects are concatenated.

data = reshape(data, size(data, 1), :)
times = range(0, 1, length = size(data, 1))
m = fit(
    UnfoldModel,
    @formula(0 ~ 1 + condition + (1 | item) + (1 + condition | subject)),
    evts,
    data,
    times,
)
plot_erp(coeftable(m), mapping = (; col = :group))
Example block output

The first column shows the fixed effects, the latter the item and subject random effects as they evolve across time


This page was generated using Literate.jl.