Variable transformations
Those transformers are meant to be used to manipulate the content of TSV files
once loaded as structure with bids.util.tsvread
.
They are mostly meant to be used to implement the transformations described in BIDS stats models but can also be used to manipulate TSV files in batches.
More information on how they function can be found in the variable-transform repository.
The behavior and their “call” in JSON should (hopefully) be fairly close to the pybids-transformers.
Applying transformations
An “array” of transformations can be applied one after the other using
bids.transformers()
.
- bids.transformers(varargin)
Apply transformers to a structure.
USAGE:
new_content = transformers(trans, data)
- Parameters:
transformers (
structure
)data (
structure
)
- Returns:
- new_content:
(structure)
- json:
(structure) json equivalent of the transformers
Example
data = bids.util.tsvread(path_to_tsv); % load transformation instruction from a model file bm = bids.Model('file', model_file); transformers = bm.get_transformations('Level', 'Run'); % apply transformers new_content = bids.transformers(transformers.Instructions, data); % if all fields in the structure have the same number of rows one % create a new tsv file bids.util.tsvwrite(path_to_new_tsv, new_content)
See also: bids.Model
Basic operations
Add
Subtract
Multiply
Divide
Power
- bids.transformers_list.Basic(transformer, data)
Performs a basic operation with a
Value
on theInput
Each of these transformations takes one or more columns, and performs a mathematical operation on the input column and a provided operand. The operations are performed on each column independently.
Arguments:
- Parameters:
Name – mandatory. Any of
Add
,Subtract
,Multiply
,Divide
,Power
.Input (
char or array
) – mandatory. A array of columns to perform operation on.Value (
float
) – mandatory. The value to perform operation with (i.e. operand).Query (
char
) – Optional. logical expression used to select on which rows to act.Output (
char or array
) – Optional. List of column names to write out to.
By default, computation is done in-place on the input (meaning that input columns are overwritten). If provided, the number of values must exactly match the number of input values, and the order will be mapped 1-to-1.
Logical operations
And
Or
Not
- bids.transformers_list.Logical(transformer, data)
Each of these transformations:
takes 2 or more columns as input
performs the corresponding logical operation
inclusive or
conjunction
logical negation
returning a single column as output.
If non-logical input are passed, it is expected that:
all zero or nan (for numeric data types),
“NaN” or empty (for char) values
will evaluate to false and all other values will evaluate to true.
Arguments:
- Parameters:
Name – mandatory. Any of
And
,Or
,Not
.Input (
array
) – mandatory. An array of columns to perform operation on. Only 1 forNot
Output (
char or array
) – Optional. The name of the output column.
Munge operations
Transformations that primarily involve manipulating/munging variables into other formats or shapes.
Assign
- bids.transformers_list.Assign(transformer, data)
The Assign transformation assigns one or more variables or columns (specified as the input) to one or more other columns (specified by target and/or output as described below).
Arguments:
- Parameters:
Input (
char or array
) – mandatory. The name(s) of the columns from which attribute values are to be drawn (for assignment to the attributes of other columns). Must exactly match the length of the target argument.Target (
char or array
) – mandatory. the name(s) of the columns to which the attribute values taken from the input are to be assigned. Must exactly match the length of the input argument. Names are mapped 1-to-1 from input to target.
Note
If no output argument is specified, the columns named in target are modified in-place.
- Parameters:
Output (
char or array
) – Optional. Names of the columns to output the result of the assignment to. Must exactly match the length of the input and target arguments.
If no output array is provided, columns named in target are modified in-place.
If an output array is provided:
each column in the target array is first cloned,
then the reassignment from the input to the target is applied;
finally, the new (cloned and modified) column is written out to the column named in output.
- Parameters:
InputAttr (
char or array
) – Optional. Specifies which attribute of the input column to assign. Defaults tovalue
. If a array is passed, its length must exactly match that of the input and target arrays.TargetAttr (
char or array
) – Optional. Specifies which attribute of the output column to assign to. Defaults tovalue
. If a array is passed, its length must exactly match that of the input and target arrays.
InputAttr
andTargetAttr
must be one of:value
,onset
,or
duration
.
Note
This transformation is non-destructive with respect to the input column(s). In case where in-place assignment is desired (essentially, renaming a column), either use the rename transformation, or set output to the same value as the input.
To reassign the value property of a variable named
response_time
to the duration property of aface
variable (as one might do in order to, e.g., model trial-by-trial reaction time differences for a given condition using a varying-epoch approach), and write it out as a newface_modulated_by_RT
column.
Concatenate
- bids.transformers_list.Concatenate(transformer, data)
Concatenate columns together.
Arguments:
- Parameters:
Input (
array
) – mandatory. Column(s) to concatenate. Must all be of the same length.Output (
char
) – Optional. Name of the output column.
Copy
- bids.transformers_list.Copy(transformer, data)
Clones/copies each of the input columns to a new column with identical values and a different name. Useful as a basis for subsequent transformations that need to modify their input in-place.
Arguments:
- Parameters:
Input (
char or array
) – mandatory. Column names to copy.Output (
char or array
) – Optional. Names to copy the input columns to. Must be same length as input, and columns are mapped one-to-one from the input array to the output array.
Delete
- bids.transformers_list.Delete(transformer, data)
Deletes column(s) from further analysis.
Arguments:
- Parameters:
Input (
char or array
) – mandatory. The name(s) of the columns(s) to delete.
Note
The
Select
transformation provides the inverse function (selection of columns to keep for subsequent analysis).
DropNA
- bids.transformers_list.Drop_na(transformer, data)
Drops all rows with “n/a”.
Arguments:
- Parameters:
Input (
char or array
) – mandatory. The name of the variable to operate on.Output (
char or array
) – Optional. The column names to write out to. By default, computation is done in-place meaning that input columnise overwritten).
Factor
- bids.transformers_list.Factor(transformer, data)
Converts a nominal/categorical variable with N unique levels to either N indicators (i.e., dummy-coding).
Arguments:
- Parameters:
Input (
char or array
) – mandatory. The name(s) of the variable(s) to dummy-code.
By default it is the first factor level when sorting in alphabetical order (e.g., if a condition has levels ‘dog’, ‘apple’, and ‘helsinki’, the default reference level will be ‘apple’).
The name of the output columns for 2 input columns
gender
andage
with 2 levels (M
,F
) and (20
,30
) respectivaly will of the shape:gender_F_age_20
gender_F_age_20
gender_M_age_30
gender_M_age_30
Filter
- bids.transformers_list.Filter(transformer, data)
Subsets rows using a logical expression.
Arguments:
- Parameters:
Input (
char or array
) – mandatory. The name(s) of the variable(s) to operate on.Query (
char
) – mandatory. logical expression used to filter
Supports:
>
,<
,>=
,<=
,==
,~=
for numeric values==
,~=
for char operation (case sensitive). Regular expressions are supported
- Parameters:
Output (
char or array
) – Optional. The optional column names to write out to.
By default, computation is done in-place (i.e., input columnise overwritten). If provided, the number of values must exactly match the number of input values, and the order will be mapped 1-to-1.
Label identical rows
- bids.transformers_list.Label_identical_rows(transformer, data)
Creates an extra column to index consecutive identical rows in a column. The index restarts at 1 with every change of row content. This can for example be used to label consecutive events of the same trial_type in a block.
Arguments:
- Parameters:
Input (
char or array
) – mandatory. The name(s) of the variable(s) to operate on.Cumulative (
logical
) – optional. Defaults toFalse
. IfTrue
, the labels are not reset to 0 when encountering new row content.
Note
The labels will be by default be put in a column called Input(i)_label
Merge identical rows
- bids.transformers_list.Merge_identical_rows(transformer, data)
Merge consecutive identical rows.
Arguments:
- Parameters:
Input (
char or array
) – mandatory. The name(s) of the variable(s) to operate on.
Note
Only works on data commit from event.tsv
Content is sorted by onset time before merging
If multiple variables are specified, they are merged in the order they are specified
If a variable is not found, it is ignored
If a variable is found, but is empty, it is ignored
The content of the other columns corresponds to the last row being merged: this means that the content from other columns but the one specified in will be deleted except for the last one
Replace
- bids.transformers_list.Replace(transformer, data)
Replaces values in one or more input columns.
Arguments:
- Parameters:
Input (
char or array
) – mandatory. Name(s of column(s) to search and replace within.Replace (
array of objects
) – mandatory. The mapping old values ("key"
) to new values. ("value"
).key
can be a regular expression.Attribute (
array
) – Optional. The column attribute to apply the replace to.
Valid values include:
"value"
(the default),"duration"
,"onset"
,and
"all"
.
In the last case, all three attributes (
"value"
,"duration"
, and"onset"
) will be scanned.- Parameters:
Output (
char or array
) – Optional. Optional names of columns to output. Must match length of input column(s) if provided, and columns will be mapped 1-to-1 in order. If no output values are provided, the replacement transformation is applied in-place to all the inputs.
Select
- bids.transformers_list.Select(transformer, data)
The select transformation specifies which columns to retain for subsequent analysis.
Any columns that are not specified here will be dropped.
The only exception is when dealing with data with
onset
andduration
columns (from*_events.tsv
files) in this case the onset and duration column are also automatically selected.Arguments:
- Parameters:
Input (
char or array
) – mandatory. The names of all columns to keep. Any columns not in this array will be deleted and will not be available to any subsequent transformations or downstream analyses.
Note
one can think of select as the inverse the
Delete
transformation that removes all named columns from further analysis.
Split
- bids.transformers_list.Split(transformer, data)
Split a variable into N variables as defined by the levels of one or more other variables.
Arguments:
- Parameters:
Input (
array
) – mandatory. The name of the variable(s) to operate on.By (
array
) – Optional. Name(s) for variable(s) to split on.
For example, for given a variable Condition that we wish to split on two categorical columns A and B, where a given row has values A=a and B=1, the generated name will be
Condition_BY_A_a_BY_B_1
.
Compute operations
Transformations that primarily involve numerical computation on variables.
Constant
- bids.transformers_list.Constant(transformer, data)
Adds a new column with a constant value (numeric or char).
Arguments:
- Parameters:
Output (
char or array
) – mandatory. Name of the newly generated column.Value (
float or char
) – Optional. The value of the constant, defaults to1
.
Mean
- bids.transformers_list.Mean(transformer, data)
Compute mean of a column.
JSON EXAMPLE
{ "Name": "Mean", "Input": "reaction_time", "OmitNan": false, "Output": "mean_RT" }
Arguments:
- param Input:
mandatory. The name of the variable to operate on.
- type Input:
char or array
- param OmitNan:
Optional. If
false
any column with nan values will return a nan value. Iftrue
nan values are skipped. Defaults tofalse
.- type OmitNan:
logical
- param Output:
Optional. The optional column names to write out to. By default, computation is done in-place (i.e., input columnise overwritten).
- type Output:
char or array
CODE EXAMPLE
transformer = struct('Name', 'Mean', ... 'Input', 'reaction_time', ... 'OmitNan', false, ... 'Ouput', 'mean_RT'); data.reaction_time = TODO data = bids.transformers(transformer, data); data.mean_RT = TODO ans = TODO
Product
- bids.transformers_list.Product(transformer, data)
Computes the row-wise product of two or more columns.
Arguments:
- Parameters:
Input (
array
) – mandatory. Names of two or more columns to compute the product of.Output (
string or array
) – mandatory. Name of the newly generated column.OmitNan (
logical
) – Optional. Iffalse
any column with nan values will return a nan value. Iftrue
nan values are skipped. Defaults tofalse
.
Scale
- bids.transformers_list.Scale(transformer, data)
Scales the values of one or more columns.
Semantics mimic scikit-learn, such that demeaning and rescaling are treated as independent arguments, with the default being to apply both (i.e., standardizing each value so that it has zero mean and unit SD).
Arguments:
- Parameters:
Input (
char or array
) – mandatory. Names of columns to standardize.Demean (
logical
) – Optional. Iftrue
, subtracts the mean from each input column (i.e., applies mean-centering).Rescale (
logical
) – Optional. Iftrue
, divides each column by its standard deviation.ReplaceNa (
logical
) – Optional. Whether/when to replace missing values with 0. If"off"
, no replacement is performed. If"before"
, missing values are replaced with 0 before scaling. If"after"
, missing values are replaced with 0 after scaling. Defaults to"off"
Output (
char or array
) – Optional. Optional names of columns to output. Must match length of input column if provided, and columns will be mapped 1-to-1 in order. If no output values are provided, the scaling transformation is applied in-place to all the input.
Std
- bids.transformers_list.Std(transformer, data)
Compute the sample standard deviation.
Arguments:
- Parameters:
Input (
char or array
) – mandatory. The name of the variable to operate on.OmitNan (
logical
) – Optional. Iffalse
any column with nan values will return a nan value. Iftrue
nan values are skipped. Defaults tofalse
.Output (
char or array
) – Optional. The optional column names to write out to. By default, computation is done in-place (i.e., input columnise overwritten).
Sum
- bids.transformers_list.Sum(transformer, data)
Computes the (optionally weighted) row-wise sums of two or more columns.
Arguments:
- Parameters:
Input (
array
) – mandatory. Names of two or more columns to sum.Output (
char or array
) – mandatory. Name of the newly generated column.OmitNan (
logical
) – Optional. Iffalse
any column with nan values will return a nan value. Iftrue
nan values are skipped. Defaults tofalse
.Weights (
array
) – Optional. Optional array of floats giving the weights of the columns. If provided, length of weights must equal to the number of values in input, and weights will be mapped 1-to-1 onto named columns. If no weights are provided, defaults to unit weights (i.e., simple sum).
Threshold
- bids.transformers_list.Threshold(transformer, data)
Thresholds input values at a specified cut-off and optionally binarizes the result.
Arguments:
- Parameters:
Input (
char or array
) – mandatory. The name(s)of the column(s) to threshold/binarize.Threshold (
float
) – Optional. The cut-off to use for thresholding. Defaults to0
.Binarize (
logical
) – Optional. Iftrue
, thresholded values will be binarized (i.e., all non-zero values will be set to 1). Defaults tofalse
.Above (
logical
) – Optional. Specifies which values to retain with respect to the cut-off. Iftrue
, all value above the threshold will be kept; iffalse
, all values below the threshold will be kept. Defaults totrue
.Signed (
logical
) – Optional. Specifies whether to treat the threshold as signed (default) or unsigned.
For example, when passing above=true and threshold=3, if signed=true, all and only values above +3 would be retained. If signed=false, all absolute values > 3 would be retained (i.e.,values in the range -3 < X < 3 would be set to 0).
- Parameters:
Output (
char or array
) – Optional. Optional names of columns to output. Must match length of input column if provided, and columns will be mapped 1-to-1 in order. If no output values are provided, the threshold transformation is applied in-place to all the inputs.