Diagnostic Feature Mapping¶
Diagnostic Feature Mapping (DFM) is a psychophysical technique for identifying which visual features within an image are diagnostic for perception and recognition. On each trial, the participant sees a random subset of Gabor features extracted from an image and must identify the stimulus (e.g., which face, object, or category). By correlating responses with the features presented across many trials, researchers can determine which features are critical for recognition12.
This task is based on the "Bubbles" paradigm1, which reveals localized image regions through Gaussian apertures. The DFM implementation in Meadows uses Gabor filter representations, which decompose images into localized spatial frequency and orientation components—aligning with how the early visual cortex processes visual information2.
Below you'll find information specific to the Diagnostic Feature Mapping task. This assumes you're familiar with how to setup an experiment and how to select stimuli for a given task.
Prerequisites¶
Gabor Feature Extraction¶
Before using this task, your images must be preprocessed to extract Gabor features. From your stimulus set page, select the Gabors preprocessing option for your Stimulus Set to generate the Gabor filter bank representation for each image. This creates the necessary feature data that the task uses to display random subsets of features on each trial. To unlock this preprocessing feature please contact us.
Parameters¶
Customize the task by changing these on the Parameters tab of the task.
General Interface settings¶
Customize the instruction at the top of the page, as well as toolbar buttons. These apply to most task types on Meadows.
Instruction hint-
Text that you can display during the task at the top of the page.
Extended instruction-
A longer instruction that only appears if the participant hovers their mouse cursor over the hint.
Hint size-
Whether to display the instruction, or hide it, and what font size to use.
Fullscreen button-
Whether to display a button in the bottom toolbar that participants can use to switch fullscreen mode on and off.
Trial Configuration¶
Number of trials-
How many trials to present. Default: 5. Valid range: 1 to 5000.
Number of features-
How many Gabor features to combine for each trial. Default: 100. Valid range: 1 to 12000. This controls the initial difficulty level—fewer features make the task harder.
Responses-
Which keyboard keys are accepted as valid responses. Configure one key per stimulus condition. Keys should be ordered alphabetically according to their corresponding condition names (important for staircase procedures).
Adaptive Procedure¶
Control how difficulty adjusts based on participant performance.
Adaptive procedure-
Method to adapt the number of features based on performance:
- No adaptation: Number of features stays fixed throughout
- 1-up, 1-down staircase: Targets 50% accuracy
- 1-up, 2-down staircase: Targets ~71% accuracy
- 1-up, 3-down staircase: Targets ~79% accuracy
Step size-
When using a staircase procedure, how much to change the number of features after correct/incorrect responses. Default: 5.
Step unit-
Whether step size is interpreted as:
- Number of features: Absolute change (e.g., ±5 features)
- Percentage of max total features: Relative change (e.g., ±5% of total)
Task Flow¶
Trials between breaks-
A break will be shown and data stored every time the participant has finished this many trials. Default: 5000. Valid range: 1 to 5000.
Break text-
The text shown during breaks. Supports line breaks. Default: "You've just finished one block.\nTake a break, and press continue when ready."
Data¶
For general information about the various structures and file formats that you can download for your data see Downloads.
As stimulus-wise "annotations" (table rows), with columns:
trial- Numerical index of the trialtime_trial_start- Timestamp when the stimulus was displayed (seconds since 1/1/1970)time_trial_response- Timestamp when the participant responded (seconds since 1/1/1970)stim1_id- Meadows internal id of the stimulusstim1_name- Filename of the stimulus as uploadedlabel- Encoded response data in format:{correct}_{nFeatures}_{seed}where:correct: 1 if response was correct, 0 if incorrectnFeatures: Number of features shown on this trialseed: Random seed used for feature selection (for reproducibility)
In the Tree structure:
annotations- An array with a map for each trial:ids- Array containing the stimulus IDstart- Timestamp of stimulus onset (epoch time in seconds)resp- Timestamp of response (epoch time in seconds)label- Encoded as{correct}_{nFeatures}_{seed}trial- Trial index
Analysis and Visualization¶
Parse the label column¶
The label column contains encoded data that needs to be parsed for analysis:
import pandas as pd
# Load annotations data
df = pd.read_csv('Meadows_myExperiment_v1_annotations.csv')
# Parse the label column
df[['correct', 'n_features', 'seed']] = df['label'].str.split('_', expand=True)
df['correct'] = df['correct'].astype(int)
df['n_features'] = df['n_features'].astype(int)
df['seed'] = df['seed'].astype(int)
# Calculate reaction time in milliseconds
df['rt_ms'] = (df['time_trial_response'] - df['time_trial_start']) * 1000
print(df[['stim1_name', 'correct', 'n_features', 'rt_ms']].head())
library(tidyverse)
# Load annotations data
df <- read_csv('Meadows_myExperiment_v1_annotations.csv')
# Parse the label column
df <- df %>%
separate(label, into = c('correct', 'n_features', 'seed'),
sep = '_', convert = TRUE)
# Calculate reaction time in milliseconds
df <- df %>%
mutate(rt_ms = (time_trial_response - time_trial_start) * 1000)
head(df %>% select(stim1_name, correct, n_features, rt_ms))
Analyze staircase performance¶
Track how the number of features changes across trials:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('Meadows_myExperiment_v1_annotations.csv')
df[['correct', 'n_features', 'seed']] = df['label'].str.split('_', expand=True)
df['n_features'] = df['n_features'].astype(int)
# Plot staircase trajectory
plt.figure(figsize=(10, 5))
plt.plot(df['trial'], df['n_features'], marker='o', markersize=3)
plt.xlabel('Trial')
plt.ylabel('Number of Features')
plt.title('Staircase Trajectory')
plt.tight_layout()
plt.show()
# Estimate threshold (mean of last N reversals or trials)
threshold = df['n_features'].tail(20).mean()
print(f"Estimated threshold: {threshold:.1f} features")
library(tidyverse)
df <- read_csv('Meadows_myExperiment_v1_annotations.csv')
df <- df %>%
separate(label, into = c('correct', 'n_features', 'seed'),
sep = '_', convert = TRUE)
# Plot staircase trajectory
ggplot(df, aes(x = trial, y = n_features)) +
geom_line() +
geom_point(size = 1) +
labs(x = 'Trial', y = 'Number of Features',
title = 'Staircase Trajectory') +
theme_minimal()
# Estimate threshold
threshold <- mean(tail(df$n_features, 20))
cat(sprintf("Estimated threshold: %.1f features\n", threshold))
Compute diagnostic features¶
To identify which feature dimensions are most diagnostic for recognition, you need to:
- Load the Gabor feature data (
.gaborsfiles from your stimulus set) - Reconstruct which features were shown on each trial using the stored seed
- Correlate feature properties (spatial frequency, orientation, location) with accuracy
import pandas as pd
import numpy as np
import json
from scipy import stats
import matplotlib.pyplot as plt
# You'll need the RandomNumberGenerator from meadows.rng
# or implement compatible sampling - see scythe library
from meadows.rng import RandomNumberGenerator
# Load annotations
df = pd.read_csv('Meadows_myExperiment_v1_annotations.csv')
df['correct'] = df.label.str.split('_').str[0].astype(bool)
df['nfeats'] = df.label.str.split('_').str[1].astype(int)
df['seed'] = df.label.str.split('_').str[2].astype(int)
rng = RandomNumberGenerator()
# Load Gabor features for each stimulus
# (download .gabors files from your stimulus set)
stim_features = {}
for stim_id in df.stim1_id.unique():
with open(f'{stim_id}.gabors', 'r') as f:
data = json.load(f)
features = pd.DataFrame(data['features'])
freqs = data['settings']['frequencies']
oris = data['settings']['orientations']
features['frequency'] = [freqs[f] for f in features.f]
features['orientation'] = [oris[o] for o in features.o]
stim_features[data['source']['name']] = features
# Reconstruct which features were shown on each trial
# and compute correlation with accuracy per frequency band
freq_accuracy = {freq: [] for freq in freqs}
for _, trial in df.iterrows():
feats = stim_features[trial.stim1_name]
shown_idx = rng.sample_indices(
pop_size=len(feats),
n=trial.nfeats,
seed=trial.seed
)
shown_feats = feats.iloc[shown_idx]
# For each frequency, record if features at that freq
# were shown and whether trial was correct
for freq in freqs:
if freq in shown_feats.frequency.values:
freq_accuracy[freq].append(trial.correct)
# Compute and plot accuracy per spatial frequency
freq_means = {f: np.mean(acc) for f, acc in freq_accuracy.items() if acc}
plt.figure(figsize=(8, 5))
plt.bar(range(len(freq_means)), list(freq_means.values()))
plt.xticks(range(len(freq_means)),
[f'{f:.1f}' for f in freq_means.keys()])
plt.xlabel('Spatial Frequency (cycles/image)')
plt.ylabel('Accuracy when features present')
plt.title('Diagnostic Value by Spatial Frequency')
plt.tight_layout()
plt.show()
For a complete analysis pipeline, see the scythe repository—a collection of analysis scripts and code samples for Meadows data—which includes the RandomNumberGenerator needed to reconstruct feature selections.
References¶
-
Gosselin, F., & Schyns, P. G. (2001). Bubbles: a technique to reveal the use of information in recognition tasks. Vision Research, 41(17), 2261-2271. doi:10.1016/S0042-6989(01)00097-9 ↩↩
-
Alink, A., & Charest, I. (2020). Clinically relevant autistic traits predict greater reliance on detail for image recognition. Scientific Reports, 10, 14239. doi:10.1038/s41598-020-70953-8 ↩↩