AME (Autonomous Multimodal Experiment) is a web-based platform for designing, deploying, and running online behavioral science experiments autonomously. It enables researchers to create experiments that include manipulated video conferences, participant matching, supplementary device integration, and diverse data collection — all orchestrated through a real-time server without the experimenter needing to be present in each session.
An Experiment is the top-level entity that defines core parameters and conditions. Each experiment contains a sequence of Progress steps that participants advance through. Every progress step maps to a Routine — a layout or functional container that holds one or more Elements, the atomic building blocks that present stimuli, capture data, or monitor participant state.
Progress
A progress step represents a single position in the experiment's sequence timeline. Each progress index maps to a routine and its associated content. Participants advance through progress steps linearly, from the first to the last.
Proceed events
Participants move from one progress step to the next via proceed events. A proceed event ends the current routine, uploads any collected data to the server, and advances the participant to the next step. Proceed events can be triggered by button clicks, timer expiry, gesture detection, keyword recognition, or any other element-driven action.
Conditions
Experiments can define multiple conditions, each mapping to a different sequence of routines. For example, a study might have a "control" condition showing neutral stimuli and a "treatment" condition showing manipulated stimuli. Participants are assigned to a condition when they join, and this assignment determines which progress sequence they follow.
Randomization
When randomization is enabled, the order in which participants are assigned to conditions can be controlled using several methods: shuffle randomizes the order completely, latin-square balances condition ordering across participants, and block randomization groups assignments into balanced blocks to maintain even distribution throughout the study.
Routines
Routines are layout and functional containers within a progress step. Each progress step has exactly one routine, and each routine can hold multiple elements organized into element sets. When multiple participants share a session, each sees only their assigned element set.
routine-borderless
Full-screen layout with no visual frame. Used for immersive content where the UI should disappear — the elements fill the entire viewport without any container chrome.
routine-card
Centered card container with optional next button. The default choice for structured, step-by-step content such as instructions, questionnaires, or stimulus presentation.
routine-consent
Consent form with agree/disagree buttons. Displays an information sheet and requires explicit agreement before proceeding. Can auto-withdraw the participant on disagree.
routine-end
Experiment completion screen. Displays a final message, marks the participant as completed, and optionally redirects to an external URL (e.g., a Prolific completion page).
routine-pos-check
Camera position and field-of-view validation using browser-based ML. Checks that the participant is correctly positioned in frame before proceeding, supporting face, upper-body, and full-body presets.
routine-video-chat
Split layout combining video chat streams with an element content area. Used during matched video conferences — one side shows the peer video feeds, the other shows experiment elements.
routine-survey
SurveyJS-powered survey with configurable progress bar and submit button. Supports complex question types, branching logic, and validation provided by the SurveyJS library.
routine-matching
Waiting room displayed while participants are being matched into a session. Shows a live counter of participants in the queue and the matching status. Participants remain here until the server finds a valid match.
routine-end-matching
Ends a matching or video-chat session and disconnects peers. Cleans up WebRTC connections and removes participants from the matching queue before advancing to the next progress step.
Elements
Elements are the atomic building blocks of an experiment. Each element presents a stimulus, captures data, or monitors participant state. Elements are placed inside routines and can be configured with event triggers to drive experiment flow.
Visual
Elements that present content to the participant.
text
Renders styled text content. Supports font size, weight, color, alignment, and highlight. Used for instructions, labels, stimulus text, and any other written content.
image
Displays an image with configurable dimensions and object-fit. Used for presenting visual stimuli, diagrams, or instructional images.
video
Embeds a video player with autoplay, loop, controls, and mute options. Used for video stimuli, instructional clips, or pre-recorded demonstrations.
audio
Embeds an audio player with autoplay, loop, and controls options. Used for audio stimuli, ambient sounds, or spoken instructions.
Data Input
Elements that collect responses and data from the participant.
button
A clickable button that can trigger events. Supports hold-to-confirm with a configurable duration, delayed enable to prevent rushed responses, and limited interaction counts for single-use actions.
recorder
Records audio, video, or screen capture from the participant's device. Supports auto-start for hands-free recording, preview display, and streaming upload for continuous data transfer during long sessions.
multi-capture
Records from multiple sources simultaneously — for example, webcam, phone camera, and desktop capture all at once. Used with supplementary devices to capture multiple angles or perspectives.
input-textfield
A text input field, available as single-line or multiline, with optional validation patterns and character limits. Used for open-ended responses, participant identifiers, or any free-text data.
input-select
A dropdown or multi-select from a predefined list of options. Used for categorical choices such as demographics, condition selection, or forced-choice responses.
input-slider
A numeric slider with configurable range, step size, endpoint labels, and optional value display. Used for Likert scales, confidence ratings, or any continuous numeric input.
Monitoring
Elements that observe participant state and can trigger events automatically.
timer
A countdown timer that can trigger events on expiry. Supports visible and hidden modes, multiple display formats (mm:ss, seconds, progress bar), and a warning threshold that changes appearance as time runs low.
body-detector
Detects body gestures and poses using browser-based ML models. Recognizes actions like hands raised, thumbs up, peace sign, and fist. Can trigger proceed, data, or alert events when a gesture is held for a specified duration.
volume-check
Monitors microphone input volume against a configurable threshold. Used to verify that the participant's audio setup is working correctly before recording or voice-dependent tasks begin.
speech-recognizer
Performs real-time speech recognition with keyword detection. Listens for specific words or phrases and can trigger data capture, proceed, or alert events on keyword match. Supports continuous listening mode.
Communication
Elements that enable real-time interaction between participants.
video-chat
Renders WebRTC video and audio streams for matched participants. Used inside routine-video-chat to display peer camera feeds and transmit audio. Relies on native browser WebRTC APIs with Socket.IO signaling.
Video Chat & Matching
Matching is a routine-level concept, not experiment-level. Each video-chat routine defines its own matching configuration, meaning a single experiment can include multiple video-chat sessions with different matching strategies and partner assignments.
How matching works
When a participant reaches a matching routine, they enter a Redis-backed waiting room keyed by experiment ID and routine name. The matching routine displays a waiting screen with a live counter showing how many participants are in the queue. The server continuously evaluates the queue against the matching configuration and forms groups when the requirements are met.
Matching strategies
FIFO
First-come-first-served queue matching. Participants are matched in the order they arrive. When enough participants are waiting (as defined by the session size), the server forms a group from the earliest arrivals and moves them into a video-chat session.
Property-Based
Filters participants into named slots based on their properties — such as assigned condition, survey responses, or demographic data. Each slot defines filter criteria that a participant must match. The server only forms a group when every slot is filled by a qualifying participant.
Manual
The experimenter assigns participants to groups manually via the admin dashboard. Participants wait in the queue until the experimenter explicitly creates a match. Used when automated matching logic is insufficient or when the researcher needs direct control over pairings.
Timeouts
Each matching configuration includes a timeout duration. When a participant has been waiting longer than the configured timeout, the system can either continue waiting indefinitely or proceed without a match, depending on the timeout action setting. This prevents participants from being stuck in a waiting room if a match never arrives.
WebRTC connections
Video chat uses native browser WebRTC APIs — no PeerJS or other third-party libraries. Socket.IO relays the signaling messages (SDP offers, SDP answers, and ICE candidates) between peers. Once signaling is complete, media streams flow directly between participant browsers in a peer-to-peer connection.
Multiple Devices
AME supports supplementary devices — a phone or tablet that acts as an additional camera or input controller for the participant. This enables multi-angle video capture and remote interaction without additional hardware.
Connection flow
The participant scans a QR code displayed in the experiment interface to pair their device to the current session. The supplementary device connects via WebRTC, establishing a direct media or data channel with the main experiment browser. No app installation is required — the device opens a web page that handles the connection.
Camera
Streams video from the device's camera (front or rear facing) to be recorded alongside the main webcam. Enables multi-angle capture — for example, recording a participant's face with the webcam and their hands with a phone camera simultaneously.
Controller
Sends input events (taps, gestures) from the device to the experiment, enabling remote control interactions. The phone screen becomes a secondary input surface that participants can use during the experiment.
Multi-capture coordination
The multi-capture element orchestrates recording from all connected sources simultaneously. It manages synchronization of multiple video and audio streams — from the main webcam, supplementary device cameras, and desktop capture — uploading each stream independently with resumable chunked transfers.