Software / Documentation / Synchronization

Synchronization Guide

How to correlate data across video, audio, eye tracking, and behavioral modules with frame-level accuracy.

The Logger records all modules with synchronized timestamps, enabling precise correlation of data across different sensors and devices. This guide explains how to use these timestamps for multi-modal data analysis.

Timestamp Types

The Logger uses three types of timestamps, each suited for different purposes:

1. Monotonic Time (Recommended for Cross-Module Sync)

A continuously increasing clock that starts at system boot. It never jumps due to NTP adjustments or timezone changes, making it ideal for precise relative timing.

  • Precision: 9 decimal places (nanosecond)
  • Format: 12345.678901234 (seconds since boot)
  • Columns: encode_time_mono, write_time_monotonic, record_time_mono

2. Unix Timestamp (Wall Clock Time)

Seconds since January 1, 1970 UTC. Useful for absolute time reference and correlating with external systems.

  • Precision: 6 decimal places (microsecond)
  • Format: 1733649120.123456
  • Columns: capture_time_unix, write_time_unix, record_time_unix, Unix time in UTC

Caution with Unix Timestamps

Unix timestamps can jump forward or backward if the system clock is adjusted (NTP sync, manual changes). For frame-accurate synchronization, prefer monotonic time.

3. Hardware/Device Timestamps

Some devices provide their own hardware timestamps with higher precision or different time bases:

  • CSI Cameras: sensor_timestamp_ns (nanoseconds from camera hardware)
  • Audio: adc_timestamp (from audio hardware, if available)
  • GPS: timestamp_utc (atomic clock derived, highest absolute accuracy)
  • Detection Response Task / Visual Occlusion Goggles: Milliseconds Since Record (device internal time)

Synchronization Columns by Module

Module Monotonic Column Unix Column Notes
Cameras encode_time_mono capture_time_unix Use frame_index to seek video
Audio write_time_monotonic write_time_unix Use total_frames for sample position
Eye Tracker record_time_mono record_time_unix High-frequency gaze data
Detection Response Task Unix time in UTC Event-based (one row per trial)
Visual Occlusion Goggles Unix time in UTC Event-based (lens state changes)
GPS record_time_mono timestamp_unix Highest absolute time accuracy
Notes Timestamp Manual annotations

Common Synchronization Tasks

Find Video Frame at a Given Time

To find which video frame corresponds to a specific timestamp:

  1. Load the camera timing CSV
  2. Search for the row with capture_time_unix closest to your target time
  3. Use frame_index to seek to that frame in the video file
# Python example
import pandas as pd

timing = pd.read_csv('trial_001_usb_0_001_timing.csv')
target_time = 1733649123.456789

# Find closest frame
idx = (timing['capture_time_unix'] - target_time).abs().idxmin()
frame_number = timing.loc[idx, 'frame_index']
print(f"Frame {frame_number} at time {timing.loc[idx, 'capture_time_unix']}")

Find Audio Sample at a Given Time

To find which audio sample corresponds to a timestamp:

  1. Load the audio timing CSV
  2. Find the chunk containing your target time
  3. Calculate the sample offset within that chunk
# Python example
timing = pd.read_csv('20251208_143022_AUDIOTIMING_trial001_MIC0.csv')
target_time = 1733649123.456789
sample_rate = 48000

# Find chunk containing target time
chunk = timing[timing['write_time_unix'] <= target_time].iloc[-1]
time_offset = target_time - chunk['write_time_unix']
sample_in_chunk = int(time_offset * sample_rate)
total_sample = chunk['total_frames'] - chunk['frames'] + sample_in_chunk

Find Gaze Data at a Video Frame

To get gaze data corresponding to a specific video frame:

  1. Get the frame's encode_time_mono from camera timing CSV
  2. Find gaze samples with matching record_time_mono
# Python example
camera_timing = pd.read_csv('trial_001_usb_0_001_timing.csv')
gaze_data = pd.read_csv('20251208_GAZEDATA_trial001.csv')

frame_time = camera_timing.loc[100, 'encode_time_mono']  # Frame 100

# Find gaze samples within 33ms of frame (for 30fps video)
tolerance = 0.033
gaze_at_frame = gaze_data[
    (gaze_data['record_time_mono'] >= frame_time - tolerance) &
    (gaze_data['record_time_mono'] < frame_time + tolerance)
]

Find Detection Response Task Trial at a Video Frame

To find which Detection Response Task trial was occurring during a video frame:

# Python example
camera_timing = pd.read_csv('trial_001_usb_0_001_timing.csv')
drt_data = pd.read_csv('20251208_DRT_trial001.csv')

frame_time_unix = camera_timing.loc[100, 'capture_time_unix']

# Find Detection Response Task trial closest to frame time
drt_data['time_diff'] = abs(drt_data['Unix time in UTC'] - frame_time_unix)
closest_trial = drt_data.loc[drt_data['time_diff'].idxmin()]

Correlate Note with Video Frame

# Python example
notes = pd.read_csv('20251208_NOTES_trial001.csv')
camera_timing = pd.read_csv('trial_001_usb_0_001_timing.csv')

note_time = notes.loc[0, 'Timestamp']

# Find frame at note time
idx = (camera_timing['capture_time_unix'] - note_time).abs().idxmin()
frame_at_note = camera_timing.loc[idx, 'frame_index']
print(f"Note '{notes.loc[0, 'Content']}' at frame {frame_at_note}")

Timing Accuracy

Synchronization Accuracy by Module

Module Typical Accuracy Notes
Cameras (CSI) < 1 ms Hardware timestamps from sensor
Cameras (USB) ~10-30 ms Software timestamps, USB latency varies
Audio < 1 ms Hardware ADC timestamps when available
Eye Tracker ~5-10 ms Network latency from Neon device
Detection Response Task / Visual Occlusion Goggles ~1-5 ms USB serial latency + firmware timing
GPS < 100 ns (absolute) Atomic clock derived, best for absolute time
Notes ~10-50 ms Human input delay + software latency

Frame-Level Synchronization

At 30 fps video, one frame = ~33 ms. The Logger achieves frame-level synchronization across all modules, meaning:

  • Any two data points can be correlated to within one video frame
  • Typical cross-module accuracy: < 30 ms
  • Best case (CSI camera + audio with hardware timestamps): < 1 ms

Best Practices

1. Use Monotonic Time for Cross-Module Sync

Monotonic timestamps are immune to clock adjustments and provide the most reliable relative timing between modules.

2. Use Unix Time for Absolute Reference

When you need to know the actual wall-clock time of an event, or correlate with external systems, use Unix timestamps.

3. Account for Sample Rate Differences

Different modules sample at different rates:

  • Video: 30-60 Hz
  • Audio: 48,000 Hz
  • Gaze: 30-200 Hz
  • Detection Response Task: Event-based (~0.2-0.3 Hz)
  • GPS: 1-10 Hz

When correlating, interpolate or find nearest neighbor as appropriate.

4. Verify Timing with Known Events

Use the Notes module to mark known events during recording, then verify these events appear correctly synchronized across video, audio, and other data streams.

5. Use GPS for Absolute Time Calibration

If you need highly accurate absolute time, use the GPS module's timestamp_utc as a reference. GPS time is derived from atomic clocks and accurate to within 100 nanoseconds.