Synchronization Guide
How to correlate data across video, audio, eye tracking, and behavioral modules with frame-level accuracy.
The Logger records all modules with synchronized timestamps, enabling precise correlation of data across different sensors and devices. This guide explains how to use these timestamps for multi-modal data analysis.
Timestamp Types
The Logger uses three types of timestamps, each suited for different purposes:
1. Monotonic Time (Recommended for Cross-Module Sync)
A continuously increasing clock that starts at system boot. It never jumps due to NTP adjustments or timezone changes, making it ideal for precise relative timing.
- Precision: 9 decimal places (nanosecond)
- Format:
12345.678901234(seconds since boot) - Columns:
encode_time_mono,write_time_monotonic,record_time_mono
2. Unix Timestamp (Wall Clock Time)
Seconds since January 1, 1970 UTC. Useful for absolute time reference and correlating with external systems.
- Precision: 6 decimal places (microsecond)
- Format:
1733649120.123456 - Columns:
capture_time_unix,write_time_unix,record_time_unix,Unix time in UTC
Caution with Unix Timestamps
Unix timestamps can jump forward or backward if the system clock is adjusted (NTP sync, manual changes). For frame-accurate synchronization, prefer monotonic time.
3. Hardware/Device Timestamps
Some devices provide their own hardware timestamps with higher precision or different time bases:
- CSI Cameras:
sensor_timestamp_ns(nanoseconds from camera hardware) - Audio:
adc_timestamp(from audio hardware, if available) - GPS:
timestamp_utc(atomic clock derived, highest absolute accuracy) - Detection Response Task / Visual Occlusion Goggles:
Milliseconds Since Record(device internal time)
Synchronization Columns by Module
| Module | Monotonic Column | Unix Column | Notes |
|---|---|---|---|
| Cameras | encode_time_mono |
capture_time_unix |
Use frame_index to seek video |
| Audio | write_time_monotonic |
write_time_unix |
Use total_frames for sample position |
| Eye Tracker | record_time_mono |
record_time_unix |
High-frequency gaze data |
| Detection Response Task | — | Unix time in UTC |
Event-based (one row per trial) |
| Visual Occlusion Goggles | — | Unix time in UTC |
Event-based (lens state changes) |
| GPS | record_time_mono |
timestamp_unix |
Highest absolute time accuracy |
| Notes | — | Timestamp |
Manual annotations |
Common Synchronization Tasks
Find Video Frame at a Given Time
To find which video frame corresponds to a specific timestamp:
- Load the camera timing CSV
- Search for the row with
capture_time_unixclosest to your target time - Use
frame_indexto seek to that frame in the video file
# Python example
import pandas as pd
timing = pd.read_csv('trial_001_usb_0_001_timing.csv')
target_time = 1733649123.456789
# Find closest frame
idx = (timing['capture_time_unix'] - target_time).abs().idxmin()
frame_number = timing.loc[idx, 'frame_index']
print(f"Frame {frame_number} at time {timing.loc[idx, 'capture_time_unix']}")
Find Audio Sample at a Given Time
To find which audio sample corresponds to a timestamp:
- Load the audio timing CSV
- Find the chunk containing your target time
- Calculate the sample offset within that chunk
# Python example
timing = pd.read_csv('20251208_143022_AUDIOTIMING_trial001_MIC0.csv')
target_time = 1733649123.456789
sample_rate = 48000
# Find chunk containing target time
chunk = timing[timing['write_time_unix'] <= target_time].iloc[-1]
time_offset = target_time - chunk['write_time_unix']
sample_in_chunk = int(time_offset * sample_rate)
total_sample = chunk['total_frames'] - chunk['frames'] + sample_in_chunk
Find Gaze Data at a Video Frame
To get gaze data corresponding to a specific video frame:
- Get the frame's
encode_time_monofrom camera timing CSV - Find gaze samples with matching
record_time_mono
# Python example
camera_timing = pd.read_csv('trial_001_usb_0_001_timing.csv')
gaze_data = pd.read_csv('20251208_GAZEDATA_trial001.csv')
frame_time = camera_timing.loc[100, 'encode_time_mono'] # Frame 100
# Find gaze samples within 33ms of frame (for 30fps video)
tolerance = 0.033
gaze_at_frame = gaze_data[
(gaze_data['record_time_mono'] >= frame_time - tolerance) &
(gaze_data['record_time_mono'] < frame_time + tolerance)
]
Find Detection Response Task Trial at a Video Frame
To find which Detection Response Task trial was occurring during a video frame:
# Python example
camera_timing = pd.read_csv('trial_001_usb_0_001_timing.csv')
drt_data = pd.read_csv('20251208_DRT_trial001.csv')
frame_time_unix = camera_timing.loc[100, 'capture_time_unix']
# Find Detection Response Task trial closest to frame time
drt_data['time_diff'] = abs(drt_data['Unix time in UTC'] - frame_time_unix)
closest_trial = drt_data.loc[drt_data['time_diff'].idxmin()]
Correlate Note with Video Frame
# Python example
notes = pd.read_csv('20251208_NOTES_trial001.csv')
camera_timing = pd.read_csv('trial_001_usb_0_001_timing.csv')
note_time = notes.loc[0, 'Timestamp']
# Find frame at note time
idx = (camera_timing['capture_time_unix'] - note_time).abs().idxmin()
frame_at_note = camera_timing.loc[idx, 'frame_index']
print(f"Note '{notes.loc[0, 'Content']}' at frame {frame_at_note}")
Timing Accuracy
Synchronization Accuracy by Module
| Module | Typical Accuracy | Notes |
|---|---|---|
| Cameras (CSI) | < 1 ms | Hardware timestamps from sensor |
| Cameras (USB) | ~10-30 ms | Software timestamps, USB latency varies |
| Audio | < 1 ms | Hardware ADC timestamps when available |
| Eye Tracker | ~5-10 ms | Network latency from Neon device |
| Detection Response Task / Visual Occlusion Goggles | ~1-5 ms | USB serial latency + firmware timing |
| GPS | < 100 ns (absolute) | Atomic clock derived, best for absolute time |
| Notes | ~10-50 ms | Human input delay + software latency |
Frame-Level Synchronization
At 30 fps video, one frame = ~33 ms. The Logger achieves frame-level synchronization across all modules, meaning:
- Any two data points can be correlated to within one video frame
- Typical cross-module accuracy: < 30 ms
- Best case (CSI camera + audio with hardware timestamps): < 1 ms
Best Practices
1. Use Monotonic Time for Cross-Module Sync
Monotonic timestamps are immune to clock adjustments and provide the most reliable relative timing between modules.
2. Use Unix Time for Absolute Reference
When you need to know the actual wall-clock time of an event, or correlate with external systems, use Unix timestamps.
3. Account for Sample Rate Differences
Different modules sample at different rates:
- Video: 30-60 Hz
- Audio: 48,000 Hz
- Gaze: 30-200 Hz
- Detection Response Task: Event-based (~0.2-0.3 Hz)
- GPS: 1-10 Hz
When correlating, interpolate or find nearest neighbor as appropriate.
4. Verify Timing with Known Events
Use the Notes module to mark known events during recording, then verify these events appear correctly synchronized across video, audio, and other data streams.
5. Use GPS for Absolute Time Calibration
If you need highly accurate absolute time, use the GPS module's timestamp_utc as a reference. GPS time is derived from atomic clocks and accurate to within 100 nanoseconds.
