Interpersonal physiological synchrony (PS), or the similarity of physiological signals between individuals over time, may be used to detect attentionally engaging moments in time. We here investigated whether PS in the electroencephalogram (EEG), electrodermal activity (EDA), heart rate and a multimodal metric signals the occurrence of attentionally relevant events in time in two groups of participants. Both groups were presented with the same auditory stimulus, but were instructed to attend either to the narrative of an audiobook (audiobook-attending: AA group) or to interspersed emotional sounds and beeps (stimulus-attending: SA group). We hypothesized that emotional sounds could be detected in both groups as they are expected to draw attention involuntarily, in a bottom-up fashion. Indeed, we found this to be the case for PS in EDA or the multimodal metric. Beeps, that are expected to be only relevant due to specific ``top-down’’ attentional instructions, could indeed only be detected using PS among SA participants, for EDA, EEG and the multimodal metric. We further hypothesized that moments in the audiobook accompanied by high PS in either EEG, EDA, heart rate or the multimodal metric for AA participants would be rated as more engaging by an independent group of participants compared to moments corresponding to low PS. This hypothesis was not supported. Our results show that PS can support the detection of attentionally engaging events over time. Currently, the relation between PS and engagement is only established for well-defined, interspersed stimuli, whereas the relation between PS and a more abstract self-reported metric of engagement over time has not been established. As the relation between PS and engagement is dependent on event type and physiological measure, we suggest to choose a measure matching with the stimulus of interest. When the stimulus type is unknown, a multimodal metric is most robust.