Perception of synchrony between one’s own action (e.g. a finger tap) and the sensory feedback thereof (e.g. a flash or click) can be shifted after exposure to an induced delay (temporal recalibration effect, TRE). It remains elusive, however, whether the same mechanism underlies motor-visual (MV) and motor-auditory (MA) TRE. We examined this by measuring crosstalk between MV- and MA-delayed feedbacks. During an exposure phase, participants pressed a mouse at a constant pace while receiving visual or auditory feedback that was either delayed (+150 ms) or subjectively synchronous (+50 ms). During a post-test, participants then tried to tap in sync with visual or auditory pacers. TRE manifested itself as a compensatory shift in the tap–pacer asynchrony (a larger anticipation error after exposure to delayed feedback). In experiment 1, MA and MV feedback were either both synchronous (MV-sync and MA-sync) or both delayed (MV-delay and MA-delay), whereas in experiment 2, different delays were mixed across alternating trials (MV-sync and MA-delay or MV-delay and MA-sync). Exposure to consistent delays induced equally large TREs for auditory and visual pacers with similar build-up courses. However, with mixed delays, we found that synchronized sounds erased MV-TRE, but synchronized flashes did not erase MA-TRE. These results suggest that similar mechanisms underlie MA- and MV-TRE, but that auditory feedback is more potent than visual feedback to induce a rearrangement of motor-sensory timing.