Although infant speech perception in often studied in isolated modalities, infants' experience with speech is largely multimodal (i.e., speech sounds they hear are accompanied by articulating faces). Across two experiments, we tested infants' sensitivity to the relationship between the auditory and visual components of audiovisual speech in their native (English) and non-native (Spanish) language. In Experiment 1, infants' looking times were measured during a preferential looking task in which they saw two simultaneous visual speech streams articulating a story, one in English and the other in Spanish, while they heard either the English or the Spanish version of the story. In Experiment 2, looking times from another group of infants were measured as they watched single displays of congruent and incongruent combinations of English and Spanish audio and visual speech streams. Findings demonstrated an age-related increase in looking towards the native relative to non-native visual speech stream when accompanied by the corresponding (native) auditory speech. This increase in native language preference did not appear to be driven by a difference in preference for native vs. non-native audiovisual congruence as we observed no difference in looking times at the audiovisual streams in Experiment 2.