Xin Xie

Xin Xie, Ph.D.

I am interested in the cognitive and neural processes underlying speech processing, in the context of native and non-native communication. I combine psycholinguistic experimentation and functional neuroimaging methods (fMRI; recently, learning to apply statistical modeling) in my research to investigate how experience-dependent plasticity occurs (perceptual adaptation, long-term L2 learning). My most current line of work investigates foreign accent adaptation and its generalization across talkers, as well as maintenance over time. Specifically, I hope to better understand how people learn and represent structured variability (e.g., talker information) in spoken languages and the neural mechanisms that support such learning and representations.

Speech Adaptation

How do we adapt to unfamiliar pronunciations?

Everyone speaks in a different way, influenced by many factors such as age, gender, social status, dialects and accents. I am interested in how we represent and adapt to unfamiliar pronunciations in speech, and further, once adapted, how we generalize from prior experience to novel situations (e.g., novel words, talkers, accents). My work investigates mechanisms underlying rapid perceptual adaptation when representations of the sound structure of language are misaligned between the listener and the talker, i.e. a native listener and a foreign-accented speaker.

Continuing work along this line include studies that examine 1) effects of expectations on perceptual adaptation, 2) the maintenance of adaptation over the long term, and 3) neural systems that supports adaptation to native-accented and foreign-accented speech.

Nonnative Speech Communication

How do we speak and listen in a second language?

Foreign-accented speech deviates from native norms of the target language and is generally harder to understand than native speech. However, Bent and Bradlow (2003) found that non-native speech is not always less intelligible than native speech. Instead, non-native speech can be more easily understood than native speech when there is shared language background between the speaker and the listener. My work builds upon this finding and investigates the influences of both long-term (L1) and short-term (ambient language environment) language experience in modulating the “ISIB” by comparing L1 speakers with L2 speakers in different language environments. Effective use of this information depends on the shared language experience between the talker and the listener. The weighting and integration of various acoustic cues is important for speech perception and language experience fundamentally shapes listeners’ cue-weighting strategies.

Current work along this line involves further characterization of the phonetic variation in Mandarin-accented speech, based on large samples of natural speech production. A special focus is on under-studied phonemes and production patterns in connected speech.

Talker Identification

How do we know "who" is speaking?

Talker identification can be easy. A single syllable reveals Mom's voice over the phone. Talker identification can be hard. Speakers sound similar if you don't know the language being spoken. Voice perception depends on language processing. Builing upon an established "Language Familiarty Effect" (identifying talkers is easier in one’s native language than in unfamiliar languages; e.g., Perrachione et al., 2011) in talker identification, I take an individual differences approach and investigate the interplay of talker perception and language processing. We found that both linguistic (speaking a tone language) and relevant non-linguistic experience (musicianship) can enhance processing of talker information by sharpening general auditory skills (pitch processing). Meanwhile, individuals’ ability to encode phonetic detail strongly affects their ability to learn talker identity. Moreover, top-down lexical information may guide listeners’ interpretation of acoustic-phonetic variation and link it to talker identity.

Continuing work along this line involves further testing of a direct link between fine-grained phonetic processing and talker identification. We aim to better understand how memory representation of sounds interfaces with talker processing systems (potentially by making use of speech variation) and guides online speech processing as well as voice perception.