A Research Proposal:

Using Words to Examine the McGurkEffect

Annie Rose H. Nicholson

Stephen F. Austin State University


In 1976, Harry McGurk and John MacDonald published a paper called"Hearing Lips and Seeing Voices" that became a landmark study in thesensory integration field. While studying infant perception at theUniversity of Surrey in England, McGurk and MacDonald accidentallydiscovered an illusion that combines audio-visual stimuli (AmericanScientist, 1998). This illusion has become known as the McGurkeffect. What McGurk and MacDonald found was that when discrepantauditory and visual information was presented to a human subject, thestimuli were combined to make a completely new response (McGurk &MacDonald, 1976). For example, when the visual stimuli /ga/ ispresented with the auditory stimuli /ba/, the subject will perceive/da/. This illusion suggests that visual information is integratedinto our perception of speech unconsciously and automatically(University of California at Riverside, 2001). This illusion alsosuggests that neither auditory nor visual information is moreimportant than the other is but that we can manipulate the dominanceof one over the other (Easton & Basala, 1982).

The limitations of the McGurk effect have been studied extensivelysince 1976. These studies have examined a wide range of circumstancesunder which the McGurk effect occurs and does not occur. Green andcoworkers found that even when the auditory and visual stimuli werepresented by different genders the McGurk effect occurred (Green,Kuhl, Meltzoff, & Stevens, 1991). A study done in Canada foundthat when the auditory stimuli lagged behind the visual stimuli by asmuch as 180 milliseconds the McGurk effect was apparent (Munhall,Gribble, Sacco, & Ward, 1996). Any lag more than 180 millisecondsbetween the auditory and visual stimuli caused a combination of thestimuli but not a completely new response. An example would be thatthe visual /ba/ and the auditory /da/ did not combine to form a newresponse like /ga/ but only partly combined to form /bda/ (Munhall,Gribble, Sacco, & Ward, 1996). Other studies have been doneexamining the actual auditory and visual stimuli. Green and Gerdeman(1995) found that when the auditory and visual stimuli containeddifferent vowels that effect of the McGurk illusion decreasedsignificantly. McGurk and MacDonald (1978) found that there werecertain consonant combinations that exhibited a greater effect thanothers. Consonants that use the different formations of the mouthwhen spoken seem to have a greater influence on the McGurk effectthan those that have the same mouth formations. Some examples of goodconsonant combinations that elicit a high McGurk effect are /ba/ and/ga/ or /ba/ and /da/ (University of California at Riverside, 1998).A study done by Rosenblum and Saldana (1993) concluded that even whenthe facial images of the visual stimuli were blurred, the McGurkillusion was unaffected. It has even been found that prelinguisticinfants exhibited the McGurk effect. Infants were habituated to anaudiovisual presentation /va/. The infants were then presented withtwo different dishabituation stimuli (audio /ba/ - visual /va/ andaudio /da/ - visual /va/) that exhibit the McGurk effect in adults.The results suggested that the infants were drawn to the stimulusthat exhibited the habituated /va/ (Rosenblum, Schmuckler, &Johnson, 1997).

The McGurk effect has been shown to occur under many differentcircumstances, but does it occur with words as opposed to sounds?There have been several studies on the use of words in the stimulibut these studies have reported conflicting results. A study done atBoston College concluded that when using words the McGurk effect wasnot present (Easton & Basala, 1982). On the other hand, a studydone at Dartmouth College reported that words do exhibit the McGurkeffect (Dekle, Fowler, & Funnel, 1992). In order to furtherexamine and follow up on the results of previous studies, thisexperiment will investigate the effectiveness of words in the McGurkeffect. Green and Gerdeman (1995) found that matching vowels in theauditory and visual stimuli caused a stronger McGurk effect thannonmatching vowel combinations. Based on these findings, thehypothesis is that by using the correct vowel combinations in theauditory and visual stimuli, the McGurk effect will occur with wordsas well as monosyllabic sounds. The subjects will be tested in fourdifferent treatment conditions combining words versus monosyllabicsounds, plus matching-versus-nonmatching vowel stimuli. The dependentvariable will be the accuracy with which the subjects can identifythe auditory stimuli correctly and will be measured using anidentification test. The subjects will be asked to record what theyhear, not what they see (Easton & Basala, 1982). This is anattempt to decrease the influence of the subjectís ability tolipread on their reporting accuracy.



At least 60 undergraduate students from Stephen F. Austin StateUniversity will participate for course credit in a psychology course.The requirements will be that the subjects have no speech, language,or hearing problems and normal or corrected to normal vision.


The auditory and visual stimuli will be presented using a Sony24-inch color television and videocassette recorder combination.Headphones will be available for the subjects in order for them to beable to listen to the auditory stimuli without interference. No otherauditory devices will be used to amplify the sound. The stimulus willinclude a female speaker with no previous experience that will befilmed (prior to the experiment) on a blank white background. Thespeaker will be filmed first reading an introductory set ofinstructions that are provided so that the subject may becomecomfortable and arrange themselves at their station. The femalespeaker will then provide the set of stimuli for each level of theindependent variable with a small pause between each one. The speakerwill be filmed using a personal camcorder and tripod. The samespeaker will be used to record the discrepant auditory stimuli thatwill then be dubbed onto the videocassette in synchrony with thevisual stimuli.


The subjects will enter the testing room and be asked to selectone of the ten TV/VCR stations at which to complete the experiment.The consent form will be read aloud by the experimenter and thensigned by the subjects. Copies of the consent form will be availablefor the subjects at the front of the room after the experiment.Subjects will then be read a set of instructions by the experimenterand asked if they have any questions before beginning. Subjects willbe instructed to put on their headphones and begin the videotape bypushing play on the TV/VCR display. During the introductoryinstructions on the videotape, the subjects will be instructed toadjust their seats so that they are comfortable. Each subject willthen complete each of the four treatment conditions using theidentification test and writing utensil at his or her station. Eachtreatment condition will contain five auditory/visual combinations.Each combination will be repeated three times by the female speakerbefore moving on to the next example. The combinations for eachtreatment condition are listed in Table 1. The monosyllabic soundcombinations were taken from Green and Gerdemanís (1995) studyon the discrepancies of vowels in the audio-visual stimuli inaddition to those provided by the researcher. The word combinationswere taken from Dekle, Fowler, and Funnellís (1992) studyusing words to examine the McGurk effect also in addition to thoseprovided by the researcher. The identification tests will just be anumbered sheet of paper that has a space for the subjects to writewhat sounds or words they heard with each combination. Uponcompleting the experiment, the participants will be read aloud adebriefing form. Copies of the debriefing form will also be availableproceeding the experiment. The test forms will be collected, the bluecards handed out, and the subjects excused.


A 2x2 Within Subjects design with the independent variables beingwhether the stimulus is a monosyllabic sound or a word, and whetherthe vowels in the auditory and visual stimuli are the same. Thedependent variable will be the accuracy with which the subjects areable to identify the discrepant stimuli and will be measured using anidentification test.

Results & Discussion

Prior to being able to come to any conclusions about the resultsof this experiment, certain analyses must be done to validate theextent of their significance. An ANOVA summary table for a completelywithin subjects factorial design should be completed. Depending onthe results of the ANOVA, which are predicted to be significant,further analysis of the data should be completed. Main effects shouldbe calculated and these are also predicted to be significant. Maincomparisons, simple comparisons, or simple effects will be calculatedif the ANOVA table proves to provide significant results. Comparisonsshould be done on several levels to analyze effectiveness of thematching vowel combinations. The first comparisons that should bedone have to do with the matching vowel sounds and the nonmatchingvowel sounds. The same comparison should also be done with thematching vowel words and the nonmatching vowel words. Comparing thesetwo treatment conditions will allow us to come to a conclusion withregard to the effectiveness of matching vowel combinations on theMcGurk effect. If the hypothesis is correct, there should be asignificant difference between the scores of the matching andnonmatching vowel combinations in both the monosyllabic sounds andthe words. Several reasons could be suggested if the results of thecomparison are not significant. One such reason could be that thevowels have very little to do with the McGurk effect. Another reasoncould be that vowels as well as consonants should be taken intoaccount as was suggested by MacDonald and McGurk (1978). Anothercomparison that should be done is on the difference in the scoresbetween the nonmatching sounds and words. Again, this same comparisonshould be done with the matching sounds and words. This comparisonwould help us to determine how significantly the words elicited theMcGurk effect compared to the monosyllabic sounds in both matchingand nonmatching vowel combinations. A significant result wouldsuggest that words are able to elicit the McGurk effect effectivelyas compared with the already successful monosyllabic sounds. Aninsignificant result would suggest that, compared to sounds, wordshave no impact on the degree to which the McGurk effect isexhibited.

Although many areas concerning the McGurk effect have beenstudied, further research is needed in several areas. One of theleast understood aspects of the McGurk effect is the actual processin the brain that leads us to the combined perception of the auditoryand visual stimuli. A clearer understanding of the sensoryintegration system could be obtained by investigating a process thatwould allow us to locate where and how the brain combines thisinformation. Perhaps using equipment such as a MRI or PET scan,researchers may be able to localize activity while the subjects areexperiencing the McGurk effect. Using this information, we may beginto manipulate the responses. Another area of the McGurk effect thatshould be studied further is which consonant combinations elicit themost pronounced McGurk effect. Combining information learned fromstudies concerning vowel combinations with information from studiesconcerning consonant combinations, the McGurk effect can be elicitedusing combinations that provide maximum results.


The McGurk effect [Electronic version] (1998). AmericanScientist: The Magazine of Sigma Xi. Retrieved March 2, 2002 fromwww.sigmaxi.org

Deckle, D.J., Fowler, C.A., & Funnel, M.G. (1992). Audiovisualintegration in perception of real words. Perception andPsychophysics. Vol 51(4), 355-362.

Easton, R.D. & Basala, M. (1982). Perceptual dominance duringlip-reading. Perception and Psychophysics. Vol 32(6),562-570.

Green, K.P. & Gerdeman, A. (1995). Cross-Modal discrepanciesin coarticulation and the integration of speech information: theMcGurk effect with mismatched vowels. Journal of ExperimentPsychology: Human Perception and Performance. Vol 21(6),1409-1426.

Green, K.P., Kuhl, P.K., Meltzoff, A.N., & Stevens, E.B.(1991). Integrating speech information across talkers, gender, andsensory modality: Female faces and male voices in the McGurk effect.Perception & Psychophysics. Vol 50(6), 524-536.

MacDonald, J. & McGurk, H. (1978). Visual influences on speechperception processes. Perception and Psychophysics. Vol 24,253-257.

McGurk, H. & MacDonald, J. (1976). Hearing lips and seeingvoices: A new illusion. Nature. Vol 264, 746-748.

Munhall, K.G., Gribble, P., Sacco, L., & Ward, M. (1996).Temporal constraints on the McGurk effect. Perception andPsychophysics. Vol 58(3), 351-362.

Rosenblum, L.D., Schmuckler, M.A., & Johnson, J.A. (1997). TheMcGurk effect in infants. Perception and Psychophysics. Vol59(3), 347-357.

Saldana, H.M. & Rosenblum, L.D. (1993). Visual influences onauditory pluck and bow judgments. Perception &Psychophysics. Vol 54(3), 406-416.

University of California Perceptual Science Lab (1998). BA + GA= DA. Retrieved March 2, 2002 from http://mambo.ucsc.edu/psl

University of California at Riverside Audiovisual Speech Web-Lab(2001). The

McGurk effect. Retrieved March 2, 2002 from www.psych.ucr.edu/



 Table 1

Sounds/Matching Vowels
Words/Matching Vowels
Sounds/Nonmatching Vowels
Words/Nonmatching Vowels