by Sonia Schmidt

Nominated by Annie Santamaria for ANTHRBIO 368: Introduction to Primate Behavior

Instructor Introduction

In ANTHRBIO 368: Introduction to Primate Behavior, students learn how our closest living relatives behave, communicate, and navigate their social and ecological worlds. For their major assignment, they draw on that course knowledge to write a review paper on a question focused on nonhuman primate behavior—topics on our suggested list are complex, but open-ended enough that many could easily become the starting point for a dissertation. Sonia Schmidt’s “Washoe, Koko, Kanzi: Who Wins?” is an outstanding example of the kind of thoughtful synthesis of the literature this assignment can produce. With impressive clarity, Sonia uses three landmark case studies—Washoe, Koko, and Kanzi—to build a comparative argument about what counts as language and communication, how evidence should be evaluated, and where the strongest findings truly lie. The result is a well-crafted essay that tackles an age-old question: Do our closest living relatives use language to express complex ideas and emotions much like we do or are they simply aping us (pun intended)? I hope you enjoy Sonia’s piece as much as I did.

— Annie Santamaria

Washoe, Koko, Kanzi: Who Wins?

Overview

We can’t agree on a universal language among ourselves – yet we’ve spent decades expecting apes to learn ours. Across the past century, researchers have tried nearly every method to bridge that gap: some even attempted to teach apes to physically speak, while others turned to sign language, pointing systems, or symbols boards in search of something undeniably human – language. Some efforts failed entirely; others revealed moments of actual understanding that blurred the boundary between communication and language.

While communication broadly refers to the transmission of signals that changes another individual’s behavior – a process that is shared by all living organisms – human communication operates at a far greater level of complexity (Proust, 2016). Unlike non-human primate vocalizations, which are innate, involuntary, emotionally-driven, and lack syntactic organization, human language is learned and voluntarily produced, with meaning encoded semantically and syntactically (Rendall, 2021). The central question, at this point, is whether apes' learned communicative abilities legitimately qualify as language or remain as a sophisticated form of conditioned signaling. To address this, I will compare three major apelanguage research projects – Washoe’s sign language learning and her successors (chimpanzee), Koko’s alleged signing and its controversies (gorilla), and Kanzi’s lexigram comprehension (bonobo) – along with recent field studies on chimpanzee and bonobo call combinations to evaluate which most closely approximates human language abilities.

Background

Research on communication and language in animals often falls into two complementary frameworks. One focuses on observed behavior, such as signals shaped by stimulus-response relationships, conditioning, and reward. The other takes on a more cognitive perspective, suggesting that some species attain the ability to use symbols intentionally and understand meaning beyond immediate context. Human language, however, goes a step further by relying on intentional meaning-making, involving structured patterns, shared symbols, and the ability to create new, endless expressions (Rendall, 2021). Assessing these abilities in apes is complicated, as their apparent “success” in laboratory or home settings may result from reinforcement or training with human influence, rather than genuine linguistic comprehension, while communication in wild populations is shaped by entirely different social and ecological conditions. This developmental context is extremely relevant when evaluating these ape achievements: recent comparative studies also show that human infants receive vastly higher rates of direct vocal communication from caregivers than any other great ape – almost 400 times more than bonobos and 69 times more than chimpanzees – while surrounding vocal input remains comparable across species (Wegdell et al., 2025). This suggests that the strong, targeted input that supports human language learning is not shared across the primate lineage but rather evolved later as a uniquely human developmental feature.

Early ape language studies provided a solid foundation for evidence that some apes can acquire and use symbolic systems to a limited extent. Washoe, a chimpanzee, was the first to acquire over 100 American Sign Language (ASL) gestures from humans, and her success paved the way for later studies for signing chimpanzees, like Tatu and Loulis, who continued to use and expand their vocabularies in sanctuary environments (Jensvold et al., 2023). Koko, a gorilla also trained in ASL, became one of the most celebrated examples of ape-human communication through her expressed emotions and desires; however, her project sparked controversy over limited verification and claims that many signs reflected imitation or human prompting rather than genuine language use (Patterson & Gordon, 2002; Terrace et al., 1979). Kanzi, a bonobo taught by humans, learned to comprehend spoken English and communicate through visual lexigrams, demonstrating a great understanding of syntax and even the ability to process degraded and computer-generated speech (Schoenemann, 2022; Lahiff et al., 2022). While no single ape project perfectly captures the full range of human linguistic capabilities, Washoe, Koko, and Kanzi each exhibit distinct aspects of cognitive communication. Among them, Kanzi stands out with his exposure-based learning, syntactic use, and ability to process spoken speech, offering the most compelling evidence of an ape approaching the structural and cognitive complexity of human language.

Washoe

Washoe’s acquisition of American Sign Language (ASL) in 1966 marked the first true demonstration of symbolic learning in a nonhuman primate, challenging long-held assumptions of apes being incapable of voluntary and intentionally meaningful communication. Beginning at approximately 10 months old, she was raised in a cross-fostering environment, immersed in daily human interaction and language-heavy settings comparable to a human child’s language exposure and reached over a hundred signs within her first four years (Erbaba et al., 2025). Her rapid sign language acquisition in infancy ultimately developed a vocabulary of 245 signs which she used to refer to people, objects, activities, and even internal states in contextually appropriate ways, demonstrating intentional and referential communication rather than conditioned responses (Jensvold et al., 2023).

This emerging intentionality shifted away from the traditional behaviorism views that characterized ape communication as purely reflexive and stimulus-based, suggesting instead that symbolic reference could underlie some primate signals (Rendall, 2021). However, Washoe’s utterances rarely exceeded two or three signs and did not show the hierarchical syntax or combinatorial structure characteristic of human language beyond the developmental level of a two-year-old child – which is a plateau consistently observed across language-trained apes (LPAs) (Erbaba et al., 2025). These limitations observed in Washoe and other LPAs underscore Rendall’s (2021) argument about ape communication: that although symbolically trained, it remains limited to affective and self-directive functions rather than representational systems – highlighting the gap that Kanzi’s cognitive abilities close, which will be explored later.

Washoe’s communicative legacy extended beyond her own learning when she adopted another chimpanzee, Loulis, who was studied alongside Tatu (another human-taught signing chimpanzee), both of whom continued to use ASL signs years after the original studies ended. Jensvold et al. (2023) found that Tatu maintained a range of 138 different signs between 2014 and 2021, demonstrating long-term retention decades after her initial training. However, even more remarkably, Loulis eventually learned 51 signs directly from other chimpanzees during a period when humans were restricted to only seven permitted signs (WHO, WHAT, WHERE, WHICH, WANT, SIGN, and NAME), providing the clearest evidence of ape-to-ape transmission of a human language system (Erbaba et al., 2025; Jensvold et al., 2023). Unlike wild chimpanzees, who rarely receive communication intentionally directed toward them, Washoe grew up in a human-like environment rich in interactive speech and gestures – exposure that likely played a key role in her exceptional symbolic development (Wegdell et al., 2025).

Herbert Terrace, the psychologist behind the Nim Chimpsky project, offered one of the most influential critiques of ape language research, directly challenging interpretations of earlier studies such as Washoe’s. After analyzing more than 19,000 of Nim’s (the chimpanzee at the center of the study) utterances, Terrace found that they lacked grammar, spontaneity, and genuine conversational turn-taking (Terrace et al., 1979). Most were prompted by trainers or shaped by rewards, suggesting imitation rather than true understanding. However, Jensvold et al.’s (2023) long term study challenges this claim, as even decades after formal instruction ended, Tatu and Loulis still used many of their ASL signs in their daily lives. The chimpanzees signed to each other, even when no humans were present, and their continued and appropriate use of signs suggests they were communicating intentionally – not just copying behavior they were taught (Jensvold et al., 2023).

Also in contrast to Terrace’s judgement, this sustained evidence of chimpanzee-tochimpanzee teaching and learning suggests that symbolic systems can be socially maintained and shared within chimpanzee groups. Some recent field studies further support this interpretation, showing that wild chimpanzee populations have population-specific gestural and vocal “dialects” shaped by social learning, reinforcing the bigger idea that the foundation of shared symbolic behavior is already embedded in natural ape communication (Kalan et al., 2023). Complementing these findings, wild chimpanzees have recently been shown to combine calls into two-call ‘bigrams’ that expand or shift meaning depending on the combination or order of calls, demonstrating multiple forms of combinatorial meaning-building in their natural communication system (Girard-Buttoz et al., 2025). Altogether, Washoe’s intentional use of ASL, her cultural transmission of signs to other chimpanzees, and comparable behaviors found in wild chimpanzee populations, reveal that the social and cognitive foundations necessary for meaning are present in apes; even though they lack the syntactic and generative capabilities used in human language, they establish a revolutionary groundwork for scientists to build on in other ape-language programs.

Koko

Project Koko, initiated in 1972 by Francine Patterson, was aimed to determine whether a gorilla could acquire symbolic communication comparable to that demonstrated by signing chimpanzees like Washoe. Over nearly three decades, Koko (a western lowland gorilla) was immersed daily in a bilingual environment of ASL and spoken English, placing her in an environment intended to support bilingual communicative development (Patterson & Gordon, 2002). She developed an emitted vocabulary exceeding 1,000 signs and a working vocabulary of over 500 signs, and had rapid vocabulary growth starting with approximately 80 words per year to roughly 35 words per year as her acquisition rate leveled out (Patterson & Gordon, 2002). However, these figures rely on Patterson's reports, and more conservative analyses using stricter inclusion criteria estimate her vocabulary at approximately 260 signs (Erbaba et al., 2025). Beyond sign quantity, Patterson documented qualitative behaviors of Koko creating new signs on her own, adjusting known ones to change meaning, combining signs into new phrases, using them to talk about things that weren’t present, and expressing emotion – all behaviors Patterson interprets as evidence of flexible, referential communication (Patterson & Gordon, 2002).

At the same time, Koko became the center of the broader controversy about what “counts” as language in ape projects. Terrace’s critiques – utterances being short, prompted, and not spontaneous – were applied to Koko as well, especially when noted that 100% of Koko’s signs were signed by the teacher immediately before (Terrace et al., 1979). Unlike later work with Kanzi, Project Koko did not incorporate Terrace’s recommended safeguards – such as blind scoring, full-corpus video analysis, or controls against trainer prompting – making it difficult to separate genuinely spontaneous sequences from those shaped by subtle cuing or selective reporting (Terrace et al., 1979). Many of Koko’s most publicized statements – particularly those describing emotions or discussing her kittens – require contextual glossing by trainers and rarely exhibit consistent hierarchical syntax or generativity (Patterson & Gordon, 2002; Rendall, 2021). As a result, it remains debatable whether her utterances reflected genuine combinatorial intent or were selectively amplified through human interpretation.

Even so, Koko’s behavior occupies a meaningful space between conditioned signaling and human-like language. Koko’s reported signing about absent objects, labeling of internal states (such as “HAPPY,” “SAD,” or frequently “LOVE”), and navigating social interaction through language, suggesting at least some referential and affective use rather than mere echoing (Patterson & Gordon, 2002). In this broader comparative context, Koko followed the same general pattern seen in other LPAs: she developed symbolic signs, expressed emotion through them, and occasionally combined them, yet her utterances remained short and loosely structured, mirroring the typical plateau at a developmental stage similar to young children rather than expanding into recursive grammar (Erbaba et al., 2025; Rendall, 2021). Thus, Koko demonstrates that gorillas can participate in shared symbolic systems when raised in rich communicative environments, while simultaneously highlighting the methodological constraints that critics warn against, eventually setting the stage for Kanzi as a stronger test case for syntax-sensitive comprehension.

Kanzi

Unlike Washoe and Koko, who required explicit human instruction, Kanzi spontaneously acquired lexigrams (abstract visual symbols) through immersion in a language-rich environment, using them to communicate his desires and observations, proving that symbolic communication in apes could emerge as an example of observational learning rather than directed training. In formal testing, Kanzi was presented 660 novel spoken English commands – many of them unusual or semantically complex (e.g., “put the pine needles in the refrigerator”) – and his accuracy on these trials demonstrated comprehension that extended well beyond simple symbol-reward mapping (Schoenemann, 2022). His performance across hundreds of blind trials, where the experimenter was delivering commands behind a one-way mirror, signifies that his responses were not the product of unintentional human cueing. Although Kanzi himself was not the subject of Erbaba et al.’s (2025) published neuroimaging analyses, they report that LPAs as a group exhibit increased connectivity between temporal and frontal brain regions homologous to human language regions (Broca’s and Wernicke's), implying that language training activates neural pathways already present in ape brains rather than creating entirely new ones (Erbaba et al., 2025).

Kanzi reliably interpreted novel spoken instructions using reversible sentence structures – where meaning matters depending on word-order rather than content – demonstrating sensitivity to syntactic relationships, not just vocabulary. When tested on 20 pairs of reversible English sentences, in which only word-order determines meaning (e.g., “Pour the Coke in the lemonade” vs. “Pour the lemonade in the Coke”), Kanzi responded correctly on 31 of the 40 total sentences (78%), partially correct on 7 (18%), and incorrect on only 2 (5%), showing a non-random pattern that persisted even on stricter coding by critics (Schoenemann, 2022).

Kanzi’s comprehension extended beyond natural human speech to highly degraded and even computer-generated forms, showing auditory abstraction far beyond associative learning. In testing with natural, noise-vocoded, and sinusoidal formed speech, he correctly recognized the majority of words across all conditions, performing best with natural voices but still identifying degraded and synthetic versions of familiar words well above chance (Lahiff et al., 2022). His ability to recognize words that had phonetic cues stripped away shows that he processed the underlying structure of speech sounds instead of relying on surface patterns, mirroring his earlier success interpreting reversible English sentences. Therefore, Kanzi’s noteworthy performance across the corrupted auditory input and grammatical constructions reveals a flexible and integrated comprehension system that can combine symbolic meaning and syntactic structure (Lahiff et al., 2022).

Recent field research strengthens this connection between Kanzi’s symbolic abilities and the natural communication of his species. Bonobo vocal sequences exhibit nontrivial compositionality, meaning that the order and combination of calls change the overall meaning in systematic, syntactic-like ways (Berthet et al., 2025). This natural structuring parallels the logic underlying Kanzi’s comprehension of reversible English sentences, in which he consistently interpreted commands like the Coke vs Lemonade actions according to word order rather than associative cues (Schoenemann, 2022). Both findings can appear to point to the same underlying cognitive principle, which is an ability to integrate discrete elements into hierarchically meaningful structures; however, while wild bonobos use this ability through natural vocal calls in social settings, Kanzi extends it symbolically by using visual lexigrams and spoken words to express and understand meaning.

Why Kanzi

Among all ape language projects, Kanzi provides the clearest evidence of an ape approaching the structural and cognitive complexity of human language. Unlike Washoe or Koko, whose achievements depended on explicit instruction and close human prompting, Kanzi acquired lexigrams and spoken English comprehension through immersion and observation alone – mirroring the naturalistic, exposure-based learning that characterizes early human development. Kanzi's interpretation of reversible sentences and his accurate responses to degraded and artificially-generated speech both demonstrate that he processed underlying linguistic structure rather than relying on associative learning or surface cues, an ability no other ape has been shown to exhibit in experimental testing (Schoenemann, 2022; Lahiff et al., 2022). This is further supported by neural evidence of enhanced language-region connectivity in trained apes, demonstrating deep linguistic processing that parallels human comprehension and reflects the compositional capacities found in wild bonobo communication (Erbaba et al., 2025; Berthet et al., 2025). In sum, while all three programs reveal meaningful human linguistic capacities, Kanzi most closely approximates human language – not merely in vocabulary, but in syntax, abstraction, and flexible comprehension that the other two lacked.

Conclusion

Despite decades of ape language research, there are plenty of significant methodological and theoretical gaps that remain. The three major projects examined in this paper – Washoe, Koko, and Kanzi – employed different training methods, testing protocols, and evaluation criteria, making direct cross-species comparisons difficult. Washoe and Koko's projects relied heavily on sign language production without rigorous blind testing, while Kanzi's research emphasized comprehension under controlled conditions.

With that in mind, future research should establish standardized protocols that include comparable metrics across species, controls for trainer cueing, and blind scoring to ensure that observed differences between species reflect actual cognitive capacities rather than inconsistent testing. Additionally, the recent discoveries of compositional structure in wild bonobo and chimpanzee vocalizations create an urgent need to bridge field observations with laboratory findings. If we understand how natural communicative capabilities relate to trained symbols, we can help discover which aspects of ape language abilities are innate versus culturally acquired (Berthet et al., 2025; Girard-Buttoz et al., 2025).

Perhaps the biggest gap in our findings is that although apes can combine symbols and calls in meaningful sequences, no laboratory language-training study has demonstrated recursion or truly hierarchical embedding — the core property that allows human grammar to expand into infinite depth. Kanzi’s success with reversible sentence structures shows sensitivity to word order, but testing for genuine syntax requires demonstrations that go beyond linear two-element combinations and can reveal nested structures that humans can produce. Along with mending some of these issues, future work should especially design appropriate tests for these discoveries while addressing a harder dilemma of the cognitive and social costs of raising apes in linguistically intensive human environments.

References             

Berthet, M., Surbeck, M., & Townsend, S. W. (2025). Extensive compositionality in the vocal system of bonobos. Science, 388(6742), 104–108. https://doi.org/10.1126/science.adv1170

Erbaba, B., Sinha, M., Guevara, E. E., Hecht, E. E., Hopkins, W. D., & Sherwood, C. C. (2025). Insights from language-trained apes: Brain network plasticity and communication. Evolutionary Anthropology: Issues, News, and Reviews, 34(3), e70018. https://doi.org/10.1002/evan.70018

Girard-Buttoz, C., Neumann, C., Bortolato, T., Zaccarella, E., Friederici, A. D., Wittig, R. M., & Crockford, C. (2025). Versatile use of chimpanzee call combinations promotes meaning expansion. Science Advances, 11(19), eadq2879. https://doi.org/10.1126/sciadv.adq2879

Jensvold, M. L., Dombrausky, K., & Collins, E. (2023). Sign language studies with chimpanzees in sanctuary. Animals, 13(22), 3486. https://doi.org/10.3390/ani13223486

Kalan, A. K., Nakano, R., & Warshawski, L. (2023). What we know and don't know about great ape cultural communication in the wild. American Journal of Primatology, 87(1), e23560. https://doi.org/10.1002/ajp.23560

Lahiff, N. J., Slocombe, K. E., Taglialatela, J., Dellwo, V., & Townsend, S. W. (2022). Degraded and computer-generated speech processing in a bonobo. Animal Cognition, 25(6), 1393–1398. https://doi.org/10.1007/s10071-022-01621-9

Patterson, F. G. P., & Gordon, W. (2002). Twenty-seven years of Project Koko and Michael. In B.

M. F. Galdikas, N. E. Briggs, L. K. Sheeran, G. L. Shapiro, & J. Goodall (Eds.), All apes great and small (Vol. 1, pp. 165–176). Developments in Primatology: Progress and Prospects series. Springer, Boston, MA. https://doi.org/10.1007/0-306-47461-1_15

Proust, J. (2016). The evolution of primate communication and metacommunication. Mind & Language, 31(2), 177–203. https://doi.org/10.1111/mila.12100

Rendall, D. (2021). Aping language: Historical perspectives on the quest for semantics, syntax, and other rarefied properties of human language in the communication of primates and other animals. Frontiers in Psychology, 12, 707942. https://doi.org/10.3389/fpsyg.2021.675172

Schoenemann, P. T. (2022). Evidence of grammatical knowledge in apes: An analysis of Kanzi’s performance on reversible sentences. Frontiers in Psychology, 13, 885605. https://doi.org/10.3389/fpsyg.2022.885605

Terrace, H. S., Petitto, L. A., Sanders, R. J., & Bever, T. G. (1979). Can an ape create a sentence? Science, 206(4421), 891–902. https://doi.org/10.1126/science.504995

Wegdell, F., Fryns, C., Schick, J., Nellissen, L., Laporte, M., Surbeck, M., van Noordwijk, M. A., Masi, S., Hellwig, B., Willems, E. P., Zuberbühler, K., van Schaik, C. P., Stoll, S., & Townsend, S. W. (2025). The evolution of infant-directed communication: Comparing vocal input across all great apes. Science Advances, 11(26). https://doi.org/10.1126/sciadv.adt7718