Tokyo at night.
Tokyo at night. Zengame/CC BY 2.0

Nearly every language and every culture has what are called “filled pauses,” a notoriously difficult-to-define concept that generally refers to sounds or words that a speaker uses when, well, not exactly speaking. In American English, the most common are “uh” and “um.”

Until about 20 years ago, few linguists paid filled pauses much attention. They were seen as not very interesting, a mere expulsion of sound to take up space while the speaker figures out what to say next. (In Russian, filled pauses are called “parasite sounds,” which is kind of rude.) But since then, interest in filled pauses has exploded. There are conferences about them. Researchers around the globe, in dozens of different languages, dedicate themselves to studying them. And yet they still remain poorly understood, especially as new forms of discourse begin popping up.

When a Twitter user writes, “Is this what Trump meant by having Mexico pay for a wall? Because uh…it doesn’t work like that” followed by an emoji of a frog and then an emoji of a cup of coffee, it throws everything into doubt. Like most other things about filled pauses, the Twitter usage is simultaneously transparent and opaque: we know exactly what it means, but when asked to explain it, or analyze it? It turns out we really don’t know.

But researchers digging into the weird world of filled pauses have turned up some crazy, fascinating stuff. Some have taken sentences full of “ums” and “uhs” and edited them out to find out if people react more positively to someone who doesn’t use them. (They do.) Some are putting people in MRI machines to find out what weird neural stuff is going on when people use filled pauses. (Definitely some stuff.) And in Japan, researchers are trying to puzzle out how and why Japanese filled pauses are so unusual.

There is a wide and contentious debate about what a filled pause even is. Ralph Rose, a professor at Waseda University in Tokyo who maintains a site called the Filled Pause Research Center, says he tends to use different definitions based on whatever he’s studying at that moment. “I can’t give a definition that I would say most researchers would agree on,” he says. Generally speaking, filled pauses are filed under a broader umbrella of “hesitation markers,” which are words or sounds that indicate…well, something.

MRI machines have been used to analyze neural activity during filled pauses.
MRI machines have been used to analyze neural activity during filled pauses. Jan Ainali/CC BY 3.0

Some filled pauses, in some situations, might be used to indicate a delay. They tell the listener, hey, I’m not totally sure what’s coming next, but I’m not done speaking, so don’t interrupt me. Some filled pauses are actually words in their own right: in English, “like,” “you know,” and “so” can be used as filled pauses. Those words have meanings, but when used to fill a pause, they’re not exactly to be interpreted as having the meaning they’d normally have. When someone says, “And then we went to…you know…the grocery store,” they’re not asking you to chime in and confirm that you do in fact know that we went to the grocery store. It’s just there, taking up time. Sometimes those words aren’t even pronounced the same way; “like,” for example, is more likely to have its final “k” sound dropped when the word is used as a filled pause.

Sometimes, as in the Twitter use, they’re used to signal something to the reader (or listener, as the case may be). In that tweet above? That’s signaling that the conclusion (“it doesn’t work like that”) should be obvious. That’s a completely different use case than using it to indicate a delay.

Words, for example “like,” might indicate that the statement that follows it shouldn’t be taken totally seriously. “You know” could be used like the Canadian “eh,” to encourage solidarity between speaker and listener.

Though some researchers have insisted that filled pauses are individual words in their own right, with distinct meanings, many believe that there’s something more fundamental about them. With a few exceptions, filled pauses exist in every language, and are weirdly similar. In English, it’s “uh” or “um,” in Mandarin it’s “en,” in French it’s “euh,” in Hindi it’s “hoonm,” in Swedish “ohm.”

These are all very similar; essentially, they’re a centered vowel which may or may not be followed by a nasal consonant. Let’s unpack that for a sec: one way vowels are described by linguists is in terms of where the tongue is in the mouth when the vowel is made. You can kind of look at the position of the tongue when making all the available vowels in a given language, and if you take, roughly, the middle one? That’s a centered vowel. A nasal consonant is one that’s expressed through the nose rather than the mouth; in English, those are “m” and “n.”

There are very few elements of language that are consistent amongst English, Mandarin, French, Hindi, and Swedish. And yet this one is pretty much the same.

We don’t really know where filled pauses came from, partly because, Twitter aside for the moment, they are oral sounds, and very unlikely to be found in historical written records. (Scholars have the same problem with swear words.) “Despite the lack of records about historical filler usage, it’s probably safe to assume that fillers have always been a part of human language,” says Katharine Hilton, a linguist at Stanford University who studies (among other things) filled pauses. “The reason for this is because they’re very useful words and communicate a lot of information to the listener.” The very earliest recordings of the human voice show that Thomas Edison was an avid user of “uh” and “um.” That’s about as far back as our data goes, but it seems fair to assume they go back further than that. These non-words, these mistakes, these errors: these are basic building blocks of language.

Rose’s research, of late, focuses on second-language acquisition, especially on native Japanese speakers who are learning English. If we ignore the filled pauses that are basically repurposed words (“like,” “well,” “so”), the rest are often surprisingly similar from language to language. But Japanese is different. Studies, says Rose, indicate that filled pauses in Japanese are more common than they are in English.

The most common filled pauses in Japanese, says Rose, are “ano” and “eto,” the latter of which is sometimes used without the final syllable as just “eh.” “Ano” is a repurposed word, meaning something like “that,” as in “that book,” and tends to be used in situations that call for more politeness. So far, not too crazy.

Here’s where it gets fun. Japanese has only five vowels: ah, ee, ooh, eh, and oh. (English is a particularly murderous language in terms of the quantity of vowels.) “There are some speakers who will use any of the other vowels as filled pauses,” says Rose. “The interesting fact, for most of these speakers, is that it happens to be the last vowel that they spoke.”

The equivalent of this in English would sound insane (but also sort of musical). Take this sentence: “So then I went back…uh…to my hometown…um….to see my friends…uh…who I haven’t seen in awhile.” Kind of a lot of filled pauses in that sentence, but that’s roughly how it’d look in English.

Waseda University, where Ralph Rose is a professor.
Waseda University, where Ralph Rose is a professor. Takayuki Miki/CC BY-ND 2.0

In Japanese? It would be more like: “So then I went back…ahhh…to my hometown…oww…to see my friends…ehh…who I haven’t seen in awhile.” How fun is that?

The reasons why people use filled pauses are tough to figure out on a case-by-case basis; largely, they’re seen as involuntary. But there are some theories about why Japanese has such an intense setup of filled pauses. One of those is that, basically, Japanese is a hard language to speak.

This comes back to something called “long-distance dependencies.” In a given sentence, you could say it has a long-distance dependency if the first word in the sentence is directly tied to a word much later on in the sentence, even the last word. English doesn’t do this very often; our setup is usually in the order of subject, then verb, then object.

Let’s take this sentence: “John saw the man who was reading a book.”

It’s a modular sentence, easy to break down. The action (saw) immediately follows the entity doing the action (John).

In Japanese, the structure of that same sentence would be more like this: “John book reading man saw.” Look how far apart the subject and the verb are! In order to speak that sentence, you basically have to know, and keep in your mind, the entire thought. You can’t stumble along as you can in English, where each subject is tied to the action it performs. By the time you get to the end, you may have forgotten what the action was supposed to be, or by whom it was done. “In English, it’s more, just, ‘I can’t remember the next word,’ rather than ‘I can’t remember what the subject of this sentence is,’” says Rose. Japanese syntax requires you to keep a whole mess of stuff in your head for a long period of time. That can be troublesome! So maybe you need a sec to remember where you were going—hence, a higher rate of filled pauses.

Filled pause research is still a fairly new linguistic subject, and not everyone is caught up. Rose, in his work in second-language acquisition, believes that filled pauses should be a significant part of language classes. After all, these…things…are going to be some of the most common sounds a student is likely to hear. They aren’t meaningless, and they aren’t standard: shouldn’t learning them be standard? “Some language programs actually actively discourage filled pauses,” he says. “The advice was, don’t use them. Because if you use them too much, you sound stupid. I was floored when I read that.” There is no evidence, anywhere, that the use of filled pauses is correlated in any way with any measure of intelligence. (Rose, in fact, describes himself as “a frequent ummer.”)

But this stuff is extremely important. What could be more jarring to, say, a native speaker of English than to hear a new Japanese student say, “Could you hand me that…eeeeh-to…book?” The same would be true of an English speaker using “uh” or “um” in his or her new Japanese. It doesn’t only draw attention to your difficulty with the language: it could even negatively impact comprehension, as whoever the student is speaking to would have no idea what the student is doing. In any case, Japan’s amazing, weird filled pauses—as well as the new-ish sarcastic Twitter use—are firm messages. “Uh” isn’t just a noise.