The common words list and the real magic of writing
A 1921 study on how writers use words is stunningly accurate even today!
Omg, y’all. Sometimes you stumble across a thing and it’s so stunning that later you have no idea how you never knew or how you never noticed.
Let me tell you the craziest story.
I’d been reading about the evolution of writing out of curiosity. Because what we call good writing has changed (a lot!) over the years. And I found some truly fascinating stuff, which I’ll share in another piece because I got sidetracked by an old research study from 1921 and omg, I have to tell you about this.
I’ve been in this rabbit hole for hours and trust me, you need to see this! :)
Back in 1921, a psychologist named E. L. Thorndike published a book called The Teacher’s Word Book. It was a list of the 10,000 most commonly used words in the English language.
Here’s what he did. He took 41 different sources of text that people read. Books, mostly, the bible and classic literature, but also some newspapers and hand written correspondence. Letters by writers and such. Then he listed all the words used in all those materials, and how many times each word was used across all the texts.
My god, can you even imagine? He was doing that manually. In 1921.
Talk about laborious work, wow. As he went through books, letters and newspapers, he listed every word he came across. The end result was a list of ten thousand words.
Beside each word he noted how many times each word had been used across the 41 different texts he used. Then he divided the list in half. The 5000 most commonly used words in the English language and the 5000 least commonly used.
You know what the point was?
To help teachers know which words kids need to learn and understand first to be proficient readers. Which words to learn first, which words to learn next.
Know what it made me think of? Dr. Seuss.
Before Dr. Seuss (Theodor Seuss Geisel) started writing for kids, he was a political cartoonist and copywriter. He started writing kids books because he thought children should have books that are fun to read, not the boring readers they got in school.
It was Thorndike’s work that inspired him.
Based on TWB (The Teacher’s Word Book) educators compiled a list of 348 words that were acceptable for the youngest readers. But the readers for kids weren’t exactly fun to read. So Dr. Seuss took that list and wrote The Cat In The Hat and proudly told his publisher it only uses 246 of the words off the list. A fun book for beginner readers.
When it went bestseller, his publisher bet him fifty bucks he couldn’t do it again, but only using fifty of the easiest words. He won the bet. Green Eggs and Ham only uses fifty different words. What he showed the world is that you can use the simplest words and still make writing fun. Remember that, okay?
Back to Thorndike and his TWB.
Thorndike’s work was important because it was the first time in the United States that words were listed by frequency of use across multiple English language texts.
Researchers in the field of reading, comprehension and language arts dug into his lists. They divided it into thousands. The thousand most commonly used words, then the next thousand, and the next, and so on.
Thorndike’s work and his word lists laid the groundwork for determining reading comprehension levels by grade.
But along the way, researchers discovered an interesting and curious thing.
In studying Thorndike’s work, researchers discovered that the first 300 words of the list accounted for nearly 65% of all written material.
Mind. Blown.
I mean think about it. Hemingway’s work, Plath’s work, Poe, King, Maya Angelou, everything you’ve ever read, everything I’ve read, everything you’ve written, everything I’ve written — 65% of it is the same 300 words? Just. Wow.
So that was me, entirely disappeared down a rabbit hole.
Because I had to see it for myself. You’re going to be so stunned. Seriously.
The first thing I did is look for a unique word counter that lets me eliminate words, because I sure as heck wasn’t feeding my writing into AI.
I found one at: https://planetcalc.com/3205/
Then I went and found the TWB list of the 300 most commonly used words.
Test time!
First, I pasted in It’s Still A Beautiful World because y’all loved that one.
The tool told me the word count of that piece is 1080 words, but it only uses 430 unique words. Because, repetition. For example, the word “the” is used 50 times. The word “and” is used 37 times. The word “he” is used 27 times, and so on.
Which makes sense, of course. Common words.
But then came the fun part. There’s a little box in the tool, with a grey background, that says words to eliminate. I pasted in the 300 most commonly used words.
It wiped out 65% of my text. Not even kidding. I was stunned. Literally stunned. Staring at the screen with wtf all over my face. Only 35% of that article is words that are not in the list of 300 most commonly used words.
So I ran deeper into the rabbit hole. Because of course I would.
Next, I pasted in The Backlash Against Ocean Vuong, which is my second most popular piece on Substack. Word count, 1701 words. Then I eliminated the 300 most commonly used words. Know what the result was? 70% of my words were in the list of most commonly used words. Poof. Eliminated. 30% of my words were not in that list.
You know I had to do one more, right?
Last, I pasted in The Magic of Self-Taught Writers because that one resonated with a lot of you. Word count, 1191 words. You already know the end result, right? 65% of the text in that piece appears in the list of 300 most commonly used words in the English language. Only 35% of my words were not in the TWB 300.
You know that feeling when the merry-go-round stops but your head is still spinning?
That was me. Just stunned.
William Butler Yeats said “the world is full of magic things, patiently waiting for our senses to grow sharper” and I feel like I just stumbled across one of those magic things he was talking about.
I mean — what a thought — to think we’re all using the same words. Over and over again. We’re just arranging them differently. Taking 300 words and arranging them in different ways over and over to tell an endless number of unique stories.
And what sets my writing apart from yours or Stephen King’s or Margaret Atwood’s or Maya Angelou’s or any writer you can think of boils down to only two things.
First, the order we put them in.
And secondly, the few words that don’t appear in that list — which is roughly a third of our words give or take.
Honestly, it leaves me a little in awe. Of writers. Of writing. Of the art of writing.
It makes me understand why so many people refer to writing as magic.
Stephen King said books are portable magic and Carl Sagan said a book is proof that humans are capable of working magic. Isabelle Allende said writing is a kind of magic and Anaïs Nin said writing is the magic and the art of turning feelings into words. Bradbury said writing is its own magic. I could list so many more.
People observing that there’s a magic to writing even if they couldn’t explain it.
And if I’m honest, it makes me kind of sad that people use AI for writing. Because 65% of what it generates is going to be the same words we’re all using. Words from the list of 300 mostly commonly used words. And the remaining 35%? Not even their own. Borrowed from other writers based on probability of word pairings.
It makes me fiercely proud of the art and craft of writing. It makes me want to hone my voice. Dig deeper to find the 35% of words in any given story that makes me different from every other writer on the planet that lives or once lived.
Here’s an interesting tidbit. Did you know chickens and humans share 65% of our DNA. But the 35%? Makes all the difference. I’d love to know what you think…
“Writing is finally about one thing: going into a room alone and doing it. Putting words on paper that have never been there in quite that way before.
-William Goldman



I mean, makes sense. My grandma used to say any dessert is just eggs, white flour and sugar. It's how it's prepared and the 2 percent "something special" that makes it stand out. (No it's not love. That's WHY it's made. Totally different. 😅) But even the same words, capitalized different, or bold, or italicized, take on a different flavor. Punctuation is also massive. A comma in front of a word or behind the same word can completely change the meaning of a sentence. It only takes 0.5% of certain elements to completely change the properties of a bar of steel. We live in a pretty cool universe. 😃
I love this, and I really am excited to take a look at the links you provided! Such an interesting share with the origins of the Dr. Seuss books!
Thought: what if it's the 65% common words that allow us to easily understand each other's writing...and it's the 35% unusual words that give our writing a distinctive and unique flair that draws in the right audience?
Like, the commonality builds a bridge, and the difference makes it worth reading (or memorable) because we've contributed something new...