Algorithm fail? I use the word “Zealand” a lot

Word cloud of my Facebook stream

According to this word cloud of my Facebook timeline, I use the word “zealand” a lot.

Also, according to this word cloud of my Facebook timeline, I do not use the word “new” very often at all.

Which is weird, because I can’t think of a time I have ever written the word “zealand” when it wasn’t preceded by “new.”

In fact, I used the word combination frequently during my trip there last year, and occasionally since then. Not only that, but a cursory examination of my timeline indicates that I occasionally use the word “new” independently too. So if anything, not only should “new” be on this cloud, but it should be bigger than “zealand.”

In text analysis programs, it is fairly common to exclude what are known as “stop words.” Generally words that have been found to offer little meaningful information. Which exact words are included or excluded tends to vary depending on the context. Words like “and” are generally useless, but some other connectors (like “not”) may provide useful context for the words after them.

It seems odd to exclude something like “new” though, even from a stupid Facebook word cloud. That one would  seem to be meaningful in many contexts. But of course the creators of this silly Facebook word cloud app aren’t interested in being accurate. I’m guessing they found that even while it was important, “new” tended to overwhelm the clouds far too often. Because people love to post about their new stuff, new experiences, new jobs, new whatever. So it gets filtered out to keep the rest more engaging.

I guess it worked. They got a blog post out of me.