Posted by Sam Battin, Senior Search Strategist Mark Twain is reputed to have said, “If you don’t read the newspaper, you are uninformed; if you do read the newspaper, you are misinformed.” Some things never change! It’s often the case that one of our clients reads a particularly shocking news article related to search marketing and comes to us with questions. The article might relate to subjects such as major infrastructural change in the Google index, or an announcement about the personalization of natural search results. As with all things media-related, you can’t always believe what you read, and you certainly can’t expect journalists to be experts on all of the topics they cover. We like setting the record straight; there’s always new information about search engine optimization, and we do our best to stay on top of things so we can give our clients conclusive answers where a news article only tells half of the real story. Case in point: WCBS TV recently posted a news story about how there are a large number of videos of teenage girls brawling with each other on YouTube. To some Web users, this may be a newsworthy topic. To others, this is not really news, it’s more a “life in the 21st century” kind of thing. The article headline, however, startled me: “Wow,” I thought, “They found more than 267,000 videos. That certainly is a large number. Perhaps this article is newsworthy after all.” Finding exact numbers through search is no picnic, so I wondered how they came to that conclusion, that 267,000 videos about this very specific subject were on YouTube? I decided to read the article in hopes that they would explain some neat tool or trick I could use. It’s certainly handy to be able to tell my clients exactly how many mentions of individual subjects there are on the Web. I read the article. About the fifth paragraph in, they explained their secret search technique: That stunned me, and made me realize that not many people (or journalists) understand what the numbers on a search results page really indicate. In other words, 267,000 is not at all likely to be the true number. Showing how they got to this number was a great opportunity for us to give you a “behind the scenes” look at search results. First of all, when you type “girl fight” into YouTube, you are telling YouTube to show you videos associated with a particular search string. A “search string” means a sequence of characters in any order. For example, “cat” is a search string, and “casdsjfdsjh” is a search string. There are even two ways to enter a search string on YouTube: The second way puts quotation marks around the search string. This is called “exact matching.” In the first case, you’re telling YouTube to find you videos with the words “girl” and “fight” somewhere on the page (and not necessarily next to each other). In the second case, you’re telling YouTube you want to see videos with the exact phrase “girl fight,” e.g. the words appear next to each other on the page. Here are the results for an exact match search for “girl fight” on YouTube: In the screenshot, you see that YouTube added quotes around my quote; only 16,300 vids have the words “girl” and “fight” next to each other on the page as of Feb 3, 2010. Here are the results for a regular query for “girl fight” on YouTube: Wow—a much higher number; it would appear the reporters didn’t use quotes in their search. In their search without the quotations, the words “girl” and “fight” don’t necessarily have to appear next to each other on the video page; not even in the video title. This search means that a YouTube video such as the following is a legitimate result for the search “girl fight”: What the? This video doesn’t look like two high school age females engaging in fisticuffs! And yet this video (found on page 13 of the YouTube search) “popped up” among the 267,000 “girl fight” videos claimed by the article. Why? Because the lyrics of the Justin Bieber song (in the “More info” section of the page) contained the words “girl” and “fight”, as shown below: In other words, not every single one of the 267,000 videos returned by WCBSTV’s search shows a different instance of two school-age females locked in combat. A search engine isn’t very smart. When you type “girl fight” into YouTube, the search engine only sees two strings of four and five letters separated by a space. The only pages a search engine can match your search with are pages that contain these strings. As we saw in the Justin Bieber video result, having the word “girl” in the lyrics qualifies this video upload for the search “girl fight.” Song lyrics aside, the number 267,000 arrived at by WCBSTV also included duplicate videos, description text that used the words “girl” and/or “fight,” possibly inbound link anchor text, and just plain old spam. Reporting the number “267,000” without explanation makes it seem like there’s some kind of huge worldwide problem with teenage schoolgirls punching each other, but the actual number of uploaded videos is unknown. As we saw, when you ask “How many?,” a search engine will not necessarily know the answer – in this example, it couldn’t even tell the difference between a song and two girls fighting. Perhaps if the reporters spent some time looking at the actual YouTube results and counted each individual video with young women behaving in a disorderly manner, and subtracted incorrect results (like returns for song lyrics and spam), they’d have a more accurate number around which to write a news story. As is the case in many other subjects, a journalist will write about topics they aren’t experts in and can come away with incorrect conclusions. Rather than determining your business policy from a news article, we recommend you first talk to a search expert and get their take on what’s going on. You’ll probably learn something neat.