About

I'm Mike Pope. I live in the Seattle area. I've been a technical writer and editor for over 35 years. I'm interested in software, language, music, movies, books, motorcycles, travel, and ... well, lots of stuff.

Read more ...

Blog Search


(Supports AND)

Feed

Subscribe to the RSS feed for this blog.

See this post for info on full versus truncated feeds.

Quote

With sufficient leisure I can compose excellent impromptus.

— Jean Jacques Rousseau



Navigation





<January 2025>
SMTWTFS
2930311234
567891011
12131415161718
19202122232425
2627282930311
2345678

Categories

  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  
  RSS  

Contact Me

Email me

Blog Statistics

Dates
First entry - 6/27/2003
Most recent entry - 9/4/2024

Totals
Posts - 2655
Comments - 2677
Hits - 2,721,600

Averages
Entries/day - 0.34
Comments/entry - 1.01
Hits/day - 346

Updated every 30 minutes. Last: 9:47 PM Pacific


  08:09 AM

It's Friday the 13th! But it's Friday! I'm conflicted.

Friend Nancy alerted me this week to an interesting term: search void, also known as data void. This describes a peculiar weakness, you might call it, of how web search results are ranked.

It might help to know that search rankings (or page rank, as Google calls it[1]), works by counting how many pages link to a specific page. The more pages link to a specific page, and the more "authoritative" those pages are, the higher a page appears in the search results. "Authoritative" here is defined as a page that itself ranks high. If a well-known, high-traffic blogger links to one of your blog posts, your post will get a big rankings boost.[2] A similar example occurs on Twitter: if someone with tons of followers retweets one of your tweets, many people will see and possibly retweet your original.

The idea is a kind of digital crowdsourcing—the internet at large decides which pages are the best, and those rise to the top of the search results. A flaw can result, however, if a lot of content is produced and cross-linked about a topic, but that information is one-sided or niche. An article in Wired that describes this uses the example of vitamin K shots for newborns. A passionate anti-shot community has produced a lot of content warning of the dangers of these shots. There is not (or was not) a corresponding community of passionate pro-shotters, so there was a period during which if you searched for info about vitamin K for newborns, there was a data void: the top-ranked search results represented a kind of skewed data sampling. This information showed up at the top of the search listings, and people presumably assumed it was the "best" information, even though it doesn't represent a majority view about the subject.

As our information sources become more siloed, we're all going to become more subject to search/data voids. I suppose the first defense is to know that there's a word for the phenomenon.

For origins, a fun one that I learned from Jonathon Owen. In English, we got the word lettuce from Old French, and there are cognates like lechuga in Spanish. (Hold that thought.) It gets more interesting when we go further back. In Latin, the name was lactuca. The lac- part means "milk", because wilder members of the lettuce family have milky juice. That lac particle is what you see in lactate and lactose, and whose relatives are caffè latte and café au lait. (In Spanish, milk is leche, which hey look, is right there in lechuga.) The lac particle also shows up in the word galaxy/galactic, which comes from a Greek word for the Milky Way. Got milk? Yes you do.

[1] Page rank is a satisfactory lexical intersection of the term web page and the name Larry Page, one of Google's founders.

[2] This statement is only mostly true.

Like this? Read all the Friday words.

[categories]   ,

|