Limitations of Google for Online Research

Limitations of Google for Online Research

Limitations of search engines for online research

Search engines scratch the surface of available data

The World Wide Web is used by 40 percent of the world’s population and 77 percent of the developed world for news, entertainment, communication and countless other purposes. Even though more people are using Google for searches, we are finding less of the data that’s available. The reason…conventional search engines can only offer up the tip of the iceberg of available information in response to our queries…A measly 3 percent…and it is getting worse.

Google has become the De facto research tool for many people but may be inadequate to rely on for serious matters.

Technology is constantly changing from “box” TVs to Plasma, LCD and LED. A phone has become a computer, and we have come to expect more from them. Our homes look different than they did 10 years ago with chargers in the outlets and multiple screens. Though physical media and communication technology is sophisticated and ever improving, online data is largely unrefined – a hodgepodge of search results with often dubious relevancy and accuracy.

When you Google something, do you look at where the data is located or who owns it? Do you ever look up something one day, and then try to look it up a few days later and can’t find it? When data disappears, where does it go? Do you think about where the data is before you search?

Google results vary worldwide

If you search something on Google from the U.K., you will likely get a completely different set of results than if you made the same search query from Vancouver. The results can vary so much that the top result from the U.K. may be on page 10 if your search query is conducted from Vancouver.

Google is in the business of making money. Results are a blend of location, browsing history and businesses fighting for ranking.

Page 1 results are not necessarily more relevant

Some facts about ranking:

  • #1 in a Google search gets 32% of all the clicks.
  • #2 in a Google search gets 17% of all the clicks.
  • #3 in a Google search gets 11% of all the clicks.

Google’s algorithms rely on more than 200 unique signals or “clues” that make it possible to guess what you might really be looking forfrom Google’s Inside Search.

92 out of every 100 people who search on the Internet do not click past the first page of Google results. Is first page data that relevant? No. First page results get there through a combination of programmatically applied rules that gauge a website’s relevance (Google’s algorithm), online marketing tactics like search engine optimization, Web marketing (often used as techniques to manipulate Google’s results in the hope of increasing ranking), and dumb luck.

A set of programmed rules that sorts data can never be as reliable at vetting information as a human can. The searcher or investigator relies on experience, carefully considered judgments, knowledge of the research subject, and context. We know exactly why we are searching. We understand our intent; Google is only guessing.

So what does all of this tell us?

searching with Google on the Internet today is like dragging a net across the surface of the ocean

Google, Bing and other popular search engines are very useful and make some data on the Internet easy to locate; however, if you are searching for something relevant and unique to a person or business, some nugget of information that relies on human judgment to discern its value, or something not widely published in typical online channels, how can you rely on results returned by a machine? You can’t.

The fact is, Google, a resource that most consider the answer to all questions, has true limitations. You can’t use a hammer to drill a hole, and you can’t look at one page or even hundreds of results returned by Google and assume you’ll find exactly what you are seeking. You’ll find answers, but will you find the right answers?

Search engines just scratch the surface

As only the tip of an iceberg is visible to observers, so is a typical Google search that sees a very small amount of the information that’s available. Mike Bergman, founder of Bright Planet stated that “searching with Google on the Internet today is like dragging a net across the surface of the ocean: a great deal may be caught in the net, but there is a wealth of information that is deep and therefore missed”.

Dragging nets below the surface

The rest of the information is buried in what’s called the deep web. The deep web, consists of data that is not indexed and therefore can’t be located by traditional search engines. It is difficult to know how big the deep web is but at least hundreds or thousands of times larger than the surface web. The deep web doesn’t necessarily hide the data, it’s just that conventional search engine technology has a hard time finding and making sense of it.

The sunken treasures of the deep web

The inability of traditional search engines to examine the deep web represents a major gap in accessing vast amounts of very high quality information existing online. Many of these internal pages have external links, contain blogs, file directories, technical journals, professional papers, photos and untold amounts of data and information that search engines cannot find.

By way of example, most newspapers have their own websites. At times, a search engine will capture a few of the articles that are popular. But for more obscure articles you will have to go directly to the newspaper’s website and search its content. Even with popular news stories, the older they become, the less likely it is they will be returned by a search query.

Don’t give up on Google, just don’t rely on it

Google is arguably the world’s largest search engine.  Understanding that your Google search will provide you results that contain only a small portion of the information available on the Internet.

If there is personal, financial or legal risk to you or your organization, consider using a professionally trained investigator who understands how to mine the webone that conducts real online investigations NOT cursory background checks. Our team has indexed numerous databases held within the deep web, open source, government and proprietary websites that are critical resources for our online investigations.

Coupled with our analytical/investigative methods, we can connect the dots and determine the veracity of the information for personal background checks, legal investigations, corporate due diligence and specialized research.

About the Author

Pat Fogarty is a former organized crime investigator now leading Internet research and investigations at Fathom Research Group. Read more about Pat.