Cathy Anderson

How do we harness the energy of information? Through Education!!

Google, Google, Semantic Web, and Singularity

Written By: Cathy - Dec• 29•09

Yesterday, as I was on a lengthy drive, I was sorting out a possible connecting between Google Search, Google Books and the Semantic Web—>tied to Ray Kurzweil’s Singularity.  I think my brain grew a new wrinkle but whether or not the connections I was trying to make make sense or are part of the scheme is probalby beyond my ability to determine.  But, that observation aside I set out to learn more about Google Search, Google Books, and Semantic Web and present that here and finally take another look at Kurzweil’s Singularity.

First of what what is it about Google Search that makes them unique, how do they do it?  Well according to various sources. Fundamental to how Google operates and even why it does what it does is it’s mission;

The company’s mission is to organize the immense amount of information available on the web and make it universally accessible and useful.

And while it is not my intent to discuss the history of Google here it should be noted that:

Their (Brin and Page) goal was to make digital libraries work, and their big idea was as follows: in a future world in which vast collections of books are digitized, people would use a “web crawler” to index the books’ content and analyze the connections between them, determining any given book’s relevance and usefulness by tracking the number and quality of citations from other books. (retrieved from http://books.google.com/googlebooks/history.html0

That mission, in my opinion could be expanded to include all information, even that which is not on the web, digitizing it and putting it on the web.

How do they do this though?

According to the Wikipedia article on Google Search they use a Google search-results page is ordered by a priority rank called a “PageRank.”  Which is:

a link analysis algorithm, named after Larry Page,[1] used by the Google Internet search engine that assigns a numerical weighting to each element of a hyperlinked set of documents, such as the World Wide Web, with the purpose of “measuring” its relative importance within the set. The algorithm may be applied to any collection of entities with reciprocal quotations and references. The numerical weight that it assigns to any given element E is also called the PageRank of E and denoted by PR(E).  (retrieved fromWikipedia, 12/29/09)

In terms that make sense to me .. Google indexes web pages based upon key terms, the more popular key terms on a website and  how they are phrased all have something to do with how your site and content is indexed by the search engine.

Integrating this indexing of text or content from websites with the massive Google books project seems huge ..to date various resources indicate that over 10 million books have been digitized by Google.   As Tim O’Reilly noted the key to Google is.. “database management. Google isn’t just a collection of software tools, it’s a specialized database.”  Indeed the Google Books project is just that a vast digital warehouse of text, knowledge and information.    Integrated with that project is:

The Google Books Library Project is an effort by Google to scan and make searchable the collections of several major research libraries.[1] It and Google’s Partner Program comprise Google Book Search. Along with bibliographic information, snippets of text from a book are often viewable. If a book is out of copyright and in the public domain, the book is fully available to read or to download.[2]

(retrieved from Wikipedia, 12/29/09)

Now having established that Google “views” the web and the web content as one large searchable database, that it has taken web based content and made it searchable, and is now taking vast amounts of non webbased content and digitizing it as well..and further adding to the database..where does the Semantic Web come into this? \

According to this website the Semantic Web is:

…the extension of the World Wide Web that enables people to share content beyond the boundaries of applications and websites. It has been described in rather different ways: as a utopic vision, as a web of data, or merely as a natural paradigm shift in our daily use of the Web.

In order to apply or understand the semantic web it is necessary to understand that it requires adding semantic metadata, or data that describes data, to information resources…further definition from the “How Stuff Works Website”  indicates that … Semantic Web proposes to help computers “read” and use the Web. The big idea is pretty simple — metadata added to Web pages can make the existing World Wide Web machine readable…”

As is indicated this does not create an ‘artificial intelligence” definition for the WWW yet.  However digging deeper in to Kurzweil’s Singularity may allow us to harness that wealth of information to create an artificial intelligent WWW.

Wikipedia provides an overview of technological singularity as follows:

Technological singularity refers to the idea that technological progress would reach such an infinite or extremely high value at a point in the near future. This idea is inspired by the observation of accelerating change in the development of wealth, technology, and humans’ capability for information processing. Extrapolating these capabilities to the future has led a number of thinkers to envisage the short-term emergence of a self-improving artificial intelligence or superintelligence[1] that is so much beyond humans’ present capabilities that it becomes impossible to understand it with present conceptions. Thus, the technological singularity can be seen as a metasystem transition or transcendence to a wholly new regime of mind, society and technology.

I propose that we are driven, by our very need to learn and learn more to achieve the singularity, to harness the ‘energy” of the world’s knowledge..the abillity to do so is at our fingertips…Brin and Page designed the mission of Google to achieve this aim, whether they did this knowingly or not is immaterial, reality is they are achieving this aim through Google Books, Google Search and now with the advent of the Semantic Web, which provides a means to harness this vast wealth of information .. we are close to achieving Superintelligence via machines.

From the companion website for the book The Singularity is Near Kurzeil notes:

The Singularity is an era in which our intelligence will become increasingly nonbiological and trillions of times more powerful than it is today—the dawning of a new civilization that will enable us to transcend our biological limitations and amplify our creativity.

Google harnesses the computing power of hundreds of thousands of interconnected pc’s, billions of pages of data, and now that information can be pushed to us in a systematic fashion via semantic web processes…the potential of solving problems that we can’t address due to limited knowledge or disaggregated data is limitless.

The Internet is huge..

Microsoft’s Bing team puts the amount of web pages at “over one trillion“.

And Google has already indexed more than one trillion discrete web addresses.

More information on indexing of web pages by Google can be found on theGoogle Blogs

and another source of information on the number of web pages here:  http://hubpages.com/hub/How-many-webpages-do-you-think-actually-exist-on-the-Internet.

Be Sociable, Share!

You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.


  1. Eric says:

    Not really but : yes : we ( or they ) have the “makina”s.

    In fact google don’t use semantic web yet : but you are right : when they will do it : ho ho ho : happy new year.

    telligence is just a feedback, consciouness is just the capacity to analyse and decided with deduction, reduction or far better : holism .

  2. Josh Johnson says:

    I wonder if you have run across Tononi’s Integrated Information Theory of consciousness? (http://en.wikipedia.org/wiki/Integrated_information_theory) The internet is the only non-living data structure that he specifically identifies as possible meeting the criteria as conscious. I think, indeed, the Semantic Web, if successfully implemented, would improve the integration of information stored there and certainly move in the direction of consciousness.

    I think that the point that it will really get interesting is when automated programs begin running to reconcile discrepancies, eliminate redundancies, and begin automated fact-checking of connected data. This will amount to a self constructing, self-improving integrated data structure with the potential for consciousness.

    The next interesting development will be automated data gathering systems. What if programs, possible even viral in nature, were set out to increase connectivity, gather new data through web cams, microphones, interactions with humans, and through an increasing vast network of the “internet of things,” providing a variety of sensors around the world.

    I think the final revolution might be when the connected web begins testing theories through experimentation. If I simulate a computer malfunction in a home, will the residents spend their time trying to repair it, or spend time socializing with each other? If I increase the bank account of person X, will they report the discrepancy, or spend the money? etc. The internet could seek to increase its knowledge base in the same way that children (and scientists) increase theirs.

    Then we have a singularity.

    I hope Google has digitized some good books that offer moral concepts :)

Leave a Reply

Your email address will not be published.