Results 1 to 7 of 7

Thread: How the flipping hell does Google do it?

  1. #1
    Almost in control. autopilot's Avatar
    Join Date
    Dec 2004
    Location
    Region 2
    Posts
    4,071
    Thanks
    51
    Thanked
    12 times in 11 posts

    How the flipping hell does Google do it?

    I was wondering today (boring sunday at work) about Google searches and how it must work to do it so insanely fast. I never really thought about it that much before but i supose i take it for granted. I understand the basics of Meta data based searching and all that, but how does it manage to do it so insanely fast, no mater what you type in? and with all the people that must be using it at the same time. I mean, sometimes the search result find a matching string way down in the body of some obscure website, maybe even hundred of sites. Searching all those billions of pages, text and all, and returning the results in less than a second. It just seems nuts to me. Even if they cache all those site, searching the cache would be a mamoth task. Truely it is magic stuff.

  2. #2
    Senior Member
    Join Date
    Apr 2005
    Location
    Bournemouth, Dorset
    Posts
    1,631
    Thanks
    13
    Thanked
    2 times in 2 posts
    IIRC they have bots that just look through pages and index them so when you seach your basically searching googles database for results matching your query.
    i think these bots are constantly searching through pages and updating googles database.

  3. #3
    Senior Member
    Join Date
    Mar 2005
    Location
    Manchester,UK
    Posts
    366
    Thanks
    0
    Thanked
    0 times in 0 posts
    the way i see Google [come to think of it]

    is a massive database of URL's-upon a "search" or Query its called on Excel-it finds links thats contain the words and find the words that are on the URL

    think that your a cd seller- think how many cds you have and how you can find the exact artist or even the exact song -works the same way

  4. #4
    Senior Member
    Join Date
    Apr 2005
    Location
    Essex
    Posts
    600
    Thanks
    0
    Thanked
    1 time in 1 post
    Tim N

  5. #5
    Registered+
    Join Date
    Dec 2005
    Posts
    40
    Thanks
    0
    Thanked
    0 times in 0 posts
    More than> four billion Web pages, each an average of 10KB, all indexed and cached (copied).

    Up to 2,000 computers in a cluster. With over 30 clusters

    Over 60,000 computers! Yes 60,000 pc's.

    104 interface languages (how many can you name!).

    One petabyte of data in a cluster -- so much that hard disk error rates of 10 to the power 15 begin to be a real problem if not planned for.

    Constant transfer rates of 60Gbps. (Gigabites per second)

    An expectation that sixty machines will fail every day.

    No complete system failure since February 2000.

    It is one of the largest computing projects on the planet, arguably employing more computers than any other single, fully managed system (we're not counting distributed computing projects here), some 200 computer science PhDs, and 600 other computer scientists.

    And it is all hidden behind a deceptively simple, white, Web page that contains a single one-line text box and a button that says Google Search.

  6. #6
    Senior Member
    Join Date
    Apr 2005
    Location
    Essex
    Posts
    600
    Thanks
    0
    Thanked
    1 time in 1 post
    That's a lot of pigeons they have there. Lots of problems with avian flu.
    Tim N

  7. #7
    Registered+
    Join Date
    Dec 2005
    Posts
    40
    Thanks
    0
    Thanked
    0 times in 0 posts
    The power of three
    There are no disk arrays within individual PCs; instead Google stores every bit of data in triplicate on three machines on three racks on three data switches to make sure there is no single point of failure between you and the data. "We use this for hundreds of terabytes of data," said Hölzle.

    Don't expect to see GFS on a desktop near you any time soon -- it is not a general-purpose file system. For instance, a GFS block size is 64MB, compared with the more usual 2KB on a desktop file system. Hölzle said Google has 30 plus clusters running GFS, some as large as 2,000 machines with petabytes of storage. These large clusters can sustain read/write speeds of 2Gbps -- a feat made possible because each PC manages 2Mbps.

    Data errors: A regular IDE hard disk will have an error rate in the order of 10 to the power of 15 - that is one millionth of one billionth of the data written to it may get corrupted and the hard-disk's own error checking will not pick it up. "But when you have a petabyte of data you need to start worrying about these failures," said Hölzle. "You must expect that you will have undetected bit errors on your disk several times a month, even with hardware checking built-in, so GFS does have an extra level of checksumming. Again this is something we didn’t expect, but things happen."


    Sorry for big post I thought this was all rather interesting.
    Full story is here:-
    http://www.linesave.co.uk/google_search_engine.html

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Google, the OpenSource thief?
    By Xaneden in forum General Discussion
    Replies: 28
    Last Post: 27-08-2005, 03:06 PM
  2. Google Earth
    By Steve in forum HEXUS News
    Replies: 54
    Last Post: 12-08-2005, 11:58 PM
  3. Is Hell exothermic?
    By JK Ferret in forum General Discussion
    Replies: 5
    Last Post: 22-06-2005, 07:29 PM
  4. Is hell exothermic or endothermic?
    By Paul Adams in forum General Discussion
    Replies: 11
    Last Post: 16-06-2004, 11:23 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •