Become.com's Web Crawler

[img_assist|fid=64|thumb=1|alt=Web]

Sun Developer Network is featuring a story on Become.com's Java technology Web crawler that maybe -

[T]he most sophisticated, massively scaled Java technology application in existence, obtaining information on over 3 billion web pages and writing well over 8 terabytes of data (and growing) on 30 fully distributed servers in seven days.

The company has patented its Affinity Index Ranking (AIR) algorithm which -

[P]rovides highly targeted search results by understanding the context of pages on the web. AIR integrates advanced concepts from applied physics and engineering dynamics that Become.com will eventually make public.
Become.com's Web Crawler: A Massively Scaled Java Technology Application

Reply

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <h3> <h4> <a> <em> <strong> <cite> <pre> <code> <ul> <ol> <li> <dl> <dt> <dd> <img> <b> <i> <pre> <br> <p> <blockquote> <img> <div>
More information about formatting options