Major WWW Search Tools

The following excerpted from:

Understanding WWW Search Tools

By Jian Liu (jiliu@indiana.edu) Reference Department, IUB Libraries

URL: http://www.indiana.edu/~librcsd/search/

First Draft: September 1995 First Update: February 1996 Second Update: September 1996


Search the Search Engines (Meta-Search Engines)

These search engines do not have databases of their own. They send queries to multiple search engines simultaneously.

Major considerations: number of search engines included, and selection; boolean and phrase searches; time out; integration of results; number of hits from each source; restriction by location or by type, etc.

SavvySearch
  • URL: http://guaraldi.cs.colostate.edu:2000/form
  • FAQ and HELP
  • The Internet Sleuth
  • URL: http://www.isleuth.com/
  • HELP
  • MetaCrawler
  • URL: http://www.metacrawler.com/
  • FAQ
  • ProFusion
  • URL: http://www.designlab.ukans.edu/profusion
  • HELP
  • Dogpile
  • URL: http://www.dogpile.com/
  • HELP
  • Inference Find
  • URL: http://www.inference.com/ifind/
  • HELP
  • Highway 61
  • URL: http://www.highway61.com/
  • How it works
  • search.onramp.net
  • URL: http://search.onramp.net/
  • FAQ
  • Cyber411
  • URL: http://cyber411.com/
  • FAQ
  • Mother Load Insane Search
  • URL: http://www.cosmix.com/motherload/insane/
  • FAQ


    Special Purpose Databases

    DejaNews
    for searching Usenet
    URL: http://www.dejanews.com/
    Reference.COM
    for searching Usenet, Mailing List Archive and directories
    URL: http://www.reference.com/
    MIT Usenet Address Database
    Search for e-mail addresses from messages posted to Usenet
    URL: http://usenet-addresses.mit.edu/
    U.S. GovBot Database
    More than 300,000 web pages from government site
    URL: http://www.business.gov/Search_Online.html
    LawCrawler
    Search Legal Information: World Wide
    URL: http://www.lawcrawler.com/
    Medical World Search
    Search sites of medical fields; using a medical thesaurus
    URL: http://www.mwsearch.com/
    Argos
    Search sites ancient and medieval internet sites
    URL: http://argos.evansville.edu/
    Euroferret
    European Internet sites only
    URL: http://www.muscat.co.uk/ferret/
    Web Wombat Search Engine
    Australian Internet sites only
    URL: http://www.intercom.com.au/wombat/
    Alcanseek
    specifically for Alaskan and Canadian web sites

    URL: http://www.alcanseek.com


    Alta Vista
    URL: http://altavista.digital.com/
    Developer: Digital Equipment Corporation
    Database size: 30 million Web pages from 275,600 servers; 4 million articles from 14,000 Usenet news groups.
    Simple and Advanced Queries
    Phrase search (in quotes)
  • Required word/phrase (+)
  • Fielded searches (link, title, url)
  • NEAR operator (within 10 words)
  • Comments and Search Tips:
  • Excellent online help for both simple queries and advanced queries
  • Large database, and fast.

  • HotBot
    URL: http://www.hotbot.com/
    Developer: Core technology from U.C. Berkeley
    Other Info:
  • HotWired
  • Features:
  • Database size: 54 million Documents.
  • Online Help
  • Simple Search and Expert Search
  • Phrase search
  • Search by location

  • InfoSeek Ultra
    URL: http://ultra.infoseek.com/
    Developer: Architext Software
    Features:
  • Database size: more than 50 million URLs.
  • Phrase search (in quotes)
  • Case-sensitive search
  • Fielded search (link, url, title)
  • Search for images (special search)

  • Excite Search
    URL: http://www.excite.com/
    Developer: Architext Software
    Features:
  • Database size: 50 million full-text URLs.
  • Concept search and Keyword search.
  • Online Help
  • Confidence ranking and site sorting
  • Supports Boolean searchs (AND, OR, NOT)
  • Allows nesting term1 AND (term2 or term3)
  • Similar page searching (Query by Example)
  • Comments and Search Tips:
  • Concept searching is not well-defined, though an interesting idea.
  • Finding similar pages often results in dubious links.
  • Summaries do not seem to be much different from the first paragraphs.

  • WebCrawler
    URL: http://webcrawler.com/
    Developer: Brian Pinkerton, University of Washingtin
    Other Info:
  • Purchased by America Online
  • Features:
  • Database size
  • Indexed Document/Content Search
  • Boolean: AND, OR
  • Ranked
  • Convert regular Plural to singular
  • Technical Description and a paper
  • Comments and Search Tips:
  • Excellent for quick searching for a top level webpage.
  • Difficult to achieve precision in searching
  • Broad coverage of servers but less in-depth information

  • Lycos: The Catalog of the Internet
    URL: http://www.lycos.com/
    Developer: Dr. Mauldin, Carnegie Mellon Univ.
    Other Info:
  • Now Lycos, Inc.
  • Acquired Point Communications
  • Features:
  • Database size: 51 million URLs
  • FAQ
  • Indexed documents include: title, headings and subheadings, 100 most significant words, first 20 lines of the document and its size in bytes and number of words.
  • Lycos Search Language
  • Fast and flexible
  • Simple search (by default "OR" "Loose Match" and "Standard Results")
  • Search options can be changed
  • Search results ranking includes adjacency and number of occurrence
  • Comments and Search Tips:
  • Adjust search options before doing a multiple term search
  • Add a . to stop stemming.
  • Search results display can be very confusing.
  • Database not purged for a while, contains much out-of-date information.
  • Search term match options range from "loose" to "strong" but no indication what those options mean.
  • Each time the homepage is accessed, the ad page changes, even if you use the back arrow/button.

  • Open Text Web Index
    URL: http://www.opentext.com/omw/f-omw.html
    Developer: Tim Bray for Open Text Corporation
    Other Info:
  • Search engine used by Yahoo.
  • Features:
  • Database Size: about 10 billion words.
  • Simple Search: implied boolean AND (default); phrase search
  • Power Search: Searches in Anywhere, Summary, Title, First Heading, URL. Combine searches using Boolean (AND, OR, BUT NOT) and proximity (NEAR, FOLLOWED BY) operators.
  • Search Results: Summary of the page (taken from the first 100 words). See search terms in context; Find similar pages. A score, and file size.
  • Comments and Search Tips
  • Truncation is allowed (The symbol: *).
  • Online help and FAQ are buried.
  • The search results scores are impossible to interpret.
  • "Find similar pages" returns dubious results.
  • This search engine has gone through several major changes over the past 6 months. Many features are removed, such as "OR" search in Simple Search; Date of the webpage in search results; Weighted Search.
  • Best Features:
  • Phrase search; URL search; FOLLOWED BY; Fielded search (Search in specific parts of a webpage); View search results in context.

  • Appendices

    Collections of Search Engines

    Internet Sleuth
    URL: http://www.isleuth.com/
    search.com
    URL: http://www.search.com/
    CUSI - Configurable Unified Search Index
    URL: http://www.eecs.nwu.edu/susi/cusi.html and the official site
    W3 Search Engines
    URL: http://cuiwww.unige.ch/meta-index.html
    All-in-One Search
    URL: http://www.albany.net/allinone/
    SearchPlex
    URL: http://www.west.net/~jbc/tools/search.html


    Z39.50 Gateway

    Library of Congress
    URL: http://lcweb.loc.gov/z3950/
    ZWeb: Library Search Gateway
    URL: http://zweb.cl.msu.edu/
    InterCat: OCLC Internet Cataloging Project's Catalog of Internet Resources
    URL: http://www.oclc.org:6990/
    Return to The Hoblit Psychology Page