 |
 |
|
|
|
|
Title: |
US5864863:
Method for parsing, indexing and searching world-wide-web pages
[ Derwent Title ]

|
Country: |
US United States of America

|
| |
Inventor: |
Burrows, Michael; Palo Alto, CA

|
Assignee: |
Digital Equipment Corporation, Maynard, MA
other patents from DIGITAL EQUIPMENT CORPORATION (147695) (approx. 2,345)
News, Profiles, Stocks and More about this company

|
Published / Filed: |
1999-01-26
/ 1996-08-09

|
Application Number: |
US1996000696406

|
IPC Code: |
Advanced:
G06F 17/30;
Core:
more...
IPC-7:
G06F 17/30;

|
ECLA Code: |
G06F17/30T1P1; G06F17/30W1;

|
U.S. Class: |
Current:
707/103.R;
707/003;
707/010;
707/104.1;
707/E17.086;
707/E17.108;
Original:
707/103;
707/003;
707/010;
707/104;

|
Field of Search: |
395/613,610,603,615,614
707/103,3,10,104

|
Priority Number: |
| 1996-08-09 |
US1996000696406 |

|
Abstract: |
A system indexes Web pages of the Internet. The pages are stored in computers distributively connected to each other by a communications network. Each page has a unique URL (universal record locator). Some of the pages can include URL links to other pages. A communication interface connected to the Internet is used for fetching a batch of Web pages from the computers in accordance with the URLs and URL links. The URLs are determined by an automated Web browser connected to the communications interface. A parser sequentially partitions the batch of specified pages into indexable words where each word represents an indexable portion of information of a specific page, or the word represents an attribute of one or more portions of the specific page. The parser sequentially assigns locations to the words as they are parsed. The locations indicates the unique occurrences of the word in the Web. The output of the parser is stored in a memory as an index. The index includes one index entry for each unique word. Each index entry also includes one or more location entries indicating where the unique word occurs in the Web. A query module parses a query into terms and operators. The operators relate the terms. A search engine uses object-oriented stream readers to sequentially read location of specified index entries, the specified index entries correspond to the terms of a query. A display module presents qualified pages located by the search engine to users of the Web.

|
Attorney, Agent or Firm: |
Brinkman, Dirk ;

|
Primary / Asst. Examiners: |
Amsbury, Wayne;

|
INPADOC Legal Status: |
Show legal status actions
Family Legal Status Report

|
Family: |
Show 2 known family members

|
Claim |
I claim:
1. A system for indexing Web pages of the Internet, the pages stored in computers connected to each other by a communications network, each page having a unique URL (universal record locator), some of the pages including URL links to other pages, comprising:
- a communication interface for fetching a batch of specified pages of the Web from the computers in accordance with the URLs and URL links;
- a parser sequentially partitioning the batch of specified pages into indexable words, each word representing a portion of one specified page or an attribute of one or more portions of the specified page, the parser sequentially assigning locations to the words as they are parsed;
- a memory storing index entries, each index entry including a word entry representing a unique one of the words, and one or more location entries indicating where the unique word occurs in the Web;
- a query module parsing a query into terms and operators relating the terms;
- a search engine using object-oriented stream readers to sequentially read location of specified index entries, the specified index entries corresponding to the terms of a query; and
- a display module for presenting qualified pages located by the search engine to users of the Web.

|
Background / Summary: |
Show background / summary

|
Drawing Descriptions: |
Show drawing descriptions

|
Description: |
Show description

|
Forward References: |
Show 127 U.S. patent(s) that reference this one

|
 |
 |
|
|
|
|
Foreign References: |
None

|
Other Abstract Info: |
DERABS G1999-131672
DERABS G1999-131672
DERABS G2000-146985

|
Other References: |
Business Wire, Open Text's Web Search Server for OEM's; Offers Unique Intelligent Search Capabilities, p. 9181355 Jan. 1, 1995.
Information Intelligence Inc., World wide Web Search Engines: AltaVista & Yahoo, Dr Link, Accession No. 3168688 May, 1, 1996.
Yuwono et al, Wise: A World Wide Web Resource Database System, IEEE Transations on Knowledge and Data Engineering, vol. 8, No. 4, Aug. 1996 Apr. 29, 1996.
Steinberg, Seek and Ye Shall Find (maybe), Wired, May 1, 1996, p. 108 et al.

|


|
Nominate this for the Gallery...

|
|