05675819 is referenced by 403 patents and cites 3 patents.

A method and apparatus accesses relevant documents based on a query. A thesaurus of word vectors is formed for the words in the corpus of documents. The word vectors represent global lexical co-occurrence patterns and relationships between word neighbors. Document vectors, which are formed from the combination of word vectors, are in the same multi-dimensional space as the word vectors. A singular value decomposition is used to reduce the dimensionality of the document vectors. A query vector is formed from the combination of word vectors associated with the words in the query. The query vector and document vectors are compared to determine the relevant documents. The query vector can be divided into several factor clusters to form factor vectors. The factor vectors are then compared to the document vectors to determine the ranking of the documents within the factor cluster.

Title
Document information retrieval using global word co-occurrence patterns
Application Number
8/260575
Publication Number
5675819
Application Date
June 16, 1994
Publication Date
October 7, 1997
Inventor
Hinrich Schuetze
Stanford
CA, US
Agent
Oliff & Berridge
Assignee
Xerox Corporation
CT, US
IPC
G06F 15/21
G06F 15/38
View Original Source