A method for generating context vectors for use in a document storage and retrieval system. A context vector is a fixed length list of component values generated to approximate conceptual relationships. A context vector is generated for each word stem. The component values may be manually determined on the basis of conceptual relationships to word-based features for a core group of word stems The core group of context vectors are used to generate the remaining context vectors based on the proximity of a word stem to words and the context vectors assigned to those words. The core group may also be generated by initially assigning each core word stem a row vector from an identity matrix and then performing the proximity based algorithm. Context vectors may be revised as new records are added to the system, based on the proximity relationships between word stems in the new records.