Content Classification Routines
Different vendors provide different content classification routines, including the following.
- Keywords. A set of related search keywords linked to a preferred term. Often a mix of query expansion and regular expressions provide the evidence for a term.
- Natural language processing rules. [Smartlogic capability]. A rule logic language with grammatical, syntactical, natural language, proximity / positional and boolean operators allowing the complex combination of words and phrases.
- Statistical. Statistical (often Bayesian, but not necessarily) analysis of word frequencies and proximities based on a sample set.
- Entity. Algorithms to identify types of entity and match against reference dictionary, sometimes coupled with syntactical analysis [Smartlogic capability]
















































