US Federal: 301 980 8490|US: 408 213 9500|UK: +44 (0)203 176 4500

Classification and Text Mining Server

Downloads

Product Sheet - ALP

Product Sheet CS

Product Sheet CRT

Product Sheet CTM

White Paper

White Paper

Video

Semaphore Classification and Text Mining Server Delivers Innovative and Powerful Mechanisms to Analyze and Classify Text

Modules of Semaphore Content Classification and Text Mining Server

Content classification is the process of analyzing a document and adding metadata 'tags' that describe that document which are sourced from a taxonomy or other form of controlled vocabulary.

Modules include:

  • Classification Server. The enterprise scalable classification and text analysis processing engine.
  • Rule and Template Editor. An expert client tool to generate the rule base templates and build custom rules.
  • Rulebase Generator. The processing stream that generates the rule bases from the Semaphore model.
  • Language Packs. Optional additional language packs to extend the natural language processing capability to multiple languages.
  • Advanced Language Packs. Optional pack that provides entity extraction capability for a specific language.

Benefits of Semaphore Classification Server

Semaphore's content classification and text mining services have been developed and proven in large customer environments to be:

  • Flexible. Semaphore Classification Server offers multiple outputs such as rulebase classification, entity extraction and term extraction.
  • Scalable. Semaphore Classification Server is in production classifying millions of documents a day, running on multiple CPU cores in parallel.
  • Efficient. Huge taxonomies like MeSH and SNOMED can be loaded into the system without impacting performance
  • Robust. Semaphore Classification Server has been stress tested with hundreds of content formats and millions of documents.
  • Accurate. Semaphore Classification Server delivers accurate classification.
  • Rapidly Configurable. Semaphore Classification Server can be configured to automate rules generation to yield more accurate, transparent and controllable results and save months of time by eliminating the need for training sets.