Text Miner

Text Miner

Analyze your information and discover the language of your organization

Jump-start the model building process with Text Miner; an easy to use application that mines information assets to identify the vocabulary of the enterprise. Text Miner searches a set of assets and highlights the language you use most using a simple iterative process:

  • Point Text Miner to a set of information assets wherever they reside – on hard drives; in the cloud; information management platforms, such as SharePoint; and enterprise search engines.
  • Discover potential concepts – using complex noun phrase and entity extraction algorithms, Text Miner analyzes information assets and extracts candidate concepts and phrases from your data set.
  • Review results – Text Miner provides statistical counts and frequencies of candidate concepts, and groups concepts by similar phrase content in an easy-to-use interface so you can see each concept in context.
  • Drag and drop candidate concepts, preferred or alternative labels into your model so that it reflects the language of the enterprise.
  • Solicit model feedback using Semaphore’s Ontology Review Tool – users can visually browse the model and provide feedback.

Iteratively building your model and soliciting feedback from stakeholders, will ensure acceptance throughout the organization and result in a model that accurately reflects the enterprise.

Benefits of text mining with Semaphore

Semaphore Text Miner lets information scientists and subject matter experts efficiently develop enterprise models without complicated languages or processes. Text Miner gives you:

  • Increased productivity - model building time is reduced
  • Improves rule-based classification accuracy - your model reflects your information
  • Simple to use graphical user interface (GUI) – no coding
  • Counts and frequencies of the concepts found within your information– you can see how important the concept is to your organization and make decisions about whether it should be included in the model.
  • A display to view each concept in its original context
  • Model closely reflects the organization – the concepts, labels and relationships in the model come directly from the assets of the organization

The model generates rulebases which are combined with classification strategies to result in precise and consistent metadata. Jump-start the model building process with Semaphore Text Miner so you can immediately reap the benefits of model-driven classification.

Explore your model with Semaphore’s Ontology Review Tool

Semaphore’s Ontology Review Tool lets subject matter experts and users throughout the enterprise visually browse the model and provide feedback to improve the model building process.

To begin the model review process:

  • The model development team provides a link from Ontology Editor and instructions, which can include sections to review, review end date, etc. to the model review team.
  • Reviewers can visually browse the model and leave comments about a concept, label or relationship based on their knowledge and use case.
  • Model developers can view all comments en mass or comments for a single concept, label or relationship and adjust the model as appropriate.

With the Ontology Review Tool, model builders and subject matter experts can iteratively build models that accurately reflect the organization and drive precise and consistent metadata tagging to improve search and discovery within the organization.

Test results with Rulebase Generator

Semaphore Rulebase Generator (SRG) creates model-driven rulebases by extracting the concepts, labels and relationships from within your taxonomy or ontology. We pass these rulebases to Semaphore Classification Server to result in a precise, consistent and complete set of rules that drive content classification.

SRG provides a feature rich set of rules, templates and mechanisms to result in sophisticated automatic classification results using:

  • Twenty (20) customizable business rule types, multiple control attributes, expressions and wildcard strategies.
  • Mechanisms to precisely tune and manage rulebase results to remove ambiguity - for example, the concept Apple might reference a company, fruit, singer or New York City: SRG provides mechanisms to help you differentiate between these concepts.
  • Concept weighting – concepts with in the model are weighted based on their ability to discriminate specific topics. Classification scores are adjusted according to the frequency of a concept; where it is located within an asset (i.e. header, body, footer etc.) and its context – what it’s about – within the content.
  • Associated concepts - identifies information, which is about a specific concept as opposed to content that merely mentions a topic. For example, the technology company Apple can be associated with other concepts iPhone, Steve Jobs or alternative labels AAPL, Apple Inc. and even negative evidence such as, the Big Apple or Fiona Apple.

The combined power of Semaphore Text Miner, Ontology Review and Rulebase Generator tools let you build, manage and validate your models to ensure classification results accurately reflect your organization and content.

Download "Text Miner helps build your taxonomy"

Ask a question

Please leave your details and one of our experts will get back to you.

All fields are required.

Stay in the loop

Sign up for our newsletter to receive the latest updates, features and news on Semaphore.