Big Data analytics is the process of examining large data sets containing a variety of data types to reveal patterns, market trends, product information, customer preferences and other business information. Increasingly organizations are examining their data searching for actionable insights to answer key business questions. With the Semaphore platform, organizations can drive effective marketing, increase revenue and improve customer service, operational efficiency and risk management.

What is Big Data?

Gartner describes Big Data as spanning three dimensions: volume, velocity and variety. These dimensions describe the way that the information available to the enterprise is changing; the exponential growth of information, the increasing amount and rate of change and the diversity of data types and formats. The marketplace has responded with new technologies to support the use of all this information – schema agnostic data stores to accommodate varying data formats, infrastructure technologies to deal with the growth and change, and a variety of analytical tools to make sense of it all.

Yet with all the hype surrounding Big Data, the richest source of information – unstructured information- is ignored. Information assets such as contracts, proposals, meeting minutes, presentations, emails, spreadsheets, position papers and conversations represent approximately 80% of the enterprise information and contain most of the human intelligence and insight. Organizations struggle to analyze and make sense of this information as it typically requires a human to examine and determine the nature of the content and extract the facts, entities and relationships contained within. Semaphore, Smartlogic’s sophisticated semantic platform, solves this problem.

Big Data analytics and semantics result in better information

Today’s organizations strive to analyze and gain insight from their information to drive key business decisions, increase stakeholder intimacy, improve operations and manage regulatory and reputational risk. The Semaphore platform lets them augment their data sets with information hidden in their information.

  • Insurance companies can analyze years’ worth of claims adjuster reports to identify patterns and indicators of fraud, improving the speed and accuracy of claim processing.
  • Pharmaceutical companies can automate the process of identifying patterns that indicate unknown side effects of drugs in the market by analyzing adverse event reports received from healthcare providers – replacing a costly, error-prone manual process.
  • Financial institutions can analyze correspondence and borrower interaction to identify patterns of behavior that indicate potential loan and mortgage default.
  • Healthcare providers can mine the information found in patient records and doctor’s, social worker’s and visiting nurse’s notes to identify patients at risk of hospital readmission.

Semaphore drives Big Data analytics

Semaphore, our sophisticated platform, is the combination of semantic technology and information science that lets you identify - classify - extract - analyze and expose the valuable information hidden in unstructured assets based on its true meaning and context. Semaphore brings structure to the unstructured and scales to handle Big Data volumes.

Semaphore will:

  • Build a model that contains the topics, concepts, labels and relationships that reflect the unique characteristics of a problem domain with Semaphore Ontology Editor.This model can leverage externally developed vocabularies (such as MeSH and MeDRA in healthcare), and link them to the vocabularies that are unique to the organization.
  • Extract facts, entities, relationships and sentiment from unstructured content to provide additional valuable data points for analysis.
  • Express the information extracted from the content in a standard semantic format (RDF triples) to be combined with other, more readily accessible data for analysis.
  • Provide the ability to visualize and traverse the model to explore concepts and relationships.

Semaphore employs semantic techniques to enrich information and power Big Data analytics

Semaphore uses sophisticated semantic techniques to perform:

  • Natural Language Processing (NLP) – the use of advanced NLP and identification, analysis and description of the structure of a language’s linguistic units, lemmatization, part-of-speech tagging and part-of-speech sequence characterization to precisely tag text.
  • The model is then applied to the output of the NLP process to perform:
    • Named entity extraction – the use of patterns and dictionaries to locate and classify elements such as persons, organizations, locations, expressions of time, monetary units, quantities, etc. found within a block of text.
    • Topic, subject, thematic classification - combine concept evidence from a taxonomy or ontology to create detailed linguistic processing rules. This means documents are tagged with the topics they are “about” as opposed to themes they “mention.”

Following the entity extraction and topic classification processes, entities, topics, subjects and themes can be studied to determine how they relate to other elements of the content using:

  • Fact extraction – process text to look for patterns associated by its proximity to a phrase or an entity such as, references, project codes, prices, and credit card numbers.
  • Relationship extraction - Fine-grained entity and fact extraction provides for the description of entities and facts within a document. Fact extraction rules that allow for discovery of entity relationships within documents as well as the ability to correlate relationships into groups to derive enhanced meaning.

With the Semaphore platform you can identify – classify – extract – analyze and expose information hidden in your unstructured content based on its true meaning and context to drive decision making. In Big Data analytics, semantics moves information analysis and decision-making to the next level