What is an ontology and how does it help with content classification?

 

At its basic level, an ontology is a set of words and concepts that are related to one another.  Smartlogic recognized that together all these terms provide the evidence that can drive a rule-based classification approach.

For example, the preferred term in an ontology is "Home Ownership". Click on the image on this page to see this example in the Ontology Manager software.

In the ontology, "Home Ownership" has:

  • many equivalence terms such as "Buying a house", "Owner occupation"
  • child terms such as "Houses for Sale", "Mortgages"
  • related topics such as "Stamp Duties" (a UK Government tax on house purchases), "Self build projects"

In our classification logic we can create a rule that says:

Return the tag "Home Ownership" when:

  • The term itself appears as an exact phrase in the title
  • Add some weight if the phrase "owner occupation" is in the text.  Note that this should be treated as a phrase.  The words "owner" and "occupation" in isolation are completely mis-leading and should not contribute to "Home Ownership"
  • If the rulebase for "mortgages" has been returned, pass a high score up to "Home Ownership"
  • If the rulebase for "Self build projects" has been returned, pass a low score to contribute to "Home Ownership"