Posted on: November 16, 2015, by: Ann Kelly
A few weeks ago we had a new addition in the Holland household. His name is Arthur Herbert Fonzarelli but we call him “Fonzie” for short. Now Fonzie is a Briard puppy and he joins four other dogs in our increasingly busy kitchen. As you can imagine with so many dogs, in the interests of the structural integrity of our house, proper training is important. So we’ve started teaching Fonzie the basics – sit, stand, down and stay. For the uninitiated, the Briard is a French sheepdog (if you’re old enough to remember the Looney Tunes cartoons, think Sam Sheepdog), so he gets sit, down and stay with relatively little effort. However, when it comes to stand, it’s proving to be something of a challenge.
Our customers often ask us about machine learning and how it relates to the work that we do and it occurs to me that machine learning is much like training a puppy – you give it many examples of the desired behaviour and you reward it for performing the correct behaviour on command. For example, if you want to find out if a document is about a particular topic, you feed the model with a load of examples of documents and you tell it which ones are about the topic. You then give it an unknown document and ask if it relates to that topic. If the model gives you the correct answer you provide positive feedback and the model is modified to include the unknown document. Over many repetitions the model gets smarter in much the same way as a puppy more reliably acts on a given command if you’ve plied him with dog biscuits in many situations!
But what happens if you want to add a new behaviour? Or a new topic? What if you’ve mastered ‘sit’ but you want to add ‘stand’? Well you do the same thing again – you provide different examples and you reward correct results. But what if, in the process of training this new behaviour you inadvertently un-train your existing behaviour? Well you provide more and more examples and feedback until you get consistent results. Simple enough but I’m sure you’ll agree, time consuming. Imagine how much fun you’ll have if you want to add several thousand topics.
Classification in Semaphore takes a different approach. Rather than provide many examples and a lot of feedback in order to build a model. Semaphore allows rules to be crafted that use natural language processing in order to determine what action should be taken and when. In effect, the rule says ‘when I hear the word stand, that means I should stand up’. Semaphore takes care of the generalizations – the many different ways that we can say the word stand, and the many different scenarios that it may crop up – and provides a consistent response from a simple rule.
So while machine learning is undoubtedly a useful tool for analysing large volumes of data, if you require classification across a range of topics, it isn’t usually the sharpest tool in the box. In fact, a more effective way to use machine learning may be to apply it to the results of classification but that’s a story for another day.
Unfortunately, Smartlogic does not currently provide any dog training technologies but I may ask our CTO to help me brush up on my French to see if that helps. I also have it on very good authority that our SVP of Operations has a few dog training tricks up his sleeve.
US: +1 408-213-9500
US Federal: +1 703-956-2600
UK: +44 203-176-4500
Copyright ©2022 MarkLogic Corporation