Last week I told you how dogs learn new behaviours, but that’s not the end of the story. This week we’ll find out about reward prediction errors in dogs and how dogs can quickly unlearn those new tricks and behaviours!
Check out last weeks blog and discover How Dogs Learn New Behaviours!
How dogs unlearn behaviours!
After a lot of trial and error our pup has finally figured out roughly what a “sit” is. He has also learned that sitting predicts a certain kind of reward. Your pup thinks this is a good deal and is prepared to swap a sit for your treat.
When the treats stop
Your dog doesn’t know he is being “trained”. All he knows is that a sit leads to a yummy treat. Just like a rat who learns to press the green button to get the food to drop, your dog expects a treat when he sits.
How long do you think the rat will continue to press the green button if the treats stop falling? Or what if they change the treats to something the rat doesn’t want? How long would you continue to go to work if your boss suddenly only paid you for 90% of the days you worked, or one day they gave you vouchers instead of cash?
Reward Prediction Error
When a situation produces a good outcome, enough times, we learn to predict a reward. This leads us to repeat the situation to obtain that reward. This is reward prediction, the error part describes whether the reward is better, the same, or worse than expected. If the outcome is worse (negative reward error) than expected, the likelihood of the behaviour being repeated is less. If the reward is better (positive reward error), the likelihood is increased, and if it stays the same (no reward error), nothing changes.
Let’s imagine you go to a coffee shop every morning on your walk to work. They are slightly out of your way, but they serve good coffee at a reasonable price. Every morning you get your coffee with no issues, therefore the reward is unchanged (no reward error) and you will probably continue to visit routinely.
This morning you went, but there was a sign on the door to say they had run out of coffee. Your expectations were not met (negative reward error) and so the outcome was worse than expected. Will you go back tomorrow? Maybe, maybe not. If you do, how would you feel if the same thing happened again next week?
What if they gave you a free pastry? There is no question, you’ll definitely be back tomorrow. But would that raise the bar, and would just serving coffee everyday now feel a little disappointing? It might if it happened a few times!
Cheese or kibble?
Very quickly your own behaviour can be changed by a reward prediction error (negative or positive), and so can your dogs’!
That story was a good example of how it works in humans. But it could just as easily be about a dog that stopped getting rewards for a recall, or instead of getting cheese, he just got kibble.
Reward prediction error isn’t just found in dogs, other animals such as rats, monkeys and even bees have this. It encourages us to seek out better and more successful ways of doing things. Basically, we need it to survive better!
A source of frustration
When we change the deal, this can cause frustration. On it’s own, that might not be such a problem. In fact, if it’s managed well, it will be the catalyst that triggers a new desirable (hopefully) behaviour. Combined with other frustrations though, and built up over the course of a few hours, it can unbalance a dog enough to trigger a reactive behaviour problem.
Weaning off the treats
Many people think they are bribing their dog with treats during the training process. But actually, you are just fulfilling your end of the deal. When you start to wean your dogs off the treats you are creating a negative reward prediction error for your dog, and are gambling with the potential outcome. Just like when the coffee shop ran out of coffee, if you change the deal too quickly, he might stop bothering altogether.
This typically happens when people think they’ve finally cracked a certain trick. The dog is doing the trick well, so you think they’ve learned it and no longer need the reward/bribe. But a habit is not formed yet, so the response still is not guaranteed. Without the reward it’ll fade or stop. After a new habit is formed, the new behaviour becomes harder to stop, so this is a good time to think about reducing the rewards. Habit forming doesn’t happen overnight though, it takes weeks of practice, so don’t rush this.
Using it to your advantage
This can work in your favour too. When you change the deal the dog will seek out the reward elsewhere. So when you stop rewarding a dog that is pulling on the lead by standing still (negative prediction error), they will seek out a new way to get a reward. Moving forward is rewarding, so if you wait until the lead slackens before moving, the dog will learn a new way to keep moving. The dog gets the same reward, but for a different behaviour.
Be precise with your timing otherwise he’ll get confused, or he’ll accidentally learn something completely different. (Refer to last weeks blog to see how that can happen!)
How can I help you with your dogs behaviour training?
Private Dog Behaviour Consultations are currently available in the Dundee area and beyond. If you are looking for help solving your dogs behaviour and training problems, then please get in touch!