- We will start by covering the operant conditioning definition.
- Next, we will explore the principles and concepts that make up the operant conditioning theory and the Skinner operant conditioning experiment.
- Moving on, we will explore some operant conditioning theory examples.
- Finally, we will compare classical and operant conditioning.
Operant Conditioning Definition
B. F. Skinner believed that it is possible to study behaviour scientifically. He also thought behaviour is voluntary and has a purpose: to affect one's environment. This behaviour, which he called operant behaviour, is the focus of operant conditioning.
Skinner describes operant behaviour as behaviour influenced by its outcomes.
In other words, a person acts on their environment for the desired results. So, then, what is operant conditioning?
Operant conditioning is a method of learning or modifying behaviours in which the consequence of a response, whether good or negative, influences the repetition of an action.
Fig. 1 A dog rolling over for a treat.
Suppose you give your dog a treat when it rolls over. The dog learns to associate the action with the reward through operant conditioning and will likely repeat the behaviour.
Operant conditioning states that every action we take while engaging with our environment has consequences. We are more likely to repeat behaviours with positive outcomes than actions with negative results. When we receive punishment as a consequence for a behaviour, we most likely will never repeat that behaviour.
Operant Conditioning Theory
Skinner divided behaviour into three parts for his scientific study: discriminative stimulus, operant response, and the reinforcer or punisher. These three are the three-term contingency, which illustrates a relationship between the operant response and the consequence (a reinforcer or punisher).
Let's define these three terms:
- A discriminative stimulus serves as the antecedent of behaviour, such as events or situations in which a behaviour occurs.
- Reinforcers are the responses increasing the likelihood of the behaviour it follows.
- Punishers are the responses, decreasing the likelihood of the behaviour it follows.
An exam (discriminative stimulus) is coming up, and you reviewed well and gave your best effort in preparing for the exam. Results came, and you earned a high score. Your parents were proud and took you to your favourite restaurant (reinforcer). If you played video games all day and failed your exam, your parents scolded you for being irresponsible (punisher).
The three-term contingency served as the foundation of Skinner's study on operant conditioning. With his analysis, he also identified several types of operant conditioning.
Operant Conditioning: Types
Skinner developed four types of operant conditioning: positive reinforcement, negative reinforcement, positive punishment, and negative punishment.
We've mentioned that operant conditioning involves rewarding or punishing behaviours.
In positive reinforcement, a favourable outcome follows the behaviour to increase its recurrence.
When you apply positive reinforcement, you'd want to strengthen a response (e.g., in terms of frequency or likelihood) by using an operant reinforcer. In this case, a positive reinforcer.
John noticed his friend, Luke, looked sad, so he decided to crack a joke to cheer him up. Luke laughed, which positively reinforced John's behaviour. So, the next time Luke gets sad, John may likely repeat that behaviour.
Positive reinforcement strengthens a behaviour, so negative reinforcement weakens it. Right? Negative reinforcement can also enhance a behaviour. This type of reinforcement falls under operant aversive conditioning.
Negative reinforcement occurs when you remove an unpleasant event (aversive stimulus or negative reinforcer) following a behaviour.
You're driving and suddenly hear a squealing noise when you step on the brakes. Feeling worried, you brought your car to the mechanic and found that the brake pads needed replacing. The mechanic replaced them, and the squealing noise disappeared. Removal of the squealing noise negatively reinforced the behaviour of bringing the car to the mechanic.
There are two types of negative reinforcement: avoidance and escape behaviour.
In avoidance, the learner prevents the unpleasant event from occurring. If the unpleasant event has already happened, the removal of the negative reinforcer occurs through escape behaviour.
Avoidance: When you leave your dishes in the sink, you hear your mother coming home from the grocery and parking her car in the driveway. You rushed to wash the dishes before she entered the house to avoid nagging.
Escape: But what if your mother arrives and sees the dishes in the sink earlier than expected? Your mother starts to nag you, and you wash the dishes so she'd stop nagging.
Punishment is another form of operant aversive conditioning which aims to weaken behaviours. When behaviours weaken, it means that there is a decrease in frequency, duration, and intervals.
Punishment refers to negative consequences (aversive stimuli) following a behaviour.
Positive punishment occurs when an aversive stimulus (something that you don't want) follows a response.
A group of students faces detention after refusing to follow their teacher.
Adverse outcomes following misbehaviour need to be immediate and consistent so that the learner will associate the consequence of the behaviour with a higher chance of stopping it.
Negative punishment involves removing something valuable (an object or activity) following a response.
A person gets their driving licence suspended after multiple traffic violations.
Psychologists warn, however, of excessive punishment as punishment tells you what not to do; this may not lead to the desired behaviour. Punishments can make the learner aggressive because it is a coping mechanism (to deal with problems in life).
Simply put, positive punishment (+) adds a negative consequence, while negative punishment (-) is to take away something.
Operant Conditioning: Properties of Reinforcement
Earlier, we defined what reinforcers are and positive and negative reinforcements of behaviour. In operant conditioning, Skinner identified reinforcement properties, such as the different types of reinforcers and schedules of reinforcement.
Primary reinforcers, such as food, water, and sleep, are of biological importance to us. This reinforcement is universal, which means it can occur to anyone.
Secondary reinforcers, also known as acquired or conditioned reinforcers, are initially neutral but can strengthen behaviours when paired with a primary reinforcer. Examples include tokens, points, and stickers.
Reinforcement schedules describe the manner and timing of giving reinforcers to a learner.
There are two types of schedules of reinforcement: continuous and partial.
Continuous reinforcement refers to giving reinforcers every time the learner commits the targeted behaviour.
The teacher gives a gold star every time a student participates in class.
Partial reinforcement, on the other hand, involves giving reinforcers based on a target number of desirable actions (ratio schedules) or time (interval schedules).
Fixed ratio schedules require a specific number of responses before reinforcement occurs.
The sales manager gives an employee a bonus for hitting the target sales for six consecutive months.
Fixed interval schedules involve reinforcement of a desirable behaviour after a specific period. This schedule leads to an increased number of responses as reinforcement approaches.
Alice prepares for her licensure exam. She had three months to prepare for the exam, but in the first two months, she didn't spend that much time reviewing. As the exam drew near, she spent the last month of her exam preparation studying her lessons to ensure she passed (reinforcement) the exam.
Variable ratio schedules refer to a reinforcement of desirable behaviours without a specific number of responses.
The most common example of a variable ratio schedule of reinforcement is slot machines. The unpredictability of reinforcement encourages gambling behaviour.
Variable interval schedules refer to a reinforcement of desirable behaviours in unpredictable time intervals.
The unpredictability of receiving a message (reinforcement) via instant messaging may encourage the behaviour of checking your notifications at various times throughout the day.
Operant Conditioning: Principles
We've seen how reinforcement occurs and the types of reinforcers given. Now we'll look at three essential principles of operant conditioning.
The principle of immediacy highlights the timing of the delivery of the reinforcement. If the reinforcement occurs right after the behaviour, the greater its effect on the learner. The less immediate, the less effective the consequences are.
The principle of contingency refers to how consistently a consequence follows a behaviour. This principle highlights the importance of reliably relaying a response to increase the consequence's effectiveness.
The principle of satiation tells us that if the learner has no appetite for a particular stimulus (e.g., reward), the consequence will not be that effective; however, if there's a need for a specific stimulus, the effect of the consequence increases.
Skinner Operant Conditioning: Experiment
In testing his theory, B. F. Skinner conducted operant conditioning experiments on animals by observing their behaviour in the Skinner box. Skinner developed the Skinner box, or the operant conditioning chamber, which recorded the behaviour of an organism in a specific time frame.
The animal either receives a reward (food pallet) or a punishment (unpleasant electric shocks) when it exhibits certain behaviours, such as pressing the lever for rats or pecking keys for pigeons.
Fig. 2 Skinner's experiment supports his operant conditioning theory.
As the rat moved around the box, it accidentally pressed the lever connected to a food pellet. The food pellet automatically dropped food into a food dispenser (positive reinforcement). The rat learned this rewarding behaviour quickly after being placed in the Skinner box only a few times.
Skinner tested negative reinforcement by giving the rat unpleasant electric shocks whilst inside the box. When the rat moved inside the box, it accidentally pressed the lever, and the electric shocks stopped immediately (negative reinforcement).
After being placed in the box a few times, the rat quickly learned this behaviour. The next time the rat was placed in the box, it immediately hurried to press the lever to avoid the unpleasant experience of the electric shocks.
Operant Conditioning Examples and Application
There are several examples of applying operant conditioning in everyday life. Skinner's operant conditioning contributed to developing treatment therapies such as the token economy and behaviour shaping.
Parents and teachers use token economy to reinforce desired behaviour through tokens such as stickers, coupons, money, or points a child can exchange for rewards such as food, activities, or privileges. Token economies help teach children to follow the rules at home and school.
Fig. 3 Circus animal training
Behaviour shaping involves eliciting responses by simplifying the desired behaviour into small, manageable steps, followed by a reward when the learner completes each step.
For example, trainers use behaviour shaping to teach complex tricks to circus animals.
In behavioural therapy, psychologists use operant conditioning and its principles to alter behaviour and treat psychological conditions such as depression, eating disorders, and obsessive-compulsive disorder (OCD).
Classical and Operant Conditioning
We understand that both classical and operant conditioning are forms of associative learning. But what's the difference? Let's look at this table to compare the two types of conditioning.
Classical Conditioning | Operant Conditioning |
Behaviours are involuntary. | Behaviours are voluntary. |
Learning happens before a response occurs (presentation of an unconditioned stimulus after a conditioned stimulus). | Learning happens after a response takes place (through reinforcement or punishment). |
The learner is passive. | The learner is active. |
The learner associates a neutral stimulus with an unconditioned stimulus, eliciting a response. | The learner associates a response with a consequence that follows it, affecting the recurrence of a behaviour. |
Operant Conditioning - Key takeaways
Operant conditioning is a method of learning or modifying behaviours in which the consequence of a response, whether good or negative, influences the repetition of an action.
Using the Skinner Box, B. F. Skinner conducted operant conditioning research on animals, which recorded behaviour over time.
Properties of reinforcement include primary and secondary reinforcement and reinforcement schedules based on the number of responses or time intervals.
Real-life examples of operant conditioning include token economy, behaviour shaping and behavioural therapy.
Operant conditioning differs from classical conditioning because behaviours are voluntary, and learning occurs after a response. Classical conditioning regards behaviours as reflexes, and learning happens before a reaction occurs.
References
- Fig. 2. Image of the Skinner rat experiment (https://commons.wikimedia.org/wiki/File:Skinner_box_scheme_01.png) by Andreas1 (https://commons.wikimedia.org/w/index.php?title=User:Andreas1&action=edit&redlink=1) Licensed by CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0/deed.en)