Main
Ivan Pavlov
Baby Albert
B.F. Skinner
Operant Conditioning
Conditioning a Response
Applying Behavioralism
Index

Operant Conditioning

A Brief Survey of Operant Behavior

It has long been known that behavior is affected by its consequences. We reward and punish people so they will behave in different ways. A more specific effect of a consequence was first studied experimentally by Edward L. Thorndike in a well-known experiment. A cat enclosed in a box struggled to escape and eventually moved the latch, which opened the door. When repeatedly enclosed in a box, the cat gradually ceased to do things that had proved ineffective ("errors") and eventually developed the successful response very quickly.

In operant conditioning, behavior is also affected by its consequences, but the process is not trial-and-error learning. It can best be explained with an example. A hungry rat is placed in a semi-soundproof box. For several days an automatic dispenser occasionally delivers bits of food into a tray. The rat soon goes to the tray immediately upon hearing the sound of the dispenser. A small horizontal section of a lever protruding from the wall has been resting in its lowest position, but it is now raised slightly so that when the rat touches it, it moves downward. In doing so it closes an electric circuit and operates the food dispenser. Immediately after eating the delivered food the rat begins to press the lever fairly rapidly. The behavior has been strengthened or reinforced by a single consequence. The rat was not "trying" to do anything when it first touched the lever and it did not learn from "errors."

To a hungry rat, food is a natural reinforcer, but the reinforcer in this example is the sound of the food dispenser, which was conditioned as a reinforcer when it was repeatedly followed by the delivery of food before the lever was pressed. In fact, the sound of that one operation of the dispenser would have had an observable effect even though no food was delivered on that occasion; when food no longer follows pressing the lever, the rat eventually stops pressing. The behavior is said to have been extinguished.

A number of studies in the
Berkeley laboratory of Edward Tolman explored the operant conditioning theory. Rats were allowed to explore a maze in which there were three routes of different lengths between the starting position and the goal. The rats' behavior when the maze was blocked implied that they must have some sort of mental map of the maze. The rats preferred the routes according to their shortness, so when the maze was blocked at point A, stopping them using the shortest route, they chose the second shortest route. When the maze was blocked at point B, however, the rats did not retrace their steps and use route 2, but rather chose route 3. The rats must have recognized that block B would stop them from using route 2 by using some memory of the layout of the maze. Tolman's group also showed unexpected changes in the quality of reward could weaken learning even though the animal was still rewarded.

 

 

In 1938 Burrhus Friederich Skinner published the most influential work on animal behavior of the century, '"The Behavior of Organisms." Skinner's provided a technology that allowed sequences of behavior produced over a long time to be studied objectively. His Skinner-Box was a great improvement on earlier individual learning trials. Skinner developed the basic concept of operant conditioning. Operant conditioning forms an association between a behavior and a consequence. It is also called response-stimulus or RS conditioning because it forms an association between the animal's response [behavior] and the stimulus that follows [consequence]).

The theory of B.F. Skinner is based upon the idea that learning is a function of change in overt behavior. Changes in behavior are the result of an individual's response to events (stimuli) that occur in the environment. A response produces a consequence such as defining a word, hitting a ball, or solving a math problem.

Principles:

1.         Behavior that is positively reinforced will reoccur; intermittent reinforcement is especially effective.

2.         Information should be presented in small amounts so that responses can be reinforced (called "shaping").

3.         Reinforcements will generalize across similar stimuli, producing secondary conditioning.

Reinforcement is the key element in Skinner's S-R theory.  A reinforcer is anything that strengthens the desired response.  Positive reinforcement includes verbal praise, a good grade or a feeling of increased accomplishment or satisfaction.  The theory also covers negative reinforcement -- any stimulus that results in the increased frequency of a response when it is withdrawn (different from adverse stimuli -- punishment -- which results in reduced responses).  Skinner explained drive (motivation) in terms of deprivation and reinforcement schedules.

Reinforcers may be positive or negative.  A positive reinforcer reinforces when it is presented; a negative reinforcer reinforces when it is withdrawn.  Negative reinforcement is not punishment.  Reinforcers always strengthen behavior; that is what "reinforced" means.   Punishment is used to suppress behavior.  It consists of removing a positive reinforcer or presenting a negative one.   It often seems to operate by conditioning negative reinforcers.  The punished person henceforth acts in ways which reduce the threat of punishment and which are incompatible with, and hence take the place of, the behavior punished.


Four Possible Consequences

Consequences have to be immediate, or clearly linked to the behavior.  With verbal humans, we can explain the connection between the consequence and the behavior, even if they are separated in time.  For example, you might tell a friend that you'll buy dinner for them since they helped you move, or a parent might explain that the child can't go to summer camp because of her bad grades.  With very young children, humans who don't have verbal skills, and animals, you can't explain the connection between the consequence and the behavior.  For the animal, the consequence has to be immediate.

Applying these terms to the Four Possible Consequences, you get:

Something Good can start or be presented, so behavior increases = Positive Reinforcement (R+)

Something Good can end or be taken away, so behavior decreases = Negative Punishment (P-)

Something Bad can start or be presented, so behavior decreases = Positive Punishment (P+)

Something Bad can end or be taken away, so behavior increases = Negative Reinforcement (R-)

Or:

 

Punishment
(behavior increases)

Punishment
(behavior decreases)

Positive
(something added)

Positive Reinforcement:
Something added increases behavior

Positive Punishment
Something added decreases behavior

Negative
(something removed)

Negative Reinforcement
Something removed increases behavior

Negative Punishment
Something removed decreases behavior

 

Technical Terms

The technical terms for "start or be presented" is positive, since it's something that's added to the environment.

The technical terms for "end or be taken away" is negative, since it's something that's subtracted from the environment.

Anything that increases a behavior - makes it occur more frequently, makes it stronger, or makes it more likely to occur - is a reinforcer. Often, a person will perceive "starting Something Good" or "ending Something Bad" as something worth pursuing, and they will repeat the behaviors that seem to cause these consequences. These consequences will increase the behaviors that lead to them. These are consequences the animal will work to attain, so they strengthen the behavior.

Anything that decreases a behavior - makes it occur less frequently, makes it weaker, or makes it less likely to occur - is a punisher. Often, a person will perceive "ending Something Good" or "starting Something Bad" as something worth avoiding, and they will not repeat the behaviors that seem to cause these consequences. These consequences will decrease the behaviors that lead to them.

These definitions are based on their actual effect on the behavior in question: they must reduce or strengthen the behavior to be considered a consequence and be defined as a punishment or reinforcement. Pleasures meant as rewards that do not strengthen a behavior are indulgences, not reinforcement; aversives meant as a behavior weakener but which do not weaken a behavior are abuse, not punishment.

To learn more about negative and positive reinforcement, check out these websites:

 

Negative Reinforcement University

 

Positive Reinforcement University

Skinner's approach emphasized the function of behavior, employing a deterministic theory in which there is no free will. He stressed that we must apply the principles of learning to each organism individually. In his novel Walden Two, Skinner described a utopian community that is behaviorally engineered, based on principles of operant conditioning; a benevolent government rewards positive, socially appropriate behavior, and all is well.

According to Skinner, the motivations that Freud called the drives of the id are better understood as biological reinforcers of the environment. The part of the psyche that Freud called the superego (conscience) is better understood as the contingencies that society creates and imposes to control the selfish (individualistic) nature of the individual. For Skinner, personality traits such as extroversion are just groups of behavior that have been reinforced.

Behaviorist approaches such as operant conditioning are important because they forced personality theorists to become more empirically minded, and many untestable Freudian assumptions were discarded.

(C) 2002 All Rights Reserved.