Skinner's theory of operant learning briefly. Theory of learning. What is behaviorism

Introduction

Basic postulate of learning theory is that almost all behavior is acquired as a result of learning. For example, any psychopathology is understood as learning a maladaptive behavior or as a failure to learn an adaptive behavior. Instead of talking about psychotherapy, learning theorists talk about behavior modification and behavior therapy. It is necessary to modify or change specific actions, instead of resolving the internal conflicts underlying these actions, or reorganizing the personality. Since most of the problem behaviors were once learned, they can be abandoned or somehow changed using special procedures based on the laws of learning.

An even more essential feature of these approaches is the focus on objectivity and scientific rigor, on testability of hypotheses and experimental control of variables.

Supporters of the theory of learning manipulate the parameters of the external environment and observe the consequences of these manipulations in behavior. Learning theories are sometimes called psychology S-R (stimulus - reaction).

Learning- (training, teaching) - the process of acquiring by the subject of new ways of carrying out behavior and activities, fixing and / or modifying them. The change in psychological structures that occurs as a result of this process provides an opportunity for further improvement of activity.

Theories of learning in psychology come out from two main points:

  • - Any behavior is acquired in the process of learning.
  • - In order to maintain scientific rigor when testing hypotheses, it is necessary to observe the principle of objectivity of data. As variables that can be manipulated, external causes (food reward) are chosen, in contrast to "internal" variables in the psychodynamic direction (instincts, defense mechanisms, self-concept), which cannot be manipulated.

To patterns of learning relate:

  • - The law of readiness: the stronger the need, the more successful the learning.
  • - Law of effect: behavior that leads to beneficial effect, causes a decrease in demand and therefore will be repeated.
  • - The Law of Exercise: Other things being equal, the repetition of a certain action makes it easier to perform a behavior and leads to faster execution and a decrease in the likelihood of errors.
  • - The law of recentness: the material that is presented at the end of the series is better memorized. This law contradicts the effect of primacy - the tendency to better memorize the material that is presented at the beginning of the learning process. The contradiction is eliminated when the law "edge effect" is formulated. The U-shaped dependence of the degree of memorization of the material on its place in the learning process reflects this effect and is called the "positional curve".
  • - The Law of Correspondence: There is a proportional relationship between the probability of a response and the probability of a reinforcement.

There are three main learning theories:

  • - the theory of classical conditioning by I.P. Pavlova;
  • - operant conditioning theory B.F. Skinner;
  • - A. Bandura's theory of social learning.

Theory of classical conditioning originates from the teachings of I.P. Pavlova (1849-1936) on education conditioned reflexes. Ivan Petrovich Pavlov (1849-1936) was a Russian physiologist who, in the course of his research on the process of digestion, developed a method of studying behavior and principles of learning that had a profound impact on all psychological science.

AT late XIX- early XX century. Pavlov studied the secretion of gastric juice in dogs. During these experiments, he, among other things, put some food in the dog's mouth and measured how much saliva was released as a result. By chance, he noticed that after several such experiments, the dog begins to salivate to certain stimuli even before the food enters its mouth. Salivation "occurred in response to cues such as the appearance of a bowl of food or the presence of a person who usually brought food. In other words, stimuli that did not initially lead to this response (so-called neutral stimuli) could then cause salivation due to the fact that was associated with food that automatically made the dog salivate.This observation led Pavlov to the idea of ​​​​conducting outstanding research, as a result of which the process was discovered, which was called the process of developing a classical conditioned reflex, or the classical conditioning process.

Principles of classical conditioning. I.P. Pavlov was the first to discover that respondent behavior can be classically conditioned. The essence of the process of classical conditioning is that an initially neutral stimulus begins to cause a reaction due to its associative connection with a stimulus that automatically (unconditionally) generates the same or very similar reaction.

In other words, food, in the case of the dog, is seen as an unconditioned stimulus (CS) and salivation as an unconditioned response or unconditioned reflex (BR). This is because salivation is an automatic, reflex response to food. A neutral stimulus - such as a bell - will not cause salivation. However, if in a series of experiments a bell rings immediately before food is offered, then its sound by itself, without the appearance of food following it, can cause a salivation reaction. In this case we are talking about the conditioning process, since salivation follows the bell without the presentation of food. In this sense, the call can be attributed to conditioned stimuli (CS), and salivary separation - to conditioned reactions, or conditioned reflexes (UR).

Based on the foregoing, we can say that the main scheme of the conditioned reflex I.P. Pavlova S - > R, where S - stimulus R reaction From this scheme it is clear that the main way to control conduct is to control the presentation of stimuli, causing a certain reaction, by the external environment, control over it By organizing the environment in a certain way, developing conditioned reflexes, it is possible to form a certain human behavior.

The elements of classical conditioning in this case are the unconditioned stimulus (BS), the unconditioned response (BR), the conditioned stimulus (CS), and the conditioned response (UR).

I.P. Pavlov showed that the formation of a conditioned reflex is subject to a number of requirements:

  • - the most important of them is the adjacency (coincidence in time of the indifferent and unconditioned stimuli, with some advance of the indifferent stimulus);
  • - no less important condition is repetition (multiple combination of indifferent and unconditioned stimuli).

Although Pavlov initially experimented with animals, other researchers began to study the basic processes of classical conditioning in humans.

Operant conditioning theory associated with the names of Edward Lee Thorndike (E. L. Thorndike) and Burres Skinner (B. F. Skinner). In contrast to the principle of classical conditioning (S->R), they developed the principle of operant conditioning (R->S), according to which behavior is controlled by its results and consequences. The main way to influence behavior, based on this formula, is to influence its results.

theory learning conditioned reflex

As mentioned earlier, respondent behavior is B.F. Skinner's concept of behavior, which he called Type S conditioning, to emphasize the importance of the stimulus that comes before the response and brings it out. However, Skinner believed that, in general, animal and human behavior cannot be explained in terms of classical conditioning. Skinner emphasized behavior unrelated to any known stimuli. He argued that your behavior is mainly affected by stimulus events that come after it, namely, its consequences. Since this type of behavior involves the organism actively influencing the environment in order to change events in some way, Skinner defined it as operant behavior. He also called it Y-type conditioning to emphasize the impact of the reaction on future behavior.

So, the key structural unit of the behaviorist approach in general and the Skinner approach in particular is reaction. Reactions can range from simple reflex responses (eg, salivation to food, flinching to loud noises) to complex behavioral patterns (eg, solving a math problem, covert forms of aggression).

A reaction is an external, observable piece of behavior that can be associated with events environment. The essence of the learning process is the establishment of connections (associations) of reactions with the events of the external environment.

In his approach to learning, Skinner distinguished between responses that are elicited by well-defined stimuli (such as the blinking reflex in response to a puff of air) and responses that cannot be associated with any single stimulus. These reactions of the second type are generated by the organism itself and are called operants. Skinner believed that environmental stimuli do not force the organism to behave in a certain way and do not induce it to act. The original cause of behavior is in the organism itself.

Operant behavior (caused by operant learning) is determined by the events that follow the response. That is, behavior is followed by an effect, and the nature of that effect changes the organism's tendency to repeat that behavior in the future. For example, skateboarding, playing the piano, throwing darts, and writing one's own name are patterns of operant response, or operants controlled by the outcomes that follow the corresponding behavior. These are voluntary learned responses for which there is no recognizable stimulus. Skinner understood that it is meaningless to talk about the origin of operant behavior, since we do not know the stimulus or internal cause responsible for its occurrence. It happens spontaneously.

If the consequences are favorable for the organism, then the probability of repeating the operant in the future increases. When this happens, the consequences are said to be reinforced, and the operant responses resulting from the reinforcement (in the sense of the high probability of its occurrence) are conditioned. The strength of a positive reinforcer is thus determined according to its effect on the subsequent frequency of responses that immediately preceded it.

Conversely, if the consequences of the response are not favorable and reinforced, then the likelihood of getting the operant decreases. Skinner believed that, therefore, operant behavior is controlled by negative consequences. By definition, negative or aversive consequences weaken the behavior that generates them and increase the behavior that eliminates them.

operant learning can be thought of as a learning process based on a stimulus-response-reinforcement relationship, in which behavior is shaped and maintained by virtue of one or another of its consequences.

An example of operant behavior is a situation that occurs in almost every family where there are small children, namely, operant learning to cry behavior. As soon as young children are in pain, they cry, and the immediate reaction of parents is to pay attention and give other positive reinforcements. Since attention is a reinforcing factor for a child, the crying response becomes naturally conditioned. However, crying can also occur when there is no pain. Although most parents claim that they can distinguish crying from frustration and crying due to a desire for attention, yet many parents stubbornly reinforce the latter.

In 1969, Albert Bandura (1925) - Canadian psychologist put forward his theory of personality, called the theory of social learning.

A. Bandura criticized radical behaviorism, which denied the determinants of human behavior arising from internal cognitive processes. For Bandura, individuals are neither autonomous systems nor mere mechanical transmitters animating the influences of their environment - they have superior abilities that allow them to predict the occurrence of events and create the means to exercise control over what affects their daily lives. Given that traditional theories of behavior could be wrong, this provided an incomplete rather than an inaccurate explanation of human behavior.

From the point of view of A. Bandura, people are not controlled by intrapsychic forces and do not react to the environment. The causes of human functioning must be understood in terms of the continuous interplay of behavior, cognition, and environment. This approach to the analysis of the causes of behavior, which Bandura called reciprocal determinism, implies that predisposition factors and situational factors are interdependent causes of behavior.

Human functioning is seen as a product of the interaction of behavior, personality factors and the influence of the environment.

Simply put, internal determinants of behavior, such as belief and expectation, and external determinants, such as rewards and punishments, are part of a system of interacting influences that act not only on behavior, but also on various parts of the system.

Developed Bandura The triad model of reciprocal determinism shows that although behavior is influenced by the environment, it is also partly a product of human activity, that is, people can have some influence on their own behavior. For example, a person's rude behavior at a dinner party may cause the actions of the people present to be more of a punishment rather than an encouragement for him. In any case, behavior changes the environment. Bandura also argued that due to their extraordinary ability to use symbols, people can think, create and plan, that is, they are capable of cognitive processes that are constantly manifested through overt actions.

Each of the three variables in the reciprocal determinism model is capable of influencing the other variable. Depending on the strength of each of the variables, then one, then the other, then the third dominates. Sometimes environmental influences are strongest, sometimes inner forces dominate, and sometimes expectations, beliefs, goals, and intentions shape and guide behavior. Ultimately, however, Bandura believes that because of the dual nature of the interaction between overt behavior and environmental circumstances, people are both the product and the producer of their environment. Thus, social-cognitive theory describes a model of mutual causation, in which cognitive, affective and other personal factors and environmental events work as interdependent determinants.

Foreseen consequences. Learning researchers emphasize reinforcement as necessary condition to acquire, maintain and modify behavior. Thus, Skinner argued that external reinforcement is essential for learning.

A. Bandura, although he recognizes the importance of external reinforcement, does not consider it as the only way by which our behavior is acquired, maintained or changed. People can learn by watching or reading or hearing about other people's behavior. As a result of previous experience, people may expect certain behaviors to have consequences they value, others to produce an undesirable result, and still others to be ineffective. Our behavior, therefore, is governed to a large extent by foreseeable consequences. In each case, we have the opportunity to imagine in advance the consequences of inadequate preparation for action and take the necessary precautions. Through our ability to represent the actual outcome symbolically, future consequences can be translated into momentary causative factors that influence behavior in much the same way as potential consequences. Our higher mental processes give us the ability to foresee.

At the heart of social-cognitive theory is the proposition that new forms of behavior can be acquired in the absence of external reinforcement. Bandura notes that much of the behavior we display is learned by example: WE simply observe what others are doing and then imitate their actions. This emphasis on learning by observation or by example rather than direct reinforcement is the most feature Bandura's theory.

Self-regulation and cognition in behavior. Another characteristic feature of social-cognitive theory is that it gives an important role to the unique ability of a person to self-regulation. By arranging their immediate environment, providing cognitive support, and being aware of the consequences of their own actions, people are able to exert some influence on their behavior. Of course, the functions of self-regulation are created and not so rarely supported by the influence of the environment. Thus, they are of external origin, but it should not be underestimated that once established, internal influences partially regulate what actions a person performs. Further, Bandura argues that higher intellectual abilities, such as the ability to manipulate symbols, give us a powerful means of influencing our environment. Through verbal and figurative representations, we produce and store experience in such a way that it serves as a guide for future behavior. Our ability to form images of desirable future outcomes translates into behavioral strategies to guide us towards distant goals. Using the ability to manipulate symbols, we can solve problems without resorting to trial and error, we can thus anticipate the likely consequences of various actions and change our behavior accordingly.

Conclusion

The term learning refers to a relatively permanent change in behavioral potential as a result of practice or experience. This definition contains three key elements:

  • 1) the change that has taken place is usually distinguished by stability and duration;
  • 2) the change is not the behavior itself, but the potential for its implementation (the subject can learn something that does not change his behavior for a long time or never affects him at all);
  • 3) learning requires the acquisition of some experience (so, it does not just happen as a result of maturation and growth).

Starting from the work of Pavlov and Thorndike, the early representatives of the "learning theory" that dominated the psychological science of the United States of America for almost the entire first half of the 20th century directed their research to instrumental behavior. They investigated those types of it that entailed consequences. For example, the behavior of a rat moving through a maze to find a way out and get food has been studied. This measured such quantities as the amount of time required for the rat to achieve the goal during each of the repeated attempts. Similar to Thorndike's study, the procedure consisted of placing a rat at the beginning of a maze and then assessing its progress toward the exit. The main analyzed parameter was the number of attempts required for the rat to finally be able to go through the entire maze without making mistakes (such as falling into dead-end corridors).

Representatives of the theory of learning have somewhat departed from strict behaviorism. They used concepts such as learning, motivation, driving forces, incentives, mental inhibition, which denoted invisible behavior. According to the eminent learning theorist Clark Hull (1884-1952), these concepts are scientific insofar as they can be defined in terms of observable operations (see Hull, 1943). For example, an operational definition of the presence of hunger or "need for satiety" can be advanced from the number of hours of food deprivation experienced by the rat before the experiment, or from the decrease in body weight of the rat from normal. In turn, learning can be operationally defined in terms of a progressive decline from try to try in the amount of time it takes a rat to reach the exit from a maze (or a cat to get out of a problem box). Now theorists could ask such research questions as: "Does learning occur faster if the motive for satisfying food needs is strengthened"? It turns out that it does, but only up to a certain point. After this moment, the rat simply does not have the strength to go through the maze.

Learning researchers devised formulas for learning and behavior by averaging the behavior of a large number of individual subjects and gradually deduced general "laws" of learning. One of them is the classic learning curve that extends to many types of human behavior, which is shown. Thus, learning a skill, such as playing the musical instrument, is characterized by a rapid improvement in skill in the initial stages, but then the pace of improvement slows down more and more. Suppose a child is learning to play the guitar. At first, he quickly develops the flexibility and obedience of his fingers, the skills of picking strings and setting chords; but if he is destined to become a virtuoso, it will require many years of practice. The learning curve is quite well suited to illustrate the emergence of many complex human skills, despite the fact that it was created from observations of rat maze improvement over time.

Some other patterns identified by representatives classical theory learning also extends to human behavior. However, there is a large number of those that are not subject to such a transfer. The search for principles of learning universal for all animal species has largely been abandoned in favor of species-specific principles.

The next theory, which will be considered in this essay, is the Theory of Operant Learning by B.F. Skinner, I would like to dwell on the real concept, because the work of this personologist most convincingly proves that environmental influences determine human behavior. This theory belongs to the teaching-behavioral direction in personality theory. Personality, from the point of view of learning, is the experience that a person has acquired during his life. This is an accumulated set of behavior patterns. The teaching-behavioral direction in personality theory deals with the (open) actions of a person that are accessible to direct observation as derivatives of his life experience. Theorists of the teaching-behavioral direction do not call for thinking about mental structures and processes hidden in the "mind", but on the contrary, they fundamentally consider the external environment as a key factor in human behavior. It is the environment, and not internal mental phenomena, that forms a person.

Burres Frederick Skinner was born in 1904 in Susquehanna, Pennsylvania. The atmosphere in his family was warm and relaxed, discipline was quite strict, and rewards were given when they were deserved. As a boy, he spent a lot of time designing all kinds of mechanical devices.

In 1926, Skinner received a Bachelor of Arts degree in English Literature from Hamilton College. After studying, he returned to his parents' house, tried to become a writer, but, fortunately, nothing came of this venture. Burres Frederick then entered Harvard University to study psychology, in 1931 he was awarded a Ph.D.

Skinner studied at Harvard from 1931 to 1936. scientific work and taught at the University of Minnesota from 1936 to 1945. During this period, he worked hard and fruitfully and gained fame as one of the leading behaviorists in the United States. And from 1945 to 1947 he served as head of the psychology department at the University of Indiana, after which, until retiring in 1974, he worked as a lecturer at Harvard University.

Scientific activity of B.F. Skinner has received many awards, including the Presidential Medal for Science and, in 1971, the Gold Medal of the American Psychological Association. In 1990, he received the Presidential Commendation of the American Psychological Association for his lifetime contribution to psychology.

Skinner was the author of many works: "The Behavior of Organisms" (1938), "Walden - 2" (1948), "Verbal Behavior" (1957), "Teaching Technologies" (1968), "Portrait of a Behaviorist" (1979), "Towards Further reflections" (1987) and others. He died in 1990 from leukemia.

Teaching - behavioral approach to personality, developed by B.F. Skinner, refers to the open actions of a person in accordance with his life experience. He argued that behavior is deterministic (i.e. due to the impact of some events and does not manifest itself openly), predictable and controlled by the environment. Skinner strongly dismissed the idea of ​​internal "autonomous" factors as the cause of human actions and neglected the physiological-genetic explanation of behavior.

Skinner recognized two main types of behavior:

  • 1. Responsive, (a specific response that is emitted by a known stimulus that always precedes this response) as a response to a familiar stimulus.
  • 2. Operant, (reactions freely expressed by the body, the frequency of which is strongly influenced by the use of various reinforcement regimens) determined and controlled by the result that follows it.

His work focuses almost entirely on operant behavior. In operant learning, an organism acts on its environment to produce an outcome that affects the likelihood that the behavior will be repeated. operant reaction followed by positive result tries to repeat itself, and the operant reaction followed by a negative result tries not to repeat itself. According to Skinner, behavior can best be understood in terms of reactions to the environment.

Reinforcement is a key theory of the Skinner system. Reinforcement in the classical sense is an association formed by repeatedly combining a conditioned stimulus with an unconditioned stimulus. In operant learning, an association formed when an operant response is followed by a reinforcing stimulus. Four different modes of reinforcement have been described, resulting in different forms of response: constant ratio, constant interval, variable ratio, variable interval. A distinction was made between primary (unconditional) and secondary (conditioned) reinforcers. A primary reinforcer is any event or object that has inherent reinforcing properties. A secondary reinforcer is any stimulus that acquires reinforcing properties through close association with a primary reinforcer in the organism's past learning experience. In Skinner's theory, secondary reinforcers (money, attention, approval) strongly influence human behavior. He also believed that behavior is controlled by aversive (Latin for disgust) stimuli, such as punishment (follows unwanted behavior and reduces the likelihood of recurrence of such behavior) and negative reinforcement (consists of eliminating an unpleasant stimulus after receiving a desired response). Positive punishment (presentation of an aversive stimulus during a reaction) occurs when an unpleasant stimulus follows the reaction, and negative punishment consists in the removal of a pleasant stimulus after the reaction, and negative reinforcement occurs when the organism manages to limit or avoid the presentation of an aversional stimulus. B.F. Skinner struggled with the use of aversive methods (particularly punishment) in behavior control and emphasized control through positive reinforcement (presenting a pleasurable stimulus after a response, making it more likely to be repeated).

In operant learning, stimulus generalization occurs when a response is reinforced when one stimulus is encountered together with other similar stimuli. Stimulus discrimination, on the other hand, consists of responding differently to different environmental stimuli. Both are necessary for effective functioning. The method of successive approximations, or shaping, includes reinforcement when the behavior becomes similar to the desired one. Skinner was convinced that verbal behavior, as well as language, is acquired through a process of reinforcement. Skinner denied everything internal sources behavior.

The concept of operant learning has been repeatedly tested experimentally. B.F.'s approach Skinner to behavioral research is characterized by the study of a single subject, the use of automated equipment and precise control of environmental conditions. As an illustrative example, a study of the effectiveness of the token reward system for obtaining better behaviors in a group of hospitalized psychiatric patients was shown.

The modern application of the principles of operant learning is quite extensive. Two main areas of such application:

  • 1. Communication skills training is a behavioral therapy technique designed to improve the client's interpersonal skills in real life interactions.
  • 2. Biological Feedback- a type of behavioral therapy in which the client learns to control some of the functions of his body (for example, blood pressure) with the help of special equipment that provides information about the processes occurring inside the body.

Behavioral therapy is a set of therapeutic techniques for changing maladjusted or unhealthy behavior through the application of operant learning principles.

It is assumed that self-confidence training based on behavior rehearsal techniques (a method of self-confidence training in which the client learns interpersonal (interpersonal) skills in structural role playing) and self-control is very useful for each person to behave more successfully in various social interactions. Biofeedback training appears to be effective in treating migraine, anxiety, muscle tension, and hypertension. However, it remains unclear how biofeedback actually allows control over involuntary bodily functions.

Proceedings of B.F. Skinner most convincingly argue that environmental influences determine our behavior. Skinner argued that behavior is almost entirely directly conditioned by the possibility of reinforcement from the environment. In his opinion, in order to explain behavior (and thus understand personality), the researcher need only analyze the functional relationship between visible actions and visible consequences. Skinner's work provided the foundation for the creation of a behavioral science unparalleled in the history of psychology. According to many, he is one of the most highly respected psychologists of our time.

(B.F. Skinner). In contrast to the principle of classical conditioning (S->R), they developed the principle of operant conditioning (R->S), according to which behavior is controlled by its results and consequences. The main way to influence behavior, based on this formula, is to influence its results.

As mentioned earlier, respondent behavior is B.F. Skinner's concept of behavior, which he called Type S conditioning, to emphasize the importance of the stimulus that comes before the response and brings it out. However, Skinner believed that, in general, animal and human behavior cannot be explained in terms of classical conditioning. Skinner emphasized behavior unrelated to any known stimuli. He argued that your behavior is mainly affected by stimulus events that come after it, namely, its consequences. Since this type of behavior involves the organism actively influencing the environment in order to change events in some way, Skinner defined it as operant behavior. He also called it Y-type conditioning to emphasize the impact of the reaction on future behavior.

So, the key structural unit of the behaviorist approach in general and the Skinner approach in particular is reaction. Responses can range from simple reflex responses (eg, salivation to food, flinching to a loud sound) to complex behavioral patterns (eg, solving a math problem, latent shapes).

A response is an external, observable piece of behavior that can be associated with environmental events. The essence of the learning process is the establishment of connections (associations) of reactions with the events of the external environment.

In his approach to learning, Skinner distinguished between responses that are elicited by well-defined stimuli (such as the blinking reflex in response to a puff of air) and responses that cannot be associated with any single stimulus. These reactions of the second type are generated by the organism itself and are called operants. Skinner believed that environmental stimuli do not force the organism to behave in a certain way and do not induce it to act. The original cause of behavior is in the organism itself.

Operant behavior (caused by operant learning) is determined by the events that follow the response. That is, behavior is followed by an effect, and the nature of that effect changes the organism's tendency to repeat that behavior in the future. For example, skateboarding, playing the piano, throwing darts, and writing one's own name are patterns of operant response, or operants controlled by the outcomes that follow the corresponding behavior. These are voluntary learned responses for which there is no recognizable stimulus. Skinner understood that it is meaningless to talk about the origin of operant behavior, since we do not know the stimulus or internal cause responsible for its occurrence. It happens spontaneously.

If the consequences are favorable for the organism, then the probability of repeating the operant in the future increases. When this happens, the consequences are said to be reinforced, and the operant responses resulting from the reinforcement (in the sense of the high probability of its occurrence) are conditioned. The strength of a positive reinforcer is thus determined according to its effect on the subsequent frequency of responses that immediately preceded it.

Conversely, if the consequences of the response are not favorable and reinforced, then the likelihood of getting the operant decreases. Skinner believed that, therefore, operant behavior is controlled by negative consequences. By definition, negative or aversive consequences weaken the behavior that generates them and increase the behavior that eliminates them.

operant learning can be thought of as a learning process based on a stimulus-response-reinforcement relationship, in which behavior is shaped and maintained by virtue of one or another of its consequences.

An example of operant behavior is a situation that occurs in almost every family where there are small children, namely, operant learning to cry behavior. As soon as young children are in pain, they cry, and the immediate reaction of parents is to pay attention and give other positive reinforcements. Since attention is a reinforcing factor for a child, the crying response becomes naturally conditioned. However, crying can also occur when there is no pain. Although most parents claim that they can distinguish crying from frustration and crying from desire, yet many parents stubbornly reinforce the latter.

The operant behaviorism of B. F. Skinner was subordinated to the main task - to predict and control the behavior of specific individuals.

The main provisions of the theory of B. F. Skinner:

Behavior can be reliably defined, predicted and controlled by environmental conditions. To understand behavior means to control it and vice versa.

Did not accept the idea of ​​a person or self that stimulates and directs behavior.

He emphasized intensive analysis of the characteristic features of a person's past experience and unique innate abilities.

The study of personality involves finding the peculiar nature of the relationship between the behavior of the organism and the results that reinforce it.

He believed that people are dependent on past experience.

BF Skinner considered the human body as a "black box". Behavior is only a function of its consequences or legitimate S-R relationships. He considered personality only as a set of forms of reactions that are characteristic of a given behavior. The individual's personality consists of relatively complex, yet independently acquired reactions. To understand behavior, one only needs to understand a person's past learning experience.

In the system of B. F. Skinner, behavior consists of specific elements - operant reactions. He recognized two main behavior type:

respondent as a response to a familiar stimulus,

operant, determined and controlled by the result that follows it.

Operant conditioning, according to B. F. Skinner, denotes a special way of forming conditioned reflexes, which consists in reinforcing a reaction that spontaneously arises in the subject, and not a stimulus (in contrast to the “classical” Pavlovian way). Reinforcement is the key concept of the author's system. Reinforcement incentives can be divided into primary and secondary incentives. Primary - they themselves have reinforcing properties (for example, food, water, comfort). Secondary stimuli (for example, money, attention, approval, etc.) - an event or object that acquires the ability to provide reinforcement through close association with the primary reinforcer.

BF Skinner did not consider it necessary to consider the internal forces or motivational states of a person as a causal factor in behavior, but focused on the relationship between certain environmental phenomena and overt behavior. He was of the opinion that personality is nothing more than certain forms of behavior that are acquired through operant learning.

B. Skinner (1904-1990) is a representative of neobehaviorism.

The main provisions of the theory of "operant behaviorism":

1. The subject of the study is the behavior of the organism in its motor component.

1. Behavior is what the organism does and what can be observed, and therefore consciousness and its phenomena - will, creativity, intellect, emotions, personality - cannot be the subject of study, since they are not observable objectively.

3. A person is not free, since he himself never controls his graying, which is determined by the external environment;

4. Personality is understood as a set of behavioral patterns "situation-reactions-, the latter depending on previous experience and genetic history.

5. Behavior can be divided into three kinds; the unconditioned reflex and the conditioned reflex, which are a simple response to a stimulus, and the operant, which occurs spontaneously and is defined as conditioning; this type of behavior plays a decisive role in the adaptation of the organism to external conditions.

6. Main characteristic operant behavior is its dependence on past experience, or the last stimulus, called reinforcement. Behavior is strengthened or weakened depending on the reinforcement, which can be negative or positive.

7. The process of positive or negative reinforcement for an action is called conditioning.

8. On the basis of reinforcement, you can build the entire system of teaching the child, the so-called programmed learning, when all the material is divided into small parts and in the case successful completion and assimilation of each part, the student receives positive reinforcement, and in case of failure - negative.

9. The system of education and management of a person is built on the same basis - socialization occurs through positive reinforcement of the norms, values ​​and rules of behavior necessary for society, while antisocial behavior should have negative reinforcement from society.

reinforcement modes.

The essence of operant learning is that reinforced behavior tends to be repeated, while unreinforced or punished behavior tends not to be repeated or suppressed. Hence, the concept of reinforcement plays a key role in Skinner's theory.

The rate at which operant behavior is acquired and maintained depends on the mode of reinforcement applied. Reinforcement mode- a rule that establishes the probability with which reinforcement will occur. The simplest rule is to present reinforcement each time the subject gives the desired response. It is called continuous reinforcement mode and is commonly used at the beginning of any operant learning, when the organism is learning to produce the correct response. In most situations Everyday life, however, this is either not feasible or uneconomical to maintain the desired response, since the reinforcement of the behavior is not always the same and regular. In most cases, a person's social behavior is reinforced only occasionally. The child cries repeatedly before getting the mother's attention. A scientist is wrong many times before he arrives at the correct solution to a difficult problem. In both of these examples, unreinforced responses occur until one of them is reinforced.

Skinner carefully studied how the regime intermittent, or partial, reinforcements affects operant behavior. Although many different modes of reinforcement are possible, they can all be classified according to two main parameters: 1) reinforcement can only take place after a certain or random time interval has elapsed since the previous reinforcement (the so-called mode temporary reinforcements); 2) reinforcement can take place only after a certain or random number of reactions(mode proportional reinforcement). According to these two parameters, four main modes of reinforcement are distinguished.

1. Constant Ratio Reinforcement Mode(PS). In this mode, the body is reinforced by the presence of a predetermined or "constant" number of appropriate reactions. This mode is universal in everyday life and plays a significant role in the control of behavior. In many industries, employees are paid partly or even exclusively according to the number of units they produce or sell. In the industry, this system is known as the unit charge. The PS mode usually sets an extremely high operant level, because the more often the organism reacts, the more reinforcement it receives.

2. Regular Interval Reinforcement Mode(PI). In a constant interval reinforcement regimen, the organism is reinforced after a fixed or "constant" time interval has elapsed since the previous reinforcement. At the individual level, the PI regime is valid when paid for work done in an hour, a week, or a month. Similarly, a weekly allowance of pocket money to a child forms a PI form of reinforcement. Universities generally operate under the Temporary Regime of the PI. Examinations are set on a regular basis and academic progress reports are issued on time. Curiously, the PI mode gives a low response rate immediately after reinforcements are received - a phenomenon called pause after reinforcement. This is indicative of students who have difficulty learning in the middle of the semester (assuming they have passed the exam well), since the next exam will not be soon. They literally take a break from learning.

3. Variable Ratio Reinforcement Mode(Sun). In this mode, the body is reinforced on the basis of some predetermined number of reactions on average. Perhaps the most dramatic illustration of the behavior of a person under the control of the BC regime is addictive gambling. Consider the actions of a person playing a slot machine, where you need to lower a coin or draw a prize with a special handle. These machines are programmed in such a way that the reinforcement (money) is distributed according to the number of attempts the person pays to operate the handle. However, the winnings are unpredictable, inconsistent and rarely allow you to get more than what the player has invested. This explains the fact why casino owners receive significantly more reinforcements than their regular customers. Further, the extinction of the behavior acquired in accordance with the BC regimen occurs very slowly, since the body does not know exactly when the next reinforcement will be. Thus, the player is forced to drop coins into the slot of the machine, despite an insignificant gain (or even loss), in full confidence that next time he will “hit the jackpot”. Such persistence is typical of behavior induced by the VS regime.

4. Reinforcement with variable interval(IN AND). In this mode, the body receives reinforcement after an indefinite time interval has passed. Like the PI mode, the reinforcement under this condition is time dependent. However, the time between reinforcements according to the VI regime varies around some average value, and is not precisely established. As a general rule, the response speed in VI mode is a direct function of the applied interval length: short intervals generate high speed, and long intervals generate low speed. Also, when reinforcing in the VI mode, the body tends to establish a constant rate of response, and in the absence of reinforcement, the reactions fade away slowly. Ultimately, the body cannot accurately predict when the next reinforcement will arrive.

In everyday life, the VI mode is not often encountered, although several variants of it can be observed. A parent, for example, may praise a child's behavior rather arbitrarily, relying on the child to continue to behave appropriately at non-reinforced intervals. Likewise, professors who give "unexpected" test papers, the frequency of which varies from one in three days to one in three weeks, on average one in two weeks, use the VI regimen. Under these conditions, students can be expected to maintain a relatively high level of diligence, as they never know when the next test will be.

As a rule, the VI mode generates a faster response speed and greater resistance to fading than the PI mode.

Conditional reinforcement.

Learning theorists have recognized two types of reinforcement, primary and secondary. Primary A reinforcer is any event or object that itself has reinforcing properties. Thus, they do not require prior association with other reinforcers in order to satisfy a biological need. Primary reinforcers for humans are food, water, physical comfort, and sex. Their value for the organism does not depend on learning. Secondary, or conditional a reinforcer, on the other hand, is any event or object that acquires the property of producing reinforcement through close association with the primary reinforcer conditioned by the organism's past experience. Examples of common secondary reinforcers in humans are money, attention, affection, and good grades.

A slight change in the standard operant learning procedure demonstrates how a neutral stimulus can become a reinforcing force for behavior. When the rat learned to press the lever in the "Skinner box", an audio signal was immediately introduced (immediately after the reaction was performed), followed by a ball of food. In this case, the sound acts like discriminative stimulus(that is, the animal learns to respond only in the presence of a sound signal, as it communicates a food reward). After this specific operant response is established, extinction begins: when the rat presses the lever, neither food nor sound signal appears. After a while, the rat stops pressing the lever. The beep is then repeated each time the animal presses the lever, but no food ball appears. Despite the absence of the initial reinforcing stimulus, the animal recognizes that pressing the lever triggers the beep, so it continues to respond aggressively, thus reducing the extinction. In other words, the set speed at which the lever is pressed reflects the fact that the beep now acts as a conditioned reinforcer. The exact rate of response depends on the strength of the cues as a conditioned reinforcer (ie, the number of times the cues were associated with the primary reinforcer, food, during learning). Skinner argued that virtually any neutral stimulus can become reinforcing if it is associated with other stimuli that previously had reinforcing properties. Thus, the phenomenon of conditioned reinforcement greatly increases the scope of possible operant learning, especially when it comes to human social behavior. In other words, if everything we learned was proportional to the primary reinforcement, then the opportunities for learning would be very limited, and human activities would not be so diverse.

A characteristic of a conditioned reinforcer is that it generalizes when combined with more than one primary reinforcer. Money is a particularly telling example. Obviously, money cannot satisfy any of our primary drives. Yet thanks to the system of cultural exchange, money is a powerful and powerful factor in obtaining many pleasures. For example, money allows us to have fashionable clothes, bright cars, medical care and education. Other types of generalized conditioned reinforcers are flattery, praise, affection, and submission to others. These so-called social reinforcers(involving the behavior of other people) are often very complex and subtle, but they are essential to our behavior in a variety of ways. different situations. Attention is a simple case. Everyone knows that a child can get attention when he pretends to be sick or misbehaves. Often children are annoying, asking ridiculous questions, interfering with adults' conversations, showing off, teasing younger sisters or brothers, and wetting the bed - all to attract attention. The attention of a significant other—parents, teachers, lover—is a particularly effective generalized conditioned stimulus that can promote pronounced attention-seeking behavior.

An even stronger generalized conditioned stimulus is social approval. For example, many people spend a lot of time preening in front of a mirror in the hope of getting the approval of a spouse or lover. Both women's and men's fashion are subject to approval, and it exists as long as there is social approval. High school students compete for a place on the varsity track and field team or participate in non-curricular activities (drama, debate, school yearbook) in order to gain the approval of parents, peers, and neighbors. Good grades in college - too positive reinforcer, because earlier for this they received praise and approval from their parents. As a powerful conditioned reinforcer, good grades also encourage learning and academic achievement.

Skinner believed that conditioned reinforcers are very important in controlling human behavior (Skinner, 1971). He also noted that each person goes through a unique science of learning, and it is unlikely that all people are driven by the same reinforcers. For example, for someone, success as an entrepreneur is a very strong reinforcer; for others, an expression of tenderness is important; and others find a reinforcing stimulus in sports, academic or musical pursuits. The possible variations in behavior supported by conditioned reinforcers are endless. Therefore, understanding human conditioned reinforcers is much more difficult than understanding why a food-deprived rat presses a lever with only a sound signal as a reinforcer.

Controlling behavior through aversive stimuli.

From Skinner's point of view, a person's behavior is basically controlled aversive(unpleasant or painful) stimuli. The two most typical methods of aversive control are punishment and negative reinforcement. These terms are often used interchangeably to describe the conceptual properties and behavioral effects of aversive control. Skinner offered the following definition: “You can distinguish between punishment, in which an aversive event occurs that is proportional to the response, and negative reinforcement, in which the reinforcer is the removal of an aversive stimulus, conditioned or unconditioned” (Evans, 1968, p. 33).

Punishment. Term punishment refers to any aversive stimulus or phenomenon that follows or depends on the occurrence of some operant response. Instead of reinforcing the response it accompanies, punishment reduces, at least temporarily, the likelihood that the response will occur again. The supposed purpose of punishment is to encourage people not to behave in a given way. Skinner (1983) noted that this is the most common method of behavior control in modern life.

According to Skinner, punishment can be carried out in two different ways, which he calls positive punishment and negative punishment(Table 7-1). Positive punishment occurs whenever a behavior leads to an aversive outcome. Here are some examples: if children misbehave, they are spanked or scolded; if students use cheat sheets in an exam, they are expelled from the university or school; if adults are caught stealing, they are fined or jailed. Negative punishment occurs whenever a behavior is followed by the removal of a (possible) positive reinforcer. For example, children are forbidden to watch TV because of bad behavior. A widely used approach to negative punishment is the suspension technique. In accordance with this technique, a person is instantly removed from a situation in which certain reinforcing stimuli are available. For example, a disobedient fourth grade student who interferes with classes can be kicked out of the classroom.

<Физическая изоляция - это один из способов наказания с целью предотвратить проявления нежелательного поведения.>

Negative reinforcement. Unlike punishment, negative reinforcement - it is the process by which the organism limits or avoids the aversive stimulus. Any behavior that prevents the aversive state of affairs is thus more likely to be repeated and is negatively reinforced (see Table 7-1). Grooming behavior is one such case. Let's say a person who hides from the scorching sun by going indoors is likely to go there again when the sun becomes scorching again. It should be noted that avoiding an aversive stimulus is not the same as avoiding it, since the avoided aversive stimulus is not physically represented. Therefore, another way to deal with unpleasant conditions is to learn to avoid them, that is, to behave in such a way as to prevent their occurrence. This strategy is known as avoidance learning. For example, if the learning process allows the child to avoid homework, negative reinforcement is used to increase interest in learning. Avoidance behavior also occurs when addicts develop clever plans to keep their habits, but not lead to the aversive consequences of imprisonment.

Table 7-1. Positive and negative reinforcement and punishment

Both reinforcement and punishment can be carried out in two ways, depending on whether the response is followed by the presentation or removal of a pleasant or unpleasant stimulus. Note that reinforcement enhances the response; punishment weakens it.

Skinner (1971, 1983) struggled with all forms of behavioral control based on aversive stimuli. He emphasized punishment as an ineffective means of controlling behavior. The reason is that, due to their threatening nature, punishment tactics for unwanted behavior can cause negative emotional and social side effects. Anxiety, fear, antisocial actions, and loss of self-esteem and confidence are just some of the possible negative side effects associated with the use of punishment. The threat posed by aversive control can also push people into behaviors even more controversial than those for which they were originally punished. Consider, for example, a parent who punishes a child for mediocre academic performance. Later, in the absence of a parent, the child may behave even worse - skip classes, roam the streets, damage school property. Regardless of the outcome, it is clear that the punishment was not successful in producing the desired behavior in the child. Since punishment can temporarily suppress unwanted or inappropriate behavior, Skinner's main objection was that the behavior followed by punishment is likely to reappear where there is no one who can punish. A child who has been punished several times for sexual play will not necessarily refuse to continue it; a person who is imprisoned for violent assault will not necessarily be less likely to be violent. The punished behavior may reappear after the likelihood of being punished has disappeared (Skinner, 1971, p. 62). It is easy to find examples of this in real life. A child who gets spanked for swearing in the house is free to do it elsewhere. A driver fined for speeding can pay the police officer and continue to speed freely when there is no radar patrol nearby.

Instead of aversive behavior control, Skinner (1978) recommended positive reinforcement, as the most effective method to eliminate unwanted behavior. He argued that since positive reinforcers do not have the negative side effects associated with aversive stimuli, they are more suitable for shaping human behavior. For example, convicted criminals are held in intolerable conditions in many penitentiary institutions (evidence of this is the numerous prison riots in the United States over the past few years). It is obvious that most attempts to rehabilitate criminals have failed, this confirms the high rate of recidivism or repeated violations of the law. Applying Skinner's approach, it would be possible to regulate the conditions of the prison environment in such a way that behavior that resembles the behavior of law-abiding citizens is positively reinforced (for example, learning skills social adaptation values, relationships). Such reform will require the involvement of behavioral experts with knowledge of the principles of learning, personality, and psychopathology. From Skinner's point of view, such a reform could be successfully carried out using existing resources and psychologists trained in the methods of behavioral psychology.

Skinner showed the power of positive reinforcement, and this influenced behavioral strategies used in parenting, education, business, and industry. In all these areas, there is a tendency to increasingly reward desirable behavior rather than punish undesirable behavior.

Generalization and differentiation of stimuli.

A logical extension of the reinforcement principle is that a behavior reinforced in one situation is very likely to be repeated when the organism encounters other situations that resemble it. If this were not the case, then our behavioral set would be so severely limited and chaotic that we might wake up in the morning and think for a long time about how to respond appropriately to each new situation. In Skinner's theory, the tendency for reinforced behavior to spread over many similar positions is called stimulus generalization. This phenomenon is easy to observe in everyday life. For example, a child who has been praised for refined good manners at home will generalize this behavior to appropriate situations and out of the home, such a child does not need to be taught how to behave decently in a new situation. Stimulus generalization can also be the result of unpleasant life experiences. A young woman raped by a stranger may generalize her shame and hostility toward all members of the opposite sex, as they remind her of the physical and emotional trauma inflicted by the stranger. Likewise, a single instance of fright or aversive experience caused by a person belonging to a certain ethnic group(white, black, Hispanic, Asian) may be enough for an individual to create a stereotype and thus avoid future social contact with all members of a given group.

Although the ability to generalize responses is an important aspect of many of our daily social interactions, it is clear that adaptive behavior requires the ability to make distinctions in different situations. Stimulus discrimination, an integral part of generalization is the process of learning to respond appropriately in various environmental situations. There are many examples. A motorist stays alive during rush hour by distinguishing between red and green traffic lights. The child learns to distinguish between a domestic dog and vicious dog. The teenager learns to distinguish between behavior that is approved by peers and behavior that irritates and alienates others. A diabetic immediately learns to distinguish between food containing a lot and a little sugar. Indeed, virtually all intelligent human behavior depends on the ability to discriminate.

The ability to discriminate is acquired through reinforcement of responses in the presence of some stimuli and non-reinforcement of them in the presence of other stimuli. Distinctive stimuli thus enable us to anticipate the likely outcomes associated with the expression of a particular operant response in various social situations. Accordingly, individual variation in discriminative power depends on the unique past experiences of the various reinforcers. Skinner suggested that healthy personal development occurs as a result of the interaction of generalizing and discriminative abilities, with the help of which we regulate our behavior in such a way as to maximize positive reinforcement and minimize punishment.

Sequential Approach: How to Make the Mountain Come to Mohammed.

Skinner's early experiments in operant learning focused on responses that are usually expressed at medium or high frequency (eg, pecking of a dove on a key, pressing a lever by a rat). However, it soon became apparent that the standard method of operant learning was ill-suited to the large number of complex operant responses that could spontaneously occur with almost zero probability. In the field of human behavior, for example, it is doubtful that a general strategy of operant learning could successfully teach psychiatric patients to acquire appropriate interpersonal skills. To make this task easier, Skinner (1953) devised a technique whereby psychologists could effectively and quickly reduce the time required to condition almost any behavior in the repertoire that a person had. This technique, called successful approximation method, or shaping behavior, consists of reinforcing the behavior closest to the desired operant behavior. This is approached step by step, and so one reaction is reinforced and then replaced by another that is closer to the desired result.

Skinner found that the process of forming behavior determines the development of oral speech. For him, language is the result of reinforcing the child's utterances, represented initially by verbal communication with parents and siblings. So starting from pretty simple forms babbling in infancy, children's verbal behavior gradually develops until it begins to resemble the language of adults. In Verbal Behavior, Skinner gives a more detailed explanation of how the "laws of language," like any other behavior, are comprehended using the same operant principles (Skinner, 1957). And, as might be expected, other researchers have questioned Skinner's claim that language is simply the product of verbal utterances selectively reinforced during the first years of life. Noem Chomsky (Chomsky, 1972), one of Skinner's most rigorous critics, argues that the greater rate of verbal acquisition in early childhood cannot be explained in terms of operant learning. From Chomsky's point of view, the features that the brain possesses at birth are the reason why the child acquires language. In other words, there is an innate ability to learn the complex rules of conversational communication.

We're done short review Skinner's teaching-behavioral direction. As we have seen, Skinner did not consider it necessary to consider the internal forces or motivational states of a person as a causal factor in behavior. Rather, he focused on the relationship between certain environmental phenomena and overt behavior. Further, he was of the opinion that personality is nothing more than certain forms of behavior that are acquired through operant learning. Whether or not these considerations add to a comprehensive theory of personality, Skinner has had a profound effect on our understanding of the problems of human learning. The philosophies underlying Skinner's system of views on man clearly separate him from most of the personologists with whom we have already met.



 
Articles on topic:
Delicious recipes for black bread toasts with garlic Borodino bread with garlic in the oven
I do not know a single beer lover who has not heard of such an appetizer as black bread toasts with garlic and various sauces. This is one of the most inexpensive and favorite beer snacks. But not only with this drink they can be eaten with pleasure. Such in
HIV-infected pavel lobkov considers himself a hero C What did Pokrovsky promise you later
July is not the most successful month for journalist Pavel LOBKOV. For example, three years ago he was brutally beaten and robbed by Caucasian migrant workers right next to his house. This year, Pavel had another trouble - he was fired from NTV. Indeed, he claims
French Legion SS Russians in the Foreign Legion
see also Occupation of France Thirty-third SS Grenadier Division "Charlemagne" The forerunner of the "Charlemagne" division was the "Volunteer French Legion", created in 1941 under the control of the German army.
I'm not lying about Donbass, I'm not lying about Syria, why should I lie about HIV?
/ Valery Levitin Famous TV presenter Pavel Lobkov denied the information about the robbery. Earlier, there were reports in the media that on September 12, two unidentified men beat and robbed a journalist. The attack allegedly took place on 4th Tverskaya-Yamskaya Street. "FROM