Abstract:
:We studied the choice behavior of 2 monkeys in a discrete-trial task with reinforcement contingencies similar to those Herrnstein (1961) used when he described the matching law. In each session, the monkeys experienced blocks of discrete trials at different relative-reinforcer frequencies or magnitudes with unsignalled transitions between the blocks. Steady-state data following adjustment to each transition were well characterized by the generalized matching law; response ratios undermatched reinforcer frequency ratios but matched reinforcer magnitude ratios. We modelled response-by-response behavior with linear models that used past reinforcers as well as past choices to predict the monkeys' choices on each trial. We found that more recently obtained reinforcers more strongly influenced choice behavior. Perhaps surprisingly, we also found that the monkeys' actions were influenced by the pattern of their own past choices. It was necessary to incorporate both past reinforcers and past choices in order to accurately capture steady-state behavior as well as the fluctuations during block transitions and the response-by-response patterns of behavior. Our results suggest that simple reinforcement learning models must account for the effects of past choices to accurately characterize behavior in this task, and that models with these properties provide a conceptual tool for studying how both past reinforcers and past choices are integrated by the neural systems that generate behavior.
journal_name
J Exp Anal Behavjournal_title
Journal of the experimental analysis of behaviorauthors
Lau B,Glimcher PWdoi
10.1901/jeab.2005.110-04subject
Has Abstractpub_date
2005-11-01 00:00:00pages
555-79issue
3eissn
0022-5002issn
1938-3711journal_volume
84pub_type
杂志文章abstract::The interresponse-time reinforcement contingencies and distributions of interreinforcement intervals characteristic of certain variable-interval schedules were mimicked by reinforcing each key peck with a probability equal to the duration of the interresponse time it terminated, divided by the scheduled mean interrein...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1979.31-3
更新日期:1979-01-01 00:00:00
abstract::In a concurrent-chains procedure, pigeons chose between equivalent mixed and multiple fixed-interval schedules of reinforcement. In the first experiment, preference for the multiple schedule was higher when the probability of the shorter fixed interval was less than .50 than for complementary points, an outcome consis...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1980.33-3
更新日期:1980-01-01 00:00:00
abstract::Grain was briefly presented to food-deprived pigeons intermittently and response-independently except during signaled timeouts. During Experiment 1, key pecks postponed the next timeout for a specified interval. Rates of pecking during time in were inversely related to the length of time pecking postponed the next tim...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1983.40-153
更新日期:1983-09-01 00:00:00
abstract::Three experiments explored whether access to wheel running is sufficient as reinforcement to establish and maintain simple and conditional visual discriminations in nondeprived rats. In Experiment 1, 2 rats learned to press a lit key to produce access to running; responding was virtually absent when the key was dark, ...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1998.70-103
更新日期:1998-09-01 00:00:00
abstract::Six rhesus monkeys responding under a three-component multiple schedule were administered haloperidol to determine its effects on cocaine self-administration and on cocaine's disruptive effects on the repeated acquisition and performance of response chains. In the absence of haloperidol, 0.0032-0.032 mg/kg/infusion of...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.2008.89-225
更新日期:2008-03-01 00:00:00
abstract::Hungry rats received food following lever-press durations exceeding a minimum value, which ranged from 0 to 6.4 sec. When no intertrial intervals separated successive presses, modal press durations remained at very short values as the minimum value required for food was increased. This was particularly true immediatel...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1973.19-239
更新日期:1973-03-01 00:00:00
abstract::The performance of rats trained on multiple variable-interval schedules was examined before, during, and after punishment. The same linear function related relative response rates to relative density of reinforcement both in the presence and absence of punishment. Equal relative suppression was seen in both the high a...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1968.11-147
更新日期:1968-03-01 00:00:00
abstract::The purpose of this study was to model hierarchical classification as contextually controlled, generalized relational responding or relational framing. In Experiment 1, a training procedure involving nonarbitrarily related multidimensional stimuli was used to establish two arbitrary shapes as contextual cues for 'memb...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1002/jeab.63
更新日期:2014-01-01 00:00:00
abstract::Observer monkeys were housed next to demonstrator monkeys that were conditioned to respond on a multiple reinforcement schedule whose components were fixed-ratio 32, variable-interval 3-min, and extinction 5-min followed by an additional 30 sec of extinction during which every response started a new 30-sec interval. A...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1970.14-225
更新日期:1970-09-01 00:00:00
abstract::Auto-shaping the pigeon's key-peck response was examined as a respondent conditioning procedure with the use of Rescorla's truly-random control procedure. In the first experiment, pigeons received presentations of brief light on the response key and brief presentations of food where the light and the food were indepen...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1973.20-323
更新日期:1973-11-01 00:00:00
abstract::In Experiment I, keylight was paired with inaccessible grain delivery (under two conditions of keylight intensity) to determine if autoshaping would occur in the absence of primary reinforcement. In Experiment II, the procedure was repeated with accessible grain, for comparison. In Experiment III, the procedures were ...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1975.23-199
更新日期:1975-03-01 00:00:00
abstract::In Experiment I, a group of eight pigeons performed on concurrent random-interval schedules constructed by holding probability equal and varying cycle time to produce ratios of reinforcer densities of 1:1, 3:1, and 5:1 for key pecking. Schedules for a second group of seven were constructed with equal cycle times and u...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1978.30-301
更新日期:1978-11-01 00:00:00
abstract::In Experiment I, two groups of four pigeons each were exposed to multiple schedules in which one component was always a variable-interval schedule with a mean interreinforcement interval of 30 or 180 seconds. The other component was either an equal variable-interval schedule or extinction. Response rates in the unchan...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1974.22-471
更新日期:1974-11-01 00:00:00
abstract::Three experiments were conducted to study the effect of an imperfect substitute for food on demand for food in a closed economy. In Experiments 1 and 2, rats pressed a lever for their entire daily food ration, and a fixed ratio of presses was required for each food pellet. In both experiments, the fixed ratio was held...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1996.65-401
更新日期:1996-03-01 00:00:00
abstract::Three experiments used pigeons in an autoshaping procedure and a single-subject design to examine compound stimulus control in classical conditioning. Experiment 1 examined the blocking effect, and Experiment 2 examined the unblocking effect. In both experiments, response-independent food was first delivered intermitt...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1996.65-575
更新日期:1996-05-01 00:00:00
abstract::Pigeons were presented with multiple schedules of alternating 90-sec components. When components in which grain was never presented alternated with components in which grain was presented on a variable-interval schedule, the average rate of responding in the variable-interval components increased, showing overall posi...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1975.24-291
更新日期:1975-11-01 00:00:00
abstract::Horner and Staddon (1987) argued that a class of reward-following processes defined by a property they termed ratio invariance is a better model for the probabilistic choice performance of pigeons than competing molecular accounts such as momentary maximizing, melioration, and the Bush-Mosteller model. The critical da...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1989.52-57
更新日期:1989-07-01 00:00:00
abstract::Five pigeons were trained in an analogue foraging procedure in which, by completing a travel requirement, they entered a "patch" in which a reinforcer might be available after an unpredictable time. They also had the opportunity, by emitting a defined response, to exit the patch and travel to another patch. Prey avail...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1994.62-185
更新日期:1994-09-01 00:00:00
abstract::Three pigeons, previously trained to discriminate different numbers of responses (fixed ratios), were tested under different reinforcement contingencies (payoff matrices) at two levels of sensitivity. For one subject, relative reinforcement magnitude was varied-at first, across sessions and then, at midsession by reve...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1978.30-69
更新日期:1978-07-01 00:00:00
abstract::Three experiments were conducted to examine pigeons' postponement of signaled extinction periods (timeouts) from a schedule of food reinforcement when such responding neither decreased overall timeout frequency nor increased the overall frequency of food reinforcement. A discrete-trial procedure was used in which a re...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.2000.74-147
更新日期:2000-09-01 00:00:00
abstract::Three pigeons were trained to discriminate a 5.0 mg/kg dose of pentobarbital from saline under a two-key concurrent schedule with responding on the key associated with the presession injection, under both stimulus conditions, producing four times as many reinforcers as responding on the other key. This concurrent sche...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1996.65-495
更新日期:1996-05-01 00:00:00
abstract::Two pigeons were trained to peck a key under a free-operant avoidance schedule. Then, changes in key color signalled the beginning (safe period) and the end (warning period) of the response-shock interval, with a response required to change the key color. Finally, a change in key color signalled the warning period and...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1977.27-281
更新日期:1977-03-01 00:00:00
abstract::Five rats pressed levers on variable-interval schedules of water reinforcement at various levels of water deprivation. In one phase of the experiment, three deprivation conditions that replicated conditions in Heyman and Monaghan (1987) were arranged, along with three less extreme deprivation conditions. In a second p...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1999.72-251
更新日期:1999-09-01 00:00:00
abstract::Pigeons' key pecks were reinforced with food on a fixed-interval schedule. Food also was available at variable time periods either independently of responding or for not key pecking (a differential-reinforcement-of-other-behavior schedule). The latter condition arranged reinforcement following the first pause of t sec...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1980.34-285
更新日期:1980-11-01 00:00:00
abstract::Rats obtained food pellets on a variable-interval schedule of reinforcement by nose poking a lighted key. After training to establish baseline performance (with the mean variable interval set at either 60, 120, or 240 s), the rats were given free access to food during the hour just before their daily session. This sat...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.2004.81-155
更新日期:2004-03-01 00:00:00
abstract::Two experiments demonstrated the pigeon's sensitivity to ultraviolet light. In Experiment I, pigeons' responses were reinforced on a multiple schedule with a variable-interval reinforcement schedule in one component and extinction in the other component. Response rates were quite different in the two components where ...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1972.17-325
更新日期:1972-05-01 00:00:00
abstract::Pigeons' responses in the presence of two concurrently available (initial-link) stimuli produced one of two different (terminal-link) stimuli according to identical but independent variable-interval schedules. Responses in the presence of each terminal-link stimulus produced equal frequencies of food reinforcement, bu...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1968.11-15
更新日期:1968-01-01 00:00:00
abstract::Previous experiments have shown that both cocaine and d,l-cathinone can function as positive reinforces when delivered intravenously to rhesus monkeys. However, the relative reinforcing efficacies of these compounds have not been established. In the present experiment, three rhesus monkeys were allowed to choose betwe...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1984.41-i35
更新日期:1984-01-01 00:00:00
abstract::Pigeons were trained in a matching-to-sample procedure with retention intervals of 0, 2, 4, 6, and 8 s mixed within each session. In different conditions, reinforcement was delayed by 0, 1, 2, 4, 6, or 8 s from correct choice responses. Discriminability decreased with increasing retention-interval duration and with in...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.2003.80-77
更新日期:2003-07-01 00:00:00
abstract::Parallel and serial timing processes are analyzed for their account of the dynamics of intertrial responding in the peak procedure. A strictly serial model, such as the behavioral theory of timing (Killeen & Fetterman, 1988), does not fit the dynamic correlation pattern in the location and duration of the middle high-...
journal_title:Journal of the experimental analysis of behavior
pub_type: 杂志文章
doi:10.1901/jeab.1992.57-393
更新日期:1992-05-01 00:00:00