Introduction
The feature that separates psychology from an area such as philosophy is its reliance on the empirical method for its truths. Instead of arguing deductively from premises to conclusions, psychology progresses by using inductive reasoning in which psychological propositions are formulated as experimental hypotheses that can be tested by experiments. The outcome of the experiment determines whether the hypothesis is accepted or rejected. Therefore, the best test of a hypothesis is one that can be interpreted unambiguously. True experiments are considered the best way to test hypotheses, because they are the best way to rule out plausible alternative explanations (confounds) to the experimental hypothesis. True experiments are studies in which the
variable whose effect the experimenter wants to understand, the independent variable, is randomly assigned to the experimental unit (usually a person); the researcher observes the effect of the independent variable by responses on the outcome measure, the dependent variable.
For example, if one wanted to study the effects of sugar on hyperactivity in children, the experimenter might ask, “Does sugar cause hyperactive behavior?” Using a true experiment, one would randomly assign half the children in a group to be given a soft drink sweetened with sugar and the other half a soft drink sweetened with a sugar substitute. One could then measure each child’s activity level; if the children who were assigned the sugar-sweetened drinks showed hyperactivity as compared to the children who received the other drinks, one could confidently conclude that sugar caused the children to show hyperactivity. A second type of study, called a correlational study, could be done if one had investigated this hypothesis by simply asking or observing which children selected sugar-sweetened drinks and then comparing their behavior to the children who selected the other drinks. The correlational study, however, would not have been able to show whether sugar actually caused hyperactivity. It would be equally plausible that children who are hyperactive simply prefer sugar-sweetened drinks. Such correlational studies have a major validity weakness in not controlling for plausible rival alternative hypotheses. This type of hypothesis is one that is different from the experimenter’s preferred hypothesis and offers another reasonable explanation for experimental results. Quasi-experimental designs stand between true experiments and correlational studies in that they control for as many threats to validity as possible.
Plausible Alternative Explanations
Experimental and Quasi-Experimental Designs for Research (1966), by Donald T. Campbell and Julian Stanley, describes the major threats to validity that need to be controlled for so that the independent variable can be correctly tested. Major plausible alternative explanations may need to be controlled when considering internal validity. (“Controlled” does not mean that the threat is not a problem; it means only that the investigator can judge how probable it is that the threat actually influenced the results.)
An external environmental event may occur between the beginning and end of the study, and this historical factor, rather than the treatment, may be the cause of any observed difference. For example, highway fatalities decreased in 1973 after an oil embargo led to the establishment of a speed limit of 55 miles per hour. Some people believed that the cause of the decreased fatalities was the 55-mile-per-hour limit. If the oil embargo caused people to drive less because they could not get gasoline or because it was higher priced, however, either of those events could be a plausible alternative explanation. The number of fatalities may have declined simply because people were driving less, not because of the speed-limit change.
Maturation occurs when natural changes within people cause differences between the beginning and end of the study. Even over short periods of time, for example, people become tired, hungry, or bored. It may be these changes rather than the treatment that causes observed changes. An investigation of a treatment for sprained ankles measured the amount of pain people had when they first arrived for treatment and then measured their pain again four weeks after treatment. Finding a reduction in reported pain, the investigator concluded that the treatment was effective. Since a sprained ankle will probably improve naturally within four weeks, however, maturation (in this case, the natural healing process) is a plausible alternative explanation.
Testing is a problem when the process of measurement itself leads to systematic changes in measured performance. A study was done on the effects of a preparatory course on performance on the American College Test (ACT), a college entrance exam. Students were given the ACT, then given a course on improving their scores, then tested again; they achieved higher scores, on the average, the second time they took the test. The investigator attributed the improvement to the prep course, when actually it may have been simply the practice of taking the first test that led to improvement. This plausible alternative explanation suggests that even if the students had not taken the course, they would have improved their scores on the average on retaking the ACT. The presence of a control group (a group assembled to provide a comparison to the treatment group results) would improve this study.
A change in the instruments used to measure the dependent variable will also cause problems. This is a problem particularly when human observers are rating behaviors directly. The observers may tire, or their standards may shift over the course of the study. For example, if observers are rating children’s “hyperactivity,” they may see later behavior as more hyperactive than earlier behavior not because the children’s behavior has changed but because, through observing the children play, the observers’ own standards have shifted. Objective measurement is crucial for controlling this threat.
Selection presents a problem when the results are caused by a bias in the choice of subjects for each group. For example, a study of two programs designed to stop cigarette smoking assigned smokers who had been addicted for ten years to program A and smokers who had been addicted for two years to program B. It was found that 50 percent of the program B people quit and 30 percent of the program A people quit. The investigators concluded that program B is more effective; however, it may be that people in program B were more successful simply because they were not as addicted to their habit as the program A participants.
Mortality, or attrition, is a problem when a differential dropout rate influences the results. For example, in the preceding cigarette study, it might be that of one hundred participants in program A, ninety of them sent back their post-test form at the end of the study; for program B, only sixty of the participants sent their forms back. It may be that people who did not send their forms back were more likely to have continued smoking, causing the apparent difference in results between programs A and B.
When subjects become aware that they are in a study, and awareness of being observed influences their reactions, reactivity has occurred. The famous Hawthorne studies on a wiring room at a Western Electric plant were influenced by this phenomenon. The investigators intended to do a study on the effects of lighting on work productivity, but they were puzzled by the fact that any change they made in lighting—increasing it or decreasing it—led to improved productivity. They finally decided it was the workers’ awareness of being in an experiment that caused their reactions, not the lighting level.
Statistical regression is a problem that occurs when subjects are selected to be in a group on the basis of their extreme scores (either high or low) on a test. Their group can be predicted to move toward the average the next time they take the test, even if the treatment has had no effect. For example, if low-scoring students are assigned to tutoring because of the low scores they achieved on a pretest, they will score higher on the second test (a post-test), even if the tutoring is ineffective.
External Threats to Validity
External threats to validity constitute the other major validity issue. Generally speaking, true experiments control for internal threats to validity by experimental design, but external threats may be a problem for true experiments as well as quasi-experiments. Since a scientific finding is one that should hold true in different circumstances, external validity (the extent to which the results of any particular study can be generalized to other settings, people, and times) is a very important issue.
An interaction between selection and treatment can cause an external validity problem. For example, since much of the medical research on the treatment of diseases has been performed by selecting only men as subjects, one might question whether those results can be generalized to women. The interaction between setting and treatment can be a problem when settings differ greatly. For example, can results obtained on a college campus be generalized to a factory? Can results from a factory be generalized to an office? The interaction of history and treatment can be a problem when the specific time the experiment is carried out influences people’s reaction to the treatment. The effectiveness of an advertisement for gun control might be judged differently if measured shortly after an assassination or a mass murder received extensive media coverage.
Examining Social Phenomena and Programs
Quasi-experimental designs have been most frequently used to examine the effects of social phenomena and social programs that cannot be or have not been investigated by experiments. For example, the effects of the public television show
Sesame Street
have been the subject of several quasi-experimental evaluations. One initial evaluation of Sesame Street concluded that it was ineffective in raising the academic abilities of poor children, but a reanalysis of the data suggested that statistical regression artifacts had contaminated the original evaluation and that Sesame Street had a more positive effect than was initially believed. This research showed the potential harm that can be done by reaching conclusions while not controlling for all the threats to validity. It also showed the value of doing true experiments whenever possible.
Many of the field-research studies carried out on the effects of violent television programming on children’s aggressiveness have used quasi-experimental designs to estimate the effects of violent television. Other social-policy studies have included the effects of no-fault divorce laws on divorce rates, of crackdowns on drunken driving on the frequency of drunken driving, and of strict speed-law enforcement on speeding behavior and accidents. The study of the effects of speed-law enforcement represents excellent use of the “interrupted time series” quasi-experimental design. This design can be used when a series of pretest and post-test data points is available. In this case, the governor of Connecticut abruptly announced that people convicted of speeding would have their licenses suspended for a thirty-day period on the first offense, sixty days on a second offense, and longer for any further offenses. By comparing the number of motorists speeding, the number of accidents, and the number of fatalities during the period before the crackdown with the period after the crackdown, the investigators could judge how effective the crackdown was. The interrupted time series design provides control over many of the plausible rival alternative hypotheses and is thus a strong quasi-experimental design. The investigators concluded that it was probable that the crackdown did have a somewhat positive effect in reducing fatalities, but that a regression artifact may also have influenced the results. The regression artifact in this study would be a decrease in fatalities simply because there was such a high rate of fatalities before the crackdown.
Use in Organizational Psychology
Organizational psychology has used quasi-experimental designs to study such issues as the effects of strategies to reduce absenteeism in businesses, union-labor cooperation on grievance rates, and the effects of different forms of employee ownership on job attitudes and organizational performance. The last study compared three different conversions to employee ownership and found that employee ownership had positive effects on a company to the extent that it enhanced participative decision making and led to group work norms supportive of higher productivity. Quasi-experimental studies are particularly useful in those circumstances where it is impossible to carry out true experiments but policymakers still want to reach causal conclusions. A strong knowledge of quasi-experimental design principles helps prevent incorrect causal conclusions.
Research Approach
Psychology has progressed through the use of experiments to establish a base of facts that support psychological theories; however, there are many issues about which psychologists need to have expert knowledge that cannot be investigated by performing experiments. There are not too many social situations, outside a university laboratory, where a psychologist can randomly assign individuals to different treatments. For example, psychologists cannot dictate to parents what type of television programs they must assign their children to watch, they cannot tell the managers of a group of companies how to implement an employee stock option plan, and they cannot make a school superintendent randomly assign different classes to different instructional approaches. All these factors in the social environment vary, and quasi-experimental designs can be used to get the most available knowledge from the natural environment.
The philosophy of science associated with traditional experimental psychology argues that unless a true experiment is done it is impossible to reach any causal conclusion. The quasi-experimental view argues that a study is valid unless and until it is shown to be invalid. What is important in a study is the extent to which plausible alternative explanations can be ruled out. If there are no plausible alternative explanations to the results except for the experimenter’s research hypothesis, then the experimenter’s research hypothesis can be regarded as true.
Evolution of Practice
The first generally circulated book that argued for a quasi-experimental approach to social decision making was William A. McCall’s How to Experiment in Education, published in 1923. Education has been one of the areas where there has been an interest in and willingness to carry out quasi-experimental studies. Psychology was more influenced by the strictly experimental work of Ronald A. Fisher that was being published at around that time, and Fisher’s ideas on true experiments dominated psychological methods through the mid-1950s.
The quasi-experimental view gained increasing popularity during the 1960s as psychology was challenged to become more socially relevant and make a contribution to understanding the larger society. At that time, the federal government was also engaged in many social programs, sometimes collectively called the War on Poverty, which included housing programs, welfare programs, and compensatory educational programs. Evaluation of these programs was needed so that what worked could be retained and what failed could be discontinued. There was an initial burst of enthusiasm for quasi-experimental studies, but the ambiguous results that they produced were discouraging, and this has led many leading methodologists to re-emphasize the value of true experiments.
Rather than hold up the university-based laboratory true experiment as a model, however, they called for implementing social programs and other evaluations using random assignment to treatments in such a way that stronger causal conclusions could be reached. The usefulness of true experiments and quasi-experiments was also seen to be much more dependent on psychological theory: the pattern of results obtained by many different types of studies became a key factor in the progress of psychological knowledge. The traditional laboratory experiment, on which many psychological theories are based, was recognized as being very limited in external validity, and the value of true experiments—carried out in different settings, with different types of people, and replicated many times over—was emphasized. Since politicians, business managers, and other social policymakers have not yet appreciated the advantages in knowledge to be gained by adopting a true experiment approach to social innovation, quasi-experimental designs are still an important and valuable tool in understanding human behavior.
Bibliography
Abbott, Martin, and Jennifer McKinney. Understanding and Applying Research Design. Hoboken: Wiley, 2013. Digital file.
Campbell, Donald Thomas, and Julian C. Stanley. Experimental and Quasi-Experimental Designs for Research. Chicago: Rand McNally, 1966. Reprint. Belmont: Wadsworth, 2011. Print.
Cook, Thomas D., and Donald T. Campbell. Quasi-Experimentation: Design and Analysis Issues for Field Settings. Chicago: Rand, 1979. Print.
Cronbach, Lee J. Designing Evaluations of Educational and Social Programs. San Francisco: Jossey, 1987. Print.
Kerlinger, Fred N., and Howard B. Lee. Foundations of Behavioral Research. 4th ed. Belmont: Wadsworth, 2000. Print.
Maruyama, Geoffrey. Research Methods in Social Relations. [N.p.]: Wiley, 2014. Digital file.
Thyer, Bruce A. Quasi-Experimental Research Designs. New York: Oxford UP, 2012. Digital file.
Trochim, William M. K., ed. Advances in Quasi-Experimental Design and Analysis. San Francisco: Jossey-Bass, 1986. Print.
No comments:
Post a Comment