Bibliography: High Stakes Testing (page 94 of 95)

This annotated bibliography is reformatted and customized by the Center for Positive Practices.  Some of the authors featured on this page include N. James Myerberg, William A. Mehrens, Patty Molloy, Margarita Calderon, Marilyn Page, Keith Hollenbeck, Samuel J. Meisels, Judith I. Anderson, David C. Potter, and Patricia Almond.

Nash, John B.; Calderon, Margarita (1994). Principals' Perceptions of Community in Low Performing Campuses in Minority Settings. Using its Academic Excellence Indicator System, Texas has labeled some schools as low performers, giving them a "Priority One" rating for improvement. Principals of low-performing campuses were interviewed to assess the condition of these campuses as communities of learners working to remove the Priority One label. All of the 6 subject schools have 93 to 97 percent Hispanic American enrollment, and all are under heavy state and public scrutiny. Interviews with principals suggest that these schools have succumbed to ineffective staff development plans, and that, although lip service is given to site-based decision making, decision making usually does not seem linked to a coherent and focused plan. Overall, principals know what they would like to do to improve the schools, but are restricted by a crisis mentality typical of situations where high-stakes testing has an impact on local educational policy. Ultimately, responsibility for school change rests with the school community. Pedagogically based community, rather than crisis mentality, will result in successful implementation of change. Two tables summarize findings. (Contains 10 references.)   [More]  Descriptors: Administrator Attitudes, Community, Decision Making, Disadvantaged Youth

Kuehn, Phyllis A.; And Others (1989). Court-defined Job Analysis Standards in Content Validation. Legal employment test precedent cited by courts and employment-related law cited by plaintiffs during teacher certification test (TCT) decisions are discussed to determine their pertinence to test content validity issues. The two main documents involved in such litigation are the "Uniform Guidelines on Employee Selection Procedures" (1978) and the "Questions and Answers to Clarify and Provide a Common Interpretation of the Uniform Guidelines on Employee Selection Procedures" (1979). Case law pertinent to Title VII of the Civil Rights Act of 1964, which is explicated by the "Guidelines" and "Questions", and to the 14th Amendment to the U.S. Constitution is cited and discussed. Two case reviews indicated that there was an increased emphasis on the conformity of job analyses to professional standards. Guidelines are outlined. As TCT precedent is set, it appears likely that the adequacy of the job analyses that are used to define the content domain to be tested will be scrutinized using criteria defined in the broader employment test setting. Validation issues in high-stakes testing are likely to remain a public spectacle, but codification within the courts will proceed in any event.   [More]  Descriptors: Constitutional Law, Content Validity, Court Doctrine, Court Litigation

Potter, David C.; Wall, Mary Ellen (1992). Higher Standards for Grade Promotion and Graduation: Unintended Effects of Reform. Findings of a study that examined the effect of the reforms mandated by South Carolina's Educational Improvement Act of 1984 on student outcomes are presented in this paper. Specifically, the study sought to determine the impact of higher standards for grade promotion and graduation on retention rates, the proportions of students overage for their grade, different demographic groups, and student achievement between the years 1985-86 and 1989-90. Methodology involved analysis of statewide testing and demographic data and school policy reports submitted to the state department of education. Findings indicate that the stricter requirements created a high stakes testing environment. The data suggest modest gains in achievement but no improvement in the dropout rate and school holding power. In addition, student retention in grade increased, with differential effects on students with different demographic characteristics. A conclusion is that despite the modest improvement in achievement, the higher standards have had deleterious effects for some groups, particularly nonwhite males. Four figures and eight tables are included. (Contains 11 references.)   [More]  Descriptors: Academic Achievement, Educational Discrimination, Educational Improvement, Elementary Secondary Education

Ward, Martha S. (1996). Policies and Standards, Their Role and Revision: The Case of Ethics in Testing in North Carolina. High stakes testing has been in place in North Carolina since the late 1970s with highly visible, nationally norm-referenced tests administered to all students in several grades, as well as minimum competency testing as part of high school graduation requirements. A current back-to-basics movement has resulted in cuts to the testing program. A new plan for educational improvement will focus on schools rather than school systems, and will designate rewards for schools with solid performance and exemplary growth and interventions for schools that lag. The first code of ethics for North Carolina testing personnel was published in 1988 to help ensure the integrity of test results and a "level playing field" for all schools. In 1995 a committee was convened to begin to revise the"Testing Code of Ethics" to reflect new programs and new technological approaches to testing. Attachments include the current code of ethics, a model local school board policy statement, and a draft of the new code. (Contains five references.)   [More]  Descriptors: Academic Achievement, Achievement Gains, Codes of Ethics, Educational Policy

Hollenbeck, Keith; Tindal, Gerald; Almond, Patricia (1998). Teachers' Knowledge of Accommodations as a Validity Issue in High-Stakes Testing, Journal of Special Education. A survey of 166 regular and special-education teachers concerning allowable accommodations on statewide assessment tests found only 21% reported that they used allowed accommodations. Teachers' knowledge of allowable accommodations was low and suggests some students are unnecessarily exempted from participation. Results support preservice and inservice training in allowable accommodations. Descriptors: Disabilities, Elementary School Teachers, High Stakes Tests, Higher Education

Shepard, Lorrie A. (1989). Inflated Test Score Gains: Is It Old Norms or Teaching the Test? Effects of Testing Project. Final Deliverable–March 1989. It is increasingly recognized, following the lead of J. J. Cannell, that actual gains in educational achievement may be much more modest than dramatic gains reported by many state assessments and many test publishers. An overview is presented of explanations of spurious test score gains. Focus is on determining how test-curriculum alignment and teaching the test influence the meaning of scores. Findings of a survey of state testing directors are summarized, and the question of teaching the test is examined. Some frequently presented explanations refer to norms used; others refer to aspects of teaching the test. Directors of testing from 46 states (four states conduct no state testing) replied to a survey about testing. Forty states clearly had high-stakes testing. The most pervasive source of high-stakes pressure identified by respondents was media coverage. Responses indicate that test-curriculum alignment and teaching the test are distorting instruction. A possible solution is to develop new tests every year, changing the tests rather than the norms. Two tables present explanations for test score inflation and selected survey responses.   [More]  Descriptors: Academic Achievement, Achievement Gains, Elementary Secondary Education, Grade Inflation

Mehrens, William A. (1991). Defensible/Indefensible Instructional Preparation for High Stakes Achievement Tests: An Exploratory Trialogue. Issues involved in high stakes testing are reviewed, with emphasis on the proper role of instructional preparation. The recent focus on educational accountability has increased pressure to raise test scores. One way of improving test scores is to teach what is on the test. The following guidelines concerning appropriate instructional strategies are presented: (1) a teacher should not engage in instruction that attenuates the ability to infer from the test score to the domain of knowledge/skill/ability of interest; (2) it is appropriate to teach the content domain to which the user wishes to infer; (3) it is appropriate to teach test-taking skills; (4) it is inappropriate to limit content instruction to a particular test item format; (5) it is inappropriate to teach only objectives from the domain that are sampled on the test; (6) it is inappropriate to use an instructional guide that reviews the questions of the latest issue of the test; (7) it is inappropriate to limit instruction to the actual test questions; (8) it is appropriate to teach toward test objectives if the test objective comprise the domain objectives; (9) it is appropriate to ensure that students understand the test vocabulary; and (10) one cannot teach only the specific task of a performance assessment. Grey areas and tangential issues in test preparation are discussed.   [More]  Descriptors: Achievement Tests, Elementary Secondary Education, Guidelines, High Stakes Tests

Myerberg, N. James (1996). Performance on Different Test Types by Racial/Ethnic Group and Gender. As is consistent with national trends, the Montgomery County (Maryland) Public School System is exploring the use of instruments other than multiple-choice tests for high-stakes testing. This paper presents information on racial, ethnic, and gender differences in performance on the various types of tests being administered in the district. Sharing such information among school systems will help in the evaluation of new types of assessment. The six assessments used in the study were: (1) a mathematics multiple choice test given to grades 3 to 8; (2) a mathematics short answer test for grades 3 to 8; (3) a locally developed mathematics extended answer test for grades 4, 6, and 7; (4) a reading multiple choice test for grades 3 to 8; (5) a language arts extended answer test for grades 4, 6, and 7; and (6) the Maryland School Performance Assessment program for grades 3, 5, and 8. There were no meaningful differences in mathematics performance by racial and ethnic group across the different types of test studied. Nonmultiple-choice reading and language arts assessments favored nonwhite students. Nonmultiple-choice tests, whether in mathematics or language arts and reading, favored females over males. The largest difference between students on reduced-cost or free meals and others was in reading and language arts, where lower income students had substantially larger gains when moving from multiple-choice to nonmultiple-choice assessments. (Contains one reference and three tables.)   [More]  Descriptors: Achievement Tests, Constructed Response, Educational Assessment, Elementary Education

Page, Marilyn; Simpson, Marilyn; Molloy, Patty (2001). Seashore Teacher Professional Certification Pilot: The Clash of a Developmental Model of Teacher Performance Assessment with a High Stakes Testing Environment and the Impact of That Clash on Novice Teachers. The Seashore Teacher Professional Certification Program developed by Seattle University, Washington, in collaboration with three school districts, is being implemented in several phases. Phase 1 consisted of mentor support for new teachers, and Phase 2 established a site-based support program. Phase 3 consisted of graduate program coursework for teachers and site-based support. This report deals with the implementation of Phase 3. Phase 3 begins at the end of the second year of teaching and extends until the teacher meets the requirements of the state teaching certification standards and receives professional certification, which must be done by the fifth year of teaching unless the state grants an extension. The pilot program field-tested the certification process to inform the adoption and implementation of the new certification process. The first cohort consisted of 18 teachers who were to participated in 2 graduate courses developed for the program and introduced in the summer. Of these 18, 7 began Phase 3 questioning teaching as a good career choice. Four participants dropped out of the program, and one left. Among the reasons they cited was the amount of time the program required. The carryover of the program into the school year made many participants feel burdened, and the acceleration of the program timeline imposed by the state added to the pressure participants felt. Nine of the 13 who finished the program demonstrated professional growth during their participation in the pilot. Other program benefits included a positive impact on collaboration between the school districts and Seattle University and some positive impacts on Seattle University in its program development and implementation. The paper outlines recommendations for the improvement of the pilot program. These center on continuity of efforts and enhanced links between instruction and real-life educational practices. (Contains 7 figures and 39 charts introducing the presentation.) Descriptors: Beginning Teachers, Continuing Education, Curriculum Development, Elementary Secondary Education

Meisels, Samuel J.; And Others (1989). Testing, Tracking, and Retaining Young Children: An Analysis of Research and Social Policy. Many professionals are convinced that more testing, tracking, and retention are taking place in the early school years than ever before. They also believe that developmentally inappropriate modifications to curricula are being implemented. As a result of inappropriate use of standardized tests, disproportionate numbers of poor and minority children have been retained or placed in extra-year programs. This paper explores these issues and makes recommendations concerning uses of assessment data and alternatives to conventional testing practices. The report also discusses the large number of unready, at-risk children entering kindergarten. Sections of the text focus on: (1) issues and background on testing, tracking, and retention; (2) high stakes testing, i.e., the use of tests to make important decisions that immediately and directly affect those tested; (3) a rational perspective on tests and testing; (4) ways in which schools, teachers, and tests are failing minority children; (5) a rationale for testing young children, guidelines for deciding to use particular kinds of tests, characteristics of norm-referenced and criterion-referenced instruments, and criteria for selecting developmental screening instruments and school readiness tests; and (6) needed research about alternatives to standardized testing. Ninety-seven references are included.   [More]  Descriptors: Criteria, Criterion Referenced Tests, Early Childhood Education, Educational Practices

Jenkins, Jerry A. (1993). Can Quality Program Evaluation Really Take Place in Schools?. High-stakes assessments are those in which the results of tests or other measures can lead to decisions that may affect school administrators, teachers, and students substantially. Whether high-stakes assessment results in misleading information due to extraneous factors associated with the conditions under which the assessment occurs is explored. Among the major problems associated with high-stakes assessment is the lack of adequate training for teachers and administrators with regard to measurement issues and testing. In addition, high-stakes tests can lead to student anxiety or poor student motivation. Some assessments may not be chosen carefully, and tests may be given at inappropriate times. Teachers and administrators may focus only on scores, rather than on learning. Some solutions for the adverse affects of high-stakes testing are: (1) better teacher education in measurement concerns; (2) a reduction of the link between student achievement measures and teacher evaluation; (3) new approaches to assessment; (4) the use of multiple measures of student achievement; and (5) the promotion of student attitudes that allow them to demonstrate their educational growth.   [More]  Descriptors: Academic Achievement, Educational Assessment, Elementary Secondary Education, Evaluation Methods

Sutton, Rosemary E. (1993). Equity Issues in High Stakes Computerized Testing. This paper discusses equity issues that may arise from the widespread use of high stakes computerized testing. The literature relevant to computerized testing is examined from the perspective of equity concerns from within the framework of research and from the perspective of possible uses of computerized testing if equity issues are considered paramount. In the first instance, the question concerns whether the use of computerized testing will maintain or exaggerate inequities in education. In the second section, the question is how the new technology can be used to reduce inequities. The following six topics are considered in examining whether computerized testing will maintain or exaggerate inequities: (1) equivalence; (2) prior experience; (3) setting of computers; (4) long-term attitudes toward computerized testing; (5) expectancies and adaptive testing; and (6) testwiseness and adaptive testing. Issues in the equity advocate approach (the role of computerized testing in reducing inequity) include time factors, guessing and adaptive tests, and the format and style of questions. The history of computer use in schools and testing suggests that inequities will be maintained or exaggerated, but it is possible to use computerized testing to reduce inequities if that is a recognized goal. (Contains 58 references).   [More]  Descriptors: Access to Education, Adaptive Testing, Computer Assisted Testing, Educational History

Shepard, Lorrie A. (1992). Chapter 1's Part in the Juggernaut of Standardized Testing. The place for standardized testing in Chapter 1 evaluation is discussed. There is substantial evidence available on the negative effects of high-stakes standardized testing, and there is a clear link between Chapter 1 requirements and the amount of testing in most school districts. Standardized testing is usually used to identify eligible students, evaluate the Chapter 1 program, and hold individual schools accountable. It is argued that each of these purposes can be served better by other means. Alternative assessments are needed for Chapter 1 use, but any such assessments must be removed from the tyranny of normal curve equivalent gains. Any system that is devised should be subjected to its own cost-benefit evaluation to determine the costs and side effects of program improvement monitoring. Five overhead projection figures used in the presentation are included.   [More]  Descriptors: Accountability, Alternative Assessment, Compensatory Education, Cost Effectiveness

Anderson, Judith I. (1991). Using the Norm-Referenced Model To Evaluate Chapter 1. In response to growing frustration over the lack of information about the national effectiveness of the Chapter 1 program, Congress enacted the Education Amendments of 1974. Section 151 of the Amendments directed the U.S. Office of Education to develop evaluation models that would allow school district data to be aggregated to provide national estimates of program effectiveness. The norm-referenced model was the most easily applied of the alternatives developed. This model substitutes test norms for a traditional comparison group. Posttest standing relative to the norm group is compared with pretest standing relative to the norm group. The 1975 document, "A Practical Guide to Measuring Project Impact on Student Achievement," specified the conditions in which the norm-referenced model could be used. Several difficulties have arisen in implementing the models, but school districts today are still required to evaluate their Chapter 1 projects. Requirements enacted in 1988 mean that districts essentially must use nationally normed tests or tests equated to nationally normed tests to measure student achievement in both basic and more advanced skills. Test norms and norm-referenced tests are reviewed, with attention to measurement error, the effects of high-stakes testing, and the relevance of national norms as a comparison group. Ways in which information on program effectiveness could be better provided are discussed. Five figures and one table illustrate the discussion.   [More]  Descriptors: Comparative Analysis, Compensatory Education, Elementary Secondary Education, Equated Scores

Shepard, Lorrie A. (1992). Will National Tests Improve Student Learning?. Claims that national tests will improve student learning are explored, asking whether national examinations will ensure high-quality instruction and greater student learning and whether tests developed to meet urgent political deadlines will retain essential features of authentic curriculum-driven assessments. Part I presents research evidence on the negative effects of standardized testing, such as the effects of high stakes testing on scores, the curriculum, and instruction. The National Education Goals Panel's (NEGP's) version of national examinations is presented in Part II, with attention to their proposals intended to forestall the negative effects of traditional tests. Part III identifies curricular and technical problems that must be resolved before the NEGP's vision can be achieved. These include: (1) development of world class rather than lowest common denominator standards; (2) development of incorruptible performance tasks; (3) teacher training in curriculum and instruction; (4) high standards for all students without reinstitution of tracking; and (5) cost. If tests are developed before these problems are resolved, new tests are likely to have the same pernicious effects as the old. There is a 32-item list of references.   [More]  Descriptors: Advisory Committees, Cost Effectiveness, Educational Improvement, Elementary Secondary Education

Leave a Reply