As part of the Joint Achievement Report presented to the School Boards on Nov. 9, School District 65 reported the percentage of third- through eighth-graders who were on track to college readiness in reading. For purposes of the report, students who scored above college readiness benchmark scores identified for the Spring MAP test were deemed to be on track to college readiness. The scores used by the District were those identified in a study, “Measuring Growth Toward College Readiness: Using MAP Scores to Predict Success on the ACT Test Benchmark Scores,” presented by Robert Theaker and Clay S. Johnson at a meeting of the American Educational Research Association in 2012 (the 2012 Study). At the time of study, Mr. Theaker was a senior research associate of the Northwest Evaluation Association (NWEA), the owner of the MAP test.

Significantly, District 65 decided not to use the college readiness benchmark scores identified in a more recent 2015 Study, titled, “MAP College Readiness Benchmarks: A Research Brief” (June 29, 2015), an NWEA Research Report, by Yeow Meng Thum and Tyler Matta. Dr. Thum received his Ph.D. from the University of Chicago, has taught advanced courses in educational statistics at UCLA and Michigan State University, is a senior research fellow at NWEA, and pursues research on multivariate multilevel models for behavioral and educational data, as well as other topics.

NWEA recommends that the college readiness benchmark scores for the MAP tests that are identified in the 2015 Study be used, rather than those identified in the 2012 Study. Dr. Thum told the RoundTable, “The 2015 study should be used. It employed more representative data and better analytical techniques that results in more valid and reliable results.”

Dr. Thum added that the 2012 Study “was not conducted by NWEA, nor was it commissioned by us.”

The 2012 Study’s Benchmark Scores Are Consistently Higher

The Joint Achievement Report says the MAP college readiness benchmark scores it decided to use were adopted through a two-step process. Initially, a “College Readiness LINKING STUDY,” was published  by NWEA in 2011. That study identified MAP scores for eighth-grade reading and math that corresponded to the college readiness benchmark scores for reading and math identified for the EXPLORE test given in eighth grade. The second step was the 2012 Study.

The 2012 Study identified benchmark scores for MAP that are predictive of whether students in grades 4 through 8 will achieve a score of 21 in reading and a 22 in math on the ACT, which is given to 11th and 12th graders in high school.

At the time the 2012 Study was done, the ACT’s college readiness benchmarks were 21 in reading and 22 in math. In 2013, ACT raised the benchmark score for reading to 22 and kept the benchmark score for math at 22. The ACT’s benchmark scores are scores that indicate a student will have a 50% chance of scoring at least a B and a 75% chance of scoring at least a C in related courses in freshman year college.

The 2015 Study identified benchmark scores on MAP that are predictive of whether students in grades 5 through 8 will achieve a score of 22  in reading and a 22 in math on the ACT in high school.

The college readiness benchmark scores identified in the 2012 Study are consistently higher than the benchmark scores identified in the 2015 Study. The benchmark scores identified in each study for the Spring MAP test are listed in Table 1 below, together with the percentile rank of the scores, as indicated in the 2015 NWEA MAP national norms study.

While the college readiness scores identified in the 2012 Study are only 3 to 5 points higher than the 2015 Study for reading, the typical annual growth for an average middle school student ranges from 2.8 to 4.8 points per year, depending on the grade level.  For math the difference in points between the studies is 6 to 10 points; the typical annual growth for an average middle school student in math ranges from 4.6 to 7.7 points, depending on the grade level. See 2015 NWEA Measures of Academic Progress Normative Data.

The differences in raw scores and their corresponding percentile ranks are significant.

Higher Than the 2015 Study’s Scores Linked to a 24 on the ACT

The 2015 study also identified benchmarks scores on MAP that are predictive of whether students in grades 5 through 8 will achieve scores of 24 in reading and 24 in math on the ACT in high school, which are more stringent than the benchmarks of 22.

Significantly, the benchmark scores identified in the 2012 Study that are linked to scores of 21 in reading and 22 in math are the same or higher than those identified in the 2015 Study that are linked to a 24 on the ACT. The scores are compared in Table 2 below.

Reasons and Explanations

The RoundTable asked Dr. Thum why the 2012 Study generated higher benchmark scores than the 2015 Study. He said the differences are due to the samples used and the methodology.

The 2012 Study matched the MAP scores of 28,000 3rd through 8th graders with their ACT scores in high school. It included only students who had both a MAP and an ACT score. The study says, “No attempt was made to rebalance the sample in order to simulate a state- or nationally-representative sample.”

The 2015 Study used a sample of 83,318 4th through 12th graders from 410 schools in 14 districts across the U.S. It used MAP results for grades 4-9, and, where available, ACT results in high school. About 50% of the students took the ACT and had ACT scores available.

The 2015 Study employed several unique considerations. First, the study related MAP scores and ACT scores over an extended period, and it considered the entire score trajectory of every student to maximize the information across time points.

Second, the 2015 Study recognized that not all students take the ACT, and found that students who had higher scores on MAP were more likely to take the ACT. Thus, the study says, “Scale relationships based only on the data of examinees who have taken a college entrance exam are likely to contain an element of selection bias that generally makes the relationship obtained for college entrance examinees unsuitable for predictive use among the entire student population.”

In other words, failing to make adjustments for selective bias will skew the results. The 2015 Study adjusted for selection bias. The 2012 Study did not.

Dr. Thum told the RoundTable, “The 2012 research was not conducted by NWEA, nor was it commissioned by us. In it, the researchers used a sample of students who have taken both MAP and the ACT. Because these tend to be higher performing students with clear college aspirations, benchmarks based on their MAP results are naturally higher but they may not be relevant to the typical student. There are many students who may be college ready but who do not take the ACT. The 2015 study, conducted by NWEA, successfully mitigated such selection bias.

“For both distal (i.e. high school) ACT college readiness benchmark scores (of 22 and 24), the adjustments result in lower RIT benchmark scores for college readiness. That is, more students are college ready as a result of the improved data and methods employed in the 2015 study,” said Dr. Thum.

District 65 focused on the 2015 Study’s adjustments to mitigate selection bias as a reason for not using it. The Joint Achievement report says, “A key difference in the design of [the 2015] study is that it controls for students‘ likelihood to take the ACT rather than assuming that likelihood to take the ACT is random. Because the population of ACT test takers is not representative of the national student population, this approach has the potential to skew the estimates. The study attempts to control for this issue with statistical techniques (Hedeker & Gibbons, 1997). However, the possibility of estimates that are not based on a diverse population is especially important to pay attention to in the Evanston/Skokie context.”

The RoundTable asked Dr. Thum to comment on this. He said, “This is an accurate reading of a key contribution in the 2015 study. Apart from the 2012 study not being a ‘representative’ sample, the selection decisions made there could have produced a certain kind of significant bias. Additionally, longitudinal information about students was not considered, which is likely to lead to less precise estimates. 

“Regarding the last line: (“However, the possibility of estimates that are not based on a diverse population is especially important to pay attention to in the Evanston/Skokie context.”), Dr. Thum, said, “College Readiness benchmarks should always be based on the full diverse population of students. Estimates based on a more limited selection (such as those likely to be more affluent and/or from families with higher levels of education) will be skewed accordingly. Those estimates will generally trend higher and will not be relevant to many students who may be college-ready based on early MAP scores but do not, for a number of reasons, take the ACT or the SAT. Use of the 2015 MAP CRBs will generally result in higher proportions of students accurately identified as college-ready.”

The 2015 Study identified scores on the MAP test that linked to ACT’s college readiness benchmarks by locating a MAP score that balanced a high “true positive rate” (a high proportion of students being correctly identified as being college ready) with a low “false positive rate” (a low proportion of students being incorrectly identified as being college ready), using a ROC curve analysis. The area under the ROC curve is “relatively high” (e.g., over 0.9 in eighth grade, where 1.0 is a perfect prediction) indicating, “On the whole, use of the benchmarks leads to highly accurate predictions,” says the study. 

The Cohort Analyses

The Joint Achievement report also says District 65 matched up the MAP and ACT scores of a number of cohorts of District 65 students and was able to confirm that the benchmarks identified in the 2012 Study “provide accurate predictions of future performance.” Matching cohorts, however, without making adjustments for students who transferred out or dropped out of ETHS between 8th grade and 12th grade introduces a different form of selection bias and may skew the results.

In addition, as noted above, the 2012 Study identified a college readiness score of 230 for eighth-grade reading. Finding that a score of 230 at District 65 matches up with a score of 22 in reading on the ACT at ETHS, however, does not indicate that a MAP score of 230 is predictive of a 22 on the ACT for a nationally-representative sample. And it does not indicate whether each school district is carrying its own weight. 

For example, the 2015 Study found that an 8th grade MAP score of 230 is predictive of a 24 on the ACT in high school for a nationally representative sample. If the District 65/202 cohort analyses show that a MAP score of 230 in 8th grade at District 65 matches up with a score of 22 on the ACT at ETHS, rather than a score of 24, it may indicate that students are losing ground at ETHS – compared to a nationally representative sample.

Results using the college readiness benchmarks for math identified in the 2012 Study raise more questions.  District 65 recently reported that 39.1% of its students met college readiness benchmarks in math on the 2015 Spring MAP tests, using the MAP benchmark scores identified in the 2012 Study. In contrast, ETHS reported that 60.8% of its 2015 graduating seniors met college readiness benchmarks of a 22 in math on the ACT. That is a difference of 21.7 percentage points. One explanation is the benchmark scores for college readiness identified in the 2012 Study are too high.

Larry Gavin was a co-founder of the Evanston RoundTable in 1998 and assisted in its conversion to a non-profit in 2021. He has received many journalism awards for his articles on education, housing and...