TCS Daily

How to Leave No Child Behind

By Joanne Jacobs - November 18, 2002 12:00 AM

The fifth grade desks in my elementary school had ink wells, though we had moved on to leaky ink-cartridge pens. Teachers wrote with squeaky chalk on blackboards. And we used No. 2 lead pencils to take standardized tests.

Increasingly, 21st century students from grade school to graduate school are taking web-based exams, often "smart tests" that adapt to the test-taker and provide results in minutes rather than months. It's not all multiple choice either: Artificial intelligence software can "read" and evaluate essays.

At the graduate level, web-based exams are becoming the norm. Now grade schoolers are trading pencil and paper for mouse and screen as states turn to high-tech testing. Among the states considering or implementing computerized testing are Pennsylvania, Oregon, Illinois, Virginia, Georgia, South Dakota and Idaho.

Typically, questions are downloaded on exam day to a school or test center server, avoiding the risk of a slow or broken connection in mid-test. If it's a "linear on the fly" test, students will get different questions of the same level of difficulty, which makes it impossible to cheat.

On an adaptive or smart test, questions get harder or easier depending on students' correct or incorrect answers. Miss the first question or two, and the next question will be a bit easier. Answer a few correctly, and the questions will step up in difficulty. A student's final score won't be the percentage answered correctly; it will be a scaled score reflecting the test-taker's performance level.

For students who'd ace or flunk a standard test, the adaptive exam provides a more precise measure of their performance. Smart tests also take less time than traditional tests because they eliminate questions that are too easy or too hard for the individual test-taker. That may reduce test fatigue.

This year Idaho became the first state to switch to "smart tests." This year, 75 percent of Idaho schools will offer smart tests; by spring 2003, all students are supposed to be taking the adaptive exams on school computers.

Students will be tested in the fall and again in the spring. They'll get their results as soon as they finish the exam, so they can review the answers while the questions are still fresh in their minds. Teachers will get a report on their students' performance within 24 hours. They can diagnose individual students' strengths and weaknesses, and adapt their teaching to cover what students need to learn. Principals will be able to evaluate teachers' success in teaching different areas of the curriculum.

Over several years, the scores will show what "value" is added to each student's performance, claims Northwest Education Association, a Portland, Oregon non-profit that designed the tests. Smart tests show performance on a continuous scale, not just by grade level, making it easy to track progress over time.

Like all accountability tools, smart tests are controversial. South Dakota has decided to use adaptive testing to diagnose students' learning needs, but not to evaluate school performance.

Some education officials think adaptive tests may violate the federal No Child Left Behind Act, which requires that all students in the same state take the same test. A federal assurance that smart tests are in compliance would be very helpful.

Whether smart or not-so-smart, computerized testing's speedy feedback makes it an excellent tool for diagnosing students' learning needs and improving teaching. Paper-and-pencil tests typically are given in the spring and graded over the summer. By the time teachers and students get the results, it's too late. The students have forgotten what was on the test. Teachers have a new bunch of students.

Some schools don't have the computers for wide scale testing, but that's changing rapidly. Once a system is in place, downloaded tests can be given frequently and cheaply, with no wait for results. Teachers learn how their students are doing while there's time to do something about it. Students get the answers while they still remember the questions, so they can learn from their mistakes.

In the private sector, Edison Schools, Inc. now managing 150 schools nationwide, tests second through 10th grade students every month in reading, language arts and math, using the quick turnaround on results to help teachers keep students on track. Vantage Learning customizes the test to meet various state standards.

Vince Matthews, principal of Edison Charter Academy in San Francisco credits the monthly assessments with his school's rise in test scores. Web-based assessment allows inexpensive scoring of different kinds of questions, a Rand study concluded. "For example, students may be asked to move or organize objects on the screen (e.g. on a history test put events in the order in which they occurred," the study says. Computers can read numbers written in answer to a math problem or read words in a fill-in-the-blank response. Students could observe and analyze a simulated lab experiment on screen. "Reliance on paper-and-pencil multiple-choice tests limits the kinds of skills that can be measured," Rand concluded. "Computer-based testing offers the opportunity to develop new types of questions, especially those that can assess complex problem-solving skills by requiring examinees to generate their own answers."

Scoring student-generated answers is expensive and time-consuming if humans have to be employed. Usually teachers or other college graduates are hired to read short-answer questions or essays over the summer. Imagine reading hundreds of fifth-grade essays on "the person I admire the most." Graders get tired, bored and inconsistent.

So Pennsylvania is experimenting with Vantage Learning's Intellimetric essay-grading system. (Here's a demo.) Indiana is considering the Educational Testing Service's e-rater, which is now used to score written answers on the General Management Aptitude Test (GMAT). Oregon is interested too. Knowledge Analysis Technologies' Intelligent Essay Assessor will be used to grade writing on the GED exam.

Early essay-reading software, such as PEG (Project Essay Grade), analyzes vocabulary and punctuation use and other writing characteristics with no regard to content.

Intelligent Essay Assessor uses "latent semantic analysis" to compare the patterns of word usage in a student's essay to essays graded by the teacher, and to subject-matter information already entered. Each essay gets the same grade as its closest teacher-graded match.

E-rater analyzes content and writing style, giving higher grades to students who use more varied syntax. E-rater looks for logical development of an argument, so it helps to include phrases such as "for example," "by comparison" and "in conclusion."

Vantage's Intellimetric also looks for matches to hundreds of teacher-graded essays, analyzing content and writing style.

Massachusetts may use essay-reading software to prepare students for the state exam. The student writes an essay in response to a prompt. In seconds, the program grades the essay and suggests areas that need work. The student revises the essay, and gets instant feedback on the second draft. The software encourages revision and practice, which is what poor writers desperately need.

Essay-reading software can't evaluate creative writing. It favors writing that sticks to the conventions and the buzzwords. It can be fooled or befuddled. When I tried some of the demos, the software said my work was unscoreable.

But the depressing thing is that the software, however artificial its intelligence, works pretty well. When it comes to evaluating writing about specific content material, the software agrees with human readers' grades as often as two human readers agree with each other.

With essay-reading software, high-tech tests can offer a lot more than multiple choice without driving up costs or turnaround times.

In conclusion, as e-rater would have me say, the No. 2 pencil and the fill-in-the-bubble sheet are going the way of the ink well and the chalk board. High-tech testing is smarter, cheaper, faster and a lot more useful.



TCS Daily Archives