Accountability

 

How does The DOG support accountability?

School accountability is a term that has taken on political spin and become charged with implications that confuse thoughtful dialogue.

It helps to begin by assuming that even teachers and administrators in "non-traditional" schools wish to be accountable to their communities for the growth that takes place in their learning environments. However, as the stakes have risen around rigid and standardized external accountability measures designed with the intention of comparing student outcomes in traditional learning environments, educators with a more holistic bent have struggled to keep pace in demonstrating the rigor and intentionality with which we support our students' development.

In creating accountability systems, there are two significant questions: "For what do we wish to be accountable?" and "By what means may we be held accountable?"

The DOG may be thought of as a formative assessment tool that supports educators who wish to be accountable for assisting in the development of healthy, integrated and actualized human beings.

 

Is the DOG a legitimate assessment tool?

It depends on what you say you are measuring.

Anyone developing or adopting an assessment tool must begin by asking, "What do I want to measure?" The rancor over the standardized assessments currently used as instruments of school accountability originates in disagreement over whether or not we are measuring what we say we're measuring. A test is valid if it measures what it purports to measure. The politicians say we're measuring school quality. Parents assume they are getting information about what their children are learning. Many educators say that the true quality of their schools is going completely unaccounted for, while the scores are useless as a tool for informing instruction.

A little bit of background on assessment goes a long way to making it clear that we are giving a limited battery of standardized tests way too much influence, and that we need a more diverse assessment toolbox.

Think of the blind men and the elephant. Even the most perfectly designed assessment renders only a partial view of a human being. The DOG supports more rigorous documentation of a view that has gone largely unrecognized. The DOG is not necessarily a tool for documenting outcomes. Rather, it is an instrument that supports a continuous growth model of education.

 

Does it make sense to use the DOG alongside standardized test scores?

It depends on which test scores you're using. It's important for educators to help their constituencies understand the distinctions between various types of assessments, starting by being clear about whether a test is assessing the school or the children. (It can't do both!)

Though it is politically expedient, it is neither possible nor honest to say that the exact same test is measuring the "quality" of schools across a region while also informing Teacher Jane's support for Johnny's and Suzie's individual development.

Why is that so?

Large scale assessments that compare huge numbers of institutions across a variety of disciplines require tests that are enormously broad in scope, but necessarily lacking in nuance and detail. Assessments designed to help teacher Jane know what's next for Johhny or Suzie must be much narrower in scope, and deeper in terms of coverage, to provide a nuanced view of what kinds of obstacles are in the way of each child's further development within a discipline.

To give an example (admittedly oversimplified, for the sake of clarity), say a lengthy math test may include just 5 fraction problems. Further, one of those problems, and only one, is a problem that calls for division of fractions. If 75% of the 6th graders at New World Middle School miss that question, it might be reasonable to conclude that a conversation among the middle school math teachers is in order. On the other hand, if Johnny misses the lone fraction division problem in the context of a lengthy, broad-scope math test, we have learned very little about Johnny. We don't know if he had trouble with reading, whether he was missing vocabulary, whether he had difficulty recognizing which operation to perform, whether he had a faulty understanding of the algorithm, or whether he simply made a clerical error in his computation. Johnny would have to solve many similar problems in order for us to discern patterns that could truly inform our instruction.

We must patiently and logically challenge all efforts to conflate large-scale, high stakes school assessment measures with the kinds of formative or summative assessments that serve as instructional support tools or evaluations of individual academic growth.

Standardized tests of academic progress, administered infrequently for the purpose of assessing growth, can be useful as one measure of a student's academic development. The DOG is an excellent complement to these data. Individual scores on high-stakes standardized tests of school quality, however, should not be misused as indicators of individual student learning, with or without a DOG by their side.

 

On what reasoning is The DOG constructed?

The first challenge in developing any assessment tool is: How do we operationalize the construct we wish to measure? In this case: How do we define growth in such a way that it can be practically measured?

We start by acknowledging that growth takes place in more dimensions than we can count: our assessment will be always incomplete.

We agree that despite the impossibility of covering all the bases, we can name and describe a limited number of human attributes whose development are of significant concern to us.

The DOG belongs to a family of assessments known as behavioral proxy assessments. Assessment by proxy (as opposed to direct assessment, like assessing Heart Rate by counting the number of beats per minute) always raises questions of validity. When you propose to measure the growth of traits that can't be measured directly… Empathy, for example … you must start by proposing that certain observable behaviors can be accepted as stand-ins for the trait under consideration (e.g.: we may assert that "Consistently demonstrating awareness of, and sensitivity to, the mental states, needs, and experiences of others" is a valid proxy for Empathy).

It must be clearly understood that there will NEVER be 100% agreement as to the validity of any given behavioral proxies!

In The DOG, there are 4 levels of proxies:

  1. The Dimensions we have identified, collectively, are proxies for Whole Student Growth, or Multi-Dimensional Growth. That is, with a front-end caveat that this list will grow, yet always be incomplete, we are asserting that growth in these Dimensions, supplemented by measures of content acquisition (which must take their fair place in a much larger pantheon of Dimensions of Growth), constitutes a fair picture of Whole Person Growth.
  2. The Attributes we have designated, collectively, are proxies for the Dimensions (e.g.: we are asserting that Skillfulness in all traits identified as Attributes of the Dimension of Self-Management represents skillfulness in Self-Management),
  3. The described Observable Behavior Patterns are proxies for the Attributes
  4. The behaviors described at each of the 4 defined stages of growth within each Attribute are proxies for characteristic stages in the maturation of that Attribute.

These are four junctures at which we, as creators of this observation and assessment framework, acknowledge that we have made value judgments that individual users may or may not agree with.

 

Then, what makes The DOG a valid assessment tool?

The main limitation of behavioral proxy tools is poor construct validity, or limited agreement that the chosen proxies are legitimate stand-ins for what the test purports to measure. The validity of The DOG is enhanced by the editability of the tool. The observer, the observed, and even families of the observed can weigh in on and agree on the language and structure of the tool before it is finalized. The greater the level of agreement (that the behavioral proxies are legitimate stand-ins for the Attributes), the greater the validity of the instrument. (Establishing a process that allows community to have a voice in the construction of an assessment could raise questions of competence and professionalism. Introducing an instrument like The DOG will likely require a significant staff and parent education effort.)

The addition of 1st person reporting, for older students, further enhances the validity of this tool.

The hard truth is that "scaling" remains a perennial problem for valid tests of the individual psyche. Assessing and attaching high stakes only to those dimensions of growth (e.g.: knowledge acquisition) that are 1) measurable, and 2) about which there is little disagreement, not only invalidates them as proxies for whole person growth, but also distorts the education system toward a focus on growth in easily quantifiable dimensions at the expense of other kinds of growth.

One of the most significant hidden faults of high stakes standardized tests is the fact that there is no public discussion of the construct validity of high scores on these tests as proxies for school quality.

The DOG will never be useful for comparing hundreds of schools across a region. The DOG is designed to support individual schools and educators to observe and assist individual students' development in ways that the local community considers meaningful.