Category Archives: Research Methodology

The Death of Evidence

This month that just passed marked my 25th anniversary of making a
living in the social sciences. I got my first research job in Rural
Sociology in April 1980 while I was still an undergraduate. I had
the pleasure of being trained by a fairly close-knit faculty, many
of whom were fairly late in their careers. They happened to
remember a time that is likely to be oblivious to most who work in
the social sciences now, and held a number of highly useful views
that have lost favor today and are remembered as little more than
quaint. For better or worse, many of those views have stuck with me.

They used to talk about the time when it would take them a couple of
weeks of intensive labor to generate the results for a relatively
straightforward multiple regression equation. If you think about
this, it makes sense; if all you’ve got are hard copy grids of
data, a statistics book with all the necessary formulas and
distribution tables, a pad and a pencil, you can imagine that it
would take forever to calculate a multiple regression equation that
meets the simplest publication standards. Add the fact that they
had little to no computing power at their disposal (aside from
likely being late adopters, who wanted to devote the precious few
computing resources available in those days to social scientists?)
and had extensive teaching and committee responsibilities, and you
can see why doing quantitative social science in those days was such
a risky business.

They did it anyway, but with an outlook quite a bit different than
anything we see today. Given that results were so hard-earned, they
needed to actively engage in a number of activities that would
increase the probability of their results being fruitful. So they
made sure that the theories they were anxious to test were highly
developed conceptually before using any statistics on them. They
made sure their data met the basic standards for normal distribution
and agonized over how to most effectively operationalize any and all
of their key concepts. In other words, they applied the standards
for statistical testing in the way they were originally intended by
the theorists who developed them. Anything else would have likely
led to disaster.

This all began to change in 1975 (I believe) with the introduction
of SPSS. While SAS may have come out sooner and been more
statistically rigorous, it never shared the goal of being incredibly
easy to use. SPSS was designed by social scientists with an eye
toward making a wealth of statistical tools available to anyone with
the simplest skills and low-level access to a mainframe computer.
Suddenly, results that had previously taken weeks to generate could
now be done in minutes. In addition, there was now a much greater
variety of statistical tricks and tools that could be employed. And
perhaps most important of all, all these statistics could be easily
generated by a program that required absolutely no knowledge of
statistics to be run successfully. Most of my statistical training
in school was motivated by trying to understand all the various
techniques that I was routinely generating on the computer.

The dramatic reduction in the cost of generating statistical results
completely changed the face of how social science was performed.
Gone were the days of assuring adequate conceptual development and
that data met basic distributional requirements; – the computer could
easily generate the results no matter how flimsy or questionable the
data. This would ultimately give birth to the age of keeping busy
doing social science without thinking. In many circles, theoretical
development would be seen as little more than corroborated
empiricism. Finally, the advent of cheap statistical results
dramatically increased the number of iterations performed to
complete the statistical testing for any one project. Rather than
carefully laying out the concepts and the rigorous data prior to
calculating a relatively small number of statistical tests,
practitioners would take half-baked concepts and marginal data and
massage them through a wide variety of transformations and
statistical techniques before any final results were generated and
presented. Keeping track of all the iterations performed and the
rationale for each of them became a separate project in and of
itself.

Though the logic of generating results became more convoluted, the
time and space to present those results to the scientific community
either remained constant or shrank. It was largely impossible for
practitioners to fit the rationale for all their statistical
gyrations into the same 20-page journal article or 20-minute
conference presentation that needed to contain the substantive
argument, literature review, results, conclusions and implications
of the project. For a while, practitioners appeared to live by an
ethic that said that while one might not be able to present all the
statistical gyrations and the rationale for them in a conference
presentation or journal article, they would still prepare at least
an informal supplement containing this information so that if anyone
asked for it, it would be completely available in all its splendor
and glory.

Of course, no one ever asked for this information. People were too
busy and roughly overwhelmed with information overload to want to be
bothered with the petty details of how a conclusion was reached, so
inquiries into the nuts and bolts of the statistical methods were
never asked. As a result, it didn’t take long for this ethic to be
abandoned. It got lost for two reasons. First, it atrophied from
the sheer lack of use; – why make the effort if no one subjects the
work to the scrutiny? Second, continued advancements in computing
power lead to further increases in the number of statistical
gyrations that many practitioners would take; – so much so that the
authors themselves could not keep track of all the choices and
modifications they had performed to generate their final results.

This has all been compounded by the bureaucratization of the
production of knowledge. Many practitioners now organize their
research efforts like a business. The principal authors are
frequently the CEOs of the research projects, concerning themselves
only with the high-level concepts and “strategic direction” of the
work. The details of the production of the statistical results are
frequently left to the graduate students, whose efforts are
frequently not scrutinized. A favorite example comes from the tail
end of my graduate career. I was doing some work for a department
who had a member who had recently completed their dissertation. In
the dissertation, they had reported their statistical results using
linear structural modeling and a computer program entitled LISREL,
which was the big macho statistical technique of the time. During
staff meetings and other consultations, members of this department
sought this individual’s advice on issues related to this technique,
because they were now seen as an authority on it. The truth was
that this person could barely explain the difference between a mean
and a median. They had paid one of my analyst colleagues to do the
LISREL analysis for the dissertation for them and explain what it
all meant. So they were viewed as an authority on something about
which they knew very little. This method of organization is
probably fine for making widgets or creating a decent restaurant,
but it seems a little troubling for generating knowledge.

The end result of all these trends has been a growing distance
between what constitutes knowledge and the means used to produce
that knowledge. This is true not only of consumers of knowledge,
but also for producers. If you go to a conference today and ask a
presenter about the measures they used to construct an index, or
about the Cronbach’s Alpha for a given scale, they’ll look at you
like you’re a psychopath, and then proceed to tell you that
everything they’ve done is reliable because it’s based on something
someone else did at least once. You know that the logic and/or
methodology of their analysis is likely flawed, but there is no way
to get them to go to the level of detail necessary to address the
concerns. Important levels of scrutiny have become little more than
quaint artifacts in today’s fast-paced world.

I don’t suspect we reached this particularly sad state of affairs
through any sort of maliciousness. Certainly when the problem
began, it was merely a matter of the research enterprise taking more
turns and generating more information than can adequately be covered
in a small amount of space or time. As the problem matured a bit, I
suspect that some researchers were willing to hide their flaws in
thinking (or, perhaps more likely, lack of thinking) in the sheer
volume of activity in which they engaged to produce their results.
But now, I’m beginning to wonder if we’ve reached the point where
some people are willing to exploit the space and time limitations
that constrain well-reasoned arguments to forward points of view
without having to consider the possibility of contrary evidence.
All you need to do is select a few points of evidence (and withhold
the troubling ones), set your own context, strategically refute one
or two bits of counterevidence without putting that counterevidence
in context, and look real authoritative without ever attending to
the basic rules of evidence. It may be a bit intellectually
dishonest, but it’s still effective.

I’m wondering if Malcolm Gladwell might epitomize this problem.

It’s definitely not fair for me to make this judgment yet, because
I’ve only read a few pages of The Tipping Point. I just opened the
book to a random space and read about twenty pages while waiting to
get into the shower on Friday morning. Lack of context
notwithstanding, I was astounded by what I read. He seemed to be
basing his argument on what constituted “Mavens” and “Salesmen”
solely on a couple of interesting people he had met. I sensed no
attempt to explain how the traits of these particular people might
be generalized to others in anything but the most speculative ways.
There was no attempt to delineate the breadth of their influence, or
of those like them. He quotes a study that claims that folks who
watched ABC News around the 1984 presidential election were more
likely to vote for Reagan because the anchorperson smiled more often
when talking about Reagan than about Mondale. It’s a provocative
claim, but the evidence cited to support it was so selective that
the argument bordered on fraudulent. He makes a claim that ABC
otherwise covered stories in a way that was more hostile to Reagan
than the other networks, but does not present the source of that
claim or the evidence for it. I’ve heard the opposite on numerous
occasions, but have no grounds from this author on how to judge
these competing claims. He presents no evidence about the
demographic of the audiences that tune into these broadcasts that is
independent of the study he wants you to believe. There is no
direct evidence about the content of the news stories they run. It
was nothing but a bunch of provocative (and intuitively satisfying)
claims with no solid evidence to support or refute them. Yes, the
claims he makes are fun to think about, but there appears to be
little of substance behind them.

This is not the only place I’ve seen these tendencies. There’s a
guy at Johns Hopkins who’s been hawking an organizational “culture
of safety” questionnaire that has many of the folks I work for all
excited. Every time I come in contact with him or his work, I feel
like I need a shower. There may be something to his findings, but
it’s awfully hard to tell. He’s very selective about the evidence
he presents and shows no interest in explaining everything he shows
to you. He makes no attempt to address the existence of contrary
evidence or test for what might be some fundamental biases in his
work. He’s slick and demur, but I can’t help but think that he is
intellectually dishonest. Perhaps I’m not being fair, but I’m just
not used to feeling quite this level of skepticism when confronted
with academic work.

I don’t know what’s worse; the fact that we have people producing
this type of work, or the ease with which so many of us are ready to
consume it. Are we so easily seduced by provocative hypotheses and
the flights of fancy we’re inclined to take around them that we are
blind to the need for decent evidence to back them up?

The political environment we’re currently living in is not helping
matters any, given the current administration’s flagrant disdain for
facts. But it seems strange to see this level of discourse coming
from intellectuals. While I can see how we may have come to this
point, that knowledge brings me no comfort. How do we go about
reducing this level of intellectual degeneration?

Someone could say that I’m guilty of using Gladwell’s tactics to
make the case for criticizing him. But all I’m doing is getting the
ideas down before I lose them. Before I ever did anything to make
them public, I would make a concerted effort to examine – …let’s see…
what should we call it? How about evidence?

And I’ll start by actually reading the book from the beginning. I
was curious to see if any of my initial thoughts resonated for you
given that you’ve read it. I’ll keep you posted as I delve into it
more….