The challenges of objectivity: lessons from anatomy.

In the last post, we talked about objectivity as a scientific ideal aimed at building a reliable picture of what the world is actually like. We also noted that this goal travels closely with the notion of objectivity as what anyone applying the appropriate methodology could see. But, as we saw, it takes a great deal of scientific training to learn to see what anyone could see.

The problem of how to see what is really there is not a new one for scientists. In her book The Scientific Renaissance: 1450-1630 [1], Marie Boas Hall describes how this issue presented itself to Renaissance anatomists. These anatomists endeavored to learn about the parts of the human body that could be detected with the naked eye and the help of a scalpel.

You might think that the subject matter of anatomy would be more straightforward for scientists to “see” than the cells Fred Grinnell describes [2] (discussed in the last post), which require preparation and staining and the twiddling of knobs on microscopes. However, the most straightforward route to gross anatomical knowledge -– dissections of cadavers -– had its own challenges. For one thing, cadavers (especially human cadavers) were often in short supply. When they were available, anatomists hardly ever performed solitary dissections of them. Rather, dissections were performed, quite literally, for an audience of scientific students, generally with a surgeon doing the cutting while a professor stood nearby and read aloud from an anatomical textbook describing the organs, muscles, or bones encountered at each stage of the dissection process. The hope was that the features described in the text would match the features being revealed by the surgeon doing the dissecting, but there were doubtless instances where the audio track (as it were) was not quite in sync with the visual. Also, as a practical matter, before the invention of refrigeration dissections were seasonal, performed in the winter rather than the warmer months to retard the cadaver’s decomposition. This put limits on how much anatomical study a person could cram into any given year.

In these conditions, most of the scientists who studied anatomy logged many more hours watching dissections than performing dissections themselves. In other words, they were getting information about the systems of interest by seeing rather than by doing -– and they weren’t always seeing those dissections from the good seats. Thus, we shouldn’t be surprised that anatomists greeted the invention of the printing press by producing a number of dissection guides and anatomy textbooks.

What’s the value of a good textbook? It shares detailed information compiled by another scientist, sometimes over the course of years of study, yet you can consume that information in a more timely fashion. If it has diagrams, it can give you a clearer view of what there is to observe (albeit through someone else’s eyes) than you may be able to get from the cheap seats at a dissection. And, if you should be so lucky as to get your own specimens for study, a good textbook can guide your examination of the new material before you, helping you deal with the specimen in a way that lets you see more of what there is to see (including spatial relations and points of attachment) rather than messing it up with sloppy dissection technique.

Among the most widely used anatomy texts in the Renaissance were “uncorrupted” translations of On the Use of the Parts and Anatomical Procedures by the ancient Greek anatomist Galen, and the groundbreaking new text On the Fabric of the Human Body (published in 1543) by Vesalius. The revival of Galen fit into a pattern of Renaissance celebration of the wisdom of the ancients rather than setting out to build “new” knowledge, and Hall describes the attitude of Renaissance anatomists toward his work as “Galen-worship.” Had Galen been alive during the Renaissance, he might well have been irritated at the extent to which his discussions of anatomy -– based on dissections of animals, not human cadavers –- were taken to be authoritative. Galen himself, as an advocate of empiricism, would have urged other anatomists to “dissect with a fresh eye,” attentive to what the book of nature (as written on the bodies of creatures to be dissected) could teach them.

As it turns out, this may be the kind of thing that’s easier to urge than to do. Hall asks,

[W]hat scientific apprentice has not, many times since the sixteenth century, preferred to trust the authoritative text rather than his own unskilled eye? (137)

Once again, it requires training to be able to see what there is to see. And surely someone who has written textbooks on the subject (even centuries before) has more training in how to see than does the novice leaning on the textbook.

Of course, the textbook becomes part of the training in how to see, which can, ironically, make it harder to be sure that what you are seeing is an accurate reflection of the world, not just of the expectations you bring to your observations of it.

The illustrations in the newer anatomy texts made it seem less urgent to anatomy students that they observe (or participate in) actual dissections for themselves. As the technique for mass-produced illustrations got better (especially with the shift from woodcuts to engravings), the illustrators could include much more detail in their images. Paradoxically, this could be a problem, as the illustrator was usually someone other than the scientist who wrote the book, and the author and illustrator were not always in close communication as the images were produced. Given a visual representation of what there is to observe and a description of what there is to observe in the text, which would a student trust more?

Bruce Bower discusses this sort of problem in his article “Objective Visions,” [3] describing the procedures used by Dutch anatomist Berhard Albinus in the mid-1700s to create an image of the human skeleton. Bower writes:

Albinus carefully cleans, reassembles, and props up a complete male skeleton; checks the position of each bone in comparison with observations of an extremely skinny man hired to stand naked next to the skeleton; he calculates the exact spot at which an artist must sit to view the skeleton’s proportions accurately; and he covers engraving plates with cross-hatched grids so that images can be drawn square-by-square and thus be reproduced more reliably. (360)

Here, it sounds like Albinus is trying hard to create an image that accurately conveys what there is to see about the skeleton and its spatial relations. The methodology seems designed to make the image-creation faithful to the particulars of the actual specimen — in a word, objective. But, Bower continues:

After all that excruciating attention to detail, the eminent anatomist announces that his atlas portrays not a real skeleton, but an idealized version. Albinus has dictated alterations to the artist. The scrupulously assembled model is only a spingboard for insights into a more “perfect” representation of the human skeleton, visible only to someone with Albinus’ anatomical acumen. (360)

Here, Albinus was trying to abstract away from the peculiarities of the particular skeleton he had staged as a model for observation in order to describe what he saw as the real thing. This is a decidedly Platonist move. Plato’s view was that the stuff of our world consists largely of imperfect material instantiations of immaterial ideal forms -– and that science makes the observations it does of many examples of material stuff to get a handle on those ideal forms.

If you know the allegory of the cave, however, you know that Plato didn’t put much faith in feeble human sense organs as a route to grasping the forms. The very imperfection of those material instantiations that our sense organs apprehend would be bound to mislead us about the forms. Instead, Plato thought we’d need to use the mind to grasp the forms.

This is a crucial juncture where Aristotle parted ways with Plato. Aristotle still thought that there was something like the forms, but he rejected Plato’s full-strength rationalism in favor of an empirical approach to grasping them. If you wanted to get a handle on the form of “horse,” for example, Aristotle thought the thing to do was to examine lots of actual specimens of horse and to identify the essence they all have in common. The Aristotelian approach probably feels more sensible to modern scientists than the Platonist alternative, but note that we’re still talking about arriving at a description of “horse-ness” that transcends the observable features of any particular horse.

Whether you’re a Platonist, an Aristotelian, or something else, it seems pretty clear that scientists do decide that some features of the systems they’re studying are crucial and others are not. They distinguish what they take to be background from what they take to be the thing they’re observing. Rather than presenting every single squiggle in their visual field, they abstract away to present the piece of the world they’re interested in talking about.

And this is where the collaboration between anatomist and illustrator gets ticklish. What happens if the engraver is abstracting away from the observed particulars differently than the anatomist would? As Hall notes, the engravings in Renaissance anatomy texts were not always accurate representations of the texts. (Nor, for that matter, did the textual descriptions always get the anatomical features right — Renaissance anatomists, Vesalius included, managed to repeat some anatomical mistakes that went back to Galen, likely because they “saw” their specimens through a lens of expectations shaped by what Galen said they were going to see.)

On top of this, the fact that artists like Leonardo Da Vinci studied anatomy to improve their artistic representations of the human form spilled back to influence Renaissance scientific illustrators. These illustrators, as much as their artist contemporaries, may have looked beyond the spatial relations between bones or muscles or internal organs for hidden beauty in their subjects. While this resulted in striking illustrations, it also meant that their engravings were not always accurate representations of the cadavers that were officially their subjects.

These factors conspired to produce visually arresting anatomy texts that exerted an influence on how the anatomy students using them understood the subject, even when these students went beyond the texts to perform their own dissections. Hall writes,

[I]t is often quite easy to “see” what a textbook or manual says should be seen. (141)

Indeed, faced with a conflict between the evidence of one’s eyes pointed at a cadaver and the evidence of one’s eyes pointed at an anatomical diagram, one might easily conclude that the cadaver in question was a weird variant while the diagram captured the “standard” configuration.

Bower’s article describes efforts scientists made to come up with visual representations that were less subjective. Bower writes:

Scientists of the 19th century rapidly adopted a new generation of devices that rendered images in an automatic fashion. For instance, the boxy contraption known as the camera obscura projected images of a specimen, such as a bone or a plant, onto a surface where a researcher could trace its form onto a piece of paper. Photography soon took over and further diminished human involvement in image-making. … Researchers explicitly equated the manual representation of items in the natural world with a moral code of self-restraint. … A blurry photograph of a star or ragged edges on a slide of tumor tissues were deemed preferable to tidy, idealized portraits. (361)

Our naïve picture of objectivity may encourage us that seeing is believing, and that mechanically captured images are more reliable than those rendered by the hand of a (subjective) human, but it’s important to remember that pictures -– even photographs -– have points of view, depend on choices made about the conditions of their creation, and can be used as arguments to support one particular way of seeing the world over another.

In the next post, we’ll look at how Seventeenth Century “natural philosophers” labored to establish a general-use method for building reliable knowledge about the world, and at how the notion of objectivity was connected to these efforts, and to the recognizable features of “the scientific method” that resulted.
_____________

[1] Marie Boas Hall, The Scientific Renaissance: 1450-1630. Dover, 1994.

[2] Frederick Grinnell, The Scientific Attitude. Guilford Press, 1992.

[3] Bruce Bower, “Objective Visions,” Science News. 5 December 1998: Vol. 154, pp. 360-362

The ideal of objectivity.

In trying to figure out what ethics ought to guide scientists in their activities, we’re really asking a question about what values scientists are committed to. Arguably, something that a scientist values may not be valued as much (if at all) by the average person in that scientist’s society.

Objectivity is a value – perhaps one of the values that scientists and non-scientists most strongly associate with science. So, it’s worth thinking about how scientists understand that value, some of the challenges in meeting the ideal it sets, and some of the historical journey that was involved in objectivity becoming a central scientific value in the first place. I’ll be splitting this discussion into three posts. This post sets the stage and considers how modern scientific practitioners describe objectivity. The next post will look at objectivity (and its challenges) in the context of work being done by Renaissance anatomists. The third post will examine how the notion of objectivity was connected to the efforts of Seventeenth Century “natural philosophers” to establish a method for building reliable knowledge about the world.

First, what do we mean by objectivity?

In everyday discussions of ethics, being objective usually means applying the rules fairly and treating everyone the same rather than showing favoritism to one party or another. Is this what scientists have in mind when they voice their commitment to objectivity? Perhaps in part. It could be connected to applying “the rules” of science (i.e., the scientific method) fairly and not letting bias creep into the production of scientific knowledge.

This seems close to the characterization of good scientific practice that we see in the National Academy of Science and National Research Council document, “The Nature of Science.” [1] This document describes science as an activity in which hypotheses undergo rigorous tests, whereby researchers compare the predictions of the hypotheses to verifiable facts determined by observation and experiment, and findings and corrections are announced in refereed scientific publications. It states, “Although [science’s] goal is to approach true explanations as closely as possible, its investigators claim no final or permanent explanatory truths.” (38)

Note that rigorous facts, verification of those facts (or the information necessary to verify them), correction of mistakes, and reliable reports of findings all depend on honesty – you can’t perform these activities by making up your results, or presenting them in a deceptive way, for example. So being objective in the sense of following good scientific methodology requires a commitment not to mislead.

But here, in “The Nature of Science,” we see hints that there are two closely related, yet distinct, meanings of “objective”. One is what anyone applying the appropriate methodology could see. The other is a picture of what the world is really like. Getting a true picture of the world (or aiming for such a picture) means seeking objectivity in the second sense -– finding the true facts. Seeking out the observational data that other scientists could verify -– the first sense of objectivity -– is closely tied to the experimental method scientists use and their strategies for reporting their results. Presumably, applying objective methodology would be a good strategy for generating an accurate (and thus objective) picture of the world.

But we should note a tension here that’s at least as old as the tension between Plato and his student Aristotle. What exactly are the facts about the world that anyone could see? Are sense organs like eyes all we need to see them? If such facts really exist, are they enough to help us build a true picture of the world?

In the chapter “Making Observations” from his book The Scientific Attitude [2], Fred Grinnell discusses some of the challenges of seeing what there is to see. He argues that, especially in the realms science tries to probe, seeing what’s out there is not automatic. Rather, we have to learn to see the facts that are there for anyone to observe.

Grinnell describes the difficulty students have seeing cells under a light microscope, a difficulty that persists even after students work out how to use the microscope to adjust the focus. He writes:

The students’ inability to see the cells was not a technical problem. There can be technical problems, of course -– as when one takes an unstained tissue section and places it under a microscope. Under these conditions it is possible to tell that something is “there,” but not precisely what. As discussed in any histology textbook, the reason is that there are few visual features of unstained tissue sections that our eyes can discriminate. As the students were studying stained specimens, however, sufficient details of the field were observable that could have permitted them to distinguish among different cells and between cells and the noncellular elements of the tissue. Thus, for these students, the cells were visible but unseen. (10-11)

Grinnell’s example suggests that seeing cells, for example, requires more than putting your eye to the eyepiece of a microscope focused on a stained sample of cells. Rather, you need to be able to recognize those bits of your visual field as belonging to a particular kind of object -– and, you may even need to have something like the concept of a cell to be able to identify what you are seeing as cells. At the very least, this suggests that we should amend our gloss of objective as “what anyone could see” to something more like “what anyone could see given a particular conceptual background and some training with the necessary scientific measuring devices.”

But Grinnell makes even this seem too optimistic. He notes that “seeing things one way means not seeing them another way,” which implies that there are multiple ways to interpret any given piece of the world toward which we point our sense organs. Moreover, he argues,

Each person’s previous experiences will have led to the development of particular concepts of things, which will influence what objects can be seen and what they will appear to be. As a consequence, it is not unusual for two investigators to disagree about their observations if the investigators are looking at the data according to different conceptual frameworks. Resolution of such conflicts requires that the investigators clarify for each other the concepts that they have in mind. (15)

In other words, scientists may need to share a bundle of background assumptions about the world to look at a particular piece of that world and agree on what they see. Much more is involved in seeing “what anyone can see” than meets the eye.

We’ll say more about this challenge in the next post, when we look at how Renaissance anatomists tried to build (and communicate) objective knowledge about the human body.
_____________

[1] “The Nature of Science,” in Panel on Scientific Responsibility and the Conduct of Research, National Academy of Sciences, National Academy of Engineering, Institute of Medicine. Responsible Science, Volume I: Ensuring the Integrity of the Research Process. Washington, DC: The National Academies Press, 1992.

[2] Frederick Grinnell, The Scientific Attitude. Guilford Press, 1992.

Intuitions, scientific methodology, and the challenge of not getting fooled.

At Context and Variation, Kate Clancy has posted some advice for researchers in evolutionary psychology who want to build reliable knowledge about the phenomena they’re trying to study. This advice, of course, is prompted in part by methodology that is not so good for scientific knowledge-building. Kate writes:

The biggest problem, to my mind, is that so often the conclusions of the bad sort of evolutionary psychology match the stereotypes and cultural expectations we already hold about the world: more feminine women are more beautiful, more masculine men more handsome; appearance is important to men while wealth is important to women; women are prone to flighty changes in political and partner preference depending on the phase of their menstrual cycles. Rather than clue people in to problems with research design or interpretation, this alignment with stereotype further confirms the study. Variation gets erased: in bad evolutionary psychology, there are only straight people, and everyone wants the same things in life. …

No one should ever love their idea so much that it becomes detached from reality.

It’s a lovely post about the challenges of good scientific methodology when studying human behavior (and why it matters to more than just scientists), so you should read the whole thing.

Kate’s post also puts me in mind of some broader issues about which scientists should remind themselves from time to time to keep themselves honest. I’m putting some of those on the table here.

Let’s start with a quotable quote from Richard Feynman:

The first principle is that you must not fool yourself, and you are the easiest person to fool.

Scientists are trying to build reliable knowledge about the world from information that they know is necessarily incomplete. There are many ways to interpret the collections of empirical data we have on hand — indeed, many contradictory ways to interpret them. This means that lots of the possible interpretations will be wrong.

You don’t want to draw the wrong conclusion from the available data, not if you can possibly avoid it. Feynman’s “first principle” is noting that we need to be on guard against letting ourselves be fooled by wrong conclusions — and on guard against the peculiar ways that we are more vulnerable to being fooled.

This means we have to talk about our attachment to intuitions. All scientists have intuitions. They surely help in motivating questions to ask about the world and strategies for finding good answers to them. But intuitions, no matter how strong, are not the same as empirical evidence.

Making things more challenging, our strong intuitions can shape what we take to be the empirical evidence. They can play a role in which results we set aside because they “couldn’t be right,” in which features of a system we pay attention to and which we ignore, in which questions we bother to ask in the first place. If we don’t notice the operation of our intuitions, and the way they impact our view of the empirical evidence, we’re making it easier to get fooled. Indeed, if our intuitions are very strong, we’re essentially fooling ourselves.

As if this weren’t enough, we humans (and, by extension, human scientists) are not always great at recognizing when we are in the grips of our intuitions. It can feel like we’re examining a phenomenon to answer a question and that we’re refraining from making any assumptions to guide our enquiry, but chances are it’s not a feeling we should trust.

This is not to say that our intuitions are guaranteed safe haven from our noticing them. We can become aware of them and try to neutralize the extent to which they, rather than the empirical evidence, are driving the scientific story — but to do this, we tend to need help from people who have conflicting intuitions about the same bit of the world. This is a good methodological reason to take account of the assumptions and intuitions of others, especially when they conflict with our own.

What happens if there are intuitions about which we all agree — assumptions we are making (and may well be unaware that we’re making, because they seem so bleeding obvious) with which no one disagrees? I don’t know that there are any such universal human intuitions. It seems unlikely to me, but I can’t rule out the possibility. How would they bode for our efforts at scientific knowledge-building?

First, we would probably want to recognize that the universality of an intuition still wouldn’t make it into independent empirical evidence. Even if it had been the case, prior to Galileo, or Copernicus, or Aristarchus of Samos, that every human took it as utterly obvious that Earth is stationary, we recognize that this intuition could still be wrong. As it happened, it was an intuition that was questioned, though not without serious resistance.

Developing a capacity to question the obvious, and also to recognize and articulate what it is we’re taking to be obvious in order that we might question it, seems like a crucial skill for scientists to cultivate.

But, as I think comes out quite clearly in Kate’s post, there are some intuitions we have that, even once we’ve recognized them, may be extremely difficult to subject to empirical test. This doesn’t mean that the questions connected in our heads to these intuitions are outside the realm of scientific inquiry, but it would be foolish not to notice that it’s likely to be extremely difficult to find good scientific answers to these questions. We need to be wary of the way our intuitions try to stack the evidential deck. We need to acknowledge that the very fact of our having strong intuitions doesn’t count as empirical evidence in favor of them. We need to come to grips with the possibility that our intuitions could be wrong — perhaps to the extent that we recognize that empirical results that seem to support our intuitions require extra scrutiny, just to be sure.

To do any less is to ask to be fooled, and that’s the outcome scientific knowledge-building is trying to avoid.

Reasonably honest impressions of #overlyhonestmethods.

I suspect at least some of you who are regular Twitter users have been following the #overlyhonestmethods hashtag, with which scientists have been sharing details of their methodology that are maybe not explicitly spelled out in their published “Materials and Methods” sections. And, as with many other hashtag genres, the tweets in #overlyhonestmethods are frequently hilarious.

I was interviewed last week about #overlyhonestmethods for the Public Radio International program Living On Earth, and the length of my commentary was more or less Twitter-scaled. This means some of the nuance (at least in my head), about questions like whether I thought the tweets were an overshare that could make science look bad, didn’t quite make it to the radio. Also, in response to the Living On Earth segment, one of the people with whom I regularly discuss the philosophy of science in the three-dimensional world, shared some concerns about this hashtag in the hopes I’d say a bit more:

I am concerned about the brevity of the comments which may influence what one expresses.  Second there is an ego component; some may try to outdo others’ funny stories, and may stretch things in order to gain a competitive advantage.

So, I’m going to say a bit more.

Should we worry that #overlyhonestmethods tweets share information that will make scientific practice look bad to (certain segments of) the public?

I don’t think so. I suppose this may depend on what exactly the public expects of scientists.

The people doing science are human. They are likely to be working with all kinds of constraints — how close their equipment is to the limits of its capabilities (and to making scary noises), how frequently lab personnel can actually make it into the lab to tend to cell cultures, how precisely (or not) pumping rates can be controlled, how promptly (or not) the folks receiving packages can get perishable deliveries to the researchers. (Notice that at least some of these limitations are connected to limited budgets for research … which maybe means that if the public finds them unacceptable, they should lobby their Congresscritters for increased research funding.) There are also constraints that come from the limits of the human animal: with a finite attention span, without a built in chronometer or calibrated eyeballs, and with a need for sleep and possibly even recreation every so often (despite what some might have you think).

Maybe I’m wrong, but my guess is that it’s a good thing to have a public that is aware of these limitations imposed by the available equipment, reagents, and non-robot workforce.

Actually, I’m willing to bet that some of these limitations, and an awareness of them, are also really handy in scientific knowledge-building. They are departures from ideality that may help scientists nail down which variables in the system really matter in producing and controlling the phenomenon being studied. Reproducibility might be easy for a robot that can do every step of the experiment precisely every single time, but we really learn what’s going on when we drift from that. Does it matter if I use reagents from a different supplier? Can I leave the cultures to incubate a day longer? Can I successfully run the reaction in a lab that’s 10 oC warmer or 10 oC colder? Working out the tolerances helps turn an experimental protocol from a magic trick into a system where we have some robust understanding of what variables matter and of how they’re hooked to each other.

Does the 140 character limit mean #overlyhonestmethods tweets leave out important information, or that scientists will only use the hashtag to be candid about some of their methods while leaving others unexplored?

The need for brevity surely means that methods for which candor requires a great deal of context and/or explanation won’t be as well-represented as methods where one can be candid and pithy simultaneously. These tweeted glimpses into how the science gets done are more likely to be one-liners than shaggy-dog stories.

However, it’s hard to imagine that folks who really wanted to share wouldn’t use a series of tweets if they wanted to play along, or maybe even write a blog post about it and use the hashtag to tweet a link to that post.

What if #overlyhonestmethods becomes a game of one-upmanship and puffery, in which researchers sacrifice honesty for laughs?

Maybe there’s some of this happening, and if the point of the hashtag is for researchers to entertain each other, maybe that’s not a problem. However, in the case that other members of one’s scientific community were actually looking to those tweets to fill in some of the important details of methodology that are elided in the terse “Materials and Methods” section of a published research paper, I hope the tweeters would, when queried, provide clear and candid information on how they actually conducted their experiments. Correcting or retracting a tweet should be less of an ego blow than correcting or retracting a published paper, I hope (and indeed, as hard as it might be to correct or retract published claims, good scientists do it when they need to).

The whole #overlyhonestmethods hashtag raises the perennial question of why it is so much is elided in published “Materials and Methods” sections. Blame is usually put on limitations of space in the journals, but it’s also reasonable to acknowledge that sometimes details-that-turn-out-to-be-important are left out because the researchers don’t fully recognize their importance. Other times, researchers may have empirical grounds for thinking these details are important, but they don’t yet have a satisfying story to tell about why they should be.

By the way, I think it would be an excellent thing if, for research that is already published, #overlyhonestmethods included the relevant DOI. These tweets would be supplementary information researchers could really use.

What researchers use #overlyhonestmethods to disclose ethically problematic methods?

Given that Twitter is a social medium, I expect other scientists in the community watching the hashtag would challenge those methods or chime in to explain just what makes them ethically problematic. They might also suggest less ethically problematic ways to achieve the same research goals.

The researchers on Twitter could, in other words, use the social medium to exert social pressure in order to make sure other members of their scientific community understand and live up to the norms of that community.

That outcome would strike me as a very good one.

* * * * *

In addition to the ever expanding collection of tweets about methods, #overlyhonestmethods also has links to some thoughtful, smart, and funny commentary on the hashtag and the conversations around it. Check it out!

Are scientists obligated to call out the bad work of other scientists? (A thought experiment)

Here’s a thought experiment. While it was prompted by intertubes discussions of evolutionary psychology and some of its practitioners, I take it the ethical issues are not limited to that field.

Say there’s an area of scientific research that is at a relatively early stage of its development. People working in this area of research see what they are doing as strongly connected to other, better established scientific fields, whether in terms of methodological approaches to answering questions, or the existing collections of empirical evidence on which they draw, or what have you.

There is general agreement within this community about the broad type of question that might be answered by this area of research and the sorts of data that may be useful in evaluating hypotheses. But there is also a good bit of disagreement among practitioners of this emerging field about which questions will be the most interesting (or tractable) ones to pursue, about how far one may reasonably extend the conclusions from particular bits of research, and even about methodological issues (such as what one’s null hypothesis should be).

Let me pause to note that I don’t think the state of affairs I’m describing would be out of the ordinary for a newish scientific field trying to get its footing. You have a community of practitioners trying to work out a reasonable set of strategies to answer questions about a bundle of phenomena that haven’t really been tackled by other scientific fields that are chugging merrily along. Not only do you not have the answers yet to the questions you’re asking about those phenomena, but you’re also engaged in building, testing, and refining the tools you’ll be using to try to answer those questions. You may share a commitment with others in the community that there will be a useful set of scientific tools (conceptual and methodological) to help you get a handle on those phenomena, but getting there may involve a good bit of disagreement about what tools are best suited for the task. And, there’s a possibility that in the end, there might not be any such tools that give you answers to the questions you’re asking.

Imagine yourself to be a member of this newish area of scientific research.*

What kind of obligation do you have to engage with other practitioners of this newish area of scientific research whose work you feel is not good? (What kind of “not good” are we talking about here? Possibly you perceive them to be drawing unwarranted conclusions from their studies, or using shoddy methodology, or ignoring empirical evidence that seems to contradict their claims. There’s no need to assume that they are being intentionally dishonest.) Do you have an obligation to take to the scientific literature to critique the shortcomings in their work? Do you have an obligation to communicate these critiques privately (e.g., in email correspondence)? Or is it ethically permissible not to engage with what you consider the bad examples of work in your emerging scientific field, instead keeping your head down and producing your own good examples of how to make progress in your emerging scientific field?

Do you think your obligations here are different than they might be if you were working in a well-established scientific field? (In a well-established scientific field, one might argue, the standards for good work and bad work are clearer; does this mean it takes less individual work to identify and rebut the bad work?)

Now consider the situation when your emerging scientific field is one that focuses on questions that capture the imagination not just of scientists trying to get this new field up and running, but also of the general public — to the extent that science writers and journalists are watching the output of your emerging scientific field for interesting results to communicate to the public. How does the fact that the public is paying some attention to your newish area of scientific research bear on what kind of obligation you have to engage with the practitioners in your field whose work you feel is not good?

(Is it fair that a scientist’s obligations within his or her scientific field might shift depending on whether the public cares at all about the details of the knowledge being built by that scientific field? Is this the kind of thing that might drive scientists into more esoteric fields of research?)

Finally, consider the situation when your emerging field of science has captured the public imagination, and when the science writers and journalists seem to be getting most of their information about what your field is up to and what knowledge you have built from the folks in your field whose work you feel is not good. Does this place more of an obligation upon you to engage with the practitioners doing not-good work? Does it obligate you to engage with the science writers and journalists to rebut the bad work and/or explain what is required for good scientific work in your newish field? If you suspect that science writers and journalists are acting, in this case, to amplify misunderstandings or to hype tempting results that lack proper evidential support, do you have an obligation to communicate directly to the public about the misunderstandings and/or about what proper evidential support looks like?

A question I think can be asked at every stage of this thought experiment: Does the community of practitioners of your emerging scientific field have a collective responsibility to engage with the not-so-good work, even if any given individual practitioner does not? And, if the answer to this question is “yes”, how can the community of practitioners live up to that obligation if no individual practitioner is willing to step up and do it?

_____
* For fun, you can also consider these questions from the point of view of a member of the general public: What kinds of obligations do you want the scientists in this emerging field to recognize? After all, as a member of the public, your interests might diverge in interesting ways from those of a scientist in this emerging field.

Wikipedia, the DSM, and Beavis.

There are some nights that Wikipedia raises more questions for me than it answers.

The other evening, reminiscing about some of the background noise of my life (viz. “Beavis and Butt-head”) when I was in graduate school, I happened to look up Cornholio. After I got over my amusement that its first six letters were enough to put my desired search target second on the list of Wikipedia’s suggestions for what I might be looking for (right between cornhole and Cornholme, I read the entry and got something of a jolt at its diagnostic tone:

After consuming large amounts of sugar and/or caffeine, Beavis sometimes undergoes a radical personality change, or psychotic break. In one episode, “Holy Cornholio”, the transformation occurred after chewing and swallowing many pain killer pills. He will raise his forearms in a 90-degree angle next to his chest, pull his shirt over his head, and then begin to yell or scream erratically, producing a stream of gibberish and strange noises, his eyes wide. This is an alter-ego named ‘Cornholio,’ a normally dormant persona. Cornholio tends to wander aimlessly while reciting “I am the Great Cornholio! I need TP for my bunghole!” in an odd faux-Spanish accent. Sometimes Beavis will momentarily talk normally before resuming the persona of Cornholio. Once his Cornholio episode is over, Beavis usually has no memory of what happened.

Regular viewers of “Beavis and Butt-head” probably suspected that Beavis had problems, but I’m not sure we knew that he had a diagnosable problem. For that matter, I’m not sure we would have classified moments of Cornholio as falling outside the broad umbrella of Things Beavis Does to Make Things Difficult for Teachers.

But, the Wikipedia editors seem to have taken a shine to the DSM (or other relevant literature on psychiatric conditions), and to have confidence that the behavior Beavis displays here is properly classified as a psychotic break.

Here, given my familiarity with the details of the DSM (hardly any), I find myself asking some questions:

  • Was the show written with the intention that the Beavis-to-Cornholio transformation be seen as a psychotic break?
  • Is it possible to give a meaningful psychiatric diagnosis of a cartoon character?
  • Does a cartoon character need a substantial inner life of some sort for a psychiatric diagnosis of that cartoon character to make any sense?
  • If psychiatric diagnoses are based wholly on outward behavioral manifestations rather than on the inner stuff that might be driving that behavior (as may be the case if it’s really possible to apply diagnostic criteria to Beavis), is this a good reason for us to be cautious about the potential value of these definitions and diagnostic criteria?
  • Is there a psychology or psychiatry classroom somewhere that is using clips of the Beavis-to-Cornholio transformation in order to teach students what a psychotic break is?

I’m definitely uncomfortable that this fictional character has a psychiatric classification thrust upon him so easily — though at least, as a fictional character, he doesn’t have to deal with any actual stigma associated with such a psychiatric classification. And, I think perhaps my unease points to a worry I have (and that Katherine Sharpe also voices in her book Coming of Age on Zoloft) about the project of assembling checklists of easy-to-assess symptoms that seem detached from the harder-to-assess conditions in someone’s head, or in his environment, that are involved in causing the symptoms in the first place.

Possibly Wikipedia’s take on Beavis is simply an indication that the relevant Wikipedia editors like the DSM a lot more than I do (or that they intended their psychiatric framing of Beavis ironically — and if so, well played, editors!). But possibly it reflects a larger society that is much more willing than I am to put behaviors into boxes, regardless of the details (or even existence) of the inner life that accompanies that behavior.

I would welcome the opinions and insight of psychiatrists, psychologist, and others who run with that crowd on this matter.

Whither mentoring?

Drugmonkey takes issue with the assertion that mentoring is dead*:

Seriously? People are complaining that mentoring in academic science sucks now compared with some (unspecified) halcyon past?

Please.

What should we say about the current state of mentoring in science, as compared to scientific mentoring in days of yore? Here are some possibilities:

Maybe there has been a decline in mentoring.

This might be because mentoring is not incentivized in the same way, or to the same degree, as publishing, grant-getting, etc. (Note, though, that some programs require evidence of successful mentoring for faculty promotion. Note also that some funding mechanisms require that the early-career scientist being funded have a mentor.)

Or it might be because no one trained the people who are expected to mentor (such as PIs) in how to mentor. (In this case, though, we might take this as a clue that the mentoring these PIs received in days of yore was not so perfect after all.)

Or, it might be that mentoring seems to PIs like a risky move given that it would require too much empathetic attachment with the trainees who are also one’s primary source of cheap labor, and whose prospects for getting a job like the PI’s are perhaps nowhere near as good as the PI (or the folks running the program) have led the trainees to believe.

Or, possibly PIs are not mentoring so well because the people they are being asked to mentor are increasingly diverse and less obviously like the PIs.

Maybe mentoring is no worse than it has ever been.

Perhaps it has always been a poorly defined part of the advisor’s job duties, not to mention one for which hardly anyone gets formal training in how to do. Moreover, the fact that it may depend on inclination and personal compatibility might make it more chancy than things like joining a lab or writing a dissertation.

Maybe mentoring has actually gotten better than it used to be.

It’s even possible that increased diversity in training populations might tend to improve mentoring by forcing PIs to be more conscious of their interactions (since they recognize that the people they are mentoring are not just like them). Similarly, awareness that trainees are facing a significantly different employment landscape than the one the mentor faced might help the mentor think harder about what kind of advice could actual be useful.

Here, I think that we might also want to recognize the possibility that what has changed is not the level of mentoring being delivered, but rather the expectations the trainees have for what kind of mentoring they should receive.

Pulling back from the question of whether mentoring has gotten better, worse, or stayed the same, there are two big issues that prevent us from being able to answer that question. One is whether we can get our hands on sensible empirical data to make anything like an apples-to-apples comparison of mentoring in different times (or, for that matter, in different places). The other is whether we’re all even talking about the same thing when we’re holding forth about mentoring and its putative decline.

Let’s take the second issue first. What do we have in mind when we say that trainees should have mentors? What exactly is it that they are supposed to get out of mentoring.

Vivian Weil [1], among others, points us to the literary origin of the term mentor, and the meanings this origin suggests, in the relationship between the characters Mentor and Telemachus in Homer’s epic poem, the Odyssey. Telemachus was the son of Odysseus; his father was off fighting the Trojan war, and his mother was busy fending off suitors (which involved a lot of weaving and unweaving), so the kid needed a parental surrogate to help him find his way through a confusing and sometimes dangerous world. Mentor took up that role.**

At the heart of mentoring, Weil argues, is the same kind of commitment to protect the interests of someone just entering the world of your discipline, and to help the mentee to develop skills sufficient to take care of himself or herself in this world:

All the activities of mentoring, but especially the nurturing activities, require interacting with those mentored, and so to be a mentor is to be involved in a relationship. The relationships are informal, fully voluntary for both members, but at least initially and for some time thereafter, characterized by a great disparity of experience and wisdom. … In situations where neophytes or apprentices are learning to “play the game”, mentors act on behalf of the interests of these less experienced, more vulnerable parties. (Weil, 473)

In the world of academic science, the guidance a mentor might offer would then be focused on the particular challenges the mentee is likely to face in graduate school, the period in which one is expected to make the transition from being a learner of scientific knowledge to being a maker of new knowledge:

On the traditional model, the mentoring relationship is usually thought of as gradual, evolving, long-term, and involving personal closeness. Conveying technical understanding and skills and encouraging investigative efforts, the mentor helps the mentee move through the graduate program, providing feedback needed for reaching milestones in a timely fashion. Mentors interpret the culture of the discipline for their mentees, and help them identify good practices amid the complexities of the research environment. (Weil, 474)

A mentor, in other words, is a competent grown-up member of the community in which the mentee is striving to become a grown-up. The mentor understands how things work, including what kinds of social interactions are central to conducting research, critically evaluating knowledge claims, and coordinating the efforts of members of the scientific community more generally.

Weil emphasizes that the the role of mentor, understood in this way, is not perfectly congruent with the role of the advisor:

While mentors advise, and some of their other activities overlap with or supplement those of an advisor, mentors should not be confused with advisors. Advising is a structured role in graduate education. Advisors are expected to perform more formal and technical functions, such as providing information about the program and degree requirements and periodic monitoring of advisees’ progress. The advisor may also have another structured role, that of research (dissertation) director, for advisors are often principal investigators or laboratory directors for projects on which advisees are working. In the role of research director, they “may help students formulate research projects and instruct them in technical aspects of their work such as design, methodology, and the use of instrumentation.” Students sometimes refer to the research or laboratory director as “boss”, conveying an employer/employee relationship rather than a mentor/mentee relationship. It is easy to see that good advising can become mentoring and, not surprisingly, advisors sometimes become mentors. Nevertheless, it is important to distinguish the institutionalized role of advisor from the informal activities of a mentor. (Weil, 474)

Mentoring can happen in an advising relationship, but the evaluation an advisor needs to do of the advisee may be in tension with the kind of support and encouragement a mentor should give. The advisor might have to sideline an advisee in the interests of the larger research project; the mentor would try to prioritize the mentee’s interests.

Add to this that the mentoring relationship is voluntary to a greater degree than the advising relationship (where you have to be someone’s advisee to get through), and the interaction is personal rather than strictly professional.

Among other things, this suggests that good advising is not necessarily going to achieve the desired goal of providing good mentoring. It also suggests that it’s a good idea to seek out multiple mentors (e.g., so in situations where an advisor cannot be a mentor due to the conflicting duties of the advisor, another mentor without these conflicts can pick up the slack).

So far, we have a description of the spirit of the relationship between mentor and mentee, and a rough idea of how that relationship might advance the welfare of the mentee, but it’s not clear that this is precise enough that we could use it to assess mentoring “in the wild”.

And surely, if we want to do more than just argue based on subjective anecdata about how mentoring for today’s scientific trainees compares to the good old days, we need to find some way to be more precise about the mentoring we have in mind, and to measure whether it’s happening. (Absent a time machine, or some stack of data collected on mentoring in the halcyon past, we probably have to acknowledge that we just don’t know how past mentoring would have measured up.)

A faculty team from the School of Nursing at Johns Hopkins University, led by Roland A. Berk [2], grappled with the issue of how to measure whether effective mentoring was going on. Here, the mentoring relationships in question were between more junior and more senior faculty members (rather than between graduate students and faculty members), and the impetus for developing a reliable way to measure mentoring effectiveness was the fact that evidence of successful mentoring activities was a criterion for faculty promotion.

Finding no consistent definition of mentoring in the literature on medical faculty mentoring programs, Berk et al. put forward this one:

A mentoring relationship is one that may vary along a continuum from informal/short-term to formal/long-term in which faculty with useful experience, knowledge, skills, and/or wisdom offers advice, information, guidance, support, or opportunity to another faculty member or student for that individual’s professional development. (Note: This is a voluntary relationship initiated by the mentee.) (Berk et al., 67)

Then, they spelled out central responsibilities within this relationship:

[F]aculty must commit to certain concrete responsibilities for which he or she will be held accountable by the mentees. Those concrete responsibilities are:

  • Commits to mentoring
  • Provides resources, experts, and source materials in the field
  • Offers guidance and direction regarding professional issues
  • Encourages mentee’s ideas and work
  • Provides constructive and useful critiques of the mentee’s work
  • Challenges the mentee to expand his or her abilities
  • Provides timely, clear, and comprehensive feedback to mentee’s questions
  • Respects mentee’s uniqueness and his or her contributions
  • Appropriately acknowledges contributions of mentee
  • Shares success and benefits of the products and activities with mentee

(Berk et al., 67)

These were then used to construct a “Mentorship Effectiveness Scale” that mentees could use to share their perceptions of how well their mentors did on each of these responsibilities.

Here, one might raise concerns that there might be a divergence between how effective a mentee thinks the mentor is in each of these areas and how effective the mentor actually is. Still, tracking the perceptions of the mentees with the instrument developed by Berk et al. provides some kind of empirical data. In discussions about whether mentoring is getting better or worse, such data might be useful.

And, if this data isn’t enough, it should be possible to work out strategies to get the data you want: Survey PIs to see what kind of mentoring they want to provide and how this compares to what kind of mentoring they feel able to provide. (If there are gaps here, follow-up questions might explore the perceived impediments to delivering certain elements of mentoring.) Survey the people running graduate programs to see what kind of mentoring they think they are (or should be) providing and what kind of mechanisms they have in place to ensure that if it doesn’t happen informally between the student and the PI, it’s happening somewhere.

To the extent that successful mentoring is already linked to tangible career rewards in some places, being able to make a reasonable assessment of it seems appropriate.

It’s possible that making it a standard thing to evaluate mentoring and to tie it to tangible career rewards (or penalties, if one does an irredeemably bad job of it) might help focus attention on mentoring as an important thing for grown-up members of the scientific community to do. This might also lead to more effort to help people learn how to mentor effectively and to offer support and remediation for people whose mentoring skills are not up to snuff.

But, I have a worry (not a huge one, but not nanoscale either). Evaluation of effective mentoring seems to rely on breaking out particular things the mentor does for the mentee, or particular kinds of interactions that take place between the two. In other words, the assessment tracks measurable proxies for a more complicated relationship.

That’s fine, but there’s a risk that a standardized assessment might end up reducing the “mentorship” that mentors offer, and that mentees seek, to these proxies. Were this to happen, we might lose sight of the broader, richer, harder-to-evaluate thing that mentoring can be — an entanglement of interests, a transmission of wisdom, and of difficult questions, and of hopes, and of fears, in what boils down to a personal relationship based on a certain kind of care.

The thing we want the mentorship relationship to be is not something that you could force two people to be in — any more than we could force two people to be in love. We feel the outcomes are important, but we cannot compel them.

And obviously, the assessable outcomes that serve as proxies for successful mentoring are better than nothing. Still, it’s not unreasonable for us to hope for more as mentees, nor to try to offer more as mentors.

After all, having someone on the inside of the world of which you are trying to become a part, someone who knows the way and can lead you through, and someone who believes in you and your potential even a little more than you believe in them yourself, can make all the difference.

_____
*Drugmonkey must know that my “Ethics in Science” class will be discussing mentoring this coming week, or else he’s just looking for ways to distract me from grading.

**As it happened, Mentor was actually Athena, the goddess of wisdom and war, in disguise. Make of that what you will.

[1] Weil, V. (2001) Mentoring: Some Ethical Considerations. Science and Engineering Ethics. 7 (4): 471-482.

[2] Berk, R. A., Berg, J., Mortimer, R., Walton-Moss, B., and Yeo, T. P. (2005) Measuring the Effectiveness of Faculty Mentoring Relationships. Academic Medicine. 80: 66-71.

Lads’ mags, sexism, and research in psychology: an interview with Dr. Peter Hegarty (part 2).

In this post, I continue my interview with Dr. Peter Hegarty, a social psychologist at the University of Surrey and one of the authors of ” ‘Lights on at the end of the party’: Are lads’ mags mainstreaming dangerous sexism?”, which was published in The British Journal of Psychology in December. My detailed discussion of that paper is here. The last post presented part 1 of our interview, in which Dr. Hegarty answered questions about the methodology of this particular research, as well as about some of the broader methodological differences between research in psychology and in sciences that are focused on objects of study other than humans.

Janet Stemwedel: It’s been pointed out that the university students that seem to be the most frequent subjects of psychological research are WEIRD (Western Educated Industrialized Rich Democratic). Is the WEIRDness of university students as subjects in this research something that should make us cautious about the strength of the conclusions we draw?  Or are university students actually a reasonably appropriate subject pool from the point of view of exploring how lads’ mags work?

Peter Hegarty: According to the historian Kurt Danziger in his book Constructing the Subject, students became an unmarked “normative” subject population for psychologists, at least in the United States, between the world wars. Since then, criticisms of over-reliance on student samples have been common (such as those of Quin McNemar in the 1940s, or David Sears in the 1980s). Within the history of this criticism, perhaps what is most distinct about the recent argument about WIERDness is that it draws on the developments in cultural psychology of the last 20 years or so. For this specific study, our rational for studying young people on a campus was not only convenience; they are also the target market for these magazines, by virtue of their age, and by virtue of possessing the disposable income to purchase them.

May I take the time to offer a slightly broader perspective on the problem of under- and over-representation of social groups in psychology? The issue is not simply one of who gets included, and who does not. This is because groups can be disempowered and science compromised by being erased (as the WIERD criticism presumes), and groups can be disempowered when they are consistently located within the psychologists’ gaze – as in Foucaultian disciplinary power. African-Americans are oversampled in the US literature on forensic psychology, but that literature is not anti-racist, it’s largely based on a “deficit” model of race (Carter & Forsythe, 2007). The issue is not simply one of inclusion or exclusion, but one of how inclusion happens, as sociologist Steven Epstein’s work on inclusive paradigms in medicine nicely shows.

In other experiments and content analyses, my colleagues and I have found that people spontaneously explain group differences by attending to lower power groups more of the time. In our own research we have observed this pattern in scientists publications and in explanations produced in the lab with regard to race, gender, and sexuality, for example (Hegarty & Buechel, 2006; Hegarty & Pratto, 2004). On the face of it, this might lead to greater stereotyping of the lower power “marked” group. Indeed, as Suzanne Bruckmueller’s work on linguistic framing subtly shows, once a group is positioned as “the effect to be explained” in an account of group differences, then people tend to infer that the group has less power (Bruckmüller & Abele, 2010). Our work suggests that to trouble the “normative” status that WIERD people occupy in our ontologies, that inclusion is necessary but not sufficient. It’s also important to reframe our questions about difference to think concretely about normative groups. In the case of our lads’ mags research, we were heartened that people were prompted to reframe questions about the widespread problem of violence against women away from the small category of convicted rapists, to ask broader questions about how such violence is normalized.

JS: A lot of scientists seem to have a love/hate relationship with mass media. They want the public to understand their research and why it’s interesting and important, but media coverage sometimes gets the details badly wrong, or obliterates the nuance.  And, given the subject matter of your research (which the average person might reasonably connect to his or her own concerns more easily than anything we might learn about the Higgs boson), it seems like misunderstandings of what the research means could get amplified pretty quickly.  What has your experience been as far as the media coverage of your research?  Are there particular kinds of issues you’d like the public to grasp better when they read or hear about this kind of research?

PH: Your question touches on the earlier point about the difference between the human and natural sciences. Our work is caught up in “looping effects” as people interpret it for themselves, but the Higgs boson doesn’t care if the folks in CERN discover it or not. (I think, I’m no expert on sub-atomic physics!) Although some research that I released last year on sexist language got good coverage in the media (Hegarty, Watson, Fletcher & McQueen, 2011), the speed and scale of the reaction to the Horvath et al. (2011) paper was a new experience for me, so I am learning about the media as I go.

There is no hard and fast boundary between “the media” and “the public” who are ‘influenced’ by that media anymore; I’m not sure there ever was one. The somewhat ‘viral’ reaction to this work on the social networking sites such as twitter was visibly self-correcting in ways that don’t fit with social scientists’ theories that blame the media for beguiling the public. Some journalists misunderstood the procedures of Experiment 1 in our study, and it was misdescribed in some media sources. But on Twitter, folk were re-directing those who were reproducing that factual error to the Surrey website. Overall, watching the Twitter feeds reminded me most of the experience of giving a class of students an article to discuss and watching a very useful conversation emerge about what the studies had hypothesized, what they had found, how much you might conclude from the results, and what the policy implications might be. I am somewhat more optimistic about the affordances of social media for education as a result of this experience.

JS: Given the connection between your research questions in this research and actual features of our world that might matter to us quite a lot (like how young men view and interact with the women with whom they share a world), it seems like ultimately we might want to *use* what we learn from the research to make things better, rather than just saying, “Huh, that’s interesting.”  What are the challenges to moving from description to prescription here?  Are there other “moving parts” of our social world you think we need to understand better to respond effectively to what we learn from studies like these?

Related to what I’ve said above, I would like people to see the research as a “red flag” about the range and character of media that young people now read, and which are considered “normal.” There are now numerous anecdotes on the web of people who have been prompted by this research to look at a lads’ mag for the first time – and been surprised or shocked by what they see. We are also in contact with some sex educators about how this work might be used to educate men for a world in which this range of media exists. Precisely because we think this research might have relevance for a broad range of people who care about the fact that people should have pleasure, intimacy, and sex without violence, bullying and hatred,

We have suggested that it should prompt investment in sex education rather than censorship. In so doing, we are adopting an ‘incrementalist’ approach to people’s intelligence about sex and sexual literacy. Carol Dweck’s work shows that children and young people who believe their intelligence to be a fixed ‘entity’ do not fare as well academically as those who believe their intelligence might be something ‘incremental’ that can be changed through effort. Censorship approaches seem to us to be based on fear, and to assume a rather fixed limit to the possibilities of public discourse about sex. We do not make those assumptions, but we fear that they can become self-fulfilling prophecies.

JS: How do you keep your prescriptive hunches from creeping into the descriptive project you’re trying to do with your research?

I’m not sure that it is possible or desirable to exclude subjectivity from science; your last question obliged me to move from description to prescription. It is sometimes striking how much many scientists want to be ‘above politics’ and influence policy, to advocate and remain value-neutral, to change the world, but not to intervene etc. My thinking on this matter borrows more from Sandra Harding’s view of ‘strong objectivity,’ and particularly her idea that the science we get is affected by the range of people included in its production and the forms of social relationships in which they participate. I also think that Stephen Shapin’s book A Social History of Truth is a useful albeit distal explanation of why the question of subjectivity in science is often seen as an affront to honour and the opposite of reasoned dispassionate discussion. In the UK, there is now an obligation on scientists to engage non-academic publics by reporting’ impact summaries to the government as part of national exercises for documenting research excellence. However, this policy can overlook the importance of two-way dialogue between academic and non-academic audiences about how we create different kinds of knowledge for different kinds of purposes. For those reasons, I’m grateful for the opportunity to participate in a more dialogical forum about science and ethics like this one.

Bibliography

Bruckmüller, S., & Abele, A. (2010). Comparison focus in intergroup comparisons: Who we compare to whom influences who we see as powerful and agentic. Personality and Social Psychology Bulletin, 36, 1424-1435.

Carter, R.T., & Forsythe, J.M. (2007). Examining race and culture in psychology journals: The case of forensic psychology. Professional Psychology: Theory and Practice, 38, 133-142.

Danziger, K. (1990). Constructing the Subject: Historical Origins of Psychological Research. Cambridge, UK: Cambridge University Press.

Dweck, C. (2000). Self-theories: Their Role in Motivation, Personality and Development. Psychology Press.

Epstein, S. (2007). Inclusion: The Politics of Difference in Medical Research. Chicago: Univeristy of Chicago Press.

Foucault, M. (1978). Discipline and Punish: The Birth of the Prison. Trans. Alan Sheridan. New York, Random House.

Hacking, I. (1995). The looping effects of human kinds. In Dan Sperber, David Premack and Ann James Premack (Eds.), Causal Cognition: A Multi-Disciplinary Debate (pp. 351-383). Oxford, UK: Oxford University Press.

Harding, S. (1987). The Science Question in Feminism. Ithaca, NY: Cornell University Press.

Hegarty, P., & Buechel C. (2006). Androcentric reporting of gender differences in APA journals: 1965-2004. Review of General Psychology, 10, 377-389.

Hegarty, P, & Pratto F. (2004) The differences that norms make: Empiricism, social constructionism, and the interpretation of group differences. Sex Roles, 50, 445-453.

Hegarty P.J., Watson, N., Fletcher L, & McQueen, G. (2011) When gentlemen are first and ladies are last: Effects of gender stereotypes on the order of romantic partners’ names. British Journal of Social Psychology, 50, 21-35.

Horvath, M.A.H., Hegarty, P., Tyler, S. & Mansfield, S. (2011).“Lights on at the end of the party”: Are Lads Mags’ Mainstreaming Dangerous Sexism? British Journal of Psychology. Available from http://onlinelibrary.wiley.com/doi/10.1111/j.2044-8295.2011.02086.x/abstract

McNemar, Q. (1940). Sampling in psychological research. Psychological Bulletin, 37, 331-365.

Sears, D. O. (1986). College sophomores in the laboratory: Influences of a narrow data base on social psychology’s view of human nature. Journal of Personality and Social Psychology, 51, 515-530.

Shapin, S. (1994). A Social History of Truth: Civility and Science in Seventeenth-Century England. Chicago: University of Chicago Press.

Lads’ mags, sexism, and research in psychology: an interview with Dr. Peter Hegarty (part 1).

Back in December, there was a study that appeared in The British Journal of Psychology that got a fair amount of buzz. The paper (Horvath, M.A.H., Hegarty, P., Tyler, S. & Mansfield, S., ” ‘Lights on at the end of the party’: Are lads’ mags mainstreaming dangerous sexism?” British Journal of Psychology. DOI:10.1111/j.2044-8295.2011.02086.x) looked the influence that magazines aimed at young men (“lads’ mags”) might have on how the young people who read them perceive their social reality. Among other things, the researchers found that the subjects in the study found the descriptions of women given by convicted sex offenders and lads’ mags are well nigh indistinguishable, and that when a quote was identified as from a lads’ mag (no matter what its actual source), subjects were more likely to say that they identified with the view it expressed than if the same quote was identified as coming from a rapist.

I wrote about the details of this research in a post on my other blog.

One of the authors of the study, Dr. Peter Hegarty, is someone I know a little from graduate school (as we were in an anthropology of science seminar together one term). He was gracious enough to agree to an interview about this research, and to answer some of my broader questions (as a physical scientist turned philosopher) about what doing good science looks like to a psychologist. Owing to its length, I’m presenting the interview in two posts, this one and one that will follow it tomorrow.

Janet Stemwedel: Is there something specific that prompted this piece of research — a particular event, or the Nth repetition of a piece of “common wisdom” that made it seem like it was time to interrogate it?  Or is this research best understood as part of a broader project (perhaps of identifying pieces of our social world that shape our beliefs and attitudes)?

Peter Hegarty: We came to this research for different reasons. Miranda [Horvath] had been working more consistently on the role of lads’ mags in popular culture than I had been (see Coy & Horvath, 2011). Prompted by another students’ interests, I had published a very short piece earlier this year on the question of representations of ‘heteroflexible’ women in lads’ mags (Hegarty & Buechel, 2011). The two studies reported in Horvath, Hegarty, Tyler & Mansfield (2011) were conducted as Suzannah Tyler and Sophie Mansfield’s M.Sc. Dissertations in Forensic Psychology, a course provided jointly by the University of Surrey and Broadmoor Hospital. Miranda and I took the lead on writing up the research after Miranda moved to Middlesex University in 2010.

JS: When this study was reported in the news, as the Twitters were lighting up with discussion about this research, some expressed concern that the point of the research was to identify lads’ mags as particularly bad (compared to other types of media), or as actually contributing to rapes.  Working from the information in the press release (because the research paper wasn’t quite out yet), there seemed to be some unclarity about precisely what inferences were being drawn from the results and (on the basis of what inferences people thought you *might* be drawing) about whether the research included appropriate controls — for example, quotes about women from The Guardian, or from ordinary-men-who-are-not-rapists.  Can you set us straight on what the research was trying to find out and on what inferences it does or does not support?  And, in light of the hypotheses you were actually testing, can you discuss the issue of experimental controls?

PH: Our research was focused on lads’ mags –- rather than other media –- because content analysis research had shown that those magazines were routinely sexist, operated in an advice-giving mode, and often dismissed their social influence. This is not the case –- as far as I know — with regard to The Guardian. So there was a rationale to focus on lads’ mags that was not based on prior research. We hoped to test our hypothesis that lads’ mags might be normalizing hostile sexism. This idea hung on two matters; is there an overlap in the discourse of lads’ mags and something that most people would accept as hostile sexism? Does that content appear more acceptable to young men when it appears to come from a lads’ mag? The two studies mapped onto these goals. In one, we found that young women and men couldn’t detect the source of a quote as coming from a convicted rapist’s interview or a lads’ mag. In another, young men identified more with quotes that they believed to have come from lads’ mags rather than convicted rapists.

JS: While we’re on the subject of controls, it strikes me that good experimental design in psychological research is probably different in some interesting ways from good experimental design in, say, chemistry.  What are some misconceptions those of us who have more familiarity with the so-called “hard sciences” have about social science research?  What kind of experimental rigor can you achieve without abandoning questions about actual humans-in-the-world?

PH: You are right that these sciences might have different ontologies, because psychology is a human science. There are a variety of perspectives on this, with scholars such as Ian Hacking arguing for a separate ontology of the human sciences and more postmodern authors such as Bruno Latour arguing against distinctions between humans and things. Generally, I would be loath do describe differences between the sciences in terms of the metaphor of “hardness,” because the term is loaded with implicature. First, psychology is a potentially reflexive science about people, conducted by people and is characterized by what the philosopher Ian Hacking calls “looping effects;” people’s thoughts, feelings and behaviours are themselves influenced by psychological theories about them. Second, measurement in psychology is more often dependent on normalization and relative judgment (as in an IQ test, or a 7-point Likert item on a questionnaire, for example). Third, there is a lot of validity to the Foucaultian argument that the “psy- disciplines” have often been used in the service of the state, to divide people into categories of “normal” and “abnormal” people, so that different people might be treated very differently without offending egalitarian ideologies. Much of clinical psychology and testing takes this form.

Critics of psychology often stop there. By so doing, they overlook the rich tradition within psychology of generating knowledge that troubles forms of normalization, by suggesting that the distinction between the “normal” and the “abnormal” is not as firm as common sense suggests. Studies in this tradition might include Evelyn Hooker’s (1957) demonstration – from that dark era when homosexuality was considered a mental illness – that there are no differences in the responses of gay and straight men to personality tests. One might also include David Rosenhan’s (1973) study in which ordinary people managed to deceive psychiatrists that they were schizophrenic. A third example might be stereotype threat research (e.g., by Claude Steele and Joshua Aronson, 1995), which shows that the underperformance of African Americans on some standardized tests reflects not genuine ability, but a situational constraint introduced by testing conditions. Like these studies, we would hope ours would trouble’s people’s sense of what they take for granted about differences between people. In particular we hope that people will reconsider what they think they know about “extreme” sexism – that leads to incarceration – and “normal” sexism, that is now typical for young men to consume. I would urge academic critics of psychology – particularly those that focus on its complicity with Foucaultian disciplinary power, and the power of the state more generally – to develop more critiques that can account for such empirical work.

For the last half a century, “rigor” in empirical psychology has been organized by the language of validity and reliability of measurement (Cronbach & Meehl, 1955). Psychologists also tend to be Popperians, who construct “falsifiable” theories and use Fischerian inferential statistics to construct experiments that afford the possibility of falsification. However, inferential norms are changing in the discipline for three reasons. First, the rise of neuroscience has lead to a more inductive form of inference in which mapping and localization plays a greater role in scientific explanation. Second, social psychologists are increasingly engaging with structural equation modelling and offering confirmatory models of social processes. Third, there is “statistical reform” in psychology, away from the ritual of statistical significance testing toward making variability more transparent through the reporting of confidence intervals, effect sizes, and exact significance values. See Spellman (2012) for one very recent discussion of what’s happening within the genre of scientific writing in psychology around retaining rigor and realism in psychological science.

JS: One thing that struck me in reading the paper was that instruments have been developed to measure levels of sexism.  Are these measures well-accepted within the community of research psychologists?  (I am guessing that if the public even knew about them, they would be pretty controversial in some quarters … maybe the very quarters whose denizens would get high scores on these measures!)

We used two well-established measures; the ambivalent sexism inventory and the AMMSA, and one measure of endorsement of lads’ mags that we developed ourselves for the study. We describe some of the previous findings of other researchers who have used these scales to examine individual differences in responses to vignettes about sexual violence in the article. We feel more confident of the measure we developed ourselves because it was highly correlated with all other measures of sexism and because it was highly correlated with men’s identification with quotes from rapists and from lads’ mags. In other words, we followed the logic of psychologists such as Lee Cronbach, Paul Meehl and Donald Campbell for establishing and developing the “construct validity” of the empirical scales.

* * * * *

Tomorrow, in the second part of my interview with Peter Hegarty, we discuss the WEIRD-ness of college students as subjects for psychological research, how to go from description to prescription, and what it’s like for scientists to talk about their research with the media in the age of Twitter. Stay tuned!

Bibliography

Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52, 281-302.

Coy, M., & Horvath, M.A.H. (2011).‘Lads mags’, young men’s attitudes towards women and acceptance of myths about sexual aggression. Feminism & Psychology, 21, 144-150.

Foucault, M. (1978). Discipline and Punish: The Birth of the Prison. Trans. Alan Sheridan. New York, Random House.

Hacking, I. (1995). The looping effects of human kinds. In Dan Sperber, David Premack and Ann James Premack (Eds.), Causal Cognition: A Multi-Disciplinary Debate (pp. 351-383). Oxford, UK: Oxford University Press.

Hegarty, P., & Buechel C (2011) ‘”What Blokes Want Lesbians to be”: On FHM and the socialization of pro-lesbian attitudes among heterosexual-identified men’. Sage Publications Feminism & Psychology, 21, 240-247.

Hooker, E. (1957). The adjustment of the male overt homosexual. Journal of Projective Techniques, 21, 18-31.

Horvath, M.A.H., Hegarty, P., Tyler, S. & Mansfield, S. (2011).“Lights on at the end of the party”: Are Lads Mags’ Mainstreaming Dangerous Sexism? British Journal of Psychology. Available from http://onlinelibrary.wiley.com/doi/10.1111/j.2044-8295.2011.02086.x/abstract

Latour, B. (1993). We Have Never Been Modern. Cambridge, MA: Harvard University Press.

Rosenhan, D.L. (1973). On being sane in insane places. Science, 179, 250-258.

Spellman, B.A. (2012). Introduction to the special section: Data, data everywhere. . . especially in my file drawer. Perspectives on Psychological Science, 7, 58-59.

Steele, C., & Aronson, J. (1995). Stereotype threat and the intellectual test performance of African Americans.” Journal of Personality and Social Psychology 69, 797-811.

Scientific authorship: guests, courtesy, contributions, and harms.

DrugMonkey asks, where’s the harm in adding a “courtesy author” (also known as a “guest author”) to the author line of a scientific paper?

I think this question has interesting ethical dimensions, but before we get into those, we need to say a little bit about what’s going on with authorship of scientific papers.

I suppose there are possible worlds in which who is responsible for what in a scientific paper might not matter. In the world we live in now, however, it’s useful to know who designed the experimental apparatus and got the reaction to work (so you can email that person your questions when you want to set up a similar system), who did the data analysis (so you can share your concerns about the methodology), who made the figures (so you can raise concerns about digital fudging of the images), etc. Part of the reason people put their names on scientific papers is so we know who stands behind the research — who is willing to stake their reputation on it.

The other reason people put their names on scientific papers is to claim credit for their hard work and their insights, their contribution to the larger project of scientific knowledge-building. If you made a contribution, the scientific community ought to know about it so they can give you props (and funding, and tenure, and the occasional Nobel Prize).

But, we aren’t in a possition to make accurate assignments of credit or responsibility if we have no good information about what an author’s actual involvement in the project may have been. We don’t know who’s really in a position to vouch for the data, or who really did heavy intellectual lifting in bringing the project to fruition. We may understand, literally, the claim, “Joe Schmoe is second author of this paper,” but we don’t know what that means, exactly.

I should note that there is not one universally recognized authorship standard for all of the Tribe of Science. Rather, different scientific disciplines (and subdisciplines) have different practices as far as what kind of contribution is recognized as worthy of inclusion as an author on a paper, and as far as what the order in which the authors are listed is supposed to communicate about the magnitude of each contribution. In some fields, authors are always listed alphabetically, no matter what they contributed. In others, being first in the list means you made the biggest contribution, followed by the second author (who made the second-biggest contribution), and so forth. It is usually the case that the principal investigator (PI) is identified as the “corresponding author” (i.e., the person to whom questions about the work should be directed), and often (but not always) the PI takes the last slot in the author line. Sometimes this is an acknowledgement that while the PI is the brains of the lab’s scientific empire, particular underlings made more immediately important intellectual contributions to the particular piece of research the paper is communicating. But authorship practices can be surprisingly local. Not only do different fields do it differently, but different research groups in the same field — at the same university — do it differently. What this means is it’s not obvious at all, from the fact that your name appears as one of the authors of a paper, what your contribution to the project was.

There have been attempts to nail down explicit standards for what kinds of contributions should count for authorship, with the ICMJE definition of authorship being one widely cited effort in this direction. Not everyone in the Tribe of Science, or even in the subset of the tribe that publishes in biomedical journals, thinks this definition draws the lines in the right places, but the fact that journal editors grapple with formulating such standards suggests at least the perception that scientists need a clear way to figure out who is responsible for the scientific work in the literature. We can have a discussion about how to make that clearer, but we have to acknowledge that at the present moment, just noting that someone is an author without some definition of what that entails doesn’t do the job.

Here’s where the issue of “guest authorship” comes up. A “guest author” is someone whose name appears in a scientific paper’s author line even though she has not made a contribution that is enough (under whatever set of standards one recognizes for proper authorship) to qualify her as an author of the paper.

A guest is someone who is visiting. She doesn’t really live here, but stays because of the courtesy and forebearance of the host. She eats your food, sleeps under your roof, uses your hot water, watches your TV — in short, she avails herself of the amenities the host provides. She doesn’t pay the rent or the water bill, though; that would transform her from a guest to a tenant.

To my way of thinking, a guest author is someone who is “just visiting” the project being written up. Rather than doing the heavy lifting in that project, she is availing herself of the amenities offered by association (in print) with that project, and doing so because of the courtesy and forebearance of the “host” author.

The people who are actually a part of the project will generally be able to recognize the guest author as a “guest” (as opposed to an actual participant). The people receiving the manuscript will not. In other words, the main amenity the guest author partakes in is credit for the labors of the actual participants. Even if all the participants agreed to this (and didn’t feel the least bit put out at the free-rider whose “authorship” might be diluting his or her own share of credit), this makes it impossible for those outside the group to determine what the guest author’s actual contribution was (or, in this case, was not). Indeed, if people outside the arrangement could tell that the guest author was a free-rider, there wouldn’t be any point in guest authorship.

Science strives to be a fact-based enterprise. Truthful communication is essential, and the ability to connect bits of knowledge to the people who contributed is part of how the community does quality control on that knowledge base. Ambiguity about who made the knowledge may lead to ambiguity about what we know. Also, developing too casual a relationship with the truth seems like a dangerous habit for a scientist to get into.

Coming back to DrugMonkey’s question about whether courtesy authorship is a problem, it looks to me like maybe we can draw a line between two kinds of “guests,” one that contributes nothing at all to the actual design, execution, evaluation, or communication of the research, and one who contributes something here, just less than what the conventions require for proper authorship. If these characters were listed as authors on a paper, I’d be inclined to call the first one a “guest author” and the second a “courtesy author” in an attempt to keep them straight; the cases with which DrugMonkey seems most concerned are the “courtesy authors” in my taxonomy. In actual usage, however, the two labels seem to be more or less interchangeable. Naturally, this makes it harder to distinguish who actually did what — but it strikes me that this is just the kind of ambiguity people are counting on when they include a “guest author” or “courtesy author” in the first place.

What’s the harm?

Consider a case where the PI of a research group insists on giving authorship of a paper to a postdoc who hasn’t gotten his experimental system to work at all and is almost out of funding. The PI gives the justification that “He needs some first-author papers or his time here will have been a total waste.” As it happens, giving this postdoc authorship bumps the graduate student who did all the experimental work (and the conceptual work, and data analysis, and drafting of the manuscript) out of first author slot — maybe even off the paper entirely.

There is real harm here, to multiple parties. In this case, someone got robbed of appropriate credit, and the person identified as most responsible for the published work will be a not-very-useful person to contact with deeper questions about the work (since he didn’t do any of it or at best participated on the periphery of the project).

Consider another kind of case, where authorship is given to a well-known scientist with a lot of credibility in his field, but who didn’t make a significant intellectual contribution to work (at least, not one that rises to the level of meriting authorship under the recognized standards). This is the kind of courtesy authorship that was extended to Gerald Schatten in a 2005 paper in Science another of whose authors was Hwang Woo Suk. This paper had 25 authors listed, with Schatten identified as the senior author. Ultimately, the paper was revealed to be fraudulent, at which point Schatten claimed mostly to have participated in writing the paper in good English — a contribution recognized as less than what one would expect from an author (especially the senior author).

Here, including Schatten as an author seemed calculated to give the appearance (to the journal editors while considering the manuscript, and to the larger scientific community consuming the published work)that the work was more important and/or credible, because of the big name associated with it. But this would only work because listing that big name in the author line amounts to claiming the big name was actually involved in the work. When the paper fell apart, Schatten swiftly disavowed responsibility — but such a disavowal was only necessary because of what was communicated by the author line, and I think it’s naïve to imagine that this “ambiguity” or “miscommunication” was accidental.

In cases like this, I think it’s fair to say courtesy authorship does harm, undermining the baseline of trust in the scientific community. It’s hard to engage in efficient knowledge-building with people you think are trying to put one over on you.

The cases where DrugMonkey suggests courtesy authorship might be innocuous strike me as interestingly different. They are cases where someone has actually made a real contribution of some sort to the work, but where that contribution may be judged (under whatever you take to be the accepted standards of your scientific discipline) as not quite rising to the level of authorship. Here, courtesy authorship could be viewed as inflating the value of the actual contribution (by listing the person who made it in the author line, rather than the acknowledgements), or alternatively as challenging where the accepted standards of your discipline draw the line between a contribution that qualifies you as an author and one that does not. For example, DrugMonkey writes:

First, the exclusion of those who “merely” collect data is stupid to me. I’m not going to go into the chapter and verse but in my lab, anyway, there is a LOT of ongoing trouble shooting and refining of the methods in any study. It is very rare that I would have a paper’s worth of data generated by my techs or trainees and that they would have zero intellectual contribution. Given this, the asymmetry in the BMJ position is unfair. In essence it permits a lab head to be an author using data which s/he did not collect and maybe could not collect but excludes the technician who didn’t happen to contribute to the drafting of the manuscript. That doesn’t make sense to me. The paper wouldn’t have happened without both of the contributions.

I agree with DrugMonkey that there’s often a serious intellectual contribution involved in conducting the experiments, not just in designing them (and that without the data, all we have are interesting hunches, not actual scientific knowledge, to report). Existing authorship standards like those from ICMJE or BMJ can unfairly exclude those who do the experimental labor from authorship by failing to recognize this as an intellectual contribution. Pushing to have these real contributions recognized with appropriate career credit is important. As well, being explicit about who made these contributions to the research being reported in the paper makes it much easier for other scientists following up on the published work (e.g., comparing it to their own results in related experiments, or trying to use some of the techniques described in the paper to set up new experiments) to actually get in touch with the people most likely to be able to answer their questions.

Changing how might weight experimental prowess is given in the career scorekeeping may be an uphill battle, especially when the folks distributing the rewards for the top scores are administrators (focused on the money the people they’re scoring can bring to an institution) and PIs (who frequently have more working hours devoted to conception and design of project for their underlings rather than to the intellectual labor of making those projects work, and to writing the proposals that bring in the grant money and the manuscripts that report the happy conclusion of the projects funded by such grants). That doesn’t mean it’s not a fight worth having.

But, I worry that using courtesy authorship as a way around this unfair setting of the authorship bar actually amounts to avoiding the fight rather than addressing these issues and changing accepted practices.

DrugMonkey also writes:

Assuming that we are not talking about pushing someone else meaningfully* out of deserved credit, where lies the harm even if it is a total gift?

Who is hurt? How are they damaged?
__
*by pushing them off the paper entirely or out of first-author or last-author position. Adding a 7th in the middle of the authorship list doesn’t affect jack squat folks.

Here, I wonder: if dropping in a courtesy author as the seventh author of a paper can’t hurt, how either can we expect it to help the person to whom this “courtesy” is extended?

Is it the case that no one actually expects that the seventh author made anything like a significant contribution, so no one is being misled in judging the guest in the number seven slot as having made a comparable contribution to the scientist who earned her seventh-author position in another paper? If listing your seventh-author paper on your CV is automatically viewed as not contributing any points in your career scorekeeping, why even list it? And why doesn’t it count for anything? Is it because the seventh author never makes a contribution worth career points … or is it because, for all we know, the seventh author may be a courtesy author, there for other reasons entirely?

If a seventh-author paper is actually meaningless for career credit, wouldn’t it be more help to the person to whom you might extend such a “courtesy” if you actually engaged her in the project in such a way that she could make an intellectual contribution recognized as worthy of career credit?

In other words, maybe the real problem with such courtesy authorship is that it gives the appearance of help without actually being helpful.