Individual misconduct or institutional failing: “The Newsroom” and science.

I’ve been watching The Newsroom*, and in its second season, the storyline is treading on territory where journalism bears some striking similarities to science. Indeed, the most recent episode (first aired Sunday, August 25, 2013) raises questions about trust and accountability — both at the individual and the community levels — for which I think science and journalism may converge.

I’m not going to dig too deeply into the details of the show, but it’s possible that the ones I touch on here reach the level of spoilers. If you prefer to stay spoiler-free, you might want to stop reading here and come back after you’ve caught up on the show.

The central characters in The Newsroom are producing a cable news show, trying hard to get the news right but also working within the constraints set by their corporate masters (e.g., they need to get good ratings). A producer on the show, on loan to the New York-based team from the D.C. bureau, gets a lead for a fairly shocking story. He and some other members of the team try to find evidence to support the claims of this shocking story. As they’re doing this, they purposely keep other members of the production team out of the loop — not to deceive them or cut them out of the glory if, eventually, they’re able to break the story, but to enable these folks to look critically at the story once all the facts are assembled, to try to poke holes in it.** And, it’s worth noting, the folks actually in the loop, looking for information that bears on the reliability of the shocking claims in the story, are shown to be diligent about considering ways they could be wrong, identifying alternate explanations for details that seem to be support for the story, etc.

The production team looks at all the multiple sources of information they have. They look for reasons to doubt the story. They ultimately decide to air the story.

But, it turns out the story is wrong.

Worse is why key pieces of “evidence” supporting the story are unreliable. One of the interviewees is apparently honest but unreliable. One source of leaked information is false, because the person who leaked it has a grudge against a member of the production team. And, it turns out that the producer on loan from the D.C. bureau has doctored a taped interview that is the lynchpin of the story to make it appear that the interviewee said something he didn’t say.

The producer on loan from the D.C. bureau is fired. He proceeds to sue the network for wrongful termination, claiming it was an institutional failure that led to the airing of the now-retracted big story.

The parallels to scientific knowledge-building are clear.

Scientists with a hypothesis try to amass evidence that will make it clear whether the hypothesis is correct or incorrect. Rather than getting lulled into a false sense of security by observations that seem to fit the hypothesis, scientists try to find evidence that would rule out the hypothesis. They recognize that part of their job as knowledge-builders is to exercise organized skepticism — directed at their own scientific claims as well as at the claims of other scientists. And, given how vulnerable we are to our own unconscious biases, scientists rely on teamwork to effectively weed out the “evidence” that doesn’t actually provide strong support for their claims.

Some seemingly solid evidence turns out to be faulty. Measuring devices can become unreliable, or you get stuck with a bad batch of reagent, or your collaborator sends you a sample from the wrong cell line.

And sometimes a scientist who is sure in his heart he knows what the truth is doctors the evidence to “show” that truth.

Fabricating or falsifying evidence is, without question, a crime against scientific knowledge-building. But does the community that is taken in by the fraudster bear a significant share of the blame for believing him?

Generally, I think, the scientific community will say, “No.” A scientist is presumed by other members of his community to be honest unless there’s good reason to think otherwise. Otherwise, each scientist would have to replicate every observation reported by every other scientist ever before granting it any credibility. There aren’t enough grant dollars or hours in the day for that to be a plausible way to build scientific knowledge.

But, the community of science is supposed to ensure that findings reported to the public are thoroughly scrutinized for errors, not presented as more certain than the evidence warrants. The public trusts scientists to do this vetting because members of the public generally don’t know how to do this vetting themselves. Among other things, this means that a scientific fraudster, once caught, doesn’t just burn his own credibility — he can end up burning the credibility of the entire scientific community that was taken in by his lies.

Given how hard it can be to distinguish made-up data from real data, maybe that’s not fair. Still, if the scientific community is asking for the public’s trust, that community needs to be accountable to the public — and to find ways to prevent violations of trust within the community, or at least to deal effectively with those violations of trust when they happen.

In The Newsroom, after the big story unravels, as the video-doctoring producer is fired, the executive producer of the news show says, “People will never trust us again.” It’s not just the video-doctoring producer that viewers won’t trust, but the production team who didn’t catch the problem before presenting the story as reliable. Where the episodes to date leave us, it’s uncertain whether the production team will be able to win back the trust of the public — and what it might take to win back that trust.

I think it’s a reasonable question for the scientific community, too. In the face of incidents where individual scientists break trust, what does it take for the larger community of scientific knowledge-builders to win the trust of the public?

_____
* I’m not sure it’s a great show, but I have a weakness for the cadence of Aaron Sorkin’s dialogue.

** In the show, the folks who try to poke holes in the story presented with all the evidence that seems to support it are called the “red team,” and one of the characters claims its function is analogous to that of red blood cells. This … doesn’t actually make much sense, biologically. I’m putting a pin in that, but you are welcome to critique or suggest improvements to this analogy in the comments.

Strategies to address questionable statistical practices.

If you have not yet read all you want to read about the wrongdoing of social psychologist Diederik Stapel, you may be interested in reading the 2012 Tilburg Report (PDF) on the matter. The full title of the English translation is “Flawed science: the fraudulent research practices of social psychologist Diederik Stapel” (in Dutch, “Falende wetenschap: De fruaduleuze onderzoekspraktijken van social-psycholoog Diederik Stapel”), and it’s 104 pages long, which might make it beach reading for the right kind of person.

If you’re not quite up to the whole report, Error Statistics Philosophy has a nice discussion of some of the highlights. In that post, D. G. Mayo writes:

The authors of the Report say they never anticipated giving a laundry list of “undesirable conduct” by which researchers can flout pretty obvious requirements for the responsible practice of science. It was an accidental byproduct of the investigation of one case (Diederik Stapel, social psychology) that they walked into a culture of “verification bias”. Maybe that’s why I find it so telling. It’s as if they could scarcely believe their ears when people they interviewed “defended the serious and less serious violations of proper scientific method with the words: that is what I have learned in practice; everyone in my research environment does the same, and so does everyone we talk to at international conferences” (Report 48). …

I would place techniques for ‘verification bias’ under the general umbrella of techniques for squelching stringent criticism and repressing severe tests. These gambits make it so easy to find apparent support for one’s pet theory or hypotheses, as to count as no evidence at all (see some from their list ). Any field that regularly proceeds this way I would call a pseudoscience, or non-science, following Popper. “Observations or experiments can be accepted as supporting a theory (or a hypothesis, or a scientific assertion) only if these observations or experiments are severe tests of the theory.”

You’d imagine this would raise the stakes pretty significantly for the researcher who could be teetering on the edge of verification bias: fall off that cliff and what you’re doing is no longer worthy of the name scientific knowledge-building.

Psychology, after all, is one of those fields given a hard time by people in “hard sciences,” which are popularly reckoned to be more objective, more revealing of actual structures and mechanisms in the world — more science-y. Fair or not, this might mean that psychologist have something to prove about their hardheadedness as researchers, about the stringency of their methods. Some peer pressure within the field to live up to such standards would obviously be a good thing — and certainly, it would be a better thing for the scientific respectability of psychology than an “everyone is doing it” excuse for less stringent methods.

Plus, isn’t psychology a field whose practitioners should have a grip on the various cognitive biases to which we humans fall prey? Shouldn’t psychologists understand better than most the wisdom of putting structures in place (whether embodied in methodology or in social interactions) to counteract those cognitive biases?

Remember that part of Stapel’s M.O. was keeping current with the social psychology literature so he could formulate hypotheses that fit very comfortably with researchers’ expectations of how the phenomena they studied behaved. Then, fabricating the expected results for his “investigations” of these hypotheses, Stapel caught peer reviewers being credulous rather than appropriately skeptical.

Short of trying to reproduce the experiments Stapel described themselves, how could peer reviewers avoid being fooled? Mayo has a suggestion:

Rather than report on believability, researchers need to report the properties of the methods they used: What was their capacity to have identified, avoided, admitted verification bias? The role of probability here would not be to quantify the degree of confidence or believability in a hypothesis, given the background theory or most intuitively plausible paradigms, but rather to check how severely probed or well-tested a hypothesis is– whether the assessment is formal, quasi-formal or informal. Was a good job done in scrutinizing flaws…or a terrible one?  Or was there just a bit of data massaging and cherry picking to support the desired conclusion? As a matter of routine, researchers should tell us.

I’m no social psychologist, but this strikes me as a good concrete step that could help peer reviewers make better evaluations — and that should help scientists who don’t want to fool themselves (let alone their scientific peers) to be clearer about what they really know and how well they really know it.

The continuum between outright fraud and “sloppy science”: inside the frauds of Diederik Stapel (part 5).

It’s time for one last look at the excellent article by Yudhijit Bhattacharjee in the New York Times Magazine (published April 26, 2013) on social psychologist and scientific fraudster Diederik Stapel. We’ve already examined strategy Stapel pursued to fabricate persuasive “results”, the particular harms Stapel’s misconduct did to the graduate students he was training, and the apprehensions of the students and colleagues who suspected fraud was afoot about the prospect of blowing the whistle on Stapel. To close, let’s look at some of the uncomfortable lessons the Stapel case has for his scientific community — and perhaps for other scientific communities as well.

Bhattacharjee writes:

At the end of November, the universities unveiled their final report at a joint news conference: Stapel had committed fraud in at least 55 of his papers, as well as in 10 Ph.D. dissertations written by his students. The students were not culpable, even though their work was now tarnished. The field of psychology was indicted, too, with a finding that Stapel’s fraud went undetected for so long because of “a general culture of careless, selective and uncritical handling of research and data.” If Stapel was solely to blame for making stuff up, the report stated, his peers, journal editors and reviewers of the field’s top journals were to blame for letting him get away with it. The committees identified several practices as “sloppy science” — misuse of statistics, ignoring of data that do not conform to a desired hypothesis and the pursuit of a compelling story no matter how scientifically unsupported it may be.

The adjective “sloppy” seems charitable. Several psychologists I spoke to admitted that each of these more common practices was as deliberate as any of Stapel’s wholesale fabrications. Each was a choice made by the scientist every time he or she came to a fork in the road of experimental research — one way pointing to the truth, however dull and unsatisfying, and the other beckoning the researcher toward a rosier and more notable result that could be patently false or only partly true. What may be most troubling about the research culture the committees describe in their report are the plentiful opportunities and incentives for fraud. “The cookie jar was on the table without a lid” is how Stapel put it to me once. Those who suspect a colleague of fraud may be inclined to keep mum because of the potential costs of whistle-blowing.

The key to why Stapel got away with his fabrications for so long lies in his keen understanding of the sociology of his field. “I didn’t do strange stuff, I never said let’s do an experiment to show that the earth is flat,” he said. “I always checked — this may be by a cunning manipulative mind — that the experiment was reasonable, that it followed from the research that had come before, that it was just this extra step that everybody was waiting for.” He always read the research literature extensively to generate his hypotheses. “So that it was believable and could be argued that this was the only logical thing you would find,” he said. “Everybody wants you to be novel and creative, but you also need to be truthful and likely. You need to be able to say that this is completely new and exciting, but it’s very likely given what we know so far.”

Fraud like Stapel’s — brazen and careless in hindsight — might represent a lesser threat to the integrity of science than the massaging of data and selective reporting of experiments. The young professor who backed the two student whistle-blowers told me that tweaking results — like stopping data collection once the results confirm a hypothesis — is a common practice. “I could certainly see that if you do it in more subtle ways, it’s more difficult to detect,” Ap Dijksterhuis, one of the Netherlands’ best known psychologists, told me. He added that the field was making a sustained effort to remedy the problems that have been brought to light by Stapel’s fraud.

(Bold emphasis added.)

If the writers of this report are correct, the field of psychology failed in multiple ways here. First, they were insufficiently skeptical — both of Stapel’s purported findings and of their own preconceptions — to nip Stapel’s fabrications in the bud. And, they were themselves routinely engaging in practices that were bound to mislead.

Maybe these practices don’t rise to the level of outright fabrication. However, neither do they rise to the level of rigorous and intellectually honest scientific methodology.

There could be a number of explanations for these questionable methodological choices.

Possibly some of the psychologists engaging in this “sloppy science” lack a good understanding of statistics or of what counts as a properly rigorous test of one’s hypothesis. Essentially, this is an explanation of faulty methodology on the basis of ignorance. However, it’s likely that this is culpable ignorance — that psychology researchers have a positive duty to learn what they ought to know about statistics and hypothesis testing, and to avail themselves of available resources to ensure that they aren’t ignorant in this particular way.

I don’t know if efforts to improve statistics education are a part of the “sustained effort to remedy the problems that have been brought to light by Stapel’s fraud,” but I think they should be.

Another explanation for the lax methodology decried by the report is alluded to in the quoted passage: perhaps psychology researchers let the strength of their own intuitions about what they were going to see in their research results drive their methodology. Perhaps they unconsciously drifted away from methodological rigor and toward cherry-picking and misuse of statistics and the like because they knew in their hearts what the “right” answer would be. Given this kind of conviction, of course they would reject methods that didn’t yield the “right” answer in favor of those that did.

Here, too, the explanation does not provide an excuse. The scientist’s brief is not to take strong intuitions as true, but to look for evidence — especially evidence that could demonstrate that the intuitions are wrong. A good scientist should be on the alert for instances where she is being fooled by her intuitions. Rigorous methodology is one of the tools at her disposal to avoid being fooled. Organized skepticism from her fellow scientists is another.

From here, the explanations drift into waters where the researchers are even more culpable for their sloppiness. If you understand how to test hypotheses properly, and if you’re alert enough to the seductive power of your intuitions, it seems like the other reason you might engage in “sloppy science” is to make your results look less ambiguous, more certain, more persuasive than they really are, either to your fellow scientists or to others (administrators evaluating your tenure or promotion case? the public?). Knowingly providing a misleading picture of how good your results are is lying. It may be a lie of a smaller magnitude than Diederik Stapel’s full-scale fabrications, but it’s still dishonest.

And of course, there are plenty of reasons scientists (like other human beings) might try to rationalize a little lie as being not that bad. Maybe you really needed more persuasive preliminary data than you got to land the grant without which you won’t be able to support graduate students. Maybe you needed to make your conclusions look stronger to satisfy the notoriously difficult peer reviewers at the journal to which you submitted your manuscript. Maybe you are on the verge of getting credit for a paradigm-shaking insight in your field (if only you can put up the empirical results to support it), or of beating a competing research group to the finish line for an important discovery (if only you can persuade your peers that the results you have establish that discovery).

But maybe all these excuses prioritize scientific scorekeeping to the detriment of scientific knowledge-building.

Science is supposed to be an activity aimed at building a reliable body of knowledge about the world. You can’t reconcile this with lying, whether to yourself or to your fellow scientists. This means that scientists who are committed to the task must refrain from the little lies, and that they must take serious conscious steps to ensure that they don’t lie to themselves. Anything else runs the risk of derailing the whole project.

C.K. Gunsalus on responsible — and prudent — whistleblowing.

In my last post, I considered why, despite good reasons to believe that social psychologist Diederik Stapel’s purported results were too good to be true, the scientific colleagues and students who were suspicious of his work were reluctant to pursue these suspicions. Questioning the integrity of a member of your professional community is hard, and blowing the whistle on misconduct and misbehavior can be downright dangerous.

In her excellent article “How to Blow the Whistle and Still Have a Career Afterwards”, C. K. Gunsalus describes some of the challenges that come from less than warm community attitudes towards members who point out wrongdoing:

[Whistleblowers pay a high price] due to our visceral cultural dislike of tattletales. While in theory we believe the wrong-doing should be reported, our feelings about practice are more ambivalent. …

Perhaps some of this ambivalence is rooted in fear of becoming oneself the target of maliciously motivated false charges filed by a disgruntled student or former colleague. While this concern is probably overblown, it seems not far from the surface in many discussions of scientific integrity. (p. 52)

I suspect that much of this is a matter of empathy — or, more precisely, of who it is within our professional community with whom we empathize. Maybe we have an easier time empathizing with the folks who seem to be trying to get along, rather than those who seem to be looking for trouble. Or maybe we have more empathy for our colleagues, with whom we share experiences and responsibilities and the expectation of longterm durable bonds, than we have for our students.

But perhaps distaste for a tattletale is more closely connected to our distaste for the labor involved in properly investigating allegations of wrongdoing and then, if wrongdoing is established, addressing it. It would certainly be easier to assume the charges are baseless, and sometimes disinclination to investigate takes the form of finding reasons not to believe the person raising the concerns.

Still, if the psychology of scientists cannot permit them to take allegations of misbehavior seriously, there is no plausible way for science to be self-correcting. Gunsalus writes:

[E]very story has at least two sides, and a problem often looks quite different when both are in hand than when only one perspective is in view. The knowledge that many charges are misplaced or result from misunderstandings reinforces ingrained hesitancies against encouraging charges without careful consideration.

On the other hand, serious problems do occur where the right and best thing for all is a thorough examination of the problem. In most instances, this examination cannot occur without someone calling the problem to attention. Early, thorough review of potential problems is in the interest of every research organization, and conduct that leads to it should be encouraged. (p. 53)

(Bold emphasis added.)

Gunsalus’s article (which you should read in full) takes account of negative attitudes towards whistleblowers despite the importance of rooting out misconduct and lays out a sensible strategy for bringing wrongdoing to light without losing your membership in your professional community. She lays out “rules for responsible whistleblowing”:

  1. Consider alternative explanations (especially that you may be wrong).
  2. In light of #1, ask questions, do not make charges.
  3. Figure out what documentation supports your concerns and where it is.
  4. Separate your personal and professional concerns.
  5. Assess your goals.
  6. Seek advice and listen to it.

and her “step-by-step procedures for responsible whistleblowing”:

  1. Review your concern with someone you trust.
  2. Listen to what that person tells you.
  3. Get a second opinion and take that seriously, too.
  4. If you decide to initiate formal proceedings, seek strength in numbers.
  5. Find the right place to file charges; study the procedures.
  6. Report your concerns.
  7. Ask questions; keep notes.
  8. Cultivate patience!

The focus is very much on moving beyond hunches to establish clear evidence — and on avoiding self-deception. The potential whistleblower must hope that those to whom he or she is bringing concerns are themselves as committed to looking at the available evidence and avoiding self-deception.

Sometimes this is the situation, as it seems to have been in the Stapel case. In other cases, though, whistleblowers have done everything Gunsalus recommends and still found themselves without the support of their community. This is not just a bad thing for the whistleblowers. It is also a bad thing for the scientific community and the reliability of the shared body of knowledge it tries to build.
_____
C. K. Gunsalus, “How to Blow the Whistle and Still Have a Career Afterwards,” Science and Engineering Ethics, 4(1) 1998, 51-64.

Reluctance to act on suspicions about fellow scientists: inside the frauds of Diederik Stapel (part 4).

It’s time for another post in which I chew on some tidbits from Yudhijit Bhattacharjee’s incredibly thought-provoking New York Times Magazine article (published April 26, 2013) on social psychologist and scientific fraudster Diederik Stapel. (You can also look at the tidbits I chewed on in part 1, part 2, and part 3.) This time I consider the question of why it was that, despite mounting clues that Stapel’s results were too good to be true, other scientists in Stapel’s orbit were reluctant to act on their suspicions that Stapel might be up to some sort of scientific misbehavior.

Let’s look at how Bhattacharjee sets the scene in the article:

[I]n the spring of 2010, a graduate student noticed anomalies in three experiments Stapel had run for him. When asked for the raw data, Stapel initially said he no longer had it. Later that year, shortly after Stapel became dean, the student mentioned his concerns to a young professor at the university gym. Each of them spoke to me but requested anonymity because they worried their careers would be damaged if they were identified.

The bold emphasis here (and in the quoted passages that follow) is mine. I find it striking that even now, when Stapel has essentially been fully discredited as a trustworthy scientist, these two members of the scientific community feel safer not being identified. It’s not entirely obvious to me if their worry is being identified as someone who was suspicious that fabrication was taking place but who said nothing to launch official inquiries — or whether they fear that being identified as someone who was suspicious of a fellow scientist could harm their standing in the scientific community.

If you dismiss that second possibility as totally implausible, read on:

The professor, who had been hired recently, began attending Stapel’s lab meetings. He was struck by how great the data looked, no matter the experiment. “I don’t know that I ever saw that a study failed, which is highly unusual,” he told me. “Even the best people, in my experience, have studies that fail constantly. Usually, half don’t work.”

The professor approached Stapel to team up on a research project, with the intent of getting a closer look at how he worked. “I wanted to kind of play around with one of these amazing data sets,” he told me. The two of them designed studies to test the premise that reminding people of the financial crisis makes them more likely to act generously.

In early February, Stapel claimed he had run the studies. “Everything worked really well,” the professor told me wryly. Stapel claimed there was a statistical relationship between awareness of the financial crisis and generosity. But when the professor looked at the data, he discovered inconsistencies confirming his suspicions that Stapel was engaging in fraud.

If one has suspicions about how reliable a fellow scientist’s results are, doing some empirical investigation seems like the right thing to do. Keeping an open mind and then examining the actual data might well show one’s suspicions to be unfounded.

Of course, that’s not what happened here. So, given a reason for doubt with stronger empirical support — not to mention the fact that scientists are trying to build a shared body of scientific knowledge (which means that unreliable papers in the literature can hurt the knowledge-building efforts of other scientists who trust that the work reported in that literature was done honestly), you would think the time was right for this professor to pass on what he had found to those at the university who could investigate further. Right?

The professor consulted a senior colleague in the United States, who told him he shouldn’t feel any obligation to report the matter.

For all the talk of science, and the scientific literature, being “self-correcting,” it’s hard to imagine the precise mechanism for such self-correction in a world where no scientist who is aware of likely scientific misconduct feels any obligation to report the matter.

But the person who alerted the young professor, along with another graduate student, refused to let it go. That spring, the other graduate student examined a number of data sets that Stapel had supplied to students and postdocs in recent years, many of which led to papers and dissertations. She found a host of anomalies, the smoking gun being a data set in which Stapel appeared to have done a copy-paste job, leaving two rows of data nearly identical to each other.

The two students decided to report the charges to the department head, Marcel Zeelenberg. But they worried that Zeelenberg, Stapel’s friend, might come to his defense. To sound him out, one of the students made up a scenario about a professor who committed academic fraud, and asked Zeelenberg what he thought about the situation, without telling him it was hypothetical. “They should hang him from the highest tree” if the allegations were true, was Zeelenberg’s response, according to the student.

Some might think these students were being excessively cautious, but the sad fact is that scientists faced with allegations of misconduct against a colleague — especially if they are brought by students — frequently side with their colleague and retaliate against those making the allegations. Students, after all, are new members of one’s professional community, so green one might not even think of them as really members. They are low status, they are learning how things work, they are judged likely to have misunderstood what they have seen. And, in contrast to one’s colleagues, students are transients. They are just passing through the training program, whereas you might hope to be with your colleagues for your whole professional life. In a case of dueling testimony, who are you more likely to believe?

Maybe the question should be whether your bias towards believing one over the other is strong enough to keep you from examining the available evidence to determine whether your trust is misplaced.

The students waited till the end of summer, when they would be at a conference with Zeelenberg in London. “We decided we should tell Marcel at the conference so that he couldn’t storm out and go to Diederik right away,” one of the students told me.

In London, the students met with Zeelenberg after dinner in the dorm where they were staying. As the night wore on, his initial skepticism turned into shock. It was nearly 3 when Zeelenberg finished his last beer and walked back to his room in a daze. In Tilburg that weekend, he confronted Stapel.

It might not be universally true, but at least some of the people who will lie about their scientific findings in a journal article will lie right to your face about whether they obtained those findings honestly. Yet lots of us think we can tell — at least with the people we know — whether they are being honest with us. This hunch can be just as wrong as the wrongest scientific hunch waiting for us to accumulate empirical evidence against it.

The students seeking Zeelenberg’s help in investigating Stapel’s misbehavior found a situation in which Zeelenberg would have to look at the empirical evidence first before he looked his colleague in the eye and asked him whether he was fabricating his results. They had already gotten him to say, at least in the abstract, that the kind of behavior they had reason to believe Stapel was committing was unacceptable in their scientific community. To make a conscious decision to ignore the empirical evidence would have meant Zeelenberg would have to see himself as displaying a kind of intellectual dishonesty — because if fabrication is harmful to science, it is harmful to science no matter who perpetrates it.

As it was, Zeelenberg likely had to make the painful concession that he had misjudged his colleague’s character and trustworthiness. But having wrong hunches is science is much less of a crime than clinging to those hunches in the face of mounting evidence against them.

Doing good science requires a delicate balance of trust and accountability. Scientists’ default position is to trust that other scientists are making honest efforts to build reliable scientific knowledge about the world, using empirical evidence and methods of inference that they display for the inspection (and critique) of their colleagues. Not to hold this default position means you have to build all your knowledge of the world yourself (which makes achieving anything like objective knowledge really hard). However, this trust is not unconditional, which is where the accountability comes is. Scientists recognize that they need to be transparent about what they did to build the knowledge — to be accountable when other scientists ask questions or disagree about conclusions — else that trust evaporates. When the evidence warrants it, distrusting a fellow scientist is not mean or uncollegial — it’s your duty. We need the help of other to build scientific knowledge, but if they insist that they ignore evidence of their scientific misbehavior, they’re not actually helping.

Scientific training and the Kobayashi Maru: inside the frauds of Diederik Stapel (part 3).

This post continues my discussion of issues raised in the article by Yudhijit Bhattacharjee in the New York Times Magazine (published April 26, 2013) on social psychologist and scientific fraudster Diederik Stapel. Part 1 looked at how expecting to find a particular kind of order in the universe may leave a scientific community more vulnerable to a fraudster claiming to have found results that display just that kind of order. Part 2 looked at some of the ways Stapel’s conduct did harm to the students he was supposed to be training to be scientists. Here, I want to point out another way that Stapel failed his students — ironically, by shielding them from failure.

Bhattacharjee writes:

[I]n the spring of 2010, a graduate student noticed anomalies in three experiments Stapel had run for him. When asked for the raw data, Stapel initially said he no longer had it. Later that year, shortly after Stapel became dean, the student mentioned his concerns to a young professor at the university gym. Each of them spoke to me but requested anonymity because they worried their careers would be damaged if they were identified.

The professor, who had been hired recently, began attending Stapel’s lab meetings. He was struck by how great the data looked, no matter the experiment. “I don’t know that I ever saw that a study failed, which is highly unusual,” he told me. “Even the best people, in my experience, have studies that fail constantly. Usually, half don’t work.”

In the next post, we’ll look at how this other professor’s curiosity about Stapel’s too-good-to-be-true results led to the unraveling of Stapel’s fraud. But I think it’s worth pausing here to say a bit more on how very odd a training environment Stapel’s research group provided for his students.

None of his studies failed. Since, as we saw in the last post, Stapel was also conducting (or, more accurately, claiming to conduct) his students’ studies, that means none of his students’ studies failed.

This is pretty much the opposite of every graduate student experience in an empirical field that I have heard described. Most studies fail. Getting to a 50% success rate with your empirical studies is a significant achievement.

Graduate students who are also Trekkies usually come to recognize that the travails of empirical studies are like a version of the Kobayashi Maru.

Introduced in Star Trek II: The Wrath of Khan, the Kobayashi Maru is a training simulation in which Star Fleet cadets are presented with a civilian ship in distress. Saving the civilians requires the cadet to violate treaty by entering the Neutral Zone (and in the simulation, this choice results in a Klingon attack and the boarding of the cadet’s ship). Honoring the treaty, on the other hand, means abandoning the civilians and their disabled ship in the Neutral Zone. The Kobayashi Maru is designed as a “no-win” scenario. The intent of the test is to discover how trainees face such a situation. Owing to James T. Kirk’s performance on the test, Wikipedia notes that some Trekkies also view the Kobayashi Maru as a problem whose solution depends on redefining the problem.

Scientific knowledge-building turns out to be packed with particular plans that cannot succeed at yielding the particular pieces of knowledge the scientists hope to discover. This is because scientists are formulating plans on the basis of what is already known to try to reveal what isn’t yet known — so knowing where to look, or what tools to use to do the looking, or what other features of the world are there to confound your ability to get clear information with those tools, is pretty hard.

Failed attempts happen. If they’re the sort of thing that will crush your spirit and leave you unable to shake it off and try it again, or to come up with a new strategy to try, then the life of a scientist will be a pretty hard life for you.

Grown-up scientists have studies fail all the time. Graduate students training to be scientists do, too. But graduate students also have mentors who are supposed to help them bounce back from failure — to figure out the most likely sources of failure, whether it’s worth trying the study again, whether a new approach would be better, whether some crucial piece of knowledge has been learned despite the failure of what was planned. Mentors give scientific trainees a set of strategies for responding to particular failures, and they also give reassurance that even good scientists fail.

Scientific knowledge is built by actual humans who don’t have perfect foresight about the features of the world as yet undiscovered, humans who don’t have perfectly precise instruments (or hands and eyes using those instruments), humans who sometimes mess up in executing their protocols. Yet the knowledge is built, and it frequently works pretty well.

In the context of scientific training, it strikes me as malpractice to send new scientists out into the world with the expectation that all of their studies should work, and without any experience grappling with studies that don’t work. Shielding his students from their Kobayashi Maru is just one more way Diederik Stapel cheated them out of a good scientific training.

Failing the scientists-in-training: inside the frauds of Diederik Stapel (part 2)

In this post, I’m continuing my discussion of the excellent article by Yudhijit Bhattacharjee in the New York Times Magazine (published April 26, 2013) on social psychologist and scientific fraudster Diederik Stapel. The last post considered how being disposed to expect order in the universe might have made other scientists in Stapel’s community less critical of his (fabricated) results than they could have been. Here, I want to shift my focus to some of the harm Stapel did beyond introducing lies to the scientific literature — specifically, the harm he did to the students he was supposed to be training to become good scientists.

I suppose it’s logically possible for a scientist to commit misconduct in a limited domain — say, to make up the results of his own research projects but to make every effort to train his students to be honest scientists. This doesn’t strike me as a likely scenario, though. Publishing fraudulent results as if they were factual is lying to one’s fellow scientists — including the generation of scientists one is training. Moreover, most research groups pursue interlocking questions, meaning that the questions the grad students are working to answer generally build on pieces of knowledge the boss has built — or, in Stapel’s case “built”. This means that at minimum, a fabricating PI is probably wasting his trainees’ time by letting them base their own research efforts on claims that there’s no good scientific reason to trust.

And as Bhattacharjee describes the situation for Stapel’s trainees, things for them were even worse:

He [Stapel] published more than two dozen studies while at Groningen, many of them written with his doctoral students. They don’t appear to have questioned why their supervisor was running many of the experiments for them. Nor did his colleagues inquire about this unusual practice.

(Bold emphasis added.)

I’d have thought that one of the things a scientist-in-training hopes to learn in the course of her graduate studies is not just how to design a good experiment, but how to implement it. Making your experimental design work in the real world is often much harder than it seems like it will be, but you learn from these difficulties — about the parameters you ignored in the design that turn out to be important, about the limitations of your measurement strategies, about ways the system you’re studying frustrates the expectations you had about it before you were actually interacting with it.

I’ll even go out on a limb and say that some experience doing experiments can make a significant difference in a scientist’s skill conceiving of experimental approaches to problems.

That Stapel cut his students out of doing the experiments was downright weird.

Now, scientific trainees probably don’t have the most realistic picture of precisely what competencies they need to master to become successful grown-up scientists in a field. They trust that the grown-up scientists training them know what these competencies are, and that these grown-up scientists will make sure that they encounter them in their training. Stapel’s trainees likely trusted him to guide them. Maybe they thought that he would have them conducting experiments if that were a skill that would require a significant amount of time or effort to master. Maybe they assumed that implementing the experiments they had designed was just so straightforward that Stapel thought they were better served working to learn other competencies instead.

(For that to be the case, though, Stapel would have to be the world’s most reassuring graduate advisor. I know my impostor complex was strong enough that I wouldn’t have believed I could do an experiment my boss or my fellow grad students viewed as totally easy until I had actually done it successfully three times. If I had to bet money, it would be that some of Stapel’s trainees wanted to learn how to do the experiments, but they were too scared to ask.)

There’s no reason, however, that Stapel’s colleagues should have thought it was OK that his trainees were not learning how to do experiments by taking charge of doing their own. If they did know and they did nothing, they were complicit in a failure to provide adequate scientific training to trainees in their program. If they didn’t know, that’s an argument that departments ought to take more responsibility for their trainees and to exercise more oversight rather than leaving each trainee to the mercies of his or her advisor.

And, as becomes clear from the New York Times Magazine article, doing experiments wasn’t the only piece of standard scientific training of which Stapel’s trainees were deprived. Bhattacharjee describes the revelation when a colleague collaborated with Stapel on a piece of research:

Stapel and [Ad] Vingerhoets [a colleague of his at Tilburg] worked together with a research assistant to prepare the coloring pages and the questionnaires. Stapel told Vingerhoets that he would collect the data from a school where he had contacts. A few weeks later, he called Vingerhoets to his office and showed him the results, scribbled on a sheet of paper. Vingerhoets was delighted to see a significant difference between the two conditions, indicating that children exposed to a teary-eyed picture were much more willing to share candy. It was sure to result in a high-profile publication. “I said, ‘This is so fantastic, so incredible,’ ” Vingerhoets told me.

He began writing the paper, but then he wondered if the data had shown any difference between girls and boys. “What about gender differences?” he asked Stapel, requesting to see the data. Stapel told him the data hadn’t been entered into a computer yet.

Vingerhoets was stumped. Stapel had shown him means and standard deviations and even a statistical index attesting to the reliability of the questionnaire, which would have seemed to require a computer to produce. Vingerhoets wondered if Stapel, as dean, was somehow testing him. Suspecting fraud, he consulted a retired professor to figure out what to do. “Do you really believe that someone with [Stapel’s] status faked data?” the professor asked him.

“At that moment,” Vingerhoets told me, “I decided that I would not report it to the rector.”

Stapel’s modus operandi was to make up his results out of whole cloth — to produce “findings” that looked statistically plausible without the muss and fuss of conducting actual experiments or collecting actual data. Indeed, since the thing he was creating that needed to look plausible enough to be accepted by his fellow scientists was the analyzed data, he didn’t bother making up raw data from which such an analysis could be generated.

Connecting the dots here, this surely means that Stapel’s trainees must not have gotten any experience dealing with raw data or learning how to apply methods of analysis to actual data sets. This left another gaping hole in the scientific training they deserved.

It would seem that those being trained by other scientists in Stapel’s program were getting some experience in conducting experiments, collecting data, and analyzing their data — since that experimentation, data collection, and data analysis became fodder for discussion in the ethics training that Stapel led. From the article:

And yet as part of a graduate seminar he taught on research ethics, Stapel would ask his students to dig back into their own research and look for things that might have been unethical. “They got back with terrible lapses­,” he told me. “No informed consent, no debriefing of subjects, then of course in data analysis, looking only at some data and not all the data.” He didn’t see the same problems in his own work, he said, because there were no real data to contend with.

I would love to know the process by which Stapel’s program decided that he was the best one to teach the graduate seminar on research ethics. I wonder if this particular teaching assignment was one of those burdens that his colleagues tried to dodge, or if research ethics was viewed as a teaching assignment requiring no special expertise. I wonder how it’s sitting with them that they let a now-famous cheater teach their grad students how to be ethical scientists.

The whole “those who can’t do, teach” adage rings hollow here.

The quest for underlying order: inside the frauds of Diederik Stapel (part 1)

Yudhijit Bhattacharjee has an excellent article in the most recent New York Times Magazine (published April 26, 2013) on disgraced Dutch social psychologist Diederik Stapel. Why is Stapel disgraced? At the last count at Retraction Watch, 54 53 of his scientific publications have been retracted, owing to the fact that the results reported in those publications were made up. [Scroll in that Retraction Watch post for the update — apparently one of the Stapel retractions was double-counted. This is the risk when you publish so much made-up stuff.]

There’s not much to say about the badness of a scientist making results up. Science is supposed to be an activity in which people build a body of reliable knowledge about the world, grounding that knowledge in actual empirical observations of that world. Substituting the story you want to tell for those actual empirical observations undercuts that goal.

But Bhattacharjee’s article is fascinating because it goes some way to helping illuminate why Stapel abandoned the path of scientific discovery and went down the path of scientific fraud instead. It shows us some of the forces and habits that, while seemingly innocuous taken individually, can compound to reinforce scientific behavior that is not helpful to the project of knowledge-building. It reveals forces within scientific communities that make it hard for scientists to pursue suspicions of fraud to get formal determinations of whether their colleagues are actually cheating. And, the article exposes some of the harms Stapel committed beyond publishing lies as scientific findings.

It’s an incredibly rich piece of reporting, one which I recommend you read in its entirety, maybe more than once. Given just how much there is to talk about here, I’ll be taking at least a few posts to highlight bits of the article as nourishing food for thought.

Let’s start with how Stapel describes his early motivation for fabricating results to Bhattacharjee. From the article:

Stapel did not deny that his deceit was driven by ambition. But it was more complicated than that, he told me. He insisted that he loved social psychology but had been frustrated by the messiness of experimental data, which rarely led to clear conclusions. His lifelong obsession with elegance and order, he said, led him to concoct sexy results that journals found attractive. “It was a quest for aesthetics, for beauty — instead of the truth,” he said. He described his behavior as an addiction that drove him to carry out acts of increasingly daring fraud, like a junkie seeking a bigger and better high.

(Bold emphasis added.)

It’s worth noting here that other scientists — plenty of scientists who were never cheaters, in fact — have also pursued science as a quest for beauty, elegance, and order. For many, science is powerful because it is a way to find order in a messy universe, to discover simple natural laws that give rise to such an array of complex phenomena. We’ve discussed this here before, when looking at the tension between Platonist and Aristotelian strategies for getting to objective truths:

Plato’s view was that the stuff of our world consists largely of imperfect material instantiations of immaterial ideal forms -– and that science makes the observations it does of many examples of material stuff to get a handle on those ideal forms.

If you know the allegory of the cave, however, you know that Plato didn’t put much faith in feeble human sense organs as a route to grasping the forms. The very imperfection of those material instantiations that our sense organs apprehend would be bound to mislead us about the forms. Instead, Plato thought we’d need to use the mind to grasp the forms.

This is a crucial juncture where Aristotle parted ways with Plato. Aristotle still thought that there was something like the forms, but he rejected Plato’s full-strength rationalism in favor of an empirical approach to grasping them. If you wanted to get a handle on the form of “horse,” for example, Aristotle thought the thing to do was to examine lots of actual specimens of horse and to identify the essence they all have in common. The Aristotelian approach probably feels more sensible to modern scientists than the Platonist alternative, but note that we’re still talking about arriving at a description of “horse-ness” that transcends the observable features of any particular horse.

Honest scientists simultaneously reach for beautiful order and the truth. They use careful observations of the world to try to discern the actual structures and forces giving rise to what they are observing. They recognize that our observational powers are imperfect, that our measurements are not infinitely precise (and that they are often at least a little inaccurate), but those observations, those measurements, are what we have to work with in discerning the order underlying them.

This is why Ockham’s razor — to prefer simple explanations for phenomena over more complicated ones — is a strategy but not a rule. Scientists go into their knowledge-building endeavor with the hunch that the world has more underlying order than is immediately apparent to us — and that careful empirical study will help us discover that order — but how things actually are provides a constraint on how much elegance there is to be found.

However, as the article in the New York Times Magazine makes clear, Stapel was not alone in expecting the world he was trying to describe in his research to yield elegance:

In his early years of research — when he supposedly collected real experimental data — Stapel wrote papers laying out complicated and messy relationships between multiple variables. He soon realized that journal editors preferred simplicity. “They are actually telling you: ‘Leave out this stuff. Make it simpler,’” Stapel told me. Before long, he was striving to write elegant articles.

The journal editors’ preference here connects to a fairly common notion of understanding. Understanding a system is being able to identify that components of that system that make a difference in producing the effects of interest — and, by extension, recognizing which components of the system don’t feature prominently in bringing about the behaviors you’re studying. Again, the hunch is that there are likely to be simple mechanisms underlying apparently complex behavior. When you really understand the system, you can point out those mechanisms and explain what’s going on while leaving all the other extraneous bits in the background.

Pushing to find this kind of underlying simplicity has been a fruitful scientific strategy, but it’s a strategy that can run into trouble if the mechanisms giving rise to the behavior you’re studying are in fact complicated. There’s a phrase attributed to Einstein that captures this tension nicely: as simple as possible … but not simpler.

The journal editors, by expressing to Stapel that they liked simplicity more than messy relationships between multiple variables, were surely not telling Stapel to lie about his findings to create such simplicity. They were likely conveying their view that further study, or more careful analysis of data, might yield elegant relations that were really there but elusive. However, intentionally or not, they did communicate to Stapel that simple relationships fit better with journal editors’ hunches about what the world is like than did messy ones — and that results that seemed to reveal simple relations were thus more likely to pass through peer review without raising serious objections.

So, Stapel was aware that the gatekeepers of the literature in his field preferred elegant results. He also seemed to have felt the pressure that early-career academic scientists often feel to make all of his research time productive — where the ultimate measure of productivity is a publishable result. Again, from the New York Times Magazine article:

The experiment — and others like it — didn’t give Stapel the desired results, he said. He had the choice of abandoning the work or redoing the experiment. But he had already spent a lot of time on the research and was convinced his hypothesis was valid. “I said — you know what, I am going to create the data set,” he told me.

(Bold emphasis added.)

The sunk time clearly struck Stapel as a problem. Making a careful study of the particular psychological phenomenon he was trying to understand hadn’t yielded good results — which is to say, results that would be recognized by scientific journal editors or peer reviewers as adding to the shared body of knowledge by revealing something about the mechanism at work in the phenomenon. This is not to say that experiments with negative results don’t tell scientists something about how the world is. But what negative results tell us is usually that the available data don’t support the hypothesis, or perhaps that the experimental design wasn’t a great way to obtain data to let us evaluate that hypothesis.

Scientific journals have not generally been very interested in publishing negative results, however, so scientists tend to view them as failures. They may help us to reject appealing hypotheses or to refine experimental strategies, but they don’t usually do much to help advance a scientist’s career. If negative results don’t help you get publications, without which it’s harder to get grants to fund research that could find positive results, then the time and money spent doing all that research has been wasted.

And Stapel felt — maybe because of his hunch that the piece of the world he was trying to describe had to have an underlying order, elegance, simplicity — that his hypothesis was right. The messiness of actual data from the world got in the way of proving it, but it had to be so. And this expectation of elegance and simplicity fit perfectly with the feedback he had heard before from journal editors in his field (feedback that may well have fed Stapel’s own conviction).

A career calculation paired with a strong metaphysical commitment to underlying simplicity seems, then, to have persuaded Diederik Stapel to let his hunch weigh more heavily than the data and then to commit the cardinal sin of falsifying data that could be presented to other scientists as “evidence” to support that hunch.

No one made Diederik Stapel cross that line. But it’s probably worth thinking about the ways that commitments within scientific communities — especially methodological commitments that start to take on the strength of metaphysical commitments — could have made crossing it more tempting.

Shame versus guilt in community responses to wrongdoing.

Yesterday, on the Hastings Center Bioethics Forum, Carl Elliott pondered the question of why a petition asking the governor of Minnesota to investigate ethically problematic research at the University of Minnesota has gathered hundreds of signatures from scholars in bioethics, clinical research, medical humanities, and related disciplines — but only a handful of signatures from scholars and researchers at the University of Minnesota.

At the center of the research scandal is the death of Dan Markingson, who was a human subject in a clinical trial of psychiatric drugs. Detailed background on the case can be found here, and Judy Stone has blogged extensively about the ethical dimensions of the case.

Elliott writes:

Very few signers come from the University of Minnesota. In fact, only two people from the Center for Bioethics have signed: Leigh Turner and me. This is not because any faculty member outside the Department of Psychiatry actually defends the ethics of the study, at least as far as I can tell. What seems to bother people here is speaking out about it. Very few faculty members are willing to register their objections publicly.

Why not? Well, there are the obvious possibilities – fear, apathy, self-interest, and so on. At least one person has told me she is unwilling to sign because she doesn’t think the petition will succeed. But there may be a more interesting explanation that I’d like to explore. …

Why would faculty members remain silent about such an alarming sequence of events? One possible reason is simply because they do not feel as if the wrongdoing has anything to do with them. The University of Minnesota is a vast institution; the scandal took place in a single department; if anyone is to be blamed, it is the psychiatrists and the university administrators, not them. Simply being a faculty member at the university does not implicate them in the wrongdoing or give them any special obligation to fix it. In a phrase: no guilt, hence no responsibility.

My view is somewhat different. These events have made me deeply ashamed to be a part of the University of Minnesota, in the same way that I feel ashamed to be a Southerner when I see video clips of Strom Thurmond’s race-baiting speeches or photos of Alabama police dogs snapping at black civil rights marchers. I think that what our psychiatrists did to Dan Markingson was wrong in the deepest sense. It was exploitative, cruel, and corrupt. Almost as disgraceful are the actions university officials have taken to cover it up and protect the reputation of the university. The shame I feel comes from the fact that I have worked at the University of Minnesota for 15 years. I have even been a member of the IRB. For better or worse, my identity is bound up with the institution.

These two different reactions – shame versus guilt – differ in important ways. Shame is linked with honor; it is about losing the respect of others, and by virtue of that, losing your self-respect. And honor often involves collective identity. While we don’t usually feel guilty about the actions of other people, we often do feel ashamed if those actions reflect on our own identities. So, for example, you can feel ashamed at the actions of your parents, your fellow Lutherans, or your physician colleagues – even if you feel as if it would be unfair for anyone to blame you personally for their actions.

Shame, unlike guilt, involves the imagined gaze of other people. As Ruth Benedict writes: “Shame is a reaction to other people’s criticism. A man is shamed either by being openly ridiculed or by fantasying to himself that he has been made ridiculous. In either case it is a potent sanction. But it requires an audience or at least a man’s fantasy of an audience. Guilt does not.”

As Elliott notes, one way to avoid an audience — and thus to avoid shame — is to actively participate in, or tacitly endorse, a cover-up of the wrongdoing. I’m inclined to think, however, that taking steps to avoid shame by hiding the facts, or by allowing retaliation against people asking inconvenient questions, is itself a kind of wrongdoing — the kind of thing that incurs guilt, for which no audience is required.

As well, I think the scholars and researchers at the University of Minnesota who prefer not to take a stand on how their university responds to ethically problematic research, even if it is research in someone else’s lab, or someone else’s department, underestimate the size of the audience for their actions and for their inaction.

A hugely significant segment of this audience is their trainees. Their students and postdocs (and others involved in training relationships with them) are watching them, trying to draw lessons about how to be a grown-up scientist or scholar, a responsible member of a discipline, a responsible member of a university community, a responsible citizen of the world. The people they are training are looking to them to set a good example on how to respond to problems — by addressing them, learning from them, making things right, and doing better going forward, or by lying, covering up, and punishing people harmed by trying to recover costs from them (thus sending a message to others daring to point out how they have been harmed).

There are many fewer explicit conversations about such issues than one might hope in a scientist’s training. In the absence of explicit conversations, most of what trainees have to go on is how the people training them actually behave. And sometimes, a mentor’s silence speaks as loud as words.

The purpose of a funding agency (and how that should affect its response to misconduct).

In the “Ethics in Science” course I regularly teach, students spend a good bit of time honing their ethical decision-making skills by writing responses to case studies. (A recent post lays out the basic strategy we take in approaching these cases.) Over the span of the semester, my students’ responses to the cases give me pretty good data about the development of their ethical decision-making.

From time to time, they also advance claims that make me say, “Hmmm …”

Here’s one such claim, recently asserted in response to a case in which the protagonist, a scientist serving on a study section for the NIH (i.e., a committee that ranks the merit of grant proposals submitted to the NIH for funding), has to make a decision about how to respond when she detects plagiarism in a proposal:

The main purpose of the NIH is to ensure that projects with merit get funded, not to punish scientists for plagiarism.

Based on this assertion, the student argued that it wasn’t clear that the study section member had to make an official report to the NIH about the plagiarism.

I think the claim is interesting, though I think maybe we would do well to unpack it a little. What, for instance, counts as a project with merit?

Is it enough that the proposed research would, if successful, contribute a new piece of knowledge to our shared body of scientific knowledge? Does the anticipated knowledge that the research would generate need to be important, and if so, according to what metric? (Clearly applicable to a pressing problem? Advancing our basic understanding of some part of our world? Surprising? Resolving an ongoing scientific debate?) Does the proposal need to convey evidence that the proposers have a good chance at being successful in conducting the research (because they have the scientific skills, the institutional resources, etc.)?

Does plagiarism count as evidence against merit here?

Perhaps we answer this question differently if we think what should be evaluated is the proposal rather than the proposer. Maybe the proposed research is well-designed, likely to work, and likely to make an important contribution to knowledge in the field — even if the proposer is judged lacking in scholarly integrity (because she seems not to know how properly to cite the words or ideas of others, or not to care to do so if she knows how).

But, one of the expectations of federal funders like the NIH is that scientists whose research is funded will write up the results and share them in the scientific literature. Among other things, this means that one of the scientific skills that a proposer will need to see a project through to completion (including publishing the results) successfully is the ability to write without running afoul of basic standards of honest scholarship. A paper which communicates important results while also committing plagiarism will not bring glory to the NIH for funding the researcher.

More broadly, the fact that something (like detecting or punishing plagiarism) is not a primary goal does not mean it is not a goal that might support the primary goal. To the extent that certain kinds of behavior in proposing research might mark a scientist as a bad risk to carry out research responsibly, it strikes me as entirely appropriate for funding agencies to flag those behaviors when they see them — and also to share that information with other funding agencies.

As well, to the extent that an agency like the NIH might punish a scientist for plagiarism, the kind of punishment it imposes is generally barring that scientist from eligibility for funding for a finite number of years. In other words, the punishment amounts to “You don’t get our money, and you don’t get to ask us for money again for the next N years.” To me, this punishment doesn’t look like it’s disproportional, and it doesn’t look like imposing it on a plagiarist grant proposer diverges wildly from the main goal of ensuring that projects with merit get funded.

But, as always, I’m interested in what you all think about it.