Trusting other scientists (Sames-Sezen follow-up).

Posted on August 3, 2006August 3, 2006 by admin

At the request of femalechemist, I’m going to revisit the Sames/Sezen controversy. You’ll recall that Dalibor Sames, a professor at Columbia University, retracted seven papers on which he was senior author. Bengu Sezen, also an author on each of the retracted papers and a graduate of the Sames lab, performed the experiments in question.

Sames says he retracted the papers because the current members of his lab could not reproduce the original findings. Sezen says that the experiments reported worked for her and for other experimenters in the Sames lab. Moreover, she says that Sames did not contact her about any problems reproducing the results, and that he asked the journals to retract the papers without letting her know he was doing so.

I am not now, nor was I ever, an organic chemist, so I’m not going to try to do the experiments myself (repeatedly, with appropriate consultation of the people who developed the original protocols) to see who’s right. That’s not the kind of light I can shed on this case. However, I can break down the key issues at play here:

1. Are the experiments reproducible?

This is the question of interest to others working in organic chemistry. Do these syntheses work or don’t they? To get a definitive answer to this question, people outside the Sames lab will have to go to the lab and try it. (Why people outside the Sames lab? Because all the principals in this scandal — and their close associates — have too much at stake in the outcome of these experiments to have a reasonable chance of being unbiased.)

Of course, they will want to follow not only the “materials and methods” from the papers in question, but they will want to get tips for running the experiment from the person who says she has made it work. A number of scientists, in other contexts, have noted that there are some parts of experimental protocols that may seem dispensable, until you dispense with them and can no longer get the desired outcome. At least some of the attempts at replication ought to take these “pigeon dance” steps seriously.

2. Did Sezen really get the experiments to work in the first place (or at least, did she really believe she got them to work)?

The fact that Sezen is on record saying that she would come back to the Sames lab to assist them in getting the experiment to work suggests that she either really believes the experiments can be accomplished, or she’s an incredibly gutsy liar. Lying in science is a lot harder when you’ve got all kinds of attention on you (say, from press coverage of your dispute with your old advisor and co-author). If Sezen were still willing to step up and work with other scientists to reproduce the reported results, I’d be inclined to think that in her own mind, she was convinced of the goodness of her original results.

Do note that one can be convinced of the goodness of one’s own results and still turn out to be wrong. If one has made a concerted effort to design and conduct careful, controlled experiments (and to collect enough results to be sure they were not a fluke and that the results are not biased), this is what we call an honest mistake.

3. Did Sames have good reason to believe the experiments worked back when he authored those papers? Does he have good reason to believe they don’t work now?

Presumably, if Sames was willing to trust Sezen to do the experiments, then to send out the results with his own name on the manuscript in the senior author slot, he believed in the results. One would hope that, in his capacity as an advisor, he would push his graduate students to be sufficiently skeptical of their own results — and that he would make sure he was walked through the entire chain of scientific evidence. If he was putting his name on the paper, he’d probably want to be at least as critical of the not-yet-published results as the referees at the journal would be, not to mention competitors after the fact.

Not only would he want good evidence to support the claimed results of the syntheses, but he’d also want something like good evidence to support his trust in Sezen. If her trustworthiness was ever in doubt, dealing with that before co-authoring papers with her would have been a much better idea.

It is possible that since those papers were published, Sames came face to face with certain facts that were enough to convince him that Sezen was not to be trusted. These would have to be facts beyond the difficulties his lab was having reproducing Sezen’s results, or else he would have to at least entertain the “honest mistake” hypothesis. Maybe someone is aware of the existence of such damning facts against Sezen, but I haven’t seen them mentioned in the news coverage. (That doesn’t mean they don’t exist, though.)

4. Why wouldn’t Sames tell Sezen there were problems with the recent attempts to replicate the experiments? Why wouldn’t he contact her before sending retractions to the journals?

This, to me, is the most puzzling question. The possibilities range all the way from Sames viewing Sezen as an irredeemable liar, to Sames deciding for his own evil reasons to try to destroy Sezen’s reputation, to his not being able to locate her to communicate with her (even though reporters were able to locate Sezen fairly quickly as the story was breaking), with many other possibilities between these. What is clear is that there isn’t currently trust between the two parties. It is much less clear on which side the trust was breached, and whether it was breached on the basis of good evidence or an unsupported hunch (or vendetta).

Make no mistake: these retractions are not just about the reliability of the results reported in the retracted papers. They are also about the credibility of the authors of those papers. Sames’ retraction is also as good as an assertion that Sezen is not a reliable source of scientific information. The reported experimental results were supposed to live up to a certain standard of proof. What about the retractions’ implications for Sezen’s credibility — what standard of proof do they meet?

It’s worth noting that even without the attending scandal about leaving Sezen out of the loop (or allegedly not really making serious attmpts at replication), seven retractions would not leave Sames’ reputatioin pristine. At the least, they would mark him as a scientist willing to put results into the scientific literature before they had received sufficient scrutiny. That couldn’t help his future publications. (“Hmm, Sames has another synthesis. I wonder if this one is really worked out.”) On the other hand, I suppose you get a few more credibility points for retracting a result once you’ve discovered it’s mistaken rather than leaving it out there and hoping no one notices.

But this is not your standard difficult retraction. This is a retraction where not all the authors are acting in concert. This is a fight, and sadly, even if one party really is in the right and the other is in the wrong, neither comes out looking like someone you’d trust or with whom you’d want to collaborate.

5. So can you ever trust your collaborator, or should scientists all author their papers alone? Can advisors ever trust their graduate students, or graduate students their advisors?

The alternative to trusting other scientists, whether by way of collaborations or by consulting the scientific literature, is doing all the science you care about all by yourself. And if that’s your plan, you really have too much to do to be reading blogs. Scoot!

On the other hand, since scientists have some acquaintance with the idea of backing their beliefs with facts, it’s good to base your trust on facts, too. This would be easier if collaborators took the time to get to know something about the pieces of the project contributed by their collaborators, if PIs still got involved in conducting experiments themselves, if grad students worked up their data with their advisors at least some of the time rather than only delivering the finished product with a pretty bow on top. Yes, it’s possible for an accomplished liar to seem to provide all kinds of evidence of trustworthiness, but it takes a lot of work. Better to try to have good reasons for trusting people rather than to decide preemptively that no reliable evidence can be had.

If I were a PI right now, I’d want to have a conversation with my graduate students about the importance of honesty and trust within the group. I might look into ways that my grad students could attempt prepublication replications of each other’s results — or even step up myself to try to reproduce their results.

If I were a science graduate student right now, I’d want to have a discussion with my advisor about keeping me in the loop if results we’re certain of now — and publish — end up being less certain later. I’d want to talk about the possible career consequences of a retraction due to an honest mistake. I’d also want to have a discussion about what kind of evidence you need to see to be confident enough about a result to publish it — and whether there are strategies he or she uses to give tentative results one more run through the wringer.

Without trust, there is no science. Of course, we’re talking evidence-based trust, not gullibility. And I don’t see how we get that trust without scientists actually talking to each other.

Posted in Chemistry, Communication, Ethical research.

16 Comments

per

August 3, 2006 at 10:49 am Reply

a fascinating mix of issues, not the least of which is whether Sames contacted (or tried) to contact Sezen.
There is at least one additional issue worthy of mention. Believe it or not, it is quite common for people to produce work which is entirely unreproducible, and for PIs to turn a blind eye. When this happens, external groups can spend large amounts of time trying to reproduce data which the original PI knows, or strongly suspects, is garbage. Governments can rely on scientific work which turns out to be faked, and vast amounts of research funding can disappear into a lab, when the work cannot be reproduced.
An additional complication is that it is possible that the investigator has either falsified data, or been incompetent (I know of a case where a student claimed to get results for a year with a machine which transpired not to have a connection to the electrical supply). Neither of these issues will be an appealling prospect to any ex-student to own up to. Irrespective of whether the student has been in error, or correct (but a demanding protocol), it is rare that an ex-student has the necessary weeks-months of time to come back to a lab and re-do the experiments.
PIs are of course responsible for ensuring that the work they publish is robust in the first place, but it can be a difficult decision. How do you ensure replication ? You can set two graduate students to compete against each other on the same project – but for some reason, some people hold this up as a paradigm of bad practice !
yours
per
femalechemist

August 3, 2006 at 11:52 am Reply

Hi Janet, thanks for taking the time to comment again on all this. It seems to me that Sames smelled fire and decided to finally retract everything, thinking he could keep his hands clean when in fact it just makes him smell more. Plus, he was willing to take the credit at the time, and now he’s trying to distance himself from the whole thing. There is also the more salacious aspect to this, which I don’t think you commented on. Rumour has it that Sames and Sezen were ‘involved’ with each other, and also that Sames held up Sezen’s work to the point of having dismissed one or more people in his lab who couldn’t reproduce her results. I am particularly disturbed about this conflict of interest aspect, especially as a female scientist. I wonder if anyone can comment more on this, since as far as I know this is rumour and I can’t speak to the truth of it. On the one hand, it’s a sad fact that shaky and/or irreproducible syntheses get published. However most of those stories don’t involve a conflict of interest backstory on the major players, one of whom is female, as well as a huge set of retractions… people are a lot more likely to jump on it. Lastly, about the trust issue, a lot of research is very high pressure and high stakes, and I think that unfortunately many PI’s put huge expectations of success on people and turn a blind eye to how they are produced. Misconduct is sure to follow.
Bill Hooker

August 4, 2006 at 12:53 am Reply

This is a fight, and sadly, even if one party really is in the right and the other is in the wrong, neither comes out looking like someone you’d trust or with whom you’d want to collaborate.
Huh? There’s no really good outcome for Sames, who looks like something of a jerk even if he’s right, but if Sezen is right (experiments reproducible), or if she’s made an honest mistake and finally, forthrightly accepts proof of same, then I’d work with her. Why wouldn’t you?
per

August 4, 2006 at 9:15 am Reply

just for information:
Org. Lett., 8 (13), 2899 -2899, 2006.
Cobalt-Catalyzed Arylation of Azole Heteroarenes via Direct C-H Bond Functionalization
Bengü Sezen and Dalibor Sames*
Volume 5, 2003
Page 3607. After the departure of the first author, the laboratory of the corresponding author (D. Sames) has not been able to reproduce the key results in this publication. Accordingly, the corresponding author withdraws this paper, and deeply regrets that the chemical community was misled by this publication.06/15/2006
Bill Hooker

August 4, 2006 at 9:07 pm Reply

Believe it or not, it is quite common for people to produce work which is entirely unreproducible, and for PIs to turn a blind eye. When this happens, external groups can spend large amounts of time trying to reproduce data which the original PI knows, or strongly suspects, is garbage. Governments can rely on scientific work which turns out to be faked, and vast amounts of research funding can disappear into a lab, when the work cannot be reproduced.

Do you have any concrete examples, with sources? I’m not claiming this doesn’t happen, but I don’t believe that it is “common” or that “vast amounts” of funds are wasted on it. For instance:

I know of a case where a student claimed to get results for a year with a machine which transpired not to have a connection to the electrical supply

I’ve heard that one too. As far as I can tell, it’s an urban myth, so I’d be glad to hear the original details if it’s actually true.
Lab Lemming

August 5, 2006 at 6:56 pm Reply

On a more fundamental level with the above reference, why wasn’t the corresponding author the author with the greatest level of technical knowledge? And how do publishers react (and how should they react) when authors start to disagree on the reproducibility of a paper. In geology, the best high-profile example of this is Mojzsis et al. ’96 (Nature 384, 55-59), where the second, fifth, and possibly sixth authors have claimed the results are unreproducable, while the first and fourth authors are still standing by the paper as written.
per

August 7, 2006 at 2:56 pm Reply

Do you have any concrete examples, with sources?
look up the ricaurte ecstasy paper, or the arnold/McLachlan estrogen fraud paper, both in science. Both led to congressional bills, both were ludicrous papers.
Ricaurte’s excuse was nothing short of amazing.
I am aware of two Nature papers, where the authors know the papers to be utterly irreproducible, and have not published. There is ~$4M grants from this work. In both cases, these are now open secrets, but the authors won’t retract. One of them is associated with a single postdoc, and maybe 6-10 papers up the spout. Why are these authors not going public ?
I am also intimately familiar with three other fields which feature irreproducible papers at their centre.
I’ve heard that one too.
I am not talking hearsay; I know the student. I know the supervisor. Strangely, both have declined to publish this event, nor include it in their project write up. I wonder why they should fail to include this stunning example of incompetence ?
I am not about to tell you the details, partly because of my anonymity, secondly because of UK defamation laws.
yours
per
Bill Hooker

August 8, 2006 at 10:44 am Reply

I am not talking hearsay; I know the student.
Huh. I honestly have heard it before, from people who could not say more than “oh, I heard it from someone” when pressed for details. I always thought it sounded too good (for, um, interesting values of “good”) to be true.
I would note that $4M is a drop in the biomed research bucket — but also that you have pointed out, without apparent effort, two high-profile and three more personal instances of real malfeasance. It seems that such behaviour is more frequent than I had thought, and its cost proportionately greater.
That’s a depressing revelation, though it only strengthens my conviction that open science is a necessity.
per

August 8, 2006 at 2:29 pm Reply

If you look at a recent Nature:
http://www.nature.com/nature/journal/v442/n7101/full/442344a.html
they review a single edition of Nature from 2002. Two of the articles turned out to be non-reproducible. The article is very favourable to Nature, and to the problems that nature faces with peer-review.
Another, less savoury analysis is that 2 of the articles in every edition of Nature turn out to be wrong.
yours
per
Bill Hooker

August 8, 2006 at 4:51 pm Reply

Another, less savoury analysis is that 2 of the articles in every edition of Nature turn out to be wrong.
I don’t so much disagree as I want to emphasize that such an analysis would miss two important points:
1. there’s a world of difference between “wrong” (most published research is wrong) and “non-reproducible”
2. more importantly, there’s another world of difference between “non-reproducible but published in good faith and soon disposed of by the usual corrective mechanisms” and “non-reproducible but staunchly defended by knowing malfeasants and likely to waste a lot of time and money”.
Neither of the two outliers in the Nature article you cite looks like fraud to me. The H3+ thing appears to have been resolved by the original authors accepting that they were wrong (again, nothing wrong with being wrong!) and the stem cell controversy is ongoing. There is still not enough data to support any strong conclusion, and given the potential payoff I do not think the work going into resolving this issue is wasted. (Unless Verfaillie is committing active fraud, but again that looks unlikely since *some* groups have published results agreeing with hers and other groups have reached similar points via different methods).
This is fun, but we can do it forever. I can probably weasel out from under every example you can come up with! In the end, that serves only the status quo, which we agree is sub-optimal at best. So, given that misconduct is ongoing throughout research, and given that it carries unacceptable financial and community costs — what do we do about it?
Here’s a first suggestion: enough with the wishy-washy five-year bans on research for convicted fraudsters. Let’s have consequences commensurate with the havoc these lowlifes are wreaking: lifetime bans.
Here’s another, better one: open science. By which I mean open access publishing of all research results (at least those funded by the public) and open data. There’s no shortage of space online: it should by now be considered somewhat infra dig, if not actively dodgy, NOT to make available your raw data. Further, there should be more ways to report negative results, in the manner of some of the databases now being developed for clinical trials. It’s a lot harder to cheat when everything is out in the open.
per

August 9, 2006 at 6:45 am Reply

let me first say, I was aware of the examples, and didn’t make any inappropriate suggestions, such as fraud.
re; 1, actually, i don’t see much difference between wrong and non-reproducible in context.
re: 2 I have some difficulty with stuff which can get published as a nature paper, and disposed of by the normal corrective mechanisms ! What is wrong with the authors doing a little bit of self-correction, a little bit of rigour, before they publish ? Once you publish something that is wrong, someone else has to do the work thoroughly to disprove it, or may actually waste a lot of time and money relying on that research.
i don’t see any mileage in your lifetime bans, principally because any ban serves as sufficient warning that the miscreant is unlikely ever to get a grant again.
re: open science, I am sure you will be aware of some severe difficulties in making raw data available online in many fields. But I agree that there should be a much greater presumption towards extensive description of material and methods/ data, certainly in the journals like science/ nature; these journals frequently reduce the presented information to a totally inadequate level.
yours
per
Lab Lemming

August 9, 2006 at 8:19 am Reply

Bill:
The problem with open data is that a lot of time, people don’t know if their data is any good or not at the time that they collect it. Another problem is that people who produce data will no longer be able to reap the fruits of their labor, as data interpreters will be able to mine and publish the data and run away with the credit. The third problem is that raw data is generally outputted in non-public formats, which are often unique to the company that builds the analytical instrument.
As for dodgy papers, don’t blame reviewers. I know a scientist who suspected a nature/science (can’t remember which) paper he was asked to review was fraudulent. The editors published over his objections after only a half-assed attempt to verify. So ultimately, the power of scientists to police themselves is limited by the degree to which journal editors grant that power.
-LL
Bill Hooker

August 9, 2006 at 2:45 pm Reply

Per: I think that there is a very great difference between being wrong, knowing it and publishing anyway (that’s fraud) and simply being wrong. Your call for authors to do “a little bit of self-correction” seems rather beside the point to me. Researcher A produces dataset B and publishes paper C; if A is proceeding in good faith then “wrong” can mean that B is useful but the conclusions drawn in C mistaken, or that B contains errors that neither A nor the manuscript reviewers spotted. Either way, the usual corrective mechanisms do their thing and there is no lasting harm (especially if B is freely and fully available). Both of the 2002 Nature examples seem to me to fit this profile. My claim of “no lasting harm”, of course, rests squarely on the assumption of good faith all round.
In contrast, you have given examples of researchers knowing that some of their published work is wrong, and refusing to correct the error:

a student claimed to get results for a year with a machine which transpired not to have a connection to the electrical supply […] ricaurte ecstasy paper […] arnold/McLachlan estrogen fraud paper […] I am aware of two Nature papers, where the authors know the papers to be utterly irreproducible, and have not published. There is ~$4M grants from this work. In both cases, these are now open secrets, but the authors won’t retract. […] I am also intimately familiar with three other fields which feature irreproducible papers at their centre.

I call that fraud. What do you call it? “Lacking in rigour” seems inadequate both to the sense of knowing wrongdoing in these examples and to the fact that the individuals in question are actively hindering the usual corrective mechanisms on which, for the most part, I am happy to rely.
per

August 9, 2006 at 4:59 pm Reply

Per: I think that there is a very great difference between being wrong, knowing it and publishing anyway (that’s fraud) and simply being wrong.
I agree. That’s why I said wrong, and not “fraud”, or “knowingly wrong”. Are we agreeing 🙂
The point I am making is that if 10-20% of papers are wrong/ need self-correction/ have serious errors in them, this is a very important issue for the progress of science, even without adducing the additional issues of whether any of this is due to fraud.
I am profoundly worried about this issue, in and of itself. It seems to me that the two Nature papers cited in the nature article were presented at a stage where some of their conclusions were perhaps a bit too tentative, as evidenced by the fact that both sets of authors rapidly ran into problems as soon as people started asking questions. This is of course a question of judgement, and I have been bitten more than a couple of times, so I have a jaundiced view that some people care more about getting good papers, than producing good science.
I point out that I am careful about the “fraud” word, because it implies knowing intent to deceive and evidence of significant harm. I am quite clear that arnold committed a fraud, because there has been an ORI finding against him. I have not extended that accusation against others, because I cannot substantiate such a serious charge.
yours
per
per

August 9, 2006 at 5:14 pm Reply

oh, by the way, here is another one:
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12676084&dopt=Abstract
apart from the nature of some of the evidence adduced, it is my understanding that this paper is now not reproducible in house, and that an independent laboratory cannot repeat these findings despite extensive efforts. I don’t see a correction on the original paper.
BPA toxicity isn’t a key issue for the originating lab, so I would imagine that repeating this isn’t a priority for them. They have a high impact paper, and the paper replicating this stuff and showing it doesn’t work won’t get into current biology.
You don’t have to posit fraud, merely inertia and indifference, and you have a recipe for serious problems in scientific discourse.
Which is why I am a little more sympathetic to Sames’s dilemma. Brushing it under the carpet is the easy thing to do, and it is the worst thing for science. I suspect that there have been extensive efforts at replication of Sames’s work, and the reason that he has been forced to retract is that no-one in house or externally can repeat. That is a very serious issue in and of itself.
yours
per
Turnkey

December 18, 2008 at 4:40 pm Reply

The following “Reproducibility Problem” was recently cleared by a separate and independent analysis, published in Geology by McKeegan et al. (2007) “Raman and ion microscopic imagery of graphitic inclusions in apatite from older than 3830 Ma Akilia supracrustal rocks, west Greenland. GEOLOGY Volume: 35 Issue: 7 Pages: 591-594.
“In geology, the best high-profile example of this is Mojzsis et al. ’96 (Nature 384, 55-59), where the second, fifth, and possibly sixth authors have claimed the results are unreproducable, while the first and fourth authors are still standing by the paper as written.” Posted by: Lab Lemming | August 5, 2006 6:56 PM

16 Comments

Leave a Reply Cancel reply