Laboratory Errors
There are many types of possible laboratory errors. They include:
Mistakes in sequencing occur frequently, but would normally not result a large number of errors.
Mistakes in PCR also happen frequently, and can result in very serious errors. For example, if even a tiny amount of a genomic segment were to contaminate a work surface, this template could be amplified, sequenced and deposited in GenBank without the researchers being aware of it. This could happen again and again. There would be no way to know whether this happened in Canadian H1N1 sequences without knowing what primers they were using. For example, did they use primers that flanked any of the sequences that showed up again and again?
Misattribution. This should not happen, but apparently does happen frequently when flu sequences are deposited. For examples, see this thread. One possible sources of misattributed recombinants: viruses made in the lab. It is common practice to artificially create recombinants to determine which portion of a genomic segment is responsible for virulence or some other viral property.
Lab escapees. As been discussed previously, many virologists think this happened in China and resulted in the release of an H1N1 strain that circulates to this day. A company in the US sent a pandemic strain, by mistake, as a control to hundreds of labs all over the world. I think the lab escapee hypothesis is a more likely explanation for the retracted swine sequences in China than the swine sequences in the North America, but I would not rule it out.
Badly designed live vaccine. A live attenuated vaccine, likely an artificial recombinant, could theoretically reassort with a wild type virus. Again, I think this more likely in China, but would not rule it out in North America.
I do not know whether any of these possible sources of laboratory error explain the Canadian swine sequences. However, I want to indicate the large range of possible lab errors. There are probably others that I have not thought of. My point is that there are many possible explanations for these anomalous results and I would not be prepared base an entire new Theory on a few odd sequences without investigating all the possibilities. Please recall that the Theory of Recombinomics requires throwing out 70 years of research and hundreds of experiments that indicate that mutations are common. Does it really make sense to do that without exploring all the possible explanations for a few anomalous sequences?
Anonymous suggested that he didn’t like Dr. Niman’s explanation or mine. That’s OK with me, maybe there’s another explanation.
In any case, I do think these particular swine H1N1 sequences are recombinants, as I have indicated many, many times. No back-tracking on that. Further, I think the authors of the relevant paper should address the issue.
NS1, a while back you suggested creating a FluWiki page graphically illustrating all of Dr. Niman’s examples of recombination. I think that would be an excellent idea. It would make it easier for all of us to evaluate the data if the alignments were represented.
Open-mindedness
My goal in starting this thread was to present the conventional view of flu evolution, *not* to convince anyone that Recombinomics was wrong. I think it’s perfectly OK if people want to believe in it. Personally, it does not matter to me one way or the other if it is correct. I am not a virologist. My grant funding or reputation in no way depends on defending conventional flu science. When I decided to read the available literature, I did have an open-mind. I still do. But having read a fair number of papers and reading Dr. Niman’s arguments, I have honestly am unconvinced that the Theory of Recombinomics is correct. This does not mean I think the conventional Theory of Flu Evolution is correct or complete in every respect. It almost certainly is not. However, if you want to replace an existing theory, the new theory has to explain ALL of the old data plus resolve new problems. The Theory of Recombinomics ignores a vast body of data demonstrating that flu viruses mutate rapidly. This disqualifies it for serious consideration at it’s inception, IMO.
There is not one flu scientist that I am aware that denies that these viruses mutate rapidly, with the sole exception of Dr. Niman. Now, maybe he will turn out to be right. But he will not convince anyone to adopt his Theory unless he explains why the many, many papers on mutation rate for flu viruses are all wrong.
Interpreting Papers
WetDirt analysed one of the papers I cited and critised some of aspects of it. There is no paper that has ever been written, including my own, that cannot be criticised for something. There is always another control that could have been done or some confounding variable that you didn’t think of. This is why one paper, or a few sequences, are not enough to build a Theory out of. However, there isn’t just one paper indicating that flu viruses mutate rapidly, there are many. They were done in different ways by different groups, but they all come to the same conclusion: flu viruses mutate rapidly. Are they all wrong?
ScienceTeacher cited a paper a general paper about Recombination in Viruses. I read it and enjoyed it very much. Nothing in that paper I disagree with. It didn’t really discuss flu viruses much, but after reading that paper, you sure wouldn’t get the idea that conventional scientists were somehow blocking the concept of Recombination.
Let me repeat: I think recombination occurs in flu viruses. I think it could be very important. I just don;t think it happens frequently in flu viruses. There is not one published paper that shows that flu viruses recombine frequently, and several papers that suggest that flu viruses do recombine, but infrequently.
Monotreme, I don’t think niman denies random mutations.
Probably not even that they are responsible for most of the
mutations which occur. But you know his way to formulate
things. He wants to present it in a way which lets the reader
assume recombination were more common than even niman
thinks they really are. Note, that he doesn’t give any
numbers or explicite statistics here, so he can always
backtract later. Also he doesn’t acknowledge errors. It’s the
journalist’s way to present theories, not the scientist’s way.
This would be the way to proof his theory and to attack the
problem : a systematic search of the whole database, not hunting
for some examples. (do you agree ?)
He is never doing this, nor does he seem to like that idea at all…
I feel, that we don’t understand these mutations very well
too unrandomly are the mutations clustered on the positions
from the database. Is there something encoded in the synonymous
positions ? Why are the mutation rates varying ?
Recombination doesn’t answer this either.
I admit, that I don’t know much about virus replication, how it works
and where errors and recombination might occur and how.
Scaredy cat said: “Dr. Niman, personally i agree with your comments “
so, do you believe that the mutations in Karo happened by recombination,
that the S227N in Turkey happened by recombination and that the twin mutations
in A/mallard/BC/317/2005(H5N2) were acquired by recombination with swine H1N1 ??
Monotreme, Your list of lab errors don’t really shed any light on the data presented. The same could be said of any data, so failing to address the issue because something might be wrong is a significant problem.
As I said earlier. most scientists re-evaluate their hypothesis when it fails to predict nature, or they at least try to explain the data in a reasonable manner. Compiling a list of possible lab errors really doesn’t do that.
As noted in the Holmes 1999 artcle on recombination, more examples of recombination were to be expected as more sequences became available. The sequences in nature were expected to be the best data, because they would represent what emerges after selection.
The Canadian swine sequences are examoples of such change. Like many recent swine isolates from around the world, they had a human PB1 gene. However, like other isolates, a BLAST of each of the 7 PB1 sequences shows closest honology with human PB1 from the mid 90′s. In the past year, NIAID through its “flu blueprint” fuunding, has generated about 1000 full human flu (all 8 geen dsegments) sequences and made them publicly available.
Thus, a BLAST of a sequence really compares the sequence against a relative large number of sequences, just as the Holmes recombination paper predicted. The data shows that human PB1 also is using a mammalian reservoir to acquire new sequences. One region of the swine sequence is present in the mid 90′s, then disappears from the database in the early 2000′s, and then reappears in recent human isolates. This is yet another example of recombination, involving screening of 1109 nearly complete human PB1 sequences at Los Alamos. These are not “lab errors” but are a representation of how human flu has evolved, including fine detail for the past 10 years, which is the time of collection for the vast majority of the human samples.
These data also show that the human genes in swine sequences evolve more slowly in swine. Such evolution has been demonstrated repeatedly, showing that selection pressures are at work that create differences with in vitro data, which you have cited in the past to argue that the copying of flu genes for even a year or two with abosulte fidelity was not likely.
However, the Canadian swine data has examples of identity with a large number of earlier isoaltes. I have tried to keep the discuission simple, by just showing some of the more extreme examples, where major portions of two 1997 isoaltes from Tennessee, one 1998 isolate from North Carolina, and one 2002 isolate from Korea were present in the Canadian PA or PB2 swine genes. The data was quite specific. None of the swine sequences were exact matches for the entire gene, but there were many examples where the identity extended for 1000 BP or more and many examples included such long stretches with one or the other 1977 isolates.
Since the data was not a complete gene, lab error on a single sequence would require many mistakes. The full sequence does not exist in GenBank, so the lab would have to take some portions of the sequence from the database and append that to the new sequence which is not in the public database. In most cases, this would have to have been done several times for a single sequence. The paper described 56 sequences and virtually all had examples of recombination, and most had regions that paired up with sequences in the database, but also had regions that paired up with other recent swine isolates from Canada.
Coming up with “lab error” to explain even one of the examples would be difficult. Coming up with explanations for 56 would be a major task. At this point, you have failed to address a single example. Instead you posted a general article on how mistakes can be made in labs.
These mistakes are called artifacts and virtaully all scientists are keenly aware of artifacts. In scientific terms, artifacts mean they are “man made”, meaning the scientists crreated data which does not represent nature. The implications of your comments are that the Olsen lan in Wisconsin generated artifacts in 56 swine sequences, deposited all of these artifacts in GenBank, and publsihed the artifacts in a peer reviewed journal if all of these “anamolies” can from their lab. If the artifacts were in the matching sequences, then other labs managed to generate sequences of future isolates in Canada, base by base for over 1000 bases. Again, as I suggested earl;ier, thses types of errors can be easily demonstrated by taking an earlier sequence, such as the 1977 swine H1N1 sequence and BLASTING it against the database which has a large number of swine and human (as well as avian) flu sequences.
You have chosen to not present any specufic data to back your contention that dozens or hundreds of lab errors were generated in the Olsen lab.
You really should have something more than a general description of how labs can make mistakes if you are going to make such an argument.
As I said earlier, I was hoping for some serious discussion. Ignoring published data, or simply saying its all lab error does not qualify.
The lab-error is just one possible explanation. As I understood
Monotreme hasn’t yet examined this seriously.
We should also ask some experts and/or wait that they
comment on the Tennessee/77-anomaly.
But, how many recombination events do we have ?
10 ? 20 ? 50 ? Do you agree, that the examples for reassortment
are still more numerous than those for recombination ?
How likely are random mutations ? You don’t deny them, right ?
How many percent of mutation-events in H5N1 do you expect
are recombination ? (me:less than 1%)
Also, you failed to explain how the recombination should work on the
molecular level. Which mechanism did include the S227N in Turkey-HA ?
Monotreme, Since you have chosen to cite lab error (with no specifics) as the explanation for regions of identity in the Canadian swine with dozens of public sequences from a variety of sources, it is probably worthwhile givibg more examples of labs that should be questioned because their data doesn’t fit your view of the influenza world.
The Webster lab at St Jude and the Guan lab in Hong Kong have published an M sequence from a 1998 H9N2 swine isolate in Hong Kong and a 2003 H9N2 chicken isolate from Korea. The two sequences are identical. I would say that this example would be highest on the list of lab errors, because it involves just one sequence, the match is exact for the entire gene, and the paper itself acknowledges that the gene is different and is an example of reassortment defining a new constellation of genes which I believe they name type C, but indicates the M gene is like a 1997 chicken gene.
However, this is not necessarily a conclusive indication of a labeling error, because the example used has been used by them previously to describe the 1997 H5N1 outbreak in Hong Kong which involved H9N2 and H6N1 internal genes, and the 1997 chicken sequence cited was used to define these reassortants. Moreover, the published sequence is somewhat like the 1997 chicken sequences, with something like 9 differences.
However, the sequence has zero differences with a 1998 swine sequence, so it is conceivable that the swine sequence was deposited in error. However, I wrote a commentary on the identity in March of 2005, just after the Virology paper was published, and members of the Webster and Guan labs follow Recombinomics commentaries, but they have not changed the submitted sequence or issued a correction for the published paper, so at this point they would appear to stand by the peer reviewed publication and the submitted sequence at GenBank.
Thus, if labs are going to be contacted on submitted sequences that Monotreme considers “anamolous”, these two labs should be contacted to testify because identity for 6 years in two species in two countries would certainly violate the in vitro data that says the polymerase makes many “random errors” or “mutations”.
anonymous at 05:33,
You erroneously attributed the following comment to me: “Dr. Niman, personally i agree with your comments “
I did not make such a comment; Science teacher did, yesterday, at 20:11. (Had to search for that one; you had me going for a while.)
So I guess I’m off the hook for answering your question:
“so, do you believe that the mutations in Karo happened by recombination, that the S227N in Turkey happened by recombination and that the twin mutations in A/mallard/BC/317/2005(H5N2) were acquired by recombination with swine H1N1 ??”
which is a good thing because I am not knowledgeable enough to have a meaningful opinion.
Another lab that should be contacted to explain their “anamolous” data would be the Kim lab in South Korea. In 2004 they submitted a series of human H3N2 sequences from a 2002 outbreak in Korea. Six of those sequnces A/Daejeon/258/2002, Incheon/260, Kyonbuk/320, Chechonnam/323, Chechonnam/338, Chechonnam/340 are vitually identical to contemporay human H3N2 sequences for the first 600 positions. They then diverge and match human sequences from the late 1980′s and early 1990′s. Two of the best matches are Seoul/45/91 and Seoul/50/91. These are detailed in the public patent application entitled “Copy Choice Recombination and Uses Thereof”. The 2002 sequences are exact matches with the earlier human sequences between positions 575 and 1008. The idetity is nearly exact with sequences isolated over a decade earlier. This identity would again caise significant problemns for the in vitro data, and represent another example of “natural” sequences violating in vitro data, which Monotreme has yet to address, other than saying its lab error and the labs submitting such “anamolies” should be questioned.
This data is much harder to explain by a simple labeling error, because much of the gene matches 2002 human isoaltes, but a major portion is virtually the same (but not exact - there are one or two changes) as human isoaltes from the late 80′s early 90′s. The “anamolous” sequences really are two sets of three with the three Cheonnam sequences swithing back to 2002 and then back to 1991, while the other three sequences remaining 1991.
These chaages are quite obvious by simply running a BLAST on the sequence linked above. The 1991 polymorphisms show up at postions 575 and really cannot be missed by anyone who even glances at the results.
These sequences have been at public at GenBank for over 2 years and the switch from 2002 to 1991 is hard to miss.
sorry Cat. I was reading old posts, switching #5-#6 and both names start with Sc, niman quoted you just before several times — so I mixed it up. The question goes to science teacher then
niman, so we also have non-swine examples. Maybe we should do a complete search for large regions of identity, years apart…
anononymous, You comments are in error. I have expalined the acquisition of S227N many times and also explained why it was not in the index case’s sister. I have also recently pointed out that the same chnage was in another human sequence from Turkey. You continue to ask the same questions over and over, either as “anonymous” on this board or as GS or GSGS on many other boards, as well as in private e-mails. You also started the comments on the Karo cluster, somehow coming up with 200 differences between the Karo cluster and the Indo/05/05 wich is not consitant with a published phylogenetic tree. I believe you are operating under some basic misconceptions.
You can do searches, but your interpretations of the results are at best “novel”. You were going to seach your database for 25 BP matches between H5 and H1. Did you get a result? I think it is more valuable to discuss actual data than misconceptions repeated dozens of times using various handles on multiple boards. Let’s try to stay focused on data and when questions are answered, avoid asking the same question again and again (and stating that the answered questions have not been answered).
link repaired - pogge
anonymous, The are MANY examples, including human and birds. I was trying to stay focused on some clear cut examples, but instead of getting a discussion of the DATA. I got a “lab error” general post with no specifics.
As indicated earlier, I think that the “lab error” explanation for 56 sequences is something that belongs in the Martian Chronicles.
anonymous, Since I was banned from Flu Wiki while trying to discuss the DATA on the Karo Cluster, it is worth repreating, since Declan Butler has since done another piece in Nature citing 21 differences between the father and the other members of the cluster. My earlier comments were based on a presentation in Jakarta that use 4 genes to show differences among members of the Karo cluster. The father had 14 differences in these 4 genes (HA, NA, M, NS?). Declan Butler has since published data which included the other 4 genes (which I had assumed were identical). The 4 genes had 7 additional changes in the father. Thus, the father had 21 differences (9 in HA), while other members of the cluster had 1–4. One polymorphism (which was rare and not in Genbank) was genetic evidence for huamn to huamn transmission as well as recombination between the H5N1 in the son and a second H5N1 in the father (because it was in both the second isoalte from the son and the isolate from the father). The large number of changes in the father readily distinguished his siequence from the other family members, but the sequence was sill on the same branh of the tree suggesting the H5N1 was another avian virus in Indonbesia (it had nothing to do with Indo/05/05, which has a novel c;eavage site as well as additional changes which place it on the lower branch, along with most human H5N1 cases in Indonesia.
Thus, the H5N1 from the father picked up 1 polymorphsimns from the son (via recombination) and the other 20 difference were from another H5N1 from Indonesia.
anonymous at 07:51,
Since you use anonymous to post, it is unclear if you are the same anonymous who posts here and sounds just like GSGS on the flutrackers board who askes the same questions again and again, offers to do searches, and somehow came up with 200 or more differences between the Karo cluster and Indo/05/05.
However, as I indicated to GSGS there and in many e-mails, the conservation is not limited to swine (although the swine may indeed be the reservoir).
I posted this while banned (administratively or self imposed avoidance) of this board:
“Flu evolves fairly rapidly and most of the time the acquisitions are small. The different rates of evolution are easily seen when human genes in swine are analyzed. Earlier the data of the Canadian swine was posted. Those 2003 and 2004 isolates had a human PB1. However, a blast of the 2004 PB1 from the swine matched human PB1 from the mid 1990′s. Thus, the 1990′s PB1 didn’t change much in swine, but it changes enough in humans so the matches by BLAST were far down on the list of sequences with homology.
Only the H and N were available from the 1992 swine. Most human H1N1 evolved away from the 1992 sequence. However, one sample from Brazil, isolated in 2000 was very close providing indirect evidence that the 1992 was tucked away somewhere and emerged in this patient, 8 years later. The database is short on human isolates from south America, so it is unclear how widespread this 1992 sequence was in 2000 in South America.
However, the N has regions of homology with a quail isolate from 2000. The quail sequence has 1000 BP of identity with a human isolate from 1995. That is over 2/3 of the gene and there were only 4 differences in the next 400 PB (the sequence of last few BP dropped two nucleotides).”
Thus, as shown above the quail sequencce from 2000 matches the human sequnece from 1995. Looks like another lab should be hauled in for questioning and charged with publishing “anamolies” that violate the in vitro data on “random mutations”.
Dr. Niman, you are correct in saying that I haven’t specifically provided a lab error hypothesis for each of the sequences that you present. Haven’t had the time, frankly, as anonymous correctly suggests. I will try to address at least some of them in next few days. The general point still stands, there are many sources of laboratory error. But it will be hard to know which of these possibilities is true, if any, without cooperation from relevant labs, as anonymous points out.
Can we agree that these labs should explain their sequences? They should choose from the following three options: 1. Admit the Theory of Recombinomics is correct and publicly state that the flu polymerase complex is highly accurate. 2. Admit that there was a laboratory error and state what it was. 3. Provide another explanation for their anomomalous results. Whichever of these three options they choose, they should also acknowledge that their published sequences are clearly recombinants.
I think mistakes of attribution are common in the flu field based on even a cursory examination of the data. I will keep adding to this thread as I find them.
Another alternative is deliberate release of a bioweapon. I have hesitated to advance this possibility, but you haven’t.
For example, this Recombinomics commentary:
Pandemic Influenza as a Bioweapon
Since the WSN/33 situation in Korea provides some valuable insight into detection and reporting of bird or human flu, and wire services are carrying stories about biologic attacks by terrorists causing a contagious disease, it is worth reviewing some of the lessons learned from the swine WSN/33 infections.
If pandemic flu is the contagious disease of choice, selection of WSN/33 at this time would offer some advantages. It is already transmissible from human-to-human, has been shown to be lethal in mice, has mutations in NA and PB2 that increase lethality, is widely available, and could be used without genetic manipulation.
As has been seen in Korea, introduction of the agent into pigs would allow it to spread almost undetected. Verification of its spread (or existence) has proven to be exceedingly difficult. Movement from swine to humans has not been reported and all reported isolates are missing the PB2 mutation. This may be due to a survival selection offered by recombining or reassorting with prevalent H9N2 subtypes. Most of the swine isolates have an avian PB2, but even the isolates that have half of a human PB2 have the 3′ half of the human gene replaced with avian sequences. Thus, the results from the Korean swine may indicate that starting with a very lethal virus has disadvantages in that a less lethal virus will emerge virtually undetected.
I remember when you discussed the anomolous WSN/33 results. The WHO was quite dismissive of your interpretation and suggested a laboratory error. I admit, I did not find their explanation very believable. However, I was also reluctant to accept the idea of a deliberate introduction of a recombinant bioweapon. But, perhaps that’s just because I don’t want to be labeled a conspiracy nut ;-) In fact, I think we should consider all possible explanations, including deliberate release. My own pet hypothesis is deliberate release of a live attenuated recombinant swine H1N1 vaccine that has been reassorting with wild type viruses. But I don’t don’t have much of evidence of this, yet.
The point I am making here is, if you thought the Korean sequences were indicative of a bioweapon, why not the Canadian sequences? I realize that WSN/33 is a well-known laboratory strain, but other strains of H1N1 could have been used as well, right?.
Monotreme, WSN/33 is a lab strain of influenza and there was no doubt that the sequence reported came from a lab. The question of the relationship between the lab isolate and the pigs was raised, and never really resolved. However, other than an unpublished PB2 sequence, the WSN/33 sequences were complete genes, not recombinants.
You are correct that you have not addressed the issue of the conservation of sequences. I do not find the sequences cited as “anamolies”. In fact I think that they are accurate representations of nature, which may not meet your definition, but I don’t think the problem is with the many labs that are publishing the data.
The data from the Olsen lab did not involve one isolate that had been described previously. The were over a dozen other matches, and each gene and each isolate had different prior strains.
As I said, it would take something like 50–100 different errors, and although you are making the charge politely, I think that if your claims of lab error are correct, the Olsen lab would be in danger of losing funding support.
I don’t think that the putative errors are a real possiblity. Your explanation will certainly make for interesting reading.
Anonymous indicated that he contacted the Olsen lab, and they did not seem to think that anything was amiss.
You comments on contacting labs bring back memories of the H5N1 swine sequences. I had received a number of e-mails, suggesting the thread you started was of some import. I hadn’t read any of the posts prior to the e-mails because I thought the title was a bit much. I did scan the posts and it sounded like you thought that your comments led to the pulling of 40 H5N1 swine sequences from GenBank. I found that analysis curious, since I had tried to look at the sequences prior to the start of your thread and teh sequences had been pulled, so it seemed alikely that there was a relationship (unless you started a similar thread elesewhere earlier).
Further investigations suggested that the sequences were pulled two weeks before you started the thread, which sugegsted that there was no relationship.
I don’t think that the comments on these threads are leading to reviews by the labs submitting the sequences. As I said earlier, I am fairly certain that both St Jude and Hong Kong are aware of the exact match between the 2003 chicken M gene in Korea and the 1998 swine sequence from Hong Kong. The paper was published over a year ago and they have not issued a correction or changed the sequence at GenBank or Los Alamos.
Therefore, I don’t think these labs believe your concerns or claims regarding lab error are of significance.
Other labs are beginning to acknowledge the recombination. As noted previously, the submitters of the H5N1 tree sparrow sequences from China have acknowledged recombination and recent reports have cited failures to find reassortment and recombination, so I think the acknowledgemnt of recombination in flu is gaining some acceptance.
I will be present the recombination data at the Vaccine meeting next month, and at that time will havea better idea, but the acceptance is just a matter of time.
The data are NOT lab errors and recombination is the name of the game (and random mutations rarely appear in seasonal differences commonly called genetic drift).
dr niman, im so glad to see you back at fw. my gut just tells me to listen to you and your findings, though i dont really understand or know anything about your field and usually stay in the prepping area. i have one question. could you give me an educated guess, based on your feeling about what you feel is happening, and im not holding your feet to the fire, i know it would just be an guess - do you think we will have a pandemic this fall or do you feel its further away? im prepping for so many people and your answer would help with some stress either way. thank you for all you do, and the time you give trying to educate us. i for one appreciate you. thanks
DTEX – at 11:47
“- do you think we will have a pandemic this fall”(?)
+Sounds like a New Thread. And a query that should be worked.
Here is a summary of the recent discussion as I understand it. Anonymous, Dr. Niman and I all agree there are some unusual sequences in GenBank, at least as far as conventional science goes. Dr. Niman believes that they are proof of the correctness of Recombinomics, I think they may be artifacts (in the broadest sense of the word) and anonymous thinks there must be a third explanation. We all agree that some of the swine H1N1 sequences represent recombinants.
I will work on the H1N1 swine sequences to see if I can find a specific artifact that would account for them, but it will be difficult to prove without the cooperation of the submitters.
In the meantime, let’s consider some H5N1 human sequences. Dr. Niman, I’d appreciate it if you would look at these two sequences from GenBank: AF102660.1 and AF296752.1. Both were deposited by the CDC. Both are listed as the sequence of the neuramindase gene (NA) of the A/Hong Kong/516/97 isolate of H5N1. So, naively, I would expect the sequences to be identical. They aren’t. Instead, there are 24 nucleotides that don’t match. Now, to me, this looks like an artifact of some kind. My first thought is that one or both these sequences really aren’t from A/Hong Kong/516/97. Another possibility is that the strain was passaged so many times in culture that it picked up a large number of mutations. Either, the differences represent an artifact, IMO.
So, what do you think? I look forward to your analysis.
Monotreme. I don’t have much to offer in these matters so I read and mostly keep silent…but clerical errors as a reason for abberations seem unllikely to me.
If I was a scientist, used to working with these bases and I was about to publish my results whether in Genbank or in a publication, it seems to me that I would double and triple check the sequences to make sure that they were correct…
…there must be another explanation.
Tom DVM – at 21:55
If I was a scientist, used to working with these bases and I was about to publish my results whether in Genbank or in a publication, it seems to me that I would double and triple check the sequences to make sure that they were correct…
…there must be another explanation.
Tom, I can’t beleive the people who submitted the second sequence didn’t do a BLAST search or align their sequence with the original. I keep starring at the sequences, but can’t make the mismatches go away. Can’t believe they were that sloppy, but what other explanation can there be?
PS, Perhaps Dr. Niman will find a non-artifact explanation
Monotreme. I’m with you but just can’t believe professionals would be that stupid!!
Mono, Tom and Niman Is there any pattern to the things that look like ‘artifacts’?? Just thinking that if a scientist in a ‘closed’ country was trying to get a message out to you - perhaps by inserting deliberately incorrect sequence thingys he/she might be trying to alert you to something else - if so - there may be a ‘rationale’ to the values you are finding - i.e. backwards or diametrically opposed positions to them or something?? Some numerical pattern or cipher??? Or do I read too many spy novels??
gharris. Good point!! I would have never thought of it as a possibility…at this point, who knows?…this whole thing has seemed a little sureal from the start.
So it may be worth having a cryptologist look at them?
Well, in this case the people depositing the artifacts are Americans! The CDC deposited both sequences. Now, no jokes about the US being a “closed” society. Guess we’re stuck with the cognitively challenged hypothesis until Dr. Niman finds a non-artifact explanation.
So we have to find the enigma machine or the code book.
It may be Home magazine vol 5 Number 1. 1st letter of every 5th word.
Or it may be disinformation to waste our scientists time.
No, I don’t think so. If these ‘artifacts’ come from multiple unrelated sources, it will be hard to have a shared code book.
I would leave this to the scientists to figure out, not the intelligence analyst.
There is one thing for sure…with the number of dissociative thinkers on flu wiki…there will be no angle unexamined…and I do believe that is the way it should be. /:o)
Tom DVM – at 00:00 gharris – at 22:16 enza – at 23:23
OK then. May be there is something to it.
Thinking out of the box?
Would you like to start a new thread to brainstorm this? I wouldn’t want to interrupt Part 6 of this thread.
Monotreme. I think you are operating under some very basic misconceptions. You want to take field data and bend it to fit your preconcieved notions. Two isolates from the same host need not be the same. Ib fact, that is a requirement for both reassortment and recombination. That is why viruses are cloned prior to sequencing. That is why some samples generate mixed signals.
These are two VERY basic facts of virology. Do you have any training in this area? I thinl your comments are quite misleading, especially to readers of this noard who largely lack a scientific background.
Your assumption that an error was made if two different sequences are published from the same indovidual has no scientific basis, and in fact has been disproved time and time again.
Similarly, suggesting that the Olsen lab made 50–100 errors without a sgred of evidence is careless at best, but it really makes meaningful discussions pretty useless.
Such comments are wll beyond hand waving.
OK, I have some data now. First all H5- genes which are almost identical
although different years are given.
I’m afraid, this could give sidescrolls and “ohmuse” is too complicated,
so this was posted to curevents instead:
http://www.curevents.com/vb/showthread.php?p=486384#post486384
I’ll be doing this H5-H1 thing now for niman…(to examine how
likely this identity in BC317+Ontario/11112 was)
>niman at 07:58
>anononymous, You comments are in error.
which exactly ?
>I have explained the acquisition of S227N many times and
I don’t remember.
Isn’t it worth to put it on your webpage and make a link ?
If someone has a link to a previous explanation
(S227N in Turkey by recombination, explanation of the
acquisition on the molecular level), I’d appreciate.
>also explained why it was not in the index cases sister.
as I remember you think this is because the virus was
grown in chicken eggs instead of dog-cells.
>I have also
>{{http://www.recombinomics.com/News/06230603/H5N1_S227N_Turkey_Cases.html|recently]]
›pointed out that the same chnage was in another human sequence from Turkey.
>You continue to ask the same questions over and over,
which questions exactly ? (OK, this very question is one of them now)
>either as anonymous on this board or as GS or GSGS on many other boards,
>as well as in private e-mails.
>You also started the comments on the
>Karo cluster, somehow coming up with 200 differences between the
>Karo cluster and the Indo/05/05 wich is not consitant with a
>published phylogenetic tree.
that’s why the mutation data and the tree must be wrong.
>I believe you are operating under some basic misconceptions.
>You can do searches, but your interpretations of the results
>are at best novel. You were going to seach your database
>for 25 BP matches between H5 and H1. Did you get a result?
Coming soon.
I’m still aligning data with clustalw which is slow
I should have installed the other program mentioned or asked
Frenchie…
>I think it is more valuable to discuss actual data than
>misconceptions repeated dozens of times using various handles
>on multiple boards.
your choice. You needn’t answer all questions and other people
can also jump in
>Lets try to stay focused on data and when
>questions are answered, avoid asking the same question again
>and again (and stating that the answered questions have not been answered).
maybe others can post the answer, if they know it.
If not, it’s well worth being repeated.
Or could it be, that you just dislike some questions ? ;-)
>niman at 08:03
>anonymous, There are MANY examples, including human and birds.
>I was trying to stay focused on some clear cut examples, but
>instead of getting a discussion of the DATA. I got a lab error
>general post with no specifics.
this is in monotreme’s thread, pick what you like
and ignore what you don’t.
>As indicated earlier, I think that the lab error explanation
>for 56 sequences is something that belongs in the Martian Chronicles.
I don’t understand your Martian arguments, nor which 56 sequences
you mean
>niman at 08:20
>anonymous, Since I was banned from Flu Wiki while trying to discuss
>the DATA on the Karo Cluster, it is worth repreating,
good ! BTW. I opposed banning you.
>niman at 08:42
>anonymous at 07:51,
>Since you use anonymous to post, it is unclear if you are the
>same anonymous who posts here and sounds just like GSGS on the
>flutrackers board
and it doesn’t matter unless you are planning to get personal.
>who askes the same questions again and again,
>offers to do searches, and somehow came up with 200 or more
>differences between the Karo cluster and Indo/05/05.
…and who repeatedly explained to you in posts and emails
that more than 200 differences were just only the likely
consequence of the (supposedly wrong) mutation data.
And aren’t you the one, who posts links to the same
webpage again and again and now suddenly is afraid
about repeated explanations ?
Could it be, that you don’t like some questions ? ;-)
>However, as I indicated to GSGS there and in many e-mails,
>the conservation is not limited to swine (although the swine
>may indeed be the reservoir).
agreed ! That’s what I said at 07:51.
>I posted this while banned (administratively or self imposed
>avoidance) of this board:
>Flu evolves fairly rapidly and most of the time the acquisitions
>are small. The different rates of evolution are easily seen when
>human genes in swine are analyzed…
earlier I did a search on swine sequences in my database and found
a higher rate of average differences between swine sequences
than between human sequences. I don’t think that flu mutates
more in humans than in swine in average. Your examples are exceptions.
The Tennessee-thing is very special. There are only very few
such examples.
BTW. Olsen said, this couldn’t be an error of their lab., because
they just don’t work with that Tennessee-strain.
That argument presumably doesn’t hold for St.Jude…
niman at 03:10 - Two isolates from the same host need not be the same. [In] fact, that is a requirement for both reassortment and recombination. That is why viruses are cloned prior to sequencing. That is why some samples generate mixed signals.
Dr. Niman,
In an earlier thread, you stated, “Dual infections are the mainstay of H5N1 evolution.”
Also, “Dual infections produce two copies of the eight gene segments in the same cell creating two opportunitoies [reassortment and recombination] for change.”
Okay. At 21:50 on this thread, Monotreme wrote, In the meantime, let’s consider some H5N1 human sequences. Dr. Niman, I’d appreciate it if you would look at these two sequences from GenBank: AF102660.1 and AF296752.1. Both were deposited by the CDC. Both are listed as the sequence of the neuramindase gene (NA) of the A/Hong Kong/516/97 isolate of H5N1. So, naively, I would expect the sequences to be identical. They aren’t. Instead, there are 24 nucleotides that don’t match. Now, to me, this looks like an artifact of some kind. My first thought is that one or both these sequences really aren’t from A/Hong Kong/516/97. Another possibility is that the strain was passaged so many times in culture that it picked up a large number of mutations. Either, the differences represent an artifact, IMO.
Dr. Niman, are you saying that the differences in Monotreme’s example could be accounted for by recombination? And if so, wouldn’t a dual infection have been required? Given the relative rarity of human H5N1 infection at all, wouldn’t a dual infection have been extremely unlikely back in 1997?
Also, could recombination ever take place absent a dual infection? You have probably stated your opinion; sorry if I missed it; but do you think random mutation ever takes place at all? If so, could a virus mutate within a single host and then recombine with an earlier version of itself?
Scaredy Cat,
As with the father in the Karo cluster, a person can be intected with two H5N1’s. H5N1 was first detected in Asia in a goose in 1996. The H5N1 in people in Hong Kong in 1997 include 4 internal genes that traced back to H9N2 and H6N1 serotypes. H9N2 is the most common infection in birds in Asia. Some (if not most) of these birds could be infected with two H5N1’s. A sample sent to CDC could result in the isolation of one, while a sample sent to Hong Kong could isolate another.
The dual infections have been described many times. St Jude published a PNAS paper last year on changing H5N1 over a short time (I believe from the same bird). For the Indonesian human cases, samples are sent to Hong Kong and the CDC. In most cases both locations isolate H5N1 from the same people and the sequences are virtually identical. However, in some cases H5N1 is isolated by CDC and not Hong Kong. In other cases it is the opposite. In the SARS outbreak, Beijing published differing sequences from different organs of the same person.
Recombination (and reassortmet) both require a dual infection. However, the two sequences can be quite close and in fact most of the time the two sequences are quite close, which is why the recombination looks like point mutations.
Errors do happen, but they rarely appear in the isolates collected because they have to offer some advantage to be present at a high enough concentration, because in most cases only one H5N1 is isolated, so if there is one copy of the “mutant” and 100 copies of the wild type, chances are that the vast maority of the time, only the wild type will be detected.
Asssuming that two sequences from the same person indicates that one of the sequences is wrong really desn’t have much of a scientific basis. If the sequence makes the virus non-functional, the odds might be increased, but defective interfering particles have been described for many types of viruses, including flu, so even in those instances, thes sequence does not indicate the sequencers got it wrong.
Such proof is extremely difficult, and such efforts might be worth an e-mail, but in the vast majority of instances, the sequencers will be well aware of the dicrepency and stand by their sequence, especially if it has been public for a number of years.
GSGS at 7:55, You repeat the same questions again and again under multiple handles on multiple boards. Such activity, gets tiresome since you have been following me around and have been doing this for at least 6 months.
You have asked for the data, and with the phylogenetic trees, insist they are wrong because they don’t fit some “novel” interpretation of yours that indocates there are 200 or more differences between the father in the Karo cluster and Indo/05/05.
There is no data to support such a conclusion. Now both you and Montreme are assuming that much of the data are wrong because the data doesn’t match your preconceived notions, which are at odds with those in the field.
I don’t believe that either you or Monotreme have much, if any, training in virology, based on the comments you have posted. I think your posts are quite misleading and detract from any serious discussion.
Real data is useful Simply calling the published data wrong, without a shred of evidence, is not.
anonymous, I think you’re on to something with the idea that certain types of questions are especially irritating to Dr. Niman. The reason is if it turns out that many of the apparently odd sequences are really artifacts, well, there goes Recombinomics. This puts the submitters of these sequences in a odd position. They may not want to admit to making errors, but I don’t think any of them are willing to accept Recombinomics (except, perhaps, in Mainland China). It will be interesting to see how they respond. I intend to make a big deal about these sequences until submitters either embrace Recombinomics, admit to errors or find an alternative explanation for them.
Dr. Niman, let me repeat a several salient facts, both of the sequences were deposited by the American CDC. They were not deposited at the same time. AF102660.1 was submitted by the CDC to GenBank . AF296752.1 was first submitted to the CDC to GenBank on July 22 2002. So, your hypothesis is that the CDC stored two samples from the same patient, got two different isolates, decided to give them the same name even though they new they were very different viruses, ie, there are non-conserved differences, ie, the proteins coded by these sequences are different. Further, when they sequenced the second isolate, they decided not to mention that it was a completely different virus than the first. They published papers concerning each of the sequences, referred to them by the same strain designation but didn’t bother to mention that in each case they were actually referring to separate viruses with very different sequences. And they did all of this on purpose. Well, gosh, I’d really like to hear that from them.
If you were to make a phylogenetic tree with the A/HongKong/516/97 neuramindase gene, which sequence would you use? How would we know which sequence you used? Would you note that there were two completely different A/HongKong/516/97 neuramindase genes?
again, see here
http://www.flutrackers.com/forum/showpost.php?p=20944&postcount=114
and the following posts for the explanation, why the data about the Karo mutations
as well as the phylogenetic tree are probably wrong.
Monotreme, Do you have any training in virology?
Monotreme – at 09:54
“They may not want to admit to making errors, but I don’t think any of them are willing to accept Recombinomics (except, perhaps, in Mainland China). “
Excuse me for the distraction.
Is there a reason (or history) why Chinese scientists are (more) willing to accept Recombinomics?
ANON-YYZ, Yes, the believe that DATA.
monotreme, recombinomics cannot fully explain the conserved sequences either.
We would expect to see more cases of conserved sequences then.
Everone should keep in mind that when niman says:
“there was a dual infection in Karo” or some “polymorphism was
acquired by recombination” or “random mutations are nonsense”
or such, then this is just his opinion and theory and most scientists
don’t approve this. Also be careful, when he uses upper-case words.
He says this means that he is sure about it, but I interpret it as
this is in doubt by others.
I’m not saying niman is wrong, only that some things which he puts as truth
are very much in doubt by others. He also often states accepted truth,
but he doesn’t distinguish between these two.
(and sometimes he’s also wrong, but so am I…)
Monotreme, I just did a quick check on 516 at Los Alamos. Three groups submitted sequences and each submission has its own accession number. Same is true for NA. None of the numbers match what you have posted. I’m not sure which of the sequences you are describibg are related to the sequences listed at GenBank, but as I have said previously, and as is known by most virologists, more than one sequence can come from teh same patient.
The Indonesian sequence names are a bit more sophiticated, using a more detailed desription of the sample name (CDC uses CDC in its numbering) and both CDC and Hong Kong use a different numbering system, which avoids reliance on the accession number to distinguish different isolates from the same patient.
As I said earlier, viruses from the field are not as co-operative as some would like. There is no one patient / one virus rule, which is why each isolate has a unique identifier, and now those identifiers are being incorportated into the name.
ANON-XYZ, China has more H5N1 than any other country, and they have active programs at many locations within mainland China that sequence H5N1. They have submitted many human, avian, and swine H5N1 sequences, although Monotreme has misresented those submission by saying the don’t exist. Included in those submission were a number of isolates from Henan province (including tree sparrows) which had obvious recombination. China believes its own data, since the recombination is real and obvious.
In submissions from the US, there is also obvious recombination, but that recombination is actively ignored. There are numberous examples in the Canadian swine. If fact, virtually all 56 genes show recombination to varing degrees (beyond the changes that look like point mutations).
In addition to actively ignoring the recombination, other labs simply withhold the data. I have noted recombination in sequences submitted by St Jude from Korea. They have additional examples that look the same as the recombinant, but the recombined portion is withheld. It remains unclear why the lab that has submitted more sequences than any to GenBank, choses to submit partial sequences, especially when it is part of the “flu blueprint” project from NIAID, which has complete and public sequences as its goal (yet these partial sequences are not in the que for future full sequencing). I ahve asked about the missing data, and did not receive a satisfactory response.
The comments on this board become more entertaining, as the cries for “where’s the data” have now shifted to “the data must be wrong” (especially since the “analysis” is being done by a couple of posters who I striongly suspect have no training in virology.
Thus, the virus sequences by the virologists are wrong because the data doesn’t quite fit the preconceived notions of a few posters on this board.
The posts are entertaining, but do not seriously address the issues at hand.
GSGS 10:38, I raelly didn’t look at how you came up with over 200 differences, because the number really does not fit any of the available information. I did a commentary indicating the father had 9 HA differences. Declan Butler had another source, which also indicated the father had 9 differences from the consensus. I had posted two phylogenetic trees, here and here. These two recent trees show the Karo cluster, which have 6 isoaltes that are close to each otehr, plus the father, who had the 9 changes. Both of these trees also have the other human cases from Indonesia that have the novel cleavage site. Included in that group is Indo/05/05, teh only published huamn sequence. These two recent trees, are similar to earlier trees (which have both the CDC and Hong Kong isoaltes, which are generally in pairs, even though they have different names, but the pairs represent different isoaltes from teh same patient).
The data is all internally consistant, and do not show anything loke 200 difference between Indo/05/05 and any indonesian isolate, including the Karo cluster.
Thus, data from multiple sources (who actually sequenced the H5N1, including the withheld human sequences), present a very clear and simple picture, which is at odds with your “analysis”. All discussions I have seen by the virologists who are actually generating the data indicate the Indonesian H5N1 sequences (both bird and human), are easily distingusihed from H5N1 outside of the area, which would be difficult if your 200 or more differences had any basis in fact.
Thus, your “analysis” is wrong on all counts.
I don’t believe you have any training in virology, based on your comments on these boards (and others). Am I correct?
yes, I never learned any virology except from these boards and online papers. You’re better in virology, I’m better in programming and logics, let’s join the skills ! But this is just a matter of logics. I never said “200″, but “200 or more”. In fact much more, but 200 sounds strange enough to doubt the data as you seem to agree. I even made a mistake, I compared with Vietnam/1194 and other sequences instead of Indo/5 . With Indo/5 it’s even worse, we even have 8 (not 6) out of 9 positions (chosen almost at random) where a mutation occurred !
Niman:
I’m still not clear on how strictly you are defining “dual infection”. In making that distinction, what value would you use for the threshold distance?
Are your analyses based entirely on phylogenetic methods, or do you also apply statistical cluster analysis?
GSGs, I agree on the programming, but disagree strongly on the logic. 200 differences is absurd. 200 or more is utterly absurd. The DATA indicate your logic is in far left field, if not out of the ball park.
I suspect it starts with major misconceptions because you have no training in virology. My guess it is the same for Monotreme. Going to virologists and asking questions based on these misconceptions can be counter-productive.
I believe you and Montreme have just enough knowledge to be dangerous.
Racter, At the level of point mutations, a dual infection just requires two more sequences. Most of the more obvious examples I have been using, such as the swine, are many steps beyond statistical cluster analysis. When a sequence has identity stretching over multiple polymorphisms, it is a bit beyond statistics. The “point mutations” or “random mutations” are more into the statistical range, but needs a bit more work to define a bullet-proof data set.
anonymous at 07:55 and possibly various other times:
Moderating these threads can be difficult enough without having to read some posts two or three times just to be sure I know who’s saying what because different people are appearing under the same handle. If you’re going to continue to participate in this thread I’m going to ask you to pick a nickname and stick to it. And I’m going to insist. If you don’t, I’ll block your participation here. I’m sorry but these threads need to be managed and I’m going to use whatever tools I have to do it.
niman:
A couple of days ago in the previous iteration of this thread you made it quite clear that you regarded your participation here as a waste of time. And you made it abundantly clear that the fault lay with everyone else but you. And yet here you are.
I’m certainly not a virologist but I can identify mockery, ad hominems and arguments from authority when I see them. I believe Monotreme answered your question about his credentials some time ago. Discussion here won’t be limited to those whose credentials meet with your approval and at this point, your insistence on knowing who’s a virologist and who’s not looks like an attempt to shut the discussion down, not further it.
Instead of worrying about protecting our poor undereducated readership (your opinion, not mine), how about you do them the credit of just laying out the facts clearly and let them read and decide where they stand for themselves?
Discuss the science or don’t. Your choice. But constantly sneering at the entire community about how entertaining we are while pretending to protect us from misinformation just isn’t acceptable. There is one credential that really matters and in this thread it’s mine. I’m the moderator and I’ll decide who participates and who doesn’t. I’ll decide what constitutes acceptable behaviour and what doesn’t.
You all go ahead and talk about the science and leave the moderation to me.
Niman:
Sorry, I don’t follow you there. Would a difference as small as that produced by a single SNP constitute “dual infection”?
pogge’s and my post crossed each other.
Thanks pogge, although personally I am not in any way concerned about any ad hominem attacks by Dr. Niman.
NON-YYZ – at 10:23
Is there a reason (or history) why Chinese scientists are (more) willing to accept Recombinomics?
Just mainland Chinese scientists, not Hong Kong Chinese sceintists. Lot’s of interesting things going on in mainland China re: sequences lately. It will be interesting if they officially endorse Recombinomics. But I won’t comment on this further until I’m sure that they do.
niman – at 10:41
Monotreme, I just did a quick check on 516 at Los Alamos. Three groups submitted sequences and each submission has its own accession number. Same is true for NA. None of the numbers match what you have posted. I’m not sure which of the sequences you are describibg are related to the sequences listed at GenBank, but as I have said previously, and as is known by most virologists, more than one sequence can come from teh same patient.
OK, let me try again. Here are the links to the two sequences (again): AF102660.1 and AF296752.1. I would ask anyone following this debate to simply click on the two links I have provided. If you read the definition line (second from the top of the white area), you will see that both records indicate that the sequences are from: Influenza A virus (A/HongKong/516/97(H5N1)) neuraminidase gene. Same source, same gene. If you do an alignment, you will see that the two sequences are very different. I’m not sure why Dr. Niman is denying this, it’s really very easy to verify.
niman at 09:14 - “As with the father in the Karo cluster, a person can be intected with two H5N1?s. H5N1 was first detected in Asia in a goose in 1996. The H5N1 in people in Hong Kong in 1997 include 4 internal genes that traced back to H9N2 and H6N1 serotypes. H9N2 is the most common infection in birds in Asia. Some (if not most) of these birds could be infected with two H5N1?s. A sample sent to CDC could result in the isolation of one, while a sample sent to Hong Kong could isolate another.”
Here you talk about dual H5N1 infection in birds. But the samples Monotreme referred to were in humans, or am I missing something? (Probably am.)
Are you saying one bird could infect one person with two different H5N1’s?
Monotreme, None of the three accession numbers at Los Alamos match the numbers you listed AF046089, AF036357, AF028708.
Dr. Niman, could you do me a huge favor and just click on the links I provided and read the definition line from the GenBank sequences? Report back what you see, please.
Scaredy Cat. The humans were infected by H5N1 from birds. The birds don’t “clone” the virus, so a dual infection in a bird, can lead to a dual ifection in a human (from a single source).
Montrome is arguing that two sequences from the same person equal a sequencing error. I really haven’t compared, it it seems that at least 5 NA sequences from teh same patient exist. I haven’t looked to see how many are the same or different, because I think the answer is quite irrelevant.
That’s why I raised the issue of training. The questions being asked don’t seem to have much relevance. Different labs can get different sequences. I am not sure why several sequences can from the same lab, but would guess it is a different clone. If it was a different sequence of the same clone, the old sequence would be replaced with the new sequence.
I would put this entire exercise in the “diversion category”.
I think an explantion of the putative 50–100 errors by the Olsen lab would be more instructive.
pogge, I didn’t see Monotreme’s prior response on training, but I thought it was worth revisiting because he is essentially saying sequences with extensive regions of identity are in error, which is an unusual position, He is also citing two sequences from the same person as evidence of error, which is not a view held by those in the field.
It is not really an “attack”, it is simply a request for clarification.
My original complaint about assertions that a lab has been accused of creating 50–100 errors without a shred of evidence has not been addressed.
The assertions are now being directed toward virtually all of the labs generating flu sequences, so I thought it would be useful to have a better understanding of the background of the poster launching the attacks.
Monotreme, I see 5 NA sequences from the same patient.
OK, I posted on this before, but here goes again.
I am not a virologist. However, I have published several papers in which I used viruses as tools. My co-authors were virologists. I have many colleagues who are virologists. I have alot of experience submitting and analysing sequences.
I realize you wish to personalize this issue and make it Niman vs. Monotreme rather than discuss the data. I decline to enter a mud-slinging contest.
I do think there are many errors in the flu databases. Not sure what the number is, but I will continue pointing them out as I find them.
niman – at 13:20
Monotreme, I see 5 NA sequences from the same patient.
I’m still having a hard to time getting a direct answer. Do you know acknowledge that the accessions I provided were correct? And that they both came from the CDC?
niman – at 13:17
Montrome is arguing that two sequences from the same person equal a sequencing error. I really haven’t compared, it it seems that at least 5 NA sequences from teh same patient exist. I haven’t looked to see how many are the same or different, because I think the answer is quite irrelevant.
Absolutely not! I have specifically stated I do not think these are sequencing errors! Both sequences translated. They are clearly from different strains, a point I have repeated, many, many times.
Monotrome, The sequences are generally 99% identical with each other, but the sequence with the most differences from the consensus is AF296752.1. Its passage history appears to indicate it was cloned out of the sample:
note=“passage history: C2/E2.
Monotrome, There appear to be 5 sequences from the same patient. What do you think that proves?
Anonymous — at 07:55 ‘’I should have installed the other program mentioned or asked Frenchie…’‘ — why didn’t you??? I still have computer time available, at home or at the office
Monotreme, WS/33, WSN/33, and NWS/33 came from the same patient. All three sequences are different. What do you think that proves?
Monotreme, The latter sequence was a clone from the original sample. I fail to see your point.
Monotreme, I am not following your database error arguemnt. Are the multiple NA 156 sequence supposed to have some relevance to the “errors”, or is this related to some other issue?
I am happy to address the data, but am not following your point with regard to multiple sequences from the same patient. For some reason you chose to focus on two of the five sequences, and I am not sure why that was either. Since the two sequences were from the same lab, it seems pretty clear that they thought the sequences were distinct. If the latter sequence was a correction, they would not have taken a new accession number. They would have simply issued a later, corrected version of the earlier sequence.
Well, still no admission that my accessions were correct, but c’est l’vie.
I have made my point several times, but here I go again. The reason I chose the sequences I did is that they were from the same lab and had the same strain designation. Both translate. There are significant differences at the protein level. I proposed two possible explanations: 1. They mislabeled strains, ie, one of the sequences came from a different strain or 2. The virus had been passed so many times that it had picked up a number of significant mutations. Since the authors of the second paper were making claims about the biology of the A/HongKong/516/97 either source of error would affect their conclusions. It is well-known that passaging a virus many times can affect it’s biological properties due to the mutations it can pick up. The authors give no indication that they realise that they are basing their assumptions on a very different virus than the one originially identified.
Monotreme, Thanks for responding to my question about background. I am not looking to personalize the discussion. As you know, molecular biology can be quite specialized and although similar tools are used, problems can be appraoched from different perspectives.
The sequneces at GenBank and Los Alamos are from natural hosts. These naural hosts can create conditions that are quite different than controled laboratory settings. They can also lead to questions that most virologists wouldn’t ask, because the answer is already known.
H5N1 in birds or people can be quite a mess. There can be multiple versions in a single host. That is how the virus evolves and it does create some problems when looking at sequences. However, the multiple versions in a single host is reality. It’s not a controled lab situation.
The database is an approximation of the natural situation. The sequencers have some control over what is and isn’t submitted. Two labs sequening isoaltes from the same person offers some control, but it is not absolute. Sometimes the sequences are virtually identical. That was seen in most of the isolates from the Karo cluster. However, the presence of a second H5N1 is not a new observation, which is almost certainly what happened to the father.
In any event, I think that some of the things that you think are “errors” in the database are simply two or more isolates from the same patient, as with patient 156 from Hong Kong. Maybe you are trying to make some other point, but more than one isolate from the same patient, especially during a significant outbreak in Hong Kong in 1997, should not be a surprise.
The mutiple sequneces from the same patient make the conservation of the swine sequences more remarkable. As I said earlier, there would have to be 50–100 lab errors to generate the data in the Canadian swine. I really don’t see how that many errors can happen. The mislabeling on a sequence can happen, but not the number of type of error that would be required to generate the Canadian swine data.
Similarly, isoaltes where the first 1000 BP are identical with four chnages in the next 400 are also not likely to be lab or labeling errors, nor is the remergence of sequences from a decade earlier.
Flu recombines, That is its major mode of change. Most of the time it changes so often, that the mechanism is hard to see, but as with the Canadian swine, or the patients in south Korea in 2002, it can be VERY obvious, and its not there because of lab error.
Attacking the flu database, starting with some basic misconceptions, can be quite dangerous.
Monotreme, I think they knew the sequences were different (which is why they used a new accession number) and assumed that latter isolate was a different clone. If you are saying that the latter sequence was an articaft, it could have been generated many ways, including a dual infection in vivo or in vitro with another virus in the lab since the sequence was deposited almost 2 years later.
In any event, what does that have to do with the accuracy of the database? Is there supposed to be any relevance for the identities in the Canadian swine?
Monotreme, If you are going to use “mutations” to explain the differences between the two sequences, you have to explain why 8 differences in the later sequence relative to the consensus clustered between positions 1245–1315. Such clustering is an indication or recombination, not “random mutations”.
Monotreme, for what it’s worth - I completely and fully see your point. The links work just fine, it’s logical and clear. I’m waiting for the answer as well. I have no emotional or intellectual investment in what that answer “should” be. There is an explanation somewhere, and I simply wanna know what that is.
niman@03:10, “Two isolates from the same host need not be the same.”
Maybe this is the answer, but it still doesn’t make sense. This may certainly be true; but then, the isolates wouldn’t be treated as if they were the same. And it’s strange to try to imagine that the investigator thought that they were the same and missed it that they differed. It defies my definition of scientific investigation and core competency at the professional level.
Further, “Ib fact (sic), that is a requirement for both reassortment and recombination.” Yeah. A characteristic of viral evolution, not an explanation for it.
“That is why viruses are cloned prior to sequencing.” It’s cloned to create a copy of the original. In the event of mistakes and anomolies, you can go back and double-check.
“That is why some samples generate mixed signals.” The mixed signal would be identified, it would be of interest, and it would be subject to further investigation. It wouldn’t be disregarded. The fact of the difference, in and of itself, is not also the explanation.
In my opinion, such as it is. But then, I’m one of the dimbulbs that ask stoopid questions. :]
Montreme, Here are more details on the clustering. If you compare all 5 NA sequences from patient 516 you will find 17 polymorphsims that are unique to AF296752.1. 9 are located in the first 1233 positions. The other 8 are in the next 60 positions. The 20 fold increase in polymorphism density is a recombination indication. The distribution of the polymorphsims is far from “random”.
glo, Virologists would not consider them to the same. They each have a different accession number and as noted, the most recent has all of the markings of being a recombinant. Also as noted, the recent human isolates from Indonesia have a diffenent sample number to avoid such confusion.
glo, In addition, more recent isolates from the same lab have a number, followed by a dot. Compare sequences of A/Ck/HK/31.2/2002(H5N1) with A/Ck/HK/31.4/02(H5N1). Although both came from chicken 31, the sequences are quite different, and in fact are the parental strains for several obvious recombinants also isoalted in Hong Kong in 2002, as described in Copy Choice Recombination and Uses Thereof. The observations on teh recombination were made in 2004.
These comments on labeling are all re-inventions of the wheel. The same host can have multiple versions and such isolates should be given unique names as well as unique accession numbers.
Wading into a database of sequences not in your field can be dangerous.
Dr. Niman, a modest friend has pointed out to me one possible source of confusion. I am referring to the following strain: A/Hong Kong/516/97(H5N1). Sometimes you seem to understand this, and sometimes you mention a completely different strain A/Hong Kong/156/97(H5N1). Let’s make sure we are talking about the same strain.
The differences between the two sequences I identify are significant, at the protein level. Whether they are due to error in attribution, mutation during passage or recombination, failure to note the differences is odd. And I will repeat my point about phylogenetic trees. How can you create a tree with A/Hong Kong/516/97 without acknowledging that this includes two very different viruses? How can one talk about the biological properties of A/Hong Kong/516/97 without mentioning that there are two very different viruses with the same subtype designation?
glo, I’m pretty much where you are. I don’t have a horse in the Recombinomics vs. Conventional Science race. I just want to know the truth.
Monotreme, I was follwing you in the other thread. It wasn’t about viral theory; it was recognition of stuff that didn’t add up. Now your orginal insight and the question that you called has come full circle to whether or not recombinomic theory is responsible. I’m still with you and still wondering…
niman@03:10 “Your assumption that an error was made if two different sequences are published from the same indovidual has no scientific basis…” This was not the assumption, nor the question.
niman@13:46 - “The latter sequence was a clone from the original sample.” Then, why isn’t it identical? This is a very definitive statement.
@13:53 - “Since the two sequences were from the same lab, it seems pretty clear that they thought the sequences were distinct.” 7 minutes earlier you stated (without qualification) the latter sequence was a clone. And how is it clear that they thought the sequences were distinct?
@14:45 - “I think they knew the sequences were different (which is why they used a new accession number) and assumed that latter isolate was a different clone.” Now it’s “a different clone”. A clone of what sequence?
“If you are saying that the latter sequence was an articaft, it could have been generated many ways, including a dual infection in vivo or in vitro with another virus in the lab since the sequence was deposited almost 2 years later.” This IS the question.
You make definitive statements with no qualifying language (i.e. I think, suggest, guess, my opinion, etc). I interpret this as unconditional, empirical knowledge of the information. Were you the investigator on that study and is this your work?
concerning this different-viruses-with-one-infection-idea :
In Karo this would mean, (this is hypothetical)
that the index-case got (at least) two strains,
then passed both to the son and other members in the cluster.
Then the son also passed both to the father.
The “father”-strain (let me call it thus, although the father had both strains)
was apparantly much rarer and by some coincidence
it was isolated in the father-sample while in all the other samples the other
strain was isolated, although both were present in each case.
This scenario looks a bit unlikely, but maybe possible, I don’t know.
It seems that we need no recombination for this scenario.
Did we observe such things earlier with H5N1 ? I never heard this.
Monotreme, The mixing up of the numbers really doesn’t change much, other than to reduce the number of different sequences per person. Patient 156 has at least three different H5N1 which are closely related to each other. Patient 516 has one H5N1 that is close to the three H5N1’s from 156, and one that is different (17 differences with 8 clustered in a 60 BP region). Thus, it would seem that the unique virus arose via recombination, either in the patient, or in the lab (I would suspect the former).
In any even, the paper describing the isolates uses the earlier accession number, which is the sequence close to the other sequences, so the 1999 Virology paper really has nothing to do with the later sequence (even though both sequence characterization sheets link the 1999 paper, which also has information about the 516 patient).
Thus, if you wanted to create a phylogentic tree reflecting the sequence described in the Virology paper, you would use the first sequence because that is the correct accession number.
If you wanted to make a phylogenetic tree with both sequences, you would either just use the accession number, or the name with an asterik. For the Karo cluster, the second sequence from the nephew has the same number, followed by letter “b”. For the human sequences from Indonesia, CDC and Hong Kong are now using different sample numbers, even though the patient is the same. As noted earlier, the Hong Kong chicken uses a number followed by a dot and another number representing the clone.
Thus, the 1999 name merely presented some discussion topics for this board. Giving different isolates from the same patient a different sample number creates less confusion, and the system used in 1999 has changed. You can look at H5N1 isoaltes to see when the changes were more widespread. As indicated, bird sample numbering in Hong Kong changed in 2002. For Indonesia, human sample numbering chnaged in late 2005. Indo/5 is Indo/5 for isolates from Hong Kong and the CDC. The public sequence is from the CDC and I don’t know if the Hong Kong sequence is the same (unless I dig out an earlier tree with both samples).
In any event, there really is no error in the database or in the publication describing the isolates. I supsect most phylogentic trees would use the earlier isolate, since that is the sequence that has been published.
The truth is in the sequences in GenBank and Los Alamos. It takes a little digging to get the publication and pull up the accession number to find out which is which, but anyone who sees two full sequences from the same patient and same lab but with difference accession numbers, should assume the sequences are different.
None of the above is an indication that there is an error in the database or the publication, which lists the accession number used.
As far as teh sequences with identity but collected years apart, there still has been no exxplanation as to why such sequences are in error, or how the Olsen l;ab could have created 50–100 errors.
glo, I have training as a virologist and know how the data is generated. I am quite familiar with teh flu database and know how to read a scientific publication and know how accession numbers are assigned and how they are used by those who want to make productive use of teh associated sequences. I have explained it above.
There was no error and two sequences were obtained from the same patient. The second sequence was fromn a second H5N1 that was cloned from the clinical sample. The first sequence was descibed in the linked Virology paper form 1999. In all likelihood, the second sequence was a recombinant in the patient that was infected with at least two distinct H5N1’s (and the recombination that created the secoind sequence could have been in the patient, the bird that led to the infection of the patient, or another host that led to the infection of the bird. The only think that is clear is the fact that the second sequence is a recombinant (and not described in any paper linked to the characterization sheet of the sequence).
There was nothing unusual and no errors were made. Sometimes I wonder if these nuances are left in place to keep outsiders confused. Clearly, it has that effect.
No, I was not in the lab, and have not asked for clarification. The data appear to be quite clear. If you have any information (Data) that would have a bearing on any of the above, please post it.
GSGS, There really is no reason to put the second H5N1 in anyone but the father. The H5N1 from the father had 21 polymorphisms not in the consensus. Only one was in his son. Thus, the father’s H5N1 was mostly a second H5N1 with one polymorphism acquired from his son. The father had one H5N1 from his son, and a second distinct H5N1. The H5N1 isolated from the father had the 20 unique changes and one shared with his son.
In general, your scenarios are far more complicated than required to explain the data. I suspect that is how you came up with 200 ((or more than 200 changes). You should review your approach toward data analysis when the results you generate have no relationship with reality (and you should not assume others use your logic).
anonymous at 16:38:
If you didn’t see my earlier post, please scroll up and find it. One more time: please pick a name.
from whom did the father get the 2nd strain ?
And how did that 2nd strain “know” about the first strain in that it just
differed at 9 positions in HA from it, 8 of which were among those
where the first strain differed from indo/5 ?
I mean, take the set of differences-positions 1st strain to 2nd strain
and compare it with the set of differences-positions 1st strain - indo/5 .
You get an amazing high cardinality of this intersection set. It’s 8 ,
while the first set had only cardinality 9 and the 2nd set is sparse
in the set of all positions !
sorry pogge, feel free to delete my last posts. I go to bed now. And please include a privacy-policy page to the forum rules.
Monotreme, it looks like we’ve arrived at one explanation. A summary of what Niman seems to be saying: AF102660.1 was initially isolated from one sample of 1997 Patient X. Then a few years later, the same sample was revisited (or a clone of that sample) and AF296752.1 was isolated. Investigators assume “recombinant” of two distinct H5N1s through co-infection of 1997 Patient. “Nothing unusual”, no big deal, so they file that sequence with GenBank as well, without comment or addendum.
According to this, it all boils down to familiarity with the system… “I wonder if these nuances are left in place to keep outsiders confused. Clearly, it has that effect.”
Yup, sure does make it confusing for all of us “outsiders”. So what’s the agenda, with days of convoluted argument and doublespeak, when an explanation could be presented in a single succinct paragraph?
anonymous, Evryone who regularly reads this forum assumes you are GSGS. Why don’t you just post with that handle? GSGS is the only poster who thinks the data points to 200 differences, who complains about formatting of the recombinomics PCT, who offers to do searches, and who cpmplains about how long it takes for clustalw alignments. If you are not GSGS, then using that handle will be a perfect cover.
The phylogenetic tree shows that all members of the Karo cluster, including the father were on the same branch indicating the two H5N1’s were relatively closely related (as in being from the Kao area). The son and index case were infected with related H5N1’s, but not as related as the various family members H5N1’s were to each other, because they just had teh H5N1 from the H2H infection.
glo, To get the explanation required a fair amount of work. I really didn’t want to do it because I didn’t think it would show anything new. It didn’t, but did allow for a step by step explanation. I misread the original number. 156 is the most discussed patient, so I thought that was who Monotreme was talking about, but the acession numbers didn’t match. I aligned the 5 sequences (from 156 and 561) with clustal w and it was clear that 4 of the 5 sequences were very similar. Then when I read Montreme’s comments on the mischaracterization in the Virology paper, I had to retrieve the paper and look at the data presented, as well as the accession number used in the table describing the various isolates. I did a blast on the “novel” sequence to see where the “mutations” originated. When I got to the end of the sequence, it was clear that it was a recombinant because all of a sudeen it went from an occasiona difference with the closest sequences (9 changes in the first 1200 BP, to a dramatic difference (8 changes over 60 BP). Then I went back to the clustalw to add up the differences to show how they were clustered.
With that DATA in hand, it was pretty easy to come up with the senario on what was in the publication, why the second sequence had its own accession number, and why there were two sequences from the same lab at Genbank.
That took a couple of hours, but I thought it was useful to show that there were no errors in the data base, I already knew about changes in names, so I could use them as examples of what had happened with the two H5N1 sequences in question.
Thus, a simple explanation was possible when all of the data was in place, but it took a couple of hours to go acquire the data and maybe move the discussion forward to the swine sequences and away from the attacks on the database, which are more due to unfamiliarity than a series of errors by the sequencers.
That is not to say that there are no errors in the databases or publsihed papers. However, most of the time, what you see is what you get, and if the data is at odds with your perceptions, and examination of your perceptions may be in order.
Anonymous can’t respond. In fact he can sleep in tomorrow as late he wants. If he wishes to reach me by email, he can look for my name in the small print at the bottom of the left sidebar.
Well, perhaps a little clarification is in order. I have been talking about A/Hong Kong/516/97 from the beginning of this part of the thread. I think I have been pretty clear about this. I really don’t understand why Dr. Niman began to comparing the sequences from this strain with sequences from A/Hong Kong/156/97. I provided the two accessions I was making comments about several times. Again, I’m not sure why this caused any confusion for Dr. Niman.
After many hours, we are now back to where I started. The CDC deposited two sequences for the NA gene for A/Hong Kong/516/97, several years apart. These sequneces are significantly different. The question is: why? I posited that it is because of an error of attribution or due to mutations from repeated passages in the lab. To my suprise, Dr. Niman seems to be implying that this patient was infected with two different H5N1 strains. The question for board members is: given the relatively few people who have been infected with H5N1, how believable is it that a patient was infected with two quite different strains of H5N1? How often do people think this happens?
Monotreme – at 19:23 Considering the Karo cluster, it seems clear that there were host-specific reasons that the Ginting family was susceptible. Supposing the local vector, not a chicken, was carrying several closely related but different strains of H5N1. These strains were 99% the same. The Ginting’s susceptibility was tied up in the parts that were the same, not the parts which were different, so they were susceptible to both strains. Since the mystery host carried both, then Ginting caught the second one at some point. I just don’t see H5N1 being homogenous and clonal.
Sorry, the above was mine.
Nope, nothing unclear about it. Links worked, commentary sound and easy to follow. Question was obvious and should have been easy to answer, considering that this particular explanation is a second recombinant isolate, demonstrating coinfection in 1997 and how unremarkable that is.
“Anyone who sees two full sequences from the same patient and same lab but with difference accession numbers, should assume the sequences are different.”
“…the second sequence is a recombinant (and not described in any paper linked to the characterization sheet of the sequence).” It doesn’t follow that this would not be of interest. Frankly, the whole thing would be of critical interest.
Monotreme, Unfortunately, you still seem rather stuck on dual infections, which are quite common in birds, which then infected the people in Hong Kong. A dual infection in a person does not require two seperate infections, just one infection by a dually infected bird. The H5N1 situation in Hong Kong in birds in 1997 was a mess. All birds (1.5 million) in Hong Kong were killed in 1997.
You are once again asking the wrong questions. The correct question is how likely is it that a H5N1 positive bird in Hong Kong was dually infected in 1997.
I gave you the answer for Hong Kong in 2002. Read the PCT linked above. The parenst and recombinants are spelled out quite clearly.
However, why is 1997 an issue? The second sequence wasn’t described in the Virology paper, as you mistakenly posted. Those who want to generate phylogentic trees use accession numbers and now you should have a better idea how to use the sequences in the database.
There are no errors in the two sequences and if you think the clustered polymorphisms mean random mutations in the lab, there isn’t going to be any more DATA on 1997 cases to argue one way or another. It really matters little if the lab strain was dually infected in the lab or the patient. You can prove one or the other, and the Olsen data involve a dozen or more sequences recombined with the 2003/2004 swine sequences.
Still waiting for an explanation of the 50–100 errors by the Olsen lab. I’m really not following you fixation on the two 1997 sequences, when there is no indication of lab errors and the cause of the changes really can’t be answered.
What exactly is your point?
It is 2006. You are ignoring DATA which are quite clear.
glo, There are many examples of recombination, including the swine sequences that are not addressed. Virtually all 56 sequences in the Canadian swine are recombinants. There is no mention of recombination in the peer reviewed papers by the Olsen group.
Those generating the sequences have no interest in recombination. Their grants are to study reassortment and random mutations
Monotreme at 19:23
As I came down the thread reading the contents I thought Niman must believe there is an asymptomatic H5N1 infection at work since he clearly implied there were two differenct strains. But I thought the idea of widespread asymptomatic H5N1 was considered unproven?? If he’s right and there are two strains I suppose one is asymptomatic or so very mild no one noticed it.
My question is if two strains are present would one test pick both strains up?
Leo7, Monotreme is discussing human H5N1 in Hong Kong in 1997. Both H5N1’s at Genbank / Los Alamos are the 1997 strain, which had a 19 aa deletion in NA. This strain was in birds and people in 1997 but that strain was pretty much eliminated when Hong Kong killed all of the birds (1.5 million) in 1997. The human cases in 1997 were firmly linked to birds, not other people. Since 1997 there has been no NA reported with the 19 aa deletion, which is the sequence Montreme is discussing.
There is very little relevance between the NA in 1997 and the H5N1 causing problems in 2006. The NA in Indonesia, China, Vietnam, Thailand, and the Qingahis strain in wild birds all have a 20 aa delection, which overlaps the 19 aa deletion, but is at a slightly different location and first appeared in 2003/2004.
WetDirt – at 19:45
Considering the Karo cluster, it seems clear that there were host-specific reasons that the Ginting family was susceptible. Supposing the local vector, not a chicken, was carrying several closely related but different strains of H5N1. These strains were 99% the same. The Ginting’s susceptibility was tied up in the parts that were the same, not the parts which were different, so they were susceptible to both strains. Since the mystery host carried both, then Ginting caught the second one at some point.
I’m avoiding commenting too much on the Karo cluster until the sequences have been deposited in GenBank. As you’ve probably noticed, I think the incidence in errors of attribution are quite common, and in pre-published data, they are even more common.
You do touch on the crux of the problem, are people being infected with multiple strains of H5N1 at the same time? Dr. Niman, and you, would suggest yes. I find this hard to believe, especially if it is to occur many times.
I just don’t see H5N1 being homogenous …
Neither do I. I think there are multiple strains of H5N1, I just think the odds that two different strains would infect the same patient are absurdly low.
niman – at 20:07
Monotreme, Unfortunately, you still seem rather stuck on dual infections
Yes, I am.
A dual infection in a person does not require two seperate infections, just one infection by a dually infected bird.
The implication is that one exposure resulted in infection with two different strains of virus. I’m not at all sure that the odds of this happening are high.
However, why is 1997 an issue? The second sequence wasn’t described in the Virology paper, as you mistakenly posted.
Well, actually there were two papers. The Virology paper was cited in the first accession, http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=4324297?. They could hardly have cited the second accession as it had been deposited yet!. The Virology paper was published in 1999. The second accession wasn’t deposited until 2002. The second accession, AF296752.1, cites two papers, the Virology paper, which is linked to AF102660.1 and Katz et al., unpublised. The second paper was since published: Molecular changes associated with the transmission of avian influenza a H5N1 and H9N2 viruses to humans. People interested in this debate can click on the accession for AF296752.1 and look at the reference section. They will note that both papers are cited. Now, if the authors of the second paper were aware that there were two different viruses, why did they cite the first paper but not note that it referenced an entirely different virus?
Although I don’t claim infallibility, people can decide for themselves who is making mistakes today.
There are no errors in the two sequences and if you think the clustered polymorphisms mean random mutations in the lab, there isn’t going to be any more DATA on 1997 cases to argue one way or another.
Actually my first choice was error of attribution; lab adaptation was my second choice.
It really matters little if the lab strain was dually infected in the lab or the patient.
Are you kidding me?! It matters a great deal if this was a lab error or if a patient was dually infected. This is my whole point!
Still waiting for an explanation of the 50–100 errors by the Olsen lab.
Whoa, now we’re up to 100 errors? They keep mounting, in your estimation. I really do wish NS1 would graph all of your assertions, it would save me alot of time. I am building a case for lab errors, and frankly, sorting out H5N1 infections in humans is a higher priority for me than swine H1N1 sequences.
I’m really not following you fixation on the two 1997 sequences, when there is no indication of lab errors and the cause of the changes really can’t be answered.
Sure they can be answered, and should be by the CDC.
What exactly is your point?
I think people are making mistakes in the attribution of sequences to strains. You interpret these as proof of a radical new Theory that ignores 70 years and 100′s of papers demonstrating that mutation rates in flu viruses are high.
It is 2006.
Well, I agree with that ;-)
You are ignoring DATA which are quite clear.
Uh, I don’t think I’m ignoring DATA, I’m just interpreting it differently than you are.
Correct link for http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nucleotide&val=4324297?.
I confess to making formatting errors quite frequently ;-)
Third time is the charm.
I’m still stuck on: What, ‘exactly, is meant by “dual infection”?
But I’m close to giving up hope of ever getting an answer.
Errors in GenBank deposits
The issue of whether it is possible for people depositing sequences could be making lots of mistakes has arisen. Although I am *not* a virologist, I do spend most of my time on my day-job looking at sequences from GenBank. I have a shock for you. There are lots of mistakes. Huge numbers of them, actually. Blindly relying on the accuracy of these sequences has cost certain companies huge sums of money. The ones that survived all have their own quality control checks, now.
In my own work, the first thing I do is to create a clean database. This requires creating filters to identify problematical sequences. I have found that certain sequencing centers are extremely unreliable while others almost always generate good data, although even the good centers make mistakes sometimes. Some centers make specific types of mistakes that can be corrected for. I don’t have enough data yet to determine if certain centers are particularly unreliable with regard to H5N1 sequences, but it would not surprise me if that turns out to be true.
Monotreme—I work in a regulatory agency reviewing submissions and trying to detect when we are being flimflammed, and whether the proponents can prove their case with the data they present. It seems to me that part of the problem with detecting multiple infections is that only one sample is being taken from patients. If you only analyze one sample, then you think you know everything. As a rule of thumb in my field, we do about 10% duplicate samples. This identifies lab errors, sampling errors, and inherent variablity in the process. It is interesting that in the medical field, I never see duplicate samples. In my entire life I can’t recall a single lab test that was obtained in duplicate for quality control. So of course you will miss dual infections; your metholdology is biased against them. Same thing with reviewing the article on random mutation: It was pretty clear that the authors were not doing a broad investigation; they set up the experiment to reflect what they thought they knew. They were not testing multiple working hypotheses, in fact, they don’t even mention them. I see this a lot in instances where labwork is very expensive and laborious. I hope the rest of the literature is better than the one I reviewed. It’s easy to not see things you are not looking for. I get the imperession that most virologists don’t actually look at the sequences for patterns; there isn’t any reason to if you are convinced the variablity is random. It would be an indication of some humility, as well as intellectual honesty, to constantly question even basic assumptions to make sure nothing is being overlooked. So far, I see a lot of inherent bias in the way this work is being done, which doesn’t make me too confindent that nothing is being overlooked.
wetDirt – at 00:40
It seems to me that part of the problem with detecting multiple infections is that only one sample is being taken from patients. If you only analyze one sample, then you think you know everything. As a rule of thumb in my field, we do about 10% duplicate samples. This identifies lab errors, sampling errors, and inherent variablity in the process. It is interesting that in the medical field, I never see duplicate samples.
I agree there should be duplicate, or even triplicate, samples should be taken from every patient, sequenced and deposited in GenBank, pronto. And I don’t deny that dual infections can occur. I’m quite sure they do with viruses that are infecting large numbers of individuals. This would be true of H3N2 in humans and H5N1 in ducks, but not of H5N1 in humans. My major concern is dual infection of H5N1 with H1N1 in pigs or humans and H5N1 and H3N2 in humans.
Same thing with reviewing the article on random mutation: It was pretty clear that the authors were not doing a broad investigation; they set up the experiment to reflect what they thought they knew. They were not testing multiple working hypotheses, in fact, they don’t even mention them.
Actually, I think the paper you are referring to went to a lot of trouble to rule out dual infections. But, as I mentioned before, no one paper is perfect. You have to look at a complete body of work to get a sense of what’s true. There are literally hundreds of papers demonstrating that random mutation is common. They have convinced me that this is true. I’ve also examined a number of sequences which convince me that H5N1 rapidly mutates, in humans. But, as I’ve said before, I’m not really trying to convince anyone that Recombinomics is wrong. What I’m trying to do is explain why I think its wrong and to present, as best I can, the Conventional Science view of flu evolution. I took on this task because it is just too weird for all the flu boards to present one, extremely fringe, view of flu evolution. That doesn’t mean Recombinomics is wrong, but folks aught to know what they are signing up for. And if the Conventional Science view is never presented, except with derision, how can they make a rational choice?
I see this a lot in instances where labwork is very expensive and laborious.
That’s a accurate criticism. Sometimes controls that one would like to do are not done because of lack of money.
I hope the rest of the literature is better than the one I reviewed.
Remember the paper you are referring to was published 20 years ago. There have been many papers published since then that came to the same conclusion.
It’s easy to not see things you are not looking for.
Very true. But I would argue that that may appy to non-conventional scientists as well ;-)
I get the imperession that most virologists don’t actually look at the sequences for patterns
The NIH is looking for patterns. But first they have to get the sequences. Many people, including conventional scientists, have been desperately trying to get the sequences out of the CDC for a long time, precisely so they could look at patterns.
…there isn’t any reason to if you are convinced the variablity is random. It would be an indication of some humility, as well as intellectual honesty, to constantly question even basic assumptions to make sure nothing is being overlooked. So far, I see a lot of inherent bias in the way this work is being done…
I guess I’ll just have to disagree with you about this. I think the evidence that flu viruses mutate at a high rate is overwhelming. As I’ve mentioned before, I have no stake in the debate. I am truly a neutral entity.
…which doesn’t make me too confindent that nothing is being overlooked.
Oh, I’m quite certain important patterns have been overlooked. But please consider the possibility that if you look at all of the data through the prism of Recombinomics you may miss something that no-one has commented on, yet.
sorry, reading and sleepy…)
Monotreme, I am still not following your comments on links to the literature. I did dig out the second paper, which was on H5N1 and H9N2. In 1997, the H5N1 isolates had internal (non-glycosylated) genes from H9N2. The 2002 paper you referenced went through an analysis of the genes from H5N1 (and H9N2), including sequences from the patient at issue. The paper included phylogentic trees from PB1, PB2, PA, NP, M, and NS. These are the six internal genes and accession numbers for each sequence were given in a rather large table listing 20 isolates and associated accession numbers. For several isolates, two accession numbers were used (including patient 516). However, the two accession numbers for 516 were for the NP sequence, so I assume that two different NP sequences exist for H5N1 from that patient.
Since the paper was on the 6 internal gene segments, there were no phylogenetic trees on HA or NA. I glanced through the paper and did not see anything about individual NA sequences and don’t really se why they would be talking about NA since H5N1 has N1 and H9N2 has N2. Since the N involves two different serotypes, there would be no reason to be doing comparisons via phylogentic trees, and the paper was focused on internal genes. Thus, I still don’t see any evidence of discussions on the second sequence. Maybe I missed it in the 2002 paper, but the paper has 1 figure and 3 tables and all are on the six non-glycosylated genes, as indoacted in the abstract. NA and HA are both glycosylated, so I doubt that there is much discussion, but there clearly is no NA phylogenetic tree, which was one of your concerns.
Moreover, since they reference two accession numbers for NP, it seems likely that there are two distinct NP sequences from this patient, which again would suggest that the authors were well aware of the differences in sequence in both NA and NP and feel that two or more H5N1’s were circulating in this patient. For the later isoalte, they give its cloning history on the characterization sheet at Genbank and this is also indicated on teh sheet at Los Alamos, so Los Alamos appears to appreciate the fact that there are two sequences and their clonoing history is notable, which again would indicate that they were aware of the fact that two NA sequences were obtained. As I said earlier, if the second sequence corrected an error in the first, they would not get a new independent accession number, but instead would modify the old accession number (as dot 2), to avoid confusion created by two sequences with the unrelated accession numbers.
Thus, it would appear that the virologists who generated the two sequences know more about their sequences than you do and they are well aware of the fact that the two NA and two NP sequences are different. If I get a chance I will compare the two NP sequences to see how different they are, but at this time I think you have failed to provide any evidence that there is some sort of database error with regard to the two NA sequences from this patient, and have failed to show that the authors published misleading information by discussing one NA sequence in one paper, and another sequence in another paper.
On a technical note, the accession numbers are assigned when the sequence is deposited, so the second sequence was deposited in 2000. I woudl guess that it became public in 2002 or the characterization sheet was updated in 2002, but the accession number assignment is linked to the deposit date, not the date at the top of the charterization sheet, which can change regularly.
With regard to the Canadian swine sequences, the conservation of sequence was in commentaries a couple of months ago. I e-mailed you the specifics about a month ago. Just the PA and PB2 genes would require at least 20 lab errors, which I suspect would be a record for a single publication. If you come up with any type of reasonable explanation for the 20 errors, I will give you the other 6 genes which involve identities with other isolates, and you can work on the additional 30–80 errors required.
At this time, your repeated assertions that the Olsen lab made that many errors remains reckless, since you have maintained that position for 1–2 months, yet have failed to provide a shred of evidence, or a specific example of why the data should be considered unreliable.
Similarly, I have yet to see any evidence that the two H5N1 sequences from the 1997 patient in Hong Kong were anything but two or more viruses from the same person. I also gave you a very concrete examples for Hong Kong in 2002 (as well as the fact that another patient from Hong Kong, 156, has three sequences at Genabnk, and all three sequences are different, although the differences are lower in frequency than 516. That chicken example (31.2 and 31.4) I believe involves all 8 gene segments, which differ in the two isolates from the same chicken. Although you are not a virologist, an explanation of why a dually infected chicken would not create a dual infection in a person, if that chicken was the source of the human infection would be most appreciated. If you explanation is accurate, it is probably publishable. At this time. I don’t recall any publication indicating that two distinct H5N1’s in a bird would only produce one H5N1 in a person.
Protesting banning of anonymous.
I know, TSFM.
P.S. Hello Germany!
Monotreme. Since you want to focus on H5N1 in Kong Komg and multiple sequences from the same source, there have been s few studies that take H5N1 infected chickens and inject the associated virus into mice. Virus is then isolated from the brains of the mice. These sequences are at genbank and can be easily paired because the isolates have the same name, except those from the mouse brain include MB in the name. Some exampels have been listed in the recombinomics patent and there are again clear examples of recombination.
The authors of the paper (Yi Guan and assocaites in Hong Kong) clearly state in the text that there are multiple changes and mixed signals are seen in uncloned samples. They conclude that some isolates are selected because of passage in the mouse brain, which accounts for the sequence differences (at multiple locations in multiple genes).
Thus, this paper also concludes that the H5N1 in birds in Hong Kong are infected with multiple versions of H5N1. I did not see any indications that the authors assumed that infection of a person by H5N1 from such an infected bird would produce only one H5N1 in the person. You see to be implying that two H5N1’s in a person requires two independent infections. I think the data suggests just the opposite.
Since you are more interested in H5N1 in people, I recall one of the first clusters in Vietnam. It was a pretty famous cluster involving a groom and his two sisters. The groom died with bird flu symptoms, and of course no sample was collected. However, his two sisters, who cared for him, became infected about 4–5 days after their brother. Epidemiologically, it certainly looked like an infection of the sisters by their brother. The two sisters developed symptoms on the same day, where hospitalized on the same day, initial tested inconclusive, subsequently tested H5N1 positive, and died within an hour of each other. I thought they represented a good example of infection from a common source, which was their brother. However, their symptoms were somewhat different (one more pulmonary and one more gastrointestinal) and the H5N1 isolated from each was distinct. I would interpret the data as an indication that the brother was dually infected and he infected both sisters, but one virus was dominant in one sister and another in the other sister (much like different H5N1’s in different organs of the same experimental mouse).
The Vietnam example is a bit of a digression, but the different tropisms in the H5N1 in birds in Hong Kong does address the issue of dual infections in birds in Hong Kong, which I believe was quite common and capable of causing dual infections in humans.
Scaredy,
And your point would be….?
People have to play by the rules and if the anonys don’t wish to do so, good-bye.
monotreme. I checked out teh NP sequences, and it looks like two non-overlapping partial sequqneces were submitted, so the two accession numbers does not show two different sequences. However, it is curious that a complete sequence, or even a continuous, was not deposited. One sequence is about 300 BP and the other is 500, but this only covers about 1/2 of the sequence.
In other partial sequences (such as the H9N2 sequences from Korea), the missing data happens to cover areas were recombination is expected.
The bottom kline of the NA sequence remains the same. The authors, in the two citations linked to the sequences, only dicuss the first NA sequence. The second publication focuses on the 6 internal geens and has ni NA phylogenetic tree.
Thus, the case for error still has not been made. The assigning of a second accession, along with a description of teh clomimng histiry for the second sequence is a strong inidcation tha the authors knew there were two or mrore sequences in patient 516.
Dual infections in H5N1 poultry in Hong Kong was not uncommon. Even though Hong Kong culled all of the birds in 1997, and had several subsequent culls, H5N1 kept reappearing in poultry in Hong Kong. The J Virology paper [[http://jvi.asm.org/cgi/content/full/77/6/3816?view=long&pmid=12610156|Neurovirulence in Mice of H5N1 Influenza Virus Genotypes Isolated from Hong Kong Poultry in 2001 ]], which can be accessed for free, states
The MB variants, designated Ck/HK/YU822.2/01-MB, Ph/HK/FY155/01-MB, Ck/HK/FY150/01-MB, and Ck/HK/NT873.3/01-MB, were markedly more virulent than the original viruses (Table 2). These variants had MLD50 values similar to their EID50 values, which indicates that these viruses are highly pathogenic for mice.
The MB variants fell into one of two groups according to their organ tropism. The first group included the highly pathogenic variants of genotypes A and C, which replicated systemically and infected brain and internal organs (Table 2). High infective doses of these MB variants resulted in death within 3 to 5 days; lower doses killed the mice 6 to 11 days after inoculation. Inoculation with doses of 102.5 to 100.5 EID50 of either virus caused neurologic signs such as hind-leg paralysis, tremor, and paresis. Viral titers of the MB variants of genotypes A and C in liver, spleen, kidney, and heart were at least 3.5 to 2.0 log10 lower than those in lungs and brain (Fig. 4), where these variants replicated efficiently; this result suggests that these MB variants were pneumotropic and neurotropic.
Amino acid sequence variations between the original viruses and their MB variants were compared to those found between mouse phenotypes of human H5N1/97 isolates of high and low pathogenicity (Tables 3 to 5). There were amino acid changes in the highly pathogenic variants in all gene products except for PB1, NP, and NS1 proteins. Most of these changes distinguished the original viruses from the MB variants, as well as from other viruses of high and low pathogenicity, and were not unique. The random nature of these differences suggested that the viruses were heterogeneous and indicated that rapid selection of highly pathogenic variants was possible (Tables 3 to 5). This suggestion was confirmed by the heterogeneity of the original sequences (shown by electrophoretograms) of the PB2 and PA genes, especially in genotypes A, C, and D (data not shown).
Here is a quick summary for thise who don’t want to wade through the long posts or read the linked publication.
Monotreme is trying make a case that the flu database has numerous errors. He has indirectly accused the Olsen lab of making 50–100 errors, which is what would be required to generate the 56 swine sequences they deposited if as Momtreme contends, sequence identies with earlier isoaltes are artifacts. The sequences have multiple insertions of multiple flu genes that are exact matches of portions of genes isolated between 1931 and 2002. He doesn’t feel the data can be real because of in vitro data shopwing errors by the flu polymerases (although most virologists wthat I know would simply cite differences in selection pressure). He hasn’t provided a shred of evidence for the swine data, which has now been public for a couple of months and has been actively discussed on these boards for at least a month. He keeps promising an explantion, but has yet to deliver (other general comments on otehr sequences in other databases).
Instead he has focused on another sequencing lab, the Katz/Subbauro lab at the CDC because they submitted two sequences for the NA gene from a patient and both sequences have the same name, A/Hong Kong/516/1997. He thought that this represented some sort of error, which he really needs to explain because it just doesn’t make sense to me. He also thought the description in the literature of the NA from the patients was misleading because there were two sequences. He thinks two sequences from the same patients is unlikely which seems to require two independent infections (which I also don’t understand - he may have a reference for such an assumption) and therefore the two sequences represent a lab artifact or some sort (he doesn’t care if its is mutation or recombination, as long as it is an artifact - a sequences that really was never in patient 516 and an indicator of the dangers in using the sequences in the database.
I really haven’t been convinced that there is anything wrong with the two sequences. I think the lab is quite aware of the fact that the sequences are different and assumes both sequences came from patient 516. I also found no evidence of confusion in the literature.
The NA sequence is described in a 1999 ppaer in Virology. The paper uses sequences from most or all 1997 Hong Kong patients, including 516. It includes accession numbers and the number corresponds to the first sequence. Montreme then cites a more recent paper, noting that in 1999 the second sequence had not yet been deposited. I looked at the 2002 paper and it also used the accession numbers to describe the sequneces. However, that paper was on the other 6 gene sequences and therefore also did not discuss the second NA sequence.
Thus, at this point I have not found a discussion on the second sequence. However, I beleive the lab is well aware of the fact that the second sequence is different than the first. Since both are full sequences, there really would be no reason to submit the same sequence a second time. Therefore, there is little doubt that the authors knew the second sequence differed from the first. Since both sequences came form the same patient, the isolate was given the same name. However, the accession number for the second sequence has no relationship to the first. If the authors thought the first sequence was an error, they would have replaced the first sequence with the second and bumpred the accession number up to dot.2 from dot.1 (and of course if they thought the second sequence was in error, they would have not submitted it).
Therefore, I really see no data indocating the authors, who are well established virologists who actually isolated and sequenced the virus, suspect the sequence is an artifact or are unaware of the fact that the two sequences are different.
However. Monotreme seems to think that two sequences from the same patient are a low probability event. He hasn’t really said why, other than to allude to two independent infections, which seems to not only require two infections, but two infections by a person. I am sure that few would argue that such a scenario is highly unlikely since there were only 16 reported human cases in Hong Kong.
I really know of no viroligist that would take such a narrow view. In fact the main point of both cited papers was the fact that indeed the 16 human infections were from birds infecting humans. There was no itermediate host. This data is supported by the sequencing data in the birds and people. Both data sets had a number of distinguishing characteristics.
The most striking difference was the constellation of the 8 gene segments. The H5 was closlely related to the H5 in the H5N1 seen in the Guandong goose in 1996. This isolate from the goose was the first H5N1 isolated in Asia and also had the HPAI HA cleavagee site RERRRKKR. The H1 matched bird sequences, which included a 19 aa delection. However, internal genes matched sequences from H9N2 or H6N1 indicating that both the human and bire H5N1’s were assortants that had formed in birds and the reassortants infected the humans,
The fact that the H5N1’s were reasortants indicated that there had to be a dual infection, but the dual infection was in the birds prior to infecting the humans. In subsequent years, H5N1 from poultry in Hong Kong, including published studies with 2001 isoaltes, indicated the birds had multiple H5N1 infections as indocated by sequence data as well as various tissue tropisms identified by isolate H5N1 from various organs.
Thus, few virologists would have problems with H5N1 in humans or birds in Hong Kong having dual infections.
Monotrome has a different view, but the bases for that view remains obscure. He seems to think the viroligists and their sequences are wrong, but the evidence for such an assertion for the two NA sequences has not been clearly explained to me (and I suspect it is also unclear to viroligists reading this board).
I am also still interetsed in the explanation for the 50–100 errors supposedly made by the Olsen lab.
I don’t know who is right and who is wrong; I haven’t had enough coffee yet to peruse all the links. But, I appreciate the discussion AND the fact that it has remained civilized. We all can benefit from all points of view - whose opinion you adopt as your own is your own choice.
Racter,
Dual infection is simply two (or more) different Influenza strains growing in a single host organism. Route of infection may be singular from a dual infected vector or multiple, one strain each from two or more vectors. Obviously, the first route is more likely.
My personal belief is that a monster replicator like H5N1 (multi-tropic as well) is going to generate varients from a single strain inside an incubator very rapidly.
If multiple strains are endemic, expect even more varients. Depending on the mode of delivery from the vector to the new host organism, the probability is nearly as high for dual strain transmission as for single strain transmission because the rapid replicants will be living in close quarters for a certain time period in the incubator.
An exhalation is an exhalation. Each exhalation is like a monsoon for viral particles in an incubator growing H5N1 in the lungs . . . single, multiple, triple, it doesn’t matter, they all are driven in the monsoon mist. Consider the fecal route for an even more efficient multiple strain delivery mechanism.
Niman and Monotreme,
I’ve mentioned in many previous posts that we must NOT make comparisons between in vitro and in vivo rates of genetic acquisition. I continue to hold to that position in agreement with Niman and at odds with Monotreme. In nature, we will find different changes and rates of change even if working with the same progenitor strains.
Because I hold to the wild type (in vivo, nature-based) strains being more representative of the true Influenza characteristics than lab strains or lab-processed strains, I will continue to place a higher emphasis on wild strains and their sequences in my ongoing reviews.
Because I value the wild type strains, I naturally value the wild type sequences.
Because I value the wild type sequences and have tracked Dr. Niman’s examples that include observations of instances of identity represented across multiple wild generations (time) and instances of identity in those same strains that were processed in multiple, distinct labs, I must hold to Niman’s observation that these homologies are related (in some way).
Now that the outcome is clearly observed (identities across generations with no chance of lab error), my current course of study leads me to investigate the process. Recombination is observed at the output. How it happens is the next question?
Are we looking at an actual physical recombination with two or more donor strains offering parts to a new daughter strain? Is there a set of rules that determine these allowable combinations of these donor parts?
Or, most obtusely, during a dual infection (s1,s2), does the strongest replicating strain (s1) get an attractant signal or biological magnetism from the second strain (s2)? Could that s2 attractant signal cause the s1 strong replicant to intrinsically pattern-match at certain positions to s2, creating the appearance of a physical mating or recombination, but without conjugation or transposition?
Is there a rule or just a attractant force that may cause viral strain one (s1) to update and match portions of viral strain 2 (s2)?
On the slow genetic acquisition test, has anyone looked at GS post as Anonymous 2006–07–22–10:14 on Accuracy Of Flu Polymerase Part 2?
He’s tracked gene segments that have high homology from separate strains installed two or more years apart. Read the description of the columns at the beginning of the report and you’ll see the simplicity of his test.
Slow genetic change seems to be pretty common in the wild.
wetDirt – at 00:40
It is interesting that in the medical field, I never see duplicate samples. In my entire life I can’t recall a single lab test that was obtained in duplicate for quality control. So of course you will miss dual infections; your metholdology is biased against them.
WetDirt, I disagree with your comment above. I work in Quality at my hospital. We Routinely collect second sample(s) to verify lab results. Our policies are such that if a patient has critical lab values, a second sample is obtained and retested. Not only is the second sample taken, but sometimes it is by a second person, from a different site, and sometimes run on a different machine. (if available). We do not want to treat patients for a condition that is diagnosed by lab results without verifying the results and also look at clinical symptoms of a patient. We do this at a cost to the hospital, it is not chargeable.
Admittedly, even this process is sometimes not enough. Sometimes, we verify by sending the sample out to a different lab. It depends on which lab value we are measuring, what our process is. But, it is certainly routine.
Addressing the quality control: All machines are calibrated frequently. Controls are run on all machines daily, or even every shift.(depending on the test).
Testing for disease works much the same way. If a patient is confirmed with something and then treated; regular testing is set up to both monitor disease progress and treatment effectiveness.
This is how it works at my hospital and hospital system.
Floridagirl. wetDirt is really talking about testing at the viral isoaltion level. The patients on the official WHO list really are positive many times over. The samples in Indonesia are a good example. First there would be a local quicky test. Next samples would be sent to Jakarta. Positive samples would then go to Hong Kong and the CDC for verification. However, the only results at Genbank or Los Almos would come from H5N1 grown in Hong Kong or Atlanta. These samples have some control, since they are being isolated independently, but even then it is likely that only the dominant sequence that grew in vitro will be isolated by each source.
To see dual infections, quite a bit of work would be required, so 2, 3, 4, or 5 positive results still would not likely detect dual infections (and of course the huamn sequences from Hong Kong and the CDC are being held hostage, so they are not available to the public and are not at GenBank or the public side of Los Alamos.
Let’s try again with this paper:
Molecular Correlates of Influenza A H5N1 Virus Pathogenesis in Mice.
The origin and outcome of human disease associated with 15 H5N1 viruses isolated from confirmed cases in Hong Kong in 1997 and the antigenic profile and relative pathogenicity of the viruses for mice are shown in Table 1. Viruses were grown in Madin-Darby canine kidney (MDCK) cells and/or the allantoic cavity of 10-day-old embryonated hens’ eggs at 37°C for 24 h.
Table 1 includes HK/516/97 which is the strain we are discussing. No accessions are provided for the NA gene in this table.
The complete nucleotide sequences for all coding regions of all gene segments of nine of the H5N1 viruses were determined by direct cycle sequencing of PCR products generated by reverse transcription-PCR from MDCK cell-passaged (XC1 or XC2) stocks by using gene-specific primer sets as previously described (3; M. Shaw, L. Cooper, X. Xu, et al., submitted for publication). This analysis identified five residues that segregated with the mouse pathogenicity phenotype in genes that encoded the NA, matrix (M1) protein, and viral polymerases PB1 and PB2. To confirm this finding, partial sequence analysis was conducted on these four gene segments from the same virus stocks that were used to determine the pathogenicity phenotype in mice (Table 2).
Table 2 also includes HK/516/97 and does list accessions.
GenBank accession numbers for nucleotide sequence data are as follows: PB2, AF036363, AF258837, AF258839−40, and AF258843−53; PB1, AF036362, AF258818, AF258820−21, and AF258824−34; PA, AF036361, AF257193, AF257195−96, and AF257199−209; NP, AF036359, AF255744, AF255746−47, and AF255750−67; NA, AF102657−70 and AF296752; M, AF036358, AF255365, AF255367−68, and AF255371−84; and NS, AF036360, AF256178, AF256180−81, and AF256184−94.
AF102657−70 would include AF102660 which is original accession for the HK/516/97 NA gene. AF296752 is the second accession we have been discussing and was deposited by the authors of this paper.
I have noticed that AF102660 and AF296752 are very different sequences. I have suggested that they may be from different strains or that AF296752 represents a lab-adapted strain. I have searched through the paper and cannot find any indication that the authors are aware that these two accessions numbers represent very different sequences. Since they performed very specific animal experiments with a HK/516/97 virus, and came to conclusions regarding the pathogenicity of that strain based on those experiments, I suggest that it would be useful to know which virus they used, the one represented by AF102660 or the one represented by AF296752. It is also incumbent on the authors (who are at the CDC) to explain why the sequence they deposited is very different from the one previously deposited by the CDC for the same gene, from the same strain.
Failure to provide this information in the paper represents an error, IMO, as HK/516/97 is discussed as if it were a single virus with specific pathological features rather than as representing two or more viruses with very different sequences. We all know that a small number of changes can cause very different biological behaviours in viruses. If one is doing animal experiments to test the properties of viruses, each virus being tested should be clearly identified. In the case of HK/516/97, this was not done. We really don’t know from the paper whether their results were based on the original virus or the one they deposited sequences for. I do not find this reassuring.
A side note, for those who think I may be being harsh towards the authors. It is common practice in science to analyze papers in journal clubs. Very few papers escape intense scrutiny without errors being detected, mostly minor. If another virus were being discussed, the errors I have pointed out might be considered minor. However, given the stakes with H5N1, it is vital that all errors and ambiguities be uncovered and rectified/clarified. If not, wrong assumptions will be made regarding the biological properties of various strains, and that would be tragic.
Monotreme, Since they list the new accession number in Table 2, it would seem that the new sequence was used in the paper. Listing that sequence and not using that isolate really would make little sense. Similarly, there would be no reason for the authors to add a new sequence to Genbank if they didn’t know it was different.
If they thought it was the same, why in the world would they make a new submission?
I fail to follow your logic.
The usual drill. This is getting pretty long and looks likely to continue so I’m closing it and I’ve opened Part 7. Sorry for the inconvenience.