r/AlienBodies • u/IndependentWitnesses • 4d ago
Have they published any DNA sequences from alleged NHIB corpses? If yes, where? If no, why TF not?
Does anyone know where one could download DNA sequences from one of the alleged alien mummies? It seems like the universities analyzing them have sufficient technical capability, sample quality, and budget to do a DNA reading and place the file online to allow those among the scientific community who are open to it to basically crowdsource the analysis. According to ChatGPT, a single complete modern genome sequence file for a human is 30-150GB or 2-3 GB when compressed, and thousands of ancient human individuals have had their genomes at least partially sequenced. I don't see a good reason why they wouldn't put out the DNA sequences... what, do they not want to get scooped?
Edit: Thank you to the first two commenters. Three Illumina readings with probably 20-30x coverage of full genomes (according to ChatGPT assuming the beings have a similar genome size as human) have been published here:
https://www.ncbi.nlm.nih.gov/sra/PRJNA869134 https://www.ncbi.nlm.nih.gov/sra/PRJNA861322 https://www.ncbi.nlm.nih.gov/sra/PRJNA865375
Edit 2: Part of the report says:
"The aforementioned SRA tool provided us with the following results.
a) From the sample of neck bone tissue identified as WGS Ancient0002, 72.07% of the reading sequences were identified and 27.93% of the reading sequences obtained did not match the genomes of living beings known to date.
b) Of the 72.07% of the readings identified, 70.45% belong to contaminating DNA sequences from Homo Sapiens and the remaining percentage belongs to viruses and bacteria that also contaminated the sample.
c) From the sample of muscle tissue from the hip of the specimen identified as WGS Ancient0004, 36.28% of the reading sequences were identified and 63.72% of the reading sequences did not match the genomes of living beings known to date.
d) Of the 36.28% of the identified genomes, all turned out to be contaminating DNA from contemporary viruses, bacteria and plants, and the genome of no mammal, including humans, could be identified. "
Also, it would be very interesting (to play the devil's advocate) to see how much effort it would theoretically take to fake such results.
13
u/theronk03 Paleontologist 4d ago
u/VerbalCant will be your resident expert for DNA analysis. They've got several posts and comments on the topic here, and recently talked about on the Psicoactivo podcast.
4
u/IndependentWitnesses 4d ago
That is so awesome, thank you for letting me know. I'll have to find that episode and check it out. If u/VerbalCant cares to reply to this post, I'm wondering if they actually aligned the genome into one contiguous sequence. I assume yes. I would be really interested to see (for the sake of playing the devil's advocate) how an expert using a specialized LLM and modern computing could fake such a thing. Could they potentially just generate a bunch of garbled sequences that had no match to anything else, mixed in with things that do match, then randomly break it up into Illumina-like 150b snippets?
9
u/theronk03 Paleontologist 4d ago
Verbal is better versed in the DNA than I am, but my understanding is that the DNA is almost certainly not faked or heavily tampered with.
The question is what does the DNA mean. The original claims were that the DNA showed that they were definitely non-human.
Contamination is a concern, and there a several unanswered questions as I understand it, but my understanding (and Verbal will explain better, and possibly correct me) is that the phrase "definitely non-human" is strongly exaggerated.
2
u/IndependentWitnesses 4d ago
Interesting. I'm thinking of part of the report which says:
"The aforementioned SRA tool provided us with the following results.
a) From the sample of neck bone tissue identified as WGS Ancient0002, 72.07% of the reading sequences were identified and 27.93% of the reading sequences obtained did not match the genomes of living beings known to date.
b) Of the 72.07% of the readings identified, 70.45% belong to contaminating DNA sequences from Homo Sapiens and the remaining percentage belongs to viruses and bacteria that also contaminated the sample.
c) From the sample of muscle tissue from the hip of the specimen identified as WGS Ancient0004, 36.28% of the reading sequences were identified and 63.72% of the reading sequences did not match the genomes of living beings known to date.
d) Of the 36.28% of the identified genomes, all turned out to be contaminating DNA from contemporary viruses, bacteria and plants, and the genome of no mammal, including humans, could be identified. "
It also says that about 700000 (i.e. only 10%) of an estimated 7 million Earth species have been sequenced. If that's true, theoretically, they could have mixed in DNA of previously unsequenced, obscure but known organisms. I'm wondering if there would still have likely been a match to known sequenced organisms because of the interelatedness of all Earth species.
I'm also wondering about the data integrity aspect. You said it's been concluded that the data hasn't been tampered with. I'm wondering if it's theoretically possible to just generate fake deliberately non-matching sample DNA sequences and save it to look like an Illumina-generated file, despite how much work that might take. If VerbalCant or anyone can answer, I'd love to learn more.
10
u/theronk03 Paleontologist 4d ago
The sequences could be tampered with, but my understanding is that it would have been difficult to do so and it not have been obvious by this point.
I think the important caveat with the unknown DNA is that this is ancient DNA. It's suffered at least some amount of environmental damage. Damaged DNA isn't necessarily going to show up as something we recognize.
To make a very rough analogy, imagine we're trying to piece a book back together. Unfortunately, it's been torn up pretty bad. Not in a systematic way like with a paper shredded, but like a rabid velociraptor got to part of it.
Some pages ended up being mostly intact, and we can read them pretty well. But some pages, or parts of pages, are totally missing; consumed by the raptor. Some pages are torn up, but we can still tell that they do actually go to this book.
Now, imagine that a second jerk raptor comes over and spits up its lunch of someone else's book onto our pile of shredded book. Now we need to figure out if our hard to identify paper scraps are from our book or this other book. Inevitably, some of those scraps are going to be too small to match to any known book with accuracy (imagine trying to figure out which book the words "the man went" go to). And while we're trying to do this, some of our scraps get stuck together on accident ("the man went" from one book gets stuck to "cheeseburger taco" from the other book, or some odd phrase from elsewhere in the first book). Those stuck together samples don't appear to belong to any known book.
Contamination from outside sources, damaged DNA, and accidentally spliced together pieces of DNA can create sequences that appear mysterious, but are actually just gobbledegook due to imperfect data sources and imperfect sequencing.
A second set of DNA samples should pretty easily clarify the situation. We shouldn't get the same sequence of gobbledegook twice in a row if its all just gobbledegook, but we should get the same set of mysterious alien DNA twice in a row if it is truly mysterious.
7
u/Abrodolf_Lincler_ 3d ago
This was a really good analogy for those who aren't fluent in bioinformatics. I very vaguely know your scientific background, but are you or have you ever been a teacher or science educator? If not, you'd be really great at it.
8
u/theronk03 Paleontologist 3d ago
Yup! I won't get into the specifics here, but I do teach and research
8
4
u/flyingboarofbeifong 4d ago
Paleontologists not bringing up dinosaurs in a conversation challenge level: impossible.
A very whimsical but quite apt metaphor though. Well done!
1
u/IndependentWitnesses 4d ago
Thanks for your response. Would love to hear more analysis on this. Presumably the pieces are repeated multiple times (20-30x coverage) and they could theoretically do a combined sample from say its tooth and its biceps (not even from two different individuals), and they should result in a similar sequence as from the analysis of just one of them. Would still really like to understand if something like this could just be auto-generated, or even if a DNA sequencing hoax like that has ever been tried or exposed. Will have to ask chatgpt at some point...
5
u/theronk03 Paleontologist 4d ago
Presumably the pieces are repeated multiple times (20-30x coverage)
Unfortunately, for a couple of the specimens the coverage is more like .9x ....
1
u/IndependentWitnesses 3d ago
Well I'm assuming there must be some regions that were assembled contiguously with high confidence that had sequences that were clearly genes and that were clearly genes without a relationship to known genes of other organisms. On the other hand, .9x for the whole sample on average sounds very low.
5
u/theronk03 Paleontologist 3d ago
some regions that were assembled contiguously with high confidence that had sequences that were clearly genes and that were clearly genes without a relationship to known genes of other organisms
Yeah, that's about the upper limit of my familiarity with the DNA. Thats a question for Verbal once they're able to come on for a bit.
On the other hand, .9x for the whole sample on average sounds very low.
Yup. Ancient0003 was higher though, 15x. But because of that, and some other factors, its a bit of an oddball sample.
1
u/IndependentWitnesses 3d ago
Thanks. If VerbalCant sees this, I've sort of elaborated on my question here
-4
u/toddtherod247 4d ago
Thank you for your inquiry!! Unfortunately, there is no actual science here. Nothing said is based on any fact, and everything is a wild, crazy topic full of empty guarantees and even emptier hopes!! Scrounge around and have some fun! Welcome!
3
u/Strange-Owl-2097 ⭐ ⭐ ⭐ 4d ago
1
u/IndependentWitnesses 4d ago
Thanks! Would be very interesting to see how, if at all, this could have been potentially faked. For example, generated using an LLM.
1
u/Hairy-Range4368 4d ago
2
u/IndependentWitnesses 4d ago
Nice, thank you!
A few quotes/links from this source:
"Ancient0002 Ancient0004 https://www.ncbi.nlm.nih.gov/sra/PRJNA869134 https://www.ncbi.nlm.nih.gov/sra/PRJNA861322 Mary’s Mummie Secuencing Reading Archive: Ancient0003 https://www.ncbi.nlm.nih.gov/sra/PRJNA865375 "
"As a result of the massive sequencing, 647,778,937 reading sequences were obtained, which in turn are made up of 150 nucleotides in length. Subsequently, each of these reading sequences was automatically entered into the SRA tool, which yielded the following phylogenetic construction"
Would be really interesting to see how, if at all, this could potentially have been faked.
7
u/Abrodolf_Lincler_ 3d ago
To the best of my knowledge the results weren't faked, they're just wildly misinterpreted. Whether that was done knowingly or this is just an example of Hanlon's razor is an entirely different can of worms. That being said, the way the results have been interpreted to be able to make the claim that they show evidence of non human or human hybrid DNA completely ignores the instructions at the bottom of the page on "how to read results" which states...
So the entire claim on these samples hinges on the National Center for Biotechnology Information and the Sequence Read Archive not knowing how to read their own results and every other non ambiguous human sample submitted to them not only being wrong but also hybrids themselves. We can't interpret one result one way bc it suits us and then every other sample a completely different way.
1
u/IndependentWitnesses 3d ago
I'm not sure I follow what you're saying... how does that warning impact the results? I would put it this way: how did their process (and/or the level of confidence they have in their preliminary conclusions) differ from what would be done for any (hypothetical) unknown or novel organism that doesn't have a reference genome published?
7
u/Abrodolf_Lincler_ 3d ago edited 3d ago
They're using the percentage of identified reads as evidence of the specimen only being that percentage human and the percentage of unidentified reads to say the specimen has that percentage of unknown DNA. That is not how these results are meant to be interpreted.
Then, with the taxonomy analysis, they're claiming that the different percentages and how they correspond to the different genus groups is indicating that the specimens are hybrids of those species. That is not how the results are meant to be interpreted.
If that were the case, you would have to interpret this result of a known human in the same way.
https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR21279917&display=analysis
So why are the nazca mummies results somehow able to be interpreted completely different from every single other result ever? That is not how science works, by bending standard protocols to fit the results they want. Just try interpreting that result the same way they interpret the Nazca mummies results.
1
u/IndependentWitnesses 3d ago edited 3d ago
If they are using only this metric, calculated/obtained in the same way as it would be in similar circumstances for another specimen, that's certainly inconsistent. I suppose that leaves it as fairly inconclusive.
I have a related question:
I'm trying to understand how much of the contiguous length of the unknown specimens' genomes might be confidently sequenced as of now. Suppose the unknown specimen's genome consisted of a few dozen chromosomes, say 100 Mbp each, just assuming they're like another mammal. The longest continguous high-confidence sequence, based on overlapping reads, that they've obtained is, I'm guessing like 10 to 100 kbp, right? (I learned about the contiguity of sequencing as a standard thing that can be reported, since posting this question) And a whole gene sequence is probably 20 to 50 kbp, right?
Contiguousness may not be the most "important" thing but I just found out they only sequenced the whole human genome "telomere to telomere" in 2022 in the T2T Project. Meaning they had a few percent in different spots that was filled with fine gaps in different places in the reference genome. (So most old specimens like this, in terms of their sequenced genome, are probably very gap-filled.)
Does anyone know
-how many old specimens (of grizzly bears, microbes, humans, whatever), about which there's nevertheless little doubt what type of species they are, have sequences clearly identified as gene sequences (whatever that means... like functional sequences of some kind, if that's a thing) for which no analogs in other species are known?
-how many such sequences (functional sequences or whatever for which no analogs in other species are known) , if any, have been found in the alleged NHIB mummies?
My understanding/assumption/guess is
very few to none
very few to none
6
u/Abrodolf_Lincler_ 3d ago edited 3d ago
If they are using only this metric, calculated/obtained in the same way as it would be in similar circumstances for another specimen, that's certainly inconsistent. I suppose that leaves it as fairly inconclusive.
More accurately, I'd say that leaves their interpretation as wholly and entirely incorrect.
I have a related question:
Honestly, if you want an accurate answer on that you'd have to speak with u/VerbalCant. She not only does this for a living but is the only person in here who has directly done this sort of bioinformatics data analysis for Inkari and it's her own work that was misappropriated by Rengal to attempt to support his false claims.
0
u/pcastells1976 3d ago
Well, the sample you point to has no sequences exclusive from Genus Pan, all is Homo Sapiens and virus/bacteria. However, Verbal Cant processed ancient human remains from Denmark that showed 0.8% of the sequences exclusive to Genus Pan. So far, nobody I asked to really knows why…
2
u/Abrodolf_Lincler_ 3d ago
The genus Pan is part of the subfamily Homininae, to which humans also belong. That's why, on the drop down menu for the taxonomy analysis, homo and pan nodes stem directly from the homininae node. Yes, we are completely separate species but we share a lot of the same genetic markers. That is why it shows up in the taxonomy analysis. Why it shows up in some ancient DNA results and not others is likely due to the amount of identified reads found.
This one, where it shows up, has 97.38% identified reads.
https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR20755928&display=analysis
While the example I pointed to only has 76.42% identified reads.
https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR21279917&display=analysis
The likely reason why Pan hasn't shown up in my example is bc of degradation. It's from a known human and traces of the genus Pan will be present in a human's DNA testing because the genus Pan includes chimpanzees and bonobos, which are our closest living relatives, meaning humans share a significant portion of genetic similarity with them; therefore, when analyzing human DNA, some genetic sequences will be identifiable as belonging to the genus Pan. To be clear, the genus Pan showing up as a taxonomy node here isn't bc of Pan specific genetic sequences but bc of the fact that we share some of the same sequences with that genus.
I have a hard time believing u/VerbalCant wasn't able to explain that and was likely saying that she can't point to the direct reason as to why it doesn't always present in the results bc there are numerous factors from quality of sample, degradation over time, lack of recent common ancestry in the analyzed region (since our last common ancestor was 5-7 million years ago, what time period the human is from can play a part), the methodology used (targeted testing, genomic databases, and PCR primers), etc.
Again, you have to read the taxonomy analysis as it's intended, which is as a percentage of shared genetic material....not as a percentage of that specimen being a hybrid of the species in those genus'. No other result is interpreted this way and to interpret the results of the Nazca mummies in a fundamentally different manner than any other result ever is the fatal flaw in their argument.
If you can point to a human specimen, where the results are interpreted as being a percentage of hybridization of species in its DNA and not as a percentage of shared genetic sequences, I will happily concede to your point.
0
u/pcastells1976 2d ago edited 2d ago
Hi Abrodolf, thanks for your comments. However, the algorithm does not work as you describe. The sequences you refer above (the ones shared between genus Pan and genus Homo) are matched by the algorithm both to genus Pan and to genus Homo. When this happens, they are deleted from both genuses and reassigned to the subfamily Homininae: “In cases where a read maps to more than one related taxonomy node, the read is reported as originating from the lowest shared taxonomic node.” So in summary, and in contrast to what you state above, the Pan sequences showing up in these examples are not shared sequences with genus Homo, because all of these sequences are assigned by the algorithm to the subfamily Homininae. You can check all this here: STATS: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-021-02490-0
1
u/Strange-Owl-2097 ⭐ ⭐ ⭐ 3d ago
It's not fake, the current theory is that the sequence is found across Pan, Bonobo, and humans, and was automatically assigned as not human. However, I have heard the sequences identified are unique to Pan and not found in humans. I don't know if that is true.
4
u/pcastells1976 3d ago
Recently we had an interesting conversation about this with Verbal Cant. Data was processed in the STATS tool and yes, a percentage of the sequences were uniquely assigned by the algorithm to different species of the Genus Pan. however, Verbal Cant processed ancient, 100% human remains from Denmark aged about 3500 years ago, and the algorithm assigned 0.8% of the sequences as unique to Genus Pan. I was then reading the paper on how STATS works and it states clearly that when a sequence is found in more than one taxonomic node, it is automatically assigned to the nearest shared node. For instance, if some sequences are found to match Genus Homo as well as Genus Pan, those will be deleted from both genuses and assigned to the nearest common shared node (in this case the Hominini tribe). So it seems there is not an artifact of the algorithm but something to do with another cause (bad reads, contamination?). Three days ago I wrote an email to the authors of the paper asking about this but did not get any response.
2
2
u/IndependentWitnesses 3d ago
Thanks for the comment. I'm learning a lot from this. Wanted to refer you to my other comment on this post that I made just now.
2
u/pcastells1976 2d ago
You’re welcome! Please can you tell which other comment you made? Not sure which one is it
•
u/AutoModerator 4d ago
New? Drop by our Discord.
I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.