(The description is below the video.)
See https://support.ancestry.com/s/article/AncestryDNA-White-Papers?language=en_US, especially https://www.ancestrycdn.com/support/us/2024/10/2024ancestralregionswhitepaper.pdf.
(I left the links as they are to allow people whom read the description from a YouTube link to be able to access the original links).
The brazen admission regarding selection and confirmation bias reads as follows:
“ For each sample, we analyze a set of approximately 300,000 SNPs that are shared between the Illumina OmniExpress platform and the Illumina HumanHap 650Y platform (which was used to genotype HGDP samples). Samples with large amounts of missing data are removed. We also remove samples which are likely to degrade the performance of the reference panel. Samples can be removed because 1) they are closely related to another reference sample, or 2) the underlying genetic information about a sample’s origins disagrees with the sample labels, as determined through principal component analysis (PCA) (Jackson 2003, Patterson 2006) and our previous genetic analyses (Figure 2.1).”
If I had ever tried something like that, in one of my college research papers, I would have been expelled very quickly. Legitimate researchers do not get to throw out samples just because the samples do not agree with the predetermined hypothesis. The samples must always be reported as outliers.
I also would have been expelled for a reference panel such as the following because of its deliberate lack of 1:1 ratios and 1,000 samples in each group: