Showing posts with label science. Show all posts
Sunday, April 19, 2015

By the Way, Kibosh the Khazar Theory Altogether

When I talked to the website owner of and asked him about the testing of Khazar skeletons, I received the following reply, in part:

Why are we still talking about the Khazars? They aren't involved inour ancestry at all and archaeologists and historians say it may bedifficult to distinguish Khazars proper from the other peoples of Khazaria,plus I'm not aware of anybody who has tested Khazar skeletons or plans to, butyou are welcome to ask around now that Russians have successfully testedmany populations like the Yamnaya and the Mal'ta.Based on the latest evidence I would say the Khazars are Volga Finnicintermixed with East-Central Asian Turks and other assorted peoples, andtheir Turkic element is the same one found in other Turks and Mongoliansaround Eurasia, a particular affinity never found in Ashkenazim....In lieu of ancient DNA, modern populationshave proven to be good proxies to determine ethnicity. Did you see my recentarticle "The Chinese Lady who Joined the AshkenazicPeople"? Ashkenazim are also descended from a Korean-related people, from amore recent Asian-Ashkenazic marriage.
Also by the way, I compare Dr. Himladevi "Himla" Soodyall to "Dr." Eran Elhaik. I don't know what agenda "Dr." Soodyall has, although I can ascertain that she attempted to delegitimize the Lemba as much as "Dr." Elhaik attempted to delegitimize Ashkenazi Jews.

PS My dad's Ancestry atDNA in even Analysis 2.0* does, in fact, show a very-slight amount of Middle Eastern atDNA. It also shows a tiny bit of East Asian, Melanesian, Scandinavian, and Finnish/Northwest Russian atDNA. The Melanesian atDNA is probably related to the East Asian atDNA, and Scandinavian atDNA to the Finnish/Northwest Russian atDNA.

*"We create estimates for your genetic ethnicity by comparing your DNA to the DNA of other people who are native to a region. The AncestryDNA reference panel (version 2.0) contains 3,000 DNA samples from people in 26 global regions."
The AncestryDNA panel does need to be balanced**, though:

The updated AncestryDNA ethnicity estimation V2 reference panel contains 3,000 samples carefully selected as described to represent 26 distinct global regions (Table 3.1), each with a somewhat distinct genetic profile. As a comparison, our Beta panel represented only 22 distinct global regions.

Region# Samples
Great Britain111
Europe East432
Iberian Peninsula81
European Jewish189
Europe West166
Finland/Northwest Russia59
Africa Southeastern Bantu18
Africa North26
Africa South-Central Hunter-Gatherers35
Ivory Coast/Ghana99
Native American131
Asia Central26
Asia East394
Asia South161
Near East141
Table 3.1: The Final AncestryDNA V2 Ethnicity Reference Panel

Regional Polygon Construction

As described above, we divide the globe into 26 overlapping geographic regions. Each region represents a population with a somewhat distinct genetic profile. Where possible, we use the known geographic locations of our samples to guide the delineation of regional boundaries. Figure 3.6 shows an example of the information used to define regional polygons.

For a more-accurate panel, they should have 115-16 ("115.384615"). Also, the selection should not be "carefully selected as described". The selection needs to be as random as possible. This cannot be accepted:

Before using the reference set to estimate ethnicities of AncestryDNA customers, we perform several experiments to lend support to the quality of this new reference set. This involves testing the performance of our ethnicity estimation procedure on the reference set of samples. (See Section 4 below for details regarding the statistical method used for ethnicity estimation.)
First, we use the new panel to do a leave-one-out analysis. In this experiment, we remove one sample from the reference panel and then use the remaining panel to estimate the ethnicity of the sample that has been removed. We repeat this process for every sample in the panel and then look at the average predicted ethnicity for each region in the set. Figure 3.4 shows the results of this experiment as a box plot.

Figure 3.4: Leave-one-out analysis of the V2 reference panel. Here we plot the results of an experiment in which each sample is removed from the reference set one-by-one and its ethnicity is estimated using the remaining panel samples. Each bar represents the average correctly predicted ethnicity for all samples from a given region. It is clear from this graph that for the majority of samples in each region, we predict at least 80% of the genetic ethnicity to be from the correct region. However, there are exceptions. In particular, our average prediction accuracy for samples from Great Britain, Western Europe, Iberian Peninsula, and Mali are not quite as high. There are many factors affecting the accuracy of these numbers, most importantly the number of reference samples in the panel for each region and the genetic distinctness of each region.

The purpose of this analysis is twofold. First, reference panel samples with poor performance in the leave-one-out analysis were removed. This included samples from individuals whose leave-one-out ethnicity did not represent their ethnic group of origin. (See for instance, Figure 3.5) Second, the leave-one-out plots allow us to define population boundaries and demonstrate our ability to accurately estimate the ethnicities of our reference panel samples using our method (see next section).

Figure 3.5: Removing Reference Panel Candidates. Leave-one-out estimation for a Reference Panel Candidate with 8 terminal ancestors from the Ivory Coast and Ghana region. While this sample was initially included as a candidate of the reference panel for the Ivory Coast/Ghana region, the sample’s leave-one-out ethnicity estimation reveals primarily Benin/Togo ancestry. As a result, this sample was removed from the reference panel.
In scientific studies, this is unacceptable unless it is for case studies and/or other non-generalizable/non-extrapolatable studies:

There are two sources of error that limit generalizability: sampling error (chance variation) and sample bias (constant error) which results from inadequate research design. Sampling error (but not sample bias) can be taken into account using statistics.
Probability samples are representative of the population. They permit generalization to the population from which they are drawn. There are two types of probability samples: Random and stratified.
Random - each individual in the population has an equal chance of being selected for the sample.
Stratified - a miniature representation of the larger population with regard to proportions within selected strata (e.g., gender, education, socioeconomic level). Individuals are randomly selected within strata.
table of random numbers or the random number function in Excel can be used to select a random sample from a population.
If a sample is, thus, "poor", it should be put in an "Indeterminable" or a "Poor Sample" category. 

Some would argue, "Well, what about other studies that don't have very-balanced numbers"? Given that numerous studies on Ashkenazi Jews, Lemba Jews, and other groups have been done overtime—and most have shown similar or equal results—the studies balance the numbers at least somewhat in the end. Therefore, the argument about "other studies that don't have very-balanced numbers" is moot at this point.

 ** Stratified Sampling – This technique divides the population into meaningful homogenous or similar groups based on a certain characteristic (e.g., gender, race, socioeconomic status) and then selects a simple random sample from each group. [For example, if you were interested in the affects of student motivation on academic achievement, particularly by grade level, you would divide the population into their respective grade levels and then randomly select an equal number of 9th, 10th, 11th, and 12th graders.]


Sunday, January 11, 2015

A "New York Times" Quiz On Lying And How I Did, Etc.

I got 7/10, as I predicted. Pretty darned good. After all, listening to the audio version, wanting to give the benefit of the doubt on some, having had the experiences that I've had, etc. help with that. Remember, e.g., my final conversation with my granddad was over the phone (audio) and he finally implicitly conceded our Jewishness. As far as I know (unless that quote particularly struck me), I was paying attention to every word (especially, "If we had any Jewish blood, I don't know about it.).

That may have also been the conversation (or it was probably one of the ones before) in which he said that he'd said that he'd tell me that "[I] disappoint [him]" all over again (though I think that it was the conversation in which he also mentioned his service at Ft. Knox when Great-Granduncle Ed [z"l] was there in 1957 [I think, since he was 19-20, anyway. Great-Granduncle Ed, meawhile, had served in World War Two.]). I know, too, that he talked about how days and nights were different during his childhood (e.g., Great-Granddad would come home from the mines, etc., an they'd all be in bed at 9:00 PM), unlike how "people go to bed at all hours of the night" (which struck me, I guess, because I tend to be a night owl).

I'm also trying to remember anything else. Anyway, you're usually able to tell when you've been through what I've been through, even via audio.