Back in March 2019, I wrote about the most recent ethnicity estimates from Ancestry DNA for three generations of my family: myself, my husband, all four of our parents, and our four children. Since this is a rather unique data set, I thought it would be interesting to see what insights such analysis might offer about DNA inheritance, and also about the limitations inherent to these estimates.
Ancestry DNA has updated their ethnicity estimates several times since that first blog post, adding new reference groups and Genetic Communities™ for increased granularity. Last month, they released another update, bringing the total number of Genetic Communities™ to 61 for Poland. So, this seems like a good time to revisit that concept and compare the newest ethnicity estimates for my family members to each other and to those previous estimates, to see how they have changed over time.
For those who might be unfamiliar with the term, Ancestry’s Genetic Communities™ are the result of Ancestry’s effort to identify more precisely the regions from which each DNA tester’s ancestors originated. They’re assigned automatically, so if Ancestry is able to place you into one of their Communities, they will, without any requirement to opt-in. Ancestry’s algorithm takes into consideration the family trees of clusters of DNA testers who all match each other, and uses the locations mentioned in those family trees to identify birthplaces or migration destinations common to the group. Theoretically, if a majority of the family trees incorrectly identified a place of origin for a group of people, the algorithm might be thrown off, but I suspect that this risk is minimized due to the size of Ancestry’s database.
With this most recent update, Ancestry correctly assigned me to a Genetic Community of those with ancestry from Southeast Poland, and further refined that to Northeastern Lesser Poland (Figure 1).
I’ve traced my Klaus and Liguz ancestors to villages in that region between Szczucin and Mielec, so Ancestry nailed that one. Moreover, they were able to be even more precise with my mother’s estimate, specifying Dąbrowa County as one of her ancestral places of origin (Figure 2).
I was also assigned to the Genetic Community of Northeast Poland, indicated by the larger yellow area on the map in Figure 3, with a further assignment to the Łódź Province and Surrounding Area Community.
Zooming in on that map reveals that the “Łódź” area is defined rather broadly, so I’m not surprised that their map encompasses my ancestry from parishes that are in the Mazowieckie province, but are only a few kilometers east of the border with Łódź province. However, I am a little surprised by the extent to which these Genetic Communities overlap, and by the fact that I was not assigned to all of the Genetic Communities that cover a particular geographic area. For example, the geographic region identified as “Łódź Province and Surrounding Area” encompasses my ancestry from parishes in Słupca County, Wielkopolska, nearly 150 km west of Łódź. However, Ancestry has identified other Genetic Communities (e.g. West Central Poland Community, Greater Poland Community, and Central Poland Community) which also cover this region. The map in Figure 4 defines the geographic region identified as the place of origin of those in the definition of the Central Poland Community, so one might expect that someone with roots in Słupca County—located west of Konin and east of Poznań—would be assigned to this community, but that was not the case for me. My mother-in-law was assigned to this area, however, so the map shown in Figure 4 comes from her ethnicity estimate.
Of course, these estimates and Genetic Community assignments are still a work in progress, and we have every reason to expect that the accuracy will continue to improve over time. With that in mind, here is the table which compares the ethnicity estimates for my family, consisting of a group of four siblings, their parents, and all four grandparents (Figure 5). For each ethnicity component, the reported value is given in bold, with the range indicated in the line below. Check marks indicate the Genetic Communities that were assigned to each tester. A dash indicates that a person was not assigned to a particular ethnic group or Genetic Community. Ancestry tests for over 1500 ethnicities, but only the ten groups shown were reported in ethnicity estimates for members of my family.
As with my previous post, it’ll be helpful to discuss the ethnicities in my family based on pedigree. The ancestors of my father-in-law (“Paternal Grandpa” in the chart) were ethnic Poles from the Russian and Prussian partitions as far back as I’ve been able to discover. (A brief discussion of the partitions of Poland and subsequent border changes is found here.) My mother-in-law’s (“Paternal Grandma’s”) ancestors were also ethnic Poles, from the Prussian partition. My mother’s (“Maternal Grandma’s”) family were ethnic Poles from the Russian and Austrian partitions. My father’s (“Maternal Grandpa’s”) ancestry is more mixed. His mother’s family was entirely German, and his father’s family was half German/Alsatian, half English/Irish/Scottish.
Based on those pedigrees, “Paternal Grandpa, “Paternal Grandma,” “Dad,” and “Maternal Grandma” should all be 100% Polish ethnicity, since all of their ancestors were Poles, living in Polish lands, as far back as I have traced thus far. I’m half Polish, since all my ancestors on my Mom’s side were Polish and none of my Dad’s ancestors were, and my kids, then, are 75% Polish.
For comparison, the summary chart for the data from March 2019 is shown in Figure 6.
In comparison with these earlier data, the November 2021 ethnicity estimates for each person have not changed significantly. My father-in-law (“Paternal Grandpa”), for example, was previously reported to be 83% Eastern Europe & Russia,16% Baltic States, and 2% Finland; in this current estimate, 84% of his ethnicity was Eastern Europe & Russia, with 11% Baltic, and 5% Sweden & Denmark. The Baltic and Sweden & Denmark components may or may not be real, since the reported ranges indicate 0% at the low end. It may happen that these components eventually disappear, just as the “Finland” component did, as the ethnicity estimates are continually refined. However, it’s also possible that these components are real, and reflect retained traces of more ancient ancestry. Time will tell.
My father-in-law was also assigned to some Genetic Communities™, specifically, the Northeast Poland community, with additional sub-assignments of Central & Northeast Poland, Central Poland, and Łódź Province and Surrounding Area. Given the degree of overlap between those communities, I think this is, at best, a modest improvement over the simple statement that his ethnicity is Polish, but it’s a step in the right direction, at least.
Another interesting difference between the 2019 ethnicity estimate and the current estimate is the increase in my Dad’s (“Maternal Grandpa’s”) reported Scottish ethnicity. This is due to Ancestry’s attempt in 2020 to differentiate between the closely-related ethnic groups in the United Kingdom. As explained in this blog post by Barry Starr, Ph.D., Director of Scientific Communications at Ancestry, earlier reference panels included only two groups for this region, an Irish/Celtic/Gaelic group and an Anglo-Saxon/British/English group. In 2020, Ancestry added additional reference panels in an attempt to offer increased granularity, so testers with U.K. ancestry could now be assigned to one or more of four ethnic groups for this region: England & Northwestern Europe, Ireland, Scotland, and Wales.
Unfortunately, this particular change to the algorithm led to inflated estimates of Scottish ancestry for many of us. In 2019, my Dad’s combined “Ireland & Scotland” component represented 4% of his ethnicity (range = 0–5%). For comparison, we can calculate Dad’s ethnicity by pedigree. His most recent Irish ancestor was his great-great-grandfather, Robert Walsh, from whom Dad would have received, on average, 6.25% of his DNA. Another great-great-grandmother, Catherine (Grant) Dodds, was the source of Dad’s Scottish ancestry, but her family’s origins are unclear, as she herself was most likely born in Canada of parents or grandparents who were Scottish immigrants. If we assume that Catherine’s ancestry was purely Scottish, then Dad would be expected to inherit 6.25% Scottish ethnicity from her, for a total of 12.5% “Ireland & Scotland.” So, the 4% “Ireland & Scotland” reported in 2019 falls short of that, partly due to the random nature of DNA inheritance through recombination—Dad may simply have inherited less than the average amount of DNA from each of those two ancestors—and partly due to the inexact science of generating ethnicity estimates.
However, in Dad’s current ethnicity estimate, his Scottish component is inflated to a whopping 31% (range = 12–33%), while his Irish estimate is 3% (range = 0–7%), and his England & Northwest Europe component comes in at 18% (range = 0–51%). These changes are the result of that attempt in 2020 to distinguish between Irish, Scottish, Welsh, and English/Northwestern European ethnicities, and they effectively double his total U.K. ancestry, which should be about 25% since all of his English/Irish/Scottish roots are through one grandmother, Katherine (Walsh) Roberts. (Dad’s other three grandparents were all German or Alsatian.) I suspect that this over-estimate of Scottish ancestry will be resolved in a future ethnicity estimate update.
I think the rest of the data in the charts largely speak for themselves, so an exhaustive analysis of each person’s results is unnecessary. However, a few observations can be made:
- Both Child 1 and Child 4 both had ethnicities reported that were not detected in the tests of either their parents or their grandparents. Child 1 was reported to have 1% DNA (range = 0–4%) from Sardinia, and Child 4 was reported to have 6% (range = 0–12%) DNA from Norway. Since DNA cannot “skip a generation,” these results cannot reflect any true ethnic origins in those areas. Since we only recognize that that these results are spurious by comparing them with data from both parents, this illustrates the need for caution in interpreting ethnicities reported at values less than about 10%.
- Even if a reported ethnicity matches the known pedigree, checking the range of values is recommended; anything that dwindles down to 0% should be taken with a grain of salt, in the most conservative interpretation.
- Ancestry’s Genetic Communities™, identified in conjunction with place data from family trees, track well across generations. There were no Communities assigned to children which were not also assigned to their parents, and in one case, a parent’s data exhibited a higher degree of accuracy and precision ((Northeastern Lesser Poland > Dąbrowa County) than was detected in the child.
- Identification of Genetic Communities™ did not always line up with known data about ancestral origins, even when those origins are confirmed through DNA matches. Despite having a grandmother born in Greater Poland and having deep ancestry in that region confirmed by DNA matches, my mother was not assigned to this Community. Despite having no evidence of ancestry from places further south than Greater Poland, my mother-in-law was assigned to the Southeast Poland Genetic Community. Go figure.
At the end of the day, these are only estimates of one’s ethnicity, and they are liable to change, modestly or significantly, as additional testers enter the data pool and new reference populations are added for comparison. DNA match lists are ultimately more useful than ethnicity estimates in answering genealogical research questions, but it’s nonetheless fascinating to see how these estimates play out within a family group.
© Julie Roberts Szczepankiewicz 2021