New Ethnicity Estimates from 23&Me: Getting Better, Maybe?

23&Me recently updated their ethnicity estimates to include a great deal more detail, and the results are frankly fascinating. In some cases, they nailed it. In other cases, not so much. So what does that mean about the overall accuracy of their analysis?

My “Eastern European” Ancestry

Before I go into the analysis, I should mention that I actually have two separate ethnicity estimates generated by 23&Me. The first was based on DNA data downloaded from Ancestry, and uploaded to 23&Me during a special one-day event in April 2018. (23&Me no longer accepts uploads from Ancestry or any other company.) The second ethnicity estimate was based on testing done with 23&Me directly. As expected, these estimates differ a bit, so for the purpose of this discussion, I’m going to refer to the ethnicity estimate generated through testing directly with 23&Me.

According to the analysis by 23&Me, I’m 99.8% European. Of this, 45.4% is “Eastern European,” which means Polish, in my case. I’ve discovered that this “Eastern European” designation is a sore spot with many Poles today because of its inaccuracy:  geographically, culturally, and historically, Poland is Central European, not Eastern. It would make more sense for the DNA test companies to use the term “Slavic,” to describe this genetic component, but unfortunately, they don’t, so I’ll continue to use their “Eastern European” term for the sake of consistency.

23&Me reports that all 45.4% of that “Eastern European” component is accounted for by  Polish ancestors, with Polish ancestry being “highly likely.” So far, so good; by pedigree, I’m 50% Polish. But here’s where it starts to get interesting. In this latest update, 23&Me further refines that estimate and claims that, out of Poland’s 16 modern day voivodeships (provinces), the strongest evidence of my ancestry was found in the following provinces, which are ranked in terms of confidence (Figure 1):

Figure 1: 23&Me’s map indicating strength of evidence for biogeographic origins within present-day Poland. Darker colors indicate stronger evidence of ancestry.

23 & Me Poland Map

As stated in the figure, “Poland has 16 administrative regions, and we found the strongest evidence of your ancestry in the following 6 regions:

  1. Subcarpathian (Podkarpackie) Voivodeship
  2. Lesser Poland (Małopolskie) Voivodeship
  3. Lublin (Lubelskie) Voivodeship
  4. Silesian (Słąskie) Voivodeship
  5. Greater Poland (Wielkopolskie) Voivodeship
  6. Masovian (Mazowieckie) Voivodeship”

How does this compare with documented pedigree? If we look at this list again, with documented ancestry from that region in parenthesis, we have:

  1. Subcarpathian (Podkarpackie) Voivodeship (1/16 ancestry, or 6.25%)
  2. Lesser Poland (Małopolskie) Voivodeship (1/16 ancestry, or 6.25%)
  3. Lublin (Lubelskie) Voivodeship (0% ancestry)
  4. Silesian (Słąskie) Voivodeship (0% ancestry)
  5. Greater Poland (Wielkopolskie) Voivodeship (1/8 ancestry, or 12.5%)
  6. Masovian (Mazowieckie) Voivodeship (1/4 ancestry, or 25%)

I found it really interesting that genetic contributions from the Podkarpackie and Małopolskie provinces seemed to be so pronounced. I have one great-great-grandfather, Andrzej Klaus, whose family was from an area presently located in Małopolska, and one great-great-grandmother, Marianna Łącka (Andrzej’s wife), whose family was from a village presently located in Podkarpacka, so it makes sense that genetic traces from those regions showed up in the analysis. But I have four great-great-grandparents from Mazowieckie province, which shows up much further down the list, and two great-great-grandparents from Wielkopolskie province. In all those cases, I have identified DNA matches which confirm documentary evidence. This means that I really do have ancestry from all the regions I think I do, in the proportions that I think I do.

So how do we account for the differences between documented pedigree and 23&Me’s genetic analysis? It’s certainly possible that I have deeper ancestry in some of the regions for which 23&Me found genetic traces, beyond what I’ve been able to document thus far. However, it’s important to remember that this analysis superimposes modern geographic province borders over the data, in a way that is both arbitrary and artificial,  and I think this is probably the reason behind the supposed genetic contributions from the Śląskie and Lubelskie provinces, from which I have no documented ancestry.  The test simply lacks the sophistication to accurately differentiate between “Masovian DNA” and “Lublin DNA,” because the genetic differences between people whose ancestors were from those regions is so slight.

My French and German Ancestry

By pedigree, I’m 50% “Northwestern European” (as 23&Me defines it) on my Dad’s side.  Twelve of my 32 great-great-great-grandparents are from Germany or Alsace, which comes to 37.5%, and another 4 out of those 32 great-great-grandparents (12.5%) came from the British Isles. What 23&Me actually reports is that I’m only 37.4% Northwestern European, which is further refined to 21.8% French and German, and 6.8% British Isles.  Of that 21.8%, German ancestry is considered to be “highly likely,” with the strongest evidence of my German ancestry coming from the regions of Bavaria, Baden-Württemberg, North Rhine-Westphalia, and Hamburg (Figure 2).

Figure 2: 23&Me’s map indicating strength of evidence for biogeographic origins within present-day Germany.

23&Me Germany map

As stated in the figure, the strongest evidence for my ancestry was found in the administrative regions of

  1. Bavaria
  2. Baden-Württemberg
  3. Hamburg
  4. North Rhine-Westphalia

Bavarian ancestry is certainly consistent with the paper trail, and this Bavarian component appears to have left the strongest mark on my DNA as indicated by its deep-blue color on the map. Three of my great-great-grandparents—Anna Murre, Wenzeslaus Meier, and Anna Goetz—have documented roots in Bavaria. For the Meiers and the Goetzes, those roots are deep, although in the case of the Murre family, I have yet to discover the family’s specific place of origin in Bavaria. Similarly, my great-great-grandfather, John G. Boehringer, had roots in Baden-Württemberg, just west of Bavaria. Check. But Nordrhein-Westfalen and Hamburg? There’s nothing in the paper trail to link my family to those locations.

Conspicuously absent from the list are the German states of Rheinland-Pfalz (Rheinland-Palatinate) and Hessen. I have documented ancestry in both those regions through my Roberts/Ruppert and Wagner great-great-great-grandparents, so 1/32 (3.125%) of my ancestry should trace back to those states. Yet they don’t show up in 23&Me’s analysis at all. However, when one considers the location of all three of these states in relation to one another (Figure 3), it seems probable that these discrepancies result from superimposing modern geographical boundaries onto historical populations. Hessen and Rheinland-Pfalz lie just south of Nordrhein-Westfalen, so it may be difficult for any DNA test to distinguish between those populations genetically. The test may be picking up on my Hessen/Rheinland-Pfalz ancestry and erroneously reporting it as Nordrhein-Westfalen. Of course, that explanation still doesn’t account for the trace of “Hamburg” DNA in the last 200 years. Go figure.

Figure 3: Map of Germany showing relative positions of Nordrhein-Westfalen, Rheinland-Pfalz, and Hessen.1

small map germany

The next three components of my French and German ancestry, which are considered to be “likely,” feel like spitballing. 23&Me reports that I have French ancestry from the Brittany region, Swiss ancestry from the Valais canton, and Dutch ancestry from the South Holland region, all within the last 200 years. I’m not buying those specific regions —Brittany, Valais and South Holland—and I have no evidence of Dutch ancestry at all, either in the paper trail or in the trees of my DNA matches. However, I do have one family of documented Alsatian origin—that of 3x-great-grandmother Catherine Grentzinger from Steinsoultz, Alsace—as well as another 3x-great-grandmother, Mary Magdalene Causin, who was probably Alsatian. Together they would contribute 6.25% of my DNA, and I believe their genes are probably the source of the French/Swiss ancestry reported by 23&Me. I know so little about Mary Magdalena Causin’s origins that I can’t state definitively that her family was not from Brittany, Valais, or South Holland, but Alsace seems much more probable based on documentary evidence gathered to date.

My British Isles Ancestry

23&Me reported that I have 6.2% British Isles ancestry, which is a bit less than the 12.5% which would be predicted statistically, but is nonetheless reasonable, given the random nature of DNA inheritance through recombination. For this relatively minor component of my genetic make-up, 23&Me offers no fewer than 11 distinct areas from which these genes are supposed to originate. Starting with the United Kingdom, 23&Me reports the strongest evidence of my genetic ancestry  from the following regions, ranked in terms of confidence (Figure 4):

Figure 4: 23&Me’s map indicating strength of evidence for biogeographic origins within the present-day United Kingdom.23&Me UK map

23&Me reports that the strongest evidence of my ancestry was found in the following 10 of the U.K.’s 165 administrative regions:

  1. Greater London
  2. Greater Manchester
  3. Northumberland
  4. Tyne and Wear
  5. Lancashire
  6. South Yorkshire
  7. Aberdeen City
  8. Dumfries and Galloway
  9. Glasgow City
  10. West Yorkshire

Ireland is also noted as a possible place of origin for my ancestors, although there were insufficient data for their algorithms to point to any particular part of Ireland.

So how does this relate to what I know of my British ancestors? My most recent British ancestor was my great-great-great-grandfather, Robert Dodds, and there’s documentary and DNA evidence to suggest that he came from the general vicinity of Northumberland, possibly the village of Ford. That may account for the genetic traces from Northumberland and Tyne and Wear, and may also account for the Dumfries and Galloway component. Similarly, I have an Irish great-great-great-grandfather, Robert Walsh. So far, so good. My next mystery ancestor with roots in the British Isles was Robert Dodds’ wife, Catherine, whose maiden name was variously reported as Grant or Irving, and whose ancestry was generally reported as Scottish. Given that I don’t even know her maiden name, I can say nothing about her specific origins within Scotland. However, this list would suggest origins in Dumfries and Galloway, Aberdeen, or Glasgow—options with such geographic disparity that they don’t offer much insight beyond the basic fact of Scottish ancestry, which I already knew.

Going back still further in the family tree, I have Hodgkinson and Spencer ancestors who were Loyalists during the American Revolution. There’s some evidence that Robert Spencer, United Empire Loyalist, was a descendant of Michael Spencer, one of four immigrant Spencer brothers who settled in the American colonies during the Great Migration. The Spencer brothers were known to be from Bedfordshire, but detecting such distant ancestry with any sort of confidence is beyond the capabilities of autosomal DNA testing. It seems to be anyone’s guess where John Hodgkinson was from, nor do we even know for certain the maiden name of his daughter-in-law, Christina Hodgkinson, who was the wife of his son, Robert. Christina was my 4x-great-grandmother, 6 generations back from me, which is sufficiently recent in the family tree that she almost certainly left detectable traces in my genes.2

Could Christina or John Hodgkinson be from Greater London, Greater Manchester, Lancashire, or West or South Yorkshire? Sure, why not? Yet the DNA evidence at this point is far too flimsy for us to conclude that it must be so. Once we’re down in this “trace” region, it seems that anything goes—and by “trace,” I mean that any component that represents less than 5% of the total seems suspect to me. This is where it starts to get bizarre, since I have absolutely no documentary evidence for any of the the remaining ethnicities highlighted in the report. According to 23&Me, I’m 6.3% Southern European, which breaks down to

  • 2.8% Italian, with “evidence” of ancestry in Sicily and Campania
  • 0.6% Greek and Balkan, with another
  • 2.9% Broadly Southern European.

In addition to that, I’m 0.8% Ashkenazi Jewish, 6.3% Broadly European, and another 0.7% Northern West Asian. There’s a crazy temptation to wonder if this could possibly, somehow be true. There could always be a misattributed parentage lurking in the family tree, right? Or more than one? But that way madness lies. The best remedy seems to be buckling down and extending my family tree through documentary research as far as possible. Perhaps my documentary research will eventually identify some distant, mysterious ancestor(s) from Southern Europe who left their stamp on my DNA. Or perhaps continued refinement of those ethnicity estimates will reveal those ethnic components to be mere phantoms, artifacts of the testing algorithm. Only time will tell.


