In my last post, I tried to introduce some basic concepts in genetic genealogy in a shameless attempt to stimulate increased interest in DNA matches among scientifically-challenged genealogists. When I say “scientifically-challenged,” I’m not judging, here. Many family historians took high school biology classes a long time ago, and for those who attended high school back in the 1950s or ’60s, the topic of DNA may not have been covered at all. So it’s not surprising that there are many testers who aren’t really sure how to make best use of their results. Today, I’d like to talk a little more about using your DNA match list at Ancestry, and the specific tools that are available there.
I think most of us realize that the list of DNA matches we receive when we first get our test results is not static. Nope, DNA test results are the gift that keeps on giving, and as more people test, our list of matches will grow. For this reason it’s a good idea to check back periodically to view your DNA results reported by each company with which you’ve tested. Ancestry will report your closest matches first, and after that, more distant matches are ranked in terms of both shared DNA (expressed in centimorgans, cM) and general degree of confidence (expressed as “extremely high,” “very high,” “high,” or “moderate”). New test results from the holiday DNA test sales have already started to hit the match lists, so you may have noticed a more dramatic uptick in your number of matches recently. A few weeks ago, I had 209 pages of matches at Ancestry (at 50 matches per page); today, I have 234 pages of matches. Of course, not all of these are quality matches. As discussed previously, many of the distant matches, especially those with whom I share 7 cM of DNA or less, are false positives and should be evaluated very critically, or completely ignored.
If you’re new to DNA testing, your list may include some close matches from your family tree, because you have family members (already known to you) who decided independently to test their DNA. That’s a good thing; those results confirm the established paper trail as discussed in the last post, and with any luck, you can persuade those cousins to upload their DNA to one of the other sites that offers a chromosome browser, such as Family Tree DNA, MyHeritage, or GEDmatch. Be sure to encourage them to read the terms of service for these sites before they upload.
If your match list includes very few close matches (Immediate Family, Close Family, 1st Cousins, 2nd Cousins or Third Cousins), you may want to begin proactively target-testing your family members, as finances permit. I typically wait until there’s a good sale, and then stock up on several kits at once. The general rule of thumb is to test your oldest relatives first. This is because the amount of autosomal DNA that you could potentially inherit from any particular ancestor gets cut in half with each successive generation. So, my Grandma’s DNA contains information about ancestors about two generations further back than mine does. If you are among the oldest members of your family, don’t despair. You can still gain valuable information by testing your siblings and first cousins. Full siblings (except for identical twins) inherit different bits of DNA from their parents, which they then pass down to successive generations. Therefore testing your siblings gives you added information about your parents’ DNA, and testing your first cousins will give you additional insights into your grandparents’ DNA, that are not provided by looking only at your own DNA. However, DNA samples from the children or grandchildren of your siblings and first cousins will not be as informative, since your parents or grandparents’ DNA is further diluted in these individuals. If you think of your ancestors—the people you’re trying to learn about—as the targets of your DNA testing, then relatives from older generations are “upstream” from your position in the family tree, and therefore closer to the target, and relatives from younger generations are “downstream.”
Just how different can siblings’ DNA test results really be? Well, if you consider DNA matches at the level of 4th cousins or closer—a level at which matches are unlikely to be Identical by Population or Identical by Chance—my sister has 248 matches currently, and I have 169. Her matches provide 79 additional opportunities to understand our shared ancestry, based on the different bits of DNA we inherited from our parents. Since I now have DNA test data from both our parents, my sister’s test results are nice to have, but not critically important. However, before my parents tested, my sister’s DNA gave me opportunities to confirm my documentary research thanks those additional matches to distant cousins whom she matched, but I did not. These individuals are still documented cousins to both of us, but they are only genetic cousins to her, and not me.
As you review your closest matches, you may notice that Ancestry “mislabels” some of the relationships. For example, my “Second Cousin” category includes two documented first cousins once removed (1C1R), and a documented grandaunt (or great-aunt, if you prefer). Conversely, I have some documented second cousins who are reported to be third cousins. What’s going on here? Does this mean the DNA test is inaccurate?
In a word, no. DNA inheritance is somewhat random thanks to a process known as genetic recombination. There are a variety of great articles out there to help you understand this process, but the take-home message is that the amount of DNA you inherit from any given ancestor—say, a grandparent—is random. On average, a person will inherit 25% of his or her DNA from each grandparent, but in practice, one may inherit 27% from the paternal grandmother, 23% from the paternal grandfather, 26% from the maternal grandmother, and 24% from the maternal grandfather, while a sibling’s inherited percentages will be different.
Because of this, it’s impossible to say that second cousins will always share exactly X number of centimorgans of DNA. Instead, it’s more appropriate to think in terms of averages and ranges, i.e. according to Ancestry, second cousins will share between 200 and 620 cM. Using another real-life example from my DNA match list, I have a documented second cousin, Ellie, with whom I share only 175 cM. Since this amount is less than the 200 cM defined by Ancestry as the low end of the acceptable range for the second-cousin category, Ancestry’s algorithms guess that she’s actually my third cousin. However, Ancestry’s definitions oversimplify the situation to some extent, as we’ll see in a moment.
Figure 1 shows a few of my second cousin matches at Ancestry with their names blacked out to protect their privacy, to illustrate a few points about how Ancestry presents the data.
Figure 1: Second cousin matches at Ancestry.
Although these are grouped under the heading, “2nd Cousin,” you can see that Ancestry qualifies that assignment with the notation, “Possible range: 1st-2nd cousins.” If you click on the question mark, circled in red, you will find the chart shown in Figure 2, which indicates the amount of shared DNA (in centimorgans) which Ancestry uses to define relationships. According to this chart, DNA matches must share between 200-620 cM of DNA, as stated previously, in order for Ancestry’s algorithm to classify them as second cousins. However, Ancestry offers a disclaimer at the bottom of the page which cautions that “The exact amount of shared DNA can vary beyond the ranges shown in the table.”
Figure 2: Ancestry’s Predicted Relationship Info page.
Fair enough; so where can we obtain a more comprehensive understanding of the possible relationships suggested by a given amount of shared DNA? There are two good resources for this. The first is Blaine Bettinger’s Shared Centimorgan Project, an ongoing, collaborative effort to gather empirical evidence for the amount of DNA shared between DNA matches whose relationships are known or have been established through documentary evidence. The most recent update from the project includes data from more than 25,000 DNA matches, and can be found at the author’s website, and also at the website of the International Society of Genetic Genealogy.1 A copy of this chart is shown in Figure 3.
In the example of a DNA match with whom I share 175 cM, examination of the chart reveals that this amount of shared DNA falls into the established range for the following relationships: Half GG-Aunt/Uncle, Half Great Aunt/Uncle, Half 3C, Half 2C, Half 2C1R, Half 2C2R, Half 1C, Half 1C1R, Half 1C2R, Half 1C3R, Half Great-Niece/Nephew, Half GG Niece/Nephew, 1C1R, 1C2R, 1C3R, 2C, 2C1R, 2C2R, or 3C. As you can see from these notations, some of these relationships are more complicated than simple first cousins, second cousins, etc., so if you’re not sure what “third cousin twice removed” (3C2R) really means, this chart by Nathan Yau may be useful.2
These results represent quite a lot of possibilities, and statistically, some of those relationships will be more probable than others. With this in mind, Jonny Perl, creator of the DNA Painter site, developed an interactive tool which combines data from the Shared cM Project with probability statistics provided by Leah Larkin. Figure 4 shows the result of plugging my 175 cM shared DNA into this calculator.
According to this tool, it’s statistically most likely that two people who share a total of 175 cM DNA will be half second cousins, second cousins once removed, half first cousins twice removed, or first cousins three times removed, although other relationships are possible.
Leveraging Shared Matches
To examine an individual match from your list at Ancestry more closely, click on either the person’s name or on “View Match.” That will bring you to a screen that looks like the one in Figure 5.
Figure 5: Ancestry’s main match analysis screen.
There are a few points of interest to note in this display. The first thing to notice is that there are four options for looking at the match in more detail. The default is the screen we’re on, “Pedigree and Surnames” underlined in purple in Figure 5. Since this DNA match does not have a family tree, we don’t get much information here. Similarly, if we were to click over to “Map and Locations,” we wouldn’t learn much, either, since those are dependent on information included in the family tree. So in cases like this one, where the tester has not linked a family tree, what can we learn?
Well, for starters, the ethnicity estimate, circled in yellow, might give us some clues about which side of the family the match is on, assuming one’s mother and father come from different ethnic backgrounds. In my case, my Mom’s ancestors were entirely Polish as far back as I’ve been able to research thus far, while my Dad is a mix of German, Alsatian, English, Irish, and Scottish. Since the ethnic blend reported for this DNA match is Eastern Europe and Russia, Baltic States, and European Jewish, she’s most likely a cousin on my mom’s side. This is an important bit of information, since a primary goal of DNA testing for most of us is to understand how we’re related to our DNA matches in the hope of extending and confirming our documentary research.
Are we ready to contact this match to request more information on her family tree? Not yet. The text underlined in red in Figure 5 informs us that the last time this member logged into her Ancestry account was October 2018—about four months ago, as I write this. That suggests that she’s not actively researching her family tree, and may have been one of the many people who bought a DNA test in order to see the ethnicity estimate, with no real interest in the list of DNA matches. That doesn’t necessarily mean that we shouldn’t reach out to DNA matches who haven’t logged into their accounts in a while, but it means that we should temper our expectations of a reply.
So what else should we do before we write to her? Check out the shared matches! These are one of the most powerful tools we have at our disposal at Ancestry, and they can be accessed by clicking the “Shared Matches” to the right of “Pedigree and Surnames.” When I do that, I can see all the matches I share in common with this person, which can provide some evidence of how we might be related. Note that caution must still be exercised when interpreting these results. If I see my 4C1R, Mark, among the shared matches for a particular DNA match (Dan), and I know that I’m related to Mark through my Ptaszkiewicz/Łącki line, it suggests that I am also related to Dan through that line. However, other interpretations are possible. For example, Dan might match Mark through Mark’s father’s side, while Mark matches me through his mother’s side. What we’re looking for is not just a shared match, but a shared segment of DNA, which suggests common ancestry. The only way to know if a segment of DNA triangulates between three people is to use a chromosome browser, which Ancestry lacks.
We’ll look at the Shared Matches screen more closely in a minute, but on this screen, I’d also like to point out Ancestry’s option for making notes on each of your DNA matches, circled in blue in Figure 5. Once I have a good idea about how I’m related to a certain match, I like to make a note about the relationship (e.g. “paternal 4C3R”) and the path back to our most recent common ancestral couple. A very helpful tool when working with Ancestry matches is a Google Chrome extension called MedBetter DNA, which will enable all your notes on DNA matches to show on the main listing page simultaneously, , so you don’t have to click on each note individually in order to view it. It also allows you to filter your matches according to certain surnames (or other keywords), provided they are prefaced with a hashtag (#) in the notes. A more detailed explanation of how to use MedBetter DNA can be found in Kitty Cooper’s blog, here.3
As mentioned, Ancestry offers four options for looking at a match in more detail. The first three are the “Pedigree and Surnames,” “Shared Matches” and “Map and Locations” tools, and the fourth is the “Compare” utility, to the left of “Ethnicity” and “Send Message” in Figure 5. The Compare utility gathers all the information about predicted relationship, shared centimorgans, ethnicity estimates, shared migrations, and shared matches and displays it on one page. Note that their data on “shared migrations” incorporates data from Ancestry’s Genetic Communities.™ To identify these “Genetic Communities,” Ancestry uses data from the family tree of each DNA tester who has uploaded a tree. So when Ancestry tells me that I belong to the Pomerania “Genetic Community,” that assignment is not based on evidence from my DNA, but on the fact that I, and many of my DNA matches, have family trees which include ancestors born in the region that Ancestry defines as “Pomerania.” Note that Ancestry’s definitions of historical regions such as Pomerania might not correspond to the traditional borders of such regions, so you might want to take those designations with a grain of salt.
Putting It All Together
As an initial goal, you may want to try to identify your connection to each of your unknown “close” matches, but it’s up to you to decide how to define “close,” based on your total number of matches. You can start off by going down the list, one at a time, or you can search your match list according to surname, to see which of your matches report that surname in their family tree. Be careful with common surnames—just because you and a match both have a common surname, like Smith, Jones, Nowak, or Wagner in your tree, you cannot assume that this is how you are related without documentary research to back that up. The best way to begin the process of identifying your relationship to your matches is by seeking those that share surnames and geographic locations in common with you. As long as you have one of those things in common to begin with, a surname or a place of origin, you can start documentary research to investigate the relationship.
As you identify your relationship to each person in your list, you may want to contact that person to introduce yourself. To increase the likelihood of a reply, I try to give my matches some specific information about our connection, without being overwhelming. I also try to explain the benefits of using a chromosome browser to better understand the location of our matching segment(s) of DNA, and encourage them to consider downloading their raw DNA from Ancestry and uploading to one of the sites that offers a chromosome browser, such as MyHeritage, GEDmatch, or Family Tree DNA, being sure to read the sites’ privacy policies first.
Even if a DNA match has a family tree online, it’s important to be skeptical regarding the information contained there, since not all family historians are equally experienced or rigorous. And even when a match’s tree indicates an intersection of surname and place with your own tree, additional research may still be necessary to trace back to the common ancestral couple. Clearly, DNA testing is not (usually!) a magic wand that will miraculously give you four more generations back in your tree, unless you happen to discover that you’re related to a very good genealogist. Nonetheless, it’s very satisfying when you are able to find solid documentary evidence identifying your relationship to a DNA match, and when that documented relationship is consistent with the amount of shared DNA. At that point, it’s usually safe to call the match “solved” and move on to the next one.
Solving such puzzles can often reveal quite a bit of new information about one’s family tree. DNA testing has enabled me to discover and connect with living cousins in Europe, which is fun. It’s also answered questions about the disappearance of my ancestors’ s cousins from Polish records, many of whom also immigrated to the U.S., but settled in places other than where my family settled. With time, effort, and a bit of luck, it’s also possible to use DNA testing to solve unknown parentage cases.
DNA testing offers a unique opportunity for genealogists to confirm and extend their documentary research. Although there’s a bit of a learning curve involved with understanding the key concepts, the abundance of online educational resources makes it possible to gain confidence with interpreting DNA results. That long list of DNA matches might seem intimidating at first, but bit by bit, it’s possible to work through them. A willingness to collaborate with our DNA matches will make the genealogy community a better place for all of us. So what are you waiting for? Check out your matches and maybe we can connect.
Sources and Further Reading:
1 Blaine Bettinger, “August 2017 Update to the Shared cM Project,” The Genetic Genealogist: Adding DNA to the Genealogist’s Toolbox, posted 26 August 2017, (https://thegeneticgenealogist.com : 19 February 2019); and
Some relevant blog articles:
Using Ancestry’s “Shared Matches” tool: https://thegeneticgenealogist.com/2015/08/28/ancestrydna-announces-new-in-common-with-tool/
Understanding Ancestry’s Genetic Communities™: https://thegeneticgenealogist.com/2017/03/28/ancestrydnas-genetic-communities-are-finally-here/
© Julie Roberts Szczepankiewicz, 2019