Goal-Focused Genealogy, or, Connecting to a DNA Match in 20 Minutes

If you’re reading this, you probably know how time-consuming genealogy can be. The supply of historical documents and individuals to research is endless, so before sitting down for a research session, it’s important to always be asking ourselves, “What is it I want to know?” Having a specific question in mind can help drive you toward the sources of information that are most relevant to the problem.

When I’m researching a DNA match, for example, my essential question is, “How am I related to this person?” I’m not interested in fully documenting that person’s family history; I just want to get to the documents that will allow me to connect him or her to my family tree. I think of this method as “quick and dirty genealogy,” but “goal-focused genealogy” might be a more accurate description. During or after the research session, I’m still careful to create source citations for each document I find, extract each piece of information from each document (e.g. name, date and place of birth, place of residence, etc.), and attach those source citations to each fact I create in my family tree. Nonetheless, keeping my focus on the goal permits me to ignore a lot of “low-hanging fruit”documents that turn up quickly in a search of historical records databases (e.g. Ancestry or FamilySearch), but aren’t likely to give me the information I need to solve the problem. For example, if the 1940 census and the 1920 census both turn up in a database search for a given research target, I’m likely to ignore the 1940 census and investigate the 1920 census result. Why? Because the 1940 census didn’t ask questions about year of immigration or year of naturalization, while the 1920 census did ask those questions, and the information provided by that census record about immigration and naturalization is relevant to the process of tracing immigrant ancestors back to the Old Country. Recently, staying goal-focused enabled me to discover, in about 20 minutes, how a DNA match was related to me, and it made me so happy that I want to share that story with you today.

Introducing Fred Kowalski

Since this is a story about our Polish origins, I’ll call my DNA match Fred Kowalski (not his real name). Fred appeared in my list of autosomal DNA matches at 23&Me, and we were reported to share DNA in a single segment consisting of 51 centimorgans (cM, a unit for measuring genetic distance) on Chromosome 15. Shared matches gave me no clues regarding how we might be related; I didn’t recognize a single name in the list. In his profile on 23&Me, Fred reported that all four grandparents were born in Poland, and he gave me six family surnames to work with, including one that was familiar to me: Słoński. Painting the match onto my chromosome map at DNA Painter revealed that the segment shared with Fred falls into a larger segment of DNA which I inherited from my maternal grandmother, consistent with my preliminary hypothesis that our relationship might be through the Słoński family. Fred’s real surname is not especially popular, so a quick internet search turned up an online obituary for his father. From there, I used the subscription database at Newspapers to find an obituary for his grandmother. I’ll begin the story with her.

The Bengier Family of Steubenville, Ohio

Fred’s grandparents were Peter J. and Constance A. Bengier of Steubenville, Ohio. Constance’s obituary was very informative, but for the sake of this narrative, the most important information was that she was born in Poland on 6 April 1889 to Joseph and Anna Kujawa, and that she married Peter Bengier on 4 February 1907.

Figure 1: Newspaper obituary for Constance A. Bengier.1

Constance’s Social Security application (Figure 2) provided somewhat different information about her parents’ names, in that her father’s name was reported to be Stanley, rather than Joseph. Since Constance would have provided the information for this form herself, rather than another family member providing it after her death, we can consider the information from the Social Security Applications and Claims index to be more reliable than the obituary in this regard.

Figure 2: Information from Social Security Applications and Claims Index for Constance Anna Bengier.2

The 1930 census (Figure 3) provided additional details relevant to tracing the family back to Poland. Although the information on the entire family group is important when documenting the family history, my focus was on tracing the family back to Poland, and the data that was most germane to that issue is contained within the red box.

Figure 3: Image extracted from the 1930 census for German township, Harrison County, Ohio, showing the Bengier family.3

According to the census, Constance Bengier was age 41, suggesting a birth year circa 1889, nicely consistent with previous data from the Social Security application and her obituary. The census record offers enough additional evidence (such as names of other family members) for us to be certain that this Constance Bengier is a match to the Constance Bengier in the obituary. Once we establish that fact, then the most important piece of new information found in this record is her year of immigration, 1910, and the fact that her husband and oldest daughter also reported immigrating in that year. We would expect to find all of them on the same passenger manifest, or possibly on two different manifests, if Peter came over first to secure employment and lodging before sending for his wife and child.

The critical pieces of information that are required at minimum in order to locate an immigrant in records from his or her home country are the person’s name, approximate date of birth, parents’ names, and specific place of origin. With Constance Kujawa Bengier, I was nearly ready. The missing piece was evidence for her place of origin.

The Bengier Family of Wola, But Which One?

Since the 1930 census provided information about the year of arrival, I decided to seek a passenger manifest next. The Hamburg emigration manifest popped up first, revealing that Konstancia (or Konstancja, modern Polish spelling) Bengier departed from the port of Hamburg on 29 September 1910 at the age of 21, along with her 3-month-old daughter, Walerya (or Waleria, in modern Polish; Figure 4).

Figure 4: Detail from the Hamburg emigration manifest of the SS Cleveland, departing Hamburg on 29 September 1910, showing passengers Konstancia Bengier and her 3-month-old daughter, Walerya.4

The ages matched well with my expectations based on previous data. Given the propensity of immigrants for adapting their given names to sound more “American,” I was not surprised to find that the original name of the daughter, “Voila” (or Viola) from the 1930 census, was actually Waleria. If additional confirmation were required before concluding that this was the correct passenger manifest, the corresponding Ellis Island arrival manifest could also be located. In those days, it took about 2 weeks for a steamship to cross the Atlantic. Assuming no manifest turned up with a search of indexed records, one could browse the manifests in Ancestry’s database, “New York, U.S., Arriving Passenger and Crew Lists (including Castle Garden and Ellis Island), 1820-1957,” and look for the arrival of the Cleveland at the port of New York some time in mid-October 1910. However, as it happens, Ancestry’s database is incomplete, and there are instances such as this where the arrival manifest is not found. If this happens, Ellis Island arrivals can be searched directly at the Ellis Island site, or via Steve Morse’s more sophisticated One-Step search form. (Konstancja’s Ellis Island arrival manifest is here. It confirms and extends the information found in the Hamburg emigration manifest, but I won’t discuss it in detail since it was not part of my original research process.)

The key piece of information found in this manifest that permitted me to advance the search was her place of residence, which was recorded as “Wola,” in Russia. (If you’re wondering why a woman who said she was Polish in 1930 might have been coming from Russia in 1910, there’s an overview of those border changes here.) Now, if this were an ordinary research process, and not one guided by DNA, I would have needed a time-out here to fall back and regroup, and seek additional sources of documentation for Konstancja’s place of birth. That’s because “Wola” is one of those Polish place names that’s so common that it strikes fear into the hearts of even seasoned Polish genealogists. Just how common is it? Mapa.szukacz.pl, which is an interactive Polish map site, reveals that there are 848 places called Wola, or containing Wola in the full name, within the borders of Poland today. And that’s not counting all the additional places called Wola that were previously part of Poland, but are outside of Poland’s current borders.

The situation would have been ameliorated somewhat by the fact that Konstancja’s Wola was recorded as being located in the Russian partition, so we could safely ignore all the places called Wola that were within the German and Austrian partitions. Nonetheless, that would still leave us with a lot of places called Wola to check, unless we could find some additional documentation (naturalization records, church records, military records, etc.) that might provide some geographic clues to help us narrow the field. However, this was not an ordinary research process; it’s a genetic genealogy story, and one with a happy ending.

The Missing Link

Since my hypothesis was that I was related to Konstancja Kujawa Bengier through the family of her mother, Anna Słońska, I immediately suspected that “Wola” might be Wola Koszucka, a village belonging to the Roman Catholic parish of Kowalewo-Opactwo, where I’d found records for my Słoński ancestors. This Wola was in the Russian Empire in 1910, so it would fit the description found in the passenger manifest. Records for this area are indexed in a number of different databases, including Geneteka, BaSIA, the Poznan [marriage] Project and Słupca Genealogy. Each of those databases has its strengths and weaknesses, and there’s a fair amount of overlapping coverage between them. I decided to cut to the chase and search for a marriage record for Stanisław Kujawa and Anna Słońska first, since that would tell me Anna’s parents’ names, rather than searching for a marriage record for Piotr Bengier and Konstancja Kujawa, or a birth record for Konstancja. I plugged in my search parameters at the Słupca Genealogy site, and there it was, bada boom, bada bing! The marriage record for Stanisław Kujawa and Anna Słońksa which connected the dots (Figure 5).

Figure 5: Marriage record for Stanisław Kujawa and Anna Słońska from the parish of Kowalewo Opactwo.5

The record is in Russian, and here’s how I translate it:

“No. 12

Wola Koszutska

This happened in Kowalewo on the first/thirteenth day of November in the year one thousand eight hundred eighty-two at three o’clock in the afternoon. We declare that in the presence of witnesses Antoni Zieliński, age fifty, and Józef Buczkowski, age forty, both owners* of Wola Koszutska, on this day was celebrated a religious marriage between Stanisław Kujawa, bachelor of Wilczna, born in Cienin Kościelny, 27-year-old son of the laborers Łukasz and his deceased wife, Wiktoria née Przybylska Kujawa, and Anna Słońska, single, born and residing with her parents in Wola Koszutska, daughter of Antoni and Marianna Słoński née Kowalska, age twenty-two. The marriage was preceded by three announcements published on the eighth, fifteenth, and twenty-second days of October of this year in the local parish churches of Kowalewo and Cienin Kościelny. The newlyweds declared that they had no prenuptial agreement between them. This Act was read to the illiterate newlyweds and witnesses, and was signed by Us only. [Signed] Fr. Rzekanowski.”

*хозяева, a word which can mean hosts, landlords, owners, proprietors, or masters. In my experience, it’s used to describe the same individuals who were described in Polish-language records as gospodarze, peasant farmers who owned their own land.

The record stated that Anna was the daughter of Antoni Słoński and Marianna Kowalska, and her age at the time of her marriage 22, suggested a birth year circa 1860. I checked my family tree, and there she was, quietly sitting there the whole time, waiting to be rediscovered. Many years ago, I had added Anna to my family tree when I found her birth record, but I had never gone further with seeking a marriage record for her, or birth records for her children. Anna was born on 14 July 1860,6 and she was in my tree because her father, Antoni, was the son of Bonawentura Słoński and his second wife, Marianna Muszyńska, as evidenced by both Antoni’s birth record7 and the record of his marriage to Marianna Kowalska.8 But wait, there’s more! Bonawentura Słoński was the brother of my great-great-great-great-grandmother, Barbara (nee Słońska) Dąbrowska. Barbara and Bonawentura were both children of Wojciech Słoński and Marianna Duras,9 and it is they who are the most recent shared ancestors between me and this DNA match, whom I can now state is my documented fifth cousin once removed. Wojciech Słoński and Marianna Duras are the genetic and documentary link that connects me to the Bengier family of Steubenville, Ohio.

I love a happy ending.

© Julie Roberts Szczepankiewicz 2021

Sources:

1 “Deaths and Funerals: Mrs. C.A. Bengier,” The Weirton Daily Times (Weirton, West Virginia), 3 August 1970, p. 2, col. 1; Newspapers (https://www.newspapers.com/ : 8 August 2021).

2 “Social Security Applications and Claims Index, 1936-2007,” database, Ancestry (https://www.ancestry.com : 8 August 2021), Constance Anna Bengier, born 6 April 1889, SSN 268447885.

3 1930 United States Federal Census, Harrison County, Ohio, population schedule, Geman township, E.D. 34-10, Sheet 7B, dwelling no. 174, family no. 175, Pete Bengier household; database with images, Ancestry (https://www.ancestry.com : 8 August 2021), citing National Archives and Records Administration microfilm publication T626, 2,667 rolls, no specific roll cited.

4 Manifest, SS Cleveland, departing 29 September 1910, p 2226, lines 288 and 289, Konstancia Bengier and Walerya Bengier; imaged as “Hamburg Passenger Lists, 1850-1934,” database with images, Ancestry (https://www.ancestry.com/ : 8 August 2021), citing Staatsarchiv Hamburg; Hamburg, Deutschland; Hamburger Passagierlisten; Volume: 373-7 I, VIII A 1 Band 226; Page: 2222; Microfilm No.: K_1815.

5 “Akta stanu cywilnego Parafii Rzymskokatolickiej Kowalewo-Opactwo” (Kowalewo-Opactwo, Słupca, Wielkopolskie, Poland), Ksiega urodzen, malzenstw, zgonów, 1882, marriages, no. 12, Stanisław Kujawa and Anna Słońska; digital image, Szukajwarchiwach (https://www.szukajwarchiwach.gov.pl/ : 8 August 2021), Sygnatura 54/771/0/-/71, scan 27 of 37.

6 “Akta stanu cywilnego Parafii Rzymskokatolickiej Kowalewo-Opactwo” (Kowalewo-Opactwo, Słupca, Wielkopolskie, Poland), Ksiega urodzen, malzenstw, zgonów, 1860, births, no. 27, Anna Słonska; digital images, Szukajwarchiwach (https://www.szukajwarchiwach.gov.pl : 8 August 2021), Sygnatura 54/771/0/-/49, scan 6 of 24.

7 “Akta stanu cywilnego Parafii Rzymskokatolickiej Kowalewo-Opactwo” (Kowalewo-Opactwo, Słupca, Wielkopolskie, Poland), Ksiega urodzen, malzenstw, zgonów, 1823, births, no. 16, Antoni Jan Słoński; digital image, Szukajwarchiwach (https://www.szukajwarchiwach.gov.pl/ : 8 August 2021), Sygnatura 54/771/0/-/13, scan 4 of 25.

8 “Akta stanu cywilnego Parafii Rzymskokatolickiej Kowalewo-Opactwo” (Kowalewo-Opactwo, Słupca, Wielkopolskie, Poland), Ksiega urodzen, malzenstw, zgonów, 1845, marriages, no. 8, Antoni Słoński and Marianna Kowalska; digital image, Szukajwarchiwach (https://www.szukajwarchiwach.gov.pl/ : 8 August 2021), Sygnatura 54/771/0/-/34, scan 17 of 28.

9 “Akta stanu cywilnego Parafii Rzymskokatolickiej Ladek,” (Lądek, Słupca, Wielkopolskie, Poland), Ksiega malzenstw, 1819–1820, 1819, no. 24, Bonawentura Słoński and Jagnieszka Wilczewska; digital images, Szukajwarchiwach (https://www.szukajwarchiwach.gov.pl/ : 8 August 2021), Sygnatura 54/776/0/-/46, scans 13 and 14 of 14; and

“Akta stanu cywilnego Parafii Rzymskokatolickiej Kowalewo-Opactwo (pow. slupecki),” Akta urodzen, malzenstw i zgonów, 1845, deaths, no. 5, Barbara z Slonskich Dabrowska; digital image, Szukajwarchiwach (https://www.szukajwarchiwach.gov.pl/ : 8 August 2021), Sygnatura 54/771/0/-/34, scan 23 of 28.

DNA Testing for the Scientifically Challenged

Autosomal DNA testing has become an increasingly popular tool in everyone’s genealogy toolbox these days, but I’ve noticed that there are many everyday family historians who are still bewildered by their DNA test results and aren’t really sure what to make of them. For many genealogists, high school biology classes are a distant memory, so the language of genetic genealogy is foreign. Comments like, “What’s the point of DNA testing? I already know I’m 100% Polish-American,” remind me of how far we need to go in educating people about the value in looking beyond those ethnicity estimates so that they can really make use of their test results. With all that in mind, I thought it might be helpful to review some of these basic concepts in genetic genealogy and present some strategies for the absolute beginner to use when confronted with a list of autosomal DNA matches. If you’re already comfortable working with your DNA match lists, and you’re looking for a blog post with cutting-edge information written by an acknowledged expert in the field of genetic genealogy, then this post is not for you. But if you’re one of those people who’s scratching his head wondering how all these people could show up in the match list when they’re not in the family tree, then keep reading.

Going Beyond the Ethnicity Estimates

Biogeographical analyses, also known as admixture analyses or “ethnicity estimates” are a big draw these days, and are the primary motivation for DNA testing for many. Eager to learn whether they should trade in their lederhosen for a kilt, many people pore over their ethnicity breakdowns, and don’t pay much attention to their lists of DNA matches. That’s a shame, because the real value of DNA testing lies in those lists of matches, which offer evidence that will allow you to extend and support your documentary research. The underlying assumption of DNA testing is that the people on your match list are your genetic cousins, whether or not you know at this point how you are related to them. There’s a significant caveat, which we’ll get to in a moment. However, generally speaking, if you match a particular individual to whom you have a known relationship, and if the amount of DNA you share is consistent with the known relationship, it suggests several things:

  1. That the paper trail is correct from you to the most recent common ancestral couple that you share with this DNA match.
  2. That the paper trail is also correct from your DNA match to that same most recent common ancestral couple. 
  3. That the matching segments of DNA shared between you and this person were passed down to each of you from that most recent common ancestral couple.

To illustrate, let’s say that I have a maternal first cousin once removed named Fred. (I do, actually, and I have his permission to use his name in this post.) Fred is the son of my maternal grandmother’s brother, Leon. Fred and I share 544 centiMorgans of DNA across 28 segments, according to Ancestry. A centimorgan (cM) is a unit of genetic linkage that is commonly used to express genetic distance, so the more DNA you share with a match in centimorgans, the more closely you’re related. Since 544 cM of DNA is within the range that first cousins once removed can be expected to share, we can say that the DNA evidence supports the documentary evidence. That is, the proposed, documented parentage shown in Figure 1 is also borne out by DNA evidence, so there are no misattributed parentage events in my line back to my great-grandparents, Jan/John Zażycki and Weronika/Veronica Grzesiak, and there are no misattributed parentage events in Fred’s line back to that same couple.

Figure 1: Relationship chart showing documented relationship between me and cousin Fred.relationship chart to fred zazycki

Misattributed parentage events (also known as non-paternity events, or NPEs) can occur in a family for a number of reasons, such as informal adoption, illegitimacy, marital infidelity, surname change, etc., and they can sometimes come as quite a shock to people who test their DNA and suddenly discover that their lineage isn’t what they thought it was. Similar discoveries can also be made with documentary research, of course, so anyone who is considering DNA testing or genealogy research should be prepared for the possibility of such surprises. However, in the example above, no NPEs were found (whew!), so now we have both genetic and documentary evidence to prove that cousin Fred and I are first cousins once removed.

If we download the raw data from Ancestry and upload it to a site that offers a chromosome browser, such as GEDmatch, we can visualize where each matching segment is located on each chromosome, as shown in Figure 2.

Figure 2: Matching DNA segments (shown in blue) between me and Cousin Fred, courtesy of GEDmatch Genesis. Only data from Chromosomes 1, 2 and 3 are shown here. 

first three chromosomes showing matching segments

Each of those blue segments is presumed to be identical by descent (IBD). That is, Cousin Fred and I each carry those specific DNA sequences because we inherited them from a common ancestor. Based solely on these data, it’s not possible to know which of these segments was inherited from Jan Zażycki and which was from Weronika Grzesiak, but we know they had to come from that ancestral couple. Now let’s say we identify a hypothetical third cousin, Joe. Let’s suppose that we have documentary evidence to prove that Joe descends from Weronika Grzesiak’s brother Tadeusz. Moreover, let’s say that Joe matches Fred and me on Chromosome 2 along that segment shown in blue. If that were the case, we would call it a triangulated segment, and we could state confidently that the common ancestor from whom Fred and I inherited that bit of DNA was Weronika Grzesiak and not Jan Zażycki. 

Chromosome Mapping with DNA Painter

Analysis like this supplies the foundation for creating chromosome maps like the ones that can be generated quickly and easily at DNA Painter. Each time you use documentary evidence to verify your relationship to one of the genetic cousins from your match list—assuming you also have segment data for the match—you can paint the segment(s) onto your chromosome map. Currently, all of the major test companies except Ancestry offer chromosome browsers and segment data that can be used for chromosome mapping. So if you test with Family Tree DNA, MyHeritage, or 23 & Me, you’re good to go. However, if you determine your relationship to a cousin found in your match list at Ancestry, you cannot paint the match onto your chromosome map unless you can persuade that person to download his or her raw data from Ancestry and upload to Family Tree DNA, MyHeritage, or GEDmatch Genesis. (23 & Me does not currently accept uploads from other companies.) So although it’s intellectually satisfying to document your relationship to a DNA match found on Ancestry, the lack of segment data is a serious drawback, and these matches are useless for chromosome painting. My current map is shown in Figure 3.

Figure 3: My chromosome map, generated by DNA Painter.

dna painter screenshot

If you look closely at the map, you see that each chromosome is represented by two bars that appear next to the chromosome number on the left. The upper bar is lightly shaded in blue and represents the copy of that chromosome which I inherited from my father. The lower bar is lightly shaded in pink, and represents the copy of that chromosome which I inherited from my mother. Superimposed on those base colors are darker-colored segments which are defined in the key on the right. For example, there’s a dark pink color that indicates DNA I inherited from my great-grandparents, John Zazycki and Veronica Grzesiak. I know I inherited this DNA from them because all of those dark pink segments represent DNA shared between me and my late grandmother, their daughter. I tested Grandma before she passed (thank you, Grandma!), and these are the segments where she and I matched. This is important information, because it implies that the segments of my maternal (light pink) chromosomes that are not shaded in dark pink must have been inherited from my maternal grandfather. The entire light pink chromosome came from my Mom, and all of her DNA came from either her mother or her father. So if I know from empirical evidence which segments came from her mom, I know by deduction which segments came from her dad. 

Those dark-pink segments inherited from Grandma can be further refined, since all of her DNA was ultimately inherited from her mother’s ancestors and her father’s ancestors. You’ll notice that chromosomes 1, 4, and 13 show red bars superimposed on that dark pink. These red bars indicate DNA segments that I inherited from Grandma’s great-great-grandparents, Maciej Dąbrowski and Barbara Słońska. Maciej was born about 1775, and Barbara was born circa 1781, and I find it utterly amazing and fascinating that I can pinpoint at least some of the bits of my own DNA that were inherited from one or the other of them. Figure 4 shows a close-up of a portion of my chromosome map, where the red bar indicating DNA inherited from Maciej and Barbara is especially visible on Chromosome 4.

Figure 4: Closer view of my chromosome map showing red segment on maternal Chromosome 4, corresponding to DNA inherited from 4x-great-grandparents Maciej Dąbrowski and Barbara Słońska, indicated by black arrow.

dna painter crop

DNA Painter offers the additional option of a closer look at each individual chromosome. If we focus on Chromosome 4, we can see the breakdown of Grandma’s dark-pink segments as I’ve been able to map them to date (Figure 5).

Figure 5: Expanded view of Chromosome 4 showing underlying ancestral contributions to each dark-pink segment inherited from Grandma.chromosome 4

I’ve removed the names of my living DNA matches to protect their privacy. However, each of those red bars represents a match to a 5th cousin who is a documented descendant of Maciej Dąbrowski and Barbara Słońska. The orange bar represents a match to a second cousin once removed (2C1R) who is a documented descendant of Maciej and Barbara’s great-granddaughter, Józefa Grzesiak (my Grandma’s aunt). This means that the segment of DNA which Grandma inherited (pink bar) which overlaps with the segment of DNA inherited by my 2C1R (orange bar) came from either Maciej Dąbrowski or his wife Barbara, and was passed down to at least two of their great-granddaughters—both my great-grandmother, Weronika Grzesiak, and well as her sister Józefa Grzesiak—who ultimately passed it down to me and my 2C1R. There may be other descendants who share this bit as well, who haven’t yet tested their DNA. 

Ultimately, this bit of DNA, or any of the other bits of Grzesiak DNA carried by documented Grzesiak descendants, might someday be the key to identifying unknown cousins from Poland. Weronika and Józefa had at least one sister, Konstancja, who remained in Poland, married, and had at least two children whom I’ve identified through documentary research. Thanks to a fortuitous marginal note on a baptismal record, I know that one of these children married in Lower Silesia in 1927. Unfortunately, it’s not currently possible for me to know if there were any children from that marriage, because Polish privacy laws protect birth records for a period of 100 years. However, if a descendant from that marriage were to test his or her DNA, it’s quite likely that he or she would show up as a match to me or one of those other Grzesiak descendants. With any luck, that hypothetical cousin might be interested in collaborating to confirm the match, through documentary research. As next-of-kin, Polish law would permit him to request the recent birth, marriage or death records to which I have no access. 

These segment data also illustrate how matches to more-distant cousins can refine our understanding of matches to closer relatives. My match to Grandma tells me that the DNA underlying that pink bar in the middle of maternal Chromosome 4 comes from one of her parents, John Zażycki or Veronica Grzesiak, but it doesn’t tell me which one. My match to my 2C1R tells me that the subset of that Zażycki/Grzesiak DNA, underlying the orange bar, comes from Veronica Grzesiak because I’m related to that cousin through the Grzesiaks and not through the Zażyckis. This suggests that the DNA on either side of that segment, represented by the pink tips that extend past the orange on the left and the right, might have been inherited from John Zażycki. However, it’s impossible to know that definitively at this point, because some future DNA match might prove me wrong.

If I only had data from Grandma and that 2C1R, I would know that the DNA segment represented by the overlap between the orange and the pink bars had to come from either Józef Grzesiak or his wife, Marianna Krawczyńska, but I would not know which one contributed it. However, thanks to those DNA matches to my fifth cousins (a set of siblings), I know that the DNA segment represented by the overlap in pink, orange and red bars must have been inherited from Józef Grzesiak and not Marianna Krawczyńska, because those fifth cousins are related to me through Józef Grzesiak’s grandparents, Maciej Dąbrowski and Barbara Słońska, and not through the Krawczyński side. The more DNA matches you can identify, the deeper you can drill down into your DNA, because every bit of DNA in your body, no matter how small, had to come from one ancestor or another. Theoretically, you should be able to go through your list of DNA matches and identify the ancestors responsible for passing along even the tiniest fragments of DNA shared between you and a match, right?

IBD or…Not?

Unfortunately, it’s not quite that simple. It may not be possible to determine your relationship to every single one of your DNA matches. It’s not a perfect world and I don’t know anyone who has his family tree traced back to 6x- or 7x-great-grandparents on every single line. Moreover, there’s always the possibility of an NPE or two (or more!) in each person’s tree, which would throw a monkey wrench into the analysis. Furthermore, some of the DNA matches who show up in our match list may not be related to us at all through common descent in the genealogical time frame.  This is that caveat I mentioned earlier, and it’s true regardless of the company you test with. Although DNA testing is predicated on the assumption that your matches share common ancestry with you due to inherited DNA segments that are identical by descent (IBD), not every DNA segment that is identified as a match by the test company’s algorithm is IBD. What else could they be?

Any DNA match that is not IBD is sometimes described as IBS, “Identical by State.” However, IBS is something of a catch-all term, because it encompasses matches that are Identical by Population (IBP), as well as Identical by Chance (IBC). Let’s take a closer look at these two possibilities. There are some segments of DNA that you will share with people just because your ancestors and their ancestors came from the same endogamous population, meaning a community in which intermarriage between distant (or not-so-distant) cousins was common. These are typically small segments of DNA (<10 cM) that will not be possible to assign to a particular ancestor within the genealogical time frame—that is, within the time frame in which it’s possible to find documentary evidence to confirm the relationship. Such segments are often referred to as Identical by Population (IBP). The other possibility is that the DNA segment identified as a match by the test company is a false positive, also known as a pseudosegment. To understand how this can happen, we need to take a closer look at the methodology behind DNA testing.

The Nuts and Bolts of Autosomal DNA Testing

Autosomal DNA testing focuses on the tiny differences in our genetic makeup that make us unique. Most of our genetic code is identical, of course, but there are places in the human genome where slightly different forms of the same gene can exist. These different forms of the same gene are called alleles. DNA is made up of chemical units called  nucleotides, and each nucleotide in the DNA is referred to by a letter (A = adenine, T = thymine, G= guanine, C = cytosine), and each time one letter is substituted for another at a particular place in the DNA sequence, it’s called a Single Nucleotide Polymorphism, or SNP (pronounced “snip”). There are 4-5 million SNPs in the human genome, and each of the DNA test companies samples between roughly 630,000-700,000 of them.1 Figure 6 shows an extract of my raw DNA data file (called a genotype) as downloaded from Ancestry.

Figure 6: Extract from my genotype from Ancestry showing SNPs on Chromosome 2.

raw dna data

The raw data file includes some additional columns which I’ve omitted, and I’ve obscured the data in the column that identifies the precise position on Chromosome 2 where these SNPs are located. The letters to the right of the position column indicate the nucleotide found at that position on each copy of my Chromosome 2. Note also that only half the DNA is shown here. If you remember from high school biology class, DNA exists as paired strands, so every time there’s an A, it’s paired with a T, and every C is paired with a G. However, this report only provides information on one strand from each parent.

The sequence of the data looks nice and neat, and one might assume that the left column represents data from maternal alleles while the right column represents data from paternal alleles. However, the reality is that the test cannot distinguish between maternal and paternal alleles at any given position. The data in the genotype are intermixed, and therein lies the problem. Although all of the test companies use algorithms which can successfully sort out the data and identify matching segments of DNA between individuals, the accuracy of the matching algorithms decreases significantly when they attempt to identify smaller segments of DNA as matches. The result is that a large percentage of small “matching” segments (less than 7 cM) reported by the test companies are not IBD, they’re Identical by Chance (IBC), or false positives. Roberta Estes offers a more detailed discussion of these types of matching (IBD, IBS, IBC, and IBP),2 and if you really want to delve into the nitty gritty, you can read Ancestry DNA’s Matching White Paperwhich explains how their matching algorithm works in technical terms.

The Big Problem of Small Segments

So how big a problem is this? Genetic genealogist Tim Janzen estimates that there is only a 5% likelihood that a shared segment of 6-12 cM indicates a common ancestor within the last 6 generations for you and your DNA match.4 You can see his full table here. That same article states that, “False positive matching rates of between 12% and 23% have been reported for Family Finder data [Family Finder is the autosomal DNA test offered by Family Tree DNA], and up to 34% at Ancestry using their current algorithm.” 5 Yikes! So how can we know if a match is real or not? One possibility is to test not only yourself, but both your parents. Since all your DNA must come from either one parent or the other, any DNA match who matches you, but who does not also match one of your parents, cannot be your genetic relative. If both parents aren’t available for testing, the safest thing to do is to avoid basing genealogical conclusions on evidence from small segments. Consider restricting your analysis to segments larger than 10 cM.  This is good advice even if you do have phased data—that is, data which have been compared to both your mother’s data and your father’s data using a tool such as the Phased Data Generator, available as a Tier 1 utility at GEDmatch Genesis. 

To illustrate the problems with small segments, consider the following example. Figure 7 shows a 9 cM segment on Chromosome 22 which is shared by a DNA match, “Czesław C.” along with my mother (EZR), my grandmother (Helen Zielinski), my sister (AW), and me. 

Figure 7: Matching segment (shown in blue) shared by Czesław C., my mother, my grandmother, my sister, and me, courtesy of GEDmatch Genesis. chromosome 22

The segment is clearly IBD, identical by descent, because it was passed from Grandma to mom to my sister and me. However, thorough comparison of Czesław’s genealogy and Grandma’s offers no good clues regarding common surnames or places of origin. At best, this segment could be IBP, identical by population, since Grandma’s documented ancestry was entirely Polish and so was Czesław’s. However, I had the opportunity to discuss this example with genetic genealogist Blaine Bettinger over the summer, and he pointed out that the segment is still untrustworthy. Even though it’s IBD on my side, it’s possible that it’s still IBC, identical by chance, on Czesław’s side, and therefore a false positive. Of course, DNA evidence is always just one piece of the puzzle. If further documentary research turns up evidence of a shared surname or common place of origin between Grandma’s ancestors and Czesław’s, we might want to reevaluate this segment in that light. However, at present there’s no reason to believe there is any connection at all between my family and Czesław’s, so an exhaustive effort to seek documentary evidence is unwarranted.

Hopefully this discussion has helped at least a little bit with demystifying some of the concepts and terms used in genetic genealogy discussions, and explaining why autosomal DNA testing is such a powerful research tool. There are so many great resources out there to help educate budding genetic genealogists, including the list of some of my favorite blogs and Facebook groups included below, and with just a little effort, you, too, can grow comfortable with looking beyond your ethnicity estimates and incorporating DNA evidence into your research methodology. In my next post, I’ll offer some specific suggestions for working with your DNA match list at Ancestry so you can make the most of the information that’s provided there. Happy researching!

Sources:

1 Tim Janzen, “Autosomal DNA Testing Comparison Chart,” International Society of Genetic Genealogy Wiki, (https://isogg.org/wiki : 14 January 2019), licensed under CC BY-NC-SA 3.0.

Roberta Estes, “Concepts – Identical by…. Descent, State, Population, and Chance,” DNAeXplained – Genetic Genealogy, posted 10 March 2016 (https://dna-explained.com : 14 January 2019).

3 Catherine A. Ball, Matthew J. Barber, et. al, “Ancestry DNA Matching White Paper,” AncestryDNA, (https://www.ancestry.com/dna : 14 January 2019).

Tim Janzen, table relating “Length of Shared Segment” to “Likelihood You and Your Match Share a Common Ancestor Within 6 Generations,” “Identical by descent,” International Society of Genetic Genealogy Wiki, (https://isogg.org/wiki : 14 January 2019), licensed under CC BY-NC-SA 3.0.

5 Identical by descent,” International Society of Genetic Genealogy Wiki, (https://isogg.org/wiki : 14 January 2019), licensed under CC BY-NC-SA 3.0.

For further reading:

The ISOGG Wiki (online encyclopedia of genetic genealogy, hosted by the International Society of Genetic Genealogy, or ISOGG) has articles on pretty much any topic of interest in the field of genetic genealogy and is highly recommended. 

Blaine Bettinger’s blog, The Genetic Genealogist.

Kitty Cooper’s blog, Musings on Genealogy, Genetics and Gardening.

Roberta Estes’ blog, DNAeXplained.

Leah Larkin’s blog, The DNA Geek.

CeCe Moore’s blog, Your Genetic Genealogist.

Of interest to Polish-speakers is Eryk Jan Grzeszkowiak’s blog, Genealogia Genetyczna

An even more comprehensive listing of popular genealogy blogs is found here

In addition to these blogs, some of my favorite genetic genealogy Facebook groups are Genetic Genealogy Tips and Techniques, DNA Detectives, GEDmatch.com User Group, and AncestryDNA Matching. Be sure to also check Katherine R. Willson’s index of genealogy-related Facebook groups. At present, the list includes several pages of Facebook groups, although not all are focused on autosomal DNA testing.

© Julie Roberts Szczepankiewicz 2019

 

 

 

A Tale of Two Zagóróws

For the past two weeks I’ve been on a hiatus from genealogy due to a family health crisis. Today, I’m celebrating both the end of that crisis, and a new DNA match. The DNA match isn’t that new, actually, but I think I’ve figured out just how my new cousins and I are related.

The story began last August, when I wrote to some new matches that appeared in my list at Ancestry.  The matches were siblings, and Ancestry predicted with high confidence that my match to both of them was in the 4th-6th cousins range, spanning 30 centimorgans (cM) across 2 chromosomes.  Both of my matches responded to my messages and suggested that I get in touch with their sister, Carol, who had not yet tested her DNA but who was the more avid family historian in the family. As can happen with all of us, life can get in the way of genealogy research, so I didn’t hear from Carol until a few days ago, when we began comparing notes to see if we could determine how we might be related.

Carol told me that her family had roots in Prussian, Russian and Austrian Poland, which suggested a match on my mom’s Polish side. This was supported by the fact that her sibings matched me, but not my Dad’s sister. However, there was also no match between Carol’s siblings and either my mom’s maternal first cousin, or my third cousin on my mom’s maternal side. Although there were no surnames in Carol’s family tree that jumped out at me, I noted with interest that her father’s paternal line was from Zagórów. Unfortunately, this appeared to be a red herring:  although I, too, had family from Zagórów, my ancestors were from Zagórów in Słupca County, Wielkpolskie province, while Carol’s tree stated that her ancestors were from Zagórów in Limanowa County, Małopolskie province, nearly 300 miles away.

However, as Carol and I messaged back and forth, she commented that her father had cousins living in Poland in Konin and Poznań, both of which are located in Wielkopolskie County. Moreover, she mentioned that she had found documents for her family at the Słupca Genealogy site, a fantastic resource which contains indexed vital records specifically from Słupca and Kalisz Counties in Wielkopolskie province, but not from anywhere else in Poland. Finally she mentioned that the name of the church that her father’s family attended in Zagórów was Sts. Peter and Paul the Apostles, which is definitely the name of the parish for the Zagórów in Wielkopolskie province, and not the one in Małopolskie province. By this point the evidence was clear:  Carol’s family was from the same Zagórów that my ancestors were from, in Wielkopolskie province.  It’s not an uncommon error for a newcomer to Polish genealogy to make, to confuse two locations with the same name, and it makes a big difference

Having cleared up that misconception, the game was now afoot. A common point of geography would be a logical place to begin looking for our connection. I took a closer look at her family tree, paying attention to the surnames that were from Zagórów. It’s been a while since I did any research on my Wielkopolskie lines, and by “a while,” I mean about a decade, so I was a little surprised to find that the answer had been staring me in the face since last August:  Celia Przystańska.

According to her family tree, Carol’s paternal grandparents were Jan Myśliński, and Celia Przystańska, who was born about 1870 in Zagórów.  I had forgotten that I had the Przystański surname in my own family tree — but lo, and behold, my tree includes one Cecylia Przystańska, born 1863 in Zagórów! Cecylia was the daughter of Marcin Przystański and Katarzyna Tuzik. Here’s her birth record (Figure 1):

Figure 1:  Birth record from Zagórów for Cecylia Przystańska, 1863:1Cecylia Przystanka 1863 birth crop

The record is in Polish and reads,

“#278, Zagórów. This happened in Zagórów on the twenty-second day of November in the year one thousand eight hundred sixty-three at four o’clock in the afternoon.  He appeared, Marcin Przystański, shoemaker residing in Zagórów, having twenty-four years of age, in the presence of Walenty Łukomski, carpenter, age thirty-eight, and Ignacy Michalski, glazier, age twenty-seven, residents of Zagórów, and showed us a child of the female sex, born in Zagórów on the fourteenth day of the current month and year at four o’clock before day of his wife, Katarzyna née Tuzik, age twenty. To this child at Holy Baptism, performed today, was given the name Cecylia, and her godparents were Walenty Łukomski and Balbina Michalska. This document was read to the declarant and witnesses, who are illiterate, and was signed. [signed] Fr. Mikołaj Wadowski, pastor”

Katarzyna Tuzik was married to Marcin Przystański in 1862 in the nearby village of Kowalewo-Opactwo.  Their marriage record is also found online (gotta love Szukajwarchiwach!) and describes the bride as, “Miss Katarzyna Tuzik, having twenty years of age, daughter of Michał and the late Maryanna; born in Wierzbno and living in that same place with her father….” Although Maryanna’s maiden name is not mentioned here, there is substantial evidence available which indicates that she was Marianna Agata Dąbrowska, daughter of Maciej Dąbrowski and Barbara Słońska. This is where the DNA match comes in — Maciej and Barbara were my own great-great-great-great-grandparents. I’m descended from their daughter, Jadwiga Anna, who married Stanisław Grzesiak.

Here’s the relationship chart (Figure 2), which demonstrates that Carol and I are 5th cousins (her maiden surname is used with permission).

Figure 2:  Relationship chart showing relationship between me and cousin Carol.

Relationship Chart

I’ve discovered that these charts can be a little confusing to the uninitiated.  The couple at the top are our common ancestors, Maciej and Barbara Dąbrowski, but after that, the chart shows our lines of descent, not married couples.  Thus, Carol descends from Maciej and Barbara’s daughter, Marianna Agata, whereas I descend from their daughter Jadwiga Anna.  Marianna Agata married Michał Tuzik (not shown in the chart) and their daughter, Katarzyna Tuzik, carries on the line of descent on Carol’s side. On my side, Jadwiga’s husband Stanisław Grzesiak is not shown, but their son Józef Grzesiak carries on the line of descent. The last generation shown on this chart is my Mom and Carol’s late father — Carol and I would run onto a second page of the chart, but I think the general idea is clear.

So, this is a promising lead to the possible connection between Carol and me.  A couple things still need to be ironed out, of course. We don’t yet have the marriage record for Cecylia Przystańska and Jan Myśliński, which is necessary to verify Cecylia’s parents’ names. However, the marriage has been indexed at Słupca Genealogy, (Zagórów, 1886, #42), and although records from this year are not available online, they’re on microfilm from the Family History Library. If the marriage record shows that Cecylia’s parents were, in fact, Marcin Przystański and Katarzyna Tuzik, then the documentary evidence would fit nicely with the DNA evidence.

None of my new cousins have uploaded their DNA to GEDmatch yet, so it’s unfortunately impossible to get a good sense of which chromosomes and what locations are involved in the match. Moreover, without an upload to GEDmatch, I can’t compare their DNA to that of my late grandmother, whom I tested with FTDNA and not Ancestry. That will be a key comparision to make, because Carol’s siblings, Grandma, and I, will all have to share some overlap in the matching regions. It’s not possible for me to match these cousins according to this pedigree if they do not also match Grandma, because she must be the source of my matching DNA.

The amount of shared DNA itself, as reported by Ancestry, is acceptable for this match and would support the predicted relationships.  According to this chart by Blaine Bettinger (Figure 3), 5th cousins share on average 17 cM, with a range of 0-42 cM.  This relationship — 30 cM across two chromosomes — is at the high end of the range, but still plausible.

Figure 3:  Shared centimorgans (cM) for documented genealogical relationships. Data compiled by Blaine T. Bettinger.2 “C” = cousin and “R” = times removed, so “1C1R” in this chart means “first cousin once removed.”SharedcMProject20March2017

The fact that it’s perfectly possible for 5th cousins to share NO DNA (0 cM) also explains another facet of this puzzle that I mentioned in the beginning. One of the first steps I take when evaluating a DNA match is to check to see what matches exist in common with the new match.  In this case, my Myslinski cousins did NOT match a documented and genetic third cousin to me on our common Grzesiak line, nor did they match my mother’s first cousin on her maternal Zazycki line. How can this be?

Let’s examine each of those situations separately. My cousin Valerie descends from my great-grandmother’s sister, Józefa Grzesiak. Józefa would have inherited half of her DNA from her father, Józef Grzesiak, and a quarter of her DNA from her father’s mother, Jadwiga Dąbrowska.  Jadwiga inherited all her DNA from her own parents, Maciej Dąbrowski and Barbara Słońska, who are the common ancestors in this puzzle. Remember that these numbers are averages — the amount of DNA that one inherits from such distant relatives can vary a bit, due to the genetic recombination that occurs in each generation. Similarly, my great-grandmother, Weronika Grzesiak, would have inherited a quarter of her DNA from Jadwiga Dąbrowska — but although the proportion of inherited DNA is roughly the same as what her sister Józefa would have inherited, the content can be quite different — there’s no guarantee that the same genes from their great-grandparents Maciej and Barbara were inherited by both Weronika and  Józefa.

So it’s perfectly possible for the same bit of DNA to have been passed down from common ancestors Maciej and Barbara to me and to cousin Carol, but not to cousin Valerie. (At this point we don’t know which one of my 4x-great-grandparents, Maciej or Barbara, contributed the matching segment that is carried by me and by my Myslinski cousins.) Similarly, it’s possible for me to have inherited this bit through my maternal Grandmother, even though my mother’s maternal cousin did not inherit it.  Mom’s cousin, Fred, is 4th cousin once removed to cousin Carol. According to the above chart, fourth cousins once removed share an average of 20 cM, with a range from 0- 57 cM. So it’s possible that Grandma inherited that crucial bit of DNA from Maciej or Barbara that her brother (Fred’s father) did not inherit. Therefore she was able to pass it on to me, resulting in a match between me and Carol, that is not shared by Fred.

All of this demonstrates the fact that DNA evidence can support a documented relationship, but when it comes to ancestors as far back as this, a lack of DNA evidence cannot disprove a documented relationship. It’s actually quite remarkable to me to think that the same tiny bit of DNA was passed down from parents Maciej and Barbara to both of their daughters (Jadwiga and Marianna) who in turn managed to pass that bit down through several additional generations, so that cousin Carol and I show up as matches at all. Hopefully this helps to illustrate what a powerful weapon DNA testing can be in your arsenal of genealogy techniques.  If you have any recent discoveries that have come about through DNA testing, please let me know about them in the comments — I’d love to read your stories!  Happy researching!

Sources:

Akta stanu cywilnego Parafii Rzymskokatolickiej Zagórów (pow. slupecki), Narodowe Archiwum Cyfrowe, Szukajwarchiwach, 1863, births, #278, record for Cecylia Przystanska, accessed on 22 March 2017.

SharedcMProject20March2017.png, by Blaine T. Bettinger, is licensed under C.C. BY 4.0.

© Julie Roberts Szczepankiewicz 2017