Notes on R-Z18 in Steppe Ancestry Pre-print

These are my notes from the pre-print Steppe Ancestry in western Eurasia and the spread of the Germanic Languages. It hasn’t been peer-reviewed. This is, at it’s heart, an autosomal study, but it does contain some Y-DNA information. They’re trying to identify population movements (along with language movements). They’re using ancient DNA to focus on the spread of Germanic languages. Because of that focus there are quite a few R-U106 and R-Z18 results. Roughly counting in the spreadsheet of results, there were 87 R-U106 samples and 30 R-Z18 samples.

Below is the map of samples in this paper. Dark Green dots were newly generated in this study. Light blue-ish are from other studies.

map of europe where samples were found. From Scandinavia south to italy and spain, Ireland  east to Eastern Europe and the Steppe.

Y-DNA Depth

The study is attempting to form clusters of similar genetic samples and the Y-DNA depth they report speaks to that. The Y Haplogroups are relatively far up the tree, I assume because they want major groupings, not a bunch of individuals.

This study includes samples from other studies as well. For instance, it includes our Angle from the Wash. Hatherdene 5. We know that he is at least down to R-ZP121, many generations below R-Z19 under R-CTS12023, but this study lists him as R-Z19. I’m sure many of the samples from this study are actually well below R-Z19, but individual resolution in Y DNA is not this group’s aim. They’re looking for clusters of genetically similar people.

I’m using the supplemental spreadsheets from the study and the assigned haplogroups in the R-U106 ancient DNA spreadsheet for reference.

Genetic Clusters of Note

In the Bronze age section of the study, they identify 3 clusters. I’ll let them speak for themselves:

“In order to identify whether migrations had occurred within Northern Europe, understanding the substructure within the Bronze Age populations of this region was necessary. We therefore reclustered all ancient samples older than 2800 BP, to remove the impact of later admixture between structured populations present in the Bronze Age (Supplementary Note 6.4.2, Supplementary Table x). Within Scandinavia, three clusters are apparent (Extended Data Figure 4): 1) an early Scandinavian cluster, including the oldest Swedish (Battle Axe Culture) and Danish samples and almost all Norwegians, 2) a later ‘Southern Scandinavian’ cluster restricted to Denmark and the southern tip of Sweden, and 3) a second later ‘Eastern Scandinavian’ cluster, spread across Sweden and overlapping with that of the Southern Scandinavia cluster. In all three instances, there is a very close correspondence between Y-haplogroups and the IBD clusters (Extended Data Figure 4A), largely driven by different frequencies of haplogroups I1a-DF29, R1a1a1b1a3a (R1a-Z284) and R1b1a1b1a1a1 (R1b-U106), which are all strongly associated with Scandinavian ancestry”

Y DNA haplogroups found in early scandinavian, southern scandinavian and eastern scandinavian populations.
(McColl et al. p 16-17.)

The dots are colored based on age, lighter being older. Everything is aged BP (before present) so 4500 is 2400 BCE. Early Scandinavians tend to be R1a, Southern Scandinavians tend to be R1b, Eastern Scandinavians tend to be I1.

There was a genetic group in the spreadsheets that I didn’t see referenced much in the paper: Nothern Scandinavians. Several results are listed as 0_1_6_NorthScan, but I didn’t see them used outside of a quick statement about the difference between language, culture, and genetics. There is a lot of material to go through, though, so there may be more in the supplementary notes.

All these listed results are not nicely formatted. They read like a stream of consciousness because I’m pulling them right out of a spreadsheet and dumping them here.


R-Z19 is grouped with R-Z18. It is sometimes listed as YSC0000054. Most people use R-Z18 to designate the group but this study doesn’t. Below is a screenshot of family tree DNA’s tree listing for R-Z18, Z19 and other SNPs that are clustered in a single block. So far everyone who has one, has them all:

R-Z18 block with R-Z19, Z14, Z16, Z368, Z369..etc.

I’m going to list all the base R-Z19 results with some location information and the major group this study placed them in. The formatting is pretty awful. I’ve labeled some of the results at R-CTS12023 and below. There are more results known to be further downstream from R-Z19, but I didn’t correct them all. Most of the CGG results are from the This paper, and I’ve taken them at face value (as portrayed in the U106 spreadsheet), except our man from Tjaerby Denmark, who is R-ZP121 according to Family Tree DNA. Non CGG results (NEO, HAD, BUK) are from other studies and are usually downstream of R-Z19.

Late Neolithic

  • CGG107465 Lillevasby Denmark_Islands Denmark_LateNeolithic 0_1_2_SouthScan 2194-2026 calBCE
  • CGG105923 Albäcksbacken Maglarp Southern Sweden_LateNeolithic 0_1_2_SouthScan 2200-1700 BCE

Bronze Age

  • CGG106744 Langelands Rørsløkke Mose (Tryggelev) Denmark_Islands Denmark_EarlyBronzeAge 0_1_2_SouthScan 1730-1542 calBCE
  • CGG100212 Kalvehavegaard Denmark_Islands Denmark_EarlyBronzeAge 0_1_2_SouthScan 1608-1430 calBCE
  • CGG106705 Zealand. Tune Karlstrup Denmark_Islands Denmark_Bronze Age 0_1_2_SouthScan 2126-1932 calBCE
  • CGG106708 Karlstrup Denmark_Islands Denmark_BronzeAge 0_1_2_SouthScan 2125-1947 calBCE
  • CGG106706 Karlstrup Denmark_Islands Denmark_Bronze Age 0_1_2_SouthScan 2250-1700 BCE
  • NEO752 Madses Denmark_Islands Denmark_BronzeAge 0_1_2_SouthScan 1814-1483 calBCE
  • NEO946 Hove Denmark_Islands Denmark_BronzeAge 0_1_3_EastScan 1322-967 calBCE

Iron Age, Migration Period, Angles Saxons and Jutes

  • CGG100144 Engeldrup Bro, Melby Denmark_Jutland Denmark_IronAge -0_1_2_SouthScan 500-1 BCE
  • CGG105930 Albäcksbacken Maglarp Sweden Sweden_IronAge – 0_1_3_EastScan 1-150 CE
  • CGG106720 Zealand/ Alsted Simonsborg Denmark_Islands NorthernEurope Denmark_IronAge 0_1_3_EastScan 1-200 CE
  • CGG106722 Zealand/ Alsted Simonsborg Denmark_IronAge 0_1_3_EastScan 1-200 CE
  • CGG106730 Simonsborg Denmark_Islands NorthernEurope Denmark_IronAge 0_1_3_EastScan 1-200 CE
  • CGG106810 Mellemholm Denmark_Jutland NorthernEurope Denmark_IronAge 0_1_6_NorthScan 1-200 CE
  • CGG107446 Bøgebjerg Denmark_Islands NorthernEurope Denmark_IronAge 0_1_3_EastScan 1-200 CE
  • CGG107451 Landledgård Denmark_Islands NorthernEurope Denmark_IronAge 0_1_3_EastScan 1-200 CE
  • CGG107494 Mosede Fort Denmark_Islands NorthernEurope Denmark_IronAge – 0_1_3_EastScan 1-200 CE
  • CGG106489 Sondrup Østergaard, Ulstrup sogn Denmark_Jutland NorthernEurope Denmark_IronAge 0_1_6_NorthScan 126-227 calCE
  • CGG107399 Denmark Zealand FraugdeCGG107384 Alsted Denmark_Islands NorthernEurope Denmark_IronAge 0_1_3_EastScan 200-400 CE
  • CGG107423 Kirkebjerggaard Denmark_Islands NorthernEurope Denmark_IronAge 0_1_3_EastScan 200-400 CE
  • CGG107013 Engholmen 2 Norway NorthernEurope Norway_IronAge 0_1_6_NorthScan 600-700 CE
  • CGG107037 Enge, Sømna M Norway NorthernEurope Norway_IronAge 0_1_6_NorthScan 550-800
  • BUK009 (R-PH1163) Dover Buckland_Kent England WesternEurope England_Medieval.1240k 0_1_2_SouthScan 475-750 CE
  • HAD005 (R-ZP121) Sk 640 Hatherdene Close_Cambridgeshire England WesternEurope England_Medieval.SG 0_1_2_SouthScan 400 – 600 CE.
  • BUK064 423 Dover Buckland_Kent England WesternEurope England_Medieval.1240k 0_1_2_SouthScan 475-750 CE
  • HVN004 Hven_Mecklenburg-Vorpommern Germany WesternEurope Germany_Medieval.1240k 0_1_2_SouthScan 200 – 400 CE
  • BUK025 Dover Buckland_Kent England WesternEurope England_Medieval.1240k 0_1_2_SouthScan c475-c750

Vikings, Various Medieval People.

  • CGG101864 Ahlgade_15-17,_Holbk Denmark_Islands NorthernEurope Denmark_Medieval 0_1_2_SouthScan 1300-1350 CE
  • CGG107579 Kalmargården Denmark_Islands NorthernEurope Denmark_LateVikingEarly Medieval 0_1_6_NorthScan 1040 CE
  • VK168 Oxford England WesternEurope Britain_VikingAge 0_1_3_EastScan 1002 CE
  • CGG100750 (R-ZP121) Tjrby,_Randers Denmark_Jutland NorthernEurope Denmark_Medieval 0_1_2_SouthScan 1000-1300 CE
  • GRO007 Groningen_Groningen Netherlands WesternEurope Netherlands_Medieval.1240k 0_1_2_SouthScan 985-1030 calCE

Of note, the study was missing Tiszapüspöki 18184, who was R-CTS12023 and from Hungary circa 600 CE.


I did find some of these results in the spreadsheet from the paper, but for the most part, I’m referencing the U106 spreadsheet for haplogroups and ages. They’ve done the hard work for me. I didn’t split them into as many groups as straight R-Z19 because there weren’t as many.

  • CGG105928 Albäcksbacken Maglarp Sweden NorthernEurope Sweden_IronAge 0_1_3_EastScan 196BCE-218calCE
  • CGG106728 Simonsborg Denmark_Islands NorthernEurope Denmark_IronAge 0_1_3_EastScan 1-200 CE
  • CGG107495 Mosede Fort Denmark_Islands NorthernEurope Denmark_IronAge 0_1_3_EastScan 1-200 CE
  • CGG107411 Varpelev Denmark_Islands NorthernEurope Denmark_IronAge 0_1_3_EastScan 200-400 CE
  • CGG107015 Føre 1 Norway NorthernEurope Norway_IronAge 0_1_6_NorthScan 300-400 CE
  • CGG107384 Alsted Denmark_Islands NorthernEurope Denmark_IronAge 0_1_3_EastScan 400-550 CE
  • CGG107007 Skongeneshelleren Norway NorthernEurope Norway_IronAge 0_1_6_NorthScan 400-550 CE
  • VK418 Nordland Norway NorthernEurope Norway_IronAge 0_1_6_NorthScan 500s CE
  • SZ4 Szld Hungary CentralEasternEurope Hungary_Langobard_o1 0_4_3_2_SCEEu 550 – 570 CE
  • HAD006 (S4031) Hatherdene Close_Cambridgeshire England WesternEurope England_Medieval.SG 0_1_2_SouthScan 415-537 calCE’
  • VK170 Balladoole IsleOfMan WesternEurope Britain_VikingAge 0_1_6_NorthScan c950 CE
  • VK449 Dorset England WesternEurope Britain_VikingAge 0_1_6_NorthScanVK259 Dorset England WesternEurope Britain_VikingAge 0_1_6_NorthScan 980-1009 CE


  • CGG019442 Sanddal Denmark_Islands NorthernEurope Denmark_IronAge 0_1_3_EastScan 1-125 CE
  • CGG107489 Mosede Fort Denmark_Islands NorthernEurope Denmark_IronAge 0_1_3_EastScan 1-200 CE
  • CGG019091 SOEL_964_Engbjerg Denmark_Islands NorthernEurope Denmark_IronAge_LateRomanIronAge 0_1_3_EastScan 200-375 CE
  • DUN011 352 Dunum_Lower Saxony Germany WesternEurope Germany_Medieval.1240k 0_1_2_SouthScan 672-773 calCE
  • VK204 Orkney_Newark Scotland WesternEurope Britain_VikingAge 0_1_6_NorthScan 1100s CE.
  • VK308 Skara Sweden NorthernEurope Sweden_VikingAge 0_1_6_NorthScan 900-1150

Some Conclusion on Migration

These are still major groupings of R-Z18. Where I’ve marked the known R-CTS12023+ samples in Bold, we have R-ZP121 found among ancient Anglo-Saxons and Ancient Medieval Denmark. R-ZP121 testers trend towards Northern Europe. We also have R-PH1163, found anciently in Anglo-Saxons, with modern Scandinavian testers. This study leaves them at R-Z19. Since there are original samples in the study left at a high level, we’ll have to wait for further analysis to get a better Y DNA resolution. Each CGG sample probably hides a closer relationship to modern testers.

Because of this part of a sentence in the paper: “R1b1a1b1a1a1 (R1b-U106), which are all strongly associated with Scandinavian ancestry (Supplementary Note 6.4.2)” I went looking for Note 6.4.2.

In the Genetics Supplementary Material (media-3.pdf):

“Downstream of R1b1a1b1a (R1b-L11), haplogroup R1b1a1b1a1a1 (R1b-U106) have been
previously argued to be related to the expansion of the Germanic languages, due to its high
frequency in places where those languages are spoken today (Figure S6). We found most of
the individuals of the dataset positive for R1b-U106 to belong to two different downstream
sublineages, which have starkly distinct distributions, particularly in the early Iron Age.
R1b1a1b1a1a1c (R1b-Z19) is found almost exclusively in Northern Europe (with the only
exception being a Langobard from Hungary), and likely represents a local variant of R1b-U106″

(McColl et al. supplemental media-3, p 61)

They conclude that R-Z18 is a local variant of U106 in Northern Europe. They go on to talk about R-Z381 (AKA R-S263 in the paper). I don’t know if they were just excited not to use R-Z18 and R-Z381 or what, but they didn’t. They also pretty consistently take a chromosome that only exists in males and give it female descriptors like “sister”.

“Instead, its sister lineage, R1b1a1b1a1a1b (R1b-S263), is absent in Scandinavia before the IronAge (Figure S8), where it spreads, likely through an Eastern North Sea source, and becomes
dominant in South Scandinavia during the Iron Age, before spreading through Northern Europe. This pattern strongly matches the one seen using autosomes, that detect gene flow back into Scandinavia related to the spread of Germanic languages. Another potential signal of this migration is the increase in frequency of R1b-U106 sister lineage, R1b1a1b1a1a2 (R1b-P312), that has a more continental distribution. and is almost absent in Scandinavia before 2,000 BP.”

(McColl et al. supplemental media-3, p 61)

They offer maps of R-U106 in the study, R-Z18 (Z19) and R-Z381 (S263). First, the R-U106 map is below. Darker red/brown (burnt umber?) is older, and darker blue is newer.

Below is their R-S263 (R-Z381) map. Again, darker red is older, and darker blue is newer. I looked for that dark red dot in Central Europe and couldn’t figure out which sample it was. They assert that there is a migration of R-Z381 into Scandinavia, which they see in autosomal DNA.

The map below portrays R-Z19 (R-Z18) as the local branch of R-U106, which had its oldest sources in Southern Scandinavia before the Iron Age.

I don’t know about their assertions about R-Z381, I don’t spend much time looking at the different clades under it and their migration paths. The oldest R-Z381 record I see in the R-U106 spreadsheet is from the Netherlands. I’m not sure about the dark red dot in their map.

The R-Z19 map seems to follow with other findings. The (now) oldest R-Z18 people are in Denmark and Sweden, but that’s not a big leap from the previous oldest R-Z18 person who was from Denmark. R-Z18 shows up in migration-era populations like the Langobards (and, you know…I think, maybe the Gepids with our CTS12023 man in Hungary) and the Anglo-Saxons. R-Z18 shows up in Weilbark culture. All groups that the ancient world, and often times the people themselves, believed spread out from Southern Scandinavia.

Random Thoughts

There are some things in this paper that seem like important omissions like samples from other studies that are not included. I’m sure they had to pick and choose.

Their Y DNA nomenclature seems outdated. Like they’re working from a really old Y tree (which means their looking for really old Y DNA groups).

Some of their placement seems old. R-Z18 and R-Z381 haven’t been sibling groups for a while. Things have gotten a lot more complex. If you change them from siblings to second cousins or cousins once removed, then is it less odd that they were born in different places? If they’re not looking for the groups in between R-Z18 and R-Z381 then their feelings about their conclusions could be skewed. The conclusions can be solid, but the light those conclusions are considered in can be skewed.

I also noticed some odd labeling. In the supplemental material, one group with an R1a long name (R1a1a1b1a3a) was labeled as an R1b group in the text on page 61 and on the map on page 66. It’s then that I have to remember this is not peer-reviewed and it is a pre-print. Everything has to be taken with a grain of salt, and things could change on the road to publication.

What I think is great is that they included Y DNA in their study and thought about it as part of the whole. I also think it’s awesome that they ran a lot of new samples of ancient DNA. Hopefully, they will make their raw data available, and we’ll see a treasure trove of ancient results in the Family Tree DNA ancient connections site.

A Quick added note on the oldest R-Z18 samples that has some relevance to my Wielbark post.

The oldest R-Z18 sample after the McColl study, CGG107465 lived roughly from 2200 to 2000 BCE in Zealand Denmark, not too far from Kallerup. Here’s a link to the coordinates

The next oldest samples were further down the shore in Zealand, near Kalstrup. One there may be the oldest with more precise dating but they are in a similar range. All are listed as Nordic Bronze Age. One in a similar range but listed as Late Neolithic, comes from Albäcksbacken Maglarp, Skåne, Sweden.

Since McColl has only high level haplogroups, they’re left at R-Z18. These samples are right on top of the estimated date for the birth of R-Z18, 2200 BCE. So the study may be right that R-Z18 is a local branch of U106 in the area.

Here is a map with those locations:

map of copenhagen and sweden area with points for ancient R-Z18 DNA

Given the estimated start of CTS12023/RDF95 in 700 BCE there are about 17 CGG R-Z19 samples that could contain CTS12023 or downstream SNPs if there were greater resolution on the Y DNA. I suppose there is also the possibility of finding someone upstream and breaking up our long chain of SNPs that is currently listed as CTS12023.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.