R1a-YP4248 Subclade Project

Michael Cooley

This page is being developed live. (I like living on the edge.) Expect short-lived typos, etc.

The discussion component of the following can be found at Facebook, a group appropriately entitled the R1a-YP4248 Subclade Project. Although it's open to all YP4248-positive descendants and their families, the project will concentrate on Big Y (or equivalent) test results, as well as for those tests found positive for relevant terminal SNPs, as shown below. Through tracking the various Y-DNA and genealogy branches, we stand to learn something about the origins of the YP4248 SNP mutation, the routes it took to the British Isles and beyond. To accomplish this, we'll employ the three G's: genealogy, genetics, and geography — as well as a heavy dose of history.

Both Y-STR and Y-SNP results are slowly being collected at FTDNA's R1a-YP4248 Subclade Project. The project surnames are presently Cochran, Cooley, Coombs, Cummings, Gray, Hackett, Hawley, Higdon, Mann, Rankin, Sample, Story, and Whitfield.

In short,

R1a-YP4248 Big Y SNP Table

This is the basic SNP data for the following tree. It can likely be placed in a spreadsheet easily by copy and paste.

 22 November 2021
#MI17444YP4248>YP4253>A12124Elijah Hawley (1781-1860)Y-700FTB7956 FTB7682 FTB7334 FTB6983 FTB8950 FTB6689 FTB6655
#323704YP4248>YP4253>A12124John Hackett (1746-1808)Y-500A12127 A12126 A12128
#57597YP4248>YP4253>YP4491John Cooley (c1738-1811)Y-500A7497* A7498
#N3690YP4248>YP4253>YP4491John Cooley (c1738-1811)Y-700A7411
#160752YP4248>YP4253>YP4491John Cooley (c1738-1811)Y-SNP--
#573208YP4248>YP4253>YP4491John Cooley (c1738-1811)Y-SNP--
#Y-7660YP4248>YP4253>YP4491John Cooley (c1738-1811)Y-SNP--
#558118YP4248>YP4253>YP4491John Higdon (1657-1723)Y-700FTA84863 FTA84182 YFS514073* Y76109
#651501YP4248>YP4253>YP4491Lewis Whitfield (1789-1858)Y-700FT140057 A17721
#573208YP4248>YP4253>YP4491William H Cooley (1797-1877)Y-700--
#520597YP4248>YP4253>YP4491William Whitfield (1751-1835)Y-500A14496 A14495
#910648YP4248>YP4253>YP4491William Whitfield (1751-1835)Y-700FT144702*
#343609YP4248>YP5007>BY27664John Lonan Mann (1855-)Y-500BY149694 Y79767 BY134046 Y75160 Y72774 Y71660 Y70911
#B161280YP4248>YP5007>BY27664Lancelot Cummings (1908-1960)Y-500BY74965 BY69379 Y98299 Y96328 BY65753 BY226047
#962459YP4248>YP5007>BY27664William Semple (1735-1785)Y-700FTB61216 FTB60783 FTB59852
#183830YP4248>YP5007>BY30798>BY30796Christopher Storry (1693-1767)Y-500Y128826
#IN96910YP4248>YP5007>BY30798>BY30796>Y108621>FT106724Thomas Story (1655-1718)Y-700--
#N23144YP4248>YP5007>BY30798>BY30796>Y108621>FT106724Thomas Story (1655-1718)Y-700--
#IN57019YP4248>YP5007>BY30798>BY30796>Y108621Thomas Story (1655-1718)Y-700FT170856 FT168230*
#915713YP4248>YP5007>BY30798Mathew Coombs (1844-1822)Y-700FT338904 FT338291 FT337526 FT337379 FT336903 FT336598 FT335764 FT335052 FT334720 FT341599*
#552162YP4248>YP5007>FTA31692David Rankin (-1789)Y-700FTA31466 FTA31313 FTA29998 FTA29885
#149142YP4248>YP5007>FTA31692Joseph Rankin (1704-1764)Y-700FTA76438 FTA75293 FTA75212 FT290693 FTA74193
#IN89936YP4248>YP5007>YP5244>FT407422John Tyrie (1905-)Y-700FT407989 FT407449 FT407434 FT407407
#378638YP4248>YP5007>YP5244>FT407422>Y64331William Cochrane (1806-1889)Y-700FTA26988 FTA26964 FTA26723
#378637YP4248>YP5007>YP5244>FT407422>Y64331William Cochrane (1806-1889)Y-700FTA50409 FTA54389 Y70195 Y67035
#772584YP4248>YP5007>YP5244John Cochrane (c1720-1788)Y-700FT390045 MF120379 FT389363 FT388324 FT387882* FT387814* BY116184
#B5637YP4248>YP5007>YP5244>YP5242Deacon John Cochran (1662-1747)Y-700FTB49459 FTB55647 FTB55589 FTB55341 FTB49825
#171069YP4248>YP5007>YP5244>YP5242>FTA22447GitterleY-700FTB29960 FTB29862 FTB29764 FTB28865 F17858 FTB31291 FTB18996 FT238436
#184037YP4248>YP5007>YP5244>YP5242>FTA22447Henry A Cochran (1878-1962)Y-700FTA22181 FTA21965 FTA21949 FTA21889 FGC41633
#391280YP4248>YP5007>YP5244>YP5242John Cochran (c1760-1831+)Y-700Y65968 FT123924* FT12364715965418 FT123352 BY80739 Y72633
#943658YP4248>YP5007>YP5244>YP5242John Cochrane (c1750s-1839)Y-700FTA20728 FTA52411 BY165580

      * SNP cannot undergo Sanger testing

R1a-YP4248 SNP Tree

A SNP (Single Nucleotide Polymorphism) occurs when there's a genetic change at a single position, for example an A mutated to a C. Each marker listed in the tree below is a SNP, and each one was manifested at the birth of a single man. This doesn't happen every generation but, if we had enough data, they'd all line up one after the other in a timeline. In other words, each of these markers is a direct representative of a specific man born at a specific time and place. For that reason, I like to say that "SNPS are people too." It's not literally true but conjures an image that is reasonably correct. There could well have been a man called Bjorn (or Ugh), but we can refer to him as the "YP4248 Man."

I'm kit number 57975, the second from the left. If we count all the SNPs in my lineage back to Bjorn, including my novel variants, 21 men are represented. Was each man a great-grandfather to the next? Possibly. It would vary from one lineage to the next. But let's say that one SNP pops into the lineage every three generations. This would represent 63 generations, perhaps 1500 years. As new testers come along, there will be further branching and will be able to make more accurate estimates. In any case, Bjorn certainly lived over a thousand years ago.

Click image to enlarge

Short Tandem Repeats (STRs)

STRs are a big, tricky, and somewhat misunderstood topic. As I understand it, they were chosen over SNPs in the early days for several reasons. The SNP catalog was small. Only very ancient SNPs and their placement had been uncovered. They could not genetically define a lineage over the historical timeframe. Because STRs are long strings of repeating letters, they were quickly identified and had the benefit of being fast mutating. In other words, they would quickly evolve over the course of one or two thousand years, which allowed for the identification of branching, even within the last few hundred years.

STRs are comprised of short strings of genetic letters that repeat X number of times withing a given region. For example, I have 34 repeats of TTTC in the region known as DYS449 (allele position 21). It's likely that close relatives will have the same values.

However, the STR values can go either up or down. The next illustration demonstrates that the modal at that position is 33. Somewhere in my lineage, an ancestor gained an additional repeat of DYS449 while others lost a repeat. This makes STRs unreliable for a purposes. Yes, they will trend in a certain direction as they evolve and patterns will emerge. But you can't count on them with any certainty. Still, individual Y-STR results can be compared to one another and some insight can be gleaned. Just don't bank on it.

The following data was last updated on 17 Sept 2020. Click the image to go to the table.

Hybrid Tree

This tree combines the known genealogy with the Y-SNP markers. It includes both Big Y testers and Y-111 testers, most of whom are at the far right labeled TBD. Click twice to bring up the full image or go to the HTML version.

Genetic Distance over 111 Y-STRs

This table is much the same thing but only those testers having 111 markers are looked at. Here we do some some patterns. For example, three Cochrans share 18 repeats at allele position 32. It should be noted that the relationship between this known. They're relatively close and one of them has been tested for the downstream YP5242 haplogroup. Likely, all three are of the same subclade.

But comparing individual STR markers can take us only so far. Genetic Distance (GD) is more representative of how the limbs of the tree hang in relationship to one another. GD is simply the number of differences between to sets of values. Here, each tester is compared to the modal.

The top line of markers (in gray) is the modal, which represents the most common marker found among the testers (not the average value). Theoretically, the modal reflects the expected values for the Most Recent Common Ancestor (MRCA) of all testers. But it can be accurate only to the degree that we have tested descendants. Should we ever dig YP4248 Man out of the soil, we should expect to see something rather different.

The highlighted markers on each line illustrate where the testers show a difference to the modal. The parenthetical numbers in blue are the total Genetic Distance (GD) between each tester and the modal.

Click the image to enlarge

A tester who might belong to the project can be compared to the modal (at top of the table) and to the min/max values (at the bottom of the table). For example, this tester is probably not YP4248. If he were positive for the SNP, however, the bar would be raised quite a bit.

Genetic Distance Tester by Tester

Here the GD to the modal for each testers is compared those values among all testers. The kits are arranged this time based on the value in each cell, so the kit order varies from that in the table above. This is especially useful in predicting where someone might be placed on the SNP tree once advanced SNP testing is completed. Like results start clumping together by surname and subclade.

Color Version

Here's a colorized version. The GD is simply replaced by a color value. I suppose this borders on the silly, but patterns readily emerge. The bright box at the lower right represents the Cooleys and Whitfields (YP4491). This strongly suggests large genetic and genealogical gaps above and below the subclade. Finding the right testers might fill those out. Other boxes could be just as telling, for example numbers 26 through 32. There are bands that appear to be out of the place. For example, 19, 24, and 25 should most likely be clumped together. But that's just the way the sort worked out. A better algorithm might come along, but better to just go along with the sort than to have to remember to revise the order each time the scripts are run.

And, again I point out, this is derived from STR values. They're not as precise as SNPs and can vary within lineages. They're best used in adjunct to SNPs.


The first thing to notice about the color version is that each side of the diagonal line is a mirror of the other. In other words, there's double the data. For the following, I removed the bottom half and turned the whole thing on its side, resulting in pyramids rather than squares. It's not coincidental, of course, that it looks a bit like the SNP tree at top, except that the hapolgroups are moving in the opposite direction. That's just the way the data is independently sorted.

This is a lot more ambiguous than the SNP tree. And that for the simple reason that we're using somewhat finicky STRs as opposed to stable and discrete SNPs. Certainly there are additional pyramids to flesh out, but STRs won't accomplish that. Only advanced SNP testing can help clarify questions of descent. STRs are the rough sketch. It's brought into form by SNPs.

It should be noted that the Cochrans are largely over-represented in this study. However, in the SNP study, the Ulster Cochrans presently have 13 novel SNPs (see the SNP tree, above). Quite a bit of parsing can yet be done. Perhaps that will be accomplished with as few as two Big Ys. They should be carefully chosen, though. The STR matrix might point the way to finding testers who are close, but not two close, to the two current Big Y testers.

Otherwise, it would be helpful to do some single SNP testing to confirm placement ($39 at FTDNA and $18 at YSEQ). FTDNA, however, has SNP tests available for this tree only for YP4248 (the top of the tree) and YP4491 (Cooley-Whitfield). They've stopped, for now, adding to their catalog.

Most of the other names, especially Hackett (and I'm sure there are more to come), are under-represented. The exception is the Cooley-Whitfield block, which is well defined. Still, those gaps above and below YP4491 need to be expanded. We need to find a way to expand our reach.

Origins Map for R1a-YP4248

1. Speculation. Landing of YP4252, 200 BC.
2. Confirmed. Yorkshire 1655 (Story BY30796).
3. Confirmed. Derbyshire 1744 (Hackett YP4253)
4. Speculation. Birmingham c1750 (Cooley YP4491)
5. Confirmed. Ireland 1700s (Cochran YP5244)
6. Speculation. Renfrewshire 13th century (Barons Cochran).
7. Unconfirmed. Unk Scotland 19th century (Cummings BY27664).
8. Unconfirmed. Sutherland c1761 (Gray N/A).

Keep in mind that Great Britain is roughly the size of Minnesota and is 600 miles long. For hundreds and thousands of years, Britons have lived within a few hundred of miles of one another. Travel was not uncommon. Over the preceding thousand years and more, YP4248's descendants could have spread out from any point on the map.

But if we were to determine the "center of gravity," so to speak (and I don't really know how to do that), it's likely to be at or near the Scottish/English border. These placements are not likely to be totally accurate. They're derived from the extant data. The landing estimate for YP4252 comes from Hunter Provyn's phylogeographer.com. However, the results at FTDNA tell a different story. YP4252 has two known subclades, FT33200, to which our YP4248 belongs, and YP5598, which includes a Scandinavian-born descendant. FT33200 itself has two subclades, including YP4248. The other, BY63466, is presently comprised of two Scandinavian testers. The data is too small for making conclusions. Still, it appears most likely that YP4248, one of its member SNPs or an immediate parent SNP, hitched a ride to Britain, likely across the North Sea to the east coast, within each cell of its host. We can call the immigrant Björn.

Brief Family Histories

I've barely started on this. More is to come. Help will be welcomed.


Hackett Marriage Record Click for larger copy

A descendant of John Hackett of Derbyshire might have been the first Big Y tester positive for YP4248. In fact, I suppose, we might call him the discoverer of the haplogroup. He lacks a Hackett match, however, meaning that his four novel SNPs have not parsed into a terminal SNP.

John Hackett was born 6 April 1746 at Cromford, Derbyshire, England. He married Susannah Renshaw (1750-1834) on 13 July 1772 at Bakewell in the same county. He was a tinsmith, also known as a whitesmith. John lived in Derby all his life dying at Matlock on 16 Jan 1808. He's buried in the St Giles Church Cemetery.

There's much, of course, to be learned about the family. Because Hackett is of the YP4253 subclade along with Cooley/Whitfield, I'm particularly interested in it. In fact, there's a Cowley family in Derbyshire that appears to be a good match. But, for now, that's purely speculation.

Cooley & Whitfield

The placement on the map for YP4491 (Cooley/Whitfield), is speculative and even sourced from a family now known to be unrelated. My branch of YP4491 Cooleys resided for several generations in Bartholomew County, Indiana. By the 1940s they'd been neighbors with another Cooley family for three or four generations. Thanks to Y-DNA testing over the last few years, however, we know that the Y-DNA of these neighbors, descendants of Reuben Ransom Cooley, match with the descendants of Benjamin Cooley of Massachusetts and are of haplogroup R1b-A12022 — light years from YP4491.

Benjamin arrived in the colonies in the 1640s. Three hundred years later, in a 1946 letter, Reuben's great-grandson, George Cooley, wrote that Reuben's father immigrated to the U.S. with "11 children, 8 boys and 3 girls," which is a close approximation of what we know about the family of John Cooley of Stokes County, North Carolina (except that John's children were born in Virginia and North Carolina). George continues, "One of these boys was supposed to have been killed in the Battle of New Orleans in 1812." In fact, John's son, Cornelius, died of illness just after the Battle of New Orleans in 1815. But other parts of the story are more contemporaneous with Reuben and correspond with his family, not at all with John of North Carolina.

It's very likely, then, that Reuben's descendants subsumed part of the YP4491 story. The parallels are too striking and specific to have been coincidental. George also wrote that the patriarch of the family immigrated from Birmingham, England, but we can't trust that. After all, George has himself been proven to be an unreliable source. It's a convoluted story and Birmingham could be one of its convulsions. Yet it's all we have for now — plus it somewhat correlates with the fact that John's nearest and contemporaneous DNA lineage (Hackett) lived in Derbyshire, just north of Birmingham. Speculation aside, however, Virginia is the earliest confirmed residence for these Cooleys and their Y-DNA twins, the Whitfields.

More about the Whitfields later, but their William Whitfield (1751-1835) and my John Cooley (c1738-1811), both first found in Virginia, are practically Y-DNA clones of one another. Although I'm not seriously postulating this, John was barely old enough to have been William's father — not likely — or, they could have been brothers, half-brothers, uncle-nephew, etc. But more about that when I discuss the Big Y STRs results.