Analysis of DNA for Casey lines
Last updated on September 3, 2007
By Robert Brooks Casey (Ambler Casey line)



This web site is the personal analysis of the Casey DNA database by this Casey researcher, Robert Brooks Casey. This web site was created to assist other Casey researchers in analyzing the current DNA submissions and to provide input for what is required to allow this project to produce genealogically significant information. This is not the official Casey DNA web site. For additional information concerning the Casey DNA project, refer to the official Casey DNA web site or contact its coordinator (found at the official web site):

Official Casey DNA web site



Introduction to DNA for Genealogy

Information obtained from the Casey DNA Project

Detailed analysis of DNA submissions

If DNA tests were only $10 each

Good candidates for more DNA analysis

How many markers should be analyzed

Analysis using Cladogram software

DNA Descendancy Chart (SC & TN)

Non Paternity Events vs. Overlapping Haplotypes

Call for better documentation

Please give me some feedback












Introduction to DNA for Genealogy

The analysis of DNA markers provides a new opportunity for genealogists to unravel their family history. This new tool is now producing results that can take some of the guesswork out of adding another ancestor to our family trees. Historically, our traditional research is heavily influenced by geography, family naming patterns, migration patterns, etc. This approach most often leads us to discover new ancestors but it can also lead us into wrong direction as well. Your particular oldest proven ancestor may have broken away from his family connections and traditions. Your oldest proven ancestor may not have named his children after his older generation of relatives or may have moved to new areas where no siblings or cousins lived, etc. Analysis of DNA markers allows us to identify which Casey lines look encouraging as potential relatives and can reduce unproductive research on unrelated lines that ended up in the same county by chance. The Casey surname is a fairly common name, therefore, Casey researchers should expect to regularly encounter unrelated Casey families in the same counties during the same time period.

Unfortunately, DNA research provides its best fit by tracing your "all male" line of ancestors as this is basic biology that limits our genealogical research. This biological fact limits male researchers to submit their DNA for their surname only. For other surnames of your ancestors, you have to get one of your male cousins born with the surname of interest to submit their DNA sample and you can "sponsor" their submission (or assist them with the expense of the DNA submission based on your mutual interest in your common ancestor). There is also DNA test for women that will trace their "all female" lines as well. However, the markers of the "all female" mutate at such a slow rate that these tests are not very useful for genealogical purposes. With European surname practices for marriage, women of European descent take the surname of their husband which results in a different surname for every generation. This constant change of surname makes it more difficult to trace than tracing one surname over many generations. Male DNA markers are like probate records and census records, our best primary sources that yield maximum information. Female DNA tests are equivalent to tax records and property records, producing fewer results but can be very useful when census records and probate records fail to produce results. We should search our "all male" DNA lines first and then later supplement them with research of female DNA lines when further analysis of male DNA lines can no longer yield enough information to break through to another generation.

Information obtained from the Casey DNA Project

Currently, there are 23 submissions of DNA markers to analyze for the Casey surname. Analysis implies that there are probably at least five different clusters to date. The submission of the James Casey (VA) line and both submissions of the Sinclair Casey lines have less than a 1 of 10,000 chance as being related to other Casey lines in the last 600 years (less than 0.01 % chance). The Irish Cluster is probably distantly related to SC & TN Cluster in a genetically significant timeframe (when surnames were first used) but not in a genealogical significant timeframe (within three or four generations of oldest proven ancestors). Genetically, these two clusters are probably from a common ancestor that used the surname Casey, however, this connection would probably be 200 or 300 years prior our earliest proven ancestors and therefore, these two clusters will remain separated for DNA descendancy charts. The two lines in Cluster 3 have around a 10 % chance of being related in the last 600 years. There is a small probability of these lines being genetically related in the last 600 years and these two lines are currently grouped together in another cluster (probably will be split after future submissions are received).

There are ten submissions that appear to be closely related to my Casey line (Ambler Casey). This group of submissions (with origins in South Carolina and Tennessee) presently has the most to gain from this project due to early formation of a cluster of ancestors being tied together with very similiar DNA markers. Even the two most distantly related individuals in this cluster have a 96.90 % chance of being related in the last 500 years (99.04 % chance in the last 600 years). This probability was calculated using the FTDNATip utility from Family Tree DNA when using 37 markers. Since both of these have 67 markers available, this MCRA (Most Recent Common Ancestor) Utility shows a 99.48 % chance of a common ancestor in the last 500 years and 99.90 % chance of having a common ancestor in the last 600 years. To date, the John (MO) line and the Jesse E. Casey (& Ambler Casey) lines are most distantly related lines (having three mutations apart). This MRCA utility actually shows this probability for last 20 (or 24) generations and I have assumed 25 years per generation. The DNA submission of Jackson Casey was not included in the comparisons as the DNA mutations of this submission must be from male descendants of Henson Casey vs. Henson Casey himself. His brother, Arvle Casey has an exact 37 marker match with John Casey (SC) and therefore, these mutations were not introduced by Henson Casey but probably originated from male descendants of Jackson Casey (in this case either Jackson Casey or the submittor of DNA).

It appears that several lines with recent ties to Ireland have also formed a second cluster. Unfortunately, only about half of the Casey lines with recent Irish origins belong to this second cluster and the other half belong to other clusters). The MRCA utility shows that the two most distantly related lines are the Michael Casey (Ireland) line and the Daniel Casey (Ireland)line which have a genetic distance (number of mutations) of seven. These two lines currently have a 88.04 % chance of having a common ancestor in the last 500 years (95.00 % chance in the last 600 years). The MRCA utility is a good tool to verify the formation of a cluster and these three lines are definitely a second cluster. With the John Casey (NY) line being upgraded to 37 markers, more information concerning this cluster will soon be available for additional analysis.

The utility can also assist in revealing if two clusters might share a common ancestor that used the Casey surname. Being able to determine the probability of two clusters being genetically related is also very important in establishing progenitor of each cluster. If these clusters are closely related, they would share a common ancestor that have a common DNA marker set that is somewhere between the two clusters. It appears that the SC & TN cluster and the Irish cluster could share a common ancestor in a genetically significant timeframe (in a timeframe when surnames were first used). The two most closely related individuals from these two clusters have 43.14 % chance of being related in the last 500 years (62.51 % in the last 600 years). It is believed that these percentages are high enough to establish that there is a reasonable chance that both lines descend from a very early common ancestor that used the surname of Casey. The two most distantly related lines are John Casey (MO) and Daniel Casey (Ireland) which have a genetic distance of 13 (thirteen mutation differences). Even these two lines have a 16.84 % chance of a common ancestor in 500 years and 32.49 % chance in the last 600 years. Unfortunately, the MRCA utility does not allow manual entry of DNA marker sets for comparison. It would be much more accurate to guess at the marker set of the common ancestor between these two clusters (using a mutation point from the 37 marker cladogram chart). This would represent a better picture how closely related these clusters could be. For two clusters to share a common ancestor with the same surname, it would be acceptable for the two most closely related lines to have 50 % chance in the last 600 years and the two most distantly related lines to have over 20 % in the last 600 years.

A real surprise is that there are at least five major clusters identified to date and only two of these clusters have a reasonable chance of being genetically related (the SC & TN cluster and the Irish cluster). The two lines in Cluster 3 only have around a 10 % chance of a common ancestor in the last 600 years and will probably evolve into to two clusters. A big surprise is that this utility shows that Clusters 4 and 5 are genetically different from all other Casey clusters. Both Clusters 4 and 5 have a 0.00 percent chance of being related to each other in the last 600 years (must be less than 0.01 % probability or less than one out of 10,000 chance) and both lines also have no chance of being related to the closest cluster, the Irish Cluster. There are many possible explanations for this great genetic distance: 1) There could be many unrelated men that first used the surname Casey (or its earlier forms); 2) Some of these lines could changed their surnames from other surnames to the Casey surname; 3) Some will be NPEs (Non-Paternity Events); 4) Numerous other remote reasons are possible.

The two lines covered in Cluster 3 are also very distantly related and have only a 3.55 % chance of a common ancestor in the last 500 years (9.81 chance in the last 600 years). This indicates a remote chance that this could be a cluster but more likely implies that these two lines are probably genetically different lines as well (or form more than one real cluster). The other lines in Cluster 3 with fewer than 37 markers appear to be even more distantly related. I have always known that surnames based on trade or geographical terms would have great genetic diversity but was a little surprised that the Casey surname also falls into this category as well. In fact, the Christopher Casey line has higher odds of being related to the John Casey (SC) line as these lines have 8.8 % chance of being related in the last 500 years (20.96 % chance in the last 600 years). This a little bit of let down as I has always falsely believed that most of these Casey lines were related somewhere in the distant past but genetic information tends to shatter this assumption. Also, when you really think about it, there will be probably be a lot of name changes and NPE events over 20 to 25 generations where each male ancestor had several sons each generation. When odds of being related this low in 600 year timeframe then it is very obvious that these lines will never be genealogical signficant to each other (in a timeframe reasonably close to our oldest proven ancestors).

The surname "Casey" is a fairly common surname (364th most common surname according to the 1964 Social Security survey of surnames in the book, "American Surnames," by Elsdon C. Smith). There are an estimated 150,000 individuals with the "Casey" surname in the United States and at least one hundred Casey men that have been declared the oldest ancestor of numerous Casey lines. With 75,000 Casey men living in the United States today, the chance for NPE events and name changes are quite high and each passing generation will create genetically "new" Casey lines. The highest priority of this project is to greatly increase the number of submissions that are "not" known to be related to the current oldest proven ancestors. The descendants of Cluster 3, 4 and 5 should not be discouraged and should recruit new members that they believe could be related to their lines. We also need more random submissions to determine how many major clusters that will form. Currently, this project is probably biased towards the SC and TN lines (long time interest in these lines and previous publications) and somewhat biased towards lines with recent origins in Ireland (the obvious origin of the Casey surname).

For the ten submissions included in the SC and TN cluster, two submissions are from the same known ancestor and have identical DNA markers. Another two submissions are from another proven ancestor but have two mutational differences (a third submission is required to truly understand the source of these mutations). A third submission is believed to Casey NPE event (the Hanvey line is believed to really be a Casey line). This leaves seven lines where there are no proven connections between the lines and six of these lines have unique DNA marker sets that can be charted in a DNA descendancy chart. Only the Ambler Casey line and Jesse E. Casey lines can not be separated - both having common 67 DNA markers. However, even with only six unique lines in this cluster, I am very encouraged of what these samples tell us and I am optimistic that additional samples will greatly help Casey research on the Casey lines of South Carolina and Tennessee. So what have we learned from these early submissions ? First, the SC & TN Cluster clearly validate what we all have suspected, that the Tennessee lines have their roots in South Carolina. We really already knew this from traditional genealogical research - but we now have this fact validated by scientific evidence as well. Second, all six lines (which are quite diverse) only have one to three marker deviations from the other SC and TN lines. Therefore, there was a big discovery that all South Carolina and Tennessee lines apppear to be much more closely related than anticipated. This means random new entries in this cluster could provide very interesting results.

Third, DNA evidence does not support the speculative connection of Henson Casey being a son of Ambler Casey. This is a little bit disappointing as there was great expectations that DNA would help support this connection. Fourth, it appears that the Ambler Casey line and the Jesse E. Casey line are the most closely related lines as they both share a common unique DNA mutation for this cluster (CDYb from 38 to 37) and both lines are exact 67 marker match. Fifth, marker 460 mutating from 12 to 13 appears to to be the earliest mutation in this cluster and creates two distinct branches in the DNA decendancy charts. John Casey (SC), John Casey (MO), James H. Casey and probably Henson Casey form one branch (460 = 12) and Abner Casey, Jesse E. Casey, Ambler Casey, the Hanvey line and possibly the Henson Casey line form the second branch (460 = 13). Sixth, There are three branches of 460 = 12 and probably two sub-branches of three sub-branches. There are two branches of 460 = 13 with one branch having two sub-branches and one branch having two or three sub-branches (depending on Henson Casey variations). This is a lot of branching for so few lines submitted to date. Seventh, it is now known that the John Casey (SC) line currently has the spotlight as containing the markers that most closely represents the progenitor of the SC & TN cluster. The Henson Casey line may share that spotlight but a third son is required to determine Henson's DNA marker sets from his male descendants and the John Casey (SC) line needs to be upgraded to 67 markers for a full comparison of markers. Eighth, the Abner Casey line and the Hanvey lines are exact 67 marker matches. Since the Hanvey line remained in South Carolina much longer, further research of the Hanvey line may shed light on the Abner Casey line.

Ninth, the Henson Casey submissions clearly show the importance of getting two sons from one known ancestor. The first submission (Jackson Casey) is now known to have a two marker mutation from his brother's line (Arvle Casey). It is now believed that these mutations may have occured in later generations - and do not represent the DNA marker set of Henson Casey. Originally, the analysis of Jackson Casey line was substantially off due to the recent discovery that these mutations probably occurred more recently are probably not genalogically significant as first believed. A DNA submission from a third son is now required to determine Henson Casey's true DNA marker set. The conclusions for Henson Casey should be considered speculative until more DNA submissions reveal Henson Casey's true DNA marker set. Tenth, even with very few submissions, there appears to be the ability to speculate on how these SC and TN lines are related based on DNA submissions. It appears that a DNA descendency chart is possible that shows how these lines could be related and is very helpful in determining what kind of submissions are needed next. The DNA descendency chart (shown later in this web page) is a very unique approach that does not appear most DNA projects and should be considered speculative in nature.

Another discovery is that the haplogroups are identified as Irish in only 18 of the 23 submissions. Four of the more distantly related Casey lines are not categorized at this point in time as being part of any Irish haplogroup. However, I suspect that Irish haplogroups are not that well defined as the Patrick Casey line of Clare County, Ireland does not have an Irish haplotype which puts some doubt on the accuracy of this haplotype to include all people of Irish origin. Research of overlapping haplotypes from the YSearch database imply that some of these Casey lines may have Scottish or English origins and just borrowed the "Casey" surname for some unknown reason or were NPE (adoptions, etc.) events. With the analysis of only STR markers, haplogroups are only estimated. To know the actual haplogroup, another DNA test must be performed (analyzing the SNP DNA markers). Since most Casey lines are estimated to be R1b1, it is most likely that their haplogroups will fall into the R1b1 family of haplotypes. The DNA marker sets of the SC & TN Cluster and the Irish Cluster have been further estimated to be Irish Type 3 (also known as R1b1c). Only further DNA analysis of SNP markers (at additional expense) can truly map each line to a definite haplogroup. It is believed that all Casey lines in the SC & TN Cluster and the Irish Cluster are part of the R1b1c haplotype. Only testing of SNP markers could verify this assertion but the expense probably does not warrant the gain and it is believed that no genealogical significant information would be revealed. For further information concerning the Irish Type 3 haplogroup:

Irish Type 3 Haplogroup

Many of the Casey lines are genetically related to other surnames and some genetically Casey lines will have non-Casey surnames. This complex topic has not been thoroughly researched to date with one notable exception - the Hanvey line (submission 29956). For the SC and TN Cluster, every entry in the Family Tree DNA database have the Casey surname and have documented ties to either South Carolina or Tennessee with one exception (one Hanvey line). It appears that this Hanvey line may be a genetic Casey line (NPE) and probably belongs to the SC and TN cluster. The Hanvey line appears to be closely related to the Abner Casey line and researchers of both lines should exchange information on possible connections. The Abner Casey line is an exact 67 marker match with the Hanvey line. The Hanvey sponsor is quite certain that the NPE event occured in South Carolina as the Hanvey line originates from Abbeville County, South Carolina and remained in South Carolina much longer than the Abner Casey line resided in South Carolina. After contacting the Hanvey researcher who sponsored the DNA submission, I was informed that he believed the NPE event occurred in either 1865 or 1885. If true, this would be very significant to the Abner Casey line as this means that one Casey line that remained behind in South Carolina is very closely related to the Abner Casey line.

Detailed analysis of DNA submissions



For anyone wanting to analyze the Casey DNA submissions that we currently have, the only practical method is to use cladogram software which graphically displays the connections between the various DNA submissions. Without this tool, it is extremely difficult to manually extract this information. The cladogram chart puts John Casey (SC) as a line that is more closely related to other Casey lines. The cladogram program makes a simple assumption of connecting lines by determining the minimum number of mutations. This is the highest probability connection but it is not 100% accurate as the actual connections could have more mutations or backwards mutations (deletions) can also occur. The cladogram charts should only be used to identify clusters and determine if clusters are genetically related (very important in determining the progenitor of any cluster). Additionally, the cladogram charts assume that all DNA mutations are significant to the oldest proven ancestor. In fact, many of these mutations originated with male descendants of oldest proven ancestors vs. the oldest proven ancestor themselves. Also, the cladograms tend to oversimplify relationships in their charts. In order to have branches, at least two brothers have to have different DNA marker sets but only one marker set has to mutate and the other can remain the same. The father of Abner Casey and the father of Jesse E. Casey could be brothers, however, the cladogram does not show this possible brother relationship and implies that the Jesse E. Casey line must be a branch of the Abner Casey line.

The cladogram charts combined with the availability of DNA submissions outside the SC and TN cluster are key to establishing the DNA descendency chart for the SC and TN Cluster. The cladogram charts implies that the DNA marker set of John Casey (SC) has the fewest mutations from closest Casey lines in other clusters. I guess congratulations to this line for being the least mutated Casey line within this cluster. This fact currently puts John Casey (SC) on top of the DNA descendancy chart and establishes this line as the baseline DNA marker set that all other submissions in this group should be compared with. A closer inspection of the markers involved shows that marker 460 is very important to this cluster. For all other Casey lines, 460 is either 10 or 11. For the SC and TN Casey cluster, 460 is either 12 or 13. This means that the SC and TN lines have a unique fingerprint from other Casey lines by having one or two more mutations for one marker than all other Casey submissions to date. The SC and TN Cluster can be described genetically as those who have 460 > 11.

Without the usage of cladogram charts and the inclusion of other Casey DNA submissions outside the SC and TN Casey cluster, very different conclusions would result. Most DNA genealogical web sites use another methodology to determine the progenitor of any cluster. This method is sometimes called the "Majority Rules" methodology. It assumes that the DNA marker value with the highest number of occurances in the cluster must be the most common and probably represents the DNA marker value of the progenitor of the cluster. This assumes that all DNA submissions within the cluster are random and evenly distributed and biases this conclusion by the popularity of lines submitted. Using this "Majority Rules" methodology and by inspecting on SC and TN submissions independently, the DNA descendancy chart takes on a very different look. By only looking at the SC and TN cluster and using the "Majority Rules" methodology, the Abner / Pleasant Casey line would appear to be the best candidate for being on the top of the DNA descendancy chart. Another methodology looks at all submissions of a surname and would use the cladogram chart in chosing the progenitor of the SC & TN Cluster. This methodology makes the assumption that the closest marker set to other clusters must be the progenitor of the SC & TN Cluster. These two major conflicting approaches can create two very unique DNA Descendancy Charts.

Therefore, we need to analyze the underlying assumptions of these two different approaches. First, do we believe that the DNA submissions included in the SC & TN Cluster are biased with more submissions that are closely related to the Abner Casey line or randomly distributed ? Unfortunately, I tend to believe that the popularity of Roane County, TN and other central Tennessee lines may have introduced a bias to the DNA submissions received to date. Obviously, this assumption is not the easiest to validate. Fortunately, the second methodology has tools to validate if two clusters could be related. In this case, we can run a MRCA utility and determine the probability of the two clusters being related in the last 500 or 600 years. With the MRCR utility, you can enter two DNA submissions (haplotypes) and it will give you the probabilities that these individuals are related over varioius numbers of generations. If the two most closely related individuals from each cluster show reasonable chances of being related, then this approach has scientific evidence proving the required assumption.

There is a very big assumption when using the cladogram, as connections between clusters have no meaning unless the adjacent clusters are closely related enough to be genealogically significant. It appears that the SC & TN cluster and the Irish Cluster could share a common ancestor in a genetically significant timeframe (in a timeframe when surnames were first used). The two most closely related individuals from these two clusters have 43.14 % chance of being related in the last 500 years (62.51 % in the last 600 years). I believe that these percentages are high enough to assume that these two clusters did indeed share a common ancestor that used the surname Casey (probably too far back to be genealogically significant but close enough to assume the connection shown in the cladogram is most likely). Therefore, I believe that "Related Cluster" approach has supporting data and the "Majority Rules" approach has less supporting information. However, the most common approach used by DNA Genealogical web sites is the "Majority Rules" as this approach is covered more in genealogical publications that cover the topic of DNA being used for genealogical purposes. Welcome to the uncertainity of DNA research used for genealogical purposes. Over time, additional DNA submissions will incrementally allow both approaches to merge to a common conclusion - that is the beauty of larger sample sizes, they make it much less likely to have conflicting assumptions due to shear size of the sample size "should" reduce bias over time.

With the recent submission of Arvle Casey, a backwards mutation may have actually taken place. Therefore, the Abner / Pleasant Casey line still remains the second best candidate for topping the DNA descendancy chart but additional DNA submissions for the Henson Casey lines may reveal that the Abner Casey and Henson Casey lines may share that place of sharing second place. Additional submissions of other SC lines should could eventually reveal another Casey line could replace the John Casey (SC) line as the sharing the oldest known set of DNA markers. The Ambler Casey line or Jesse E. Casey line are unlikely to be on top on the DNA descendancy chart. Others should play around with their own possible descendancy charts for the SC and TN Cluster as there are several other possible (lower probability) scenarios that could develop. The most encouraging aspect of this project is that with each additional submission in this group (or expansion of available markers within this group) more information can be extracted.

If DNA tests were only $10 each

Unfortunately, charges for DNA analysis will come down very slowly over time (and there will always be more markers made available to quench our thirst for more information). So $10 submissions will not happen in the near future but we could have wealthy person who wants the Casey DNA project to be the best or somebody could put in their will a modest donation to our project. What would we do with these funds and we certainly want to spend these funds wisely. Let us assume that we had $25,000.00 donated to the Casey DNA project tomorrow. Would that be too much or not enough ? Also, we should be assume that we are not allowed to divert the funds to additional traditional research, publishing Casey books or scannning 1,000s of pages of Casey source materials (which might be a better usage of the funds). The benefactor only put three conditions: 1) We would have to present a plan of how the funds would be used and what results would be expected; 2) Only those persons who currently have submitted DNA submissions could participate in defining how funds would be spent. 3) Funds can only be spent on actual DNA testing (not hiring any consultants, other genealogical projects, advertising, etc.)

So if this Casey DNA project already had identified 1,000 random male Casey volunteers willing to participate (all with diverse well documented ancestry back to around 1800) and some private benefactor donated $25,000.00 to the Casey DNA project, what should be done with these funds ? Could we even spend that kind of donation properly (this represents 100 additional submissions with a mixture of 37 and 67 markers) ? Let us also assume that this donation was spread out over ten months. This means we would have to describe which ten submissions (or upgrades) would be sent for analysi each month and what kind of results we would expect - this would force us to plan ahead and justify which of willing ten percent volunteers would be analyzed and which 90 % of willing participants would not be analyzed. Here would be my list:

Month one - Upgrade James H. Casey to 37 markers (done); upgrade John Casey (SC) to 67 markers; three more 37 submissions from SC and TN lines that are not related to existing SC and TN lines (but are good candidates to be related - one done now); three more 37 marker submissions from lines with recent Irish origins but good candidates to be related to the currently submitted lines with recent Irish origins; three more lines of the other Casey lines (only good candidates that could be possibly related to the existing submissions).

Month two - Upgrade Arvle Casey to 67 markers; upgrade second SC and TN Casey line to 67 markers (done); three more 37 marker submissions in the SC and TN cluster; three more 37 marker submissions in the Irish Cluster; three more submissions in other Casey lines.

Month three - 37 marker submission from third son of Henson Casey; two more random 37 marker submissions from the SC and TN lines; two more random 37 marker submissions from Irish Cluster; five more submissions from other Casey lines.

Month four - Upgrade John Casey (NY) to 37 markers (being done); Upgrade James C. Casey to 37 markers; order 37 marker for second son of Ambler Casey; order 37 marker for second son of Abner Casey; order 37 marker for second son of Jesse E. Casey; two more 37 marker submissions from Irish Cluster; five more 37 marker submissions from other Casey lines.

Month five - Upgrade three 37 marker submissions to 67 marker submissions (from either SC and TN lines or the recent Irish lines); Upgrade Patrick Casey to 37 markers and Daniel Casey to 37 markers; Upgrade submittor Daniel Casey (VA) to 37 markers; two more 37 marker submissions from SC and TN cluster; two more 37 marker submissions from Irish Cluster; three more 37 marker submissions from other Casey lines.

We are now half way through our funds and have all known upgrades done and all known fine tuning submissions completed. We have a total of eight fine tuning submissions (from same known ancestors) - six from the SC and TN lines and two from other Casey lines. This means that we have 16 lines covered from SC and TN; 17 lines covered from Irish Cluster and 19 lines from other lines. This probably on target for SC and TN lines, somewhat high for the Irish Cluster and low for other lines. There are around 50 unique lines in the SC and TN cluster, therefore, 16 lines would probably be insufficient to cover this cluster but 32 lines would sufficient to uncover many connections. Half way through the donation funds, it would now be time to concentrate the best candidate for a third cluster from the other Casey lines.

Month six - Two upgrades from 37 to 67 markers where required; three more 37 marker submissions to SC and TN lines; one more 37 marker submission to Irish Cluster; two more 37 marker submissions third identified cluster; three more for other Casey lines.

Month seven - One fine tuning submission (submission from line that already has a proven ancestor submitted); two 37 to 67 marker upgrades where required; two more from the SC and TN cluster; one more from the Irish Cluster; two more for third cluster; three more for other Casey lines.

Month eight - One fine tuning submission; two more from the SC and TN Cluster; one more from the Irish Cluster; three from the third cluster and three more from other Casey lines.

Month nine - Two more 37 to 67 marker upgrades; three more from SC and TN Cluster; one more from Irish Cluster; two more from third cluster; three more from other Casey lines.

Month ten - One fine tuning submission; three more from SC and TN Cluster; one more from Irish Cluster; two more from third cluster; three more from other lines.

All funds are now spent and what kind of coverage would we have ? Let's assume that four submissions were used to seed the third true cluster from the other Casey lines. This leaves us with 30 lines covered from other Casey lines, 29 lines covered from the SC and TN cluster, 22 lines covered from the Irish cluster and 18 lines covered from the third cluster. This results in 99 lines covered, eleven submissions for fine tuning and nine equivalent submissions dedicated to upgrades. This would result with three DNA descendacy charts that would give researchers a lot of new lines to start researching based on DNA results. At this point in time, traditional research should help tie many of this lines together with information derived from this DNA study. However, it appears that even $25,000 would not be enough funds to completely map the Casey DNA map (my estimate would be it would be approaching 50 % though).

The above plan for spending the mythical $25,000 donation provides a good insight on where priorities should be set for this DNA project. I am not really certain what 30 "not so closely related" other Casey lines would tell us about the Casey surname. Only the formation of closely related clusters can assist our primary goal of connecting the many Casey lines that exist. However, we must include a diverse cross-section of all Casey lines to provide a better picture of all Casey lines. There are probably dozens of Casey lines that have no common genetic ancestor that used the surname of Casey. Many unrelated individuals must have taken the Casey surname when our Casey ancestors started using surnames, many NPE events have probably happened and there are bound to be several name changes from other surnames to the Casey surname. We also need to be brave and attempt to map several of our Casey lines as NPE events (non paternity events such as adoption, out of wedlock, etc.) As one Hanvey line appears to one known NPE event where some Casey male was probably adopted by the Hanvey's, we should also investigate which Casey lines are not really genetically related to the Casey lines but were other non-Casey males that were adopted into the Casey families. Each of these NPE lines create a genetically "new" Casey line. However, with the assistance of DNA submissions of other surnames, we should be able to determine some of the origination of "new" Casey lines and tie some into other surname descendancy charts.

Good candidates for more DNA analysis

So what additional submissions would provide more insight to our Casey ancestors ? And which upgrades would be beneficial ? And what fine tuning submissions would be useful (additional submissions from known ancestors that already have submissions) ? Without any doubt, the highest priority is to broaden the scope of the submissions. We all need to identify and recruit submissions of lines that we think could be related and less on additional submissions that we know are related. However, additional submissions for lines already covered and upgrades to existing submissions also have value as well but are not as high of a priority (with a couple of exceptions). The primary purpose of this project is determine which unrelated lines look most promissing for additional research for possible connections. Identifying and recruiting these possibly related lines should always remain our highest priority. Unfortunately, it is our bias towards our own lines greatly influences our interests. It is human nature to want our lines to be best represented but the project benefits more from having broader participation. We all need to work hard to identify good candidates that might be related and actively recruit DNA submissions from those lines.

There is much interest in the "fine tuning" of the existing submissions (adding more submissions to lines that already have DNA submissions). The best usage of DNA is to scientifically prove which lines are worth additional research and which lines can be eliminated as wild goose chases. So, what value does "fine tuning" DNA submissions have ? First, it is useful to have submissions from at least two sons of each proven oldest Casey ancestor in order verify the exact source of any DNA mutation. The second submission will determine if their marker set is unique to their oldest proven ancestor or a mutation of one his sons (or other male descendants). This is where the DNA descendancy chart is very useful. For some lines, it may pretty obvious that the chance of variations is probably low. However, the two marker variation between two sons of Henson Casey vividly show that recent mutations can be mistaken for mutations between oldest proven ancestors.

If the submissions for any first two sons of any oldest proven Casey ancestor have different DNA marker sets, then new submission of a third son of this oldest proven ancestor would be required. The recent addition of the Arle Casey submission has now shown that the Jackson Casey line has the unique marker mutation of 607 (15 to 16). This mutation may possibly even be from a later generation male descendant. A third DNA submission would determine where these unique mutations start. With only one submission per ancestor, it can be dangerous to assume that all brothers of the oldest ancestor will have the same DNA marker set. In the case of the Arvle Casey submission, I think we were all surprised to find that a brother would have two DNA mutations. For the Henson Casey line it is doubly important for a third submission since we currently do not know if Jackson Casey has a two marker mutation or only one marker mutation and Arvle has a one backwards mutation which is rare but can happen (probably a red flag DNA genealogical descendancy charts).

Another second form of "fine tuning" is upgrading the number of markers when two unconnected lines have common marker sets. John Casey (SC) and Arvle Casey both have 37 marker matches as well. It is doubly important for the John Casey (SC) line to be upgrade to 67 markers as it represents the baseline DNA marker set for this cluster (line with fewest mutations from other Casey lines). These two submissions could benefit from being upgraded to 67 markers as well to attempt to determine if additional markers can separate these two lines or help establish yet an even closer relationship. With the case of the upgraded markers of Jesse E. Casey and Ambler Casey, the 67 marker set implies even a closer relationship between these two lines than a 37 marker set match would indicate. If they turned out to be different, you would have identified the mutation that makes these two lines unique (guess we will have to wait to the future 100 marker test for this). If Henson Casey and John Casey (SC) lines come back with a 67 marker match, it would be time for these two lines to start sharing traditional genealogical research for possible connections. DNA clearly shows that Henson Casey is much closer related to John Casey (SC) than Ambler Casey as once speculated. Maybe Henson Casey has ties back to Warren County, Kentucky where his 37 marker match once lived.

The "fine tuning" of additional children of a oldest known ancestor will also validate the connection of sons of each oldest known son to the oldest known ancestor. These connections could already be pretty well established or could be fairly speculative in nature. Our DNA Descendancy charts should not be too biased on our current traditional research to date. However, well proven sons will benefit little other than verifying what is already known. The three major TN Casey lines are well researched but all three lines lack the traditional proof that most genealogical lines have. These three lines are Jesse E. Casey, Ambler Casey and Abner Casey. The connection of Jesse E. Casey to his children was originally based on a 1894 book. Fortunately, this secondary source is well supported by census records and other sources (with one or two exceptions). Of the three lines, the Jesse E. Casey line probably has the best genealogical documentation for establishing the children of their oldest proven ancestor (this line would not benefit further proving the connection from oldest ancestor to their sons since primary documentation already exists). The children of Abner Casey is primarily based on several abstracts (letters) of a Family Bible that can not be located. This account is also supported by several primary documents as well.

The children of the Ambler Casey line is the least documented family as there is no existing single document that establishes the children of Ambler Casey. DNA documentation can provide scientific evidence that firm up the connection of these sons to their oldest proven ancestors. Sons of oldest proven ancestors with the weakest traditional genealogical documentation to their oldest proven ancestors could provide additional documentation connecting these sons. Having DNA evidence supporting these family connections may, in the near future, be considered primary documentation in this new world of genealogy. Unfortunately, this did not prove the case for Henson Casey being the son of Ambler Casey (this was pretty speculative in nature and has now been shown by DNA evidence to be very unlikely now). This is probably the best usage of DNA submissions when the sample size is relatively small (as it is to date) and when one cluster of lines emerges early in any DNA project.

Our goal is to get several clusters of Casey lines that help establish recent common ancestry between various Casey submissions. Once the number of submissions greatly expands in scope, another major benefit will start to emerge. It will become obvious that several diverse Casey lines will become more closely related than traditional research has shown to date. DNA documentation can help genealogists better select which "possibly" related lines to research based soley on DNA evidence. Researching these newly discovered potential relationships through traditional genealogical methods may result in locating supporting documentation and may be the key to getting past that brick wall.

The current DNA submissions have really shattered many of my most promissing lines (which I have spent countless hours attempting to connect). Before the availability of DNA information, my most promissing lines for connection to Ambler Casey were: 1) Abner Casey, 2) Jesse E. Casey, 3) Henson Casey and 4) John Casey (MO). After DNA submissions, here are major changes: 1) Jesse E. Casey has obviously replaced Abner Casey as the best candidate - but both are still my best candidates. 2) Since we have hit the brick wall on Abner Casey, the Hanvey line could open new doors for more connections to Casey lines that remained longer in South Carolina. 3) Although Henson Casey lived in Roane County, TN where Ambler Casey lived, the speculative connection as a son of Ambler Casey is now not possible. 4) Since John Casey (MO) resided in McMinn County, Tennessee during the same time as Ambler Casey, John Casey (MO) "was" another good candidate - DNA documentation really discounts this connection now. 5) With an exact 67 marker match with Jesse E. Casey, I should now prioritize research on this line above all others. These are significant changes in focus for my Casey research.

How many markers should be analyzed

So how many DNA markers should one submit to be useful and which of the existing submissions should have additional markers analyzed. For the SC and TN cluster, all new submissions should be either 37 or 67 markers. For all other lines, all new submissions should have be 37 markers (unfortunately, Family Tree DNA has dropped their 25 marker option which would have been sufficient). The 12 marker test does not have enough information to be useful for the project. Also, it is not desirable for two submissions from the same line - unless they are from different sons of the oldest known proven ancestor. If two submissions from two sons of oldest known ancestor have two different marker sets, then another submissions from a third son would be required.

With the recent match of Arvle Casey and John Casey (SC), 67 markers are now required to separate these two lines. For other submissions in this group, it may turn out later that all the markers from 38 to 67 may just help separate the descendants of our oldest proven ancestors (not very useful for adding another generation to the pedigree chart). Or as the sample size grows, we may later learn that additional markers are required to separate many of these lines. To date, there are now five 67 submissions in this group, unfortunately, all have the same marker values from 38 to 67. Submissions in the SC and TN Casey cluster will always have the most to benefit from expanding 37 markers to 67 markers. At this point in time, it is not probably necessary for other Casey lines to upgrade to 67 markers. The other Casey lines (outside the SC and TN Casey lines) need to concentrate in obtaining new submissions or encouraging others with fewer than 37 markers to upgrade their submissions to 37 markers. This is harder to accomplish since it is much easier just to order your own upgrade.

So how many potential oldest Casey ancestors originate from South Carolina ? The 1790 census of South Carolina has 47 males that make good candidates and all but six are from Spartanburg County or Newberry County. With only one unique marker for the first 25 marker positions, this leaves only ten markers available to separate these lines from a list of 47 possible ancestors. It appears that the SC and TN Casey lines will probably need 67 marker sets in order to provide separation. It appears that Casey lines grouped in the Irish Cluster has now been established as a second cluster of Casey ancestors and may eventually need 67 marker submissions. This grouping of submissions are not that distantly related, so additional submissions that fall into this group could be informative for researchers of this group in the future.

As time passes by, many submittors may become no longer interested in paying the premium to have their sample analyzed for additional markers. Eventually, these samples will become unviable to analyze. The person supporting the analysis could also die or become incapacited with the children potentially showing no interest in this project. For the vast majority of cases, the exposure to lose valuable DNA documentation will probably not be of great concern as most lines have many living male descendants of any particular son of an oldest proven ancestor. If there are numerous living male descendants, then there will remain many others to assist in the future. However, if you are the only surviving male of your line, it is very important that you submit as many markers that are currently available (currently 67 markers from this company analyzing our samples for this project). My great grandfather, William Martin Shelton (born 1847), had seven daughters and only one son. This son produced only one grandson who died as a teenager in 1928. Therefore, there are no male descendants of this Shelton line that can be tested for the Shelton DNA project even though there are around 400 living descendants (all descending from daughters born with the Shelton name at some point).

So who should we encourage to submit additional samples that would benefit this project ? There are three broad categories of submissions that should be sought in the near term. Once other submissions are analyzed, there will surely be new items of interest. First, for all the current submissions, we should encourage male descendants of at least two sons of our oldest proven ancestors to submit DNA samples. This helps us determine where the uniqueness of each marker set begins. It also provides more evidence connecting these sons to their oldest proven ancestor. Second, everyone has their favorite candidates for possible connection to their lines. Your hunch (supported by traditional genealogical research) can be either dismissed by DNA evidence or further strengthened by DNA evidence. We must have more submissions from possible related candidates to make any progress on which lines are worthy of additional research. Third, we need wider participation of all Casey lines to determine those big surprises and possible connections that we all have missed to date. As the current submissions confirm, the Casey surname is a relatively common surname with dramatically different DNA backgrounds. Only larger sample sizes (more submissions) can reveal where other clusters will form.

For the SC and TN lines, we need more submissions from other SC Casey lines in order to understand how these lines are connected to the SC and TN lines that have been submitted to date. The biggest surprise for me to date is how closely related the TN lines are to the first SC submission with 37 markers. The John Casey (SC) line could be just one SC line that is closely related to the other TN lines - or are all SC Casey lines more closely related to the TN lines than traditional research has shown to date ? We also need more unconnected TN/AR/MO lines to determine which lines are closely related and which are wild goose chases for connection to this cluster. We will discover that certain lines are very closely related which will allow researchers of these lines to properly focus their research more on these promissing lines. We also will need to identify those lines that are not closely related. The descendants of those lines might avoid spending many additional unfruitful hours attempting to connect these more remotely related lines.

Here are some specific recommendations for additional submissions for existing lines. First, we need additional samples for other sons of the four main SC and TN lines covered to date. If I got any of these sons incorrectly listed, please let me know so I can update/correct this list (I intentionally left out some possible sons where the connections are very weak). I would prefer to put these weak connections into the category of possibly related lines of interest.

_____Ambler Casey
__________Moses Casey (need one more)
__________John Casey (have)
__________Ellison Casey (need one more)
_____Henson Casey - (have two)
__________Jackson Casey (have, sibling is different)
__________Arvle Casey (have, sibling is different)
__________Other sons (need one more since different)
_____Abner Casey
__________Turner Casey (need one more)
__________Pleasant Casey (have two)
_______________Pleasant Casey, II (have, sibling is same)
_______________Elsberry Casey (have, sibling is same)
__________Abner Casey (need one more)
__________Jesse Casey (need one more)
_____Jesse E. Casey
__________Steven Casey (need one more)
__________Elijah Casey (need one more)
__________Anthony Casey (have)
__________Levi Casey (need one more)
__________Ambler Casey (need one more)
__________Jesse Casey (need one more)
__________Wesley Casey (need one more)
_____John Casey (MO)
__________Levi Casey (have)
__________John Allen Casey (need one more)
_____John Casey (SC)
__________Abner Casey (need one more)
__________Thomas Casey (need one more)
__________Samuel Casey (need one more)
__________John Casey (have)
__________Henry Casey (need one more)
_____James Hill Casey
__________James Casey (need one more)
__________Hugh Casey (need one more)
__________Willis Casey (have)
__________Allen Casey (need one more)
__________John Casey (need one more)
__________Newton Casey (need one more)
__________Andrew Casey (need one more)


For every Casey researcher, you need to determine which sons of oldest proven ancestors may have very few living male descendants and are exposed to having the line "die out" of producing no living "all male" descendants. For the descendants of Ambler Casey, Henson Casey and Jesse E. Casey, I have reviewed all the known male descendants to estimate the number of potential submittors and the exposure for these lines to die out:
For the Ambler Casey line
Moses Casey (need)
3 sons, 20 gsons, 37 ggsons, 22 3Gsons, 10 4Gsons, 2 5GSons (33 males descendants born after 1900). Almost no exposure for this line to die out and substantial information concerning living male descendants.
John Casey (my line)
7 sons, 25 gsons, 43 ggsons, 41 3Gsons, 24 4Gsons, 8 5GSons (76 males descendants born after 1900). Almost no exposure for this line to die out and substantial information concerning living male descendants.
Ellison Casey (need)
5 sons, 2 gsons, 7 ggsons and 2 gggsons (only two known males descendants that were born after 1900). Probably minimal exposure for this line to die out but very little knowledge of living male descendants. The connection of Levi Casey is pretty weak and Levi Casey could be a son of Ambler Casey, therefore, this submission would greatly benefit the Ambler Casey line and could greatly benefit the descendants of Levi Casey to breaking through that brick wall on this particular line.



For the Henson Casey line
16 sons, 30 gsons, 18 ggsons, 5 3Gsons, 1 4Gsons, 0 5GSons (37 males descendants born after 1900). Almost no exposure for this line to die out and reasonable information concerning living male descendants.


For the descendants of Jesse E. Casey, I have reviewed all the known all male descendants to estimate the number of potential submittors and the exposure for these lines to die out (this is based on the 2000 version of Vonda Dihm's book). There is considerable exposure on several sons of Jesse E. Casey to die out (or may have already died out). However, there are seven sons to choose from (unlike the Ambler Casey line where do not even know the names of several of his sons):
Jesse E. Casey
Stephen Casey (need)
6 sons, 18 gsons, 29 ggsons, 15 3Gsons, 7 4Gsons, 0 5GSons. Almost no exposure for this line to die out and reasonable information concerning living male descendants.
Elijah Casey (need)
6 sons, 3 gsons, 4 ggsons, 8 3Gsons, 2 4Gsons, 1 5GSon. Little exposure for this line to die out but not much information concerning living male descendants.
Anthony Casey (have)
3 sons, 25 gsons, 34 ggsons, 43 gggsons, 23 3Gsons, 4 4Gsons, 0 5GSon. Almost no exposure for this line to die out and considerable information concerning living male descendants.
Levi Casey (need)
6 sons, 8 gsons, 0 ggsons, 0 3Gsons, 0 4Gsons, 0 5GSons. Moderate exposure for this line to die out and no information concerning living male descendants.
Ambler Casey (need)
2 sons, 0 gsons, 0 ggsons, 0 3Gsons, 0 4Gsons, 0 5GSons. Very high exposure for this line to die out or this line could have already died out 100 years ago. No information concerning living male descendants.
Jesse Casey (need)
11 sons, 24 gsons, 22 ggsons, 15 3Gsons, 4 4Gsons, 0 5GSons. Almost no exposure for this line to die out and reasonble information concerning living male descendants.
Ambler Casey (need)
2 sons, 0 gsons, 0 ggsons, 0 3Gsons, 0 4Gsons, 0 5GSons. Very high exposure for this line to die out or this line could have already died out 100 years ago. No information concerning living male descendants.


We all need to openly discuss what we think are the best candidates to be related to these TN and SC lines as well as the other Casey lines that have been submitted to date. Traditionally, many of us avoided sharing this kind of speculation because many novices to genealogy tend to convert this speculation into fact. However, we also are not sharing this valuable insight that we have developed over many years of research. People visiting the Casey DNA web site need input and encouragement on which lines are important to this project at this point in time. As my web site might imply, here are some additional Casey lines that my instinct tells are good candidates for the SC and TN lines:

_____James Casey
__________James Casey (need)
__________Sterling Casey (need)
__________Samuel Casey (need)
_____Mrs. Easter Casey
__________William Casey (need)
__________Abner Casey (need)
_____Ambler Casey (born 1832)
__________Only 4 daughters known
__________(this male line probably died out)


Mrs. Easter Casey has Fulton County, AR connections and my gut feeling says her family is related. Please let me know your specific lines of interest (specially on the SC lines where others must help me identify other lines of interest). The unconnected Ambler Casey (born 1832) is an obvious candidate to be one of the missing sons of Ambler Casey (TN), however, this line appears to have no living male descendants.

Where are our submissions for widely known SC and TN lines such as Randolph Casey, Christopher Casey and General Levi Casey ? I will be glad to add other candidates for anyone who wants to present their speculation for possible connections. I decided to look at previous Casey publications that cover these SC and TN Casey lines. From the Walter E. Casey book, the George and Abner Casey manuscript and some early SC probate records, here are some other good candidates:

_____Christopher Casey (1960s manuscript)
__________John Casey (have)
__________Aaron Casey (need)
__________Hardin Casey (need)
_____Aaron Casey (1960s manuscript)
__________Abner Casey (have)
__________Jesse Casey (have)
__________Alexander Casey (need)
__________Anthony Casey (need)
__________Uriah Casey (need)
_____Aaron Casey (two probate records)
__________William Casey (need two)
__________Moses Casey (need two)
__________James Casey (need two)
__________Levi Casey (need two)
_____Randolph Casey (probate records)
__________Levi Casey (need two)
__________Randolph Casey (need two)
__________Isaac Casey (need two)
__________Abraham P. Casey (need two)
__________Samuel Casey (need two)
__________Hiram Casey (need two)
__________Zadock Casey (need two)
_____Levi Casey (1960 DAR article)
__________John Casey (need two)
__________Levi Casey (need two)
__________Jacob D. Casey (need two)
__________Samuel O. Casey (need two)

I know most of you are doing the same as me - pulling out your files and going to the Internet to get the basic information for these Casey lines. The official Casey DNA web site allows the submission of your pedigree via a GEDCOM file, please submit these files and provide this project with this critical information. This web site also provides the ancestry of each DNA submission when known. Please send in any additions and corrections to information included to date.

Analysis using Cladogram software

Cladograms are graphical representations of the marker mutations between individuals. These charts can quickly determine the closeness of relationships between various lines. The cladogram charts were created using a free phylogenetic network software program offered by Fluxus Engineering:

More information about free cladogram software

Unfortunately, this free software is not for the faint of heart and is fairly difficult to use. Also, the connections presented can be misleading at times but these charts are absolutely wonderful determining clusters and grouping of various DNA submissions. This program determines the simplest configuration which has the least number of interconnections or mutations. For excellent examples of what cladograms can do for this project in the future, refer to the Mumma DNA web site. The Mumma line is related to my wife's Garver line and was one of the first genealogical DNA projects. The Mumma DNA project is pretty far along in their collection of DNA submissions and have gained useful information through the availibility of DNA information. This surname is relatively uncommon which makes the project much more useful with many fewer samples. Their web site has a great presentation of their tables and show the usefulness of cladograms:

Mumma DNA Web Site

The first cladogram includes all unique submissions with 37 markers and reveals two items of interest: 1) It implies that John Casey (SC) may have a marker set that represents the earlier Casey line as its marker mutation is closer to all other Casey submissions than the Pleasant (Abner) Casey submission. 2) It also reveals that there are two true clusters and two very preliminary groupings that are possible (these groupings are so remotely related that they probably do not currently warrant separation - other than to make it easier to analyze the current DNA summary table).



37 marker Cladogram (PDF)



The next cladogram includes all unique submmissions with 25 markers (only two submissions were added). This chart revealed three points of interest: 1) As we already knew, the SC & TN lines require 37 markers to be useful. All 37 marker submissions were lumped together (one big circle) and only the James Hill Casey line has a unique marker. 2) The cluster of Michael Casey (Ireland), Daniel Casey (Ireland) and Dennis Casey (Ireland), lose many of the unique markers found with markers 26 through 37. 3) The other 25 marker submission, Daniel Casey (VA) line, could be attached two groups and was attached to the group with the fewest mutations. The expansion to 37 markers would make easier to determine which group this line really belongs to. Since 25 markers can no longer be submitted, this chart will not be updated with new 37 marker submissions.



25 marker Cladogram (PDF)



The last cladogram includes all unique submmissions with only 12 markers (only three submissions were added). This chart revealed four minor points of interest: 1) The Daniel Casey (Ireland) line, Michael Casey (Ireland) line and Dennis Casey (Ireland) line all merged with a common set of markers. This suggests that this group of lines is a true cluster. 2) For the first new line added, Patrick Casey (Ireland) has a unique four marker mutation from all other lines. After this unique string of four unique mutation, it only takes one additional mutation to connect to six other lines and two mutations to connect to the Pleasant Casey line where five other lines are lumped together. Clearly 12 DNA markers are not sufficient to properly determine which of these 11 lines are more closely related to the Patrick (Ireland) line. 3) For the second new line added, John Casey (NY) has two unique marker mutations separating this line from other lines. With only two marker mutations, this line shows relationships to four lines and with three marker mutations is not far from the Pleasant Casey grouping of five more lines. Again, it is clear that 12 markers is just not enough to determine what other lines look promissing as possible relatives. 4) For the third and last new line added, James C. Casey also has two unique mutations from other lines. With only two mutations, it has possible relatives to six lines and with three mutations, the Pleasant Casey line adds another five lines. Again, it is clear that 12 markers is just not enough to determine what other lines look promissing as possible relatives. With the analysis of the three new 12 marker lines, it highly recommended that any future DNA submission should be at least 37 markers (since 25 markers are no longer available) as 12 markers do not provide enough markers to draw any conclusions about connections to other lines. It is doubtful that any new 12 marker submissions will be submitted, therefore this chart will not be updated with new 37 marker submissions.



12 marker Cladogram (PDF)



Based on the 37 marker cladogram, I separated all the Casey submissions into to five clusters (only two true clusters at this point, the other clusters were created solely to make the summary chart more readable). I also made a baseline DNA marker set based on the closest mutation split for each of the remote groups (where remote lines joined into a common mutation list with respect to other lines). I also changed the background color for mutations from the baseline to clearly show the mutations from the baseline. The next section compares the four baselines to show how each baseline deviates from the other groupings. With the recent addition of several 67 marker upgrades, those markers are shown on page 2 of the "DNA Summary Table."





DNA Summary Table (PDF)

This table makes it pretty clear about the desirability to add more markers to the existing submissions. First, the John Casey (NY) and James Charles Casey each only have two mutations from their baseline. These two submissions would benefit from expanding to 37 markers. Expanding to 67 markers would not really be necessary at this point in time. The Daniel Casey (VA) line clearly does not need to upgrade to 37 markers (unless this line does not want to be the only line without 37 markers). The SC and TN lines clearly have the most to benefit from 67 markers. With the new exact 37 marker match between Arvle Casey and John Casey (SC), this DNA project has one pair of unrelated lines with no mutation differences to separate these lines. The John Casey (SC) and Arvle Casey should upgrade to 67 markers in order to separate these lines or determine an even closer relationship. Both the Ambler Casey line and the Jesse E. Casey line are now an exact 67 markers match - now indicating even closer relationship than a 37 marker match implied (it will now take a future 100 marker upgrade to separate these lines).

This chart also makes a graphical case for putting John Casey (SC) as the baseline DNA marker set instead of the Abner / Pleasant Casey submissions. By looking at marker 460, you can clearly see that all SC and TN Casey lines are either 460 = 12 or 460 = 13. All other Casey submissions are either 460 = 10 or 460 = 11. The MRCA utility implies that the SC and TN Casey Cluster and the Irish Cluster could share a common Casey ancestor, therefore, it makes more sense to make the John Casey (SC) marker set as the baseline marker set for this cluster. The last observation is that there are a lot of mutations between most of the other Casey lines. This means that most of these lines are very remotely related to each other and that the Casey surname is quite common requiring many more submissions in able to start connecting lines together. In fact, there is so many mutations that it appears that the earliest common ancestor between some of these lines started before surnames were even used. It is possible that several of these Casey lines may have had clan ties and may have taken a surname based on their clan leader vs. having family ties. Of course, early clans were closely related as well. A few of these lines could also be NPE events. Once additional submissions are received and these lines still remain very isolated, other surnames should be investigated for possible NPE connections.

DNA Descendancy Chart (SC & TN)

The next charts is attempt to create DNA based descendancy charts. This is probably the most likely scenario but there are obvious variations. With any DNA chart, the marker set could easily be passed down several generations with no mutations. One big surprise in this chart is that both Abner Casey and his father must have no mutations to make sense. This chart also implies that the testing of the sons of Abner Casey, Jesse E. Casey and Ambler Casey will not help us find any additional generations (just confirm that current submissions are not recent DNA mutations). This chart implies that more submissions of unrelated SC and TN lines are key. Also, the Arvle Casey line appears to the source for a backwards mutation of 460 from 13 back to 12. Another alternative for Arvle Casey would be no mutation and Jackson Casey had two mutations (add 460 from 12 to 13). This makes the Henson Casey line more closely related to the SC John Casey (SC) line than other lines that were once believed to be closely related.

The DNA Descendancy Chart implies that another son of Abner Casey will most like be the same as the Pleasant Casey submissions. It can really have two scenarios for different marker sets: 1) A new mutation of some descendant of the second son being submitted (not very genealogicially significant); 2) The second son may not have the mutation from 460 (12 to 13). This would imply this was a mutation of the Pleasant Casey line. This would also elevate the Abner Casey line to the same level as John Casey (SC) in the DNA Descendancy Chart. This is really pretty low odds since other Casey lines (Jesse E. Casey, Ambler Casey and part of the Henson Casey line) share this mutation which will probably separate this cluster into two subclusters in the future. If more SC lines were submitted, this would verify that the 460 mutation is the best candidate to separate this cluster into two subclusters. These would be subclusters - not new clusters as by definition, clusters are closely related and subclusters just allow a large number of submissions to be broken up into more managable groupings.

The DNA Descendancy Chart implies that another son of either Ambler Casey or Jesse E. Casey will probably not help solve the connection between these lines. However, CDYb appears to be a second significant DNA mutation. This separates another subcluster (Ambler Casey and Jesse E. Casey lines) from other lines in this cluster. The mutation of the 607 marker was probably introduced with birth of Jackson Casey (or his male descendants). This mutation is probably not genealogically significant. Since Arvle Casey (Jackson's brother) does not have this mutation, descendants of Jackson Casey can claim this mutation as unique to the Jackson Casey line. The backwards mutation of the 460 marker is only one possibility for sons of Henson Casey presents a unique challenge. This mutation is shared with all other non-related TN lines (Abner Casey, Jesse E. Casey and Ambler Casey). A third son of Henson Casey is needed to determine if Jackson Casey is a double mutation and Arvle Casey will be elevated to the same level of John Casey (SC). This is the most likely scenario. However, another reasonable probability can be assigned to Jackson Casey having only the 607 mutation and Arvle Casey having a backwards "deletion" mutation of 460 to back to 12.

Possible DNA Descendancy Chart - EXCEL Version(Irish Cluster related)

Possible DNA Descendancy Chart - EXCEL Version (Majority Rules)

Non Paternity Events vs. Overlapping Haplotypes



This section is speculative in nature and I have not seen these issues addressed in depth at other DNA web sites and is not well covered in publications on DNA used for genealogy. However, I found the issue covered by Mark Jobling in the June, 2001 issue of Trends in Genetics (the internet is wonderful at times). I am now convinced that non-Casey surnames found in the SC and TN cluster are very good candidates for NPE events. This is based on the fact that only one of the ten individuals that are closely related do not have the surname Casey. When the vast majority of submissions that are closely related have the same surname, it would be much less likely that other surnames randomly crossed genetic paths (known as overlapping haplotypes). For the Irish Cluster, the opposite is true. The Ysearch database shows that there are 23 individuals within five mutation points of the baseline for the Irish Cluster (I used mutation point mu2 from the 37 marker cladogram for the oldest ancestor of this group). Only two out of 23 submissions in this listing were Caseys. For this cluster's haplogroup (the very common R1b1), overlapping haplotypes are not unusual. There are two Casey, two Forbes, two Butler, two McGraw and a myriad of other surnames: Harvey, McLain, Ramsey, Brooks, Peppers, Hart, Bryan, Anderson, Hogan, Cummings, Iron, Crow, Blair and McGrath. This implies that most of these haplotypes are probably not related via NPE scenarios but are most likely overlapping haplotypes of a common haplogroup (R1b1). This will make sorting out NPEs in this cluster much more difficult. Another interesting piece of information from this Ysearch database are the origins of these individuals: Ireland (9), Scotland (6), US (5) and Unknown (3). This implies that our Casey origins may be: a) Scotland ancestors who migrated to Ireland then to the US (this may explain why most of the Casey lines are not Catholic); b) Some Irish lines migrated to Scotland (probably a much lower probability but possible).

There are three reasons for genetically diverse events that seem to be evolving: 1) Non paternity events (NPEs) are where real families and biology diverge. These are adoptions and out of wedlock children. I contend that the Hanvey line is the only valid candidate at this point in time. Only the SC and TN cluster will have higher probabilty of NPE events at this point in time; 2) The rarely covered topic is nature's ability for different genetic lines to randomly cross paths (overlapping haplotypes); The Irish Cluster seems to exhibit this variation. When overlapping haplotypes exist, it will be much harder to determine NPE events as haplotypes will not always be related in this scenario. 3) Another topic that was not well understood before DNA is that having the same surname does not imply any genetic relationship as we once thought. Only very uncommon surnames will have one common ancestor. More common names can have ten to one hundred unique unrelated ancestors that only shared interest in the same surname when surnames were first used by most Europeans only a short genetic time ago. These three issues can be confused for each other as well and not correctly identifying the sources of these issues can lead to incorrect conclusions. For the Irish Cluster, upgrading to 67 markers may provide more uniqueness between these overlapping haplotypes. However, upgrades to 67 markers for the SC & TN Cluster have not proven very fruitful to date.

Non paternity events are bound to happen as young adults can die permaturely (adoption) or can have temporary biological relationships that do not result in formal legal relationships. There are lot of variations of these issues that can have many different results. For instance, even adoptions can really lead us astray as it is pretty common for the extended family (same surname) to adopt children of cousins and nephews of young adults who die prematurely or are not able to take on the responsibility of child raising. These closely related adoptions could be very difficult to sort out. Non paternity events can happen is two directions as well. Casey boys can be adopted by non-Casey families (other surnames now truly have Casey DNA being passed down via Y chromosones). Also, non-Casey boys are also adopted in the Casey lines and we can get some pretty diverse DNA being introduced into the Casey surname pool (when in reality their DNA biologically belongs to another surname). This really allows two versions of genealogical ancestry to emerge (familial and biological - both are important). If a male infant was adopted and raised by a Casey family, the Casey environmental influence on this child could be stronger than the biological influence due to DNA. Reseaching two different ancestral trees will become more common with the aid of DNA documentation.

I think that the documentation covering overlapping haplotypes is not widely understood to date and has been avoided as it is difficult to understand (and prove with accuracy). With only 37 and 67 markers available and most markers with only four or five common variations each, you just do not have enough markers to avoid distantly related individuals randomly mutating back across between various lines (both with the same surname or different surnames). For the SC & TN Casey lines, these Casey lines appears to have unique haplotypes which puts a DNA fingerprint associated with these lines (until new submissions prove otherwise). However, some Casey lines clearly have much closer matches to other surnames which could be NPE events of other surnames (most likely scenario). Some may have changed their surname to Casey for various reasons and some may have not been related when our ancestors first started using surnames. Who gets to lay claim the Casey surname - none of course. Unfortunately, clusters with lots of Casey descendants could bias us to believe that one group may try to claim the Casey surname over others (we should avoid this bias at all costs). I really am warming up to the idea that we now should sort out our various Casey lines into many different genetic buckets. We already have done this with our paper research by sorting out various Casey lines by geography. I have known for some time that my focus should only be on Casey lines that have ties to SC and TN and DNA has proven this to be the correct strategy. Actually, I have given up hope on SC lines and DNA has significantly revived my interest in these lines.

Until the availability of DNA submissions to analyze, I had a very inaccurate assumption concerning the exact relationship of other Casey lines to my Casey line. I also assumed that Casey was an Irish name and that all Casey lines (except for NPE lines) would have a common ancestor back in the early days of Ireland. However, the genetic distance from many of the Casey lines have not proven this not to be the case. I am now beginning to believe that there were 10 to 100 unique individuals that first used the Casey surname. Surnames were generally forced upon our ancestors by governments and early rulers in order to collect taxes and raise armies. With surnames like Smith or Brooks, the orginal assignment of surname was driven by trade or geography. For the first Irish individuals that were told to start using a surname, these individuals did not get all get unique surnames and it is likely that geography played significant role or that the clan that they belonged to played a large role. It is possible that their vocation could have played a role as well (clan leaders probably had unique names as may have soldiers, farmers and other vocations). The Casey surname has a military meaning (it means dart-armed chief in battle). Dart refers to knives (shorter than swords).

The surname Casey originated from the Gaelic name "O'Cathasaigh" around 1,000 years ago. The original of the name was "dart-armed chief in battle" which implies our ancestors were probably soldiers. After the Anglo-Norman invasion, the name was "anglicized" to O'Casey and by the 1300's it was further "anglicized" to its present form of Casey. During the introduction of surnames, life was very brutal with constant warfare between neighboring clans. These clans became larger and larger in order to survive. These conflicts left many orphan sons who adopted and raised by others on both sides. Other clan members would regularly adopt the young sons of fallen comrades and these sons may have taken on new surnames if they were very young. When large conflicts resulted in expanded control by one side, other clans adopted orphan children of the defeated side. Also, there was many orphans left because of widespread outbreaks of diseases and food shortages. Potato blight outbreaks caused severe food shortages and starvation for many Irish families. Even in these turbulent times, diseases and accidents resulted in the need to adopt orphaned sons and probably introduced many DNA varieties to the Casey surname.

Without any doubt, only two clusters have a very high probability of being genetically connected. At this point in time, only the SC & TN Casey cluster and the Cluster 1 (recent Irish connections) have any possibility of being genetically connected. Even these two clusters are not "genealogically" significant to each other but may be "genetically" significant to each other. Cluster 2 has only small chance of being "genetically" connected to other clusters / groupings and its connection is so distant that it will not be "genealogically" significant to current Casey researchers (Cluster 2 will not be tied to other clusters any time soon). Cluster 3 is now known not to be genetically connected to other clusters / groupings of Casey lines. The lines in Cluster 3 can not have either "genealogically" or "genetically" significant connections to other clusters or groupings.

However, do not rule out those very low odds connections, your line may be in those lower two percent that had many more mutations than other lines. There easily could be ten to one hundred genetically distinct Casey lines. Only the addition of many more random Casey DNA submissions will start to reveal the actual number of "genetically" distinct lines. With around 1,000 years and 40 generations, most are probably due to NPE events over this long time period. The Casey surname was probably started by around twenty unrelated individuals and was suplemented by seventy NPE events spread out over literally 1,000 years. You might also be able to throw another ten to twenty name changes to Casey as well. It is very possible to have ten to one hundred unique different genetic lines of the Casey surname. We have at least five genetically unique clusters to date out of only 21 submissions.

Call for better documentation

Raw DNA data without traditional genealogical research is not very useful. It is critical to have both the DNA marker sets and known information about the ancestry of these DNA submissions. The Pace DNA web site (one of my ancestors) has an excellent web page dedicated to providing significant genealogical information known about all of their DNA submittors. This information is conveniently made available to anyone of interest and saves redundant efforts of many people gathering what they know about these submissions for their own personal analysis.

Pace DNA web site's ancestry listings for submittors

The attached summary of ancestral listings is based on my knowledge of these lines and what is readily documented on the Internet. I may have made a couple of incorrect assumptions or have not included complete ancestries as my knowledge is quite limited on Casey lines outside the SC and TN cluster. I welcome additions and corrections to this listing as well as comments as to format and content. As I have more time, I will attempt to add more from emails and other web sites.

Ancestries of Casey DNA submissions

Please give me some feedback

I am an amateur DNA researcher, so I greatly expect to be corrected on some of my conclusions but I will take the risk to be the first to publish an analysis of our Casey DNA submissions. For anyone wanting to analyze the Casey DNA submissions that we currently have, the only practical method is to use cladogram software which graphically displays the connections between the various DNA submissions. Also, usage of MRCA (Most Recent Common Ancestor) calculators are also critical in determining the closeness of genetic relationships. Without these tools, it is extremely difficult to manually extract this information. I am very new with DNA analysis for genealogy and would appreciate comments on this DNA analysis. This analysis takes a lot of time and I would appreciate feedback of where you think I going off track, where you think my analysis is on target, what information is not found that should be included, etc. There is quite of bit of redundancy in this analysis and I will try to reduce this in future updates to this analysis.

Please send your comments by email, letter or phone:


E-mail (new) ___________

______________________  email address changed to image to reduce my spam email



Snail mail______________  Robert B. Casey, 4705 Eby Lane, Austin, TX  78731-4507



Phone (home)__________  (512) 371-0579 (nights and weekends only)