Analysis of DNA for Casey lines
Last updated on September 3, 2007
By Robert Brooks Casey (Ambler Casey line)
This web site is the personal analysis of the Casey DNA database by
this Casey researcher, Robert Brooks Casey. This web site was created
to assist other Casey researchers in analyzing the current DNA
submissions and to provide input for what is required to allow this
project to produce genealogically significant information. This is
not the official Casey DNA web site. For additional information
concerning the Casey DNA project, refer to the official Casey DNA
web site or contact its coordinator (found at the official web site):
Official Casey DNA web site
Introduction to DNA for Genealogy
Information obtained from the Casey DNA Project
Detailed analysis of DNA submissions
If DNA tests were only $10 each
Good candidates for more DNA analysis
How many markers should be analyzed
Analysis using Cladogram software
DNA Descendancy Chart (SC & TN)
Non Paternity Events vs. Overlapping Haplotypes
Call for better documentation
Please give me some feedback
Introduction to DNA for Genealogy
The analysis of DNA markers provides a new opportunity for genealogists
to unravel their family history. This new tool is now producing
results that can take some of the guesswork out of adding
another ancestor to our family trees. Historically, our traditional
research is heavily influenced by geography, family naming patterns,
migration patterns, etc. This approach most often leads us to discover
new ancestors but it can also lead us into wrong direction as well.
Your particular oldest proven ancestor may have broken away from his family
connections and traditions. Your oldest proven ancestor may not have
named his children after his older generation of relatives or may
have moved to new areas where no siblings or cousins lived, etc.
Analysis of DNA markers allows us to identify which Casey lines
look encouraging as potential relatives and can reduce unproductive
research on unrelated lines that ended up in the same county by chance.
The Casey surname is a fairly common name, therefore, Casey researchers
should expect to regularly encounter unrelated Casey families in the
same counties during the same time period.
Unfortunately, DNA research provides its best fit by tracing your "all male"
line of ancestors as this is basic biology that limits our genealogical
research. This biological fact limits male researchers to submit their DNA for
their surname only. For other surnames of your ancestors, you have to get
one of your male cousins born with the surname of interest to submit their
DNA sample and you can "sponsor" their submission (or assist them
with the expense of the DNA submission based on your mutual interest
in your common ancestor). There is also DNA test for women that will
trace their "all female" lines as well. However, the markers of the
"all female" mutate at such a slow rate that these tests are not
very useful for genealogical purposes. With European surname
practices for marriage, women of European descent take the surname
of their husband which results in a different surname for every generation.
This constant change of surname makes it more difficult to trace than
tracing one surname over many generations. Male DNA markers are like
probate records and census records, our best primary sources that
yield maximum information. Female DNA tests are equivalent to
tax records and property records, producing fewer results but can
be very useful when census records and probate records fail to
produce results. We should search our "all male" DNA lines first
and then later supplement them with research of female DNA lines
when further analysis of male DNA lines can no longer yield enough
information to break through to another generation.
Information obtained from the Casey DNA Project
Currently, there are 23 submissions of DNA markers to analyze for the
Casey surname. Analysis implies that there are probably at least
five different clusters to date. The submission of the James
Casey (VA) line and both submissions of the Sinclair Casey lines
have less than a 1 of 10,000 chance as being related to other Casey
lines in the last 600 years (less than 0.01 % chance). The Irish
Cluster is probably distantly related to SC & TN Cluster in a
genetically significant timeframe (when surnames were first used)
but not in a genealogical significant timeframe (within three or
four generations of oldest proven ancestors). Genetically, these
two clusters are probably from a common ancestor that used the surname
Casey, however, this connection would probably be 200 or 300 years
prior our earliest proven ancestors and therefore, these two clusters
will remain separated for DNA descendancy charts. The two lines
in Cluster 3 have around a 10 % chance of being related in the
last 600 years. There is a small probability of these lines being
genetically related in the last 600 years and these two lines are
currently grouped together in another cluster (probably will be
split after future submissions are received).
There are ten submissions that appear to be closely related to
my Casey line (Ambler Casey). This group of submissions
(with origins in South Carolina and Tennessee) presently has the
most to gain from this project due to early formation of a cluster
of ancestors being tied together with very similiar DNA markers.
Even the two most distantly related individuals in this cluster
have a 96.90 % chance of being related in the last 500 years
(99.04 % chance in the last 600 years). This probability was
calculated using the FTDNATip utility from Family Tree DNA when
using 37 markers. Since both of these have 67 markers available,
this MCRA (Most Recent Common Ancestor) Utility shows a 99.48 %
chance of a common ancestor in the last 500 years and 99.90 %
chance of having a common ancestor in the last 600 years.
To date, the John (MO) line and the Jesse E. Casey (& Ambler
Casey) lines are most distantly related lines (having three
mutations apart). This MRCA utility actually shows this
probability for last 20 (or 24) generations and I have assumed
25 years per generation. The DNA submission of Jackson Casey
was not included in the comparisons as the DNA
mutations of this submission must be from male
descendants of Henson Casey vs. Henson Casey himself. His brother,
Arvle Casey has an exact 37 marker match with John Casey (SC) and
therefore, these mutations were not introduced by Henson Casey but
probably originated from male descendants of Jackson Casey (in this
case either Jackson Casey or the submittor of DNA).
It appears that several lines with recent ties to Ireland have
also formed a second cluster. Unfortunately, only about half of
the Casey lines with recent Irish origins belong to this second
cluster and the other half belong to other clusters). The MRCA
utility shows that the two most distantly related lines are the
Michael Casey (Ireland) line and the Daniel Casey (Ireland)line
which have a genetic distance (number of mutations) of seven.
These two lines currently have a 88.04 % chance of having a
common ancestor in the last 500 years (95.00 % chance in the
last 600 years). The MRCA utility is a good tool to
verify the formation of a cluster and these three lines are
definitely a second cluster. With the John Casey (NY) line
being upgraded to 37 markers, more information concerning
this cluster will soon be available for additional analysis.
The utility can also assist in revealing if two clusters might
share a common ancestor that used the Casey surname. Being able
to determine the probability of two clusters being genetically
related is also very important in establishing progenitor of each
cluster. If these clusters are closely related, they would share
a common ancestor that have a common DNA marker set that is
somewhere between the two clusters. It appears that the
SC & TN cluster and the Irish cluster could share a common
ancestor in a genetically significant timeframe (in a
timeframe when surnames were first used).
The two most closely related individuals from these
two clusters have 43.14 % chance of being related in the last
500 years (62.51 % in the last 600 years). It is believed that
these percentages are high enough to establish that there is a
reasonable chance that both lines descend from a very early
common ancestor that used the surname of Casey. The two most
distantly related lines are John Casey (MO) and Daniel Casey
(Ireland) which have a genetic distance of 13 (thirteen mutation
differences). Even these two lines have a 16.84 % chance of
a common ancestor in 500 years and 32.49 % chance in the last
600 years. Unfortunately, the MRCA utility does not allow
manual entry of DNA marker sets for comparison. It would be
much more accurate to guess at the
marker set of the common ancestor between these two clusters
(using a mutation point from the 37 marker cladogram chart).
This would represent a better picture how closely related these
clusters could be. For two clusters to share a common ancestor
with the same surname, it would be acceptable for the two most
closely related lines to have 50 % chance in the last 600 years
and the two most distantly related lines to have over 20 % in
the last 600 years.
A real surprise is that there are at least five major clusters
identified to date and only two of these clusters have a reasonable
chance of being genetically related (the SC & TN cluster and the
Irish cluster). The two lines in Cluster 3 only have around a 10 %
chance of a common ancestor in the last 600 years and will probably
evolve into to two clusters. A big surprise is that this utility shows
that Clusters 4 and 5 are genetically different from all other Casey
clusters. Both Clusters 4 and 5 have a 0.00 percent chance of being related
to each other in the last 600 years (must be less than 0.01 % probability
or less than one out of 10,000 chance) and both lines also have no
chance of being related to the closest cluster, the Irish Cluster.
There are many possible explanations for this great genetic
distance: 1) There could be many unrelated men that first used
the surname Casey (or its earlier forms); 2) Some of these lines
could changed their surnames from other surnames
to the Casey surname; 3) Some will be NPEs (Non-Paternity Events);
4) Numerous other remote reasons are possible.
The two lines covered in Cluster 3 are also very distantly related
and have only a 3.55 % chance of a common ancestor in the last 500
years (9.81 chance in the last 600 years). This indicates a
remote chance that this could be a cluster but more likely
implies that these two lines are probably genetically different
lines as well (or form more than one real cluster). The other lines
in Cluster 3 with fewer than 37 markers appear to be even more distantly
related. I have always known that surnames based on trade or
geographical terms would have great genetic diversity but was
a little surprised that the Casey surname also falls into
this category as well. In fact, the Christopher Casey line
has higher odds of being related to the John Casey (SC) line
as these lines have 8.8 % chance of being related in the last
500 years (20.96 % chance in the last 600
years). This a little bit of let down as I has always falsely
believed that most of these Casey lines were related somewhere
in the distant past but genetic information tends
to shatter this assumption. Also, when you really think about it,
there will be probably be a lot of name changes and NPE events over
20 to 25 generations where each male ancestor had several sons each
generation. When odds of being related this low in 600 year timeframe
then it is very obvious that these lines will never be genealogical
signficant to each other (in a timeframe reasonably close to our
oldest proven ancestors).
The surname "Casey" is a fairly common surname (364th most common surname
according to the 1964 Social Security survey of surnames in the book,
"American Surnames," by Elsdon C. Smith). There are an estimated
150,000 individuals with the "Casey" surname in the United States
and at least one hundred Casey men that have been declared the oldest
ancestor of numerous Casey lines. With 75,000 Casey men living in the
United States today, the chance for NPE events and name changes
are quite high and each passing generation will create genetically
"new" Casey lines. The highest priority of this project is to
greatly increase the number of submissions that are "not" known
to be related to the current oldest proven ancestors.
The descendants of Cluster 3, 4 and 5 should not be discouraged and
should recruit new members that they believe could be related to
their lines. We also need more random submissions to determine how
many major clusters that will form. Currently, this project is probably
biased towards the SC and TN lines (long time interest in these
lines and previous publications) and somewhat biased towards lines
with recent origins in Ireland (the obvious origin of the Casey
surname).
For the ten submissions included in the SC and TN cluster, two
submissions are from the same known ancestor and have identical
DNA markers. Another two submissions are from another proven
ancestor but have two mutational differences (a third submission
is required to truly understand the source of these mutations).
A third submission is believed to Casey NPE event (the Hanvey
line is believed to really be a Casey line). This leaves seven
lines where there are no proven connections between the lines
and six of these lines have unique DNA marker sets that can be
charted in a DNA descendancy chart. Only the Ambler Casey line
and Jesse E. Casey lines can not be separated - both having
common 67 DNA markers. However, even with only six unique lines
in this cluster, I am very encouraged of what these samples tell
us and I am optimistic that additional samples will greatly
help Casey research on the Casey lines of South Carolina and
Tennessee. So what have we learned from these early submissions ?
First, the SC & TN Cluster clearly validate what we all
have suspected, that the Tennessee lines have their roots in
South Carolina. We really already knew this from traditional
genealogical research - but we now have this fact validated
by scientific evidence as well. Second, all six lines (which
are quite diverse) only have one to three marker deviations from
the other SC and TN lines. Therefore, there was a big discovery
that all South Carolina and Tennessee lines apppear to be much more
closely related than anticipated. This means random new entries in
this cluster could provide very interesting results.
Third, DNA evidence does not support the speculative connection of
Henson Casey being a son of Ambler Casey. This is a little bit
disappointing as there was great expectations that DNA would
help support this connection. Fourth, it appears that the Ambler
Casey line and the Jesse E. Casey line are the most closely related
lines as they both share a common unique DNA mutation for this
cluster (CDYb from 38 to 37) and both lines are exact 67 marker match.
Fifth, marker 460 mutating from 12 to 13 appears to to be the earliest
mutation in this cluster and creates two distinct branches in the
DNA decendancy charts. John Casey (SC), John Casey (MO), James H. Casey
and probably Henson Casey form one branch (460 = 12) and Abner Casey,
Jesse E. Casey, Ambler Casey, the Hanvey line and possibly the
Henson Casey line form the second branch (460 = 13). Sixth,
There are three branches of 460 = 12 and probably two sub-branches
of three sub-branches. There are two branches of 460 = 13 with one
branch having two sub-branches and one branch having two or three
sub-branches (depending on Henson Casey variations). This is a lot
of branching for so few lines submitted to date. Seventh, it is now
known that the John Casey (SC) line currently has the spotlight as
containing the markers that most closely represents the progenitor
of the SC & TN cluster. The Henson Casey line may share that spotlight
but a third son is required to determine Henson's DNA marker sets from
his male descendants and the John Casey (SC) line needs to be upgraded
to 67 markers for a full comparison of markers. Eighth, the Abner Casey
line and the Hanvey lines are exact 67 marker matches. Since the
Hanvey line remained in South Carolina much longer, further research
of the Hanvey line may shed light on the Abner Casey line.
Ninth, the Henson Casey submissions clearly show the importance of getting
two sons from one known ancestor. The first submission (Jackson Casey)
is now known to have a two marker mutation from his brother's line
(Arvle Casey). It is now believed that these mutations may have
occured in later generations - and do not represent the DNA marker
set of Henson Casey. Originally, the analysis of Jackson Casey line
was substantially off due to the recent discovery that these mutations
probably occurred more recently are probably not genalogically
significant as first believed. A DNA submission from a third son is
now required to determine Henson Casey's true DNA marker set.
The conclusions for Henson Casey should be considered speculative
until more DNA submissions reveal Henson Casey's true DNA marker set.
Tenth, even with very few submissions, there appears to be the
ability to speculate on how these SC and TN lines are related based
on DNA submissions. It appears that a DNA descendency chart
is possible that shows how these lines could be related and
is very helpful in determining what kind of submissions
are needed next. The DNA descendency chart (shown later in this
web page) is a very unique approach that does not appear most
DNA projects and should be considered speculative in nature.
Another discovery is that the haplogroups are identified as Irish in
only 18 of the 23 submissions. Four of the more distantly related
Casey lines are not categorized at this point in time as being part
of any Irish haplogroup. However, I suspect that Irish haplogroups
are not that well defined as the Patrick Casey line of Clare County,
Ireland does not have an Irish haplotype which puts some doubt
on the accuracy of this haplotype to include all people of
Irish origin. Research of overlapping haplotypes from the YSearch
database imply that some of these Casey lines may have Scottish or
English origins and just borrowed the "Casey" surname for
some unknown reason or were NPE (adoptions, etc.) events.
With the analysis of only STR markers, haplogroups are only
estimated. To know the actual haplogroup, another DNA test must
be performed (analyzing the SNP DNA markers). Since most Casey
lines are estimated to be R1b1, it is most likely that their
haplogroups will fall into the R1b1 family of haplotypes. The
DNA marker sets of the SC & TN Cluster and the Irish Cluster
have been further estimated to be Irish Type 3 (also known
as R1b1c). Only further DNA analysis of SNP markers (at additional
expense) can truly map each line to a definite haplogroup.
It is believed that all Casey lines in the SC & TN Cluster
and the Irish Cluster are part of the R1b1c haplotype. Only
testing of SNP markers could verify this assertion but the
expense probably does not warrant the gain and it is believed
that no genealogical significant information would be revealed.
For further information concerning the Irish Type 3 haplogroup:
Irish Type 3 Haplogroup
Many of the Casey lines are genetically related to other surnames
and some genetically Casey lines will have non-Casey surnames. This
complex topic has not been thoroughly researched to date with one
notable exception - the Hanvey line (submission 29956). For the
SC and TN Cluster, every entry in the Family Tree DNA database
have the Casey surname and have documented ties to either South
Carolina or Tennessee with one exception (one Hanvey line). It
appears that this Hanvey line may be a genetic Casey line (NPE)
and probably belongs to the SC and TN cluster. The Hanvey line
appears to be closely related to the Abner Casey line and researchers
of both lines should exchange information on possible connections.
The Abner Casey line is an exact 67 marker match with the Hanvey line.
The Hanvey sponsor is quite certain that the NPE event occured in
South Carolina as the Hanvey line originates from Abbeville County,
South Carolina and remained in South Carolina much longer than the
Abner Casey line resided in South Carolina. After contacting the
Hanvey researcher who sponsored the DNA submission, I was informed
that he believed the NPE event occurred in either 1865 or 1885.
If true, this would be very significant to the Abner Casey line
as this means that one Casey line that remained behind in South Carolina
is very closely related to the Abner Casey line.
Detailed analysis of DNA submissions
For anyone wanting to analyze the Casey DNA submissions
that we currently have, the only practical method
is to use cladogram software which graphically displays the
connections between the various DNA submissions.
Without this tool, it is extremely difficult to manually
extract this information. The cladogram chart puts John Casey (SC)
as a line that is more closely related to other Casey lines.
The cladogram program makes a simple assumption of connecting
lines by determining the minimum number of mutations. This is
the highest probability connection but it is not 100% accurate
as the actual connections could have more mutations or backwards
mutations (deletions) can also occur. The cladogram charts
should only be used to identify clusters and determine if
clusters are genetically related (very important in determining
the progenitor of any cluster). Additionally, the cladogram
charts assume that all DNA mutations are significant to the
oldest proven ancestor. In fact, many of these mutations
originated with male descendants of oldest proven ancestors
vs. the oldest proven ancestor themselves. Also, the cladograms
tend to oversimplify relationships in their charts. In order
to have branches, at least two brothers have to have different
DNA marker sets but only one marker set has to mutate and the
other can remain the same. The father of Abner Casey and
the father of Jesse E. Casey could be brothers, however,
the cladogram does not show this possible brother relationship
and implies that the Jesse E. Casey line must be a branch
of the Abner Casey line.
The cladogram charts combined with the availability of DNA
submissions outside the SC and TN cluster are key to
establishing the DNA descendency chart for the SC and TN
Cluster. The cladogram charts implies that the DNA marker
set of John Casey (SC) has the fewest mutations from closest
Casey lines in other clusters. I guess congratulations
to this line for being the least mutated Casey line within
this cluster. This fact currently puts John Casey (SC)
on top of the DNA descendancy chart and establishes
this line as the baseline DNA marker set that
all other submissions in this group should be compared with.
A closer inspection of the markers involved shows that
marker 460 is very important to this cluster. For all other
Casey lines, 460 is either 10 or 11. For the SC and TN
Casey cluster, 460 is either 12 or 13. This means that the
SC and TN lines have a unique fingerprint from other Casey
lines by having one or two more mutations for one marker
than all other Casey submissions to date. The SC and TN
Cluster can be described genetically as those who have
460 > 11.
Without the usage of cladogram charts and the inclusion
of other Casey DNA submissions outside the SC and TN
Casey cluster, very different conclusions would
result. Most DNA genealogical web sites use another
methodology to determine the progenitor of any cluster.
This method is sometimes called the "Majority Rules"
methodology. It assumes that the DNA marker value with
the highest number of occurances in the cluster must
be the most common and probably represents the
DNA marker value of the progenitor of
the cluster. This assumes that all DNA submissions within
the cluster are random and evenly distributed and biases
this conclusion by the popularity of lines submitted.
Using this "Majority Rules" methodology and by inspecting
on SC and TN submissions independently, the DNA descendancy
chart takes on a very different look. By only looking at the
SC and TN cluster and using the "Majority Rules" methodology,
the Abner / Pleasant Casey line would appear to be the best
candidate for being on the top of the DNA descendancy chart.
Another methodology looks at all submissions of a surname
and would use the cladogram chart in chosing the progenitor of the
SC & TN Cluster. This methodology makes the assumption that
the closest marker set to other clusters must be the
progenitor of the SC & TN Cluster. These two major conflicting
approaches can create two very unique DNA Descendancy Charts.
Therefore, we need to analyze the underlying assumptions
of these two different approaches. First, do we believe that
the DNA submissions included in the SC & TN Cluster are
biased with more submissions that are closely related to
the Abner Casey line or randomly distributed ? Unfortunately,
I tend to believe that the popularity of Roane County, TN
and other central Tennessee lines may have introduced a
bias to the DNA submissions received to date. Obviously,
this assumption is not the easiest to validate. Fortunately,
the second methodology has tools to validate if two clusters
could be related. In this case, we can run a MRCA utility
and determine the probability of the two clusters being
related in the last 500 or 600 years. With the MRCR utility,
you can enter two DNA submissions (haplotypes) and it will give
you the probabilities that these individuals are related
over varioius numbers of generations. If the two most
closely related individuals from each cluster show
reasonable chances of being related, then this approach
has scientific evidence proving the required assumption.
There is a very big assumption when using the cladogram,
as connections between clusters have no meaning unless
the adjacent clusters are closely related enough to be
genealogically significant. It appears that the
SC & TN cluster and the Irish Cluster could share a
common ancestor in a genetically significant
timeframe (in a timeframe when surnames were first used).
The two most closely related individuals from these
two clusters have 43.14 % chance of being related in the
last 500 years (62.51 % in the last 600 years). I believe
that these percentages are high enough to assume that
these two clusters did indeed share a common ancestor
that used the surname Casey (probably too far back to
be genealogically significant but close enough to assume
the connection shown in the cladogram is most likely).
Therefore, I believe that "Related Cluster" approach
has supporting data and the "Majority Rules" approach
has less supporting information. However, the most
common approach used by DNA Genealogical web sites
is the "Majority Rules" as this approach is covered
more in genealogical publications that cover the
topic of DNA being used for genealogical purposes.
Welcome to the uncertainity of DNA research used
for genealogical purposes. Over time, additional
DNA submissions will incrementally allow both approaches
to merge to a common conclusion - that is the beauty
of larger sample sizes, they make it much less likely
to have conflicting assumptions due to shear size
of the sample size "should" reduce bias over time.
With the recent submission of Arvle Casey, a backwards
mutation may have actually taken place. Therefore,
the Abner / Pleasant Casey line still remains the
second best candidate for topping the DNA descendancy
chart but additional DNA submissions for the Henson
Casey lines may reveal that the Abner Casey and Henson Casey
lines may share that place of sharing second place.
Additional submissions of other SC lines should could
eventually reveal another Casey line could replace the John
Casey (SC) line as the sharing the oldest known set
of DNA markers. The Ambler Casey line or Jesse E. Casey
line are unlikely to be on top on the DNA descendancy
chart. Others should play around with their own possible
descendancy charts for the SC and TN Cluster as there are
several other possible (lower probability) scenarios
that could develop. The most encouraging aspect of this
project is that with each additional submission in this group
(or expansion of available markers within this group)
more information can be extracted.
If DNA tests were only $10 each
Unfortunately, charges for DNA analysis will come down
very slowly over time (and there will always be more markers
made available to quench our thirst for more information).
So $10 submissions will not happen in the near future but
we could have wealthy person who wants the Casey DNA project
to be the best or somebody could put in their will a modest
donation to our project. What would we do with these funds
and we certainly want to spend these funds wisely. Let us
assume that we had $25,000.00 donated to the Casey DNA
project tomorrow. Would that be too much or not enough ?
Also, we should be assume that we are not allowed to
divert the funds to additional traditional research,
publishing Casey books or scannning 1,000s of pages
of Casey source materials (which might be
a better usage of the funds). The benefactor only put three
conditions: 1) We would have to present a plan of how
the funds would be used and what results would be expected;
2) Only those persons who currently have submitted DNA
submissions could participate in defining how funds
would be spent. 3) Funds can only be spent on actual
DNA testing (not hiring any consultants, other
genealogical projects, advertising, etc.)
So if this Casey DNA project already had identified
1,000 random male Casey volunteers willing to
participate (all with diverse well documented
ancestry back to around 1800) and some private
benefactor donated $25,000.00 to the Casey
DNA project, what should be done with these funds ?
Could we even spend that kind of donation properly
(this represents 100 additional submissions with a
mixture of 37 and 67 markers) ? Let us also assume
that this donation was spread out over ten months.
This means we would have to describe which ten
submissions (or upgrades) would be sent for analysi each
month and what kind of results we would expect - this
would force us to plan ahead and justify which of willing
ten percent volunteers would be analyzed and which 90 %
of willing participants would not be analyzed.
Here would be my list:
Month one - Upgrade James H. Casey to 37 markers (done);
upgrade John Casey (SC) to 67 markers; three more 37
submissions from SC and TN lines that are not related
to existing SC and TN lines (but are good candidates to be
related - one done now); three more 37 marker submissions
from lines with recent Irish origins but good candidates
to be related to the currently submitted lines with recent
Irish origins; three more lines of the other Casey
lines (only good candidates that could be possibly
related to the existing submissions).
Month two - Upgrade Arvle Casey to 67 markers; upgrade
second SC and TN Casey line to 67 markers (done);
three more 37 marker submissions in the SC and TN
cluster; three more 37 marker submissions in the
Irish Cluster; three more submissions in other Casey
lines.
Month three - 37 marker submission from third
son of Henson Casey; two more random 37 marker
submissions from the SC and TN lines; two more
random 37 marker submissions from Irish Cluster;
five more submissions from other Casey lines.
Month four - Upgrade John Casey (NY) to 37 markers
(being done); Upgrade James C. Casey to 37 markers;
order 37 marker for second son of Ambler Casey;
order 37 marker for second son of Abner Casey;
order 37 marker for second son of Jesse E. Casey;
two more 37 marker submissions from Irish Cluster;
five more 37 marker submissions from other Casey lines.
Month five - Upgrade three 37 marker submissions to 67
marker submissions (from either SC and TN lines or
the recent Irish lines); Upgrade Patrick Casey to
37 markers and Daniel Casey to 37 markers; Upgrade
submittor Daniel Casey (VA) to 37 markers;
two more 37 marker submissions from SC and TN cluster;
two more 37 marker submissions from Irish Cluster;
three more 37 marker submissions from other Casey lines.
We are now half way through our funds and have all
known upgrades done and all known fine tuning submissions
completed. We have a total of eight fine tuning submissions
(from same known ancestors) - six from the SC and TN
lines and two from other Casey lines. This means that
we have 16 lines covered from SC and TN; 17 lines covered
from Irish Cluster and 19 lines from other lines. This probably
on target for SC and TN lines, somewhat high for the
Irish Cluster and low for other lines. There are around
50 unique lines in the SC and TN cluster, therefore,
16 lines would probably be insufficient to cover this
cluster but 32 lines would sufficient to uncover many
connections. Half way through the donation funds,
it would now be time to concentrate the best candidate
for a third cluster from the other Casey lines.
Month six - Two upgrades from 37 to 67 markers where
required; three more 37 marker submissions to SC and
TN lines; one more 37 marker submission to Irish Cluster;
two more 37 marker submissions third identified cluster;
three more for other Casey lines.
Month seven - One fine tuning submission (submission
from line that already has a proven ancestor submitted);
two 37 to 67 marker upgrades where required; two more
from the SC and TN cluster; one more from the Irish
Cluster; two more for third cluster; three more for
other Casey lines.
Month eight - One fine tuning submission; two more
from the SC and TN Cluster; one more from the Irish
Cluster; three from the third cluster and three more
from other Casey lines.
Month nine - Two more 37 to 67 marker upgrades;
three more from SC and TN Cluster; one more from
Irish Cluster; two more from third cluster;
three more from other Casey lines.
Month ten - One fine tuning submission; three
more from SC and TN Cluster; one more from Irish
Cluster; two more from third cluster; three more
from other lines.
All funds are now spent and what kind of coverage
would we have ? Let's assume that four submissions
were used to seed the third true cluster from the other
Casey lines. This leaves us with 30 lines covered
from other Casey lines, 29 lines covered from
the SC and TN cluster, 22 lines covered from
the Irish cluster and 18 lines covered from
the third cluster. This results in 99 lines
covered, eleven submissions for fine tuning and
nine equivalent submissions dedicated to upgrades.
This would result with three DNA descendacy charts
that would give researchers a lot of new lines
to start researching based on DNA results. At this
point in time, traditional research should help
tie many of this lines together with information
derived from this DNA study. However, it appears
that even $25,000 would not be enough funds
to completely map the Casey DNA map (my estimate
would be it would be approaching 50 % though).
The above plan for spending the mythical $25,000
donation provides a good insight on where priorities
should be set for this DNA project. I am not really
certain what 30 "not so closely related" other Casey
lines would tell us about the Casey surname. Only
the formation of closely related clusters can assist
our primary goal of connecting the many Casey lines
that exist. However, we must include a diverse
cross-section of all Casey lines to provide a
better picture of all Casey lines. There are probably
dozens of Casey lines that have no common genetic
ancestor that used the surname of Casey. Many
unrelated individuals must have taken the Casey
surname when our Casey ancestors started using
surnames, many NPE events have probably happened
and there are bound to be several name changes
from other surnames to the Casey surname. We also
need to be brave and attempt to map several of our Casey
lines as NPE events (non paternity events
such as adoption, out of wedlock, etc.)
As one Hanvey line appears to one known
NPE event where some Casey male was
probably adopted by the Hanvey's, we should
also investigate which Casey lines are not
really genetically related to the Casey lines
but were other non-Casey males that were
adopted into the Casey families. Each of these
NPE lines create a genetically "new" Casey line.
However, with the assistance of DNA submissions
of other surnames, we should be able to determine
some of the origination of "new" Casey lines
and tie some into other surname descendancy charts.
Good candidates for more DNA analysis
So what additional submissions would provide more insight
to our Casey ancestors ? And which upgrades would be
beneficial ? And what fine tuning submissions would be
useful (additional submissions from known ancestors that
already have submissions) ? Without any doubt, the highest
priority is to broaden the scope of the submissions. We
all need to identify and recruit submissions of lines
that we think could be related and less on additional
submissions that we know are related. However, additional
submissions for lines already covered and upgrades to
existing submissions also have value as well but are
not as high of a priority (with a couple of exceptions).
The primary purpose of this project is determine
which unrelated lines look most promissing for
additional research for possible connections.
Identifying and recruiting these possibly related lines
should always remain our highest priority. Unfortunately,
it is our bias towards our own lines greatly
influences our interests. It is human nature to want
our lines to be best represented but the project benefits
more from having broader participation. We all need to
work hard to identify good candidates that might be
related and actively recruit DNA submissions from
those lines.
There is much interest in the "fine tuning" of the
existing submissions (adding more submissions to lines
that already have DNA submissions). The best
usage of DNA is to scientifically prove which lines
are worth additional research and which lines can
be eliminated as wild goose chases. So, what value
does "fine tuning" DNA submissions have ? First, it
is useful to have submissions from at least two sons
of each proven oldest Casey ancestor in order verify
the exact source of any DNA mutation. The second
submission will determine if their marker set is unique
to their oldest proven ancestor or a mutation of one
his sons (or other male descendants). This is where
the DNA descendancy chart is very useful. For some lines,
it may pretty obvious that the chance of variations
is probably low. However, the two marker variation
between two sons of Henson Casey vividly show that
recent mutations can be mistaken for mutations between
oldest proven ancestors.
If the submissions for any first two sons of any oldest
proven Casey ancestor have different DNA marker sets,
then new submission of a third son of this oldest proven
ancestor would be required. The recent addition of the
Arle Casey submission has now shown that the Jackson Casey
line has the unique marker mutation of 607
(15 to 16). This mutation may possibly even be from
a later generation male descendant. A third DNA submission
would determine where these unique mutations start. With
only one submission per ancestor, it can be dangerous to
assume that all brothers of the oldest ancestor will
have the same DNA marker set. In the case of the Arvle
Casey submission, I think we were all surprised to find
that a brother would have two DNA mutations. For the Henson
Casey line it is doubly important for a third submission
since we currently do not know if Jackson Casey has a two
marker mutation or only one marker mutation and Arvle has
a one backwards mutation which is rare but can happen
(probably a red flag DNA genealogical descendancy charts).
Another second form of "fine tuning" is upgrading the number
of markers when two unconnected lines have common marker
sets. John Casey (SC) and Arvle Casey both have 37 marker
matches as well. It is doubly important for the John Casey
(SC) line to be upgrade to 67 markers as it represents the
baseline DNA marker set for this cluster (line with
fewest mutations from other Casey lines). These two
submissions could benefit from being upgraded to 67
markers as well to attempt to determine if additional
markers can separate these two lines or help establish
yet an even closer relationship. With the case of the
upgraded markers of Jesse E. Casey and Ambler Casey, the
67 marker set implies even a closer relationship between
these two lines than a 37 marker set match would indicate.
If they turned out to be different, you would have identified
the mutation that makes these two lines unique (guess we will
have to wait to the future 100 marker test for this).
If Henson Casey and John Casey (SC) lines come back with a 67
marker match, it would be time for these two lines to
start sharing traditional genealogical research for
possible connections. DNA clearly shows that Henson
Casey is much closer related to John Casey (SC) than
Ambler Casey as once speculated. Maybe Henson Casey
has ties back to Warren County, Kentucky where his 37 marker
match once lived.
The "fine tuning" of additional children of a oldest known
ancestor will also validate the connection of sons of
each oldest known son to the oldest known ancestor. These
connections could already be pretty well established or could
be fairly speculative in nature. Our DNA Descendancy charts
should not be too biased on our current traditional research
to date. However, well proven sons will benefit little other
than verifying what is already known. The three major TN Casey
lines are well researched but all three lines lack the
traditional proof that most genealogical lines have. These
three lines are Jesse E. Casey, Ambler Casey and Abner Casey.
The connection of Jesse E. Casey to his children
was originally based on a 1894 book. Fortunately,
this secondary source is well supported by census records
and other sources (with one or two exceptions).
Of the three lines, the Jesse E. Casey line probably
has the best genealogical documentation for establishing
the children of their oldest proven ancestor
(this line would not benefit further proving the connection
from oldest ancestor to their sons since primary
documentation already exists). The children of Abner
Casey is primarily based on several abstracts (letters)
of a Family Bible that can not be located. This account
is also supported by several primary documents as well.
The children of the Ambler Casey line is the least
documented family as there is no existing single
document that establishes the children of Ambler
Casey. DNA documentation can provide scientific
evidence that firm up the connection of these sons to their
oldest proven ancestors. Sons of oldest proven ancestors
with the weakest traditional genealogical documentation
to their oldest proven ancestors could provide
additional documentation connecting these sons.
Having DNA evidence supporting these family connections
may, in the near future, be considered primary documentation
in this new world of genealogy. Unfortunately, this
did not prove the case for Henson Casey being the
son of Ambler Casey (this was pretty speculative
in nature and has now been shown by DNA evidence
to be very unlikely now). This is probably
the best usage of DNA submissions when the sample
size is relatively small (as it is to date) and when
one cluster of lines emerges early in any DNA
project.
Our goal is to get several clusters of Casey lines
that help establish recent common ancestry between
various Casey submissions. Once the number of
submissions greatly expands in scope, another major
benefit will start to emerge. It will become obvious
that several diverse Casey lines will become more
closely related than traditional research has
shown to date. DNA documentation can help
genealogists better select which "possibly"
related lines to research based soley on DNA
evidence. Researching these newly discovered potential
relationships through traditional genealogical
methods may result in locating supporting
documentation and may be the key to getting past
that brick wall.
The current DNA submissions have really shattered
many of my most promissing lines (which I have
spent countless hours attempting to connect).
Before the availability of DNA information, my
most promissing lines for connection to Ambler
Casey were: 1) Abner Casey, 2) Jesse E. Casey,
3) Henson Casey and 4) John Casey (MO). After
DNA submissions, here are major changes:
1) Jesse E. Casey has obviously replaced Abner
Casey as the best candidate - but both are still
my best candidates. 2) Since we have hit the
brick wall on Abner Casey, the Hanvey line could
open new doors for more connections to Casey
lines that remained longer in South Carolina.
3) Although Henson Casey lived in Roane County,
TN where Ambler Casey lived, the speculative
connection as a son of Ambler Casey is now
not possible. 4) Since John Casey (MO) resided
in McMinn County, Tennessee during the same
time as Ambler Casey, John Casey (MO) "was"
another good candidate - DNA documentation
really discounts this connection now. 5) With
an exact 67 marker match with Jesse E. Casey,
I should now prioritize research on this line
above all others. These are significant changes
in focus for my Casey research.
How many markers should be analyzed
So how many DNA markers should one submit to be
useful and which of the existing submissions should
have additional markers analyzed. For the SC and TN
cluster, all new submissions should be either 37 or
67 markers. For all other lines, all new submissions
should have be 37 markers (unfortunately, Family Tree
DNA has dropped their 25 marker option which would have
been sufficient). The 12 marker test does not have
enough information to be useful for the project.
Also, it is not desirable for two submissions from
the same line - unless they are from different sons
of the oldest known proven ancestor. If two submissions
from two sons of oldest known ancestor have two
different marker sets, then another submissions
from a third son would be required.
With the recent match of Arvle Casey and John
Casey (SC), 67 markers are now required to separate
these two lines. For other submissions in this group,
it may turn out later that all the markers from
38 to 67 may just help separate the descendants of our
oldest proven ancestors (not very useful for
adding another generation to the pedigree chart).
Or as the sample size grows, we may later
learn that additional markers are required
to separate many of these lines. To date, there
are now five 67 submissions in this group, unfortunately,
all have the same marker values from 38 to 67.
Submissions in the SC and TN Casey cluster will
always have the most to benefit from expanding
37 markers to 67 markers. At this point in time,
it is not probably necessary for other Casey lines
to upgrade to 67 markers. The other Casey lines
(outside the SC and TN Casey lines) need to
concentrate in obtaining new submissions or
encouraging others with fewer than 37 markers to
upgrade their submissions to 37 markers. This is
harder to accomplish since it is much easier just
to order your own upgrade.
So how many potential oldest Casey ancestors originate
from South Carolina ? The 1790 census of South Carolina
has 47 males that make good candidates and all but six
are from Spartanburg County or Newberry County. With
only one unique marker for the first 25 marker positions,
this leaves only ten markers available to separate these
lines from a list of 47 possible ancestors. It appears
that the SC and TN Casey lines will probably need 67
marker sets in order to provide separation. It appears
that Casey lines grouped in the Irish Cluster has now
been established as a second cluster of Casey ancestors
and may eventually need 67 marker submissions. This grouping
of submissions are not that distantly related, so
additional submissions that fall into this group could
be informative for researchers of this group in the future.
As time passes by, many submittors may become no longer
interested in paying the premium to have their sample
analyzed for additional markers. Eventually, these
samples will become unviable to analyze. The person supporting
the analysis could also die or become incapacited with
the children potentially showing no interest in this project.
For the vast majority of cases, the exposure to lose
valuable DNA documentation will probably not be of
great concern as most lines have many living male descendants
of any particular son of an oldest proven ancestor. If
there are numerous living male descendants, then there
will remain many others to assist in the future. However,
if you are the only surviving male of your line, it is
very important that you submit as many markers that
are currently available (currently 67 markers from this
company analyzing our samples for this project). My great
grandfather, William Martin Shelton (born 1847), had seven
daughters and only one son. This son produced only one grandson
who died as a teenager in 1928. Therefore, there
are no male descendants of this Shelton line that can be
tested for the Shelton DNA project even though there are
around 400 living descendants (all descending from
daughters born with the Shelton name at some point).
So who should we encourage to submit additional
samples that would benefit this project ? There are
three broad categories of submissions that should
be sought in the near term. Once other submissions
are analyzed, there will surely be new items of
interest. First, for all the current submissions,
we should encourage male descendants of at least
two sons of our oldest proven ancestors to submit
DNA samples. This helps us determine where the
uniqueness of each marker set begins. It also
provides more evidence connecting these sons
to their oldest proven ancestor. Second, everyone
has their favorite candidates for possible connection
to their lines. Your hunch (supported by traditional
genealogical research) can be either dismissed
by DNA evidence or further strengthened by DNA
evidence. We must have more submissions from possible
related candidates to make any progress on which
lines are worthy of additional research. Third,
we need wider participation of all Casey lines
to determine those big surprises and possible
connections that we all have missed to date.
As the current submissions confirm, the Casey
surname is a relatively common surname with
dramatically different DNA backgrounds. Only
larger sample sizes (more submissions) can
reveal where other clusters will form.
For the SC and TN lines, we need more submissions
from other SC Casey lines in order to understand
how these lines are connected to the SC and TN
lines that have been submitted to date. The biggest
surprise for me to date is how closely related the
TN lines are to the first SC submission with
37 markers. The John Casey (SC) line could be
just one SC line that is closely related to the
other TN lines - or are all SC Casey lines more
closely related to the TN lines than traditional
research has shown to date ? We also need more
unconnected TN/AR/MO lines to determine which
lines are closely related and which are wild goose
chases for connection to this cluster. We will discover
that certain lines are very closely related which will
allow researchers of these lines to properly focus
their research more on these promissing lines. We also
will need to identify those lines that are not
closely related. The descendants of those lines might
avoid spending many additional unfruitful hours
attempting to connect these more remotely related lines.
Here are some specific recommendations for additional
submissions for existing lines. First, we need additional
samples for other sons of the four main SC and TN lines
covered to date. If I got any of these sons incorrectly
listed, please let me know so I can update/correct
this list (I intentionally left out some possible sons
where the connections are very weak). I would prefer
to put these weak connections into the category of
possibly related lines of interest.
_____Ambler Casey
__________Moses Casey (need one more)
__________John Casey (have)
__________Ellison Casey (need one more)
_____Henson Casey - (have two)
__________Jackson Casey (have, sibling is different)
__________Arvle Casey (have, sibling is different)
__________Other sons (need one more since different)
_____Abner Casey
__________Turner Casey (need one more)
__________Pleasant Casey (have two)
_______________Pleasant Casey, II (have, sibling is same)
_______________Elsberry Casey (have, sibling is same)
__________Abner Casey (need one more)
__________Jesse Casey (need one more)
_____Jesse E. Casey
__________Steven Casey (need one more)
__________Elijah Casey (need one more)
__________Anthony Casey (have)
__________Levi Casey (need one more)
__________Ambler Casey (need one more)
__________Jesse Casey (need one more)
__________Wesley Casey (need one more)
_____John Casey (MO)
__________Levi Casey (have)
__________John Allen Casey (need one more)
_____John Casey (SC)
__________Abner Casey (need one more)
__________Thomas Casey (need one more)
__________Samuel Casey (need one more)
__________John Casey (have)
__________Henry Casey (need one more)
_____James Hill Casey
__________James Casey (need one more)
__________Hugh Casey (need one more)
__________Willis Casey (have)
__________Allen Casey (need one more)
__________John Casey (need one more)
__________Newton Casey (need one more)
__________Andrew Casey (need one more)
For every Casey researcher, you need to determine which
sons of oldest proven ancestors may have very few living
male descendants and are exposed to having the line
"die out" of producing no living "all male" descendants.
For the descendants of Ambler Casey, Henson Casey and
Jesse E. Casey, I have reviewed all the known male
descendants to estimate the number of potential
submittors and the exposure for these lines to die out:
For the Ambler Casey line
Moses Casey (need)
3 sons, 20 gsons, 37 ggsons, 22 3Gsons, 10 4Gsons,
2 5GSons (33 males descendants born after 1900).
Almost no exposure for this line to die out
and substantial information concerning living
male descendants.
John Casey (my line)
7 sons, 25 gsons, 43 ggsons, 41 3Gsons, 24 4Gsons, 8 5GSons
(76 males descendants born after 1900).
Almost no exposure for this line to die out
and substantial information concerning
living male descendants.
Ellison Casey (need)
5 sons, 2 gsons, 7 ggsons and 2 gggsons
(only two known males descendants that
were born after 1900). Probably minimal
exposure for this line to die out but very
little knowledge of living male descendants.
The connection of Levi Casey is pretty weak
and Levi Casey could be a son of Ambler Casey,
therefore, this submission would greatly benefit
the Ambler Casey line and could greatly benefit
the descendants of Levi Casey to breaking through
that brick wall on this particular line.
For the Henson Casey line
16 sons, 30 gsons, 18 ggsons, 5 3Gsons, 1 4Gsons, 0 5GSons
(37 males descendants born after 1900).
Almost no exposure for this line to die out
and reasonable information concerning living
male descendants.
For the descendants of Jesse E. Casey, I have reviewed
all the known all male descendants to estimate the
number of potential submittors and the exposure for
these lines to die out (this is based on the 2000
version of Vonda Dihm's book). There is considerable
exposure on several sons of Jesse E. Casey to die
out (or may have already died out). However, there
are seven sons to choose from (unlike the Ambler
Casey line where do not even know the names
of several of his sons):
Jesse E. Casey
Stephen Casey (need)
6 sons, 18 gsons, 29 ggsons, 15 3Gsons, 7 4Gsons, 0 5GSons.
Almost no exposure for this line to die out and
reasonable information concerning living
male descendants.
Elijah Casey (need)
6 sons, 3 gsons, 4 ggsons, 8 3Gsons, 2 4Gsons, 1 5GSon.
Little exposure for this line to die out
but not much information concerning
living male descendants.
Anthony Casey (have)
3 sons, 25 gsons, 34 ggsons, 43 gggsons,
23 3Gsons, 4 4Gsons, 0 5GSon.
Almost no exposure for this line to die out
and considerable information concerning living
male descendants.
Levi Casey (need)
6 sons, 8 gsons, 0 ggsons, 0 3Gsons,
0 4Gsons, 0 5GSons.
Moderate exposure for this line to die out
and no information concerning living
male descendants.
Ambler Casey (need)
2 sons, 0 gsons, 0 ggsons, 0 3Gsons,
0 4Gsons, 0 5GSons.
Very high exposure for this line to die out or
this line could have already died out 100 years ago.
No information concerning living male descendants.
Jesse Casey (need)
11 sons, 24 gsons, 22 ggsons, 15 3Gsons,
4 4Gsons, 0 5GSons.
Almost no exposure for this line to die out
and reasonble information concerning living
male descendants.
Ambler Casey (need)
2 sons, 0 gsons, 0 ggsons, 0 3Gsons,
0 4Gsons, 0 5GSons.
Very high exposure for this line to die out or
this line could have already died out 100 years ago.
No information concerning living male descendants.
We all need to openly discuss what we think are the
best candidates to be related to these TN and SC lines
as well as the other Casey lines that have been
submitted to date. Traditionally, many of us avoided
sharing this kind of speculation because many novices
to genealogy tend to convert this speculation
into fact. However, we also are not sharing this
valuable insight that we have developed over many
years of research. People visiting the Casey DNA
web site need input and encouragement on which lines
are important to this project at this point
in time. As my web site might imply, here are
some additional Casey lines that my instinct tells
are good candidates for the SC and TN lines:
_____James Casey
__________James Casey (need)
__________Sterling Casey (need)
__________Samuel Casey (need)
_____Mrs. Easter Casey
__________William Casey (need)
__________Abner Casey (need)
_____Ambler Casey (born 1832)
__________Only 4 daughters known
__________(this male line probably died out)
Mrs. Easter Casey has Fulton County, AR
connections and my gut feeling says her family is
related. Please let me know your specific lines of
interest (specially on the SC lines where others must
help me identify other lines of interest). The
unconnected Ambler Casey (born 1832) is an obvious
candidate to be one of the missing sons of Ambler
Casey (TN), however, this line appears to have
no living male descendants.
Where are our submissions for widely known SC and
TN lines such as Randolph Casey, Christopher Casey
and General Levi Casey ? I will be glad to add
other candidates for anyone who wants to present
their speculation for possible connections. I
decided to look at previous Casey publications
that cover these SC and TN Casey lines. From
the Walter E. Casey book, the George and
Abner Casey manuscript and some early SC
probate records, here are some other
good candidates:
_____Christopher Casey (1960s manuscript)
__________John Casey (have)
__________Aaron Casey (need)
__________Hardin Casey (need)
_____Aaron Casey (1960s manuscript)
__________Abner Casey (have)
__________Jesse Casey (have)
__________Alexander Casey (need)
__________Anthony Casey (need)
__________Uriah Casey (need)
_____Aaron Casey (two probate records)
__________William Casey (need two)
__________Moses Casey (need two)
__________James Casey (need two)
__________Levi Casey (need two)
_____Randolph Casey (probate records)
__________Levi Casey (need two)
__________Randolph Casey (need two)
__________Isaac Casey (need two)
__________Abraham P. Casey (need two)
__________Samuel Casey (need two)
__________Hiram Casey (need two)
__________Zadock Casey (need two)
_____Levi Casey (1960 DAR article)
__________John Casey (need two)
__________Levi Casey (need two)
__________Jacob D. Casey (need two)
__________Samuel O. Casey (need two)
I know most of you are doing the same as me - pulling
out your files and going to the Internet to get the basic
information for these Casey lines. The official Casey DNA
web site allows the submission of your pedigree via
a GEDCOM file, please submit these files and provide
this project with this critical information. This
web site also provides the ancestry of each DNA
submission when known. Please send in any additions
and corrections to information included to date.
Analysis using Cladogram software
Cladograms are graphical representations of the marker
mutations between individuals. These charts can quickly
determine the closeness of relationships between various
lines. The cladogram charts were created using a free
phylogenetic network software program offered by
Fluxus Engineering:
More information about free cladogram software
Unfortunately, this free software is not for the faint
of heart and is fairly difficult to use. Also, the connections
presented can be misleading at times but these charts are
absolutely wonderful determining clusters and grouping of
various DNA submissions. This program determines the simplest
configuration which has the least number of interconnections
or mutations. For excellent examples of what cladograms can do for
this project in the future, refer to the Mumma DNA web
site. The Mumma line is related to my wife's Garver line
and was one of the first genealogical DNA projects.
The Mumma DNA project is pretty far along in their
collection of DNA submissions and have gained useful
information through the availibility of DNA information.
This surname is relatively uncommon which makes the
project much more useful with many fewer samples.
Their web site has a great presentation of
their tables and show the usefulness of cladograms:
Mumma DNA Web Site
The first cladogram includes all unique submissions with
37 markers and reveals two items of interest: 1) It implies
that John Casey (SC) may have a marker set that represents
the earlier Casey line as its marker mutation is closer
to all other Casey submissions than the Pleasant (Abner) Casey
submission. 2) It also reveals that there are two true
clusters and two very preliminary groupings that
are possible (these groupings are so remotely related that
they probably do not currently warrant separation - other
than to make it easier to analyze the current DNA summary
table).
37 marker Cladogram (PDF)
The next cladogram includes all unique submmissions with
25 markers (only two submissions were added). This chart
revealed three points of interest: 1) As we already knew,
the SC & TN lines require 37 markers to be useful. All
37 marker submissions were lumped together (one big circle)
and only the James Hill Casey line has a unique marker.
2) The cluster of Michael Casey (Ireland), Daniel Casey
(Ireland) and Dennis Casey (Ireland), lose many of
the unique markers found with markers 26 through
37. 3) The other 25 marker submission, Daniel Casey (VA) line,
could be attached two groups and was attached to the group
with the fewest mutations. The expansion to 37 markers
would make easier to determine which group this line really
belongs to. Since 25 markers can no longer be submitted,
this chart will not be updated with new 37 marker submissions.
25 marker Cladogram (PDF)
The last cladogram includes all unique submmissions with only
12 markers (only three submissions were added). This chart
revealed four minor points of interest: 1) The Daniel Casey
(Ireland) line, Michael Casey (Ireland) line and Dennis Casey (Ireland)
line all merged with a common set of markers. This suggests that
this group of lines is a true cluster. 2) For the first new
line added, Patrick Casey (Ireland) has a unique four marker mutation
from all other lines. After this unique string of four unique
mutation, it only takes one additional mutation to connect
to six other lines and two mutations to connect to the Pleasant
Casey line where five other lines are lumped together. Clearly
12 DNA markers are not sufficient to properly determine which
of these 11 lines are more closely related to the Patrick
(Ireland) line. 3) For the second new line
added, John Casey (NY) has two unique marker mutations separating
this line from other lines. With only two marker mutations, this
line shows relationships to four lines and with three marker mutations
is not far from the Pleasant Casey grouping of five more lines.
Again, it is clear that 12 markers is just not enough to determine
what other lines look promissing as possible relatives. 4) For the
third and last new line added, James C. Casey also has two unique
mutations from other lines. With only two mutations, it has possible
relatives to six lines and with three mutations, the Pleasant Casey
line adds another five lines. Again, it is clear that 12 markers is
just not enough to determine what other lines look promissing as
possible relatives. With the analysis of the three new 12 marker
lines, it highly recommended that any future DNA submission should be
at least 37 markers (since 25 markers are no longer available)
as 12 markers do not provide enough markers to draw any
conclusions about connections to other lines. It is doubtful
that any new 12 marker submissions will be submitted, therefore
this chart will not be updated with new 37 marker submissions.
12 marker Cladogram (PDF)
Based on the 37 marker cladogram, I separated all the Casey submissions
into to five clusters (only two true clusters at this point, the other
clusters were created solely to make the summary chart more readable).
I also made a baseline DNA marker set based on the closest mutation
split for each of the remote groups (where remote lines joined into
a common mutation list with respect to other lines). I also changed
the background color for mutations from the baseline to clearly show
the mutations from the baseline. The next section compares the four
baselines to show how each baseline deviates from the other groupings.
With the recent addition of several 67 marker upgrades, those markers
are shown on page 2 of the "DNA Summary Table."
DNA Summary Table (PDF)
This table makes it pretty clear about the desirability to add more
markers to the existing submissions. First, the John Casey (NY)
and James Charles Casey each only have two mutations from their
baseline. These two submissions would benefit from expanding to 37
markers. Expanding to 67 markers would not really be necessary
at this point in time. The Daniel Casey (VA) line clearly does
not need to upgrade to 37 markers (unless this line does not want
to be the only line without 37 markers). The SC and TN lines
clearly have the most to benefit from 67 markers. With the new
exact 37 marker match between Arvle Casey and John Casey (SC),
this DNA project has one pair of unrelated lines with no mutation
differences to separate these lines. The John Casey (SC) and
Arvle Casey should upgrade to 67 markers in order to separate
these lines or determine an even closer relationship. Both the
Ambler Casey line and the Jesse E. Casey line are now an exact
67 markers match - now indicating even closer relationship than
a 37 marker match implied (it will now take a future 100 marker
upgrade to separate these lines).
This chart also makes a graphical case for putting John Casey (SC)
as the baseline DNA marker set instead of the Abner / Pleasant
Casey submissions. By looking at marker 460, you can clearly see
that all SC and TN Casey lines are either 460 = 12 or 460 = 13.
All other Casey submissions are either 460 = 10 or 460 = 11.
The MRCA utility implies that the SC and TN Casey Cluster and
the Irish Cluster could share a common Casey ancestor,
therefore, it makes more sense to make the John Casey (SC)
marker set as the baseline marker
set for this cluster. The last observation is that there are a
lot of mutations between most of the other Casey lines. This means
that most of these lines are very remotely related to each
other and that the Casey surname is quite common requiring many
more submissions in able to start connecting lines together. In fact,
there is so many mutations that it appears that the earliest
common ancestor between some of these lines started
before surnames were even used. It is possible that several
of these Casey lines may have had clan ties and may have taken
a surname based on their clan leader vs. having family ties.
Of course, early clans were closely related as well. A few of
these lines could also be NPE events. Once additional submissions
are received and these lines still remain very isolated, other
surnames should be investigated for possible NPE connections.
DNA Descendancy Chart (SC & TN)
The next charts is attempt to create DNA based descendancy
charts. This is probably the most likely scenario but there are
obvious variations. With any DNA chart, the marker set could
easily be passed down several generations with no mutations.
One big surprise in this chart is that both Abner Casey and
his father must have no mutations to make sense. This chart
also implies that the testing of the sons of Abner Casey,
Jesse E. Casey and Ambler Casey will not help us find any
additional generations (just confirm that current submissions
are not recent DNA mutations). This chart implies that more
submissions of unrelated SC and TN lines are key.
Also, the Arvle Casey line appears to the source for a backwards
mutation of 460 from 13 back to 12. Another alternative for Arvle
Casey would be no mutation and Jackson Casey had two mutations
(add 460 from 12 to 13). This makes the Henson Casey line more
closely related to the SC John Casey (SC) line than other lines
that were once believed to be closely related.
The DNA Descendancy Chart implies that another son of Abner Casey
will most like be the same as the Pleasant Casey submissions. It
can really have two scenarios for different marker sets:
1) A new mutation of some descendant of the second son being
submitted (not very genealogicially significant); 2) The second
son may not have the mutation from 460 (12 to 13). This would
imply this was a mutation of the Pleasant Casey line. This would
also elevate the Abner Casey line to the same level as John Casey
(SC) in the DNA Descendancy Chart. This is really pretty low odds
since other Casey lines (Jesse E. Casey, Ambler Casey and part of
the Henson Casey line) share this mutation which will probably
separate this cluster into two subclusters in the future. If more
SC lines were submitted, this would verify that the 460 mutation
is the best candidate to separate this cluster into two subclusters.
These would be subclusters - not new clusters as by definition,
clusters are closely related and subclusters just allow a large
number of submissions to be broken up into more managable groupings.
The DNA Descendancy Chart implies that another son of either Ambler
Casey or Jesse E. Casey will probably not help solve the connection
between these lines. However, CDYb appears to be a second significant
DNA mutation. This separates another subcluster (Ambler
Casey and Jesse E. Casey lines) from other lines in this cluster.
The mutation of the 607 marker was probably introduced with birth
of Jackson Casey (or his male descendants). This mutation is probably
not genealogically significant. Since Arvle Casey (Jackson's brother) does
not have this mutation, descendants of Jackson Casey can claim this
mutation as unique to the Jackson Casey line. The backwards mutation of
the 460 marker is only one possibility for sons of Henson Casey presents
a unique challenge. This mutation is shared with all other non-related
TN lines (Abner Casey, Jesse E. Casey and Ambler Casey). A third son
of Henson Casey is needed to determine if Jackson Casey is a double
mutation and Arvle Casey will be elevated to the same level of John
Casey (SC). This is the most likely scenario. However, another
reasonable probability can be assigned to Jackson Casey having only
the 607 mutation and Arvle Casey having a backwards "deletion" mutation
of 460 to back to 12.
Possible DNA Descendancy Chart - EXCEL Version(Irish Cluster related)
Possible DNA Descendancy Chart - EXCEL Version (Majority Rules)
Non Paternity Events vs. Overlapping Haplotypes
This section is speculative in nature and I have not seen these
issues addressed in depth at other DNA web sites and is not well
covered in publications on DNA used for genealogy. However,
I found the issue covered by Mark Jobling in the June, 2001 issue
of Trends in Genetics (the internet is wonderful at times).
I am now convinced that non-Casey surnames found in the SC
and TN cluster are very good candidates for NPE events. This is
based on the fact that only one of the ten individuals that
are closely related do not have the surname Casey. When the
vast majority of submissions that are closely related have the
same surname, it would be much less likely that other surnames
randomly crossed genetic paths (known as overlapping haplotypes).
For the Irish Cluster, the opposite is true. The Ysearch database
shows that there are 23 individuals within five mutation points of the baseline
for the Irish Cluster (I used mutation point mu2 from the 37 marker
cladogram for the oldest ancestor of this group). Only two
out of 23 submissions in this listing were Caseys. For this cluster's
haplogroup (the very common R1b1), overlapping haplotypes
are not unusual. There are two Casey, two Forbes, two Butler,
two McGraw and a myriad of other surnames: Harvey, McLain, Ramsey,
Brooks, Peppers, Hart, Bryan, Anderson, Hogan, Cummings, Iron,
Crow, Blair and McGrath. This implies that most of these
haplotypes are probably not related via NPE scenarios but are
most likely overlapping haplotypes of a common haplogroup (R1b1).
This will make sorting out NPEs in this cluster much more
difficult. Another interesting piece of information
from this Ysearch database are the origins of these individuals:
Ireland (9), Scotland (6), US (5) and Unknown (3). This implies
that our Casey origins may be: a) Scotland ancestors who migrated
to Ireland then to the US (this may explain why most of the
Casey lines are not Catholic); b) Some Irish lines migrated
to Scotland (probably a much lower probability but possible).
There are three reasons for genetically diverse events that
seem to be evolving: 1) Non paternity events (NPEs) are
where real families and biology diverge. These are adoptions and
out of wedlock children. I contend that the Hanvey line
is the only valid candidate at this point in time. Only the
SC and TN cluster will have higher probabilty of NPE events
at this point in time; 2) The rarely covered topic is
nature's ability for different genetic lines to
randomly cross paths (overlapping haplotypes);
The Irish Cluster seems to exhibit this variation.
When overlapping haplotypes exist, it will be much harder
to determine NPE events as haplotypes will not always be
related in this scenario. 3) Another topic that was not well
understood before DNA is that having the same surname does
not imply any genetic relationship as we once thought. Only
very uncommon surnames will have one common ancestor. More
common names can have ten to one hundred unique
unrelated ancestors that only shared interest
in the same surname when surnames were first
used by most Europeans only a short genetic time ago. These
three issues can be confused for each other as well and not
correctly identifying the sources of these issues can lead
to incorrect conclusions. For the Irish Cluster, upgrading
to 67 markers may provide more uniqueness between these
overlapping haplotypes. However, upgrades to 67 markers
for the SC & TN Cluster have not proven very fruitful to date.
Non paternity events are bound to happen as young adults can
die permaturely (adoption) or can have temporary biological
relationships that do not result in formal legal relationships.
There are lot of variations of these issues that can have
many different results. For instance, even adoptions can really
lead us astray as it is pretty common for the extended family
(same surname) to adopt children of cousins and nephews of young
adults who die prematurely or are not able to take on the
responsibility of child raising. These closely related adoptions
could be very difficult to sort out. Non paternity events can
happen is two directions as well. Casey boys can be adopted
by non-Casey families (other surnames now truly have Casey DNA
being passed down via Y chromosones). Also, non-Casey boys
are also adopted in the Casey lines and we can get some pretty
diverse DNA being introduced into the Casey surname pool
(when in reality their DNA biologically belongs to another
surname). This really allows two versions of genealogical
ancestry to emerge (familial and biological - both are
important). If a male infant was adopted and raised by a
Casey family, the Casey environmental influence
on this child could be stronger than the biological influence
due to DNA. Reseaching two different ancestral trees will
become more common with the aid of DNA documentation.
I think that the documentation covering overlapping haplotypes
is not widely understood to date and has been avoided as it
is difficult to understand (and prove with accuracy). With
only 37 and 67 markers available and most markers with only
four or five common variations each, you just do not have
enough markers to avoid distantly related individuals
randomly mutating back across between various
lines (both with the same surname or different surnames).
For the SC & TN Casey lines, these Casey lines appears
to have unique haplotypes which puts a DNA fingerprint
associated with these lines (until new submissions prove
otherwise). However, some Casey lines clearly have much
closer matches to other surnames which could be NPE events
of other surnames (most likely scenario). Some may have
changed their surname to Casey for various reasons and
some may have not been related when our ancestors first
started using surnames. Who gets to lay claim the Casey
surname - none of course. Unfortunately, clusters with
lots of Casey descendants could bias us to believe that
one group may try to claim the Casey surname over others
(we should avoid this bias at all costs). I really am
warming up to the idea that we now should sort out our various
Casey lines into many different genetic buckets. We already
have done this with our paper research by sorting out various
Casey lines by geography. I have known for some time that
my focus should only be on Casey lines that have ties
to SC and TN and DNA has proven this to be the correct
strategy. Actually, I have given up hope on SC lines
and DNA has significantly revived my interest in these lines.
Until the availability of DNA submissions to analyze, I had
a very inaccurate assumption concerning the exact relationship
of other Casey lines to my Casey line. I also assumed that
Casey was an Irish name and that all Casey lines (except
for NPE lines) would have a common ancestor back in the early
days of Ireland. However, the genetic distance from many
of the Casey lines have not proven this not to be the case.
I am now beginning to believe that there were 10 to 100 unique
individuals that first used the Casey surname. Surnames were
generally forced upon our ancestors by governments and early
rulers in order to collect taxes and raise armies. With
surnames like Smith or Brooks, the orginal assignment of
surname was driven by trade or geography. For the first
Irish individuals that were told to start using a surname,
these individuals did not get all get unique surnames and
it is likely that geography played significant role or that
the clan that they belonged to played a large role. It is
possible that their vocation could have played a role as well
(clan leaders probably had unique names as may have soldiers,
farmers and other vocations). The Casey surname has a
military meaning (it means dart-armed chief in battle).
Dart refers to knives (shorter than swords).
The surname Casey originated from the Gaelic name
"O'Cathasaigh" around 1,000 years ago. The original of
the name was "dart-armed chief in battle" which
implies our ancestors were probably soldiers. After the
Anglo-Norman invasion, the name was "anglicized" to
O'Casey and by the 1300's it was further "anglicized"
to its present form of Casey. During the introduction
of surnames, life was very brutal with constant warfare
between neighboring clans. These clans became larger and
larger in order to survive. These conflicts left many
orphan sons who adopted and raised by others on both
sides. Other clan members would regularly adopt the
young sons of fallen comrades and these sons may have
taken on new surnames if they were very young. When large
conflicts resulted in expanded control by one side,
other clans adopted orphan children of the defeated
side. Also, there was many orphans left because of
widespread outbreaks of diseases and food shortages.
Potato blight outbreaks caused severe food shortages
and starvation for many Irish families. Even in these
turbulent times, diseases and accidents resulted in
the need to adopt orphaned sons and probably introduced
many DNA varieties to the Casey surname.
Without any doubt, only two clusters have a very high
probability of being genetically connected.
At this point in time, only the SC & TN Casey cluster
and the Cluster 1 (recent Irish connections)
have any possibility of being genetically connected.
Even these two clusters are not "genealogically"
significant to each other but may be "genetically"
significant to each other. Cluster 2 has only small
chance of being "genetically" connected to other
clusters / groupings and its connection is so distant
that it will not be "genealogically" significant to current
Casey researchers (Cluster 2 will not be tied to other
clusters any time soon). Cluster 3 is now known not to be
genetically connected to other clusters / groupings
of Casey lines. The lines in Cluster 3 can not have
either "genealogically" or "genetically" significant
connections to other clusters or groupings.
However, do not rule out those very low odds
connections, your line may be in those lower
two percent that had many more mutations than
other lines. There easily could be ten to
one hundred genetically distinct Casey lines.
Only the addition of many more random Casey
DNA submissions will start to reveal
the actual number of "genetically" distinct
lines. With around 1,000 years and 40
generations, most are probably due to NPE
events over this long time period. The Casey
surname was probably started by around twenty unrelated
individuals and was suplemented by seventy NPE
events spread out over literally 1,000 years.
You might also be able to throw another ten to twenty
name changes to Casey as well. It is very possible
to have ten to one hundred unique different
genetic lines of the Casey surname. We have
at least five genetically unique clusters to date
out of only 21 submissions.
Call for better documentation
Raw DNA data without traditional genealogical research is not very
useful. It is critical to have both the DNA marker sets and known
information about the ancestry of these DNA submissions. The Pace
DNA web site (one of my ancestors) has an excellent web page
dedicated to providing significant genealogical information
known about all of their DNA submittors. This information is
conveniently made available to anyone of interest and saves
redundant efforts of many people gathering what they know about
these submissions for their own personal analysis.
Pace DNA web site's ancestry listings for submittors
The attached summary of ancestral listings is based on my
knowledge of these lines and what is readily documented on
the Internet. I may have made a couple of incorrect assumptions
or have not included complete ancestries as my knowledge
is quite limited on Casey lines outside the SC and TN cluster.
I welcome additions and corrections to this listing as well
as comments as to format and content. As I have more time,
I will attempt to add more from emails and other web sites.
Ancestries of Casey DNA submissions
Please give me some feedback
I am an amateur DNA researcher, so I greatly expect to be
corrected on some of my conclusions but I will take the risk
to be the first to publish an analysis of our Casey DNA
submissions. For anyone wanting to analyze the Casey DNA
submissions that we currently have, the only practical method
is to use cladogram software which graphically displays the
connections between the various DNA submissions. Also, usage
of MRCA (Most Recent Common Ancestor) calculators are also
critical in determining the closeness of genetic relationships.
Without these tools, it is extremely difficult to manually
extract this information. I am very new with DNA analysis
for genealogy and would appreciate comments on
this DNA analysis. This analysis takes a lot of time
and I would appreciate feedback of where you think I going off
track, where you think my analysis is on target, what information
is not found that should be included, etc. There is quite of bit
of redundancy in this analysis and I will try to reduce this
in future updates to this analysis.
Please send your comments by email, letter or phone:
E-mail (new) ___________
______________________ email address changed to image to reduce my spam email
Snail mail______________ Robert B. Casey, 4705 Eby Lane, Austin, TX 78731-4507
Phone (home)__________ (512) 371-0579 (nights and weekends only)