Reader Q&A--Statistical Inquiries on Nicking
Written by Byron Rogers | May 26, 2011 |
Alan Porter's recent post on the Galileo/Danehill nick inspired some lively discussion in the comments section. One of our most outspoken and analytical community members, known by the handle sceptre, posed a very good set of questions that we thought we'd address here in a new blog.
1. Consider two stallions, one has but 90+ named foals of which 16% are SWs. The second stallion has sired 14+ times as many foals as the other and 8% are SWs. All else equal (including age not a factor), how certain are you that you would prefer to breed to the first referenced stallion rather than the second? (speaks to the broader question of statistical significance).
Thanks for taking the time to post. Not to be smart, but your first question – regarding percentage of stakes winners to foals – is a multifactoral question and requires consideration of context to answer properly. For example, if Stallion A (90+ foals, 16% SW) stands in a regional market in the U.S. and achieves that statistic against only state-bred competition, and the other stallion (1,200+ foals, 8% SW) is standing in Ireland and duking it out with the best in Europe, then the stallion in Ireland may well be the better sire. Equally if Stallion A began to attract better mares, and proved that his progeny could make it outside of restricted company (for example Unusual Heat in California), we might elect to go with Stallion A, all other things being equal.
Bearing down on your question though – is 90 named foals significant enough? Again, this depends on a number of factors. If 50 of those foals have started and they have made an average of eight starts each (this is the average number of starts required to win a race) and there are 14 stakes winners, then within the context above, (i.e that they were not all restricted state-bred stakes winners), we would say there is sufficient evidence to establish that this is a very good stallion. Incidentally, one of the sub-rules of calculation within TrueNicks is that the starters within the cross have to have collectively enough starts to win a race before it will make the calculation, which is something we think is really important when it comes to statistical significance.
Of course with regard to the blog item that you were responding to, we were considering not the overall record of a stallion, but the record of a very specific cross (Galileo with mares by Danehill). That cross has 10 stakes winners from 62 starters (16%), and our conclusion would be that 62 starters bred on a cross provides sufficient data for statistically significant results, especially when the return can be contrasted with Galileo’s record with all other mares, and that of the Danehill mares with all other stallions.
We are not sure if you meant to, but you do raise the interesting concept of 'quality' within a nick. A limitation of TrueNicks – as with all such programs – is that we do not alter the rating based on the quality of the horses produced on the nick, rather we give the major winners (and a lot of other data) so that users can make some intelligent interpretation. Generally this poses little problem in terms of practical use, as breeders and buyers are likely to be comparing like with like, for example choosing between two sires standing in the same price range, rather than say, comparing a B nick with A.P. Indy as the sire to an A+ nick by a $5,000 regionally-bred sire.
We have a current project in development where we are looking into ways of signifying the quality of not only the ancestors used in the mating (i.e. the racing/production ability of the sire, dam, and broodmare sire), but also the racing ability of all the horses bred on the nick (another advantage of having access to all the foals/starters bred on the cross!). While this wouldn't alter the nick rating itself (based the percentage of stakes winners to starters), but it could lead to describing cases where the nick may be a good one, say an A+, but the quality of the stakes winners bred on the cross, or of the parents, might be relatively modest...we are kicking around ideas here on how best to do this, including looking at some multivariate regression analysis of a number of figures outside of the TrueNicks. We’ll keep our readers posted on the progress of that study.
2. How often, if at all, have you done retrospective analysis of situations such as that offered in your Galileo/Danehill vs Galileo/all others? I realize that by being "publicized"/"accepted" a nick may later somewhat "dilute" (potential overload of lesser quality mates by broodmare sire, etc.), but careful analysis can eliminate this variable.
We have looked at this quite a lot, and for a long time. If you go back and look at Alan's second post on the TrueNicks blog back in December of 2007, it is about Kingmambo and Sadler's Wells mares. This has been one that we have really copped some grief for because it is rated well by eNicks, but not by us. On TrueNicks, it has varied between a C and a very weak B for four years now (we have the original data that TrueNicks was developed off back in early 2007 and it was a C then). It is however, one of those matings that people love to do (the inbreeding to Special seems to hold some mysticism!) and when it gets a good one it is a really good one, evidenced by the six group/grade I winners bred on the cross. But it has been tried an awful lot, and the slow ones on the cross are really, really slow, i.e. some have become slow hurdlers which is a task in itself to breed. Breeders like to forget the slow ones, but the TrueNicks algorithm doesn't! It is a question of the percentage of stakes winners bred on the cross being only a little above average for the sire and broodmare sire (Kingmambo and Sadler’s Wells, respectively), but the quality of the material used ensuring that when it does work, it often works very well.
A good example of a rating that went the other way under the pressure of significant numbers is that of Unbridled's Song with Storm Cat mares. This initially started quite well, and it was a solid B+ when Unbridled’s Song sired three stakes winners on the cross in his 2003 crop (Magnificent Song, Half Ours, and Noonmark), which followed on from the success Buddha – a grade I winner bred on the cross – but it has only one stakes winner in his 2006 crop (current 5yo's) and it hasn't produced one since. If anything in that case the 'quality' of Storm Cat mares bred to Unbridled's Song in the years 2005, 2006, and 2007 (current 5yo's, 4yo's, and 3yo's), were significantly better than the ones he received in his earlier years at stud, and they may be in even better hands/stables/care if that is possible, so one would think that there would be more stakes winners bred on the cross but there is not at this stage. It is a little perplexing at some level, but it is a good reflection of TrueNicks that it is making allowances for this rather than continuing to represent the cross as something that it is not. The rating now is a solid C, and there are over 120 foals bred on the cross of racing age.
These are just two examples, and there are others, such as More Than Ready with Danehill mares in Australia, that we continue to monitor. What is notable however is that in the case of Kingmambo/Sadler's Wells it went from a C to a weak B and Unbridled's Song/Storm Cat went from a B+ to a solid C. They both moved under the pressure of more foals/starters being bred on the cross at hand, but neither of them jumped to becoming an A+ rating or decayed significantly to becoming a D rating. They stayed pretty close to the statistical parameter that might have been expected from the initial rating.
What makes Galileo/Danehill unique is that it is an A+ from 98 foals (62 starters). Kingmambo/Sadler's Wells is a B from 106 foals (82 starters) and Unbridled's Song/Storm Cat is a C from 126 foals (77 starters). Obviously we have answered the question if we feel that 98 foals and 62 starters is a statistically significant number – the answer is yes – but the more important question that you were alluding to is "under the weight of numbers to come, will Galileo/Danehill keep its A+ rating, or will the performance of Galileo with all other broodmare sires, and/or the performance of Danehill mares with other stallions, outperform those bred on the cross?" (phew!) That is a good question, and one that needs a little more solid research to answer definitively, but on what we have seen with Kingmambo/Sadler's Wells and Unbridled's Song/Storm Cat, it could have the potential to improve (although this looks a less likely scenario), stay around the same, or decay a little, but no more than down to a solid B+...how is that for a prognostication! Of course to a degree, this does depend on how carefully the Danehill mares bred to Galileo are selected. If we take a look at the Kingmambo/Sadler's Wells or Unbridled's Song/Storm Cat scenario as examples, we suspect that one of the reasons for the deterioration in the strike rate of some of these very popular crosses is that they become a “default option” and are employed with insufficient consideration for the individual mares utilized. Another reason, and we have seen this a little with Speirbhean (dam of Teofilo), the repeatability of full relations becoming stakes winners is often not as easy as it seems to be. There are some mares that do very well with repeat matings, while others, either through environment (age of mare, etc.) or the variance of genetic inheritance, fail to repeat a successful mating with another stakes winner.
3. I would also like to see your "numbers" for only those who competed in Ireland, England, and France.
So would we! We can parse out the Galileo with Danehill mares easily enough, but separating out the "all others" would be a massive task that we would have to enlist The Jockey Club to help us with. Right now we have other projects in development (one is a research tool that will ultimately help us improve TrueNicks significantly by using it) and another three new products that we are in the process of programming, so a tool to do this type of analysis will have to wait...for now.
Thanks again for your comment Sceptre. If you or anyone else has any follow up questions please feel free to post them below.