BPadvertisementfrom

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Monday, 20 August 2012

Genomic prediction: no bull

Posted on 06:13 by Unknown
This Atlantic article discusses the application of genomic prediction to cattle breeding. Breeders have recently started switching from pedigree based methods to statistical models incorporating SNP genotypes. We can now make good predictions of phenotypes like milk and meat production using genetic data alone. Cows are easier than people because, as domesticated animals, they have smaller effective breeding population and less genetic diversity. Nevertheless, I expect very similar methods to be applied to humans within the next 5-10 years.
The Atlantic: ... the semen that Badger-Bluff Fanny Freddie produces has become such a hot commodity in what one artificial-insemination company calls "today's fast paced cattle semen market." In January of 2009, before he had a single daughter producing milk, the United States Department of Agriculture took a look at his lineage and more than 50,000 markers on his genome and declared him the best bull in the land. And, three years and 346 milk- and data-providing daughters later, it turns out that they were right.
"When Freddie [as he is known] had no daughter records our equations predicted from his DNA that he would be the best bull," USDA research geneticist Paul VanRaden emailed me with a detectable hint of pride. "Now he is the best progeny tested bull (as predicted)." 
Data-driven predictions are responsible for a massive transformation of America's dairy cows. While other industries are just catching on to this whole "big data" thing, the animal sciences -- and dairy breeding in particular -- have been using large amounts of data since long before VanRaden was calculating the outsized genetic impact of the most sought-after bulls with a pencil and paper in the 1980s. 
Dairy breeding is perfect for quantitative analysis. Pedigree records have been assiduously kept; relatively easy artificial insemination has helped centralized genetic information in a small number of key bulls since the 1960s; there are a relatively small and easily measurable number of traits -- milk production, fat in the milk, protein in the milk, longevity, udder quality -- that breeders want to optimize; each cow works for three or four years, which means that farmers invest thousands of dollars into each animal, so it's worth it to get the best semen money can buy. The economics push breeders to use the genetics. 
The bull market (heh) can be reduced to one key statistic, lifetime net merit, though there are many nuances that the single number cannot capture. Net merit denotes the likely additive value of a bull's genetics. The number is actually denominated in dollars because it is an estimate of how much a bull's genetic material will likely improve the revenue from a given cow. A very complicated equation weights all of the factors that go into dairy breeding and -- voila -- you come out with this single number. For example, a bull that could help a cow make an extra 1000 pounds of milk over her lifetime only gets an increase of $1 in net merit while a bull who will help that same cow produce a pound more protein will get $3.41 more in net merit. An increase of a single month of predicted productive life yields $35 more. 
When you add it all up, Badger-Fluff Fanny Freddie has a net merit of $792. No other proven sire ranks above $750 and only seven bulls in the country rank above $700.
See below -- theoretical calculations suggest that even outliers with net merit of $700-800 will be eclipsed by specimens with 10x higher merit that can be produced by further selection on existing genetic variation. Similar results apply to humans.
... It turned out they were in the perfect spot to look for statistical rules. They had databases of old and new bull semen. They had old and new production data. In essence, it wasn't that difficult to generate rules fortransforming genomic data into real-world predictions. Despite -- or because of -- the effectiveness of traditional breeding techniques, molecular biology has been applied in the field for years in different ways. Given that breeders were trying to discover bulls' hidden genetic profiles by evaluating the traits in their offspring that could be measured, it just made sense to start generating direct data about the animals' genomes. 
"Each of the bulls on the sire list, we have 50,000 genetic markers. Most of those, we have 700,000," the USDA's VanRaden said. "Every month we get another 12,000 new calves, the DNA readings come in and we send the predictions out. We have a total of 200,000 animals with DNA analysis. That's why it's been so easy. We had such a good phenotype file and we had DNA stored on all these bulls."
... Nowadays breeders can choose between "genomic bulls," which have been evaluated based purely on their genes and "proven bulls," for which real world data is available. Discussions among dairy breeders show that many are beginning to mix in younger bulls with good-looking genomic data into the breeding regimens. How well has it gone? The first of the bulls who were bred from their genetic profiles alone, are receiving their initial production data. So far, it seems as if the genomic estimates were a little high, but more accurate than traditional methods alone.
The unique dataset and success of dairy breeders now has other scientists sniffing around their findings. Leonid Kruglyak, a genomics professor at Princeton, told me that "a lot of the statistical techniques and methodology" that connect phenotype and genotype were developed by animal breeders. In a sense, they are like codebreakers. If you know the rules of encoding. it's not difficult to put information in one end and have it pop out the other as a code. But if you're starting with the code, that's a brutally difficult problem. And it's the one that diary geneticists have been working on.
(Kruglyak was a graduate student in biophysics at Berkeley under Bill Bialik when I was there.)
... John Cole, yet another USDA animal improvement scientist, generated an estimate of the perfect bull by choosing the optimal observed genetic sequences and hypothetically combining them. He found that the optimal bull would have a net merit value of $7,515, which absolutely blows any current bull out of the water. In other words, we're nowhere near creating the perfect milk machine.  
Here's a recent paper on the big data aspects of genomic selection applied to animal breeding.
Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
Posted in algorithms, biology, genetic engineering, genetics, machine learning, statistics, technology | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • PhD Comics: the movie
    PHD Movie Trailer from PHD Comics on Vimeo . I met Jorge Cham , the cartoonist who draws PhD Comics, a few years ago at Sci Foo. Cham was ...
  • Finding the Next Einstein
    Duke researcher Jonathan Wai interviewed me for his Psychology Today blog, Finding the Next Einstein . Below are my answers to two of his q...
  • Beanbags and causal variants
    Not only do these results implicate common causal variants as the source of heritability in disease susceptibility, but they also suggest th...
  • Sitzfleisch
    Freeman Dyson reviews the new biography of Oppenheimer by Ray Monk. I discussed the book already here . NYBooks : ... The subtitle, “A Life ...
  • A blog is born
    Raghu Parasarathy , a biophysicist at U Oregon, and my correspondent in this previous post on faculty blogging, has decided to try it out. ...
  • News from Microsoft Research Faculty Summit 2013
    Measuring the maximal commuting subset of observables uniquely determines the pure state of a quantum system (recently proved Kadison-Singer...
  • Talk cancelled
    This talk has been cancelled, for complex reasons that I will not discuss.
  • East Asian sociopaths?
    Some would assert that CEOs and other people in leadership positions are often warm sociopaths . Interestingly, it is claimed that there is ...
  • Swedish height in the 20th century
    Average height of Swedish military conscripts during the 20th century. Looks like an increase of roughly 1 cm per decade or about 1.5 SD in ...
  • The differences are enormous
    Luis Alvarez laid it out bluntly: The world of mathematics and theoretical physics is hierarchical. That was my first exposure to it. There...

Categories

  • ability (2)
  • academia (9)
  • affirmative action (8)
  • ai (13)
  • aig (1)
  • alan turing (3)
  • algorithms (2)
  • alpha (2)
  • american society (54)
  • art (3)
  • ashkenazim (1)
  • aspergers (4)
  • athletics (6)
  • autism (4)
  • autobiographical (13)
  • basketball (4)
  • bayes (1)
  • behavioral economics (4)
  • berkeley (5)
  • bgi (24)
  • biology (23)
  • biotech (6)
  • bjj (5)
  • black holes (4)
  • blade runner (2)
  • blogging (3)
  • books (5)
  • borges (2)
  • bounded rationality (10)
  • brainpower (57)
  • bubbles (3)
  • caltech (14)
  • cambridge uk (1)
  • careers (18)
  • charles darwin (1)
  • chet baker (2)
  • China (25)
  • christmas (1)
  • class (2)
  • cognitive science (35)
  • cold war (1)
  • complexity (1)
  • computing (9)
  • conferences (4)
  • cosmology (4)
  • creativity (3)
  • credit crisis (10)
  • crossfit (5)
  • cryptography (2)
  • data mining (4)
  • dating (2)
  • demographics (1)
  • derivatives (5)
  • determinism (1)
  • digital books (1)
  • dna (4)
  • economic history (5)
  • economics (38)
  • econtalk (2)
  • ecosystems (1)
  • education (5)
  • efficient markets (8)
  • Einstein (2)
  • elitism (14)
  • encryption (1)
  • energy (1)
  • entrepreneurs (3)
  • entropy (1)
  • environmentalism (1)
  • eugene (3)
  • evolution (19)
  • expert prediction (6)
  • fake alpha (2)
  • feminism (2)
  • Fermi problems (2)
  • feynman (7)
  • film (9)
  • finance (42)
  • fitness (3)
  • flynn effect (1)
  • foo camp (1)
  • football (5)
  • france (1)
  • free will (1)
  • freeman dyson (2)
  • fx (2)
  • game theory (1)
  • geeks (2)
  • gender (4)
  • genetic engineering (15)
  • genetics (79)
  • genius (24)
  • genomics (2)
  • geopolitics (7)
  • gilded age (13)
  • global warming (1)
  • globalization (23)
  • godel (2)
  • goldman sachs (2)
  • google (4)
  • happiness (2)
  • harvard (8)
  • harvard society of fellows (5)
  • hedge funds (4)
  • hedonic treadmill (1)
  • height (2)
  • higher education (38)
  • history (8)
  • history of science (12)
  • hormones (3)
  • hugh everett (2)
  • human capital (34)
  • humor (1)
  • income inequality (21)
  • india (2)
  • industrial revolution (1)
  • innovation (38)
  • intellectual history (10)
  • intellectual property (1)
  • intellectual ventures (1)
  • internet (4)
  • iq (16)
  • italy (4)
  • james salter (3)
  • japan (4)
  • jiujitsu (8)
  • keynes (1)
  • kids (13)
  • lewontin fallacy (1)
  • lhc (1)
  • literature (12)
  • luck (1)
  • machine learning (8)
  • malcolm gladwell (1)
  • manhattan (2)
  • many worlds (10)
  • mathematics (14)
  • meritocracy (7)
  • microsoft (2)
  • mma (10)
  • monsters (2)
  • moore's law (1)
  • movies (9)
  • MSU (18)
  • music (5)
  • mutants (2)
  • nathan myhrvold (1)
  • neal stephenson (1)
  • neanderthals (2)
  • nerds (3)
  • net worth (5)
  • neuroscience (7)
  • new yorker (1)
  • nicholas metropolis (1)
  • noam chomsky (2)
  • nobel prize (2)
  • nsa (2)
  • nuclear weapons (5)
  • obama (7)
  • olympics (4)
  • oppenheimer (7)
  • patents (1)
  • personality (9)
  • philip k. dick (1)
  • philosophy of mind (2)
  • photos (40)
  • physical training (13)
  • physics (73)
  • podcasts (10)
  • political correctness (6)
  • politics (4)
  • pop culture (2)
  • prisoner's dilemma (1)
  • privacy (2)
  • probability (5)
  • prostitution (2)
  • psychology (25)
  • psychometrics (31)
  • qcd (1)
  • quants (9)
  • quantum computers (2)
  • quantum field theory (3)
  • quantum mechanics (18)
  • race relations (10)
  • real estate (1)
  • realpolitik (6)
  • renaissance technologies (1)
  • research (3)
  • russia (2)
  • sad but true (2)
  • sci fi (8)
  • science (42)
  • sec (1)
  • security (5)
  • silicon valley (6)
  • singularity (1)
  • smpy (1)
  • social networks (2)
  • social science (12)
  • software development (2)
  • solar energy (1)
  • sports (13)
  • startups (19)
  • statistics (16)
  • success (2)
  • taiwan (1)
  • talks (16)
  • teaching (2)
  • technology (34)
  • television (2)
  • travel (24)
  • turing test (1)
  • ufc (8)
  • ultimate fighting (1)
  • universities (33)
  • university of oregon (6)
  • usain bolt (2)
  • venture capital (3)
  • volatility (1)
  • von Neumann (10)
  • wall street (2)
  • war (1)
  • warren buffet (1)
  • wwii (3)

Blog Archive

  • ►  2013 (134)
    • ►  August (10)
    • ►  July (15)
    • ►  June (22)
    • ►  May (20)
    • ►  April (21)
    • ►  March (18)
    • ►  February (14)
    • ►  January (14)
  • ▼  2012 (222)
    • ►  December (17)
    • ►  November (19)
    • ►  October (20)
    • ►  September (25)
    • ▼  August (19)
      • Genomic secrets of the dead: high-coverage Denisov...
      • Back in the MACT
      • Beating down hash functions
      • deCODE, de novo mutations, and autism risk
      • Genomic prediction: no bull
      • Recent human evolution: European height
      • "For the historians and the ladies"
      • Better to be lucky than good
      • Knightmare
      • Chomsky on po-mo
      • On doping
      • Gell-Mann, Feynman, Everett
      • The greatest of all time
      • The path not taken
      • Quantum correspondence
      • Curiosity has landed
      • Bolt, again!
      • Correlation, Causation and Personality
      • $440M in 45 minutes
    • ►  July (18)
    • ►  June (16)
    • ►  May (20)
    • ►  April (16)
    • ►  March (18)
    • ►  February (20)
    • ►  January (14)
  • ►  2011 (144)
    • ►  December (20)
    • ►  November (16)
    • ►  October (25)
    • ►  September (23)
    • ►  August (21)
    • ►  July (26)
    • ►  June (13)
Powered by Blogger.

About Me

Unknown
View my complete profile