The Biologist Is In: Tomatillo Breeding (4/n)

Friday, February 14, 2020

Tomatillo Breeding (4/n)

The last couple posts have looked at simulations for selection of a single gene, for recessive or dominant alleles. Increasing the number of genes actively under selection results in it taking longer and longer for the population to converge.

Plot titled "Multiple recessive traits, large population", illustrating selection for a trait in an out-crossing population.

Plot titled "Multiple dominant traits, large population", illustrating selection for a trait in an out-crossing population. It takes more years for the trait of interest to reach saturation in the population.

The change in code to simulate multiple genetic loci is really simple if we assume the different alleles we're selecting on are found sufficiently distant from each other on the chromosomes. This is referred to as "un-linked" and means the probability calculations for each are independent of the others.

R Script 5: Multiple recessive traits, large population.

# One recessive trait, infinite population.
#     Stabilize progeny for recessive trait via selection.
#     Save seeds from double-recessive plants each generation.
years <- 10;

# Define F2 population.
P_AA <- vector();
P_Aa <- vector();
P_aa <- vector();
P_AA <- 0.25;
P_Aa <- 0.50;
P_aa <- 0.25;

# Save seeds only from aabb plants, unknown pollen donor. Iterate over years.
for(i in 1:years) {
  P_AA <- append(P_AA,   0);
  P_Aa <- append(P_Aa,   P_aa[i]*P_AA[i]*1.00 + P_aa[i]*P_Aa[i]*0.50);
  P_aa <- append(P_aa,   P_aa[i]*P_aa[i]*1.00 + P_aa[i]*P_Aa[i]*0.50);
  P_sum <- P_aa[i+1] + P_Aa[i+1];
  P_Aa[i+1] <- P_Aa[i+1]/P_sum;
  P_aa[i+1] <- P_aa[i+1]/P_sum;
}

# Make figure.
plot(  0:years, P_aa^1, col="red", main="Multiple recessive traits, large population.", xlab="Years", ylab="%aa pollen donors", xlim=c(0,years), ylim=c(0,1), axes=TRUE, frame.plot=TRUE);
lines(0:years, P_aa^1, col="red");
lines(0:years, 1-P_aa^1, col="blue", lty="dashed");
lines(c(0,years),c(0.95,0.95), col="black", lty="dotted");
lines(c(0,years),c(0.99,0.99), col="black", lty="dashed");

for(i in 2:20) {
  lines(0:years, P_aa^i, col="red");
  lines(0:years, 1-P_aa^i, col="blue", lty="dashed");
}

R Script 6: Multiple dominant traits, large population.

# One dominant trait, infinite population.
#     Stabilize progeny for dominant trait via selection.
#     Save seeds from dominant plants each generation.
years <- 10;

# Define F2 population.
P_AA <- vector();
P_Aa <- vector();
P_aa <- vector();
P_AA <- 0.25;
P_Aa <- 0.50;
P_aa <- 0.25;

# Save seeds only from (AA and Aa) plants, unknown pollen donor. Iterate over years.
for(i in 1:years) {
  P_AA <- append(P_AA,   P_AA[i]*P_AA[i]*1.00 + P_AA[i]*P_Aa[i]*0.50 + P_Aa[i]*P_Aa[i]*0.25);
  P_Aa <- append(P_Aa,   P_AA[i]*P_aa[i]*1.00 + P_AA[i]*P_Aa[i]*0.50 + P_Aa[i]*P_aa[i]*0.50 + P_Aa[i]*P_Aa[i]*0.50);
  P_aa <- append(P_aa,   0);
  
  P_sum <- P_AA[i+1] + P_Aa[i+1];
  P_AA[i+1] <- P_AA[i+1]/P_sum;
  P_Aa[i+1] <- P_Aa[i+1]/P_sum;
}

# Make figure.
plot(  0:years, P_AA, col="red", main="Multiple dominant traits, large population.", xlab="Years", ylab="%AA pollen donors", xlim=c(0,years), ylim=c(0,1), axes=TRUE, frame.plot=TRUE);
lines(0:years, P_AA, col="red");
lines(0:years, 1-P_AA, col="blue", lty="dashed");
lines(c(0,years),c(0.95,0.95), col="black", lty="dotted");
lines(c(0,years),c(0.99,0.99), col="black", lty="dashed");

for(i in 2:20) {
  lines(0:years, P_AA^i, col="red");
  lines(0:years, 1-P_AA^i, col="blue", lty="dashed");
}

The probability of an F2 plant having two copies of recessive alleles for multiple genes drops to minimal very quickly when we increase the number of genes. In a small population this low probability means we might not find an F2 with all the recessive alleles stacked up the way we might want. All is not lost.

With our small F2 population, roughly a quarter would be expected to be in the double-recessive condition for the first gene of interest.

25% AA; 50% Aa; 25% aa

If we were unlucky and couldn't find a single plant that was also double-recessive for the second gene of interest, we can go ahead with plants showing the dominant trait for that second gene. The probability is that two thirds of the plants showing the dominant trait for the second gene will be heterozygous, carrying one copy of the recessive allele.

aaB_ (⅓BB; ⅔Bb)

In the next generation we have pretty good odds of recovering that second recessive trait that we were looking for. This way we can progressively collect multiple recessive traits without finding them in that first F2 generation. With this strategy, we need to keep seeds from prior generations. If we can't recover that next recessive trait in the next year, then we managed to find plants that were not heterozygous for the gene of interest. We need to grow more plants from the previous generation again, to try and find some carrying a copy of the recessive allele.

With plants that typically self-pollinate (like peppers and tomatoes), it can be pretty simple to intentionally remove recessive alleles for genes of interest. If you grow out the seeds produced by a plant and find any double-recessive progeny, you know that plant was heterozygous. If you don't find any double-recessive progeny, if you grow enough seeds, you can be pretty confident of that plant being homozygous for the dominant allele.

With plants that can't self-pollinate (like tomatillos), it can take more work/time. Lets say we have one plant that is showing the dominant trait. If we cross it with a plant showing the recessive trait, the resulting progeny will tell us if that first plant is "AA" or "Aa". If all the progeny show the dominant trait, then the plant we were testing is "AA". If the progeny show a mix of dominant and recessive traits, then the plant we were testing is "Aa" (and can be discarded). This is called a "test-cross" because it is used to test the genetics of a specific individual, even though we have no interest in using the progeny that result for further breeding work.

Since tomatilloes can be kept alive over several years, you can use such test crosses to progressively collect multiple plants with just the dominant alleles for your genes of interest. Once you have a few such plants, you can then allow them to inter-cross and be confident you won't have the recessive allele turning up in the next generations.

The Biologist Is In

Friday, February 14, 2020

Tomatillo Breeding (4/n)

No comments:

Post a Comment