2024 journal article
Global genotype by environment prediction competition reveals that diverse modeling strategies can deliver satisfactory maize yield estimates
GENETICS.
Ed(s): M. Sillanpää
Abstract Predicting phenotypes from a combination of genetic and environmental factors is a grand challenge of modern biology. Slight improvements in this area have the potential to save lives, improve food and fuel security, permit better care of the planet, and create other positive outcomes. In 2022 and 2023, the first open-to-the-public Genomes to Fields initiative Genotype by Environment prediction competition was held using a large dataset including genomic variation, phenotype and weather measurements, and field management notes gathered by the project over 9 years. The competition attracted registrants from around the world with representation from academic, government, industry, and nonprofit institutions as well as unaffiliated. These participants came from diverse disciplines, including plant science, animal science, breeding, statistics, computational biology, and others. Some participants had no formal genetics or plant-related training, and some were just beginning their graduate education. The teams applied varied methods and strategies, providing a wealth of modeling knowledge based on a common dataset. The winner's strategy involved 2 models combining machine learning and traditional breeding tools: 1 model emphasized environment using features extracted by random forest, ridge regression, and least squares, and 1 focused on genetics. Other high-performing teams’ methods included quantitative genetics, machine learning/deep learning, mechanistic models, and model ensembles. The dataset factors used, such as genetics, weather, and management data, were also diverse, demonstrating that no single model or strategy is far superior to all others within the context of this competition.