Person: Pérez-Rodríguez, P.
Loading...
Email Address
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
Pérez-Rodríguez
First Name
P.
Name
Pérez-Rodríguez, P.
ORCID ID
0000-0002-3202-17849 results
Search Results
Now showing 1 - 9 of 9
- Bayesian genomic-enabled prediction as an inverse problem(Genetics Society of America, 2014) Cuevas, J.; Perez-Elizalde, S.; Soberanis, V.; Perez-Rodriguez, P.; Gianola, D.; Crossa, J.
Publication - Genomic prediction with genotype by environment interaction analysis for kernel zinc concentration in tropical maize germplasm(Genetics Society of America, 2020) Mageto, E.; Crossa, J.; Pérez-Rodríguez, P.; Dhliwayo, T.; Palacios-Rojas, N.; Lee, M.; Rui Guo; San Vicente Garcia, F.M.; Xuecai Zhang; Hindu, V.
Publication - An R Package for Bayesian analysis of multi-environment and multi-trait multi-environment data for genome-based prediction(Genetics Society of America, 2019) Montesinos-Lopez, O.A.; Montesinos-López, A.; Luna Vázquez, F.J.; Toledo, F.H.; Pérez-Rodríguez, P.; Lillemo, M.; Crossa, J.Evidence that genomic selection (GS) is a technology that is revolutionizing plant breeding continues to grow. However, it is very well documented that its success strongly depends on statistical models, which are used by GS to perform predictions of candidate genotypes that were not phenotyped. Because there is no universally better model for prediction and models for each type of response variable are needed (continuous, binary, ordinal, count, etc.), an active area of research aims to develop statistical models for the prediction of univariate and multivariate traits in GS. However, most of the models developed so far are for univariate and continuous (Gaussian) traits. Therefore, to overcome the lack of multivariate statistical models for genome-based prediction by improving the original version of the BMTME, we propose an improved Bayesian multi-trait and multi-environment (BMTME) R package for analyzing breeding data with multiple traits and multiple environments. We also introduce Bayesian multi-output regressor stacking (BMORS) functions that are considerably efficient in terms of computational resources. The package allows parameter estimation and evaluates the prediction performance of multi-trait and multi-environment data in a reliable, efficient and user-friendly way. We illustrate the use of the BMTME with real toy datasets to show all the facilities that the software offers the user. However, for large datasets, the BME() and BMTME() functions of the BMTME R package are very intense in terms of computing time; on the other hand, less intensive computing is required with BMORS functions BMORS() and BMORS_Env() that are also included in the BMTME package.
Publication - A Bayesian decision theory approach for genomic selection(Genetics Society of America, 2018) Villar-Hernández, B.d.J.; Pérez-Elizalde, S.; Crossa, J.; Pérez-Rodríguez, P.; Toledo, F.H.; Burgueño, J.Plant and animal breeders are interested in selecting the best individuals from a candidate set for the next breeding cycle. In this paper, we propose a formal method under the Bayesian decision theory framework to tackle the selection problem based on genomic selection (GS) in single- and multi-trait settings. We proposed and tested three univariate loss functions (Kullback-Leibler, KL; Continuous Ranked Probability Score, CRPS; Linear-Linear loss, LinLin) and their corresponding multivariate generalizations (Kullback-Leibler, KL; Energy Score, EnergyS; and the Multivariate Asymmetric Loss Function, MALF). We derived and expressed all the loss functions in terms of heritability and tested them on a real wheat dataset for one cycle of selection and in a simulated selection program. The performance of each univariate loss function was compared with the standard method of selection (Std) that does not use loss functions. We compared the performance in terms of the selection response and the decrease in the population's genetic variance during recurrent breeding cycles. Results suggest that it is possible to obtain better performance in a long-term breeding program using the single-trait scheme by selecting 30% of the best individuals in each cycle but not by selecting 10% of the best individuals. For the multi-trait approach, results show that the population mean for all traits under consideration had positive gains, even though two of the traits were negatively correlated. The corresponding population variances were not statistically different from the different loss function during the 10th selection cycle. Using the loss function should be a useful criterion when selecting the candidates for selection for the next breeding cycle.
Publication - A bayesian genomic regression model with skew normal random errors(Genetics Society of America, 2018) Pérez-Rodríguez, P.; Acosta-Pech, R.; Pérez-Elizalde, S.; Velasco Cruz, C.; Suarez Espinosa, J.; Crossa, J.Genomic selection (GS) has become a tool for selecting candidates in plant and animal breeding programs. In the case of quantitative traits, it is common to assume that the distribution of the response variable can be approximated by a normal distribution. However, it is known that the selection process leads to skewed distributions. There is vast statistical literature on skewed distributions, but the skew normal distribution is of particular interest in this research. This distribution includes a third parameter that drives the skewness, so that it generalizes the normal distribution. We propose an extension of the Bayesian whole-genome regression to skew normal distribution data in the context of GS applications, where usually the number of predictors vastly exceeds the sample size. However, it can also be applied when the number of predictors is smaller than the sample size. We used a stochastic representation of a skew normal random variable, which allows the implementation of standard Markov Chain Monte Carlo (MCMC) techniques to efficiently fit the proposed model. The predictive ability and goodness of fit of the proposed model were evaluated using simulated and real data, and the results were compared to those obtained by the Bayesian Ridge Regression model. Results indicate that the proposed model has a better fit and is as good as the conventional Bayesian Ridge Regression model for prediction, based on the DIC criterion and cross-validation, respectively. A computing program coded in the R statistical package and C programming language to fit the proposed model is available as supplementary material.
Publication - Genomic-enabled prediction in maize using kernel models with genotype x environment interaction(Genetics Society of America, 2017) Bandeira e Sousa, M.; Cuevas, J.; Couto, E.; Pérez-Rodríguez, P.; Jarquin, D.; Fritsche-Neto, R.; Burgueño, J.; Crossa, J.Multi-environment trials are routinely conducted in plant breeding to select candidates for the next selection cycle. In this study, we compare the prediction accuracy of four developed genomic-enabled prediction models: (1) single-environment, main genotypic effect model (SM); (2) multi-environment, main genotypic effects model (MM); (3) multi-environment, single variance G×E deviation model (MDs); and (4) multi-environment, environment-specific variance G×E deviation model (MDe). Each of these four models were fitted using two kernel methods: a linear kernel Genomic Best Linear Unbiased Predictor, GBLUP (GB), and a nonlinear kernel Gaussian kernel (GK). The eight model-method combinations were applied to two extensive Brazilian maize data sets (HEL and USP data sets), having different numbers of maize hybrids evaluated in different environments for grain yield (GY), plant height (PH), and ear height (EH). Results show that the MDe and the MDs models fitted with the Gaussian kernel (MDe-GK, and MDs-GK) had the highest prediction accuracy. For GY in the HEL data set, the increase in prediction accuracy of SM-GK over SM-GB ranged from 9 to 32%. For the MM, MDs, and MDe models, the increase in prediction accuracy of GK over GB ranged from 9 to 49%. For GY in the USP data set, the increase in prediction accuracy of SM-GK over SM-GB ranged from 0 to 7%. For the MM, MDs, and MDe models, the increase in prediction accuracy of GK over GB ranged from 34 to 70%. For traits PH and EH, gains in prediction accuracy of models with GK compared to models with GB were smaller than those achieved in GY. Also, these gains in prediction accuracy decreased when a more difficult prediction problem was studied.
Publication - Bayesian genomic prediction with genotype x environment interaction kernel models(Genetics Society of America, 2017) Cuevas, J.; Crossa, J.; Montesinos-Lopez, O.A.; Burgueño, J.; Pérez-Rodríguez, P.; De Los Campos, G.The phenomenon of genotype · environment (G · E) interaction in plant breeding decreases selection accuracy, thereby negatively affecting genetic gains. Several genomic prediction models incorporating G · E have been recently developed and used in genomic selection of plant breeding programs. Genomic prediction models for assessing multi-environment G · E interaction are extensions of a singleenvironment model, and have advantages and limitations. In this study, we propose two multi-environment Bayesian genomic models: the first model considers genetic effects ðuÞ that can be assessed by the Kronecker product of variance–covariance matrices of genetic correlations between environments and genomic kernels through markers under two linear kernel methods, linear (genomic best linear unbiased predictors, GBLUP) and Gaussian (Gaussian kernel, GK). The other model has the same genetic component as the first model ðuÞ plus an extra component, f, that captures random effects between environments that were not captured by the random effects u: We used five CIMMYT data sets (one maize and four wheat) that were previously used in different studies. Results show that models with G · E always have superior prediction ability than single-environment models, and the higher prediction ability of multi-environment models with u and f over the multi-environment model with only u occurred 85% of the time with GBLUP and 45% of the time with GK across the five data sets. The latter result indicated that including the random effect f is still beneficial for increasing prediction ability after adjusting by the random effect u.
Publication - Genomic prediction in maize breeding populations with genotyping-by sequencing(Genetics Society of America, 2013) Crossa, J.; Beyene, Y.; Semagn, K.; Pérez-Rodríguez, P.; Hickey, J.; Charles Chen; De Los Campos, G.; Burgueño, J.; Windhausen, V.S.; Buckler, E.; Jannink, J.L.; Lopez-Cruz, M.; Babu, R.Genotyping-by-sequencing (GBS) technologies have proven capacity for delivering large numbers of marker genotypes with potentially less ascertainment bias than standard single nucleotide polymorphism (SNP) arrays. Therefore, GBS has become an attractive alternative technology for genomic selection. However, the use of GBS data poses important challenges, and the accuracy of genomic prediction using GBS is currently undergoing investigation in several crops, including maize, wheat, and cassava. The main objective of this study was to evaluate various methods for incorporating GBS information and compare them with pedigree models for predicting genetic values of lines from two maize populations evaluated for different traits measured in different environments (experiments 1 and 2). Given that GBS data come with a large percentage of uncalled genotypes, we evaluated methods using nonimputed, imputed, and GBS-inferred haplotypes of different lengths (short or long). GBS and pedigree data were incorporated into statistical models using either the genomic best linear unbiased predictors (GBLUP) or the reproducing kernel Hilbert spaces (RKHS) regressions, and prediction accuracy was quantified using cross-validation methods. The following results were found: relative to pedigree or marker-only models, there were consistent gains in prediction accuracy by combining pedigree and GBS data; there was increased predictive ability when using imputed or nonimputed GBS data over inferred haplotype in experiment 1, or nonimputed GBS and information-based imputed short and long haplotypes, as compared to the other methods in experiment 2; the level of prediction accuracy achieved using GBS data in experiment 2 is comparable to those reported by previous authors who analyzed this data set using SNP arrays; and GBLUP and RKHS models with pedigree with nonimputed and imputed GBS data provided the best prediction correlations for the three traits in experiment 1, whereas for experiment 2 RKHS provided slightly better prediction than GBLUP for drought-stressed environments, and both models provided similar predictions in well-watered environments.
Publication - Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat(Genetics Society of America, 2012) Pérez-Rodríguez, P.; Gianola, D.; Gonzalez Camacho, J.M.; Crossa, J.; Manes, Y.; Dreisigacker, S.In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Publication