Date
Corporate author
Editor
Illustrator
Producer
Photographer
Contributor
Writer
Translator
Journal Title
Journal ISSN
Volume Title
Access Rights
Share
APA citation

Montesinos-López, O. A., Mosqueda-Gonzalez, B. A., Palafox González, A., Montesinos-Lopez, A., & Crossa, J. (2022). A general-purpose machine learning r library for sparse kernels methods with an application for genome-based prediction. Frontiers in Genetics, 13, 887643. https://doi.org/10.3389/fgene.2022.887643

ISO citation
Abstract
Description
The adoption of machine learning frameworks in areas beyond computer science have been facilitated by the development of user-friendly software tools that do not require an advanced understanding of computer programming. In this paper, we present a new package (sparse kernel methods, SKM) software developed in R language for implementing six (generalized boosted machines, generalized linear models, support vector machines, random forest, Bayesian regression models and deep neural networks) of the most popular supervised machine learning algorithms with the optional use of sparse kernels. The SKM focuses on user simplicity, as it does not try to include all the available machine learning algorithms, but rather the most important aspects of these six algorithms in an easy-to-understand format. Another relevant contribution of this package is a function for the computation of seven different kernels. These are Linear, Polynomial, Sigmoid, Gaussian, Exponential, Arc-Cosine 1 and Arc-Cosine L (with L = 2, 3, …) and their sparse versions, which allow users to create kernel machines without modifying the statistical machine learning algorithm. It is important to point out that the main contribution of our package resides in the functionality for the computation of the sparse version of seven basic kernels, which is indispensable for reducing computational resources to implement kernel machine learning methods without a significant loss in prediction performance. Performance of the SKM is evaluated in a genome-based prediction framework using both a maize and wheat data set. As such, the use of this package is not restricted to genome prediction problems, and can be used in many different applications.
Keywords
Citation
Copyright
CIMMYT manages Intellectual Assets as International Public Goods. The user is free to download, print, store and share this work. In case you want to translate or create any other derivative work and share or distribute such translation/derivative work, please contact CIMMYT-Knowledge-Center@cgiar.org indicating the work you want to use and the kind of use you intend; CIMMYT will contact you with the suitable license for that purpose
Journal
Frontiers in Genetics
Journal volume
13
Journal issue
Article number
887643
Place of Publication
Switzerland
Publisher
Frontiers