Title: | Genomic Regression Workbench |
---|---|
Description: | Workbench for testing genomic regression accuracy on (optionally noisy) phenotypes. |
Authors: | Nelson Nazzicari & Filippo Biscarini |
Maintainer: | Nelson Nazzicari <[email protected]> |
License: | GPL-3 | file LICENSE |
Version: | 1.3.1 |
Built: | 2025-03-13 03:46:06 UTC |
Source: | https://github.com/cran/GROAN |
This function adds a regressor to an existing GROAN.Workbench object.
addRegressor(wb, regressor, regressor.name = regressor, ...)
addRegressor(wb, regressor, regressor.name = regressor, ...)
wb |
the GROAN.Workbench instance to be updated |
regressor |
regressor function |
regressor.name |
string that will be used in reports. Keep in mind that when deciding names. |
... |
extra parameters are passed to the regressor function |
an updated instance of the original GROAN.Workbench
#creating a Workbench with all default arguments wb = createWorkbench() #adding a second regressor wb = addRegressor(wb, regressor = phenoRegressor.dummy, regressor.name = 'dummy') ## Not run: #trying to add again a regressor with the same name would result in a naming conflict error wb = addRegressor(wb, regressor = phenoRegressor.dummy, regressor.name = 'dummy') ## End(Not run)
#creating a Workbench with all default arguments wb = createWorkbench() #adding a second regressor wb = addRegressor(wb, regressor = phenoRegressor.dummy, regressor.name = 'dummy') ## Not run: #trying to add again a regressor with the same name would result in a naming conflict error wb = addRegressor(wb, regressor = phenoRegressor.dummy, regressor.name = 'dummy') ## End(Not run)
This function verifies that the two passed GROAN.NoisyDataSet objects have
the same dimensions and can thus be used in the same experiment (typically training
models on one and testing on the other). The function returns a TRUE/FALSE. In verbose
mode the function also prints messages detailing the comparisons.
are.compatible(nds1, nds2, verbose = FALSE)
are.compatible(nds1, nds2, verbose = FALSE)
nds1 |
the first GROAN.NoisyDataSet to be tested |
nds2 |
the second GROAN.NoisyDataSet to be tested |
verbose |
boolean, if TRUE the function prints messages detailing the comparison. |
TRUE if the passed GROAN.NoisyDataSet are dimensionally compatible, FALSE otherwise
This function creates a GROAN.NoisyDataset object (or fails trying). The
class will contain all noisy data set components: genotypes and/or covariance matrix,
phenotypes, strata (optional), a noise injector function and its parameters.
You can have a general description of the created object using the overridden print.GROAN.NoisyDataset
function.
createNoisyDataset( name, genotypes = NULL, covariance = NULL, phenotypes, strata = NULL, extraCovariates = NULL, ploidy = 2, allowFractionalGenotypes = FALSE, noiseInjector = noiseInjector.dummy, ... )
createNoisyDataset( name, genotypes = NULL, covariance = NULL, phenotypes, strata = NULL, extraCovariates = NULL, ploidy = 2, allowFractionalGenotypes = FALSE, noiseInjector = noiseInjector.dummy, ... )
name |
A string defining the dataset name, used later do identify this particular instance in reports and result files. It is advisable for it to be it somewhat meaningful (to you, GROAN simply reports it as it is) |
genotypes |
Matrix or dataframe containing SNP genotypes, one row per sample (N), one column per marker (M), 0/1/2 format (for diploids) or 0/1/2.../ploidy in case of polyploids |
covariance |
matrix of covariances between samples of this dataset. It is usually a square (NxN) matrix, but rectangular matrices (NxW) are accepted to incapsulate covariances between samples in this set and samples of other sets. Please note that some regression models expect the covariance to be square and will fail on rectangular ones |
phenotypes |
numeric array, N slots |
strata |
array of M slots, describing the strata each data point belongs to. This is
used for stratified crossvalidation (see |
extraCovariates |
dataframe of optional extra covariates (N lines, one column per extra covariate). Numeric ones will be normalized, string and categorical ones will be transformed in stub TRUE/FALSE variables (one per possible value, see model.matrix). |
ploidy |
number of haploid sets in the cell. Defaults to 2 (diploid). |
allowFractionalGenotypes |
if TRUE non-integer values for genotypes can be allowed. Defaults to FALSE |
noiseInjector |
name of a noise injector function, defaults to noiseInjector.dummy |
... |
further arguments are passed along to noiseInjector |
a GROAN.NoisyDataset object.
#For more complete examples see the package vignette #creating a noisy dataset with normal noise nds = createNoisyDataset( name = 'PEA, normal noise', genotypes = GROAN.KI$SNPs, phenotypes = GROAN.KI$yield, noiseInjector = noiseInjector.norm, mean = 0, sd = sd(GROAN.KI$yield) * 0.5 )
#For more complete examples see the package vignette #creating a noisy dataset with normal noise nds = createNoisyDataset( name = 'PEA, normal noise', genotypes = GROAN.KI$SNPs, phenotypes = GROAN.KI$yield, noiseInjector = noiseInjector.norm, mean = 0, sd = sd(GROAN.KI$yield) * 0.5 )
This function returns a partially random alphanumeric string that can be used to identify a single run.
createRunId()
createRunId()
a partially random alphanumeric string
This function creates a GROAN.Workbench instance (or fails trying). The created object contains:
a) one regressor with its own specific configuration
b) the experiment parameters (number of repetitions, number of folds in case of crossvalidation, stratification...)
You can have a general description of the created object using the overridden print.GROAN.Workbench
function.
It is possible to add other regressors to the created GROAN.Workbench
object using addRegressor.
Once the GROAN.Workbench
is created it must be passed to GROAN.run to start the experiment.
createWorkbench( folds = 10, reps = 5, stratified = FALSE, outfolder = NULL, outfile.name = "accuracy.csv", saveHyperParms = FALSE, saveExtraData = FALSE, regressor = phenoRegressor.rrBLUP, regressor.name = "default regressor", ... )
createWorkbench( folds = 10, reps = 5, stratified = FALSE, outfolder = NULL, outfile.name = "accuracy.csv", saveHyperParms = FALSE, saveExtraData = FALSE, regressor = phenoRegressor.rrBLUP, regressor.name = "default regressor", ... )
folds |
number of folds for crossvalidation, defaults to 10. If |
reps |
number of times the whole test must be repeated, defaults to 5 |
stratified |
boolean indicating whether GROAN should take into account data strata. This have two
effects. First, the crossvalidation becomes stratified, meaning that folds will be
split so that training and test sets will contain the same proportions of each data stratum.
Second, prediction accuracy will be assessed (also) by strata.
If no strata are present in the GROAN.NoisyDataSet
object and |
outfolder |
folder where to save the data. If |
outfile.name |
file name to be used to save the accuracies in a text file. Defaults to "accuracy.csv".
Ignored if |
saveHyperParms |
boolean indicating if the hyperparameters from regressor training should be
saved in |
saveExtraData |
boolean indicating if extradata from regressor training should be
saved in |
regressor |
regressor function. Defaults to |
regressor.name |
string that will be used in reports. Keep that in mind when deciding names. Defaults to "default regressor" |
... |
extra parameter are passed to regressor function |
An instance of GROAN.Workbench
addRegressor GROAN.run createNoisyDataset
#creating a Workbench with all default arguments wb1 = createWorkbench() #another Workbench, with different crossvalidation wb2 = createWorkbench(folds=5, reps=20) #a third one, with a different regressor and extra parameters passed to regressor function wb3 = createWorkbench(regressor=phenoRegressor.BGLR, regressor.name='Bayesian Lasso', type='BL')
#creating a Workbench with all default arguments wb1 = createWorkbench() #another Workbench, with different crossvalidation wb2 = createWorkbench(folds=5, reps=20) #a third one, with a different regressor and extra parameters passed to regressor function wb3 = createWorkbench(regressor=phenoRegressor.BGLR, regressor.name='Bayesian Lasso', type='BL')
Given a Noisy Dataset
object, this function
applies the noise injector to the data and returns
a noisy version of it.
It is useful for inspecting the noisy injector effects.
getNoisyPhenotype(nds)
getNoisyPhenotype(nds)
nds |
a |
the phenotypes contained in nds
with added noise.
This list contains all data required to run GROAN examples. It refers to a pea experiment with 105 lines coming from a biparental Attika x Isard cross.
GROAN.AI
GROAN.AI
A list with the following fields:
"GROAN.AI$yield": named array with 105 slots, containing data on grain yield [t/ha]
"GROAN.AI$SNPs": data frame with 105 rows and 647 variables. Each row is a pea AI line, each column a SNP marker. Values can either be 0, 1, or 2, representing the three possible genotypes (AA, Aa, and aa, respectively).
"GROAN.AI$kinship": square dataframe containing the realized kinships between all pairs of each of the 105 pea AI lines. Values were computed following the Astle & Balding metric. Higher values represent a higher degree of genetic similarity between lines. This metric mainly accounts for additive genetic contributions (as an alternative to dominant contributions).
Annicchiarico et al., GBS-Based Genomic Selection for Pea Grain Yield under Severe Terminal Drought, The Plant Genome, Volume 10. doi:10.3835/plantgenome2016.07.0072
This list contains all data required to run GROAN examples. It refers to a pea experiment with 103 lines coming from a biparental Kaspa x Isard cross.
GROAN.KI
GROAN.KI
A list with the following fields:
"GROAN.KI$yield": named array with 103 slots, containing data on grain yield [t/ha]
"GROAN.KI$SNPs": data frame with 103 rows and 647 variables. Each row is a pea KI line, each column a SNP marker. Values can either be 0, 1, or 2, representing the three possible genotypes (AA, Aa, and aa, respectively).
"GROAN.KI$kinship": square dataframe containing the realized kinships between all pairs of each of the 103 pea KI lines. Values were computed following the Astle & Balding metric. Higher values represent a higher degree of genetic similarity between lines. This metric mainly accounts for additive genetic contributions (as an alternative to dominant contributions).
Annicchiarico et al., GBS-Based Genomic Selection for Pea Grain Yield under Severe Terminal Drought, The Plant Genome, Volume 10. doi:10.3835/plantgenome2016.07.0072
This piece of data is deprecated and will be dismissed in next release. Please use GROAN.KI instead.
GROAN.pea.kinship
GROAN.pea.kinship
A data frame with 103 rows and 103 variables. Row and column names are pea KI lines.
Annicchiarico et al., GBS-Based Genomic Selection for Pea Grain Yield under Severe Terminal Drought, The Plant Genome, Volume 10. doi:10.3835/plantgenome2016.07.0072
This piece of data is deprecated and will be dismissed in next release. Please use GROAN.KI instead.
GROAN.pea.SNPs
GROAN.pea.SNPs
A data frame with 103 rows and 647 variables. Each row represent a pea KI line, each column a SNP marker
Annicchiarico et al., GBS-Based Genomic Selection for Pea Grain Yield under Severe Terminal Drought, The Plant Genome, Volume 10. doi:10.3835/plantgenome2016.07.0072
This piece of data is deprecated and will be dismissed in next release. Please use GROAN.KI instead.
GROAN.pea.yield
GROAN.pea.yield
A named array with 103 slots.
Annicchiarico et al., GBS-Based Genomic Selection for Pea Grain Yield under Severe Terminal Drought, The Plant Genome, Volume 10. doi:10.3835/plantgenome2016.07.0072
This function runs the experiment described in a GROAN.Workbench object,
training regressor(s) on the data contained in a GROAN.NoisyDataSet object
via parameter nds
. The prediction accuracy is estimated either through crossvalidation
or on separate test dataset supplied via parameter nds.test
.
It returns a GROAN.Result
object, which have a summary
function for quick inspection and can be fed to plotResult for visual comparisons.
In case of crossvalidation the test dataset in the result object will report the [CV]
suffix.
The experiment statistics are computed via measurePredictionPerformance.
Each time this function is invoked it will refer to a runId
- an alphanumeric string identifying
each specific run. The runId
is usually generated internally, but it is possible to pass it if
the intention is to join results from different runs for analysis purposes.
GROAN.run(nds, wb, nds.test = NULL, run.id = createRunId())
GROAN.run(nds, wb, nds.test = NULL, run.id = createRunId())
nds |
a GROAN.NoisyDataSet object, containing the data (genotypes, phenotypes and so forth) plus a |
wb |
a GROAN.Workbench object, containing the regressors to be tested together with the description of the experiment |
nds.test |
either a GROAN.NoisyDataSet or a list of GROAN.NoisyDataSet. The regression model(s) trained
on |
run.id |
an alphanumeric string identifying this specific run. If not passed it is generated using createRunId |
a GROAN.Result
object
## Not run: #Complete examples are found in the vignette vignette('GROAN.vignette', package='GROAN') #Minimal example #1) creating a noisy dataset with normal noise nds = createNoisyDataset( name = 'PEA KI, normal noise', genotypes = GROAN.KI$SNPs, phenotypes = GROAN.KI$yield, noiseInjector = noiseInjector.norm, mean = 0, sd = sd(GROAN.KI$yield) * 0.5 ) #2) creating a GROAN.WorkBench using default regressor and crossvalidation preset wb = createWorkbench() #3) running the experiment res = GROAN.run(nds, wb) #4) examining results summary(res) plotResult(res) ## End(Not run)
## Not run: #Complete examples are found in the vignette vignette('GROAN.vignette', package='GROAN') #Minimal example #1) creating a noisy dataset with normal noise nds = createNoisyDataset( name = 'PEA KI, normal noise', genotypes = GROAN.KI$SNPs, phenotypes = GROAN.KI$yield, noiseInjector = noiseInjector.norm, mean = 0, sd = sd(GROAN.KI$yield) * 0.5 ) #2) creating a GROAN.WorkBench using default regressor and crossvalidation preset wb = createWorkbench() #3) running the experiment res = GROAN.run(nds, wb) #4) examining results summary(res) plotResult(res) ## End(Not run)
This method returns several performance metrics for the passed predictions.
measurePredictionPerformance(truevals, predvals)
measurePredictionPerformance(truevals, predvals)
truevals |
true values |
predvals |
predicted values |
A named array with the following fields:
Pearson's correlation
Spearmans' correlation (order based)
Root Mean Square Error
Mean Absolute Error
Coefficient of determination
mean Normalized Discounted Cumulative Gain with k equal to 0.1, 0.2, 0.5 and 1
This function calculates NDCG from the vectors of observed and predicted values and the chosen proportion k of top observations (rank).
ndcg(y, y_hat, k = 0.2)
ndcg(y, y_hat, k = 0.2)
y |
true values |
y_hat |
predicted values |
k |
relevant proportion of rank (top) |
a real value in [0,1]
This noise injector does not add any noise. Passed phenotypes
are
simply returned. This function is useful when comparing different
regressors on the same dataset without the effect of extra injected noise.
noiseInjector.dummy(phenotypes)
noiseInjector.dummy(phenotypes)
phenotypes |
input phenotypes. This object will be returned without checks. |
the same passed phenotypes
Other noiseInjectors:
noiseInjector.norm()
,
noiseInjector.swapper()
,
noiseInjector.unif()
phenos = rnorm(10) all(phenos == noiseInjector.dummy(phenos)) #TRUE
phenos = rnorm(10) all(phenos == noiseInjector.dummy(phenos)) #TRUE
This function adds to the passed phenotypes
array noise sampled from
a normal distribution with the specified mean and standard deviation.
The function can interest the totality of the passed phenotype array or
a random subset of it (commanded by subset
parameter).
noiseInjector.norm(phenotypes, mean = 0, sd = 1, subset = 1)
noiseInjector.norm(phenotypes, mean = 0, sd = 1, subset = 1)
phenotypes |
an array of numbers. |
mean |
mean of the normal distribution. |
sd |
standard deviation of the normal distribution. |
subset |
integer in [0,1], the proportion of original dataset to be injected |
An array, of the same size as phenotypes, where normal noise has been added to the original phenotype values.
Other noiseInjectors:
noiseInjector.dummy()
,
noiseInjector.swapper()
,
noiseInjector.unif()
#a sinusoid signal phenos = sin(seq(0,5, 0.1)) plot(phenos, type='p', pch=16, main='Original (black) vs. Injected (red), 100% affected') #adding normal noise to all samples phenos.noise = noiseInjector.norm(phenos, sd = 0.2) points(phenos.noise, type='p', col='red') #adding noise only to 30% of the samples plot(phenos, type='p', pch=16, main='Original (black) vs. Injected (red), 30% affected') phenos.noise.subset = noiseInjector.norm(phenos, sd = 0.2, subset = 0.3) points(phenos.noise.subset, type='p', col='red')
#a sinusoid signal phenos = sin(seq(0,5, 0.1)) plot(phenos, type='p', pch=16, main='Original (black) vs. Injected (red), 100% affected') #adding normal noise to all samples phenos.noise = noiseInjector.norm(phenos, sd = 0.2) points(phenos.noise, type='p', col='red') #adding noise only to 30% of the samples plot(phenos, type='p', pch=16, main='Original (black) vs. Injected (red), 30% affected') phenos.noise.subset = noiseInjector.norm(phenos, sd = 0.2, subset = 0.3) points(phenos.noise.subset, type='p', col='red')
This function introduces swap noise, i.e. a number of couples of
samples will have their phenotypes swapped.
The number of couples is computed so that the total fraction of
interested phenotypes approximates subset
.
noiseInjector.swapper(phenotypes, subset = 0.1)
noiseInjector.swapper(phenotypes, subset = 0.1)
phenotypes |
an array of numbers |
subset |
fraction of phenotypes to be interested by noise. |
the same passed phenotypes
, but with some elements swapped
Other noiseInjectors:
noiseInjector.dummy()
,
noiseInjector.norm()
,
noiseInjector.unif()
#a set of phenotypes phenos = 1:10 #swapping two elements phenos.sw2 = noiseInjector.swapper(phenos, 0.2) #swapping four elements phenos.sw4 = noiseInjector.swapper(phenos, 0.4) #swapping four elements again, since 30% of 10 elements #is rounded to 4 (two couples) phenos.sw4.again = noiseInjector.swapper(phenos, 0.3)
#a set of phenotypes phenos = 1:10 #swapping two elements phenos.sw2 = noiseInjector.swapper(phenos, 0.2) #swapping four elements phenos.sw4 = noiseInjector.swapper(phenos, 0.4) #swapping four elements again, since 30% of 10 elements #is rounded to 4 (two couples) phenos.sw4.again = noiseInjector.swapper(phenos, 0.3)
This function adds to the passed phenotypes
array noise sampled from
a uniform distribution with the specified range.
The function can interest the totality of the passed phenotype array or
a random subset of it (commanded by subset
parameter).
noiseInjector.unif(phenotypes, min = 0, max = 1, subset = 1)
noiseInjector.unif(phenotypes, min = 0, max = 1, subset = 1)
phenotypes |
an array of numbers. |
min , max
|
lower and upper limits of the distribution. Must be finite. |
subset |
integer in [0,1], the proportion of original dataset to be injected |
An array, of the same size as phenotypes, where uniform noise has been added to the original phenotype values.
Other noiseInjectors:
noiseInjector.dummy()
,
noiseInjector.norm()
,
noiseInjector.swapper()
#a sinusoid signal phenos = sin(seq(0,5, 0.1)) plot(phenos, type='p', pch = 16, main='Original (black) vs. Injected (red), 100% affected') #adding normal noise to all samples phenos.noise = noiseInjector.unif(phenos, min=0.1, max=0.3) points(phenos.noise, type='p', col='red') #adding noise only to 30% of the samples plot(phenos, type='p', pch = 16, main='Original (black) vs. Injected (red), 30% affected') phenos.noise.subset = noiseInjector.unif(phenos, min=0.1, max=0.3, subset = 0.3) points(phenos.noise.subset, type='p', col='red')
#a sinusoid signal phenos = sin(seq(0,5, 0.1)) plot(phenos, type='p', pch = 16, main='Original (black) vs. Injected (red), 100% affected') #adding normal noise to all samples phenos.noise = noiseInjector.unif(phenos, min=0.1, max=0.3) points(phenos.noise, type='p', col='red') #adding noise only to 30% of the samples plot(phenos, type='p', pch = 16, main='Original (black) vs. Injected (red), 30% affected') phenos.noise.subset = noiseInjector.unif(phenos, min=0.1, max=0.3, subset = 0.3) points(phenos.noise.subset, type='p', col='red')
This is a wrapper around BGLR
. As such, it won't work if BGLR package
is not installed.
Genotypes are modeled using the specified type
. If type
is 'RKHS' (and only
in this case) the covariance/kinship matrix covariances
is required, and it will be modeled
as matrix K in BGLR terms. In all other cases genotypes and covariances are put in the model
as X matrices.
Extra covariates, if present, are modeled as FIXED effects.
phenoRegressor.BGLR( phenotypes, genotypes, covariances, extraCovariates, type = c("FIXED", "BRR", "BL", "BayesA", "BayesB", "BayesC", "RKHS"), ... )
phenoRegressor.BGLR( phenotypes, genotypes, covariances, extraCovariates, type = c("FIXED", "BRR", "BL", "BayesA", "BayesB", "BayesC", "RKHS"), ... )
phenotypes |
phenotypes, a numeric array (n x 1), missing values are predicted |
genotypes |
SNP genotypes, one row per phenotype (n), one column per marker (m), values in 0/1/2 for
diploids or 0/1/2/...ploidy for polyploids. Can be NULL if |
covariances |
square matrix (n x n) of covariances. Can be NULL if |
extraCovariates |
extra covariates set, one row per phenotype (n), one column per covariate (w). If NULL no extra covariates are considered. |
type |
character literal, one of the following: FIXED (Flat prior), BRR (Gaussian prior), BL (Double-Exponential prior), BayesA (scaled-t prior), BayesB (two component mixture prior with a point of mass at zero and a scaled-t slab), BayesC (two component mixture prior with a point of mass at zero and a Gaussian slab) |
... |
extra parameters are passed to |
The function returns a list with the following fields:
predictions
: an array of (n) predicted phenotypes, with NAs filled and all other positions repredicted (useful for calculating residuals)
hyperparams
: empty, returned for compatibility
extradata
: list with information on trained model, coming from BGLR
Other phenoRegressors:
phenoRegressor.RFR()
,
phenoRegressor.SVR()
,
phenoRegressor.dummy()
,
phenoRegressor.rrBLUP()
,
phenoregressor.BGLR.multikinships()
## Not run: #using the GROAN.KI dataset, we regress on the dataset and predict the first ten phenotypes phenos = GROAN.KI$yield phenos[1:10] = NA #calling the regressor with Bayesian Lasso results = phenoRegressor.BGLR( phenotypes = phenos, genotypes = GROAN.KI$SNPs, covariances = NULL, extraCovariates = NULL, type = 'BL', nIter = 2000 #BGLR-specific parameters ) #examining the predictions plot(GROAN.KI$yield, results$predictions, main = 'Train set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') points(GROAN.KI$yield[1:10], results$predictions[1:10], pch=16, col='red') #printing correlations test.set.correlation = cor(GROAN.KI$yield[1:10], results$predictions[1:10]) train.set.correlation = cor(GROAN.KI$yield[-(1:10)], results$predictions[-(1:10)]) writeLines(paste( 'test-set correlation :', test.set.correlation, '\ntrain-set correlation:', train.set.correlation )) ## End(Not run)
## Not run: #using the GROAN.KI dataset, we regress on the dataset and predict the first ten phenotypes phenos = GROAN.KI$yield phenos[1:10] = NA #calling the regressor with Bayesian Lasso results = phenoRegressor.BGLR( phenotypes = phenos, genotypes = GROAN.KI$SNPs, covariances = NULL, extraCovariates = NULL, type = 'BL', nIter = 2000 #BGLR-specific parameters ) #examining the predictions plot(GROAN.KI$yield, results$predictions, main = 'Train set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') points(GROAN.KI$yield[1:10], results$predictions[1:10], pch=16, col='red') #printing correlations test.set.correlation = cor(GROAN.KI$yield[1:10], results$predictions[1:10]) train.set.correlation = cor(GROAN.KI$yield[-(1:10)], results$predictions[-(1:10)]) writeLines(paste( 'test-set correlation :', test.set.correlation, '\ntrain-set correlation:', train.set.correlation )) ## End(Not run)
This regressor implements Genomic BLUP using Bayesian methods from BGLR package, but allows to use more than one covariance matrix.
phenoregressor.BGLR.multikinships( phenotypes, genotypes = NULL, covariances, extraCovariates, type = "RKHS", ... )
phenoregressor.BGLR.multikinships( phenotypes, genotypes = NULL, covariances, extraCovariates, type = "RKHS", ... )
phenotypes |
phenotypes, a numeric array (n x 1), missing values are predicted |
genotypes |
added for compatibility with the other GROAN regressors, must be NULL |
covariances |
square matrix (n x n) of covariances. |
extraCovariates |
the extra covariance matrices to be added in the GBLUP model, collated in a single matrix-like structure, with optionally first column as an ignored intercept (supported for compatibility). See details, below. |
type |
character literal, one of the following: FIXED (Flat prior), BRR (Gaussian prior), BL (Double-Exponential prior), BayesA (scaled-t prior), BayesB (two component mixture prior with a point of mass at zero and a scaled-t slab), BayesC (two component mixture prior with a point of mass at zero and a Gaussian slab), RKHS (Gaussian processes, default) |
... |
extra parameters are passed to |
In its simplest form, GBLUP is defined as:
with
Where is the overall mean,
is the incidence matrix
relating individual weights
to
, and
is a
vector of residuals with zero mean and covariance matrix
It is possible to extend the above model to include different types of kinship matrices, each capturing different links between genotypes and phenotypes:
with
This function receives the first kinship matrix via the
covariances
argument and an arbitrary number of extra matrices via the extraCovariates
built as follow:
#given the following defined variables y = <some values, Nx1 array> K1 = <NxN kinship matrix> K2 = <another NxN kinship matrix> K3 = <a third NxN kinship matrix> #invoking the multi kinship GBLUP y_hat = phenoregressor.BGLR.multikinships( phenotypes = y, covariances = K1, extraCovariates = cbind(K2, K3) )
The function returns a list with the following fields:
predictions
: an array of (n) predicted phenotypes, with NAs filled and all other positions repredicted (useful for calculating residuals)
hyperparams
: empty, returned for compatibility
extradata
: list with information on trained model, coming from BGLR
Other phenoRegressors:
phenoRegressor.BGLR()
,
phenoRegressor.RFR()
,
phenoRegressor.SVR()
,
phenoRegressor.dummy()
,
phenoRegressor.rrBLUP()
This function is for development purposes. It returns, as "predictions", an array of random numbers. It accept the standard inputs and produces a formally correct output. It is, obviously, quite fast.
phenoRegressor.dummy(phenotypes, genotypes, covariances, extraCovariates)
phenoRegressor.dummy(phenotypes, genotypes, covariances, extraCovariates)
phenotypes |
phenotypes, numeric array (n x 1), missing values are predicted |
genotypes |
SNP genotypes, one row per phenotype (n), one column per marker (m), values in 0/1/2 for
diploids or 0/1/2/...ploidy for polyploids. Can be NULL if |
covariances |
square matrix (n x n) of covariances. Can be NULL if |
extraCovariates |
extra covariates set, one row per phenotype (n), one column per covariate (w). If NULL no extra covariates are considered. |
The function should return a list with the following fields:
predictions
: an array of (k) predicted phenotypes
hyperparams
: named array of hyperparameters selected during training
extradata
: any extra information
Other phenoRegressors:
phenoRegressor.BGLR()
,
phenoRegressor.RFR()
,
phenoRegressor.SVR()
,
phenoRegressor.rrBLUP()
,
phenoregressor.BGLR.multikinships()
#genotypes are not really investigated. Only #number of test phenotypes is used. phenoRegressor.dummy( phenotypes = c(1:10, NA, NA, NA), genotypes = matrix(nrow = 13, ncol=30) )
#genotypes are not really investigated. Only #number of test phenotypes is used. phenoRegressor.dummy( phenotypes = c(1:10, NA, NA, NA), genotypes = matrix(nrow = 13, ncol=30) )
This is a wrapper around randomForest and related functions.
As such, this function will not work if randomForest package is not installed.
There is no distinction between regular covariates (genotypes) and extra
covariates (fixed effects) in random forest. If extra covariates are passed, they are
put together with genotypes, side by side. Same thing happens with covariances matrix. This
can bring to the scientifically questionable but technically correct situation of regressing
on a big matrix made of SNP genotypes, covariances and other covariates, all collated side by side.
The function makes no distinction, and it's up to the user understand what is correct in each
specific experiment.
WARNING: this function can be *very* slow, especially when called on thousands of SNPs.
phenoRegressor.RFR( phenotypes, genotypes, covariances, extraCovariates, ntree = ceiling(length(phenotypes)/5), ... )
phenoRegressor.RFR( phenotypes, genotypes, covariances, extraCovariates, ntree = ceiling(length(phenotypes)/5), ... )
phenotypes |
phenotypes, a numeric array (n x 1), missing values are predicted |
genotypes |
SNP genotypes, one row per phenotype (n), one column per marker (m), values in 0/1/2 for
diploids or 0/1/2/...ploidy for polyploids. Can be NULL if |
covariances |
square matrix (n x n) of covariances. Can be NULL if |
extraCovariates |
extra covariates set, one row per phenotype (n), one column per covariate (w). If NULL no extra covariates are considered. |
ntree |
number of trees to grow, defaults to a fifth of the number of samples (rounded
up). As per |
... |
any extra parameter is passed to |
The function returns a list with the following fields:
predictions
: an array of (k) predicted phenotypes
hyperparams
: named vector with the following keys: ntree (number of grown trees)
and mtry (number of variables randomly sampled as candidates at each split)
extradata
: the object returned by randomForest::randomForest()
, containing the
full trained forest and the used parameters
Other phenoRegressors:
phenoRegressor.BGLR()
,
phenoRegressor.SVR()
,
phenoRegressor.dummy()
,
phenoRegressor.rrBLUP()
,
phenoregressor.BGLR.multikinships()
## Not run: #using the GROAN.KI dataset, we regress on the dataset and predict the first ten phenotypes phenos = GROAN.KI$yield phenos[1:10] = NA #calling the regressor with random forest results = phenoRegressor.RFR( phenotypes = phenos, genotypes = GROAN.KI$SNPs, covariances = NULL, extraCovariates = NULL, ntree = 20, mtry = 200 #randomForest-specific parameters ) #examining the predictions plot(GROAN.KI$yield, results$predictions, main = 'Train set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') points(GROAN.KI$yield[1:10], results$predictions[1:10], pch=16, col='red') #printing correlations test.set.correlation = cor(GROAN.KI$yield[1:10], results$predictions[1:10]) train.set.correlation = cor(GROAN.KI$yield[-(1:10)], results$predictions[-(1:10)]) writeLines(paste( 'test-set correlation :', test.set.correlation, '\ntrain-set correlation:', train.set.correlation )) ## End(Not run)
## Not run: #using the GROAN.KI dataset, we regress on the dataset and predict the first ten phenotypes phenos = GROAN.KI$yield phenos[1:10] = NA #calling the regressor with random forest results = phenoRegressor.RFR( phenotypes = phenos, genotypes = GROAN.KI$SNPs, covariances = NULL, extraCovariates = NULL, ntree = 20, mtry = 200 #randomForest-specific parameters ) #examining the predictions plot(GROAN.KI$yield, results$predictions, main = 'Train set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') points(GROAN.KI$yield[1:10], results$predictions[1:10], pch=16, col='red') #printing correlations test.set.correlation = cor(GROAN.KI$yield[1:10], results$predictions[1:10]) train.set.correlation = cor(GROAN.KI$yield[-(1:10)], results$predictions[-(1:10)]) writeLines(paste( 'test-set correlation :', test.set.correlation, '\ntrain-set correlation:', train.set.correlation )) ## End(Not run)
This is a wrapper around rrBLUP
function mixed.solve
.
It can either work with genotypes (in form of a SNP matrix) or with kinships (in form of a covariance
matrix). In the first case the function will implement a SNP-BLUP, in the second a G-BLUP. An error is
returned if both SNPs and covariance matrix are passed.
In rrBLUP terms, genotypes are modeled as random effects (matrix Z), covariances as matrix K, and
extra covariates, if present, as fixed effects (matrix X).
Please note that this function won't work if rrBLUP package is not installed.
phenoRegressor.rrBLUP( phenotypes, genotypes = NULL, covariances = NULL, extraCovariates = NULL, ... )
phenoRegressor.rrBLUP( phenotypes, genotypes = NULL, covariances = NULL, extraCovariates = NULL, ... )
phenotypes |
phenotypes, a numeric array (n x 1), missing values are predicted |
genotypes |
SNP genotypes, one row per phenotype (n), one column per marker (m), values in 0/1/2 for
diploids or 0/1/2/...ploidy for polyploids. Can be NULL if |
covariances |
square matrix (n x n) of covariances. |
extraCovariates |
optional extra covariates set, one row per phenotype (n), one column per covariate (w). If NULL no extra covariates are considered. |
... |
extra parameters are passed to rrBLUP::mixed.solve |
The function returns a list with the following fields:
predictions
: an array of (k) predicted phenotypes
hyperparams
: named vector with the following keys: Vu, Ve, beta, LL
extradata
: list with information on trained model, coming from mixed.solve
Other phenoRegressors:
phenoRegressor.BGLR()
,
phenoRegressor.RFR()
,
phenoRegressor.SVR()
,
phenoRegressor.dummy()
,
phenoregressor.BGLR.multikinships()
## Not run: #using the GROAN.KI dataset, we regress on the dataset and predict the first ten phenotypes phenos = GROAN.KI$yield phenos[1:10] = NA #calling the regressor with ridge regression BLUP on SNPs and kinship results.SNP.BLUP = phenoRegressor.rrBLUP( phenotypes = phenos, genotypes = GROAN.KI$SNPs, SE = TRUE, return.Hinv = TRUE #rrBLUP-specific parameters ) results.G.BLUP = phenoRegressor.rrBLUP( phenotypes = phenos, covariances = GROAN.KI$kinship, SE = TRUE, return.Hinv = TRUE #rrBLUP-specific parameters ) #examining the predictions plot(GROAN.KI$yield, results.SNP.BLUP$predictions, main = '[SNP-BLUP] Train set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') abline(a=0, b=1) points(GROAN.KI$yield[1:10], results.SNP.BLUP$predictions[1:10], pch=16, col='red') plot(GROAN.KI$yield, results.G.BLUP$predictions, main = '[G-BLUP] Train set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') abline(a=0, b=1) points(GROAN.KI$yield[1:10], results.G.BLUP$predictions[1:10], pch=16, col='red') #printing correlations correlations = data.frame( model = 'SNP-BLUP', test_set_correlations = cor(GROAN.KI$yield[1:10], results.SNP.BLUP$predictions[1:10]), train_set_correlations = cor(GROAN.KI$yield[-(1:10)], results.SNP.BLUP$predictions[-(1:10)]) ) correlations = rbind(correlations, data.frame( model = 'G-BLUP', test_set_correlations = cor(GROAN.KI$yield[1:10], results.G.BLUP$predictions[1:10]), train_set_correlations = cor(GROAN.KI$yield[-(1:10)], results.G.BLUP$predictions[-(1:10)]) )) print(correlations) ## End(Not run)
## Not run: #using the GROAN.KI dataset, we regress on the dataset and predict the first ten phenotypes phenos = GROAN.KI$yield phenos[1:10] = NA #calling the regressor with ridge regression BLUP on SNPs and kinship results.SNP.BLUP = phenoRegressor.rrBLUP( phenotypes = phenos, genotypes = GROAN.KI$SNPs, SE = TRUE, return.Hinv = TRUE #rrBLUP-specific parameters ) results.G.BLUP = phenoRegressor.rrBLUP( phenotypes = phenos, covariances = GROAN.KI$kinship, SE = TRUE, return.Hinv = TRUE #rrBLUP-specific parameters ) #examining the predictions plot(GROAN.KI$yield, results.SNP.BLUP$predictions, main = '[SNP-BLUP] Train set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') abline(a=0, b=1) points(GROAN.KI$yield[1:10], results.SNP.BLUP$predictions[1:10], pch=16, col='red') plot(GROAN.KI$yield, results.G.BLUP$predictions, main = '[G-BLUP] Train set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') abline(a=0, b=1) points(GROAN.KI$yield[1:10], results.G.BLUP$predictions[1:10], pch=16, col='red') #printing correlations correlations = data.frame( model = 'SNP-BLUP', test_set_correlations = cor(GROAN.KI$yield[1:10], results.SNP.BLUP$predictions[1:10]), train_set_correlations = cor(GROAN.KI$yield[-(1:10)], results.SNP.BLUP$predictions[-(1:10)]) ) correlations = rbind(correlations, data.frame( model = 'G-BLUP', test_set_correlations = cor(GROAN.KI$yield[1:10], results.G.BLUP$predictions[1:10]), train_set_correlations = cor(GROAN.KI$yield[-(1:10)], results.G.BLUP$predictions[-(1:10)]) )) print(correlations) ## End(Not run)
This is a wrapper around several functions from e1071
package (as such, it won't work if
e1071 package is not installed).
This function implements Support Vector Regressions, meaning that the data points are projected in
a transformed higher dimensional space where linear regression is possible.phenoRegressor.SVR
can operate in three modes: run, train and tune.
In run mode you need to pass the function an already tuned/trained SVR model, typically
obtained either directly from e1071 functions (e.g. from svm, best.svm and so forth)
or from a previous run of phenoRegressor.SVR
in a different mode. The passed model is applied
to the passed dataset and predictions are returned.
In train mode a SVR model will be trained on the passed dataset using the passed hyper
parameters. The trained model will then be used for predictions.
In tune mode you need to pass one or more sets of hyperparameters. The best combination of
hyperparameters will be selected through crossvalidation. The best performing SVR model will be used
for final predictions. This mode can be very slow.
There is no distinction between regular covariates (genotypes) and extra
covariates (fixed effects) in Support Vector Regression. If extra covariates are passed, they are
put together with genotypes, side by side. Same thing happens with covariances matrix. This
can bring to the scientifically questionable but technically correct situation of regressing
on a big matrix made of SNP genotypes, covariances and other covariates, all collated side by side.
The function makes no distinction, and it's up to the user understand what is correct in each
specific experiment.
phenoRegressor.SVR( phenotypes, genotypes, covariances, extraCovariates, mode = c("tune", "train", "run"), tuned.model = NULL, scale.pheno = TRUE, scale.geno = FALSE, ... )
phenoRegressor.SVR( phenotypes, genotypes, covariances, extraCovariates, mode = c("tune", "train", "run"), tuned.model = NULL, scale.pheno = TRUE, scale.geno = FALSE, ... )
phenotypes |
phenotypes, a numeric array (n x 1), missing values are predicted |
genotypes |
SNP genotypes, one row per phenotype (n), one column per marker (m), values in 0/1/2 for
diploids or 0/1/2/...ploidy for polyploids. Can be NULL if |
covariances |
square matrix (n x n) of covariances. Can be NULL if |
extraCovariates |
extra covariates set, one row per phenotype (n), one column per covariate (w). If NULL no extra covariates are considered. |
mode |
this parameter decides what will happen with the passed dataset
|
tuned.model |
a tuned and trained SVR to be used for prediction. This object is only used if
|
scale.pheno |
if TRUE (default) the phenotypes will be scaled and centered (before tuning or before applying the passed tuned model). |
scale.geno |
if TRUE the genotypes will be scaled and centered (before tuning or before applying the passed tuned model. It is usually not a good idea, since it leads to worse results. Defaults to FALSE. |
... |
all extra parameters are passed to |
The function returns a list with the following fields:
predictions
: an array of (n) predicted phenotypes
hyperparams
: named vector with the following keys: gamma, cost, coef0, nu, epsilon. Some
of the values may not make sense given the selected model, and will contain
default values from e1071 library.
extradata
: depending on mode
parameter, extradata
will contain one of the
following:
1) a SVM object returned by e1071::tune.svm, containing both
the best performing model and the description of the training process
2) a newly trained SVR model
3) the same object passed as tuned.model
svm, tune.svm, best.svm from e1071 package
Other phenoRegressors:
phenoRegressor.BGLR()
,
phenoRegressor.RFR()
,
phenoRegressor.dummy()
,
phenoRegressor.rrBLUP()
,
phenoregressor.BGLR.multikinships()
## Not run: ### WARNING ### #The 'tuning' part of the example can take quite some time to run, #depending on the computational power. #using the GROAN.KI dataset, we regress on the dataset and predict the first ten phenotypes phenos = GROAN.KI$yield phenos[1:10] = NA #--------- TUNE --------- #tuning the SVR on a grid of hyperparameters results.tune = phenoRegressor.SVR( phenotypes = phenos, genotypes = GROAN.KI$SNPs, covariances = NULL, extraCovariates = NULL, mode = 'tune', kernel = 'linear', cost = 10^(-3:+3) #SVR-specific parameters ) #examining the predictions plot(GROAN.KI$yield, results.tune$predictions, main = 'Mode = TUNING\nTrain set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') points(GROAN.KI$yield[1:10], results.tune$predictions[1:10], pch=16, col='red') #printing correlations test.set.correlation = cor(GROAN.KI$yield[1:10], results.tune$predictions[1:10]) train.set.correlation = cor(GROAN.KI$yield[-(1:10)], results.tune$predictions[-(1:10)]) writeLines(paste( 'test-set correlation :', test.set.correlation, '\ntrain-set correlation:', train.set.correlation )) #--------- TRAIN --------- #training the SVR, hyperparameters are given results.train = phenoRegressor.SVR( phenotypes = phenos, genotypes = GROAN.KI$SNPs, covariances = NULL, extraCovariates = NULL, mode = 'train', kernel = 'linear', cost = 0.01 #SVR-specific parameters ) #examining the predictions plot(GROAN.KI$yield, results.train$predictions, main = 'Mode = TRAIN\nTrain set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') points(GROAN.KI$yield[1:10], results.train$predictions[1:10], pch=16, col='red') #printing correlations test.set.correlation = cor(GROAN.KI$yield[1:10], results.train$predictions[1:10]) train.set.correlation = cor(GROAN.KI$yield[-(1:10)], results.train$predictions[-(1:10)]) writeLines(paste( 'test-set correlation :', test.set.correlation, '\ntrain-set correlation:', train.set.correlation )) #--------- RUN --------- #we recover the trained model from previous run, predictions will be exactly the same results.run = phenoRegressor.SVR( phenotypes = phenos, genotypes = GROAN.KI$SNPs, covariances = NULL, extraCovariates = NULL, mode = 'run', tuned.model = results.train$extradata ) #examining the predictions plot(GROAN.KI$yield, results.run$predictions, main = 'Mode = RUN\nTrain set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') points(GROAN.KI$yield[1:10], results.run$predictions[1:10], pch=16, col='red') #printing correlations test.set.correlation = cor(GROAN.KI$yield[1:10], results.run$predictions[1:10]) train.set.correlation = cor(GROAN.KI$yield[-(1:10)], results.run$predictions[-(1:10)]) writeLines(paste( 'test-set correlation :', test.set.correlation, '\ntrain-set correlation:', train.set.correlation )) ## End(Not run)
## Not run: ### WARNING ### #The 'tuning' part of the example can take quite some time to run, #depending on the computational power. #using the GROAN.KI dataset, we regress on the dataset and predict the first ten phenotypes phenos = GROAN.KI$yield phenos[1:10] = NA #--------- TUNE --------- #tuning the SVR on a grid of hyperparameters results.tune = phenoRegressor.SVR( phenotypes = phenos, genotypes = GROAN.KI$SNPs, covariances = NULL, extraCovariates = NULL, mode = 'tune', kernel = 'linear', cost = 10^(-3:+3) #SVR-specific parameters ) #examining the predictions plot(GROAN.KI$yield, results.tune$predictions, main = 'Mode = TUNING\nTrain set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') points(GROAN.KI$yield[1:10], results.tune$predictions[1:10], pch=16, col='red') #printing correlations test.set.correlation = cor(GROAN.KI$yield[1:10], results.tune$predictions[1:10]) train.set.correlation = cor(GROAN.KI$yield[-(1:10)], results.tune$predictions[-(1:10)]) writeLines(paste( 'test-set correlation :', test.set.correlation, '\ntrain-set correlation:', train.set.correlation )) #--------- TRAIN --------- #training the SVR, hyperparameters are given results.train = phenoRegressor.SVR( phenotypes = phenos, genotypes = GROAN.KI$SNPs, covariances = NULL, extraCovariates = NULL, mode = 'train', kernel = 'linear', cost = 0.01 #SVR-specific parameters ) #examining the predictions plot(GROAN.KI$yield, results.train$predictions, main = 'Mode = TRAIN\nTrain set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') points(GROAN.KI$yield[1:10], results.train$predictions[1:10], pch=16, col='red') #printing correlations test.set.correlation = cor(GROAN.KI$yield[1:10], results.train$predictions[1:10]) train.set.correlation = cor(GROAN.KI$yield[-(1:10)], results.train$predictions[-(1:10)]) writeLines(paste( 'test-set correlation :', test.set.correlation, '\ntrain-set correlation:', train.set.correlation )) #--------- RUN --------- #we recover the trained model from previous run, predictions will be exactly the same results.run = phenoRegressor.SVR( phenotypes = phenos, genotypes = GROAN.KI$SNPs, covariances = NULL, extraCovariates = NULL, mode = 'run', tuned.model = results.train$extradata ) #examining the predictions plot(GROAN.KI$yield, results.run$predictions, main = 'Mode = RUN\nTrain set (black) and test set (red) regressions', xlab = 'Original phenotypes', ylab = 'Predicted phenotypes') points(GROAN.KI$yield[1:10], results.run$predictions[1:10], pch=16, col='red') #printing correlations test.set.correlation = cor(GROAN.KI$yield[1:10], results.run$predictions[1:10]) train.set.correlation = cor(GROAN.KI$yield[-(1:10)], results.run$predictions[-(1:10)]) writeLines(paste( 'test-set correlation :', test.set.correlation, '\ntrain-set correlation:', train.set.correlation )) ## End(Not run)
This function uses ggplot2 package (which must be installed) to graphically render the result of a run. The function receive as input the output of GROAN.run and returns a ggplot2 object (that can be further customized). Currently implemented types of plot are:
box
: boxplot, showing the distribution of repetitions. See geom_boxplot
bar
: barplot, showing the average over repetitions. See stat_summary
bar_conf95
: same as 'bar', but with 95% confidence intervals
plotResult( res, variable = c("pearson", "spearman", "rmse", "time_per_fold", "coeff_det", "mae"), x.label = c("both", "train_only", "test_only"), plot.type = c("box", "bar", "bar_conf95"), strata = c("no_strata", "avg_strata", "single") )
plotResult( res, variable = c("pearson", "spearman", "rmse", "time_per_fold", "coeff_det", "mae"), x.label = c("both", "train_only", "test_only"), plot.type = c("box", "bar", "bar_conf95"), strata = c("no_strata", "avg_strata", "single") )
res |
a result data frame containing the output of GROAN.run |
variable |
name of the variable to be used as y values |
x.label |
select what to put on x-axis between both train and test dataset (default), train dataset only or test dataset only |
plot.type |
a string indicating the type of plot to be obtained |
strata |
string determining behaviour toward strata. If |
a ggplot2 object
Short description for class GROAN.NoisyDataset, created with createNoisyDataset.
## S3 method for class 'GROAN.NoisyDataset' print(x, ...)
## S3 method for class 'GROAN.NoisyDataset' print(x, ...)
x |
object of class GROAN.NoisyDataset. |
... |
ignored, put here to match S3 function signature |
This function returns the original GROAN.NoisyDataset
object invisibly (via invisible(x))
Short description for class GROAN.Workbench, created with createWorkbench.
## S3 method for class 'GROAN.Workbench' print(x, ...)
## S3 method for class 'GROAN.Workbench' print(x, ...)
x |
object of class GROAN.Workbench. |
... |
ignored, put here to match S3 function signature |
This function returns the original GROAN.Workbench
object invisibly (via invisible(x))
Returns a dataframe with some description of an object created with createNoisyDataset.
## S3 method for class 'GROAN.NoisyDataset' summary(object, ...)
## S3 method for class 'GROAN.NoisyDataset' summary(object, ...)
object |
instance of class GROAN.NoisyDataset. |
... |
additional arguments ignored, added for compatibility to generic |
a data frame with GROAN.NoisyDataset stats.
Performance metrics are averaged over repetitions, so that a data.frame is produced with one row per dataset/regressor/extra_covariates/strata/samples/markers/folds combination.
## S3 method for class 'GROAN.Result' summary(object, ...)
## S3 method for class 'GROAN.Result' summary(object, ...)
object |
an object returned from GROAN.run |
... |
additional arguments ignored, added for compatibility to generic |
a data.frame with averaged statistics