-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Current performance evaluation objects, recently added to TunedModel histories, are too big #1105
Closed
8 of 11 tasks
Closed
8 of 11 tasks
Comments
Also relevant: #1025 |
ablaom
changed the title
Current evaluations objects, recently added to TunedModel histories) are too big
Current performance evaluation objects, recently added to TunedModel histories) are too big
Apr 22, 2024
ablaom
changed the title
Current performance evaluation objects, recently added to TunedModel histories) are too big
Current performance evaluation objects, recently added to TunedModel histories, are too big
Apr 22, 2024
4 tasks
Merged
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
There's evidence that the recent addition of full
PerformanceEvaluation
objects toTunedModel
histories is blowing up memory requirements in real use cases.I propose that we create two
PerformanceEvaluation
objects - a detailed one (as we have now) and newCompactPerformanceEvaluation
object. Theevaluate
method get's a new keyword argumentcompact=false
andTunedModel
gets a new hyperparametercompact_history=true
(this default would technically break MLJTuning but I doubt this would effect more than one or two users - and the recent change is not actually documented anywhere yet.)This would also allow us to ultimately address #575, which was shelved for fear of making evaluation objects too big.
Further thoughts anyone?
cc @CameronBieganek, @OkonSamuel
Below are the fields of the current struct. I've ticked off suggested fields for the compact case. I suppose the only one that might be controversial is
observations_per_fold
. This was always included inTunedModel
histories previously, so it seems less disruptive to include it.Fields
These fields are part of the public API of the
PerformanceEvaluation
struct.model
: model used to create the performance evaluation. In the case atuning model, this is the best model found.
measure
: vector of measures (metrics) used to evaluate performancemeasurement
: vector of measurements - one for each element ofmeasure
- aggregatingthe performance measurements over all train/test pairs (folds). The aggregation method
applied for a given measure
m
isStatisticalMeasuresBase.external_aggregation_mode(m)
(commonlyMean()
orSum()
)operation
(e.g.,predict_mode
): the operations applied for each measure to generatepredictions to be evaluated. Possibilities are: $PREDICT_OPERATIONS_STRING.
per_fold
: a vector of vectors of individual test fold evaluations (one vector permeasure). Useful for obtaining a rough estimate of the variance of the performance
estimate.
per_observation
: a vector of vectors of vectors containing individual per-observationmeasurements: for an evaluation
e
,e.per_observation[m][f][i]
is the measurement forthe
i
th observation in thef
th test fold, evaluated using them
th measure. Usefulfor some forms of hyper-parameter optimization. Note that an aggregregated measurement
for some measure
measure
is repeated across all observations in a fold ifStatisticalMeasures.can_report_unaggregated(measure) == true
. Ife
has been computedwith the
per_observation=false
option, thene_per_observation
is a vector ofmissings
.fitted_params_per_fold
: a vector containingfitted params(mach)
for each machinemach
trained during resampling - one machine per train/test pair. Use this to extractthe learned parameters for each individual training event.
report_per_fold
: a vector containingreport(mach)
for each machinemach
trainingin resampling - one machine per train/test pair.
train_test_rows
: a vector of tuples, each of the form(train, test)
, wheretrain
and
test
are vectors of row (observation) indices for training and evaluationrespectively.
resampling
: the resampling strategy used to generate the train/test pairs.repeats
: the number of times the resampling strategy was repeated.The text was updated successfully, but these errors were encountered: