Skip to content

Commit

Permalink
Added Extended Information Criterion to bootlm and bumped version n…
Browse files Browse the repository at this point in the history
…umber
  • Loading branch information
acp29 committed Jun 14, 2024
1 parent 2e88da1 commit 2eedd18
Show file tree
Hide file tree
Showing 4 changed files with 47 additions and 14 deletions.
14 changes: 7 additions & 7 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: statistics-resampling
version: 5.5.18
date: 2024-06-04
version: 5.6.0
date: 2024-06-14
author: Andrew Penn <[email protected]>
maintainer: Andrew Penn <[email protected]>
title: A statistics package with a variety of resampling tools
Expand All @@ -9,11 +9,11 @@ description: The statistics-resampling package is an Octave
variety of statistics tasks using non-parametric resampling
methods. In particular, the functions included can be used to
estimate bias, uncertainty (standard errors and confidence
intervals), prediction error, and calculate p-values for null
hypothesis significance tests. Variations of the resampling
methods are included that improve the accuracy of the
statistics for small samples and samples with complex
dependence structures.
intervals), prediction error, information criteria, and
calculate p-values for null hypothesis significance tests.
Variations of the resampling methods are included that
improve the accuracy of the statistics for small samples
and samples with complex dependence structures.
license: GPLv3+
depends: octave (>= 4.4.0)
url: https://github.com/gnu-octave/statistics-resampling
41 changes: 37 additions & 4 deletions inst/bootlm.m
Original file line number Diff line number Diff line change
Expand Up @@ -464,6 +464,9 @@
% - 'PE': Bootstrap estimate of prediction error
% - 'PRESS': Bootstrap estimate of predicted residual error sum of squares
% - 'RSQ_pred': Bootstrap estimate of predicted R-squared
% - 'EIC': Extended (Efron) Information Criterion
% - 'RL': Relative likelihood (compared to the intercept-only model)
% - 'Wt': Prediction error expressed as weights
%
% The linear models evaluated are the same as for AOVSTAT, except that the
% output also includes the statistics for the intercept-only model. Note
Expand Down Expand Up @@ -2125,6 +2128,20 @@
OPTIM = S_ERR - A_ERR; % Optimism in apparent error
PE = cell2mat (RSS) / n + sum (OPTIM, 2) / NBOOT;

% Compute the Extended (Efron) Information Criterion, weights and relative
% liklihood
% See Konishi & Kitagawa, "Bootstrap Information Criterion". In: Information
% Criteria and Statistical Modeling. Springer Series in Statistics. Springer, NY.
LogLik = @(var) -n / 2 * log (2 * pi) - n / 2 * log (var) - n / 2;
S_LL = LogLik (S_ERR); % Simple estimate of expected log-likelihood
A_LL = LogLik (A_ERR); % Apparent estimates of log-likelihood
b = sum (A_LL - S_LL, 2) / NBOOT; % Bootstrap bias estimate of log-likelihood
LL = LogLik (cell2mat (RSS) / n); % Log-likelihood of model on original sample
EIC = -2 * LL + 2 * b; % Extended (Efron) Information Criterion
RL = exp (0.5 * (EIC(1) - EIC)); % Relative likelihood compared to intercept-
% only model: exp (0.5 * (EIC[H0] - EIC[H1]))
Wt = RL / sum (RL); % EIC weights

% Transform prediction errors to predicted R-squared statistics
PRESS = PE * n; % Bootstrap estimate of predicted
% residual error sum of squares
Expand All @@ -2133,7 +2150,8 @@
% by refined bootstrap

% Prepare output
PRED_ERR = struct ('MODEL', [], 'PE', PE, 'PRESS', PRESS, 'RSQ_pred', PE_RSQ);
PRED_ERR = struct ('MODEL', [], 'PE', PE, 'PRESS', PRESS, ...
'RSQ_pred', PE_RSQ, 'EIC', EIC, 'RL', RL, 'Wt', Wt);

end

Expand Down Expand Up @@ -2801,8 +2819,9 @@

%!demo
%!
%! % Prediction errors of linear models. Data from Table 9.1, on page 107 of
%! % Efron and Tibshirani (1993) An Introduction to the Bootstrap.
%! % Prediction errors and information criteria of linear models.
%! % Data from Table 9.1, on page 107 of Efron and Tibshirani (1993)
%! % An Introduction to the Bootstrap.
%!
%! amount = [25.8; 20.5; 14.3; 23.2; 20.6; 31.1; 20.9; 20.9; 30.4; ...
%! 16.3; 11.6; 11.8; 32.5; 32.0; 18.0; 24.1; 26.5; 25.8; ...
Expand All @@ -2828,6 +2847,17 @@
%! % Efron and Tibhirani (1993) using the same refined bootstrap procedure,
%! % because they have used case resampling whereas we have used wild bootstrap
%! % resampling. The equivalent value of Cp (eq. to AIC) statistic is 2.96.
%!
%! PRED_ERR
%!
%! % The results from the bootstrap are broadly consistent to the results
%! % obtained for Akaike's Information Criterion (AIC, computed in R with the
%! % AIC function from the 'stats' package):
%! %
%! % MODEL AIC EIC
%! % amount ~ 1 180.17 178.41
%! % amount ~ 1 + hrs 127.33 125.00
%! % amount ~ 1 + hrs + lot 107.85 106.36

%!demo
%!
Expand Down Expand Up @@ -2884,12 +2914,15 @@
%! % The results from the bootstrap are broadly consistent to the results
%! % obtained for PE, PRESS and RSQ_pred using cross-validation:
%! %
%! % MODEL PE-CV PRESS-CV RSQ_pred-CV
%! % MODEL PE-CV PRESS-CV RSQ_pred-CV
%! % sr ~ 1 20.48 1024.186 -0.041
%! % sr ~ 1 + pop15 16.88 843.910 +0.142
%! % sr ~ 1 + pop15 + pop75 16.62 830.879 +0.155
%! % sr ~ 1 + pop15 + pop75 + dpi 16.54 827.168 +0.159
%! % sr ~ 1 + pop15 + pop75 + dpi + ddpi 15.98 798.939 +0.188
%!
%! % Relative likelihood of the nested models (exluding intercept-only model)
%! PRED_ERR.RL(2:end) ./ PRED_ERR.RL(1:end-1)

%!test
%!
Expand Down
Binary file modified matlab/statistics-resampling.mltbx
Binary file not shown.
6 changes: 3 additions & 3 deletions matlab/statistics-resampling.prj
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
<deployment-project plugin="plugin.toolbox" plugin-version="1.0">
<configuration build-checksum="3306453136" file="Y:\Documents\GitHub\statistics-resampling\matlab\statistics-resampling.prj" location="Y:\Documents\GitHub\statistics-resampling\matlab" name="statistics-resampling" target="target.toolbox" target-name="Package Toolbox">
<configuration build-checksum="1289251195" file="Y:\Documents\GitHub\statistics-resampling\matlab\statistics-resampling.prj" location="Y:\Documents\GitHub\statistics-resampling\matlab" name="statistics-resampling" target="target.toolbox" target-name="Package Toolbox">
<param.appname>statistics-resampling</param.appname>
<param.authnamewatermark>Andrew Penn</param.authnamewatermark>
<param.email>[email protected]</param.email>
<param.company>University of Sussex, UK</param.company>
<param.summary>Statistical analysis using resampling methods</param.summary>
<param.description>The statistics-resampling package is an Octave package and Matlab toolbox that can be used to perform a wide variety of statistics tasks using non-parametric resampling methods. In particular, the functions included can be used to estimate bias, uncertainty (standard errors and confidence intervals), prediction error, and calculate p-values for null hypothesis significance tests. Variations of the resampling methods are included that improve the accuracy of the statistics for small samples and samples with complex dependence structures.</param.description>
<param.description>The statistics-resampling package is an Octave package and Matlab toolbox that can be used to perform a wide variety of statistics tasks using non-parametric resampling methods. In particular, the functions included can be used to estimate bias, uncertainty (standard errors and confidence intervals), prediction error, information criteria, and calculate p-values for null hypothesis significance tests. Variations of the resampling methods are included that improve the accuracy of the statistics for small samples and samples with complex dependence structures.</param.description>
<param.screenshot>Y:\Documents\GitHub\statistics-resampling\doc\icon.png</param.screenshot>
<param.version>5.5.18</param.version>
<param.version>5.6.0</param.version>
<param.output>${PROJECT_ROOT}\statistics-resampling.mltbx</param.output>
<param.products.name />
<param.products.id />
Expand Down

0 comments on commit 2eedd18

Please sign in to comment.