Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conformal prediction with conditional guarantees #455

Open
wants to merge 131 commits into
base: master
Choose a base branch
from

Conversation

Damien-Bouet
Copy link
Collaborator

@Damien-Bouet Damien-Bouet commented May 28, 2024

Description

Implementation of new classes SplitCPRegressor and CCPCalibrator (and other subclasses) to implement the method proposed by Gibbs et al. (2023), and described in the issue #449

Fixes #449

Type of change

Please remove options that are irrelevant.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Add tests to cover all the new features
  • Still pass the old tests

Checklist

  • I have read the contributing guidelines
  • I have updated the HISTORY.rst and AUTHORS.rst files
  • Linting passes successfully : make lint
  • Typing passes successfully : make type-check
  • Unit tests pass successfully : make tests
  • Coverage is 100% : make coverage
  • Documentation builds successfully : make doc

@Damien-Bouet Damien-Bouet linked an issue May 28, 2024 that may be closed by this pull request
Copy link
Collaborator

@thibaultcordier thibaultcordier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good first PR, thank you a lot! I give you some suggestions (format, style, code, content ...). To sum up:

  • Transform PhiFunction into abstract class.
  • Make the comparison with MapieRegressor to foow the same steps (check for instance).
  • Remove verbose warning and duplicate checks.

mapie/regression/__init__.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/utils/ccp_phi_function.py Outdated Show resolved Hide resolved
mapie/regression/utils/ccp_phi_function.py Outdated Show resolved Hide resolved
mapie/regression/utils/ccp_phi_function.py Outdated Show resolved Hide resolved
mapie/regression/utils/ccp_phi_function.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@LacombeLouis LacombeLouis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Damien-Bouet,
Thank you and well done!
I have made some initial comments in the ccp_regression.py file. The main thing that I'm noticing is a lot of cast() and functions that seem exist in other classes. Please note these initial comments, I will have a look further regarding the other files.

mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
mapie/regression/ccp_regression.py Outdated Show resolved Hide resolved
but any conformal prediction method can be implemented by the user as
a subclass of :class:`~mapie.calibrators.base.BaseCalibrator`.

Example of naive Split CP:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why taking example of the naive split CP ? (naive here means that you don't have coverage), so actually it's not a CP method

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I changed it

Method's intuition
--------------------

We recall that the `naive` method estimates the absolute residuals by a constant :math:`\hat{q}_{n, \alpha}^+`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer standard instead of naive


def fit(
self,
X_calib: ArrayLike,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you give X_calib is it truely equivalent to MapieRegressor ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to do the splitting in the main class (SplitCPRegressor) and not in the calibrators, to have the maximum in the main class and the simplest calibrator possible. However, so calibrators may need the training or calibration data (ex: CQR, would need both). So here, I specify X_calib, but no worries, it is indeed equivalent to MapieRegressor.

- It can create very adaptative intervals (with a varying width which truly reflects the model uncertainty)
- while providing coverage guantee on all sub-groups of interest (avoiding biases)
- with the possibility to inject prior knowledge about the data or the model

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also mention the disadvantages here!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remove the advantages from the theoretical explanation, and added them, with some disadvantages, in the tutorial

mapie_ccp.fit(X_train, y_train)
y_pred_ccp, y_pi_ccp = mapie_ccp.predict(X_test)

# ================== PLOT ==================
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we plot all the methods on the same graph? The colors are clearly different.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the next plots, where there are 6 methods compared together, plotting all of them on the same figure was too much and made it difficult to understand. So I decided to plot them 3 by 3 (in this case, the first 3, and the 4th alone)

calibrator2 = PolynomialCCP(1)
calibrator3 = PolynomialCCP([0, 3])
calibrator2 = PolynomialCCP(1) # degree=1 is equivalent to degree=[0, 1]
calibrator3 = PolynomialCCP([1], variable="y_pred")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give a bit of an explanation of the intuition behind these different calibrators?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explained just above. Tell me if it is not clear enough :

  1. f : X -> (1), will try to estimate the absolute residual with a constant, and will results in a prediction interval of constant width (like the basic split CP)

  2. f : X -> (1, X), will result in a prediction interval of width equal to: a constant + a value proportional to the value of X (it seems a good idea here, as the uncertainty increase with X)

  3. f : X, y_pred -> (y_pred), will result in a prediction interval of width proportional to the prediction (Like the basic split CP with a gamma conformity score).


calibrator1 = CustomCCP([lambda X: X < 0, lambda X: X >= 0])
# To improve the results, we need to analyse the data
# and the conformity scoreswe chose (here, the absolute residuals).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scores we


##############################################################################
# Using gaussian distances from randomly sampled points is a good solution
# to have an overall good adaptativity.
# The most adaptative interval is this last brown one, with the two groups
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your intuition here? Why?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the conclusion, also adding some disadvantages, to be more impartial. Tell me if you like it :


Conlusion:
The goal is to get prediction intervals which are the most adaptative possible. Perfect adaptativity whould result in a perfectly constant conditional coverage.

Considering this adaptativity criteria, the most adaptative interval is this last brown one, with the two groups and the gaussian calibrators. In this example, the polynomial calibrator (in purple) also worked well, but the gaussian one is more generic (It usually work with any dataset, assuming we use the correct parameters, whereas the polynomial features are not always adapted).

This is the power of the CCP method: combining prior knowledge and generic features (gaussian kernelsl) to have a great overall adaptativity.

However, it can be difficult to find the best calibrator and parameters. Sometimes, a simpler method (standard split with GammaConformityScore for example) can be enough. Don’t forget to try at first the simpler method, and move on with the more advanced if it is necessary.

- ``X``: Input dataset, of shape (n_samples, ``n_in``)
- ``y_pred``: estimator prediction, of shape (n_samples,)
- ``z``: exogenous variable, of shape (n_samples, n_features).
It should be given in the ``fit`` and ``predict`` methods.
It should be given in the ``fit`` and ``predict`` methods.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we perform a test for this? By that, I mean that we provide the same combination for the fit and predict.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure to understand correctly, but if the calibrator need a z value, it will not work neither in the fit nor predict if it is not given. So in a way, there can't be an issue of forgotten z only in the fit or predict

cs_features = concatenate_functions(self.functions_, params_mapping,
self._multipliers)
cs_features = concatenate_functions(self.functions_, params_mapping)
# Normalize
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we add this comment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@@ -10,7 +10,8 @@

class PolynomialCCP(CCPCalibrator):
"""
Calibrator used for the in ``SplitCPRegressor`` or ``SplitCPClassifier``
Calibrator based on :class:`~mapie.calibrators.ccp.CCPCalibrator`,
used for the in ``SplitCPRegressor`` or ``SplitCPClassifier``
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we not use :class:?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it in the classes docstrings 👍

HISTORY.rst Outdated
@@ -17,6 +17,10 @@ History
* Building unit tests for different `Subsample` and `BlockBooststrap` instances
* Change the sign of C_k in the `Kolmogorov-Smirnov` test documentation
* Building a training set with a fraction between 0 and 1 with `n_samples` attribute when using `split` method from `Subsample` class.
* Add `SplitCPRegressor`, bsaed on new `SplitCP` abstract class, to support the new CCP method

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo with bsaed instead if based :-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you !

HISTORY.rst Outdated
@@ -17,6 +17,10 @@ History
* Building unit tests for different `Subsample` and `BlockBooststrap` instances
* Change the sign of C_k in the `Kolmogorov-Smirnov` test documentation
* Building a training set with a fraction between 0 and 1 with `n_samples` attribute when using `split` method from `Subsample` class.
* Add `SplitCPRegressor`, based on new `SplitCP` abstract class, to support the new CCP method
* Add `GaussianCCP`, `PolynomialCCP` and `CustomCCP` based on `CCPCalibrator` to implement the Conditional CP method
* Add the `StandardCalibrator`, to reproduce standard CP and make sur that the `SplitCPRegressor` is implemented correctly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure instead of sur :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Conformal Prediction With Conditional Guarantees
5 participants