ivmodels.models package

Submodules

ivmodels.models.anchor_regression module

class ivmodels.models.anchor_regression.AnchorMixin(gamma=1, instrument_names=None, instrument_regex=None, *args, **kwargs)

Bases: KClassMixin

Mixin class for anchor regression.

property gamma
class ivmodels.models.anchor_regression.AnchorRegression(gamma=1, instrument_names=None, instrument_regex=None, alpha=0, l1_ratio=0, fit_intercept=True)

Bases: AnchorMixin, GeneralizedLinearRegressor

Linear regression with anchor regularization [Rothenhäusler et al., 2021].

The anchor regression estimator with parameter \(\gamma\) is defined as

\[\hat\beta_\mathrm{anchor}(\gamma) := \arg\min_\beta \ \| y - X \beta \|_2^2 + (\gamma - 1) \|P_Z (y - X \beta) \|_2^2.\]

If \(\gamma \geq 0\), then \(\hat\beta_\mathrm{anchor}(\gamma) = \hat\beta_\mathrm{k-class}((\gamma - 1) / \gamma)\).

The optimization is based on OLS after a data transformation. First standardizes X and y by subtracting the column means as proposed by Rothenhäusler et al. [2021]. Consequently, no anchor regularization is applied to the intercept.

Parameters:
  • gamma (float) – The anchor regularization parameter. gamma=1 corresponds to OLS.

  • instrument_names (str or list of str, optional) – The names of the columns in X that should be used as instruments (anchors). Requires X to be a pandas DataFrame. If both instrument_names and instrument_regex are specified, the union of the two is used.

  • instrument_regex (str, optional) – A regex that is used to select columns in X that should be used as instruments (anchors). Requires X to be a pandas DataFrame. If both instrument_names and instrument_regex are specified, the union of the two is used.

  • alpha (float, optional, default=0) – Regularization parameter for elastic net regularization.

  • l1_ratio (float, optional, default=0) – Ratio of L1 to L2 regularization for elastic net regularization. For l1_ratio=0 the penalty is an L2 penalty. For l1_ratio=1 it is an L1 penalty.

coef_

The estimated coefficients for the linear regression problem.

Type:

array-like, shape (n_features,)

intercept_

The estimated intercept for the linear regression problem.

Type:

float

kappa_

The kappa parameter of the corresponding k-class estimator.

Type:

float

References

[RMBP21] (1,2,3,4)

Dominik Rothenhäusler, Nicolai Meinshausen, Peter Bühlmann, and Jonas Peters. Anchor regression: heterogeneous data meet causality. Journal of the Royal Statistical Society Series B: Statistical Methodology, 83(2):215–246, 2021.

set_fit_request(*, C: bool | None | str = '$UNCHANGED$', Z: bool | None | str = '$UNCHANGED$') AnchorRegression

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • C (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for C parameter in fit.

  • Z (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for Z parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, C: bool | None | str = '$UNCHANGED$') AnchorRegression

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

C (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for C parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, context: bool | None | str = '$UNCHANGED$', offset: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') AnchorRegression

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • context (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for context parameter in score.

  • offset (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for offset parameter in score.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

ivmodels.models.kclass module

class ivmodels.models.kclass.KClass(kappa=1, instrument_names=None, instrument_regex=None, exogenous_names=None, exogenous_regex=None, alpha=0, l1_ratio=0, fit_intercept=True)

Bases: KClassMixin, GeneralizedLinearRegressor

K-class estimator for instrumental variable regression.

The k-class estimator with parameter \(\kappa\) is defined as

\[\begin{split}\hat\beta_\mathrm{k-class}(\kappa) &:= \arg\min_\beta \ (1 - \kappa) \| y - X \beta \|_2^2 + \kappa \|P_Z (y - X \beta) \|_2^2 \\ &= (X^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) X)^{-1} X^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) X) y,\end{split}\]

where \(P_Z = Z (Z^T Z)^{-1} Z^T\) is the projection matrix onto the subspace spanned by \(Z\) and \(\mathrm{Id}\) is the identity matrix. This includes the the ordinary least-squares (OLS) estimator (\(\kappa = 0\)), the two-stage least-squares (2SLS) estimator (\(\kappa = 1\)), the limited information maximum likelihood (LIML) estimator (\(\kappa = \hat\kappa_\mathrm{LIML}\)), and the Fuller estimators (\(\kappa = \hat\kappa_\mathrm{LIML} - \alpha / (n - k)\)) as special cases.

Specifying exogenous included regressors \(C\) is equivalent to including them into both \(Z\) and \(X\).

Parameters:
  • kappa (float or { "ols", "tsls", "2sls", "liml", "fuller", "fuller(a)"}) – The kappa parameter of the k-class estimator. If string, then must be one of "ols", "2sls", "tsls", "liml", "fuller", or "fuller(a)", where a is numeric. If kappa="ols", then kappa=0 and the k-class estimator is the ordinary least squares estimator. If kappa="tsls" or kappa="2sls", then kappa=1 and the k-class estimator is the two-stage least-squares estimator. If kappa="liml", then \(\kappa = \hat\kappa_\mathrm{LIML}\) is used, where \(\kappa_\mathrm{LIML} \geq 1\) is the smallest eigenvalue of the matrix \(((X \ \ y)^T M_Z (X \ \ y))^{-1} (X \ \ y)^T (X \ y)\), where \(P_Z\) is the projection matrix onto the subspace spanned by \(Z\) and \(M_Z = Id - P_Z\). If exogenous included regressors \(C\) are specified, then \(\kappa_\mathrm{LIML}\) is the smallest eigenvalue of the matrix \(((X \ \ y)^T M_{[Z, C]} (X \ \ y))^{-1} (X \ \ y)^T M_C (X \ y)\). If kappa="fuller(a)", then \(\kappa = \hat\kappa_\mathrm{LIML} - a / (n - k - mc)\), where \(n\) is the number of observations and \(q = \mathrm{dim}(Z)\) is the number of instruments. The string "fuller" is interpreted as "fuller(1.0)", yielding an estimator that is unbiased up to \(O(1/n)\) [Fuller, 1977].

  • instrument_names (str or list of str, optional) – The names of the columns in X that should be used as instruments. Requires X argument of fit method to be a pandas DataFrame. If both instrument_names and instrument_regex are specified, the union of the two is used.

  • instrument_regex (str, optional) – A regex that is used to select columns in X that should be used as instruments. Requires X argument of fit method to be a pandas DataFrame. If both instrument_names and instrument_regex are specified, the union of the two is used.

  • exogenous_names (str or list of str, optional) – The names of the columns in X that should be used as exogenous regressors. Requires X argument of fit method to be a pandas DataFrame. If both exogenous_names and exogenous_regex are specified, the union of the two is used.

  • exogenous_regex (str, optional) – A regex that is used to select columns in X that should be used as exogenous regressors. Requires X argument of fit method to be a pandas DataFrame. If both exogenous_names and exogenous_regex are specified, the union of the two is used.

  • alpha (float, optional, default=0) – Regularization parameter for elastic net regularization. Only implemented for \(\kappa \leq 1\).

  • l1_ratio (float, optional, default=0) – Ratio of L1 to L2 regularization for elastic net regularization. For l1_ratio=0 the penalty is an L2 penalty. For l1_ratio=1 it is an L1 penalty. Only implemented for \(\kappa \leq 1\).

coef_

The estimated coefficients for the linear regression problem.

Type:

array-like, shape (n_features,)

intercept_

The estimated intercept for the linear regression problem.

Type:

float

kappa_

The numerical kappa parameter of the k-class estimator.

Type:

float

fuller_alpha_

If kappa is one of {"fuller", "fuller(a)", "liml"} for some numeric value a, the alpha parameter of the Fuller estimator.

Type:

float

ar_min_

If kappa is one of {"fuller", "fuller(a)", "liml"} for some numeric value a, the minimum of the unnormalized Anderson Rubin statistic.

Type:

float

kappa_liml_

If kappa is one of {"fuller", "fuller(a)", "liml"} for some numeric value a, the kappa parameter of the LIML estimator, equal to 1 + ar_min_.

Type:

float

named_coef_

If X was a pandas DataFrame, the estimated coefficients for the linear regression problem with the variable names as index.

Type:

array-like, shape (n_features,)

References

[Ful77] (1,2)

Wayne A Fuller. Some properties of a modification of the limited information estimator. Econometrica: Journal of the Econometric Society, pages 939–953, 1977.

set_fit_request(*, C: bool | None | str = '$UNCHANGED$', Z: bool | None | str = '$UNCHANGED$') KClass

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • C (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for C parameter in fit.

  • Z (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for Z parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, C: bool | None | str = '$UNCHANGED$') KClass

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

C (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for C parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, context: bool | None | str = '$UNCHANGED$', offset: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') KClass

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • context (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for context parameter in score.

  • offset (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for offset parameter in score.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class ivmodels.models.kclass.KClassMixin(kappa=1, instrument_names=None, instrument_regex=None, exogenous_names=None, exogenous_regex=None, *args, **kwargs)

Bases: object

Mixin class for k-class estimators.

static ar_min(X, y, Z=None, X_proj=None, y_proj=None)

Compute the minimum of the unnormalized Anderson Rubin statistic.

Computes

\[\begin{split}&\min_{\beta} \frac{(y - X \beta)^T P_Z (y - X \beta)}{(y - X \beta)^T M_Z (y - X \beta)} \\ &=\lambda_\mathrm{min}(((X y)^T M_Z (X y))^{-1} (X y)^T P_Z (X y)),\end{split}\]

where \(P_Z\) is the projection matrix onto the subspace spanned by \(Z\) and \(M_Z = \mathrm{Id} - P_Z\).

Either Z or both X_proj and y_proj must be specified.

Parameters:
  • X (np.ndarray of dimension (n, mx)) – Possibly endogenous regressors.

  • y (np.ndarray of dimension (n,)) – Outcome.

  • Z (np.ndarray of dimension (n, k), optional, default=None.) – Instruments.

  • X_proj (np.ndarray of dimension (n, mx), optional, default=None.) – Projection of X onto the subspace orthogonal to Z.

  • y_proj (np.ndarray of dimension (n,), optional, default=None.) – Projection of y onto the subspace orthogonal to Z.

Returns:

ar_min – The smallest eigenvalue of \(((X y)^T M_Z (X y))^{-1} (X y)^T P_Z (X y)\), where \(P_Z\) is the projection matrix onto the subspace spanned by Z.

Return type:

float

fit(X, y, Z=None, C=None, *args, **kwargs)

Fit a k-class or anchor regression estimator.

If instrument_names or instrument_regex are specified, X must be a pandas DataFrame containing columns instrument_names and Z must be None. At least one one of Z, instrument_names, and instrument_regex must be specified. If exogenous_names or exogenous_regex are specified, X must be a pandas DataFrame containing columns exogenous_names and C must be None.

Parameters:
  • X (array-like, shape (n_samples, n_features)) – The training input samples. If instrument_names or instrument_regex are specified, X must be a pandas DataFrame containing columns instrument_names.

  • y (array-like, shape (n_samples,)) – The target values.

  • Z (array-like, shape (n_samples, n_instruments), optional) – The instrument values. If instrument_names or instrument_regex are specified, Z must be None. If Z is specified, instrument_names and instrument_regex must be None.

  • C (array-like, shape (n_samples, n_exogenous), optional) – The exogenous regressors. If exogenous_names or exogenous_regex are specified, C must be None. If C is specified, exogenous_names and exogenous_regex must be None.

property named_coefs_
predict(X, C=None, *args, **kwargs)
summary(X, y, Z=None, C=None, test='wald', alpha=0.05, feature_names=None, **kwargs)

Create Summary object for the fitted model.

This contains the fitted values (estimates), subvector test statistics for each parameter, corresponding p-values, and confidence sets.

Parameters:
  • X (array-like, shape (n_samples, n_features)) – The input data.

  • y (array-like, shape (n_samples,)) – The target values.

  • Z (array-like, shape (n_samples, n_instruments), optional) – The instrument values. If instrument_names or instrument_regex are specified, Z must be None. If Z is specified, instrument_names and instrument_regex must be None.

  • C (array-like, shape (n_samples, n_exogenous), optional) – The exogenous regressors. If exogenous_names or exogenous_regex are specified, C must be None. If C is specified, exogenous_names and exogenous_regex must be None.

  • test (str, optional, default="wald (liml)") – The test to use. Must be one of “wald”, “anderson-rubin”, “lagrange multiplier”, “likelihood-ratio”, or “conditional likelihood-ratio”.

  • alpha (float, optional, default=0.05) – The significance level.

  • feature_names (list of str, optional) – Names of the features to be included in the summary. If not specified, all features will be included.

  • **kwargs – Additional keyword arguments to pass to the test and its inversion.

ivmodels.models.pulse module

class ivmodels.models.pulse.PULSE(instrument_names=None, instrument_regex=None, p_min=0.05, rtol=0.01, kappa_max=1, alpha=0, l1_ratio=0)

Bases: PULSEMixin, KClass

p-uncorrelated least squares estimator (PULSE) [Jakobsen and Peters, 2022].

Perform k-class estimation with k-class parameter \(\kappa \in [0, \kappa_\mathrm{max}]\) chosen minimally such that the PULSE test of correlation between the instruments and the residuals is not significant at level p_min.

Parameters:
  • instrument_names (str or list of str, optional) – The names of the columns in X that should be used as anchors. Requires X to be a pandas DataFrame.

  • instrument_regex (str, optional) – A regex that is used to select columns in X that should be used as anchors. Requires X to be a pandas DataFrame. If both instrument_names and instrument_regex are specified, the union of the two is used.

  • p_min (float, optional, default = 0.05) – The p-value of the PULSE test that is used to determine the k-class parameter \(\kappa\). The PULSE will search for the smallest \(\kappa\) that makes the test not significant at level p_min with binary search.

  • rtol (float, optional, default = 0.01) – The relative tolerance of the binary search. The PULSE will search for a \(\kappa\) such that the PULSE test is not significant at level p_min` with binary search but is significant at level ``p_min * (1 + rtol).

  • kappa_max (float, optional, default = 1) – The maximum value of kappa to consider. The PULSE will search for the smallest kappa that makes the test not significant at level p_min with binary search. If kappa_max = 1, the PULSE will run a regression equivalent to two-stage-least-squares. If alpha = 0 and Z.shape[1] < X.shape[1], this is not well-defined and the PULSE will raise an exception.

  • alpha (float, optional, default = 0) – The regularization parameter for elastic net. If alpha is 0, the estimator is unregularized.

  • l1_ratio (float, optional, default = 0) – The ratio of L1 to L2 regularization for elastic net. If l1_ratio is 1, the estimator is Lasso. If l1_ratio is 0, the estimator is Ridge.

coef_

The estimated coefficients.

Type:

array-like, shape (n_features,)

intercept_

The estimated intercept.

Type:

float

kappa_

The estimated kappa.

Type:

float

References

[JP22] (1,2)

Martin Emil Jakobsen and Jonas Peters. Distributional robustness of k-class estimators and the PULSE. The Econometrics Journal, 25(2):404–432, 2022.

set_fit_request(*, C: bool | None | str = '$UNCHANGED$', Z: bool | None | str = '$UNCHANGED$') PULSE

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • C (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for C parameter in fit.

  • Z (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for Z parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, C: bool | None | str = '$UNCHANGED$') PULSE

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

C (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for C parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, context: bool | None | str = '$UNCHANGED$', offset: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') PULSE

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • context (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for context parameter in score.

  • offset (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for offset parameter in score.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class ivmodels.models.pulse.PULSEMixin(p_min=0.05, rtol=0.01, kappa_max=1, **kwargs)

Bases: object

Mixin class for PULSE estimators.

fit(X, y, Z=None, C=None, *args, **kwargs)

Fit a p-uncorrelated least squares estimator (PULSE) [1].

If instrument_names or instrument_regex are specified, X must be a pandas DataFrame containing columns instrument_names and a must be None. At least one one of a, instrument_names, and instrument_regex must be specified.

Parameters:
  • X (array-like, shape (n_samples, n_features)) – The training input samples. If instrument_names or instrument_regex are specified, X must be a pandas DataFrame containing columns instrument_names.

  • y (array-like, shape (n_samples,) or (n_samples, n_targets)) – The target values.

  • Z (array-like, shape (n_samples, n_anchors), optional) – The instrument (anchor) values. If instrument_names or instrument_regex are specified, Z must be None. If Z is specified, instrument_names and instrument_regex must be None.

ivmodels.models.space_iv module

class ivmodels.models.space_iv.SpaceIV(s_max=None, p_min=0.05)

Bases: object

Run the space IV algorithm from Pfister and Peters [2022].

Returns \(\arg\min \| \beta \|_0\) subject to \(\mathrm{AR}(\beta) \leq q_{1 - \alpha}\), where \(q_{1 - \alpha}\) is the \(1 - \alpha\) quantile of the F distribution with \(q\) and \(n-q\) degrees of freedom.

Parameters:
  • s_max (int, optional, default = None) – Maximum number of variables to consider. If None, set to X.shape[1].

  • p_min (float, optional, default = 0.05) – Confidence level (\(\alpha\) above).

coef_

Estimated coefficients for the linear regression problem.

Type:

array-like, shape (n_features,)

intercept_

Independent term in the linear model.

Type:

float

S_

Indices of the selected variables.

Type:

array-like, shape (s,)

s_

Number of selected variables.

Type:

int

kappa_

Equal to \(\hat\kappa_\mathrm{LIML}\) for the selected model.

Type:

float

References

[PP22] (1,2)

Niklas Pfister and Jonas Peters. Identifiability of sparse causal effects using instrumental variables. In Uncertainty in Artificial Intelligence, 1613–1622. PMLR, 2022.

fit(X, y, Z=None)

Fit a SpaceIV model.

If instrument_names or instrument_regex are specified, X must be a pandas DataFrame containing columns instrument_names and Z must be None. At least one one of Z, instrument_names, and instrument_regex must be specified.

Parameters:
  • X (array-like, shape (n_samples, n_features)) – The training input samples. If instrument_names or instrument_regex are specified, X must be a pandas DataFrame containing columns instrument_names.

  • y (array-like, shape (n_samples,)) – The target values.

  • Z (array-like, shape (n_samples, n_instruments), optional) – The instrument values. If instrument_names or instrument_regex are specified, Z must be None. If Z is specified, instrument_names and instrument_regex must be None.

Module contents

class ivmodels.models.AnchorRegression(gamma=1, instrument_names=None, instrument_regex=None, alpha=0, l1_ratio=0, fit_intercept=True)

Bases: AnchorMixin, GeneralizedLinearRegressor

Linear regression with anchor regularization [Rothenhäusler et al., 2021].

The anchor regression estimator with parameter \(\gamma\) is defined as

\[\hat\beta_\mathrm{anchor}(\gamma) := \arg\min_\beta \ \| y - X \beta \|_2^2 + (\gamma - 1) \|P_Z (y - X \beta) \|_2^2.\]

If \(\gamma \geq 0\), then \(\hat\beta_\mathrm{anchor}(\gamma) = \hat\beta_\mathrm{k-class}((\gamma - 1) / \gamma)\).

The optimization is based on OLS after a data transformation. First standardizes X and y by subtracting the column means as proposed by Rothenhäusler et al. [2021]. Consequently, no anchor regularization is applied to the intercept.

Parameters:
  • gamma (float) – The anchor regularization parameter. gamma=1 corresponds to OLS.

  • instrument_names (str or list of str, optional) – The names of the columns in X that should be used as instruments (anchors). Requires X to be a pandas DataFrame. If both instrument_names and instrument_regex are specified, the union of the two is used.

  • instrument_regex (str, optional) – A regex that is used to select columns in X that should be used as instruments (anchors). Requires X to be a pandas DataFrame. If both instrument_names and instrument_regex are specified, the union of the two is used.

  • alpha (float, optional, default=0) – Regularization parameter for elastic net regularization.

  • l1_ratio (float, optional, default=0) – Ratio of L1 to L2 regularization for elastic net regularization. For l1_ratio=0 the penalty is an L2 penalty. For l1_ratio=1 it is an L1 penalty.

coef_

The estimated coefficients for the linear regression problem.

Type:

array-like, shape (n_features,)

intercept_

The estimated intercept for the linear regression problem.

Type:

float

kappa_

The kappa parameter of the corresponding k-class estimator.

Type:

float

References

[RMBP21] (1,2,3,4)

Dominik Rothenhäusler, Nicolai Meinshausen, Peter Bühlmann, and Jonas Peters. Anchor regression: heterogeneous data meet causality. Journal of the Royal Statistical Society Series B: Statistical Methodology, 83(2):215–246, 2021.

set_fit_request(*, C: bool | None | str = '$UNCHANGED$', Z: bool | None | str = '$UNCHANGED$') AnchorRegression

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • C (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for C parameter in fit.

  • Z (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for Z parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, C: bool | None | str = '$UNCHANGED$') AnchorRegression

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

C (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for C parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, context: bool | None | str = '$UNCHANGED$', offset: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') AnchorRegression

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • context (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for context parameter in score.

  • offset (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for offset parameter in score.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class ivmodels.models.KClass(kappa=1, instrument_names=None, instrument_regex=None, exogenous_names=None, exogenous_regex=None, alpha=0, l1_ratio=0, fit_intercept=True)

Bases: KClassMixin, GeneralizedLinearRegressor

K-class estimator for instrumental variable regression.

The k-class estimator with parameter \(\kappa\) is defined as

\[\begin{split}\hat\beta_\mathrm{k-class}(\kappa) &:= \arg\min_\beta \ (1 - \kappa) \| y - X \beta \|_2^2 + \kappa \|P_Z (y - X \beta) \|_2^2 \\ &= (X^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) X)^{-1} X^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) X) y,\end{split}\]

where \(P_Z = Z (Z^T Z)^{-1} Z^T\) is the projection matrix onto the subspace spanned by \(Z\) and \(\mathrm{Id}\) is the identity matrix. This includes the the ordinary least-squares (OLS) estimator (\(\kappa = 0\)), the two-stage least-squares (2SLS) estimator (\(\kappa = 1\)), the limited information maximum likelihood (LIML) estimator (\(\kappa = \hat\kappa_\mathrm{LIML}\)), and the Fuller estimators (\(\kappa = \hat\kappa_\mathrm{LIML} - \alpha / (n - k)\)) as special cases.

Specifying exogenous included regressors \(C\) is equivalent to including them into both \(Z\) and \(X\).

Parameters:
  • kappa (float or { "ols", "tsls", "2sls", "liml", "fuller", "fuller(a)"}) – The kappa parameter of the k-class estimator. If string, then must be one of "ols", "2sls", "tsls", "liml", "fuller", or "fuller(a)", where a is numeric. If kappa="ols", then kappa=0 and the k-class estimator is the ordinary least squares estimator. If kappa="tsls" or kappa="2sls", then kappa=1 and the k-class estimator is the two-stage least-squares estimator. If kappa="liml", then \(\kappa = \hat\kappa_\mathrm{LIML}\) is used, where \(\kappa_\mathrm{LIML} \geq 1\) is the smallest eigenvalue of the matrix \(((X \ \ y)^T M_Z (X \ \ y))^{-1} (X \ \ y)^T (X \ y)\), where \(P_Z\) is the projection matrix onto the subspace spanned by \(Z\) and \(M_Z = Id - P_Z\). If exogenous included regressors \(C\) are specified, then \(\kappa_\mathrm{LIML}\) is the smallest eigenvalue of the matrix \(((X \ \ y)^T M_{[Z, C]} (X \ \ y))^{-1} (X \ \ y)^T M_C (X \ y)\). If kappa="fuller(a)", then \(\kappa = \hat\kappa_\mathrm{LIML} - a / (n - k - mc)\), where \(n\) is the number of observations and \(q = \mathrm{dim}(Z)\) is the number of instruments. The string "fuller" is interpreted as "fuller(1.0)", yielding an estimator that is unbiased up to \(O(1/n)\) [Fuller, 1977].

  • instrument_names (str or list of str, optional) – The names of the columns in X that should be used as instruments. Requires X argument of fit method to be a pandas DataFrame. If both instrument_names and instrument_regex are specified, the union of the two is used.

  • instrument_regex (str, optional) – A regex that is used to select columns in X that should be used as instruments. Requires X argument of fit method to be a pandas DataFrame. If both instrument_names and instrument_regex are specified, the union of the two is used.

  • exogenous_names (str or list of str, optional) – The names of the columns in X that should be used as exogenous regressors. Requires X argument of fit method to be a pandas DataFrame. If both exogenous_names and exogenous_regex are specified, the union of the two is used.

  • exogenous_regex (str, optional) – A regex that is used to select columns in X that should be used as exogenous regressors. Requires X argument of fit method to be a pandas DataFrame. If both exogenous_names and exogenous_regex are specified, the union of the two is used.

  • alpha (float, optional, default=0) – Regularization parameter for elastic net regularization. Only implemented for \(\kappa \leq 1\).

  • l1_ratio (float, optional, default=0) – Ratio of L1 to L2 regularization for elastic net regularization. For l1_ratio=0 the penalty is an L2 penalty. For l1_ratio=1 it is an L1 penalty. Only implemented for \(\kappa \leq 1\).

coef_

The estimated coefficients for the linear regression problem.

Type:

array-like, shape (n_features,)

intercept_

The estimated intercept for the linear regression problem.

Type:

float

kappa_

The numerical kappa parameter of the k-class estimator.

Type:

float

fuller_alpha_

If kappa is one of {"fuller", "fuller(a)", "liml"} for some numeric value a, the alpha parameter of the Fuller estimator.

Type:

float

ar_min_

If kappa is one of {"fuller", "fuller(a)", "liml"} for some numeric value a, the minimum of the unnormalized Anderson Rubin statistic.

Type:

float

kappa_liml_

If kappa is one of {"fuller", "fuller(a)", "liml"} for some numeric value a, the kappa parameter of the LIML estimator, equal to 1 + ar_min_.

Type:

float

named_coef_

If X was a pandas DataFrame, the estimated coefficients for the linear regression problem with the variable names as index.

Type:

array-like, shape (n_features,)

References

[Ful77] (1,2)

Wayne A Fuller. Some properties of a modification of the limited information estimator. Econometrica: Journal of the Econometric Society, pages 939–953, 1977.

set_fit_request(*, C: bool | None | str = '$UNCHANGED$', Z: bool | None | str = '$UNCHANGED$') KClass

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • C (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for C parameter in fit.

  • Z (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for Z parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, C: bool | None | str = '$UNCHANGED$') KClass

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

C (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for C parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, context: bool | None | str = '$UNCHANGED$', offset: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') KClass

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • context (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for context parameter in score.

  • offset (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for offset parameter in score.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class ivmodels.models.PULSE(instrument_names=None, instrument_regex=None, p_min=0.05, rtol=0.01, kappa_max=1, alpha=0, l1_ratio=0)

Bases: PULSEMixin, KClass

p-uncorrelated least squares estimator (PULSE) [Jakobsen and Peters, 2022].

Perform k-class estimation with k-class parameter \(\kappa \in [0, \kappa_\mathrm{max}]\) chosen minimally such that the PULSE test of correlation between the instruments and the residuals is not significant at level p_min.

Parameters:
  • instrument_names (str or list of str, optional) – The names of the columns in X that should be used as anchors. Requires X to be a pandas DataFrame.

  • instrument_regex (str, optional) – A regex that is used to select columns in X that should be used as anchors. Requires X to be a pandas DataFrame. If both instrument_names and instrument_regex are specified, the union of the two is used.

  • p_min (float, optional, default = 0.05) – The p-value of the PULSE test that is used to determine the k-class parameter \(\kappa\). The PULSE will search for the smallest \(\kappa\) that makes the test not significant at level p_min with binary search.

  • rtol (float, optional, default = 0.01) – The relative tolerance of the binary search. The PULSE will search for a \(\kappa\) such that the PULSE test is not significant at level p_min` with binary search but is significant at level ``p_min * (1 + rtol).

  • kappa_max (float, optional, default = 1) – The maximum value of kappa to consider. The PULSE will search for the smallest kappa that makes the test not significant at level p_min with binary search. If kappa_max = 1, the PULSE will run a regression equivalent to two-stage-least-squares. If alpha = 0 and Z.shape[1] < X.shape[1], this is not well-defined and the PULSE will raise an exception.

  • alpha (float, optional, default = 0) – The regularization parameter for elastic net. If alpha is 0, the estimator is unregularized.

  • l1_ratio (float, optional, default = 0) – The ratio of L1 to L2 regularization for elastic net. If l1_ratio is 1, the estimator is Lasso. If l1_ratio is 0, the estimator is Ridge.

coef_

The estimated coefficients.

Type:

array-like, shape (n_features,)

intercept_

The estimated intercept.

Type:

float

kappa_

The estimated kappa.

Type:

float

References

[JP22] (1,2)

Martin Emil Jakobsen and Jonas Peters. Distributional robustness of k-class estimators and the PULSE. The Econometrics Journal, 25(2):404–432, 2022.

set_fit_request(*, C: bool | None | str = '$UNCHANGED$', Z: bool | None | str = '$UNCHANGED$') PULSE

Configure whether metadata should be requested to be passed to the fit method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • C (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for C parameter in fit.

  • Z (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for Z parameter in fit.

Returns:

self – The updated object.

Return type:

object

set_predict_request(*, C: bool | None | str = '$UNCHANGED$') PULSE

Configure whether metadata should be requested to be passed to the predict method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:

C (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for C parameter in predict.

Returns:

self – The updated object.

Return type:

object

set_score_request(*, context: bool | None | str = '$UNCHANGED$', offset: bool | None | str = '$UNCHANGED$', sample_weight: bool | None | str = '$UNCHANGED$') PULSE

Configure whether metadata should be requested to be passed to the score method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • context (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for context parameter in score.

  • offset (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for offset parameter in score.

  • sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class ivmodels.models.SpaceIV(s_max=None, p_min=0.05)

Bases: object

Run the space IV algorithm from Pfister and Peters [2022].

Returns \(\arg\min \| \beta \|_0\) subject to \(\mathrm{AR}(\beta) \leq q_{1 - \alpha}\), where \(q_{1 - \alpha}\) is the \(1 - \alpha\) quantile of the F distribution with \(q\) and \(n-q\) degrees of freedom.

Parameters:
  • s_max (int, optional, default = None) – Maximum number of variables to consider. If None, set to X.shape[1].

  • p_min (float, optional, default = 0.05) – Confidence level (\(\alpha\) above).

coef_

Estimated coefficients for the linear regression problem.

Type:

array-like, shape (n_features,)

intercept_

Independent term in the linear model.

Type:

float

S_

Indices of the selected variables.

Type:

array-like, shape (s,)

s_

Number of selected variables.

Type:

int

kappa_

Equal to \(\hat\kappa_\mathrm{LIML}\) for the selected model.

Type:

float

References

[PP22] (1,2)

Niklas Pfister and Jonas Peters. Identifiability of sparse causal effects using instrumental variables. In Uncertainty in Artificial Intelligence, 1613–1622. PMLR, 2022.

fit(X, y, Z=None)

Fit a SpaceIV model.

If instrument_names or instrument_regex are specified, X must be a pandas DataFrame containing columns instrument_names and Z must be None. At least one one of Z, instrument_names, and instrument_regex must be specified.

Parameters:
  • X (array-like, shape (n_samples, n_features)) – The training input samples. If instrument_names or instrument_regex are specified, X must be a pandas DataFrame containing columns instrument_names.

  • y (array-like, shape (n_samples,)) – The target values.

  • Z (array-like, shape (n_samples, n_instruments), optional) – The instrument values. If instrument_names or instrument_regex are specified, Z must be None. If Z is specified, instrument_names and instrument_regex must be None.