ivmodels.tests package

Submodules

ivmodels.tests.anderson_rubin module

ivmodels.tests.anderson_rubin.anderson_rubin_test(Z, X, y, beta, W=None, C=None, D=None, critical_values='chi2', fit_intercept=True)

Perform the Anderson Rubin test [Anderson and Rubin, 1949].

Test the null hypothesis that the residuals are uncorrelated with the instruments. If W is None, the test statistic is defined as

\[\mathrm{AR}(\beta) := \frac{n - k}{k} \frac{\| P_Z (y - X \beta) \|_2^2}{\| M_Z (y - X \beta) \|_2^2},\]

where \(P_Z\) is the projection matrix onto the column space of \(Z\) and \(M_Z = \mathrm{Id} - P_Z\).

Under the null and normally distributed errors, this test statistic is distributed as \(F_{k, n - k}\), where \(k\) is the number of instruments and \(n\) is the number of observations. The statistic is asymptotically distributed as \(\chi^2(k) / k\) under the null and non-normally distributed errors, even for weak instruments.

If W is not None, the test statistic is

\[\begin{split}\mathrm{AR}(\beta) &:= \min_\gamma \frac{n - k}{k - m_W} \frac{\| P_Z (y - X \beta - W \gamma) \|_2^2}{\| M_Z (y - X \beta - W \gamma) \|_2^2} \\ &= \frac{n - k}{k - m_W} \frac{\| P_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2}{\| M_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2},\end{split}\]

where \(\hat\gamma_\mathrm{LIML}\) is the LIML estimate using instruments \(Z\), covariates \(W\) and outcomes \(y - X \beta\). Under the null, this test statistic is asymptotically bounded from above by a random variable that is distributed as \(\frac{1}{k - m_W} \chi^2(k - m_W)\), where \(m_W = \mathrm{dim}(W)\). See [Guggenberger et al., 2012].

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
W (np.ndarray of dimension (n, mw) or None, optional, default = None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default = None) – Exogenous regressors of interest.
critical_values (str, optional, default = "chi2") – If "chi2", use the \(\chi^2(k - m_W)\) distribution to compute the p-value. If "f", use the \(F_{k - m_W, n - k}\) distribution to compute the p-value. If "guggenberger" or "GKM", use the critical value function proposed by Guggenberger et al. [2019] to compute the p-value.
fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.

Returns:

statistic (float) – The test statistic \(\mathrm{AR}(\beta)\).
p_value (float) – The p-value of the test.

Raises:

ValueError: – If the dimensions of the inputs are incorrect.

References

[AR49] (1,2)

Theodore W Anderson and Herman Rubin. Estimation of the parameters of a single equation in a complete system of stochastic equations. The Annals of mathematical statistics, 20(1):46–63, 1949.

[GKM19] (1,2,3,4,5,6)

Patrik Guggenberger, Frank Kleibergen, and Sophocles Mavroeidis. A more powerful subvector Anderson Rubin test in linear instrumental variables regression. Quantitative Economics, 10(2):487–526, 2019.

[GKMC12] (1,2)

Patrik Guggenberger, Frank Kleibergen, Sophocles Mavroeidis, and Linchun Chen. On the asymptotic sizes of subset Anderson–Rubin and Lagrange multiplier tests in linear instrumental variables regression. Econometrica, 80(6):2649–2666, 2012.

ivmodels.tests.anderson_rubin.inverse_anderson_rubin_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, critical_values='chi2', fit_intercept=True, tol=1e-06, max_eval=100)

Return the quadric for the inverse Anderson-Rubin test’s acceptance region.

The returned quadric satisfies quadric(x) <= 0 if and only if anderson_rubin_test(Z, X, y, beta=x, W=W)[1] > alpha. It is thus a confidence region for the causal parameter corresponding to the endogenous regressors of interest X.

If W is None, let \(q := \frac{k}{n-k}F_{F(k, n-k}(1 - \alpha)\), where \(F_{F(k, n-k)}\) is the cumulative distribution function of the \(F(k, n-k)\) distribution. The quadric is defined as

\[\begin{split}\mathrm{AR}(\beta) = \frac{n - k}{k} \frac{\| P_Z (y - X \beta) \|_2^2}{\| M_Z (y - X \beta) \|_2^2} \leq F_{F(k, n-k)}(1 - \alpha) \\ \Leftrightarrow \beta^T X^T (P_Z - q M_Z) X \beta - 2 y^T (P_Z - q M_Z) X \beta + y^T (P_Z - q M_Z) y \leq 0.\end{split}\]

If W is not None, let \(q := \frac{k - m_W}{n-k}F_{F(k - m_W, n-k)}(1 - \alpha)\). The quadric is defined as

\[\mathrm{AR}(\beta) = \min_\gamma \frac{n - k}{k - m_W} \frac{\| P_Z (y - X \beta - W \gamma) \|_2^2}{\| M_Z (y - X \beta - W \gamma) \|_2^2} \leq F_{F(k - m_W, n-k)}(1 - \alpha).\]

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float) – Significance level.
W (np.ndarray of dimension (n, mw) or None, optional, default = None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default = None) – Exogenous regressors of interest.
critical_values (str, optional, default = "chi2") – If "chi2", use the \(\chi^2(k - m_W)\) distribution to compute the p-value. If "f", use the \(F_{k - m_W, n - k}\) distribution to compute the p-value. If "gkm", use the critical value function proposed by Guggenberger et al. [2019].
fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
tol (float, optional, default = 1e-6) – Tolerance for the root finding algorithm when critical_values is "gkm".
max_eval (int, optional, default = 100) – Maximum number of evaluations for the root finding algorithm when critical_values is "gkm".

Returns:

The quadric for the acceptance region.

Return type:

Quadric

ivmodels.tests.anderson_rubin.more_powerful_subvector_anderson_rubin_critical_value_function(z, kappa_1_hat, k, mw)

Implement the critical value function proposed by Guggenberger et al. [2019].

This is done by numerically integrating the approximate conditional density from Equation (A.21) of [Guggenberger et al., 2019].

Parameters:

z (float) – The test statistic.
kappa_1_hat (float) – The maximum eigenvalue of the matrix \(M:=((W \ y - X \beta)^T M_Z (W \ y - X \beta))^{-1} (W \ y-X \beta)^T P_Z (W \ y-X \beta)\). This is the conditioning statistic.
k (int) – Number of instruments.
mw (int) – Number of endogenous regressors not of interest, i.e., \(\mathrm{dim}(W)\).

Returns:

p_value – The (approximate) p-value of observing \(\lambda_\mathrm{min}(M) > z\) given \(\lambda_\mathrm{max}(M) = \hat\kappa_1\) under the null \(\beta = \beta_0\).

Return type:

float

References

[GKM19] (1,2,3,4,5,6)

Patrik Guggenberger, Frank Kleibergen, and Sophocles Mavroeidis. A more powerful subvector Anderson Rubin test in linear instrumental variables regression. Quantitative Economics, 10(2):487–526, 2019.

ivmodels.tests.conditional_likelihood_ratio module

ivmodels.tests.conditional_likelihood_ratio.conditional_likelihood_ratio_critical_value_function(k, mx, md, lambdas, z, critical_values='londschien2025exact', tol=1e-06, num_samples=100000)

Approximate the critical value function of the conditional likelihood ratio test.

If critical_values is "londschien2025exact", computes the exact distribution from Theorem 1 of [Londschien, 2025] using Monte Carlo simulation:

\[\mathrm{CLR}(\beta_0) \overset{d}{\to} \sum_{i=0}^m q_i - \mu_\mathrm{min},\]

where \(q_0 \sim \chi^2(k-m)\), \(q_1, \ldots, q_m \sim \chi^2(1)\) are independent, and \(\mu_\mathrm{min}\) is the smallest root of the polynomial

\[p(\mu) := \left(\mu - \sum_{i=0}^m q_i\right) \prod_{i=1}^m (\mu - \lambda_i) - \sum_{i=1}^m \lambda_i q_i \prod_{j \geq 1, j \neq i} (\mu - \lambda_j).\]

This is computed by Monte Carlo simulation, generating samples of the \(q_i\) and solving for \(\mu_\mathrm{min}\) using Newton’s method.

If critical_values is "kleibergen2007generalizing" or "moreira2003conditional", uses the upper bound from Corollary 2 of [Londschien, 2025]:

\[\mathrm{CLR}(\beta_0) \leq \Gamma(k-m, m, \lambda_1),\]

where

\[\Gamma(k-m, m, \lambda_1) := \frac{1}{2} \left( Q_{k-m} + Q_m - \lambda_1 + \sqrt{ (Q_{k-m} + Q_m + \lambda_1)^2 - 4 Q_{k-m} \lambda_1 } \right),\]

with \(Q_{k-m} \sim \chi^2(k-m)\), \(Q_m \sim \chi^2(m)\) independent, and \(\lambda_1\) the smallest eigenvalue. This is computed by numerical integration using the formulation

\[\mathbb{P}[\Gamma(k-m, m, \lambda_1) > z] = \mathbb{E}_{B \sim \mathrm{Beta}((k-m)/2, m/2)}[F_{\chi^2(k)}(z/(1-aB))],\]

where \(F_{\chi^2(k)}\) is the CDF of \(\chi^2(k)\) and \(a = \lambda_1/(z + \lambda_1)\).

Parameters:

k (int) – Number of instruments.
mx (int) – Number of endogenous variables.
md (int) – Number of included exogenous variables.
lambdas (array_like) – The finite eigenvalues of the concentration matrix. If md > 0, the md infinite eigenvalues corresponding to the exogenous regressors of interest are not included.
z (float) – Test statistic.
critical_values ({"londschien2025exact", "kleibergen2007generalizing", "moreira2003conditional"}, default="londschien2025exact") – Which critical values to use. If "londschien2025exact", uses the exact distribution conditional on all eigenvalues via Monte Carlo simulation. If "kleibergen2007generalizing" or "moreira2003conditional", uses the upper bound conditional on the smallest eigenvalue via numerical integration.
tol (float, default=1e-6) – Tolerance for the approximation of the CDF and thus the p-value.
num_samples (int, default=100_000) – Number of Monte Carlo samples when using "londschien2025exact".

Returns:

The p-value \(\mathbb{P}[\mathrm{CLR} > z]\).

Return type:

float

References

[Hil09a]

Grant Hillier. Exact properties of the conditional likelihood ratio test in an IV regression model. Econometric Theory, 25(4):915–957, 2009.

[Hil09b]

Grant Hillier. On the conditional likelihood ratio test for several parameters in IV regression. Econometric Theory, 25(2):305–335, 2009.

[Lon25] (1,2,3,4)

Malte Londschien. The exact distribution of the conditional likelihood-ratio test in instrumental variables regression. arXiv preprint, 2025.

ivmodels.tests.conditional_likelihood_ratio.conditional_likelihood_ratio_test(Z, X, y, beta, W=None, C=None, D=None, fit_intercept=True, critical_values='londschien2025exact', tol=1e-06, num_samples=100000)

Perform the conditional likelihood ratio test for beta.

If W is None, the test statistic is defined as

\[\begin{split}\mathrm{CLR}(\beta) &:= (n - k) \frac{ \| P_Z (y - X \beta) \|_2^2}{ \| M_Z (y - X \beta) \|_2^2} - (n - k) \frac{ \| P_Z (y - X \hat\beta_\mathrm{LIML}) \|_2^2 }{ \| M_Z (y - X \hat\beta_\mathrm{LIML}) \|_2^2 } \\ &= k \cdot \mathrm{AR}(\beta) - k \cdot \min_\beta \mathrm{AR}(\beta),\end{split}\]

where \(P_Z\) is the projection matrix onto the column space of \(Z\), \(M_Z = \mathrm{Id} - P_Z\), and \(\hat\beta_\mathrm{LIML}\) is the LIML estimator of \(\beta\) (see KClass), minimizing the Anderson-Rubin test statistic \(\mathrm{AR}(\beta)\) (see anderson_rubin_test()) at

\[\mathrm{AR}(\hat\beta_\mathrm{LIML}) = \frac{n - k}{k} \lambda_\mathrm{min}\left( \left(\begin{pmatrix} X & y \end{pmatrix}^T M_Z \begin{pmatrix} X & y \end{pmatrix}\right)^{-1} \begin{pmatrix} X & y \end{pmatrix}^T P_Z \begin{pmatrix} X & y \end{pmatrix} \right).\]

Let

\[\tilde X(\beta) := X - (y - X \beta) \cdot \frac{(y - X \beta)^T M_Z X}{(y - X \beta)^T M_Z (y - X \beta)}\]

and let \(\lambda_1, \ldots, \lambda_m\) be the eigenvalues of

\[(n - k) \cdot \left[\tilde X(\beta)^T M_Z \tilde X(\beta)\right]^{-1} \tilde X(\beta)^T P_Z \tilde X(\beta).\]

If critical_values="londschien2025exact", the exact asymptotic distribution from Theorem 1 of [Londschien, 2025] is used:

\[\mathrm{CLR}(\beta_0) \overset{d}{\to} \sum_{i=0}^m q_i - \mu_\mathrm{min},\]

where \(q_0 \sim \chi^2(k-m)\), \(q_1, \ldots, q_m \sim \chi^2(1)\), and \(\mu_\mathrm{min}\) is the smallest root of the polynomial

\[p(\mu) := \left(\mu - \sum_{i=0}^m q_i\right) \prod_{i=1}^m (\mu - \lambda_i) - \sum_{i=1}^m \lambda_i q_i \prod_{j \geq 1, j \neq i} (\mu - \lambda_j).\]

This distribution is conditional on all eigenvalues \(\lambda_1, \ldots, \lambda_m\) and provides substantially more power when eigenvalues differ.

If critical_values is "kleibergen2007generalizing" or "moreira2003conditional", uses the upper bound conditional on only the smallest eigenvalue \(\lambda_1\):

\[\mathrm{CLR}(\beta_0) \leq \frac{1}{2} \left( Q_{m_X} + Q_{k - m_X} - \lambda_1 + \sqrt{ (Q_{m_X} + Q_{k - m_X} + \lambda_1)^2 - 4 Q_{k - m_X} \lambda_1 } \right),\]

where \(Q_{m_X} \sim \chi^2(m_X)\) and \(Q_{k - m_X} \sim \chi^2(k - m_X)\) are independent. This bound is sharp when all eigenvalues are equal.

This test is robust to weak instruments. If identification is strong (\(\lambda_i \to \infty\)), the test is equivalent to the likelihood ratio test with \(\chi^2(m_X)\) distribution. If identification is weak (\(\lambda_i \to 0\)), the test is equivalent to the Anderson-Rubin test with \(\chi^2(k)\) distribution.

If W is not None, the test statistic is defined as

\[\begin{split}\mathrm{CLR}(\beta) &:= (n - k) \min_\gamma \frac{ \| P_Z (y - X \beta - W \gamma) \|_2^2}{ \| M_Z (y - X \beta - W \gamma) \|_2^2} - (n - k) \min_{\beta, \gamma} \frac{ \| P_Z (y - X \beta - W \gamma) \|_2^2 }{ \| M_Z (y - X \beta - W \gamma) \|_2^2 } \\ &= (n - k) \frac{ \| P_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2}{ \| M_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2} - (n - k) \frac{ \| P_Z (y - \begin{pmatrix} X & W \end{pmatrix} \hat\delta_\mathrm{LIML}) \|_2^2 }{ \| M_Z (y - \begin{pmatrix} X & W \end{pmatrix} \hat\delta_\mathrm{LIML}) \|_2^2 },\end{split}\]

where \(\hat\gamma_\mathrm{LIML}\) and \(\hat\delta_\mathrm{LIML}\) are LIML estimators. In this case, only the upper bound method is available (critical_values must be "kleibergen2007generalizing" or "moreira2003conditional").

Parameters:

Z (array_like of shape (n, k)) – Instruments.
X (array_like of shape (n, mx)) – Endogenous regressors of interest.
y (array_like of shape (n,)) – Outcomes.
beta (array_like of shape (mx + md,)) – Coefficients to test.
W (array_like of shape (n, mw) or None, default=None) – Endogenous regressors not of interest.
C (array_like of shape (n, mc) or None, default=None) – Exogenous regressors not of interest.
D (array_like of shape (n, md) or None, default=None) – Exogenous regressors of interest. Will be included into both X and Z if supplied.
fit_intercept (bool, default=True) – Whether to include an intercept. This is equivalent to centering the inputs.
critical_values ({"londschien2025exact", "kleibergen2007generalizing", "moreira2003conditional"}, default="londschien2025exact") – Which critical values to use. If "londschien2025exact", uses the exact distribution conditional on all eigenvalues via Monte Carlo simulation of the polynomial root \(\mu_\mathrm{min}\). Only available when W is None. If "kleibergen2007generalizing" or "moreira2003conditional", uses the upper bound conditional on the smallest eigenvalue via numerical integration.
tol (float, default=1e-6) – Tolerance for the approximation of the CDF and thus the p-value.
num_samples (int, default=100_000) – Number of Monte Carlo samples when using "londschien2025exact".

Returns:

statistic (float) – The test statistic \(\mathrm{CLR}(\beta)\).
p_value (float) – The p-value of the test, correct up to tolerance tol.

Raises:

ValueError: – If the dimensions of the inputs are incorrect.

References

[Kle07]

Frank Kleibergen. Generalizing weak instrument robust IV statistics towards multiple parameters, unrestricted covariance matrices and identification statistics. Journal of Econometrics, 139(1):181–216, 2007.

[Kle21]

Frank Kleibergen. Efficient size correct subset inference in homoskedastic linear instrumental variables regression. Journal of Econometrics, 221(1):78–96, 2021.

[Lon25] (1,2,3,4)

Malte Londschien. The exact distribution of the conditional likelihood-ratio test in instrumental variables regression. arXiv preprint, 2025.

[Mor03]

Marcelo J Moreira. A conditional likelihood ratio test for structural models. Econometrica, 71(4):1027–1048, 2003.

ivmodels.tests.conditional_likelihood_ratio.inverse_conditional_likelihood_ratio_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, fit_intercept=True, tol=1e-06, max_value=1000000.0, max_eval=100)

Return an approximation of the confidence set by inversion of the CLR test.

This is only implemented if mx + md = 1. The confidence set is computed by a root finding algorithm, see the docs of _find_roots() for more details.

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float, optional, default = 0.05) – Significance level of the test.
W (np.ndarray of dimension (n, mw) or None, optional, default = None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default = None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default: True) – Whether to include an intercept. This is equivalent to centering the inputs.
tol (float, optional, default: 1e-6) – The boundaries of the confidence set are computed up to this tolerance.
max_value (float, optional, default: 1e6) – The maximum value to consider when searching for the boundaries of the confidence set. That is, if the true confidence set is of the form [0, max_value + 1], the confidence returned set will be [0, np.inf].
max_eval (int, optional, default: 100) – The maximum number of evaluations of the critical value function to find the boundaries of the confidence set.

ivmodels.tests.j module

ivmodels.tests.j.j_test(Z, X, y, C=None, fit_intercept=True, estimator='liml')

Perform the J-test for overidentifying restrictions.

This is also called Sargan–Hansen test or Sargan’s J test.

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Endogenous regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors.
fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
estimator (str, optional, default = 'liml') – Estimator to use. Passed to KClass.

Returns:

statistic (float) – The test statistic \(J\).
p_value (float) – The p-value of the test.

Raises:

ValueError: – If the dimensions of the inputs are incorrect.

ivmodels.tests.lagrange_multiplier module

class ivmodels.tests.lagrange_multiplier.MemoizeJacHess(fun)

Bases: MemoizeJac

Cache the return values of a function returning (fun, grad, hess).

hessian(x, *args)

ivmodels.tests.lagrange_multiplier.inverse_lagrange_multiplier_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, fit_intercept=True, tol=1e-06, max_value=1000000.0, max_eval=100)

Return an approximation of the confidence set by inversion of the LM test.

This is only implemented if mx + md = 1. The confidence set is computed by a root finding algorithm, see the docs of _find_roots() for more details.

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors of interest.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float, optional, default=0.05) – Significance level. The confidence level is 1 - alpha.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.
tol (float, optional, default=1e-6) – Tolerance for the root finding algorithm.
max_value (float, optional, default=1e6) – Maximum value for the root finding algorithm. Returns a confidence set with infinite bounds if the algorithm reaches this value.
max_eval (int, optional, default=100) – Maximum number of evaluations of the statistic for the root finding algorithm.

ivmodels.tests.lagrange_multiplier.lagrange_multiplier_test(Z, X, y, beta, W=None, C=None, D=None, fit_intercept=True, **kwargs)

Perform the Lagrange multiplier test for beta by Kleibergen [2002].

Test the null hypothesis that the residuals are uncorrelated with the instruments. If W is None, let

\[\tilde X(\beta) := X - (y - X \beta) \frac{(y - X \beta) M_Z X}{(y - X \beta) M_Z (y - X \beta)}.\]

The test statistic is

\[\mathrm{LM}(\beta) := (n - k) \frac{\| P_{P_Z \tilde X(\beta)} (y - X \beta) \|_2^2}{\| M_Z (y - X \beta) \|_2^2}.\]

If W is not None, let

\[\tilde S(\beta, \gamma) := (X \ W) - (y - X \beta - W \gamma) \frac{(y - X \beta - W \gamma) M_Z (X \ W)}{(y - X \beta - W \gamma) M_Z (y - X \beta - W \gamma)}.\]

The test statistic is

\[\mathrm{LM}(\beta) := (n - k) \min_{\gamma} \frac{\| P_{P_Z \tilde S(\beta, \gamma)} (y - X \beta - W \gamma) \|_2^2}{\| M_Z (y - X \beta - W \gamma) \|_2^2}.\]

This test statistic is asymptotically distributed as \(\chi^2(m_X)\) under the null, even if the instruments are weak.

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors of interest.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.

Returns:

statistic (float) – The test statistic \(\mathrm{LM}(\beta)\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(m_X)}(\mathrm{LM}(\beta)\), where \(F_{\chi^2(m_X)}\) is the cumulative distribution function of the \(\chi^2(m_X)\) distribution.

Raises:

ValueError: – If the dimensions of the inputs are incorrect.

ivmodels.tests.likelihood_ratio module

ivmodels.tests.likelihood_ratio.inverse_likelihood_ratio_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, fit_intercept=True)

Return the quadric for the inverse likelihood ratio test’s acceptance region.

The quadric is defined as

\[\mathrm{LR}(\beta) \leq F_{\chi^2(m_X)}(1 - \alpha).\]

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float, optional, default=0.05) – Significance level.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.

ivmodels.tests.likelihood_ratio.likelihood_ratio_test(Z, X, y, beta, W=None, C=None, D=None, fit_intercept=True)

Perform the likelihood ratio test for beta.

If W is None, the test statistic is defined as

\[\begin{split}\mathrm{LR}(\beta) &:= (n - k) \frac{ \| P_Z (y - X \beta) \|_2^2}{ \| M_Z (y - X \beta) \|_2^2} - (n - k) \frac{ \| P_Z (y - X \hat\beta_\mathrm{LIML}) \|_2^2 }{ \| M_Z (y - X \hat\beta_\mathrm{LIML}) \|_2^2 } \\ &= k \ \mathrm{AR}(\beta) - k \ \mathrm{AR}(\hat\beta_\mathrm{LIML}),\end{split}\]

where \(P_Z\) is the projection matrix onto the column space of \(Z\), \(M_Z = \mathrm{Id} - P_Z\), and \(\hat\beta_\mathrm{LIML}\) is the LIML estimator of \(\beta\), minimizing the Anderson-Rubin test statistic \(\mathrm{AR}(\beta)\) (see anderson_rubin_test()) at \(\mathrm{AR}(\hat\beta_\mathrm{LIML}) = \frac{n - k}{k} (\hat\kappa_\mathrm{LIML} - 1)\).

If W is not None, the test statistic is defined as

\[\mathrm{LR(\beta)} := (n - k) \frac{ \|P_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|^2_2 }{\| M_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2 } - (n - k) \frac{\| P_Z (y - (X \ W) \hat\delta_\mathrm{LIML}) \|_2^2 }{ \| M_Z (y - (X \ W) \hat\delta_\mathrm{LIML}) \|_2^2}\]

where \(\gamma_\mathrm{LIML}\) is the LIML estimator (see KClass) using instruments \(Z\), endogenous covariates \(W\), and outcomes \(y - X \beta\) and \(\hat\delta_\mathrm{LIML}\) is the LIML estimator using instruments \(Z\), endogenous covariates \(X \ W\), and outcomes \(y\).

Under the null and given strong instruments, the test statistic is asymptotically distributed as \(\chi^2(m_X)\), where \(m_X\) is the number of regressors.

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.

Returns:

statistic (float) – The test statistic \(\mathrm{LR}(\beta)\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(m_X)}(\mathrm{LR}(\beta)\), where \(F_{\chi^2(m_X)}\) is the cumulative distribution function of the \(\chi^2(m_X)\) distribution.

Raises:

ValueError: – If the dimensions of the inputs are incorrect.

ivmodels.tests.pulse module

ivmodels.tests.pulse.inverse_pulse_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, fit_intercept=True)

Return the quadric for the inverse pulse test’s acceptance region.

The quadric satisfies quadric(x) <= 0 if and only if pulse_test(Z, X, y, beta=x, C=C, D=D)[1] > alpha. It is thus a confidence region for the causal parameter corresponding to the regressors X and D.

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float, optional, default=0.05) – Significance level.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Must be None or mw must be 0. No subvector variant of the test is implemented.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.

Returns:

The quadric for the acceptance region.

Return type:

Quadric

ivmodels.tests.pulse.pulse_test(Z, X, y, beta, W=None, C=None, D=None, fit_intercept=True)

Test proposed by Jakobsen and Peters [2022] with null hypothesis: \(Z\) and \(y - X \beta\) are uncorrelated.

The test statistic is defined as

\[T := n \frac{\| P_Z (y - X \beta) \|_2^2}{\| (y - X \beta) \|_2^2}.\]

Under the null, \(T\) is asymptotically distributed as \(\chi^2(k)\). See Section 3.2 of [Jakobsen and Peters, 2022] for details.

If D is not None, it is added to both the instruments \(Z\) and the regressors \(X\), such that \(T\) is asymptotically distributed as \(\chi^2(k + m_D)\) under the null.

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Must be None or mw must be 0. No subvector variant of the test is implemented.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.

Returns:

statistic (float) – The test statistic \(T\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(k + m_D)}(T)\), where \(F_{\chi^2(k + m_D)}\) is the cumulative distribution function of the \(\chi^2(k + m_D)\) distribution.

Raises:

ValueError: – If the dimensions of the inputs are incorrect.

References

[JP22] (1,2,3,4)

Martin Emil Jakobsen and Jonas Peters. Distributional robustness of k-class estimators and the PULSE. The Econometrics Journal, 25(2):404–432, 2022.

ivmodels.tests.rank module

ivmodels.tests.rank.rank_test(Z, X, C=None, fit_intercept=True)

Perform the Cragg-Donald test for reduced rank [Cragg and Donald, 1997].

Let \(X = Z \Pi + V\) with \(\Pi \in \mathbb{R}^{k \times m_X}\). The Cragg-Donald test is the Wald test with the null hypothesis

\[H_0 := \mathrm{rank}(\Pi) < m_X,\]

The test statistic is

\[\mathrm{CD} := \lambda := (n-k) \lambda_\mathrm{min}((X^T M_Z X)^{-1} X^T P_Z X)\]

where \(P_Z = Z (Z^T Z)^{-1} Z^T\) is the orthogonal projection onto the column space of \(Z\), \(M_Z = I - P_Z\) is the orthogonal projection onto the orthogonal complement of the column space of \(Z\), and \(\lambda_\mathrm{min}\) is the smallest eigenvalue. Under the null hypothesis, \(\mathrm{CD}\) is asymptotically distributed as \(\chi^2(k - m_X + 1)\).

The Cragg-Donald test is asymptotically equivalent to Anderson [1951]’s likelihood ratio test for reduced rank of \(\Pi\).

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
fit_intercept (bool) – Whether to fit an intercept.

Returns:

statistic (float) – The test statistic \(\mathrm{CD}\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(k - m_X + 1)}(\mathrm{CD})\), where \(F_{\chi^2(k - m_X + 1)}\) is the cumulative distribution function of the \(\chi^2(k - m_X + 1)\) distribution.

References

[And51] (1,2)

Theodore Wilbur Anderson. Estimating linear restrictions on regression coefficients for multivariate normal distributions. The Annals of Mathematical Statistics, pages 327–351, 1951.

[CD97] (1,2)

John G Cragg and Stephen G Donald. Inferring the rank of a matrix. Journal of Econometrics, 76(1-2):223–250, 1997.

ivmodels.tests.residual_prediction module

ivmodels.tests.residual_prediction.inverse_weak_residual_prediction_test(Z, X, y, C=None, alpha=0.05, robust=False, nonlinear_model=None, fit_intercept=True, train_fraction=None, upper_clipping_quantile=0.9, gamma=0.05, seed=0, tol=1e-06, max_value=1000000.0, max_eval=100)

Compute confidence set for the weak-IV residual prediction test [Scheidegger et al., 2025].

This implements weak-IV robust confidence sets for the causal parameter based on the weak_residual_prediction_test [Scheidegger et al., 2025]. For each candidate \(\beta_0\), it tests

\[H_0(\beta_0): \exists \theta \in \mathbb{R}^q \text{ s.t. } \mathbb{E}[y - X^T \beta_0 - C^T \theta \mid Z, C] = 0,\]

and returns the confidence set

\[C_\alpha = \{\beta_0 \mid \mathrm{pval}(\beta_0) \geq \alpha\}.\]

Unlike residual_prediction_test(), this test does not require estimating \(\beta\) via TSLS and therefore remains valid under weak or many instruments. If \(C_\alpha\) is empty, the model is misspecified for all candidate values.

The function is only implemented for scalar \(X\) (\(p = 1\)). The confidence set boundaries are located by root-finding starting from the TSLS estimate of \(\beta\) on the full sample.

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, 1)) – Endogenous regressor (must be scalar).
y (np.ndarray of dimension (n,)) – Outcomes.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Included exogenous regressors.
alpha (float, optional, default = 0.05) – Significance level.
robust (bool, optional, default = False) – Whether to use the heteroskedasticity-robust variance estimator.
nonlinear_model (object, optional, default = None) – Object with a fit and predict method. If None, uses an sklearn.ensemble.RandomForestRegressor().
fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
train_fraction (float, optional, default = None) – Fraction of data used to train the nonlinear model. If None, uses \(\min(0.5, e / \log(n))\).
upper_clipping_quantile (float, optional, default = 0.9) – Asymptotic normality requires the nonlinear model’s prediction not to put too much weight in the tails. To avoid this, we clip its “test set” predictions by a certain threshold in absolute value. The threshold is the upper_clipping_quantile of predictions on the “train” data. Must be between 0 and 1.
gamma (float, optional, default = 0.05) – A non-negative scalar. Limits the minimum variance of the test statistic to gamma times the noise level.
seed (int, optional, default = 0) – Seed for the random train / test split.
tol (float, optional, default = 1e-6) – Tolerance for the root-finding algorithm used to locate the confidence set boundaries.
max_value (float, optional, default = 1e6) – Maximum absolute value of \(\beta_0\) to consider. If the confidence set boundary lies beyond this value, the boundary is reported as \(\pm \infty\).
max_eval (int, optional, default = 100) – Maximum number of evaluations of the test statistic for the root-finding algorithm.

Returns:

The confidence set \(C_\alpha\).

Return type:

ConfidenceSet

Raises:

ValueError: – If the dimensions of the inputs are incorrect.
ValueError: – If X has more than one column.
ValueError: – If train_fraction is not in (0, 1).
ValueError: – If nonlinear_model does not have a fit and predict method.

References

[SLBuhlmann25] (1,2,3,4,5,6,7,8,9,10)

Cyrill Scheidegger, Malte Londschien, and Peter Bühlmann. A residual prediction test for the well-specification of linear instrumental variable models. arXiv preprint arXiv:2506.12771, 2025.

ivmodels.tests.residual_prediction.residual_prediction_test(Z, X, y, C=None, robust=False, nonlinear_model=None, fit_intercept=True, train_fraction=None, upper_clipping_quantile=0.9, gamma=0.05, seed=0)

Perform the residual prediction test [Scheidegger et al., 2025] for model specification.

This uses a nonlinear model to test the well specification of an IV model: “Is the linear IV model appropriate for the data”? Formally, the null hypothesis is:

\[H_0: \exists \beta_0 \in \mathbb{R}^p \mathrm{\ such \ that \ } \mathbb{E}[y - X \beta | Z] = 0.\]

The tests splits the data according to train_fraction into \(y_a, X_a, Z_a\) and \(y_b, X_b, Z_b\) and fits a nonlinear model regressing \(\hat{\varepsilon}_a \sim Z_a\) on the residuals \(\hat{\varepsilon}_a := y_a - X_a \hat \beta_a\) of a two-stage least-squares (TSLS) estimator \(\hat \beta_a: y_a \sim X_a | Z_a\) on the “train” data \(y_a, X_a, Z_a\). The fitted nonlinear model is then used to predict the residuals on the “test” data \(y_b, X_b, Z_b\), yielding weights \(\hat w_b := \mathrm{nonlinear\_model}(Z_b)\). Let \(\hat \varepsilon_b := y_b - X_b \hat \beta_b\) be the residuals of a TSLS estimator \(\hat \beta_b: y_b \sim X_b | Z_b\) on the “test” data and \(\hat \sigma^2\) be an estimate of the variance of \(\hat w_b \cdot \hat \varepsilon_b\) under the null hypothesis. The test statistic is

\[T = \frac{1}{\sqrt{n_b}} \frac{w_b^T \hat \varepsilon_b}{\sqrt{\hat \sigma^2}}.\]

This is asymptotically standard Gaussian distributed under the null.

See also the test’s R implementation by Cyrill Scheidegger.

To avoid the \(p\)-value lottery due to the random train / test split used in the residual prediction test, Scheidegger et al. [2025] suggest aggregating the \(p\)-values from multiple random splits by taking 2 times the median. This results in a conservative \(p\)-value Meinshausen et al. [2009].

Example

>>> import numpy as np
>>> from ivmodels.tests import residual_prediction_test
>>> from ivmodels.simulate import simulate_gaussian_iv
>>>
>>> Z, X, y = ...
>>>
>>> ps = np.empty(50)
>>> for i in range(50):
...     _, ps[i] = residual_prediction_test(Z, X, y, seed=i)
>>>
>>> print(f"Residual prediction test p-value: {2 * np.median(ps):.3f}")

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Included exogenous regressors.
robust (bool or string, optional, default = False) – Whether to use a heteroskedasticity-robust method to estimate the noise level \(\hat \sigma^2\).
nonlinear_model (object, optional, default = None) – Object with a fit and predict method. If None, uses an sklearn.ensemble.RandomForestRegressor().
fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
train_fraction (float, optional, default = None) – Fraction of data to use to train the nonlinear model. Must be between 0 and 1. The remaining data is used to compute the test statistic. If None, 0.5 or \(e / \log(n)\) is used, whichever is smaller.
upper_clipping_quantile (float, optional, default = 0.9) – Asymptotic normality requires the nonlinear model’s prediction not to put too much weight in the tails. To avoid this, we clip its “test set” predictions by a certail threshold in absolute value. The threshold is the upper_clipping_quantile of predictions on the “train” data. Must be between 0 and 1.
gamma (float, optional, default = 0.05) – A non-negative scalar. Limits the minimum variance of the test statistic to gamma times the noise level.
seed (int, optional, default = 0) – Seed used to generate the random train / test split.

Returns:

statistic (float) – The test statistic \(\mathrm{T}\).
p_value (float) – The p-value of the test.

Raises:

ValueError: – If the dimensions of the inputs are incorrect.
ValueError: – If train_fraction is not in (0, 1).
ValueError: – If nonlinear_model does not have a fit and predict method.

References

[MMBuhlmann09] (1,2)

Nicolai Meinshausen, Lukas Meier, and Peter Bühlmann. P-values for high-dimensional regression. Journal of the American Statistical Association, 104(488):1671–1681, 2009.

[SLBuhlmann25] (1,2,3,4,5,6,7,8,9,10)

Cyrill Scheidegger, Malte Londschien, and Peter Bühlmann. A residual prediction test for the well-specification of linear instrumental variable models. arXiv preprint arXiv:2506.12771, 2025.

ivmodels.tests.residual_prediction.weak_residual_prediction_test(Z, X, y, beta, C=None, robust=False, nonlinear_model=None, fit_intercept=True, train_fraction=None, upper_clipping_quantile=0.9, gamma=0.05, seed=0)

Perform the weak-IV-robust residual prediction test at a fixed beta [Scheidegger et al., 2025].

Unlike residual_prediction_test(), this test does not estimate \(\beta\) via TSLS. Instead, it tests the null hypothesis

\[H_0(\beta_0): \exists \theta \in \mathbb{R}^q \text{ s.t. } \mathbb{E}[y - X^T \beta_0 - C^T \theta \mid Z, C] = 0\]

at the supplied value \(\beta_0\). The test statistic is asymptotically standard Gaussian under the null and remains valid under weak or many instruments.

See also the test’s R implementation by Cyrill Scheidegger (weak_RPIV_test).

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx,)) – Coefficients to test.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Included exogenous regressors.
robust (bool, optional, default = False) – Whether to use the heteroskedasticity-robust variance estimator.
nonlinear_model (object, optional, default = None) – Object with a fit and predict method. If None, uses an sklearn.ensemble.RandomForestRegressor().
fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
train_fraction (float, optional, default = None) – Fraction of data used to train the nonlinear model. If None, uses \(\min(0.5, e / \log(n))\).
upper_clipping_quantile (float, optional, default = 0.9) – Asymptotic normality requires the nonlinear model’s prediction not to put too much weight in the tails. To avoid this, we clip its “test set” predictions by a certain threshold in absolute value. The threshold is the upper_clipping_quantile of predictions on the “train” data. Must be between 0 and 1.
gamma (float, optional, default = 0.05) – A non-negative scalar. Limits the minimum variance of the test statistic to gamma times the noise level.
seed (int, optional, default = 0) – Seed for the random train / test split.

Returns:

statistic (float) – The test statistic.
p_value (float) – The p-value of the test.

Raises:

ValueError: – If the dimensions of the inputs are incorrect.
ValueError: – If train_fraction is not in (0, 1).
ValueError: – If nonlinear_model does not have a fit and predict method.

References

[SLBuhlmann25] (1,2,3,4,5,6,7,8,9,10)

Cyrill Scheidegger, Malte Londschien, and Peter Bühlmann. A residual prediction test for the well-specification of linear instrumental variable models. arXiv preprint arXiv:2506.12771, 2025.

ivmodels.tests.wald module

ivmodels.tests.wald.inverse_wald_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, estimator='tsls', fit_intercept=True)

Return the quadric for the acceptance region based on asymptotic normality.

If W is None, the quadric is defined as

\[(\beta - \hat{\beta})^T X^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) X (\beta - \hat{\beta}) \leq \hat{\sigma}^2 F_{\chi^2(m_X)}(1 - \alpha),\]

where \(\hat \beta\) is an estimate of the causal parameter \(\beta_0\) (controlled by the parameter estimator), \(\hat \sigma^2 = \frac{1}{n - m_X} \| y - X \hat \beta \|^2_2\), \(P_Z\) is the projection matrix onto the column space of \(Z\), \(M_Z\) is the projection matrix onto the orthogonal complement of the column space of \(Z\), and \(F_{\chi^2(m_X)}\) is the cumulative distribution function of the \(\chi^2(m_X)\) distribution.

If W is not None, the quadric is defined as

\[(\beta - B \hat{\beta})^T (B ((X W)^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) (X W))^{-1} B^T)^{-1} (\beta - B \hat{\beta}) \leq \hat{\sigma}^2 F_{\chi^2(m_X)}(1 - \alpha),\]

where \(B \in \mathbb{R}^{m_X \times (m_X + m_W)}\) is a diagonal matrix with 1 on the diagonal and 0 elsewhere.

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float) – Significance level.
W (np.ndarray of dimension (n, mw) or None, optional, default = None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default = None) – Exogenous regressors of interest.
estimator (float or str, optional, default = "tsls") – Estimator to use. Passed as kappa parameter to KClass.
fit_intercept (bool, optional, default = True) – Whether to include an intercept. The intercept will be included both in the complete and the (restricted) model. Including an intercept is equivalent to centering the columns of all design matrices.

ivmodels.tests.wald.wald_test(Z, X, y, beta, W=None, C=None, D=None, estimator='tsls', fit_intercept=True, robust=False)

Test based on asymptotic normality of the TSLS (or LIML) estimator.

If W is None, the test statistic is defined as

\[\mathrm{Wald}(\beta) := (\beta - \hat{\beta})^T X^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) X (\beta - \hat{\beta}) / \hat{\sigma}^2,\]

where \(\hat \beta = \hat \beta(\kappa)\) is a k-class estimator with \(\sqrt{n} (1 - \kappa) \to 0\), \(\hat \sigma^2 = \frac{1}{n - m_X} \| y - X \hat \beta \|^2_2\) is an estimate of the variance of the errors, \(P_Z\) is the projection matrix onto the column space of \(Z\), and \(M_Z = \mathrm{Id} - P_Z\). Under strong instruments, the test statistic is asymptotically distributed as \(\chi^2(m_X)\) under the null.

If W is not None, the test statistic is defined as

\[\mathrm{Wald}(\beta) := (\beta - B \hat{\beta})^T (B ( (X W)^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) (X W) )^{-1} B^T)^{-1} (\beta - B \hat{\beta}) / \hat{\sigma}^2,\]

where \(B \in \mathbb{R}^{m_X \times (m_X + m_W)}\) is a diagonal matrix with 1 on the diagonal and 0 elsewhere.

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
W (np.ndarray of dimension (n, mw) or None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None) – Exogenous regressors of interest.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
estimator (str or float, optional, default = "tsls") – Estimator to use. Passed to kappa argument of KClass.
fit_intercept (bool, optional, default = True) – Whether to include an intercept. The intercept will be included both in the complete and the (restricted) model. Including an intercept is equivalent to centering the columns of all design matrices.
robust (bool, optional, default = False) – Whether to use a robust variance estimator. If True, the sandwich estimator is used.

Returns:

statistic (float) – The test statistic \(\mathrm{Wald}\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(m_X)}(Wald)\), where \(F_{\chi^2(m_X)}\) is the cumulative distribution function of the \(\chi^2(m_X)\) distribution.

Raises:

ValueError: – If the dimensions of the inputs are incorrect.

Module contents

ivmodels.tests.anderson_rubin_test(Z, X, y, beta, W=None, C=None, D=None, critical_values='chi2', fit_intercept=True)

Perform the Anderson Rubin test [Anderson and Rubin, 1949].

Test the null hypothesis that the residuals are uncorrelated with the instruments. If W is None, the test statistic is defined as

\[\mathrm{AR}(\beta) := \frac{n - k}{k} \frac{\| P_Z (y - X \beta) \|_2^2}{\| M_Z (y - X \beta) \|_2^2},\]

where \(P_Z\) is the projection matrix onto the column space of \(Z\) and \(M_Z = \mathrm{Id} - P_Z\).

Under the null and normally distributed errors, this test statistic is distributed as \(F_{k, n - k}\), where \(k\) is the number of instruments and \(n\) is the number of observations. The statistic is asymptotically distributed as \(\chi^2(k) / k\) under the null and non-normally distributed errors, even for weak instruments.

If W is not None, the test statistic is

\[\begin{split}\mathrm{AR}(\beta) &:= \min_\gamma \frac{n - k}{k - m_W} \frac{\| P_Z (y - X \beta - W \gamma) \|_2^2}{\| M_Z (y - X \beta - W \gamma) \|_2^2} \\ &= \frac{n - k}{k - m_W} \frac{\| P_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2}{\| M_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2},\end{split}\]

where \(\hat\gamma_\mathrm{LIML}\) is the LIML estimate using instruments \(Z\), covariates \(W\) and outcomes \(y - X \beta\). Under the null, this test statistic is asymptotically bounded from above by a random variable that is distributed as \(\frac{1}{k - m_W} \chi^2(k - m_W)\), where \(m_W = \mathrm{dim}(W)\). See [Guggenberger et al., 2012].

Parameters:

Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
W (np.ndarray of dimension (n, mw) or None, optional, default = None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default = None) – Exogenous regressors of interest.
critical_values (str, optional, default = "chi2") – If "chi2", use the \(\chi^2(k - m_W)\) distribution to compute the p-value. If "f", use the \(F_{k - m_W, n - k}\) distribution to compute the p-value. If "guggenberger" or "GKM", use the critical value function proposed by Guggenberger et al. [2019] to compute the p-value.
fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.

Returns:

statistic (float) – The test statistic \(\mathrm{AR}(\beta)\).
p_value (float) – The p-value of the test.

Raises:

ValueError: – If the dimensions of the inputs are incorrect.

References

[AR49] (1,2)

Theodore W Anderson and Herman Rubin. Estimation of the parameters of a single equation in a complete system of stochastic equations. The Annals of mathematical statistics, 20(1):46–63, 1949.

[GKM19] (1,2,3,4,5,6)

Patrik Guggenberger, Frank Kleibergen, and Sophocles Mavroeidis. A more powerful subvector Anderson Rubin test in linear instrumental variables regression. Quantitative Economics, 10(2):487–526, 2019.

[GKMC12] (1,2)

Patrik Guggenberger, Frank Kleibergen, Sophocles Mavroeidis, and Linchun Chen. On the asymptotic sizes of subset Anderson–Rubin and Lagrange multiplier tests in linear instrumental variables regression. Econometrica, 80(6):2649–2666, 2012.

ivmodels.tests.conditional_likelihood_ratio_test(Z, X, y, beta, W=None, C=None, D=None, fit_intercept=True, critical_values='londschien2025exact', tol=1e-06, num_samples=100000)