ivmodels.tests package
Submodules
ivmodels.tests.anderson_rubin module
- ivmodels.tests.anderson_rubin.anderson_rubin_test(Z, X, y, beta, W=None, C=None, D=None, critical_values='chi2', fit_intercept=True)
Perform the Anderson Rubin test [Anderson and Rubin, 1949].
Test the null hypothesis that the residuals are uncorrelated with the instruments. If
WisNone, the test statistic is defined as\[\mathrm{AR}(\beta) := \frac{n - k}{k} \frac{\| P_Z (y - X \beta) \|_2^2}{\| M_Z (y - X \beta) \|_2^2},\]where \(P_Z\) is the projection matrix onto the column space of \(Z\) and \(M_Z = \mathrm{Id} - P_Z\).
Under the null and normally distributed errors, this test statistic is distributed as \(F_{k, n - k}\), where \(k\) is the number of instruments and \(n\) is the number of observations. The statistic is asymptotically distributed as \(\chi^2(k) / k\) under the null and non-normally distributed errors, even for weak instruments.
If
Wis notNone, the test statistic is\[\begin{split}\mathrm{AR}(\beta) &:= \min_\gamma \frac{n - k}{k - m_W} \frac{\| P_Z (y - X \beta - W \gamma) \|_2^2}{\| M_Z (y - X \beta - W \gamma) \|_2^2} \\ &= \frac{n - k}{k - m_W} \frac{\| P_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2}{\| M_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2},\end{split}\]where \(\hat\gamma_\mathrm{LIML}\) is the LIML estimate using instruments \(Z\), covariates \(W\) and outcomes \(y - X \beta\). Under the null, this test statistic is asymptotically bounded from above by a random variable that is distributed as \(\frac{1}{k - m_W} \chi^2(k - m_W)\), where \(r = \mathrm{dim}(W)\). See [Guggenberger et al., 2012].
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
W (np.ndarray of dimension (n, mw) or None, optional, default = None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default = None) – Exogenous regressors of interest.
critical_values (str, optional, default = "chi2") – If
"chi2", use the \(\chi^2(k - m_W)\) distribution to compute the p-value. If"f", use the \(F_{k - m_W, n - k}\) distribution to compute the p-value. If"guggenberger"or"GKM", use the critical value function proposed by Guggenberger et al. [2019] to compute the p-value.fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
- Returns:
statistic (float) – The test statistic \(\mathrm{AR}(\beta)\).
p_value (float) – The p-value of the test.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
References
[AR49] (1,2)Theodore W Anderson and Herman Rubin. Estimation of the parameters of a single equation in a complete system of stochastic equations. The Annals of mathematical statistics, 20(1):46–63, 1949.
- ivmodels.tests.anderson_rubin.inverse_anderson_rubin_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, critical_values='chi2', fit_intercept=True, tol=1e-06, max_eval=100)
Return the quadric for to the inverse Anderson-Rubin test’s acceptance region.
The returned quadric satisfies
quadric(x) <= 0if and only ifanderson_rubin_test(Z, X, y, beta=x, W=W)[1] > alpha. It is thus a confidence region for the causal parameter corresponding to the endogenous regressors of interestX.If
WisNone, let \(q := \frac{k}{n-k}F_{F(k, n-k}(1 - \alpha)\), where \(F_{F(k, n-k)}\) is the cumulative distribution function of the \(F(k, n-k)\) distribution. The quadric is defined as\[\begin{split}\mathrm{AR}(\beta) = \frac{n - k}{k} \frac{\| P_Z (y - X \beta) \|_2^2}{\| M_Z (y - X \beta) \|_2^2} \leq F_{F(k, n-k)}(1 - \alpha) \\ \Leftrightarrow \beta^T X^T (P_Z - q M_Z) X \beta - 2 y^T (P_Z - q M_Z) X \beta + y^T (P_Z - q M_Z) y \leq 0.\end{split}\]If
Wis notNone, let \(q := \frac{k - m_W}{n-k}F_{F(k - m_W, n-k)}(1 - \alpha)\). The quadric is defined as\[\mathrm{AR}(\beta) = \min_\gamma \frac{n - k}{k - m_W} \frac{\| P_Z (y - X \beta - W \gamma) \|_2^2}{\| M_Z (y - X \beta - W \gamma) \|_2^2} \leq F_{F(k - m_W, n-k)}(1 - \alpha).\]- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float) – Significance level.
W (np.ndarray of dimension (n, mw) or None, optional, default = None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default = None) – Exogenous regressors of interest.
critical_values (str, optional, default = "chi2") – If
"chi2", use the \(\chi^2(k - m_W)\) distribution to compute the p-value. If"f", use the \(F_{k - m_W, n - k}\) distribution to compute the p-value. If"gkm", use the critical value function proposed by Guggenberger et al. [2019].fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
tol (float, optional, default = 1e-6) – Tolerance for the root finding algorithm when critical_values is
"gkm".max_eval (int, optional, default = 1000) – Maximum number of evaluations for the root finding algorithm when critical_values is
"gkm".
- Returns:
The quadric for the acceptance region.
- Return type:
- ivmodels.tests.anderson_rubin.more_powerful_subvector_anderson_rubin_critical_value_function(z, kappa_1_hat, k, mw)
Implement the critical value function proposed by :cite:t`guggenberger2019more`.
This is done by numerically integrating the approximate conditional density from Equation (A.21) of [Guggenberger et al., 2019].
- Parameters:
z (float) – The test statistic.
kappa_1_hat (float) – The maximum eigenvalue of the matrix \(M:=((W \ y - X \beta)^T M_Z (W \ y - X \beta))^{-1} (W \ y-X \beta)^T P_Z (W \ y-X \beta)\). This is the conditioning statistic.
k (int) – Number of instruments.
mw (int) – Number of endogenous regressors not of interest, i.e., \(\mathrm{dim}(W)\).
- Returns:
p_value – The (approximate) p-value of observing \(\lambda_\mathrm{min}(M) > z\) given \(\lambda_\mathrm{max}(M) = \hat\kappa_1\) under the null \(\beta = \beta_0\).
- Return type:
float
References
ivmodels.tests.conditional_likelihood_ratio module
- ivmodels.tests.conditional_likelihood_ratio.conditional_likelihood_ratio_critical_value_function(k, mx, md, lambdas, z, critical_values='londschien2025exact', tol=1e-06, num_samples=100000)
Approximate the critical value function of the conditional likelihood ratio test.
If
critical_valuesis"londschien2025exact", computes the exact distribution from Theorem 1 of [Londschien, 2025] using Monte Carlo simulation:\[\mathrm{CLR}(\beta_0) \overset{d}{\to} \sum_{i=0}^m q_i - \mu_\mathrm{min},\]where \(q_0 \sim \chi^2(k-m)\), \(q_1, \ldots, q_m \sim \chi^2(1)\) are independent, and \(\mu_\mathrm{min}\) is the smallest root of the polynomial
\[p(\mu) := \left(\mu - \sum_{i=0}^m q_i\right) \prod_{i=1}^m (\mu - \lambda_i) - \sum_{i=1}^m \lambda_i q_i \prod_{j \geq 1, j \neq i} (\mu - \lambda_j).\]This is computed by Monte Carlo simulation, generating samples of the \(q_i\) and solving for \(\mu_\mathrm{min}\) using Newton’s method.
If
critical_valuesis"kleibergen2007generalizing"or"moreira2003conditional", uses the upper bound from Corollary 2 of [Londschien, 2025]:\[\mathrm{CLR}(\beta_0) \leq \Gamma(k-m, m, \lambda_1),\]where
\[\Gamma(k-m, m, \lambda_1) := \frac{1}{2} \left( Q_{k-m} + Q_m - \lambda_1 + \sqrt{ (Q_{k-m} + Q_m + \lambda_1)^2 - 4 Q_{k-m} \lambda_1 } \right),\]with \(Q_{k-m} \sim \chi^2(k-m)\), \(Q_m \sim \chi^2(m)\) independent, and \(\lambda_1\) the smallest eigenvalue. This is computed by numerical integration using the formulation
\[\mathbb{P}[\Gamma(k-m, m, \lambda_1) > z] = \mathbb{E}_{B \sim \mathrm{Beta}((k-m)/2, m/2)}[F_{\chi^2(k)}(z/(1-aB))],\]where \(F_{\chi^2(k)}\) is the CDF of \(\chi^2(k)\) and \(a = \lambda_1/(z + \lambda_1)\).
- Parameters:
k (int) – Number of instruments.
mx (int) – Number of endogenous variables.
md (int) – Number of included exogenous variables.
lambdas (array_like) – Eigenvalues of the concentration matrix.
z (float) – Test statistic.
critical_values ({"londschien2025exact", "kleibergen2007generalizing", "moreira2003conditional"}, default="londschien2025exact") – Which critical values to use. If
"londschien2025exact", uses the exact distribution conditional on all eigenvalues via Monte Carlo simulation. If"kleibergen2007generalizing"or"moreira2003conditional", uses the upper bound conditional on the smallest eigenvalue via numerical integration.tol (float, default=1e-6) – Tolerance for the approximation of the CDF and thus the p-value.
num_samples (int, default=100_000) – Number of Monte Carlo samples when using
"londschien2025exact".
- Returns:
The p-value \(\mathbb{P}[\mathrm{CLR} > z]\).
- Return type:
float
References
[Hil09a]Grant Hillier. Exact properties of the conditional likelihood ratio test in an IV regression model. Econometric Theory, 25(4):915–957, 2009.
[Hil09b]Grant Hillier. On the conditional likelihood ratio test for several parameters in IV regression. Econometric Theory, 25(2):305–335, 2009.
- ivmodels.tests.conditional_likelihood_ratio.conditional_likelihood_ratio_test(Z, X, y, beta, W=None, C=None, D=None, fit_intercept=True, critical_values='londschien2025exact', tol=1e-06, num_samples=10000)
Perform the conditional likelihood ratio test for
beta.If
WisNone, the test statistic is defined as\[\begin{split}\mathrm{CLR}(\beta) &:= (n - k) \frac{ \| P_Z (y - X \beta) \|_2^2}{ \| M_Z (y - X \beta) \|_2^2} - (n - k) \frac{ \| P_Z (y - X \hat\beta_\mathrm{LIML}) \|_2^2 }{ \| M_Z (y - X \hat\beta_\mathrm{LIML}) \|_2^2 } \\ &= k \cdot \mathrm{AR}(\beta) - k \cdot \min_\beta \mathrm{AR}(\beta),\end{split}\]where \(P_Z\) is the projection matrix onto the column space of \(Z\), \(M_Z = \mathrm{Id} - P_Z\), and \(\hat\beta_\mathrm{LIML}\) is the LIML estimator of \(\beta\) (see
KClass), minimizing the Anderson-Rubin test statistic \(\mathrm{AR}(\beta)\) (seeanderson_rubin_test()) at\[\mathrm{AR}(\hat\beta_\mathrm{LIML}) = \frac{n - k}{k} \lambda_\mathrm{min}\left( \left(\begin{pmatrix} X & y \end{pmatrix}^T M_Z \begin{pmatrix} X & y \end{pmatrix}\right)^{-1} \begin{pmatrix} X & y \end{pmatrix}^T P_Z \begin{pmatrix} X & y \end{pmatrix} \right).\]Let
\[\tilde X(\beta) := X - (y - X \beta) \cdot \frac{(y - X \beta)^T M_Z X}{(y - X \beta)^T M_Z (y - X \beta)}\]and let \(\lambda_1, \ldots, \lambda_m\) be the eigenvalues of
\[(n - k) \cdot \left[\tilde X(\beta)^T M_Z \tilde X(\beta)\right]^{-1} \tilde X(\beta)^T P_Z \tilde X(\beta).\]If
critical_values="londschien2025exact", the exact asymptotic distribution from Theorem 1 of [Londschien, 2025] is used:\[\mathrm{CLR}(\beta_0) \overset{d}{\to} \sum_{i=0}^m q_i - \mu_\mathrm{min},\]where \(q_0 \sim \chi^2(k-m)\), \(q_1, \ldots, q_m \sim \chi^2(1)\), and \(\mu_\mathrm{min}\) is the smallest root of the polynomial
\[p(\mu) := \left(\mu - \sum_{i=0}^m q_i\right) \prod_{i=1}^m (\mu - \lambda_i) - \sum_{i=1}^m \lambda_i q_i \prod_{j \geq 1, j \neq i} (\mu - \lambda_j).\]This distribution is conditional on all eigenvalues \(\lambda_1, \ldots, \lambda_m\) and provides substantially more power when eigenvalues differ.
If
critical_valuesis"kleibergen2007generalizing"or"moreira2003conditional", uses the upper bound conditional on only the smallest eigenvalue \(\lambda_1\):\[\mathrm{CLR}(\beta_0) \leq \frac{1}{2} \left( Q_{m_X} + Q_{k - m_X} - \lambda_1 + \sqrt{ (Q_{m_X} + Q_{k - m_X} + \lambda_1)^2 - 4 Q_{k - m_X} \lambda_1 } \right),\]where \(Q_{m_X} \sim \chi^2(m_X)\) and \(Q_{k - m_X} \sim \chi^2(k - m_X)\) are independent. This bound is sharp when all eigenvalues are equal.
This test is robust to weak instruments. If identification is strong (\(\lambda_i \to \infty\)), the test is equivalent to the likelihood ratio test with \(\chi^2(m_X)\) distribution. If identification is weak (\(\lambda_i \to 0\)), the test is equivalent to the Anderson-Rubin test with \(\chi^2(k)\) distribution.
If
Wis notNone, the test statistic is defined as\[\begin{split}\mathrm{CLR}(\beta) &:= (n - k) \min_\gamma \frac{ \| P_Z (y - X \beta - W \gamma) \|_2^2}{ \| M_Z (y - X \beta - W \gamma) \|_2^2} - (n - k) \min_{\beta, \gamma} \frac{ \| P_Z (y - X \beta - W \gamma) \|_2^2 }{ \| M_Z (y - X \beta - W \gamma) \|_2^2 } \\ &= (n - k) \frac{ \| P_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2}{ \| M_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2} - (n - k) \frac{ \| P_Z (y - \begin{pmatrix} X & W \end{pmatrix} \hat\delta_\mathrm{LIML}) \|_2^2 }{ \| M_Z (y - \begin{pmatrix} X & W \end{pmatrix} \hat\delta_\mathrm{LIML}) \|_2^2 },\end{split}\]where \(\hat\gamma_\mathrm{LIML}\) and \(\hat\delta_\mathrm{LIML}\) are LIML estimators. In this case, only the upper bound method is available (
critical_valuesmust be"kleibergen2007generalizing"or"moreira2003conditional").- Parameters:
Z (array_like of shape (n, k)) – Instruments.
X (array_like of shape (n, mx)) – Endogenous regressors of interest.
y (array_like of shape (n,)) – Outcomes.
beta (array_like of shape (mx + md,)) – Coefficients to test.
W (array_like of shape (n, mw) or None, default=None) – Endogenous regressors not of interest.
C (array_like of shape (n, mc) or None, default=None) – Exogenous regressors not of interest.
D (array_like of shape (n, md) or None, default=None) – Exogenous regressors of interest. Will be included into both
XandZif supplied.fit_intercept (bool, default=True) – Whether to include an intercept. This is equivalent to centering the inputs.
critical_values ({"londschien2025exact", "kleibergen2007generalizing", "moreira2003conditional"}, default="londschien2025exact") – Which critical values to use. If
"londschien2025exact", uses the exact distribution conditional on all eigenvalues via Monte Carlo simulation of the polynomial root \(\mu_\mathrm{min}\). Only available whenWisNone. If"kleibergen2007generalizing"or"moreira2003conditional", uses the upper bound conditional on the smallest eigenvalue via numerical integration.tol (float, default=1e-6) – Tolerance for the approximation of the CDF and thus the p-value.
num_samples (int, default=10000) – Number of Monte Carlo samples when using
"londschien2025exact".
- Returns:
statistic (float) – The test statistic \(\mathrm{CLR}(\beta)\).
p_value (float) – The p-value of the test, correct up to tolerance
tol.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
References
[Kle07]Frank Kleibergen. Generalizing weak instrument robust IV statistics towards multiple parameters, unrestricted covariance matrices and identification statistics. Journal of Econometrics, 139(1):181–216, 2007.
[Kle21]Frank Kleibergen. Efficient size correct subset inference in homoskedastic linear instrumental variables regression. Journal of Econometrics, 221(1):78–96, 2021.
[Lon25] (1,2,3,4)Malte Londschien. The exact distribution of the conditional likelihood-ratio test in instrumental variables regression. arXiv preprint, 2025.
[Mor03]Marcelo J Moreira. A conditional likelihood ratio test for structural models. Econometrica, 71(4):1027–1048, 2003.
- ivmodels.tests.conditional_likelihood_ratio.inverse_conditional_likelihood_ratio_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, fit_intercept=True, tol=1e-06, max_value=1000000.0, max_eval=100)
Return an approximation of the confidence set by inversion of the CLR test.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float, optional, default = 0.05) – Significance level of the test.
W (np.ndarray of dimension (n, mw) or None, optional, default = None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, 0) or None, optional, default = None) – Exogenous regressors of interest. Not supported for this test.
fit_intercept (bool, optional, default: True) – Whether to include an intercept. This is equivalent to centering the inputs.
tol (float, optional, default: 1e-4) – The boundaries of the confidence set are computed up to this tolerance.
max_value (float, optional, default: 1e6) – The maximum value to consider when searching for the boundaries of the confidence set. That is, if the true confidence set is of the form [0, max_value + 1], the confidence returned set will be [0, np.inf].
max_eval (int, optional, default: 1000) – The maximum number of evaluations of the critical value function to find the boundaries of the confidence set.
ivmodels.tests.j module
- ivmodels.tests.j.j_test(Z, X, y, C=None, fit_intercept=True, estimator='liml')
Perform the J-test for overidentifying restrictions.
This is also called Sargan–Hansen test or Sargan’s J test.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Endogenous regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors.
fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
estimator (str, optional, default = 'liml') – Estimator to use. Passed to
KClass.
- Returns:
statistic (float) – The test statistic \(J\).
p_value (float) – The p-value of the test.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
ivmodels.tests.lagrange_multiplier module
- class ivmodels.tests.lagrange_multiplier.MemoizeJacHess(fun)
Bases:
MemoizeJacCache the return values of a function returning (fun, grad, hess).
- hessian(x, *args)
- ivmodels.tests.lagrange_multiplier.inverse_lagrange_multiplier_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, fit_intercept=True, tol=1e-06, max_value=1000000.0, max_eval=100)
Return an approximation of the confidence set by inversion of the LM test.
This is only implemented if mx + md = 1. The confidence set is computed by a root finding algorithm, see the docs of
_find_roots()for more details.- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors of interest.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float, optional, default=0.05) – Significance level. The confidence level is
1 - alpha.W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.
tol (float, optional, default=1e-4) – Tolerance for the root finding algorithm.
max_value (float, optional, default=1e8) – Maximum value for the root finding algorithm. Returns a confidence set with infinite bounds if the algorithm reaches this value.
max_eval (int, optional, default=1000) – Maximum number of evaluations of the statistic for the root finding algorithm.
- ivmodels.tests.lagrange_multiplier.lagrange_multiplier_test(Z, X, y, beta, W=None, C=None, D=None, fit_intercept=True, **kwargs)
Perform the Lagrange multiplier test for
betaby Kleibergen [2002].Test the null hypothesis that the residuals are uncorrelated with the instruments. If
WisNone, let\[\tilde X(\beta) := X - (y - X \beta) \frac{(y - X \beta) M_Z X}{(y - X \beta) M_Z (y - X \beta)}.\]The test statistic is
\[\mathrm{LM}(\beta) := (n - k) \frac{\| P_{P_Z \tilde X(\beta)} (y - X \beta) \|_2^2}{\| M_Z (y - X \beta) \|_2^2}.\]If
Wis notNone, let\[\tilde S(\beta, \gamma) := (X \ W) - (y - X \beta - W \gamma) \frac{(y - X \beta - W \gamma) M_Z (X \ W)}{(y - X \beta - W \gamma) M_Z (y - X \beta - W \gamma)}.\]The test statistic is
\[\mathrm{LM}(\beta) := (n - k) \min_{\gamma} \frac{\| P_{P_Z \tilde S(\beta, \gamma)} (y - X \beta - W \gamma) \|_2^2}{\| M_Z (y - X \beta - W \gamma) \|_2^2}.\]This test statistic is asymptotically distributed as \(\chi^2(m_X)\) under the null, even if the instruments are weak.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors of interest.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.
- Returns:
statistic (float) – The test statistic \(\mathrm{LM}(\beta)\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(m_X)}(\mathrm{LM}(\beta)\), where \(F_{\chi^2(m_X)}\) is the cumulative distribution function of the \(\chi^2(m_X)\) distribution.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
ivmodels.tests.likelihood_ratio module
- ivmodels.tests.likelihood_ratio.inverse_likelihood_ratio_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, fit_intercept=True)
Return the quadric for the inverse likelihood ratio test’s acceptance region.
The quadric is defined as
\[\mathrm{LR}(\beta) \leq F_{\chi^2(m_X)}(1 - \alpha).\]- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float, optional, default=0.05) – Significance level.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.
- ivmodels.tests.likelihood_ratio.likelihood_ratio_test(Z, X, y, beta, W=None, C=None, D=None, fit_intercept=True)
Perform the likelihood ratio test for
beta.If
WisNone, the test statistic is defined as\[\begin{split}\mathrm{LR}(\beta) &:= (n - k) \frac{ \| P_Z (y - X \beta) \|_2^2}{ \| M_Z (y - X \beta) \|_2^2} - (n - k) \frac{ \| P_Z (y - X \hat\beta_\mathrm{LIML}) \|_2^2 }{ \| M_Z (y - X \hat\beta_\mathrm{LIML}) \|_2^2 } \\ &= k \ \mathrm{AR}(\beta)) - k \ \mathrm{AR}(\hat\beta_\mathrm{LIML}),\end{split}\]where \(P_Z\) is the projection matrix onto the column space of \(Z\), \(M_Z = \mathrm{Id} - P_Z\), and \(\hat\beta_\mathrm{LIML}\) is the LIML estimator of \(\beta\), minimizing the Anderson-Rubin test statistic \(\mathrm{AR}(\beta)\) (see
anderson_rubin_test()) at \(\mathrm{AR}(\hat\beta_\mathrm{LIML}) = \frac{n - k}{k} (\hat\kappa_\mathrm{LIML} - 1)\).If
Wis notNone, the test statistic is defined as\[\mathrm{LR(\beta)} := (n - k) \frac{ \|P_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|^2_2 }{\| M_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2 } - (n - k) \frac{\| P_Z (y - (X \ W) \hat\delta_\mathrm{LIML}) \|_2^2 }{ \| M_Z (y - (X \ W) \hat\delta_\mathrm{LIML}) \|_2^2}\]where \(\gamma_\mathrm{LIML}\) is the LIML estimator (see
KClass) using instruments \(Z\), endogenous covariates \(W\), and outcomes \(y - X \beta\) and \(\hat\delta_\mathrm{LIML}\) is the LIML estimator using instruments \(Z\), endogenous covariates \(X \ W\), and outcomes \(y\).Under the null and given strong instruments, the test statistic is asymptotically distributed as \(\chi^2(m_X)\), where \(m_X\) is the number of regressors.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.
- Returns:
statistic (float) – The test statistic \(\mathrm{LR}(\beta)\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(m_X)}(\mathrm{LR}(\beta)\), where \(F_{\chi^2(m_X)}\) is the cumulative distribution function of the \(\chi^2(m_X)\) distribution.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
ivmodels.tests.pulse module
- ivmodels.tests.pulse.inverse_pulse_test(Z, X, y, C=None, D=None, alpha=0.05, fit_intercept=True)
Return the quadric for the inverse pulse test’s acceptance region.
- ivmodels.tests.pulse.pulse_test(Z, X, y, beta, C=None, W=None, D=None, fit_intercept=True)
Test proposed by Jakobsen and Peters [2022] with null hypothesis: \(Z\) and \(y - X \beta\) are uncorrelated.
The test statistic is defined as
\[T := n \frac{\| P_Z (y - X \beta) \|_2^2}{\| (y - X \beta) \|_2^2}.\]Under the null, \(T\) is asymptotically distributed as \(\chi^2(k)\). See Section 3.2 of [Jakobsen and Peters, 2022] for details.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Must be None or mw must be 0. No subvector variant of the test is implemented.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.
- Returns:
statistic (float) – The test statistic \(T\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(k)}(T)\), where \(F_{\chi^2(k)}\) is the cumulative distribution function of the \(\chi^2(k)\) distribution.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
References
ivmodels.tests.rank module
- ivmodels.tests.rank.rank_test(Z, X, C=None, fit_intercept=True)
Perform the Cragg-Donald test for reduced rank [Cragg and Donald, 1997].
Let \(X = Z \Pi + V\) with \(\Pi \in \mathbb{R}^{k \times m_X}\). The Cragg-Donald test is the Wald test with the null hypothesis
\[H_0 := \mathrm{rank}(\Pi) < m_X,\]The test statistic is
\[\mathrm{CD} := \lambda := (n-k) \lambda_\mathrm{min}((X^T M_Z X)^{-1} X^T P_Z X)\]where \(P_Z = Z (Z^T Z)^{-1} Z^T\) is the orthogonal projection onto the column space of \(Z\), \(M_Z = I - P_Z\) is the orthogonal projection onto the orthogonal complement of the column space of \(Z\), and \(\lambda_\mathrm{min}\) is the smallest eigenvalue. Under the null hypothesis, \(\mathrm{CD}\) is asymptotically distributed as \(\chi^2(k - m_X + 1)\).
The Cragg-Donald test is asymptotically equivalent to Anderson [1951]’s likelihood ratio test for reduced rank of \(\Pi\).
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
fit_intercept (bool) – Whether to fit an intercept.
- Returns:
statistic (float) – The test statistic \(\mathrm{CD}\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(k - m_X + 1)}(\mathrm{CD})\), where \(F_{\chi^2(k - m_X + 1)}\) is the cumulative distribution function of the \(\chi^2(k - m_X + 1)\) distribution.
References
ivmodels.tests.residual_prediction module
- ivmodels.tests.residual_prediction.inverse_weak_residual_prediction_test(Z, X, y, C=None, alpha=0.05, robust=False, nonlinear_model=None, fit_intercept=True, train_fraction=None, upper_clipping_quantile=0.9, gamma=0.05, seed=0, tol=1e-06, max_value=1000000.0, max_eval=100)
Compute confidence set for the weak-IV residual prediction test [Scheidegger et al., 2025].
This implements weak-IV robust confidence sets for the causal parameter based on the weak_residual_prediction_test [Scheidegger et al., 2025]. For each candidate \(\beta_0\), it tests
\[H_0(\beta_0): \exists \theta \in \mathbb{R}^q \text{ s.t. } \mathbb{E}[y - X^T \beta_0 - C^T \theta \mid Z, C] = 0,\]and returns the confidence set
\[C_\alpha = \{\beta_0 \mid \mathrm{pval}(\beta_0) \geq \alpha\}.\]Unlike
residual_prediction_test(), this test does not require estimating \(\beta\) via TSLS and therefore remains valid under weak or many instruments. If \(C_\alpha\) is empty, the model is misspecified for all candidate values.The function is only implemented for scalar \(X\) (\(p = 1\)). The confidence set boundaries are located by root-finding starting from the TSLS estimate of \(\beta\) on the full sample.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, 1)) – Endogenous regressor (must be scalar).
y (np.ndarray of dimension (n,)) – Outcomes.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Included exogenous regressors.
alpha (float, optional, default = 0.05) – Significance level.
robust (bool, optional, default = False) – Whether to use the heteroskedasticity-robust variance estimator.
nonlinear_model (object, optional, default = None) – Object with a
fitandpredictmethod. IfNone, uses ansklearn.ensemble.RandomForestRegressor().fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
train_fraction (float, optional, default = None) – Fraction of data used to train the nonlinear model. If
None, uses \(\min(0.5, e / \log(n))\).upper_clipping_quantile (float, optional, default = 0.9) – Asymptotic normality requires the nonlinear model’s prediction not to put too much weight in the tails. To avoid this, we clip its “test set” predictions by a certain threshold in absolute value. The threshold is the
upper_clipping_quantileof predictions on the “train” data. Must be between 0 and 1.gamma (float, optional, default = 0.05) – A non-negative scalar. Limits the minimum variance of the test statistic to gamma times the noise level.
seed (int, optional, default = 0) – Seed for the random train / test split.
tol (float, optional, default = 1e-6) – Tolerance for the root-finding algorithm used to locate the confidence set boundaries.
max_value (float, optional, default = 1e6) – Maximum absolute value of \(\beta_0\) to consider. If the confidence set boundary lies beyond this value, the boundary is reported as \(\pm \infty\).
max_eval (int, optional, default = 1000) – Maximum number of evaluations of the test statistic for the root-finding algorithm.
- Returns:
The confidence set \(C_\alpha\).
- Return type:
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
ValueError: – If
Xhas more than one column.ValueError: – If
train_fractionis not in (0, 1).ValueError: – If
nonlinear_modeldoes not have afitandpredictmethod.
References
- ivmodels.tests.residual_prediction.residual_prediction_test(Z, X, y, C=None, robust=False, nonlinear_model=None, fit_intercept=True, train_fraction=None, upper_clipping_quantile=0.9, gamma=0.05, seed=0)
Perform the residual prediction test [Scheidegger et al., 2025] for model specification.
This uses a nonlinear model to test the well specification of an IV model: “Is the linear IV model appropriate for the data”? Formally, the null hypothesis is:
\[H_0: \exists \beta_0 \in \mathbb{R}^p \mathrm{\ such \ that \ } \mathbb{E}[y - X \beta | Z] = 0.\]The tests splits the data according to
train_fractioninto \(y_a, X_a, Z_a\) and \(y_b, X_b, Z_b\) and fits a nonlinear model regressing \(\hat{\varepsilon}_a \sim Z_a\) on the residuals \(\hat{\varepsilon}_a := y_a - X_a \hat \beta_a\) of a two-stage least-squares (TSLS) estimator \(\hat \beta_a: y_a \sim X_a | Z_a\) on the “train” data \(y_a, X_a, Z_a\). The fitted nonlinear model is then used to predict the residuals on the “test” data \(y_b, X_b, Z_b\), yielding weights \(\hat w_b := \mathrm{nonlinear\_model}(Z_b)\). Let \(\hat \varepsilon_b := y_b - X_b \hat \beta_b\) be the residuals of a TSLS estimator \(\hat \beta_b: y_b \sim X_b | Z_b\) on the “test” data and \(\hat \sigma^2\) be an estimate of the variance of \(\hat w_b \cdot \hat \varepsilon_b\) under the null hypothesis. The test statistic is\[T = \frac{1}{\sqrt{n_b}} \frac{w_b^T \hat \varepsilon_b}{\sqrt{\hat \sigma^2}}.\]This is asymptotically standard Gaussian distributed under the null.
See also the test’s R implementation by Cyrill Scheidegger.
To avoid the \(p\)-value lottery due to the random train / test split used in the residual prediction test, Scheidegger et al. [2025] suggest aggregating the \(p\)-values from multiple random splits by taking 2 times the median. This results in a conservative \(p\)-value Meinshausen et al. [2009].
Example
>>> import numpy as np >>> from ivmodels.tests import residual_prediction_test >>> from ivmodels.simulate import simulate_gaussian_iv >>> >>> Z, X, y = ... >>> >>> ps = np.empty(50) >>> for i in range(50): ... _, ps[i] = residual_prediction_test(Z, X, y, seed=i) >>> >>> print(f"Residual prediction test p-value: {2 * np.median(ps):.3f}")
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Included exogenous regressors.
robust (bool or string, optional, default = False) – Whether to use a heteroskedasticity-robust method to estimate the noise level \(\hat \sigma^2\).
nonlinear_model (object, optional, default = None) – Object with a
fitandpredictmethod. IfNone, uses ansklearn.ensemble.RandomForestRegressor().fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
train_fraction (float, optional, default = None) – Fraction of data to use to train the nonlinear model. Must be between 0 and 1. The remaining data is used to compute the test statistic. If
None, 0.5 or \(e / \log(n)\) is used, whichever is smaller.upper_clipping_quantile (float, optional, default = 0.9) – Asymptotic normality requires the nonlinear model’s prediction not to put too much weight in the tails. To avoid this, we clip its “test set” predictions by a certail threshold in absolute value. The threshold is the
upper_clipping_quantileof predictions on the “train” data. Must be between 0 and 1.gamma (float, optional, default = 0.05) – A non-negative scalar. Limits the minimum variance of the test statistic to gamma times the noise level.
seed (int, optional, default = 0) – Seed used to generate the random train / test split.
- Returns:
statistic (float) – The test statistic \(\mathrm{T}\).
p_value (float) – The p-value of the test.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
ValueError: – If
train_fractionis not in (0, 1).ValueError: – If
nonlinear_modeldoes not have afitandpredictmethod.
References
- ivmodels.tests.residual_prediction.weak_residual_prediction_test(Z, X, y, beta, C=None, robust=False, nonlinear_model=None, fit_intercept=True, train_fraction=None, upper_clipping_quantile=0.9, gamma=0.05, seed=0)
Perform the weak-IV-robust residual prediction test at a fixed
beta[Scheidegger et al., 2025].Unlike
residual_prediction_test(), this test does not estimate \(\beta\) via TSLS. Instead, it tests the null hypothesis\[H_0(\beta_0): \exists \theta \in \mathbb{R}^q \text{ s.t. } \mathbb{E}[y - X^T \beta_0 - C^T \theta \mid Z, C] = 0\]at the supplied value \(\beta_0\). The test statistic is asymptotically standard Gaussian under the null and remains valid under weak or many instruments.
See also the test’s R implementation by Cyrill Scheidegger (
weak_RPIV_test).- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx,)) – Coefficients to test.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Included exogenous regressors.
robust (bool, optional, default = False) – Whether to use the heteroskedasticity-robust variance estimator.
nonlinear_model (object, optional, default = None) – Object with a
fitandpredictmethod. IfNone, uses ansklearn.ensemble.RandomForestRegressor().fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
train_fraction (float, optional, default = None) – Fraction of data used to train the nonlinear model. If
None, uses \(\min(0.5, e / \log(n))\).upper_clipping_quantile (float, optional, default = 0.9) – Asymptotic normality requires the nonlinear model’s prediction not to put too much weight in the tails. To avoid this, we clip its “test set” predictions by a certain threshold in absolute value. The threshold is the
upper_clipping_quantileof predictions on the “train” data. Must be between 0 and 1.gamma (float, optional, default = 0.05) – A non-negative scalar. Limits the minimum variance of the test statistic to gamma times the noise level.
seed (int, optional, default = 0) – Seed for the random train / test split.
- Returns:
statistic (float) – The test statistic.
p_value (float) – The p-value of the test.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
ValueError: – If
train_fractionis not in (0, 1).ValueError: – If
nonlinear_modeldoes not have afitandpredictmethod.
References
ivmodels.tests.wald module
- ivmodels.tests.wald.inverse_wald_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, estimator='tsls', fit_intercept=True)
Return the quadric for the acceptance region based on asymptotic normality.
If
WisNone, the quadric is defined as\[(\beta - \hat{\beta})^T (X^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) X)^{-1} (\beta - \hat{\beta}) \leq \hat{\sigma}^2 F_{\chi^2(m_X)}(1 - \alpha),\]where \(\hat \beta\) is an estimate of the causal parameter \(\beta_0\) (controlled by the parameter
estimator), \(\hat \sigma^2 = \frac{1}{n - m_X} \| y - X \hat \beta \|^2_2\), \(P_Z\) is the projection matrix onto the column space of \(Z\), \(M_Z\) is the projection matrix onto the orthogonal complement of the column space of \(Z\), and \(F_{\chi^2(m_X)}\) is the cumulative distribution function of the \(\chi^2(m_X)\) distribution.If
Wis notNone, the quadric is defined as\[(\beta - B \hat{\beta})^T (B ((X W)^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) (X W))^{-1} B^T)^{-1} (\beta - B \hat{\beta}) \leq \hat{\sigma}^2 F_{\chi^2(m_X)}(1 - \alpha),\]where \(B \in \mathbb{R}^{m_X \times (m_X + m_W)}\) is a diagonal matrix with 1 on the diagonal and 0 elsewhere.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float) – Significance level.
W (np.ndarray of dimension (n, mw) or None, optional, default = None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default = None) – Exogenous regressors of interest.
estimator (float or str, optional, default = "tsls") – Estimator to use. Passed as
kappaparameter toKClass.fit_intercept (bool, optional, default = True) – Whether to include an intercept. The intercept will be included both in the complete and the (restricted) model. Including an intercept is equivalent to centering the columns of all design matrices.
- ivmodels.tests.wald.wald_test(Z, X, y, beta, W=None, C=None, D=None, estimator='tsls', fit_intercept=True, robust=False)
Test based on asymptotic normality of the TSLS (or LIML) estimator.
If
WisNone, the test statistic is defined as\[\mathrm{Wald}(\beta) := (\beta - \hat{\beta})^T \widehat{\mathrm{Cov}}(\hat\beta)^{-1} (\beta - \hat{\beta}) / \hat{\sigma}^2,\]where \(\hat \beta = \hat \beta(\kappa)\) is a k-class estimator with \(\sqrt{n} (1 - \kappa) \to 0\), \(\widehat{\mathrm{Cov}}(\hat\beta)^{-1} = \frac{1}{n} (X^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) X)^{-1}\), \(\hat \sigma^2 = \frac{1}{n - m_X} \| y - X \hat \beta \|^2_2\) is an estimate of the variance of the errors, \(P_Z\) is the projection matrix onto the column space of \(Z\), and \(M_Z = \mathrm{Id} - P_Z\). Under strong instruments, the test statistic is asymptotically distributed as \(\chi^2(m_X)\) under the null.
If
Wis notNone, the test statistic is defined as\[\mathrm{Wald}(\beta) := (\beta - B \hat{\beta})^T (B ( (X W)^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) (X W) )^{-1} B)^{-1} (\beta - B \hat{\beta}) / \hat{\sigma}^2,\]where \(B \in \mathbb{R}^{m_X \times (m_X + m_W)}\) is a diagonal matrix with 1 on the diagonal and 0 elsewhere.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
W (np.ndarray of dimension (n, mw) or None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None) – Exogenous regressors of interest.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
estimator (str or float, optional, default = "liml") – Estimator to use. Passed to
kappaargument ofKClass.fit_intercept (bool, optional, default = True) – Whether to include an intercept. The intercept will be included both in the complete and the (restricted) model. Including an intercept is equivalent to centering the columns of all design matrices.
robust (bool, optional, default = False) – Whether to use a robust variance estimator. If
True, the sandwich estimator is used.
- Returns:
statistic (float) – The test statistic \(\mathrm{Wald}\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(m_X)}(Wald)\), where \(F_{\chi^2(m_X)}\) is the cumulative distribution function of the \(\chi^2(m_X)\) distribution.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
Module contents
- ivmodels.tests.anderson_rubin_test(Z, X, y, beta, W=None, C=None, D=None, critical_values='chi2', fit_intercept=True)
Perform the Anderson Rubin test [Anderson and Rubin, 1949].
Test the null hypothesis that the residuals are uncorrelated with the instruments. If
WisNone, the test statistic is defined as\[\mathrm{AR}(\beta) := \frac{n - k}{k} \frac{\| P_Z (y - X \beta) \|_2^2}{\| M_Z (y - X \beta) \|_2^2},\]where \(P_Z\) is the projection matrix onto the column space of \(Z\) and \(M_Z = \mathrm{Id} - P_Z\).
Under the null and normally distributed errors, this test statistic is distributed as \(F_{k, n - k}\), where \(k\) is the number of instruments and \(n\) is the number of observations. The statistic is asymptotically distributed as \(\chi^2(k) / k\) under the null and non-normally distributed errors, even for weak instruments.
If
Wis notNone, the test statistic is\[\begin{split}\mathrm{AR}(\beta) &:= \min_\gamma \frac{n - k}{k - m_W} \frac{\| P_Z (y - X \beta - W \gamma) \|_2^2}{\| M_Z (y - X \beta - W \gamma) \|_2^2} \\ &= \frac{n - k}{k - m_W} \frac{\| P_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2}{\| M_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2},\end{split}\]where \(\hat\gamma_\mathrm{LIML}\) is the LIML estimate using instruments \(Z\), covariates \(W\) and outcomes \(y - X \beta\). Under the null, this test statistic is asymptotically bounded from above by a random variable that is distributed as \(\frac{1}{k - m_W} \chi^2(k - m_W)\), where \(r = \mathrm{dim}(W)\). See [Guggenberger et al., 2012].
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
W (np.ndarray of dimension (n, mw) or None, optional, default = None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default = None) – Exogenous regressors of interest.
critical_values (str, optional, default = "chi2") – If
"chi2", use the \(\chi^2(k - m_W)\) distribution to compute the p-value. If"f", use the \(F_{k - m_W, n - k}\) distribution to compute the p-value. If"guggenberger"or"GKM", use the critical value function proposed by Guggenberger et al. [2019] to compute the p-value.fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
- Returns:
statistic (float) – The test statistic \(\mathrm{AR}(\beta)\).
p_value (float) – The p-value of the test.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
References
[AR49] (1,2)Theodore W Anderson and Herman Rubin. Estimation of the parameters of a single equation in a complete system of stochastic equations. The Annals of mathematical statistics, 20(1):46–63, 1949.
- ivmodels.tests.conditional_likelihood_ratio_test(Z, X, y, beta, W=None, C=None, D=None, fit_intercept=True, critical_values='londschien2025exact', tol=1e-06, num_samples=10000)
Perform the conditional likelihood ratio test for
beta.If
WisNone, the test statistic is defined as\[\begin{split}\mathrm{CLR}(\beta) &:= (n - k) \frac{ \| P_Z (y - X \beta) \|_2^2}{ \| M_Z (y - X \beta) \|_2^2} - (n - k) \frac{ \| P_Z (y - X \hat\beta_\mathrm{LIML}) \|_2^2 }{ \| M_Z (y - X \hat\beta_\mathrm{LIML}) \|_2^2 } \\ &= k \cdot \mathrm{AR}(\beta) - k \cdot \min_\beta \mathrm{AR}(\beta),\end{split}\]where \(P_Z\) is the projection matrix onto the column space of \(Z\), \(M_Z = \mathrm{Id} - P_Z\), and \(\hat\beta_\mathrm{LIML}\) is the LIML estimator of \(\beta\) (see
KClass), minimizing the Anderson-Rubin test statistic \(\mathrm{AR}(\beta)\) (seeanderson_rubin_test()) at\[\mathrm{AR}(\hat\beta_\mathrm{LIML}) = \frac{n - k}{k} \lambda_\mathrm{min}\left( \left(\begin{pmatrix} X & y \end{pmatrix}^T M_Z \begin{pmatrix} X & y \end{pmatrix}\right)^{-1} \begin{pmatrix} X & y \end{pmatrix}^T P_Z \begin{pmatrix} X & y \end{pmatrix} \right).\]Let
\[\tilde X(\beta) := X - (y - X \beta) \cdot \frac{(y - X \beta)^T M_Z X}{(y - X \beta)^T M_Z (y - X \beta)}\]and let \(\lambda_1, \ldots, \lambda_m\) be the eigenvalues of
\[(n - k) \cdot \left[\tilde X(\beta)^T M_Z \tilde X(\beta)\right]^{-1} \tilde X(\beta)^T P_Z \tilde X(\beta).\]If
critical_values="londschien2025exact", the exact asymptotic distribution from Theorem 1 of [Londschien, 2025] is used:\[\mathrm{CLR}(\beta_0) \overset{d}{\to} \sum_{i=0}^m q_i - \mu_\mathrm{min},\]where \(q_0 \sim \chi^2(k-m)\), \(q_1, \ldots, q_m \sim \chi^2(1)\), and \(\mu_\mathrm{min}\) is the smallest root of the polynomial
\[p(\mu) := \left(\mu - \sum_{i=0}^m q_i\right) \prod_{i=1}^m (\mu - \lambda_i) - \sum_{i=1}^m \lambda_i q_i \prod_{j \geq 1, j \neq i} (\mu - \lambda_j).\]This distribution is conditional on all eigenvalues \(\lambda_1, \ldots, \lambda_m\) and provides substantially more power when eigenvalues differ.
If
critical_valuesis"kleibergen2007generalizing"or"moreira2003conditional", uses the upper bound conditional on only the smallest eigenvalue \(\lambda_1\):\[\mathrm{CLR}(\beta_0) \leq \frac{1}{2} \left( Q_{m_X} + Q_{k - m_X} - \lambda_1 + \sqrt{ (Q_{m_X} + Q_{k - m_X} + \lambda_1)^2 - 4 Q_{k - m_X} \lambda_1 } \right),\]where \(Q_{m_X} \sim \chi^2(m_X)\) and \(Q_{k - m_X} \sim \chi^2(k - m_X)\) are independent. This bound is sharp when all eigenvalues are equal.
This test is robust to weak instruments. If identification is strong (\(\lambda_i \to \infty\)), the test is equivalent to the likelihood ratio test with \(\chi^2(m_X)\) distribution. If identification is weak (\(\lambda_i \to 0\)), the test is equivalent to the Anderson-Rubin test with \(\chi^2(k)\) distribution.
If
Wis notNone, the test statistic is defined as\[\begin{split}\mathrm{CLR}(\beta) &:= (n - k) \min_\gamma \frac{ \| P_Z (y - X \beta - W \gamma) \|_2^2}{ \| M_Z (y - X \beta - W \gamma) \|_2^2} - (n - k) \min_{\beta, \gamma} \frac{ \| P_Z (y - X \beta - W \gamma) \|_2^2 }{ \| M_Z (y - X \beta - W \gamma) \|_2^2 } \\ &= (n - k) \frac{ \| P_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2}{ \| M_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2} - (n - k) \frac{ \| P_Z (y - \begin{pmatrix} X & W \end{pmatrix} \hat\delta_\mathrm{LIML}) \|_2^2 }{ \| M_Z (y - \begin{pmatrix} X & W \end{pmatrix} \hat\delta_\mathrm{LIML}) \|_2^2 },\end{split}\]where \(\hat\gamma_\mathrm{LIML}\) and \(\hat\delta_\mathrm{LIML}\) are LIML estimators. In this case, only the upper bound method is available (
critical_valuesmust be"kleibergen2007generalizing"or"moreira2003conditional").- Parameters:
Z (array_like of shape (n, k)) – Instruments.
X (array_like of shape (n, mx)) – Endogenous regressors of interest.
y (array_like of shape (n,)) – Outcomes.
beta (array_like of shape (mx + md,)) – Coefficients to test.
W (array_like of shape (n, mw) or None, default=None) – Endogenous regressors not of interest.
C (array_like of shape (n, mc) or None, default=None) – Exogenous regressors not of interest.
D (array_like of shape (n, md) or None, default=None) – Exogenous regressors of interest. Will be included into both
XandZif supplied.fit_intercept (bool, default=True) – Whether to include an intercept. This is equivalent to centering the inputs.
critical_values ({"londschien2025exact", "kleibergen2007generalizing", "moreira2003conditional"}, default="londschien2025exact") – Which critical values to use. If
"londschien2025exact", uses the exact distribution conditional on all eigenvalues via Monte Carlo simulation of the polynomial root \(\mu_\mathrm{min}\). Only available whenWisNone. If"kleibergen2007generalizing"or"moreira2003conditional", uses the upper bound conditional on the smallest eigenvalue via numerical integration.tol (float, default=1e-6) – Tolerance for the approximation of the CDF and thus the p-value.
num_samples (int, default=10000) – Number of Monte Carlo samples when using
"londschien2025exact".
- Returns:
statistic (float) – The test statistic \(\mathrm{CLR}(\beta)\).
p_value (float) – The p-value of the test, correct up to tolerance
tol.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
References
[Kle07]Frank Kleibergen. Generalizing weak instrument robust IV statistics towards multiple parameters, unrestricted covariance matrices and identification statistics. Journal of Econometrics, 139(1):181–216, 2007.
[Kle21]Frank Kleibergen. Efficient size correct subset inference in homoskedastic linear instrumental variables regression. Journal of Econometrics, 221(1):78–96, 2021.
[Lon25] (1,2,3,4)Malte Londschien. The exact distribution of the conditional likelihood-ratio test in instrumental variables regression. arXiv preprint, 2025.
[Mor03]Marcelo J Moreira. A conditional likelihood ratio test for structural models. Econometrica, 71(4):1027–1048, 2003.
- ivmodels.tests.inverse_anderson_rubin_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, critical_values='chi2', fit_intercept=True, tol=1e-06, max_eval=100)
Return the quadric for to the inverse Anderson-Rubin test’s acceptance region.
The returned quadric satisfies
quadric(x) <= 0if and only ifanderson_rubin_test(Z, X, y, beta=x, W=W)[1] > alpha. It is thus a confidence region for the causal parameter corresponding to the endogenous regressors of interestX.If
WisNone, let \(q := \frac{k}{n-k}F_{F(k, n-k}(1 - \alpha)\), where \(F_{F(k, n-k)}\) is the cumulative distribution function of the \(F(k, n-k)\) distribution. The quadric is defined as\[\begin{split}\mathrm{AR}(\beta) = \frac{n - k}{k} \frac{\| P_Z (y - X \beta) \|_2^2}{\| M_Z (y - X \beta) \|_2^2} \leq F_{F(k, n-k)}(1 - \alpha) \\ \Leftrightarrow \beta^T X^T (P_Z - q M_Z) X \beta - 2 y^T (P_Z - q M_Z) X \beta + y^T (P_Z - q M_Z) y \leq 0.\end{split}\]If
Wis notNone, let \(q := \frac{k - m_W}{n-k}F_{F(k - m_W, n-k)}(1 - \alpha)\). The quadric is defined as\[\mathrm{AR}(\beta) = \min_\gamma \frac{n - k}{k - m_W} \frac{\| P_Z (y - X \beta - W \gamma) \|_2^2}{\| M_Z (y - X \beta - W \gamma) \|_2^2} \leq F_{F(k - m_W, n-k)}(1 - \alpha).\]- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float) – Significance level.
W (np.ndarray of dimension (n, mw) or None, optional, default = None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default = None) – Exogenous regressors of interest.
critical_values (str, optional, default = "chi2") – If
"chi2", use the \(\chi^2(k - m_W)\) distribution to compute the p-value. If"f", use the \(F_{k - m_W, n - k}\) distribution to compute the p-value. If"gkm", use the critical value function proposed by Guggenberger et al. [2019].fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
tol (float, optional, default = 1e-6) – Tolerance for the root finding algorithm when critical_values is
"gkm".max_eval (int, optional, default = 1000) – Maximum number of evaluations for the root finding algorithm when critical_values is
"gkm".
- Returns:
The quadric for the acceptance region.
- Return type:
- ivmodels.tests.inverse_conditional_likelihood_ratio_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, fit_intercept=True, tol=1e-06, max_value=1000000.0, max_eval=100)
Return an approximation of the confidence set by inversion of the CLR test.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float, optional, default = 0.05) – Significance level of the test.
W (np.ndarray of dimension (n, mw) or None, optional, default = None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, 0) or None, optional, default = None) – Exogenous regressors of interest. Not supported for this test.
fit_intercept (bool, optional, default: True) – Whether to include an intercept. This is equivalent to centering the inputs.
tol (float, optional, default: 1e-4) – The boundaries of the confidence set are computed up to this tolerance.
max_value (float, optional, default: 1e6) – The maximum value to consider when searching for the boundaries of the confidence set. That is, if the true confidence set is of the form [0, max_value + 1], the confidence returned set will be [0, np.inf].
max_eval (int, optional, default: 1000) – The maximum number of evaluations of the critical value function to find the boundaries of the confidence set.
- ivmodels.tests.inverse_lagrange_multiplier_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, fit_intercept=True, tol=1e-06, max_value=1000000.0, max_eval=100)
Return an approximation of the confidence set by inversion of the LM test.
This is only implemented if mx + md = 1. The confidence set is computed by a root finding algorithm, see the docs of
_find_roots()for more details.- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors of interest.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float, optional, default=0.05) – Significance level. The confidence level is
1 - alpha.W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.
tol (float, optional, default=1e-4) – Tolerance for the root finding algorithm.
max_value (float, optional, default=1e8) – Maximum value for the root finding algorithm. Returns a confidence set with infinite bounds if the algorithm reaches this value.
max_eval (int, optional, default=1000) – Maximum number of evaluations of the statistic for the root finding algorithm.
- ivmodels.tests.inverse_likelihood_ratio_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, fit_intercept=True)
Return the quadric for the inverse likelihood ratio test’s acceptance region.
The quadric is defined as
\[\mathrm{LR}(\beta) \leq F_{\chi^2(m_X)}(1 - \alpha).\]- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float, optional, default=0.05) – Significance level.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.
- ivmodels.tests.inverse_pulse_test(Z, X, y, C=None, D=None, alpha=0.05, fit_intercept=True)
Return the quadric for the inverse pulse test’s acceptance region.
- ivmodels.tests.inverse_wald_test(Z, X, y, alpha=0.05, W=None, C=None, D=None, estimator='tsls', fit_intercept=True)
Return the quadric for the acceptance region based on asymptotic normality.
If
WisNone, the quadric is defined as\[(\beta - \hat{\beta})^T (X^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) X)^{-1} (\beta - \hat{\beta}) \leq \hat{\sigma}^2 F_{\chi^2(m_X)}(1 - \alpha),\]where \(\hat \beta\) is an estimate of the causal parameter \(\beta_0\) (controlled by the parameter
estimator), \(\hat \sigma^2 = \frac{1}{n - m_X} \| y - X \hat \beta \|^2_2\), \(P_Z\) is the projection matrix onto the column space of \(Z\), \(M_Z\) is the projection matrix onto the orthogonal complement of the column space of \(Z\), and \(F_{\chi^2(m_X)}\) is the cumulative distribution function of the \(\chi^2(m_X)\) distribution.If
Wis notNone, the quadric is defined as\[(\beta - B \hat{\beta})^T (B ((X W)^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) (X W))^{-1} B^T)^{-1} (\beta - B \hat{\beta}) \leq \hat{\sigma}^2 F_{\chi^2(m_X)}(1 - \alpha),\]where \(B \in \mathbb{R}^{m_X \times (m_X + m_W)}\) is a diagonal matrix with 1 on the diagonal and 0 elsewhere.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
alpha (float) – Significance level.
W (np.ndarray of dimension (n, mw) or None, optional, default = None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default = None) – Exogenous regressors of interest.
estimator (float or str, optional, default = "tsls") – Estimator to use. Passed as
kappaparameter toKClass.fit_intercept (bool, optional, default = True) – Whether to include an intercept. The intercept will be included both in the complete and the (restricted) model. Including an intercept is equivalent to centering the columns of all design matrices.
- ivmodels.tests.inverse_weak_residual_prediction_test(Z, X, y, C=None, alpha=0.05, robust=False, nonlinear_model=None, fit_intercept=True, train_fraction=None, upper_clipping_quantile=0.9, gamma=0.05, seed=0, tol=1e-06, max_value=1000000.0, max_eval=100)
Compute confidence set for the weak-IV residual prediction test [Scheidegger et al., 2025].
This implements weak-IV robust confidence sets for the causal parameter based on the weak_residual_prediction_test [Scheidegger et al., 2025]. For each candidate \(\beta_0\), it tests
\[H_0(\beta_0): \exists \theta \in \mathbb{R}^q \text{ s.t. } \mathbb{E}[y - X^T \beta_0 - C^T \theta \mid Z, C] = 0,\]and returns the confidence set
\[C_\alpha = \{\beta_0 \mid \mathrm{pval}(\beta_0) \geq \alpha\}.\]Unlike
residual_prediction_test(), this test does not require estimating \(\beta\) via TSLS and therefore remains valid under weak or many instruments. If \(C_\alpha\) is empty, the model is misspecified for all candidate values.The function is only implemented for scalar \(X\) (\(p = 1\)). The confidence set boundaries are located by root-finding starting from the TSLS estimate of \(\beta\) on the full sample.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, 1)) – Endogenous regressor (must be scalar).
y (np.ndarray of dimension (n,)) – Outcomes.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Included exogenous regressors.
alpha (float, optional, default = 0.05) – Significance level.
robust (bool, optional, default = False) – Whether to use the heteroskedasticity-robust variance estimator.
nonlinear_model (object, optional, default = None) – Object with a
fitandpredictmethod. IfNone, uses ansklearn.ensemble.RandomForestRegressor().fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
train_fraction (float, optional, default = None) – Fraction of data used to train the nonlinear model. If
None, uses \(\min(0.5, e / \log(n))\).upper_clipping_quantile (float, optional, default = 0.9) – Asymptotic normality requires the nonlinear model’s prediction not to put too much weight in the tails. To avoid this, we clip its “test set” predictions by a certain threshold in absolute value. The threshold is the
upper_clipping_quantileof predictions on the “train” data. Must be between 0 and 1.gamma (float, optional, default = 0.05) – A non-negative scalar. Limits the minimum variance of the test statistic to gamma times the noise level.
seed (int, optional, default = 0) – Seed for the random train / test split.
tol (float, optional, default = 1e-6) – Tolerance for the root-finding algorithm used to locate the confidence set boundaries.
max_value (float, optional, default = 1e6) – Maximum absolute value of \(\beta_0\) to consider. If the confidence set boundary lies beyond this value, the boundary is reported as \(\pm \infty\).
max_eval (int, optional, default = 1000) – Maximum number of evaluations of the test statistic for the root-finding algorithm.
- Returns:
The confidence set \(C_\alpha\).
- Return type:
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
ValueError: – If
Xhas more than one column.ValueError: – If
train_fractionis not in (0, 1).ValueError: – If
nonlinear_modeldoes not have afitandpredictmethod.
References
- ivmodels.tests.j_test(Z, X, y, C=None, fit_intercept=True, estimator='liml')
Perform the J-test for overidentifying restrictions.
This is also called Sargan–Hansen test or Sargan’s J test.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Endogenous regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Exogenous regressors.
fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
estimator (str, optional, default = 'liml') – Estimator to use. Passed to
KClass.
- Returns:
statistic (float) – The test statistic \(J\).
p_value (float) – The p-value of the test.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
- ivmodels.tests.lagrange_multiplier_test(Z, X, y, beta, W=None, C=None, D=None, fit_intercept=True, **kwargs)
Perform the Lagrange multiplier test for
betaby Kleibergen [2002].Test the null hypothesis that the residuals are uncorrelated with the instruments. If
WisNone, let\[\tilde X(\beta) := X - (y - X \beta) \frac{(y - X \beta) M_Z X}{(y - X \beta) M_Z (y - X \beta)}.\]The test statistic is
\[\mathrm{LM}(\beta) := (n - k) \frac{\| P_{P_Z \tilde X(\beta)} (y - X \beta) \|_2^2}{\| M_Z (y - X \beta) \|_2^2}.\]If
Wis notNone, let\[\tilde S(\beta, \gamma) := (X \ W) - (y - X \beta - W \gamma) \frac{(y - X \beta - W \gamma) M_Z (X \ W)}{(y - X \beta - W \gamma) M_Z (y - X \beta - W \gamma)}.\]The test statistic is
\[\mathrm{LM}(\beta) := (n - k) \min_{\gamma} \frac{\| P_{P_Z \tilde S(\beta, \gamma)} (y - X \beta - W \gamma) \|_2^2}{\| M_Z (y - X \beta - W \gamma) \|_2^2}.\]This test statistic is asymptotically distributed as \(\chi^2(m_X)\) under the null, even if the instruments are weak.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors of interest.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.
- Returns:
statistic (float) – The test statistic \(\mathrm{LM}(\beta)\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(m_X)}(\mathrm{LM}(\beta)\), where \(F_{\chi^2(m_X)}\) is the cumulative distribution function of the \(\chi^2(m_X)\) distribution.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
- ivmodels.tests.likelihood_ratio_test(Z, X, y, beta, W=None, C=None, D=None, fit_intercept=True)
Perform the likelihood ratio test for
beta.If
WisNone, the test statistic is defined as\[\begin{split}\mathrm{LR}(\beta) &:= (n - k) \frac{ \| P_Z (y - X \beta) \|_2^2}{ \| M_Z (y - X \beta) \|_2^2} - (n - k) \frac{ \| P_Z (y - X \hat\beta_\mathrm{LIML}) \|_2^2 }{ \| M_Z (y - X \hat\beta_\mathrm{LIML}) \|_2^2 } \\ &= k \ \mathrm{AR}(\beta)) - k \ \mathrm{AR}(\hat\beta_\mathrm{LIML}),\end{split}\]where \(P_Z\) is the projection matrix onto the column space of \(Z\), \(M_Z = \mathrm{Id} - P_Z\), and \(\hat\beta_\mathrm{LIML}\) is the LIML estimator of \(\beta\), minimizing the Anderson-Rubin test statistic \(\mathrm{AR}(\beta)\) (see
anderson_rubin_test()) at \(\mathrm{AR}(\hat\beta_\mathrm{LIML}) = \frac{n - k}{k} (\hat\kappa_\mathrm{LIML} - 1)\).If
Wis notNone, the test statistic is defined as\[\mathrm{LR(\beta)} := (n - k) \frac{ \|P_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|^2_2 }{\| M_Z (y - X \beta - W \hat\gamma_\mathrm{LIML}) \|_2^2 } - (n - k) \frac{\| P_Z (y - (X \ W) \hat\delta_\mathrm{LIML}) \|_2^2 }{ \| M_Z (y - (X \ W) \hat\delta_\mathrm{LIML}) \|_2^2}\]where \(\gamma_\mathrm{LIML}\) is the LIML estimator (see
KClass) using instruments \(Z\), endogenous covariates \(W\), and outcomes \(y - X \beta\) and \(\hat\delta_\mathrm{LIML}\) is the LIML estimator using instruments \(Z\), endogenous covariates \(X \ W\), and outcomes \(y\).Under the null and given strong instruments, the test statistic is asymptotically distributed as \(\chi^2(m_X)\), where \(m_X\) is the number of regressors.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None, optional, default=None) – Exogenous regressors of interest.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.
- Returns:
statistic (float) – The test statistic \(\mathrm{LR}(\beta)\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(m_X)}(\mathrm{LR}(\beta)\), where \(F_{\chi^2(m_X)}\) is the cumulative distribution function of the \(\chi^2(m_X)\) distribution.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
- ivmodels.tests.pulse_test(Z, X, y, beta, C=None, W=None, D=None, fit_intercept=True)
Test proposed by Jakobsen and Peters [2022] with null hypothesis: \(Z\) and \(y - X \beta\) are uncorrelated.
The test statistic is defined as
\[T := n \frac{\| P_Z (y - X \beta) \|_2^2}{\| (y - X \beta) \|_2^2}.\]Under the null, \(T\) is asymptotically distributed as \(\chi^2(k)\). See Section 3.2 of [Jakobsen and Peters, 2022] for details.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
W (np.ndarray of dimension (n, mw) or None, optional, default=None) – Must be None or mw must be 0. No subvector variant of the test is implemented.
fit_intercept (bool, optional, default=True) – Whether to fit an intercept. This is equivalent to centering the inputs.
- Returns:
statistic (float) – The test statistic \(T\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(k)}(T)\), where \(F_{\chi^2(k)}\) is the cumulative distribution function of the \(\chi^2(k)\) distribution.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
References
- ivmodels.tests.rank_test(Z, X, C=None, fit_intercept=True)
Perform the Cragg-Donald test for reduced rank [Cragg and Donald, 1997].
Let \(X = Z \Pi + V\) with \(\Pi \in \mathbb{R}^{k \times m_X}\). The Cragg-Donald test is the Wald test with the null hypothesis
\[H_0 := \mathrm{rank}(\Pi) < m_X,\]The test statistic is
\[\mathrm{CD} := \lambda := (n-k) \lambda_\mathrm{min}((X^T M_Z X)^{-1} X^T P_Z X)\]where \(P_Z = Z (Z^T Z)^{-1} Z^T\) is the orthogonal projection onto the column space of \(Z\), \(M_Z = I - P_Z\) is the orthogonal projection onto the orthogonal complement of the column space of \(Z\), and \(\lambda_\mathrm{min}\) is the smallest eigenvalue. Under the null hypothesis, \(\mathrm{CD}\) is asymptotically distributed as \(\chi^2(k - m_X + 1)\).
The Cragg-Donald test is asymptotically equivalent to Anderson [1951]’s likelihood ratio test for reduced rank of \(\Pi\).
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
C (np.ndarray of dimension (n, mc) or None, optional, default=None) – Exogenous regressors not of interest.
fit_intercept (bool) – Whether to fit an intercept.
- Returns:
statistic (float) – The test statistic \(\mathrm{CD}\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(k - m_X + 1)}(\mathrm{CD})\), where \(F_{\chi^2(k - m_X + 1)}\) is the cumulative distribution function of the \(\chi^2(k - m_X + 1)\) distribution.
References
- ivmodels.tests.residual_prediction_test(Z, X, y, C=None, robust=False, nonlinear_model=None, fit_intercept=True, train_fraction=None, upper_clipping_quantile=0.9, gamma=0.05, seed=0)
Perform the residual prediction test [Scheidegger et al., 2025] for model specification.
This uses a nonlinear model to test the well specification of an IV model: “Is the linear IV model appropriate for the data”? Formally, the null hypothesis is:
\[H_0: \exists \beta_0 \in \mathbb{R}^p \mathrm{\ such \ that \ } \mathbb{E}[y - X \beta | Z] = 0.\]The tests splits the data according to
train_fractioninto \(y_a, X_a, Z_a\) and \(y_b, X_b, Z_b\) and fits a nonlinear model regressing \(\hat{\varepsilon}_a \sim Z_a\) on the residuals \(\hat{\varepsilon}_a := y_a - X_a \hat \beta_a\) of a two-stage least-squares (TSLS) estimator \(\hat \beta_a: y_a \sim X_a | Z_a\) on the “train” data \(y_a, X_a, Z_a\). The fitted nonlinear model is then used to predict the residuals on the “test” data \(y_b, X_b, Z_b\), yielding weights \(\hat w_b := \mathrm{nonlinear\_model}(Z_b)\). Let \(\hat \varepsilon_b := y_b - X_b \hat \beta_b\) be the residuals of a TSLS estimator \(\hat \beta_b: y_b \sim X_b | Z_b\) on the “test” data and \(\hat \sigma^2\) be an estimate of the variance of \(\hat w_b \cdot \hat \varepsilon_b\) under the null hypothesis. The test statistic is\[T = \frac{1}{\sqrt{n_b}} \frac{w_b^T \hat \varepsilon_b}{\sqrt{\hat \sigma^2}}.\]This is asymptotically standard Gaussian distributed under the null.
See also the test’s R implementation by Cyrill Scheidegger.
To avoid the \(p\)-value lottery due to the random train / test split used in the residual prediction test, Scheidegger et al. [2025] suggest aggregating the \(p\)-values from multiple random splits by taking 2 times the median. This results in a conservative \(p\)-value Meinshausen et al. [2009].
Example
>>> import numpy as np >>> from ivmodels.tests import residual_prediction_test >>> from ivmodels.simulate import simulate_gaussian_iv >>> >>> Z, X, y = ... >>> >>> ps = np.empty(50) >>> for i in range(50): ... _, ps[i] = residual_prediction_test(Z, X, y, seed=i) >>> >>> print(f"Residual prediction test p-value: {2 * np.median(ps):.3f}")
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Included exogenous regressors.
robust (bool or string, optional, default = False) – Whether to use a heteroskedasticity-robust method to estimate the noise level \(\hat \sigma^2\).
nonlinear_model (object, optional, default = None) – Object with a
fitandpredictmethod. IfNone, uses ansklearn.ensemble.RandomForestRegressor().fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
train_fraction (float, optional, default = None) – Fraction of data to use to train the nonlinear model. Must be between 0 and 1. The remaining data is used to compute the test statistic. If
None, 0.5 or \(e / \log(n)\) is used, whichever is smaller.upper_clipping_quantile (float, optional, default = 0.9) – Asymptotic normality requires the nonlinear model’s prediction not to put too much weight in the tails. To avoid this, we clip its “test set” predictions by a certail threshold in absolute value. The threshold is the
upper_clipping_quantileof predictions on the “train” data. Must be between 0 and 1.gamma (float, optional, default = 0.05) – A non-negative scalar. Limits the minimum variance of the test statistic to gamma times the noise level.
seed (int, optional, default = 0) – Seed used to generate the random train / test split.
- Returns:
statistic (float) – The test statistic \(\mathrm{T}\).
p_value (float) – The p-value of the test.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
ValueError: – If
train_fractionis not in (0, 1).ValueError: – If
nonlinear_modeldoes not have afitandpredictmethod.
References
- ivmodels.tests.wald_test(Z, X, y, beta, W=None, C=None, D=None, estimator='tsls', fit_intercept=True, robust=False)
Test based on asymptotic normality of the TSLS (or LIML) estimator.
If
WisNone, the test statistic is defined as\[\mathrm{Wald}(\beta) := (\beta - \hat{\beta})^T \widehat{\mathrm{Cov}}(\hat\beta)^{-1} (\beta - \hat{\beta}) / \hat{\sigma}^2,\]where \(\hat \beta = \hat \beta(\kappa)\) is a k-class estimator with \(\sqrt{n} (1 - \kappa) \to 0\), \(\widehat{\mathrm{Cov}}(\hat\beta)^{-1} = \frac{1}{n} (X^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) X)^{-1}\), \(\hat \sigma^2 = \frac{1}{n - m_X} \| y - X \hat \beta \|^2_2\) is an estimate of the variance of the errors, \(P_Z\) is the projection matrix onto the column space of \(Z\), and \(M_Z = \mathrm{Id} - P_Z\). Under strong instruments, the test statistic is asymptotically distributed as \(\chi^2(m_X)\) under the null.
If
Wis notNone, the test statistic is defined as\[\mathrm{Wald}(\beta) := (\beta - B \hat{\beta})^T (B ( (X W)^T (\kappa P_Z + (1 - \kappa) \mathrm{Id}) (X W) )^{-1} B)^{-1} (\beta - B \hat{\beta}) / \hat{\sigma}^2,\]where \(B \in \mathbb{R}^{m_X \times (m_X + m_W)}\) is a diagonal matrix with 1 on the diagonal and 0 elsewhere.
- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
W (np.ndarray of dimension (n, mw) or None) – Endogenous regressors not of interest.
C (np.ndarray of dimension (n, mc) or None) – Exogenous regressors not of interest.
D (np.ndarray of dimension (n, md) or None) – Exogenous regressors of interest.
beta (np.ndarray of dimension (mx + md,)) – Coefficients to test.
estimator (str or float, optional, default = "liml") – Estimator to use. Passed to
kappaargument ofKClass.fit_intercept (bool, optional, default = True) – Whether to include an intercept. The intercept will be included both in the complete and the (restricted) model. Including an intercept is equivalent to centering the columns of all design matrices.
robust (bool, optional, default = False) – Whether to use a robust variance estimator. If
True, the sandwich estimator is used.
- Returns:
statistic (float) – The test statistic \(\mathrm{Wald}\).
p_value (float) – The p-value of the test. Equal to \(1 - F_{\chi^2(m_X)}(Wald)\), where \(F_{\chi^2(m_X)}\) is the cumulative distribution function of the \(\chi^2(m_X)\) distribution.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
- ivmodels.tests.weak_residual_prediction_test(Z, X, y, beta, C=None, robust=False, nonlinear_model=None, fit_intercept=True, train_fraction=None, upper_clipping_quantile=0.9, gamma=0.05, seed=0)
Perform the weak-IV-robust residual prediction test at a fixed
beta[Scheidegger et al., 2025].Unlike
residual_prediction_test(), this test does not estimate \(\beta\) via TSLS. Instead, it tests the null hypothesis\[H_0(\beta_0): \exists \theta \in \mathbb{R}^q \text{ s.t. } \mathbb{E}[y - X^T \beta_0 - C^T \theta \mid Z, C] = 0\]at the supplied value \(\beta_0\). The test statistic is asymptotically standard Gaussian under the null and remains valid under weak or many instruments.
See also the test’s R implementation by Cyrill Scheidegger (
weak_RPIV_test).- Parameters:
Z (np.ndarray of dimension (n, k)) – Instruments.
X (np.ndarray of dimension (n, mx)) – Regressors.
y (np.ndarray of dimension (n,)) – Outcomes.
beta (np.ndarray of dimension (mx,)) – Coefficients to test.
C (np.ndarray of dimension (n, mc) or None, optional, default = None) – Included exogenous regressors.
robust (bool, optional, default = False) – Whether to use the heteroskedasticity-robust variance estimator.
nonlinear_model (object, optional, default = None) – Object with a
fitandpredictmethod. IfNone, uses ansklearn.ensemble.RandomForestRegressor().fit_intercept (bool, optional, default = True) – Whether to include an intercept. This is equivalent to centering the inputs.
train_fraction (float, optional, default = None) – Fraction of data used to train the nonlinear model. If
None, uses \(\min(0.5, e / \log(n))\).upper_clipping_quantile (float, optional, default = 0.9) – Asymptotic normality requires the nonlinear model’s prediction not to put too much weight in the tails. To avoid this, we clip its “test set” predictions by a certain threshold in absolute value. The threshold is the
upper_clipping_quantileof predictions on the “train” data. Must be between 0 and 1.gamma (float, optional, default = 0.05) – A non-negative scalar. Limits the minimum variance of the test statistic to gamma times the noise level.
seed (int, optional, default = 0) – Seed for the random train / test split.
- Returns:
statistic (float) – The test statistic.
p_value (float) – The p-value of the test.
- Raises:
ValueError: – If the dimensions of the inputs are incorrect.
ValueError: – If
train_fractionis not in (0, 1).ValueError: – If
nonlinear_modeldoes not have afitandpredictmethod.
References