# Estimating Generalization under Distribution Shifts via Domain-Invariant Representations

Ching-Yao Chuang
MIT CSAIL
Antonio Torralba
MIT CSAIL
Stefanie Jegelka
MIT CSAIL

[Paper] [GitHub Code]

## Abstract

When machine learning models are deployed on a test distribution different from the training distribution, they can perform poorly, but overestimate their performance. In this work, we aim to better estimate a model's performance under distribution shift, without supervision. To do so, we use a set of domain-invariant predictors as a proxy for the unknown, true target labels. Since the error of the resulting risk estimate depends on the target risk of the proxy model, we study generalization of domain-invariant representations and show that the complexity of the latent representation has a significant influence on the target risk. Empirically, our approach (1) enables self-tuning of domain adaptation models, and (2) accurately estimates the target error of given models under distribution shift. Other applications include model selection, deciding early stopping and error detection.

## Estimating Generalization: Main Idea

In practice, machine learning models are deployed on a test distribution different from the training distribution. While we may hope that the model generalizes to this new data distribution, estimating empirically how well a given model will actually generalize is challenging without labels.

Our goal is to estimate the error of a given, learned model $h$ in a new/target distribution, without observing true labels. In particular, we aim to estimate the risk in unlabeled target domain with a joint distribution $p_{T}$ on data and labels $X, Y$ measured by a loss function $\ell$ (here, zero-one loss): $R_{T}(h) = \mathbb{E}_{x,y \sim p_{T}}[\ell(h(x), y)]$. The main idea underlying our approach is to obtain an upper bound on the risk in new distribution by replacing $y$ with candidates from a set of proxy models $\mathcal{P}$ that we also call check models.

Upper bound on target risk
$R_{T}(h) \leq \underbrace{\sup_{h^{\prime } \in \mathcal{P}}R_{T}(h, h^{\prime})}_{\text{Proxy Risk}} + \underbrace{\inf_{h^{\prime} \in \mathcal{P}}R_{T}(h^{\prime})}_{\text{Bias}}.$

Let $R_{D}(h, h^{\prime}) = \mathbb{E}_{x \sim D}[\ell(h(x), h^{\prime}(x))]$ be the expected disagreement between two hypotheses and an extension of the notation $R_{D}(h) = R_{T}(h, h_{\text{true}})$, where $h_{\text{true}}$ are the true labels. The first term in the upper bound measures the maximal disagreement (risk) between the hypothesis $h$ and a check model $h^{\prime} \in \mathcal{P}$, instead of $y$. The second term measures how good the check models are. The proxy risk can be estimated empirically. If the bias term is small, namely, there exists a good hypothesis in the check models, then the proxy risk itself is a good estimate of an upper bound on $R_T(h)$. It remains to determine the set $\mathcal{P}$.

Estimation error of proxy risk
$\underbrace{|\sup_{h^{\prime} \in \mathcal{P}} R_{T}(h, h^{\prime}) - R_{T}(h)|}_{\text{Estimation Error}} \leq \sup_{h^{\prime} \in \mathcal{P}}R_{T}(h^{\prime}).$

The upper bound shows that the target risk of the check models affects the error of estimating risk via the proxy risk. This motivates domain adaptation models as check models, because they are designed to minimize the target risk. In this work the check models are a set of domain-invariant models that have low domain-invariant representations (DIR) objective.

Domain-Invariant Check Models
$\mathcal{P}_{\mathcal{FG}}^{\epsilon} = \{h=fg \in \mathcal{FG}| R_{S}(h) + \alpha d(p_{S}^{g}(Z), p_{T}^{g}(Z)) \leq \epsilon \}.$

To approximate the target risk $\sup_{h^{\prime} \in \mathcal{P}_{\mathcal{FG}}^{\epsilon}} R_{T}(h, h^{\prime})$, we aim to maximize the disagreement under domain-invariant constraints. Computationally, it is more convenient to replace the constraint with a penalty via Lagrangian relaxation:

Computing Proxy Risk
$\max_{f^{\prime}g^{\prime} \in \mathcal{FG}} R_{T}(h, f^{\prime}g^{\prime}) - \lambda( R_{S}(f^{\prime}g^{\prime}) + \alpha d(p_{S}^{g^{\prime}}(Z), p_{T}^{g^{\prime}}(Z))$

## Understanding the Adaptability of DIR

To understand the tightness of our proxy risk-based estimation, we need a closer look at what affects the target risk of domain invariant representations. We show that the complexity of representation encoder has an important effect on the target generalization. To probe the effect of embedding complexity, we fix the predictor class and vary the number of layers of the embedding. We can see that there exist the optimal model complexity. See our paper for more theoretical analysis.

## Applications

Empirically, our approach (1) enables self-tuning of domain adaptation models, and (2) accurately estimates the target error of given models under distribution shift. Other applications include model selection, deciding early stopping and error detection.

## Paper

ICML 2020. arxiv.

## Citation

Ching-Yao Chuang, Antonio Torralba, Stefanie Jegelka. "Estimating Generalization under Distribution Shifts via Domain-Invariant Representations" International Conference on Machine Learning. 2020.

## Code: GitHub

### bibtexentry


@article{chuang2020estimating,
title={Estimating Generalization under Distribution Shifts via Domain-Invariant Representations},
author={Chuang, Ching-Yao and Torralba, Antonio and Jegelka, Stefanie},
journal={International conference on machine learning},
year={2020}
}


Accessibility