Rapid Network Adaptation: Learning to Adapt Neural Networks Using Test-Time Feedback

Neural networks are powerful—but brittle under distribution shift. Rapid Network Adaptation (RNA) proposes a different paradigm: instead of anticipating all possible shifts during training, models should adapt on the fly at test time.

🧠 Motivation

Models fail under distribution shift (blur, noise, occlusion, domain changes, real-world deployment). Training-time robustness (augmentation, objectives) cannot cover the vast space of shifts.

Instead of anticipating shifts, adapt when they occur.

🔁 Adaptive Inference

Traditional models are static at test time (fixed parameters, no feedback). RNA introduces a closed-loop process: prediction → test-time signal → controller → updated model → improved prediction. Inference becomes dynamic and feedback-driven.

⚙️ Method

Base model: pretrained fθf_\theta (frozen).
Adaptation signal: z=g(B)z = g(B) from batch BB (no labels), e.g., entropy, self-supervision, geometric constraints, k-NN retrieval.
Controller: hϕh_\phi maps outputs + zz → parameter updates via FiLM: fθ^=FiLM(fθ,hϕ(z))f_{\hat{\theta}} = \text{FiLM}(f_\theta, h_\phi(z))

🚀 Amortized Adaptation

RNA replaces test-time optimization with a learned controller:

  • ❌ no iterative SGD
  • ❌ no test-time tuning
  • ✅ single forward pass

This is amortized optimization—learning how to adapt during training.

🧩 Architecture

  • Base network frozen
  • Adaptation via FiLM layers only
  • Controller predicts modulation parameters
  • Controller is lightweight (~5–20% of base model)

→ stable, efficient, and deployable.

📡 Test-Time Signals

Signals act as error proxies without ground truth. Crucially:

Improving the proxy does not guarantee improving the task.

Poor signals can reduce entropy yet harm performance → signal design is critical.

📊 Experiments

Tasks: classification, segmentation, depth, optical flow
Datasets: ImageNet, COCO, ScanNet, Taskonomy, Replica, Hypersim
Shifts: 2D corruptions, 3D geometric shifts, cross-dataset generalization

📈 Results

  • Matches or outperforms test-time optimization (TTO)
  • Generalizes across tasks, modalities, unseen shifts
  • Speed: RNA ~0.01s vs TTO ~3–5s → orders-of-magnitude faster

🔬 Context (TTA Landscape)

MethodDescription
TTOiterative SGD at test time
Semi-amortizedhybrid
RNAfully learned, single-pass adaptation

RNA removes online optimization → faster and simpler deployment.

💡 Key Insights

  • Adaptation > robustness for unpredictable shifts
  • Learning adaptation strategies scales better than hand-design
  • Test-time feedback is useful without labels
  • Inference is better viewed as a feedback-driven process, not a fixed forward pass

📌 Citation

@inproceedings{yeo2023rna,
  title={Rapid Network Adaptation: Learning to Adapt Neural Networks Using Test-Time Feedback},
  author={Yeo, Teresa and Kar, Oğuzhan Fatih and Sodagar, Zahra and Zamir, Amir},
  booktitle={ICCV},
  year={2023}
}

© 2024. All rights reserved.

Powered by Hydejack v9.2.1