Skip to content

Latest commit

 

History

History
55 lines (37 loc) · 5.85 KB

emerging.md

File metadata and controls

55 lines (37 loc) · 5.85 KB

Emerging Trends

Interactive Machine Learning

With models getting larger and costing more to train, there's a growing need to interact with the model and quickly iterate on its performance before a full training run.

Massive Scale Models

With the ability to train models without needing labelled data through self-supervision, the focus became on scaling models up and training on more data.

  • GPT-3 was the first 170B parameter model capable of few-shot in-context learning developed by OpenAI.
  • Moore's Law for Everything is a post about scale and its effect on AI / society.
  • Switch Transformers is a mixture of experts for training massive models beyond the scale of GPT-3.

Observational Supervision

The way experts interact with their data (e.g. a radiologist’s eye movements) contains rich information about the task (e.g. classification difficulty), and the expert (e.g. drowsiness level). With the current trend of wearable technology (e.g. AR with eye tracking capability), the hardware needed to collect such human-data interactions is expected to become more ubiquitous, affordable, and standardized. In observational supervision, we investigate how to extract the rich information embedded in the human-data interaction, to either supervise models from scratch, or to improve model robustness.

Interesting works have collected observational signals such as:

  • Eye tracking data in medicine:
    • Chest X-Rays, dictation audio, bounding boxes, and gaze on 3,032 images by R. Lanfredi et al. dataset
    • Chest X-Rays, reports, dictation audio, and gaze on 1,083 images by A. Karargyris et al. dataset
    • Two medical datasets on Chest X-Rays and brain MRI by K. Saab et al. dataset
  • Eye tracking plus brain activity in NLP (Zuco dataset)

Critical papers in observational supervision:

  • Some of the pioneering work on using gaze data. N. Hollenstein and C. Zhang showed how to use gaze data to improve NLP models paper.
  • Improving zero-shot learning with gaze by N. Karasseli et al. paper
  • Weak supervision and multi-task learning with gaze by K. Saab et al. paper

Data Models

An explicit, differentiable data model can be specified using physics-informed and known operator learning. The gradient connection between data model and downstream task model opens up new opportunities for data-centric machine learning such as dataset drift synthesis, drift forensics as well as the optimization of the data generating process itself.

Existing works:

  • Oala, L., Aversa, M., Nobis, G., Willis, K., Neuenschwander, Y., Buck, M., Matek, C., Extermann, J., Pomarico, E., Samek, W. and Murray-Smith, R., 2022. Data Models for Dataset Drift Controls in Machine Learning With Images. arXiv preprint arXiv:2211.02578. paper | code | data
  • Maier, A., Köstler, H., Heisig, M., Krauss, P. and Yang, S.H., 2022. Known operator learning and hybrid machine learning in medical imaging—a review of the past, the present, and the future. Progress in Biomedical Engineering. paper
  • Raissi, M., Perdikaris, P. and Karniadakis, G.E., 2019. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics, 378, pp.686-707. paper
  • Wright, L.G., Onodera, T., Stein, M.M. et al. Deep physical neural networks trained with backpropagation. Nature 601, 549–555 (2022). paper