The Kaitchup – AI on a Budget

The Kaitchup – AI on a Budget

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
PaLM 2 Evaluation: Automatic Summarization

PaLM 2 Evaluation: Automatic Summarization

Here we go again — Struggling with Contaminated Training Data

Benjamin Marie's avatar
Benjamin Marie
May 22, 2023
∙ Paid
1

Share this post

The Kaitchup – AI on a Budget
The Kaitchup – AI on a Budget
PaLM 2 Evaluation: Automatic Summarization
Share

Here we go again — Struggling with Contaminated Training Data

The evaluation of a large language model such as PaLM 2 is extremely challenging for one main reason: The evaluation data may have been in the training data.

In other words, there is a risk of data contamination, a.k.a., data leakage.

Similarly to OpenAI with GPT-4, Google tried to minimize the …

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 The Kaitchup
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share