42.3 Data requirements and dataset quality

Overview and links for this section of the guide.

How Much Data?

You don't need millions of rows. - **50-100 examples:** Noticeable style change. - **500-1000 examples:** Strong format adherence. - **10,000+ examples:** Deep behavioral change.

Garbage In, Garbage Out

One bad example ruins 10 good ones. If you have a training set with 10% errors, the model learns that "errors are okay 10% of the time."

Curate your dataset manually. It is better to have 100 perfect examples than 1000 noisy ones.

Where to go next