AI image generators have created over 15 billion images in just over a year, surpassing the first 150 years of photography.

Synthetic DataAI TrainingData Strategy
Share:
SYNTHETIC DATA VS "REAL" DATA

SYNTHETIC DATA VS "REAL" DATA

By Amir Jalali2 min read

SYNTHETIC DATA VS "REAL" DATA

With the fast adoption of LLMs and other Generative AI technologies, we are seeing a massive increase in the amount of synthetic data being produced.

Synthetic data is data which is generated by an algorithm or program, vs real world events.

To put this in context, AI image generators have created over 15 billion images in just over a year, surpassing the first 150 years of photographically produced images.

This presents us with a deep moral and philosophical question. But also provides us with immediate opportunities.

For example, with the use of LLMs many small businesses can produce vast amounts of synthetic data, for testing and other purposes. High quality test data can reduce development time and help shrink the development life cycle.

On the other hand, unethical businesses might use this data to inflate their customer base or other KPI metrics. As with any new tool or technology we will have immense opportunity, for good and for bad.

The much bigger question of "What is real?" has reached a new intersection.

"WHAT IS REAL? HOW DO YOU DEFINE 'REAL'? IF YOU'RE TALKING ABOUT WHAT YOU CAN FEEL, WHAT YOU CAN SMELL, WHAT YOU CAN TASTE AND SEE, THEN 'REAL' IS SIMPLY ELECTRICAL SIGNALS INTERPRETED BY YOUR BRAIN."

― Morpheus