Optimal transport overview

Optimal transport, also known as Wasserstein distance or Earth Mover's Distance, is a mathematical framework for measuring the distance between two probability distributions. It is used to calculate the minimal cost of transforming one distribution into another by moving mass around in a way that preserves the overall amount of mass.

Optimal transport can be used to generate datasets by leveraging its ability to match and transform distributions. One way to do this is through a process called data augmentation, which involves creating new samples by transforming existing ones while preserving the underlying distribution.

Here's an example of how optimal transport can be used for data augmentation:

Start with a set of training samples.

Define a cost function that measures the dissimilarity between two samples. This can be done using various metrics, such as the Euclidean distance or a more complex distance metric.

Use optimal transport to match the distribution of the training samples to a desired target distribution. This involves finding the optimal transportation plan that minimizes the cost of transforming the training distribution into the target distribution.

Generate new samples by transforming the existing ones based on the transportation plan. This can be done by applying random perturbations or deformations to the existing samples based on the transportation plan.

Add the newly generated samples to the training set.

Repeat the process until enough samples have been generated.

By using optimal transport to match and transform distributions, data augmentation can be used to increase the size and diversity of datasets, which can help improve the performance of machine learning models. Additionally, optimal transport can be used to sample new data points from a given distribution, which can be useful for generative modeling tasks.

Optimal transport overview

Optimal transport overview

From the blog

How Dataknobs help in building data products

Generative AI is one of approach to build data product

Data Lineage and Extensibility

CIO Guide to create GenAI Budget for 2025

Kreate - Bring your Ideas to Life

KONTROLS - apply creatvity with responsbility

KNOBS - Experimentation and Diagnostics

Create Articles and Blogs

Create Presentations, Proposals and Pages

Agent to publish your website daily

Build AI Assistant in low code/no code

Build AI Agents - 5 types

Develop data products and check user response thru experiment

Experiment faster and cheaper with knobs

RAG Use Cases and Implementation

Knobs are levers using which you manage output

Our Products

KreateBots

KreateWebsites

Kreate CMS

Generate Slides

Content Compass

Fractional CTO for generative AI and Data Products