DDN (Discrete Distribution Networks) is a new generative AI model with simple principles and unique properties.



Lei Yang , a researcher in computer vision and large-scale language models (LLM), has presented a new generative AI model called ' DDN (Discrete Distribution Networks) ,' which has simple principles and unique properties.

DDN: Discrete Distribution Networks
https://discrete-distribution-networks.github.io/



[2401.00036] Discrete Distribution Networks
https://arxiv.org/abs/2401.00036

GitHub - DIYer22/discrete_distribution_networks: DDN: A novel generative model with simple principles and unique properties. (ICLR 2025)
https://github.com/DIYer22/discrete_distribution_networks

DDN is a new generative AI model that approximates data distributions using hierarchical discrete distributions . Its unique feature is that it allows the network to simultaneously generate multiple samples rather than a single output to capture distributional information.

To capture finer details of the target data, DDN selects the output closest to the ground truth (GT) from the rough results generated in the first layer. The selected output from the first layer is then fed back into the network as a condition for the second layer, generating new outputs that are more similar to the GT. In this manner, through multiple layers, the representation space of DDN's outputs expands exponentially, and the generated samples become increasingly similar to the GT. This hierarchical output pattern of discrete distributions gives DDN its unique properties: more general zero-shot conditional generation and one-dimensional latent representations.

Below is a diagram showing DDN's multi-layer image generation process.



This mechanism of generating a large number of output candidates (splitting) and retaining only the most promising ones (prune) is called 'Split-and-Prune' and is considered an excellent mechanism for search and optimization. While

stochastic gradient descent is used in conventional neural networks, DDNs cannot directly use gradients because they deal with discrete distributions. This is why Split-and-Prune was introduced. It has also been demonstrated that the KL divergence can be reduced by adopting Split-and-Prune.

DDN supports non-pixel domains, especially gradient-independent zero-shot conditional generation, such as text-to-image generation using black-box models . The images in the yellow borders serve as ground truth, and the abbreviations in the table headers correspond to each task. 'SR' stands for Super-Resolution, and the following number indicates the resolution. 'ST' stands for Style Transfer, which uses conditions to calculate the perceptual loss.



The following diagram shows the tree-structured representation space of DDN latent variables. Each sample can be mapped to

a leaf node in this tree.



The effectiveness and properties of DDN have been demonstrated through experiments using CIFAR-10 and FFHQ , datasets widely used in the fields of machine learning and computer vision.

Yang said the total time spent on DDN development was 'less than three months,' and added that 'the experiments were preliminary, and there was limited time for detailed analysis and tuning.' As a result, DDN 'still has significant room for improvement.'

in AI,   Software, Posted by logu_ii