Jan 30, 2025 09:00:00

'TokenVerse' uses AI to combine different elements extracted from multiple images to generate a single image

This article, originally posted in Japanese on 09:00 Jan 30, 2025, may contains some machine-translated parts.
If you would like to suggest a corrected translation, please click here.

A research team at Google DeepMind has announced a method called ' TokenVerse' that extracts specific visual elements and attributes from an image and uses AI to combine elements extracted from multiple images to generate a natural image. The research team explains on their website what kind of images can be generated using TokenVerse.

[2501.12224] TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space

https://arxiv.org/abs/2501.12224

TokenVerse
https://token-verse.github.io/

TokenVerse uses a model that generates images from text based on the Diffusion Transformer (DiT), which combines a diffusion model and a transformer, and extracts elements such as objects, accessories, poses, and lighting from images according to the input text. It can then combine the elements extracted from each image to generate a new image.

The research team explains the process of generating images using TokenVerse. For example, the following three images have captions that emphasize elements such as 'dog,' 'glasses,' and 'pattern.'

When these images and the text 'a dog wearing a shirt with a pattern and glasses' were input, the elements 'dog', 'glasses', and 'pattern' were extracted from the images and combined to generate a single image.

Elements that can be extracted include not only the subject and small items, but also the way light hits them.

It seems possible to extract special effects such as fog covering the screen.

You can also extract only the pose and have the subject extracted from another image pose in the specified way.

It is also possible to extract only the texture of the object's surface.

The website explaining TokenVerse also provided a demo where you could change the image from which elements are extracted and see how the image generated actually changes.

Related Posts:

Jan 30, 2025 09:00:00 in Software, Web Service, Posted by log1h_ik

Archives

Categories: Note; Headline; Review; Coverage; Interview; Tasting; Mobile; Software; Web Service; Web Application; Hardware; Vehicle; Science; Creature; Video; Movie; Manga; Anime; Game; Design; Art; Food; Security; Notice; Pick Up; Column; Free Member

Search

<	8, 2025					>
Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	31	1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31	1	2	3	4	5	6

<	8, 2025					>
Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	31	1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31	1	2	3	4	5	6

<	8, 2025					>
Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	31	1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31	1	2	3	4	5	6

<	8, 2025					>
Sun	Mon	Tue	Wed	Thu	Fri	Sat
27	28	29	30	31	1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31	1	2	3	4	5	6