Apple released the MGIE neural network for image editing based on text description
Apple has released MGIE (MLLM-Guided Image Editing), a machine learning model designed for text description image editing. The neural network was developed together with researchers at the University of California, Santa Barbara.
MGIE is a multimodal model that can work with several types of data. For example, a neural network can recognize commands in natural language, images in a source photo and generate new objects using a diffusion model. This approach allows combining several tasks in one neural network.
The MGIE model receives as input an image and a textual description of the changes to be made. The neural network then redraws the image based on the user’s instructions. For example, you can ask to add more greenery to the photo, remove some objects or paint new ones.
On the arxiv.org portal, Apple engineers have published details of the research behind the project. The code and weights are available in the public GitHub repository. At Hugging Face, enthusiasts have deployed a test web application based on MGIE.