Context-Based Visual Data Generation for Image Enhancement and Automatic Colour Calibration

Download
2024-7-26
Böncü, Ece Selin
Due to the wide range of application areas of digital cameras, the representational quality of these devices has become an essential performance criterion. Either due to the hardware limitations or momentary alterations in the scene, undesired artifacts may appear on the output image. As the procedure is irreversible and unrepeatable in most cases, image enhancement and restoration problems become ill-posed. We propose a novel agent model, AIti-FAct, that generates images based on a description written by the agent itself, and provides the edited version of this image autonomously by applying a context-based distortion. AIti-FAct decomposes the end-to-end image editing task into simpler sub-tasks of description, generation and distortion and allows multiple tool usage within the framework. Moreover, the distortion sub-task is implemented as another agent with more than 12 tools available to distort the image based on its content. By leveraging the power of pre-trained LLMs in reasoning and text-to-image generation, this nested-agent is not only capable of executing prompts but also deciding on a realistic alteration on the image merely by its description. The performance of the inner Distortion Agent on creating artifacts on image caption datasets is illustrated through a subjective evaluation. As an extension on colour quality domain, a 2-fold approach for colour constancy is presented. The limitations of the small scale datasets for colour constancy are addressed through the use of synthetically generated data. Primarily, by adding a novel tool that converts sRGB images into camera-mapped RAW images, the datasets are extended. The performance of selected SOTA approaches are boosted through transfer learning on benchmark datasets. Besides, a novel CLIP-based method, CLIP-Guided Colour Constancy is proposed. With its text \& image encoders, CGCC reformulated our problem definition to a downstream task of contrastive image-language pre-training, beating SOTA levels.
Citation Formats
E. S. Böncü, “Context-Based Visual Data Generation for Image Enhancement and Automatic Colour Calibration,” Ph.D. - Doctoral Program, Middle East Technical University, 2024.