32K Ultra Upscale XL with ollama

5.0

0 reviews

9.0K

1.5K

Description

Everyone aims for performance comparable to Magnific AI

When building Upscaler on ComfyUI, the concept of 'aiming for performance comparable to Magnific AI' is one of the watchwords.

In recent days, Kakudai V1, developed by AI development group Maverick, is one such example, which I also use.

Like Kakudai V1, I feel that the common denominator of this type of upscaler is the use of Controlnet's SD15_tile in most cases.

This is one of the functions that can be said to be essential when increasing the amount of writing and enlarging images, and Kakudai V1 also used SD15_tile as post-processing for the enlarged image after the upscaling itself was done by CCSR.

However, this SD15_tile has some weaknesses.

It is more of a limitation than a weakness, in that, as the name suggests, it can only be used with SD1.5.

Since Controlnet first supported SDXL1.0, major functions such as Canny and Openpose have been compatible with SDXL, but for some reason, no model corresponding to SD15_tile has been developed.

The emergence of ControlNet 852_a_clone_xl

But finally, last month, Mr. 852 (hakoniwa) developed an SDXL-compatible model with almost the same functionality.

Although he modestly described it as "modest" on X, this is a groundbreaking model that deserves more attention.

This is because it allows for tile processing and upscaling in the SDXL model, something that was previously impossible.

Ultra Upscale for SD1.5 modified for SDXL

With the introduction of ControlNet 852_a_clone_xl, the way is now open to upgrade the various upscalers that use SD1.5_tile, which were previously developed for SD1.5, for SDXL.

So this time I decided to modify Ultra Upscale Ver 3.0, developed by Kuurumin, for SDXL.

This workflow I have created allows upscaling up to 32K size (16384 x 16384) while keeping the original picture almost faithful to the original image (if it's size is 1024pixel×1024pixel).

In addition, we were able to reduce VRAM consumption to almost less than 12 GB for the entire process, partly by incorporating Kakudai V1 VRAM reduction ideas.

In the original workflow, VRAM consumption reached around 15 GB in the KSampler processing part, and the speed was significantly reduced by using shared VRAM. processing within 12 GB.

My GPU is an nVidia RTX4070 12GB, but if your GPU has more than 12GB of VRAM, you should be able to process almost the entire process without any speed loss due to shared VRAM use.

By changing some of the application models, illustrative and animated images can also be processed.

This original image used as a sample is 1024 x 1024.

You can download an image of the results scaled up to 32K from the link below. The size comes to over 160 MB, though.

32K enlarged image

Additional image analysis functionality using OLLAMA

For the part that analyses the original image and generates prompts, the familiar WD14 tagger was used in Kuurumin's original workflow, but this part has been modified to use ollama to further improve the analysis capability.

One note of caution: the model used should be llava-llama3, which is capable of image analysis.

Meta recently announced the Llama3 V, which, like the GPT-4o, has image analysis capability, but the genuine Meta Llama3 in the ollama line-up at the moment cannot perform image analysis.

Similarly, Qwen2, which has a reputation for its high capability, is also not capable of image analysis.

Various settings

First, for LoRA, use 'add-detail-xl' as shown in the diagram below.

For the model for upscaling, use '4xNMKD-Superscale-SP 178000_G' as shown below.

The ip-adapter should also be used for SDXL, as shown in the image below.

The Controlnet model is of course ControlNet 852_a_clone_xl.

Furthermore, in July 2024, a new controlNet model for SDXL was launched, which includes a model for Tile. Good results have been achieved using it. You can download that model from here.

When processing illustrative or animated images, it is better to change the upscale model to 'RealESRGAN_x4plus_anime_6B', which is also familiar to A1111.

※27 Jan 2025

Updated with Comfy-WaveSpeed.

Comfy-WaveSpeed contributes significantly to speeding up various Detailers even more than Kasmpler. Therefore, the latest version is equipped with Comfy-WaveSpeed, achieving a substantial increase in speed.

Discussion

(No comments yet)

Loading...

Author

Common Sence

3.0K

21.6K

Reviews

No reviews yet

Versions (5)

- latest (a year ago)
- v20241025-042830
- v20240918-100120
- v20240826-020620
- v20240702-073306

Node Details

Primitive Nodes (22)

ApplyFBCacheOnModel (1)

Reroute (21)

Custom Nodes (34)

ComfyUI

- ImageUpscaleWithModel (1)
- ImageScaleBy (4)
- CLIPVisionLoader (1)
- SaveImage (3)
- SelfAttentionGuidance (1)
- CLIPSetLastLayer (1)
- ControlNetApplyAdvanced (1)
- ControlNetLoader (1)
- KSampler (1)
- CheckpointLoaderSimple (1)
- UpscaleModelLoader (1)
- LoadImage (1)
- LoraLoader (1)
- CLIPTextEncode (3)
- ConditioningCombine (1)
- VAELoader (1)
- VAEEncode (1)

ComfyUI Ollama

- OllamaVision (1)

ComfyUI_IPAdapter_plus

- IPAdapterTiled (1)
- IPAdapterModelLoader (1)
- IPAdapterNoise (1)

pythongosssss/ComfyUI-Custom-Scripts

- ShowText|pysssss (1)

Tiled Diffusion & VAE for ComfyUI

- VAEDecodeTiled_TiledDiffusion (1)
- TiledDiffusion (2)

UltimateSDUpscale

- UltimateSDUpscaleNoUpscale (2)

Model Details

Checkpoints (1)

realvisxlV50_v50Bakedvae.safetensors

LoRAs (1)

Quality\add-detail-xl.safetensors

OpenArt

Workflows

Active Sessions