Blake's SDXL Overkill Upscale Workflow w/ Perturbed-Attention Guidance, Perp-Neg, Kohya High Res Fix

5.0

0 reviews

10.6K

1.6K

Description

TODO: Research "Trajectory Consistency Distillation" https://github.com/dfl/comfyui-tcd-scheduler and https://mhh0318.github.io/tcd/ and its compatibility with Perp-Neg.

This is my own personal ever-evolving workflow that I'll update when I make changes. I made it to look best on a 4k screen. Mind the spaghetti, it doesn't bother me (it just feels like working with a wiring harness) and the workflow changes too often to organize it.

Create and upscale images to 7168 x 9216 (or other sizes) using SDXL, Kohya High Res. Fix, Perturbed-Attention Guidance (PAG), Perp-Neg, ControlNet, Face Detailer, Refiner, object masking, and more. I try to add any new technology that I come across.

How to use:

Enter your folder path of where to save images. Select your model and LoRa.
Disable Stage 2 and/or Stage 3 (right click on the top bar and select "Set Group Nodes to Never"). I will sometimes disable Stage 6 and only run the final upscale on images I want to share. You can do that using my upscale workflow. https://openart.ai/workflows/blake/4x-upscale-merge-w-screen-multiply/XWkzjfELclE2QhXlXRmI
Generate images for stage one, updating the prompt and seed to get the image you want. Update the CLIPSeg label/text to select the subject, or whatever you want to use a lower noise pass on to preserves the original image. Right click on the preview bridge and select "Open in MaskEditor" to custom mask off the subject. These two masks will be combined for Stage 2.
Enable Stage 2 (right click on the top bar and select "Set Group Nodes to Always").
Keep making images with different seeds, for the background and character, until you get the image you want. Enable hand detection if the hands are good in the original image. I use higher denoise on the background. This is isolated in a separate pass.
Enable Stage 3 and let the rest run to upscale the image as large as possible.
If the face isn't perfect, change the face pass or face refiner ratio settings.

Update: The workflow has gotten so good, with all the new algorithms, that you can usually run phase 2+ without any intervention. It usually gets hands correct the first time. The subject masking is much more precise now, and I rarely need to manually mask something, however it's sometimes still needed for with wings or large clothing.

I will usually find a bunch of images that I want to upscale, and then queue them all at once. I will delete any images that I don't want in the ComfyUI history window.

How to use custom noise:

I added a custom noise section that helps you set the perspective/center line on images.

This has 3 settings. Horizontal Mirrored Noise, Vertical Mirrored Noise, Blend that blends the two together.

My recommend setting is to go Horizontal Mirrored Noise 0.5, Vertical Mirrored Noise 0, and Blend 0.5. This will give you 50/50 mirrored Horizontal noise and regular noise. This means it trends to symmetry, but gives the model some leeway with the regular noise. I found that this gives me the most coherent images. Otherwise, full symmetry works best, as there can be some issues with coherence otherwise, but experiment for yourself!

7/02/2024

I moved PAG node before Kohya Deep Shrink, this way I could use PAG without deep shrink, since it seems to be better for Stage 2+.
I added Ultimate SD Upscale to Stage 2 (Subject) and Stage 3. This solved some weird issues with limbs when upscaling photorealistic images. The upscaling is much more consistent now and works much better with photo images. Stage 2 will now do an extra pass of the background, since Ultimate SD Upscale node takes an image rather than a latent input, so I'm unable to mask only the subject for this pass.

5/21/2024

I noticed a bug with the width and height nodes were not saving the value that I enter, so I switched them out for a basic integer node.

5/21/2024

I noticed an issue with Perp-Neg node causing the upscaled images to become green. I found that this was due to Control Net conditioning. Perp-Neg was also deprecated by the author, so I switched to using PerpNegGuider, by the same author. https://github.com/bvhari/ComfyUI_PerpWeight
I added Stage 2+ sampler selector.
I replaced stage one sampler with the PerpNegGuider node, and adapted it to work into the current workflow. This required removing the Control Net option for Stage 1. I never really used it, and it's impossible to use with the Perp-Neg, since it needs the negative prompt conditioning that Control Net messes with.
Image quality and detail is now greatly improved with both Perp-Neg and Perturbed-Attention Guidance. Speed is also significantly increased with the new SamplerCustomAdvanced node that comes in ComfyUI_PerpWeight , this does some optimizations with negative prompts at later steps.

5/19/2024

Changed to using the CR LoRa loader, since the other one was using an outdated package that would cause issues on startup.
Cleaned things up a little.

5/18/2024

I added Perturbed-Attention Guidance (PAG) node, that helps the model clean up the image during the diffusion process. https://arxiv.org/abs/2403.17377
I added the Prep-Neg node. "Perp-Neg employs a denoising process that is restricted to be perpendicular to the direction of the main prompt".
I made some other small settings tweaks, like changing to the dpmpp_2m_sde sampler.

4/8/2024

ComfyUI was updated and caused an error in the BNK_GetSigma node used for generating custom noise. I created a pull request that resolves the error. https://github.com/BlenderNeko/ComfyUI_Noise/pull/29 Once this is merged, you should be able to update all nodes to resolve the error. No workflow changes were needed.

3/31/2024

The conditioning input switch, that I use for the Stage 1 ControlNet, had an update that broke the switch. It changed from an integer boolean to a boolean. I updated the switch node to resolve this error.

3/17/2024

Added background and subject masking. The background mask is inverted and added to the subject mask. This makes it much more reliable at masking off the foreground elements and the subject. I used "background" because it works much better than "foreground" masking. This can still be combined with the manual mask, although it's needed a lot less often now.
I disabled the ControlNet Stage 1 nodes, since it would process when not being used or cause errors when disabled.

2/20/2024

Added ControlNet option/group for Stage 1, so you can create an image based on another image. Use the conditioning switch to enable it.
I changed to using thibaud's model for Open Pose, because it worked more consistently. https://huggingface.co/thibaud/controlnet-openpose-sdxl-1.0/blob/main/control-lora-openposeXL2-rank256.safetensors
I switched to using "Concat Conditionings with Multiplier (Inspire)" nodes for combining conditioning from ControlNet, because it seems like the others may not play nice with SDXL.

2/1/2024

Added a ratio setting for Stage 2 and Stage 3, so you can select how much it will upscale.
Added an upscale model loader to the top to be used for supersampling on Stage 2 and Stage 3. This will upscale 4x then downsample to whatever ratio you select. I noticed that this makes more of a difference when using Lanczos to downsample, so I added it back.

1/29/2024

Added SDXL conditioning nodes for Stage 2 and Stage 3+ upscale with the suggested 4x resolution for that stage. It showed more details on the edge of images for Phase 2, but didn't make any noticeable change for Stage 3+.

1/26/2024

Added steps for the first generation, since it would have required changing it 4 different places otherwise. Each other stage should be adjusted individually.

1/26/2024

Fixed small bug with Stage 2 seed.
Increased noise on Stage 3 upscale denoise from .12 to .32, since Exponential kept the image more coherent and I could get more detail.

1/26/2024

Updated face fix stage to use CLIPSeg Masking.
I found that Exponential scheduler does the best upscaling, but Karras is the best at making new things. I use Karras for the first image and the upscale background, and Exponential for the rest.
I changed to using lanczos upscaling algorithm, rather than AI model upscaling, in between stages. It didn't make a big difference and made images take MUCH longer to make. I may end up adding this back when I get a chance to do more testing. It only made a very small difference in Automatic1111.
I added some notes.

1/25/2024

Switched to using CLIPSeg Masking, because it works best in my testing. It can also detect just about anything in the image. I added this to Stage 1.
I added preview bridge so you can custom in-painting whatever you want and run it at whatever noise you want for the next pass. You can use this if CLIPSeg misses something, the two masks will be combined.
Added binary mask option, to better help narrow the CLIPSeg mask to what you need.

1/25/2024

Another massive overhaul. You can now set a custom resolution and many more options.
Added LoRa stacker to resolve my CLIP issues with LoRa that SDXL seems to have with some nodes. I also switched to specific SDXL clip encoders.
I set a higher default resolution, to one that is recommended. This way you will get even larger images! You can also set a custom resolution now.
Save as .jpg for the final upscale.
Removed some nodes and made settings changes to speed up the generation process, since the images are a bit bigger now.
You can set the cfg for all stages now.
I messed with the mask settings to get better results from Stage 2.

1/24/2024

After doing some experiments with noise to make consistent full body characters, I added mirrored noise! This lets you set the perspective on images and get nearly 100% consistent center frame full body images. I added 3 settings for this. Each one blends between two different noise patterns to get the perspective or shape you want.
I added "Latent Adjustment" that will let you change the brightness, contrast, and saturation on the noise.
I added separate controls, and a switch, so you can change between art and photo LoRa. I think I will eventually go with a stacked-LoRa for both, assuming I can resolve the CLIP settings not carrying over between each LoRa.
Updated various settings to get more consistent results.

1/21/2024

Added face refiner ratio and separate face sampler selection.
Tweaked the face refiner settings and refiner pass to get better results.
I moved all the samplers to the top so the progress is visible.
I changed all the windows so that you can drag along the top, and locked down the groups.
Moved refiner model settings to the top.
Improvements to the Stage 2 hand detection flow.
Major reorganization, so each group fits under the preview image.

1/20/2024

Massive quality of life update and cleanup.
Moved all the controls to a central location.
Locked everything so it doesn't get moved around when making images.
Added complete sounds for Stage 5 and Stage 6 upscale.
Added sampler selection for all stages.
Added save path.
Added face passes selection.
I renamed the workflow, since I've been using it for non-fantasy images.

1/18/2024

Stage 2 now uses ControlNet to get the pose and avoid extra limbs due to high denoise. Adding ControlNet deluded the conditioning, so I split the seeds for the environment and person so that I don't loose conditioning strength when using ControlNet on the environment.
Stage 5 was switched to use a pipeline and face detector to fix the faces.
I updated the LoRa weights and added clip merge so that I can update the clip weights from individual Lora to get the best results. Chaining them together would not apply the clip weights of the earlier Lora. I had the same issue with the LoRa multi-loader.

Discussion

(No comments yet)

Loading...

Author

Blake

7.2K

38.1K

Reviews

No reviews yet

Versions (36)

- latest (a year ago)
- v20240703-060035
- v20240528-171844
- v20240522-045001
- v20240522-044924
- v20240520-073630
- v20240519-154736
- v20240518-074846
- v20240331-180021
- v20240331-175943
- v20240331-175749
- v20240317-235241
- v20240220-232301
- v20240220-225255
- v20240202-065220
- v20240129-092241
- v20240127-075132
- v20240127-074848
- v20240127-015901
- v20240126-085519
- v20240125-200301
- v20240125-200206
- v20240125-111845
- v20240125-102730
- v20240125-102616
- v20240124-221750
- v20240122-100439
- v20240122-031330
- v20240122-015606
- v20240121-023432
- v20240119-080145
- v20240119-022813
- v20240118-024343
- v20240118-012754
- v20240118-011221
- v20240118-001734

Node Details

Primitive Nodes (20)

JWInteger (2)

Note (2)

PrimitiveNode (1)

Reroute (15)

Custom Nodes (158)

Comfyroll Studio

- CR Index Multiply (2)
- CR Integer To String (4)
- CR Apply LoRA Stack (1)
- CR LoRA Stack (2)

ComfyUI

- UpscaleModelLoader (3)
- ImageUpscaleWithModel (4)
- ImageBlend (3)
- PreviewImage (19)
- LatentFlip (2)
- CLIPTextEncodeSDXLRefiner (2)
- VAEDecode (5)
- MaskToImage (5)
- KSamplerAdvanced (1)
- CLIPTextEncodeSDXL (6)
- ImageScale (2)
- ControlNetLoader (3)
- ControlNetApplyAdvanced (3)
- VAEEncode (2)
- CLIPSetLastLayer (1)
- CheckpointLoaderSimple (2)
- DisableNoise (1)
- KSamplerSelect (1)
- BasicScheduler (1)
- EmptyLatentImage (1)
- VAELoader (1)
- PerpNegGuider (1)
- SamplerCustomAdvanced (1)
- KSampler (1)
- InvertMask (2)
- GrowMask (1)
- SetLatentNoiseMask (1)
- PatchModelAddDownscale (1)

ComfyUI Impact Pack

- MaskDetailerPipe (1)
- ToBasicPipe (2)
- ToBinaryMask (3)
- ImpactFloat (3)
- MaskListToMaskBatch (2)
- PreviewBridge (1)

ComfyUI Inspire Pack

- ConcatConditioningsWithMultiplier //Inspire (2)

ComfyUI Nodes for Inference.Core

- OpenposePreprocessor (2)
- MiDaS-DepthMapPreprocessor (1)
- LeReS-DepthMapPreprocessor (1)

ComfyUI Noise

- BNK_GetSigma (1)
- BNK_InjectNoise (1)
- BNK_SlerpLatent (3)
- BNK_NoisyLatentImage (1)

Masquerade Nodes

- Image To Mask (1)

Perturbed-Attention Guidance

- PerturbedAttention (1)

Power Noise Suite for ComfyUI

- Latent Adjustment (PPF Noise) (1)

pythongosssss/ComfyUI-Custom-Scripts

- ShowText|pysssss (4)
- MathExpression|pysssss (8)
- PlaySound|pysssss (2)

Save Image with Generation Metadata

- Seed Generator (4)
- String Literal (5)
- Cfg Literal (1)
- Int Literal (2)
- Sampler Selector (2)
- Checkpoint Selector (1)

UltimateSDUpscale

- UltimateSDUpscaleNoUpscale (2)

WAS Node Suite

- Mask Gaussian Region (3)
- CLIPSeg Masking (3)
- Image Save (3)
- Text to Conditioning (1)
- Image Resize (1)
- Constant Number (2)
- Masks Combine Regions (2)

Model Details

Checkpoints (3)

0.3(0.5(0.5(dreamshaper_331BakedVae) + 0.5(protogenV22Anime_22)) + 0.5(0.5(epicDiffusion_epicDiffusion11) + 0.5(rpg_V4))) + 0.7(0.5(pastelmix-better-vae) + 0.5(meinapastel_V4)).safetensors

0.7(0.7(0.7(leosamsHelloworldSDXL_helloworldSDXL32DPO) + 0.3(sdxlYAMERSPERFECTDESIGN_v6UNLIMITEDVOID)) + 0.3(kohakuXLDelta_rev1)) + 0.3(copaxTimelessxlSDXL1_v9).ckpt

sd_xl_refiner_1.0.safetensors

LoRAs (6)

Animated_Concept.safetensors

JasmineStyleV2.safetensors

SDXL1.0_Essenz-series-by-AI_Characters_Style_BetterPhotography-v1.2-'Skynet'.safetensors

XDetail_heavy.safetensors

lineartSDXL.safetensors

xl_more_art-full_v1.safetensors

OpenArt

Workflows

Active Sessions