John6666/sdxl-to-diffusers-v2

xi0v

Oct 12, 2024

•

edited Oct 12, 2024

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 622, in process_events
    response = await route_utils.call_process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2016, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1569, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2405, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 914, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 846, in wrapper
    response = f(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 846, in wrapper
    response = f(*args, **kwargs)
  File "/home/user/app/convert_url_to_diffusers_sdxl_gr.py", line 347, in convert_url_to_diffusers_repo
    new_path = convert_url_to_diffusers_sdxl(dl_url, civitai_key, hf_token, is_upload_sf, half, vae, scheduler, lora_dict, False)
  File "/home/user/app/convert_url_to_diffusers_sdxl_gr.py", line 286, in convert_url_to_diffusers_sdxl
    pipe.scheduler = sconf[0].from_config(pipe.scheduler.config, **sconf[1])
AttributeError: 'NoneType' object has no attribute 'scheduler'

xi0v

Oct 12, 2024

I think #6 is the problem

John6666

Owner Oct 12, 2024

I've merged your commits and effectively rolled back. Now, this is the first time I've seen this bug even related to Gradio 5.
I'll have to debug it again. Hopefully it's a simple problem.

John6666

Owner Oct 12, 2024

It wasn't Gradio5's fault. It was a false accusation!
I made a mistake in the branching process yesterday when I adapted the LoRA specs to the new PEFT.😭

xi0v

Oct 13, 2024

Mistakes happen 😂!
Also are you able to add support for LyCORIS specifically LoHA to be merged into the checkpoint/diffusers format model?

John6666

Owner Oct 13, 2024

hehehe.

LyCORIS specifically LoHA

PEFT originally had no option to distinguish between LoRA, LoHA, LOCON, LyCORIS and... anyway none of them.
I'm not sure if PEFT implicitly absorbs the difference or if there is essentially no difference in structure there.
The PEFT author wrote that the LoRA part of Diffusers depends on PEFT.
In my experience, at least for LyCORIS, I don't know what it is, but it is usable in Diffusers, so it is probably all usable as it is.

xi0v

Oct 13, 2024

Got this while trying to convert a checkpoint

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/diffusers/loaders/single_file.py", line 495, in from_single_file
    loaded_sub_model = load_single_file_sub_model(
  File "/usr/local/lib/python3.10/site-packages/diffusers/loaders/single_file.py", line 168, in load_single_file_sub_model
    raise SingleFileComponentError(
diffusers.loaders.single_file_utils.SingleFileComponentError: Failed to load CLIPTextModel. Weights for this component appear to be missing in the checkpoint.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 622, in process_events
    response = await route_utils.call_process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2016, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1569, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2405, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 914, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 846, in wrapper
    response = f(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 846, in wrapper
    response = f(*args, **kwargs)
  File "/home/user/app/convert_url_to_diffusers_sdxl_gr.py", line 347, in convert_url_to_diffusers_repo
    new_path = convert_url_to_diffusers_sdxl(dl_url, civitai_key, hf_token, is_upload_sf, half, vae, scheduler, lora_dict, False)
  File "/home/user/app/convert_url_to_diffusers_sdxl_gr.py", line 265, in convert_url_to_diffusers_sdxl
    pipe = StableDiffusionXLPipeline.from_single_file(new_file, use_safetensors=True, torch_dtype=torch.float16)
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/diffusers/loaders/single_file.py", line 510, in from_single_file
    raise SingleFileComponentError(
diffusers.loaders.single_file_utils.SingleFileComponentError: Failed to load CLIPTextModel. Weights for this component appear to be missing in the checkpoint.
Please load the component before passing it in as an argument to `from_single_file`.

text_encoder = CLIPTextModel.from_pretrained('...')
pipe = StableDiffusionXLPipeline.from_single_file(<checkpoint path>, text_encoder=text_encoder)

xi0v

Oct 13, 2024

•

edited Oct 13, 2024

hehehe.

LyCORIS specifically LoHA

PEFT originally had no option to distinguish between LoRA, LoHA, LOCON, LyCORIS and... anyway none of them.
I'm not sure if PEFT implicitly absorbs the difference or if there is essentially no difference in structure there.
The PEFT author wrote that the LoRA part of Diffusers depends on PEFT.
In my experience, at least for LyCORIS, I don't know what it is, but it is usable in Diffusers, so it is probably all usable as it is.

I looked it up and it seems like LyCORIS support in general will take a while for diffusers and PEFT, even though they added support for "Kohya-Styled LoRAs" which should've included LyCORIS but I guess not

xi0v

Oct 13, 2024

In my experience, at least for LyCORIS, I don't know what it is, but it is usable in Diffusers, so it is probably all usable as it is.

Gonna try to merge one when the space is fixed

John6666

Owner Oct 13, 2024

diffusers.loaders.single_file_utils.SingleFileComponentError: Failed to load CLIPTextModel. Weights for this component appear to be missing in the checkpoint.

This appears to be an error that your file does not have a CLIP text encoder.
It's not impossible to make the default CLIP text encoder specifiable...
It doesn't show up in the safetensors file I'm using for my sample, so perhaps ComfyUI or something has output only UNETs?

John6666

Owner Oct 13, 2024

even though they added support for "Kohya-Styled LoRAs" which should've included LyCORIS but I guess not

So just because there is no error in the PEFT, doesn't mean it's not actually applied...

xi0v

Oct 13, 2024

This appears to be an error that your file does not have a CLIP text encoder.
It's not impossible to make the default CLIP text encoder specifiable...
It doesn't show up in the safetensors file I'm using for my sample, so perhaps ComfyUI or something has output only UNETs?

Not at all, the checkpoints are valid and not just UNETs

John6666

Owner Oct 13, 2024

•

edited Oct 13, 2024

https://huggingface.co/bluepen5805/anima_pencil-XL/blob/main/anima_pencil-XL-v5.0.0.safetensors

After all, there are no errors here...
It was on Flux, but I'm wondering if there's some weird prefix attached to the weight that's preventing it from recognizing the key.
Or is it some kind of custom CLIP?

Edit:
Maybe this issue.
https://github.com/huggingface/diffusers/issues/8228
https://github.com/huggingface/diffusers/issues/8435

John6666

Owner Oct 13, 2024

https://huggingface.co/spaces/John6666/sdxl-to-diffusers-v2-cliptest
I prototyped a version that brings CLIP externally, whether or not there is a CLIP in the safetensors file.
If this does not work, it is quite strange. There is a possibility that some unknown error is happening.

xi0v

Oct 13, 2024

•

edited Oct 13, 2024

https://huggingface.co/bluepen5805/anima_pencil-XL/blob/main/anima_pencil-XL-v5.0.0.safetensors

After all, there are no errors here...
It was on Flux, but I'm wondering if there's some weird prefix attached to the weight that's preventing it from recognizing the key.
Or is it some kind of custom CLIP?

Edit:
Maybe this issue.
https://github.com/huggingface/diffusers/issues/8228
https://github.com/huggingface/diffusers/issues/8435

It was an illustrious model where I merged another illustrious model into the UNET of a model (illustrious based) and made the one I wanted to convert

xi0v

Oct 13, 2024

https://huggingface.co/spaces/John6666/sdxl-to-diffusers-v2-cliptest
I prototyped a version that brings CLIP externally, whether or not there is a CLIP in the safetensors file.
If this does not work, it is quite strange. There is a possibility that some unknown error is happening.

I'm just going to try another checkpoint which is a newer version of the one I tried to convert and see if it works

John6666

Owner Oct 13, 2024

•

edited Oct 13, 2024

I'm using from_single_file for my usual conversion, and I've never had a CLIP error with from_single_file, even with many illustrious
based models...

Merging models can break models.
But merging doesn't usually change the name of the key, so it is strange that CLIP is unrecognizable.
It may not be that something is actually wrong with CLIP, but that there is some other error that is causing these error messages.

A simpler pattern could be that the latest Diffusers are buggy. The one I use locally is not the latest version. At this Spaces on the other hand is the latest dev version.

Edit:
I recall that CLIP is under the jurisdiction of transformers. It's a language model. And the latest transformers was buggy somehow. I'll try to degrade it a bit.

Edit:
Degraded.
https://huggingface.co/spaces/John6666/sd-to-diffusers-v2

xi0v

Oct 13, 2024

I'm just going to try another checkpoint which is a newer version of the one I tried to convert and see if it works

Its converting..?

xi0v

Oct 13, 2024

•

edited Oct 13, 2024

Its converting..?

It converted normally with no clip errors

Gonna test the model to see if whether or not the TE is cooked

xi0v

Oct 13, 2024

I'm using from_single_file for my usual conversion, and I've never had a CLIP error with from_single_file, even with many illustrious
based models...

Merging models can break models.
But merging doesn't usually change the name of the key, so it is strange that CLIP is unrecognizable.
It may not be that something is actually wrong with CLIP, but that there is some other error that is causing these error messages.

A simpler pattern could be that the latest Diffusers are buggy. The one I use locally is not the latest version. At this Spaces on the other hand is the latest dev version.

It could be that the dev version is having a stroke or there might be something else that we aren't seeing

xi0v

Oct 13, 2024

The model works normally, not sure why the older version of it doesn't even convert

xi0v

Oct 13, 2024

An alternate version of the model that had a clip error also works, something isn't right

John6666

Owner Oct 13, 2024

An alternate version of the model that had a clip error also works, something isn't right

That error-prone model is a valuable debugging resource.
You should dare to keep it.

At least a key checker for state_dict would be something I could do tomorrow for the GUI. Maybe that will tell us something.
The conversion in that model errors out at the key checking stage, not at the content checking stage.

xi0v

Oct 13, 2024

That error-prone model is a valuable debugging resource.
You should dare to keep it.

Im sure as hell keeping that!

At least a key checker for state_dict would be something I could do tomorrow for the GUI.

I also thought of telling you this.

The conversion in that model errors out at the key checking stage, not at the content checking stage.

I also noticed that, I'll try to dig into the keys and compare them with the alternate version

John6666

Owner Oct 13, 2024

I've fixed the diffusers version as well, since we won't have any trouble with SDXL, let alone with Flux, even if it's not the dev version.

xi0v

Oct 13, 2024

@John6666 IT WORKED..

xi0v

Oct 13, 2024

@John6666 IT WORKED..

Gonna test the model now

xi0v

Oct 13, 2024

You should still make that key checker gui tho!

xi0v

Oct 13, 2024

@John6666 IT WORKED..

Gonna test the model now

It seems as if the model lost character knowledge so it definitely has a clip problem

John6666

Owner Oct 13, 2024

I'd say check, but I don't know the right keys, so I can only make one that just outputs the keys. Well, if something is wrong, we can tell visually.😅

John6666

Owner Oct 13, 2024

It seems as if the model lost character knowledge so it definitely has a clip problem

I wonder if CLIP or another text encoder is broken, since a slightly broken UNET wouldn't do that.
It's still a rare case.

xi0v

Oct 13, 2024

I'd say check, but I don't know the right keys, so I can only make one that just outputs the keys. Well, if something is wrong, we can tell visually.😅

We can dump the keys from illustrious base or any illustrious based model and compare the keys

xi0v

Oct 13, 2024

It seems as if the model lost character knowledge so it definitely has a clip problem

I wonder if CLIP or another text encoder is broken, since a slightly broken UNET wouldn't do that.
It's still a rare case.

Either the CLIP accidentally fixed itself (since clip kind of breaks when merging) during the merging process or a TE is actually broken, we'll have to see

John6666

Owner Oct 14, 2024

The CLI version was completed first. The reference data is from SDXL1.0+0.9VAE.
https://huggingface.co/datasets/John6666/stkey_cli

John6666

Owner Oct 14, 2024

GUI version was completed.
https://huggingface.co/spaces/John6666/safetensors-key-checker

xi0v

Oct 14, 2024

Will test it out now

xi0v

Oct 14, 2024

So the model that has the clip broken is missing these keys:

[
  "conditioner.embedders.0.transformer.text_model.embeddings.position_ids",
  "conditioner.embedders.1.model.logit_scale"
]

And the alternate version of that model is missing this:

[
  "conditioner.embedders.0.transformer.text_model.embeddings.position_ids"
]

I believe that merging models breaks the clip position IDs so thats normal

John6666

Owner Oct 14, 2024

•

edited Oct 14, 2024

Thank you for experiment.

merging models breaks the clip position IDs so thats normal

I know it breaks the contents, but I too have messed with the contents of the merger, but the fact that the keys themselves are missing is a mystery.
And I didn't see anything unusual in the original Illustrious, derived, or other SDXL merged models.
Maybe that version of the merger, is buggy...?
With Diffusers, we can just copy and paste the entire CLIP or VAE folder from elsewhere and the port is complete, and I can add that kind of functionality, but I'm afraid that a potential bug in the merger could have other effects.

xi0v

Oct 14, 2024

•

edited Oct 14, 2024

Thank you for experiment.

merging models breaks the clip position IDs so thats normal

I know it breaks the contents, but I too have messed with the contents of the merger, but the fact that the keys themselves are missing is a mystery.

True

And I didn't see anything unusual in the original Illustrious, derived, or other SDXL merged models.
Maybe that version of the merger, is buggy...?

I don't think so, since I've been merging models almost every day and other checkpoints work perfectly fine

With Diffusers, we can just copy and paste the entire CLIP or VAE folder from elsewhere and the port is complete, and I can add that kind of functionality, but I'm afraid that a potential bug in the merger could have other effects.

Its really weird that the checkpoint is missing
The "conditioner.embedders.1.model.logit_scale" key, I'm not sure what kind of error could make this key disappear

John6666

Owner Oct 14, 2024

I don't think so, since I've been merging models almost every day and other checkpoints work perfectly fine

So even if it is a bug, it is not a logic error, but a malfunction that occurs under limited conditions...

The "conditioner.embedders.1.model.logit_scale" key, I'm not sure what kind of error could make this key disappear

If this key does not exist in either model, the merger may ignore the entire key. But if so, then where did the model without the key come from?
For example, the process of baking VAE, or the pre-process or post-process when reading or writing the model, or something else in the model tool could be buggy.

Anyway, something has to go wrong somewhere for it to break...
Well, not only in the program, but also in some cases when downloading or when the HDD is corrupted and the data gets corrupted...😱

xi0v

Oct 14, 2024

So even if it is a bug, it is not a logic error, but a malfunction that occurs under limited conditions...

For example, the process of baking VAE, or the pre-process or post-process when reading or writing the model, or something else in the model tool could be buggy.

I'm thinking of reviewing the merging script. But I'm not sure if that'll point us to where the issue is.

If this key does not exist in either model, the merger may ignore the entire key. But if so, then where did the model without the key come from?

I'm quite unsure, the alternate version is not missing that key.

when the HDD is corrupted and the data gets corrupted...

I believe This isn't an issue since I did this merge In colab

John6666

Owner Oct 14, 2024

Hmmm, if it happened in Colab, it could only be caused by the model or somewhere in the program.
One thing I noticed when I was fiddling with the merger is that the tensor shape for UNET is basically constant, but for text encoders, the tensor shape could be different depending on the model.
What if, under certain conditions, the tensor arithmetic process to make it consistent before that merge failed and blew up the whole key or something?
However, this is a weak theory because the phenomenon should be reproduced almost 100% for the same combination of models.

xi0v

Oct 14, 2024

but for text encoders, the tensor shape could be different depending on the model.

This is part of the reason why merging breaks clip and the model converter extension on A1111 having these options (which can break the entire model 💀)

What if, under certain conditions, the tensor arithmetic process to make it consistent before that merge failed and blew up the whole key or something?

This might be what happened except DURING the merge. i would've said that it was due to my Merge Block Weight values but I use the same on all of my models.

However, this is a weak theory because the phenomenon should be reproduced almost 100% for the same combination of models.

Yeah True

John6666

Owner Oct 14, 2024

All right! This is the guy who did it. We'd still rather have an error out than have the model break...😓

xi0v

Oct 14, 2024

All right! This is the guy who did it.

LOL Its valid MBW Values! it shouldn't outright delete keys.

We'd still rather have an error out than have the model break...😓

Yup, it's so interesting to dig into. I'll probably review the merging script today.

xi0v

Oct 14, 2024

•

edited Oct 14, 2024

Also @John6666 have you ever thought about making a GUI for merging (on HF spaces)? Especially on that supports SDXL and Flux (flux is quite interesting for me, I'm thinking of getting more into it. Especially with it's customizablity!) I'd love to see you work on this!
Since webuis are pretty Buggy on HF spaces (Forge and A1111)
Or if you have flux merging scripts please link them? (Though a GUI would be 🔥🔥)

John6666

Owner Oct 15, 2024

•

edited Oct 15, 2024

As for creating a GUI for the merge, I've only done it on a trial basis, and so far I haven't thought about doing it in earnest.
I don't see the point of creating a third merge when there are excellent merge's that are constantly maintained in ComfyUI and WebUI. I'm not even good at logic programming.
If there's a library specific to HF, then it makes sense to wrap it...

WebUI is buggy in HF's Spaces mainly because they can't put in the latest components due to lack of CPU space specs.
https://discuss.huggingface.co/t/infinite-preparing-space/107534/1
The only reason ggufmyrepo works decently is because it calls the CLI Llamacpp inside...
The old-fashioned Linux CLI tools are good at processing data bit by bit in a memory-saving manner.
On the other hand, in the Zero GPU space, it is still difficult to do it normally because of the Quota.
This effectively closes the way for individual volunteers to achieve this, but the HF staff would not be interested in making WebUI or ComfyUI work on HF.

Even if I were to build a full scratch merger, I can only do a fairly simple merge with 16GB of RAM and 50GB of HDD in CPU space. So that thing works, but I've given up on improving it.
There are some operations that can't be done unless the GPU is available, and there simply isn't enough room for the model.

As a workaround for the current situation, there is a way to use a high-spec VM with Zero GPU space without using GPUs, and in fact I am using it to convert Flux models to Diffusers format, but the problem here is the 10-space limit issue. One of the reasons why I've been playing around with CPU space so much lately is because I can't deploy any more Zero GPU space even if I created it. If I can't use it myself, I can't debug it...

Anyway, lamenting is pointless, so I think I'll just make a frame-like program that supports both Zero GPU space and CPU space, like the quantizer I made yesterday.

John6666

Owner Oct 15, 2024

So, do you know of any WebUI or ComfyUI spaces that work relatively decently? I would like to use it as a base.

xi0v

Oct 15, 2024

So, do you know of any WebUI or ComfyUI spaces that work relatively decently? I would like to use it as a base.

Comfyui, Forge and Reforge (A1111 is unoptimized) and reforge and Comfyui are arguably the best right now, I recall comfy has a full python API

xi0v

Oct 15, 2024

•

edited Oct 15, 2024

So, do you know of any WebUI or ComfyUI spaces that work relatively decently? I would like to use it as a base.

I think I have an A1111 space but that is so slow

I have yet to see anyone try to port Forge/ReForge
Comfy is unusable in spaces beyond their python api

xi0v

Oct 15, 2024

Wtf I just found this https://huggingface.co/spaces/kadirnar/ComfyUI-Demo (I can barely use it because on my phone but I believe you are on PC so it'll work better with you)

John6666

Owner Oct 15, 2024

https://huggingface.co/spaces/kadirnar/ComfyUI-Demo

That's great, but if we have the space running on A10G and Persistent Storage, that seems like it's already good enough...?
If I look at the contents, he's just in Docker, clone git, download files with wget and run ComfyUI, so I guess I could port this, but with this HDD consumption, it would crash before it could boot in CPU space.
Oh well, I'll try various modifications.

John6666

Owner Oct 15, 2024

•

edited Oct 15, 2024

For now, I got it to work in Gradio space without Docker, downloads from HF are fast, and we can read private repos if we set HF_TOKEN to Secrets.
It also worked in Zero GPU space, though in CPU mode.

But it's going to be hard to mess with ComfyUI internals to upload output folders and such... WebUI (or Forge) might be easier to modify since it's Gradio.
It would be easier if someone already had a fork with modifications for HF, or a plugin or something.
https://huggingface.co/spaces/John6666/comfy_test
https://huggingface.co/spaces/wrdias/ComfyUI-Animatediff

xi0v

Oct 15, 2024

But it's going to be hard to mess with ComfyUI internals to upload output folders and such... WebUI (or Forge) might be easier to modify since it's Gradio.

True. I thought maybe porting supermerger to a standalone gradio app would be better while also porting model mixer extension (which I believe has the dare merging method)

John6666

Owner Oct 15, 2024

I never thought of that idea.😇
It's true that the plug-in is not really connected to the main unit.
Worst case scenario, I could create the entire GUI.
I think ComfyUI's Merger has more functions than SuperMerger, but I don't know how to handle objects inside ComfyUI at all... SuperMerger is probably close to an independent software in that respect.

xi0v

Oct 15, 2024

I never thought of that idea.😇
It's true that the plug-in is not really connected to the main unit.
Worst case scenario, I could create the entire GUI.
I think ComfyUI's Merger has more functions than SuperMerger, but I don't know how to handle objects inside ComfyUI at all... SuperMerger is probably close to an independent software in that respect.

We could port the code in supermerger and model mixer and make it standalone it's not easy but it's possible.
comfyui has more functions because no one is willing to code the customizablity that comfyui has.

John6666

Owner Oct 16, 2024

•

edited Oct 16, 2024

https://huggingface.co/spaces/John6666/supermerger-test
I tried to force SuperMerger to be standalone, but it has a stronger dependency with WebUI than I expected and I can't bring it to startup.
It's the Python promised relative/absolute import problem. I'm starting to think it would be easier to modify it if I don't detach it.
By the way, the current body part is Forge.

John6666

Owner Oct 16, 2024

For ComfyUI, this merger is also carefully implemented. The interpolation process is also carefully implemented.
https://github.com/54rt1n/ComfyUI-DareMerge

xi0v

Oct 16, 2024

https://huggingface.co/spaces/John6666/supermerger-test
I tried to force SuperMerger to be standalone, but it has a stronger dependency with WebUI than I expected and I can't bring it to startup.
It's the Python promised relative/absolute import problem. I'm starting to think it would be easier to modify it if I don't detach it.
By the way, the current body part is Forge.

If forcing extensions to become standalone doesn't work at all,
Then there should be a way to make such webuis work with zerogpu?

xi0v

Oct 16, 2024

For ComfyUI, this merger is also carefully implemented. The interpolation process is also carefully implemented.
https://github.com/54rt1n/ComfyUI-DareMerge

Yeah that one looks very good

xi0v

Oct 16, 2024

If forcing extensions to become standalone doesn't work at all,
Then there should be a way to make such webuis work with zerogpu?

OR going through the code of supermerger/model mixer and copying over (with the needed modifications) what we need

John6666

Owner Oct 16, 2024

Then there should be a way to make such webuis work with zerogpu?

This is probably possible, but there is one complication: WebUI was Gradio 3 until this summer, but the Zero GPU space requires Gradio 4 or higher. So we will have to rethink a better package configuration with a newer version of WebUI or Forge, rather than a copy of the old CPU space or regular GPU space that was running on HF.
Well, it won't be hard, just start in CPU mode. Better performance or modification is another matter, though.

supermerger/model mixer and copying over (with the needed modifications) what we need

As long as it's a merger, I think it's possible because it will eventually just read two or more files and write them out to one file.
I would make copy/paste a last resort because it would be too low maintenance. If I'm the only maintainer, though, I'll copy and paste mercilessly.

I see no problems with SuperMerger's Gradio code and no glaring compatibility issues with Gradio4. It would be possible to run it in Gradio5 as a stand-alone application right away, if only for its appearance, without any content.
The only problem really appears to be the dependencies. If I keep chipping away at the code I'm allowed to chip away at, it should work at some point...
The only reason I haven't started scraping the code yet is because SuperMerger is frequently maintained.
It would be a shame to fork it from the current specific version. After tomorrow, I will try to get it working as it is as much as possible.
In parallel, I will try to run the Gradio4 version of the WebUI series in HF space. This should not be difficult since we have the know-how of the Gradio3 version. It will just take a lot of work.

xi0v

Oct 19, 2024

Any Updates? 😄

John6666

Owner Oct 19, 2024

•

edited Oct 19, 2024

No, I've started other work and haven't made any progress at all.

The reason for the lack of progress is that I'm worried about the modules.timer problem.
Of course, I can work around it by porting manually, but it must be a problem that comes with all Python program porting. If I don't solve it, I'll run into his relatives again...
I can't claim to be an expert on Python, but it has all the pros and cons of a scripting language, and is not a great language for the encapsulation and reuse concept. It is more suited to the style of writing simple code and reusing it by copying and pasting. I think it's hard for library authors.
So I'd like to solve it the hard way, but even if I change the directory structure, I'm still stumped by modules.timer instead of other modules... I can understand if I'm stumped by all of them.

But the actual WebUI is running in another environment. Then it's not impossible to import this module itself, but it must be some environmental factor or easy mistake.

On the HF CPU space, it works in the pattern of git clone WebUI after space startup and start .sh with subprocess.run().
However, the Zero GPU space is incompatible with processes and threads, and this means itself should be avoided. But it should give us a clue as to how to solve the problem.

xi0v

Oct 19, 2024

Hmmm.
The modules are useless but they shouldn't be incompletely added to the modules folder I'm currently adding the missing ones there and seeing if I can get the space running or not or not. I think a little bit of LLMs can help in this.
I'm trying to look over the code too and see what I can do about it.
We wouldn't have a problem with copy pasting bits and pieces from supermerger and model mixer but that'll make it a lot less humanly readable.
So we're still in plan A, port supermerger.

xi0v

Oct 19, 2024

I added the needed modules and got this

"/usr/local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 319, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

I think this is where we need the webui running on CPU part

John6666

Owner Oct 19, 2024

•

edited Oct 19, 2024

--skip-torch-cuda-test It should work if you can set the equivalent of this option.

xi0v

Oct 20, 2024

--skip-torch-cuda-test It should work if you can set the equivalent of this option.

Not sure how I should go about adding that
I tried the good ol'

import torch
device = "cpu" if not torch.cuda.is_available() else "cuda"

But I guess I should try another way

John6666

Owner Oct 20, 2024

•

edited Oct 20, 2024

good ol'

Exactly.

But I guess I should try another way

If there is a .to("cuda") inside the module, this won't help.😅

xi0v

Oct 20, 2024

Something isn't right.
I added --skip-torch-cuda-test to the commandline args env variable and still.

John6666

Owner Oct 20, 2024

•

edited Oct 20, 2024

If it's not a WebUI version issue, there should be a clue somewhere in this.

subprocess.run([r"python3" ,r"launch.py",r"--precision",r"full",r"--no-half",r"--no-half-vae",r"--enable-insecure-extension-access",r"--medvram",r"--skip-torch-cuda-test",r"--enable-console-prompts",r"--ui-settings-file="+str(pathlib.Path(__file__).parent /r"config.json")])

https://huggingface.co/spaces/John6666/webui_test1

xi0v

Oct 20, 2024

Hmmmmm

Exit code: 1. Reason: Traceback (most recent call last):
  File "/home/user/app/app.py", line 4, in <module>
    subprocess.run([r"python3" ,r"launch.py",r"--precision",r"full",r"--no-half",r"--no-half-vae",r"--enable-insecure-extension-access",r"--medvram",r"--skip-torch-cuda-test",r"--enable-console-prompts",r"--ui-settings-file="+str(pathlib.Path(__file__).parent /r"config.json")])
NameError: name 'pathlib' is not defined

John6666

Owner Oct 20, 2024

import pathlib

xi0v

Oct 20, 2024

import pathlib

This is what's being done already lol

xi0v

Oct 20, 2024

"/home/user/app/backend/memory_management.py", line 100, in get_torch_device
    return torch.device(torch.cuda.current_device())
  File "/usr/local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 940, in current_device
    _lazy_init()
  File "/usr/local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 319, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

Something isn't right

John6666

Owner Oct 20, 2024

•

edited Oct 20, 2024

/home/user/app/backend/memory_management.py

It's the part of the Forge author's lllyasviel respectful memory management function.
I wonder if it can be specified in the startup options for Forge? I'll go have a look on github.

Edit:
https://github.com/lllyasviel/stable-diffusion-webui-forge/blob/main/modules/cmd_args.py#L75

parser.add_argument("--use-cpu", nargs='+', help="use CPU as torch device for specified modules", default=[], type=str.lower)

https://github.com/lllyasviel/stable-diffusion-webui-forge/blob/main/backend/args.py#L50

vram_group.add_argument("--always-cpu", action="store_true")
~
parser.add_argument("--disable-gpu-warning", action="store_true")

xi0v

Oct 20, 2024

So I should try this when doing the subprocess python3 launch.py?

John6666

Owner Oct 20, 2024

https://github.com/lllyasviel/stable-diffusion-webui-forge/blob/main/backend/memory_management.py#L73

#if args.always_cpu:
#    cpu_state = CPUState.CPU
cpu_state = CPUState.CPU

I think this is all we can do.

xi0v

Oct 20, 2024

Still have the same GPU error.
I'd say upload all the module files and try to implement this.
I'm not sure what Im doing wrong

John6666

Owner Oct 20, 2024

It's 23:00, so tomorrow. But anyway, WebUI is trickier than we imagined...🥶
I wonder if options are being used in place of environment variables.

xi0v

Oct 20, 2024

tomorrow

Take your time!

But anyway, WebUI is trickier than we imagined...🥶

Very true

I wonder if options are being used in place of environment variables.

Well we do have COMMANDLINE_ARGS but it doesn't seem to have any effect in my testing

John6666

Owner Oct 21, 2024

•

edited Oct 21, 2024

I experimented a bit, uploading the WebUI package directly to HF doesn't work, but git clone to the same path does.

I'm not sure why, but I'm starting to understand a little.
Maybe it's just a trivial problem with HF's space specs. Something to do with line break codes or something like that.
https://huggingface.co/spaces/John6666/webui_test2

Edit:
I tried both LF and CRLF for newline codes, but both failed. Files that have been placed in Spaces from the beginning fail to import even when shutil.copytree() from another location.
There is no difference between the git clone file and the file uploaded to Spaces when comparing them with diff command.
The symptoms I can see now are similar to the following. This is just a symptom.
Anyway, this is not about logic or code. It is a problem with the execution environment.
https://stackoverflow.com/questions/44484082/python-cant-find-module-when-started-with-sudo

xi0v

Oct 21, 2024

I experimented a bit, uploading the WebUI package directly to HF doesn't work, but git clone to the same path does.

You should only upload the full module folder, since a lot are missing

Anyway, this is not about logic or code. It is a problem with the execution environment.

That's weird, I'll look into it more.

John6666

Owner Oct 21, 2024

•

edited Oct 22, 2024

You should only upload the full module folder, since a lot are missing

I've tried uploading the entire contents of the webui zip and the git clone. Neither of them worked if I put them there beforehand. And the file contents are the same as far as I can detect with diff, so I think the problem is the filesystem Attribute, the actual Python affiliation being run, some sort of Permission, or some other parameter that is more implicit.
I suspected it might be the github assets, but it's not even in Forge's 4GB package, so probably not.

Edit:
https://huggingface.co/spaces/John6666/webui_test3
It worked! The code that didn't work yesterday is now working...
Well, something must have broken in the HF settings. If it works, it's OK.

Edit:
Anyway, now we can get into coding instead of fighting with Python. It's so stressful doing something other than programming to program...

Edit:
Now we are interrupted by an incompatible component of Gradio in the initialization function...
If I don't find a good route to bypass it, it seems to refer to a component that doesn't exist in 4.x.
Forge stops with another error. Currently, WebUI is easier to get up and running.
https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/16529

Spaces:

John6666
/

sdxl-to-diffusers-v2

Running

Space Broke