Spaces:
Running
Space Broke
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 622, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2016, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1569, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2405, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 914, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 846, in wrapper
response = f(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 846, in wrapper
response = f(*args, **kwargs)
File "/home/user/app/convert_url_to_diffusers_sdxl_gr.py", line 347, in convert_url_to_diffusers_repo
new_path = convert_url_to_diffusers_sdxl(dl_url, civitai_key, hf_token, is_upload_sf, half, vae, scheduler, lora_dict, False)
File "/home/user/app/convert_url_to_diffusers_sdxl_gr.py", line 286, in convert_url_to_diffusers_sdxl
pipe.scheduler = sconf[0].from_config(pipe.scheduler.config, **sconf[1])
AttributeError: 'NoneType' object has no attribute 'scheduler'
I've merged your commits and effectively rolled back. Now, this is the first time I've seen this bug even related to Gradio 5.
I'll have to debug it again. Hopefully it's a simple problem.
It wasn't Gradio5's fault. It was a false accusation!
I made a mistake in the branching process yesterday when I adapted the LoRA specs to the new PEFT.😭
Mistakes happen 😂!
Also are you able to add support for LyCORIS specifically LoHA to be merged into the checkpoint/diffusers format model?
hehehe.
LyCORIS specifically LoHA
PEFT originally had no option to distinguish between LoRA, LoHA, LOCON, LyCORIS and... anyway none of them.
I'm not sure if PEFT implicitly absorbs the difference or if there is essentially no difference in structure there.
The PEFT author wrote that the LoRA part of Diffusers depends on PEFT.
In my experience, at least for LyCORIS, I don't know what it is, but it is usable in Diffusers, so it is probably all usable as it is.
Got this while trying to convert a checkpoint
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/diffusers/loaders/single_file.py", line 495, in from_single_file
loaded_sub_model = load_single_file_sub_model(
File "/usr/local/lib/python3.10/site-packages/diffusers/loaders/single_file.py", line 168, in load_single_file_sub_model
raise SingleFileComponentError(
diffusers.loaders.single_file_utils.SingleFileComponentError: Failed to load CLIPTextModel. Weights for this component appear to be missing in the checkpoint.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 622, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2016, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1569, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2405, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 914, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 846, in wrapper
response = f(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 846, in wrapper
response = f(*args, **kwargs)
File "/home/user/app/convert_url_to_diffusers_sdxl_gr.py", line 347, in convert_url_to_diffusers_repo
new_path = convert_url_to_diffusers_sdxl(dl_url, civitai_key, hf_token, is_upload_sf, half, vae, scheduler, lora_dict, False)
File "/home/user/app/convert_url_to_diffusers_sdxl_gr.py", line 265, in convert_url_to_diffusers_sdxl
pipe = StableDiffusionXLPipeline.from_single_file(new_file, use_safetensors=True, torch_dtype=torch.float16)
File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/diffusers/loaders/single_file.py", line 510, in from_single_file
raise SingleFileComponentError(
diffusers.loaders.single_file_utils.SingleFileComponentError: Failed to load CLIPTextModel. Weights for this component appear to be missing in the checkpoint.
Please load the component before passing it in as an argument to `from_single_file`.
text_encoder = CLIPTextModel.from_pretrained('...')
pipe = StableDiffusionXLPipeline.from_single_file(<checkpoint path>, text_encoder=text_encoder)
hehehe.
LyCORIS specifically LoHA
PEFT originally had no option to distinguish between LoRA, LoHA, LOCON, LyCORIS and... anyway none of them.
I'm not sure if PEFT implicitly absorbs the difference or if there is essentially no difference in structure there.
The PEFT author wrote that the LoRA part of Diffusers depends on PEFT.
In my experience, at least for LyCORIS, I don't know what it is, but it is usable in Diffusers, so it is probably all usable as it is.
I looked it up and it seems like LyCORIS support in general will take a while for diffusers and PEFT, even though they added support for "Kohya-Styled LoRAs" which should've included LyCORIS but I guess not
In my experience, at least for LyCORIS, I don't know what it is, but it is usable in Diffusers, so it is probably all usable as it is.
Gonna try to merge one when the space is fixed
diffusers.loaders.single_file_utils.SingleFileComponentError: Failed to load CLIPTextModel. Weights for this component appear to be missing in the checkpoint.
This appears to be an error that your file does not have a CLIP text encoder.
It's not impossible to make the default CLIP text encoder specifiable...
It doesn't show up in the safetensors file I'm using for my sample, so perhaps ComfyUI or something has output only UNETs?
even though they added support for "Kohya-Styled LoRAs" which should've included LyCORIS but I guess not
So just because there is no error in the PEFT, doesn't mean it's not actually applied...
This appears to be an error that your file does not have a CLIP text encoder.
It's not impossible to make the default CLIP text encoder specifiable...
It doesn't show up in the safetensors file I'm using for my sample, so perhaps ComfyUI or something has output only UNETs?
Not at all, the checkpoints are valid and not just UNETs
https://huggingface.co/bluepen5805/anima_pencil-XL/blob/main/anima_pencil-XL-v5.0.0.safetensors
After all, there are no errors here...
It was on Flux, but I'm wondering if there's some weird prefix attached to the weight that's preventing it from recognizing the key.
Or is it some kind of custom CLIP?
Edit:
Maybe this issue.
https://github.com/huggingface/diffusers/issues/8228
https://github.com/huggingface/diffusers/issues/8435
https://huggingface.co/spaces/John6666/sdxl-to-diffusers-v2-cliptest
I prototyped a version that brings CLIP externally, whether or not there is a CLIP in the safetensors file.
If this does not work, it is quite strange. There is a possibility that some unknown error is happening.
https://huggingface.co/bluepen5805/anima_pencil-XL/blob/main/anima_pencil-XL-v5.0.0.safetensors
After all, there are no errors here...
It was on Flux, but I'm wondering if there's some weird prefix attached to the weight that's preventing it from recognizing the key.
Or is it some kind of custom CLIP?Edit:
Maybe this issue.
https://github.com/huggingface/diffusers/issues/8228
https://github.com/huggingface/diffusers/issues/8435
It was an illustrious model where I merged another illustrious model into the UNET of a model (illustrious based) and made the one I wanted to convert
https://huggingface.co/spaces/John6666/sdxl-to-diffusers-v2-cliptest
I prototyped a version that brings CLIP externally, whether or not there is a CLIP in the safetensors file.
If this does not work, it is quite strange. There is a possibility that some unknown error is happening.
I'm just going to try another checkpoint which is a newer version of the one I tried to convert and see if it works
I'm using from_single_file for my usual conversion, and I've never had a CLIP error with from_single_file, even with many illustrious
based models...
Merging models can break models.
But merging doesn't usually change the name of the key, so it is strange that CLIP is unrecognizable.
It may not be that something is actually wrong with CLIP, but that there is some other error that is causing these error messages.
A simpler pattern could be that the latest Diffusers are buggy. The one I use locally is not the latest version. At this Spaces on the other hand is the latest dev version.
Edit:
I recall that CLIP is under the jurisdiction of transformers. It's a language model. And the latest transformers was buggy somehow. I'll try to degrade it a bit.
Edit:
Degraded.
https://huggingface.co/spaces/John6666/sd-to-diffusers-v2
I'm just going to try another checkpoint which is a newer version of the one I tried to convert and see if it works
Its converting..?
Its converting..?
It converted normally with no clip errors
Gonna test the model to see if whether or not the TE is cooked
I'm using from_single_file for my usual conversion, and I've never had a CLIP error with from_single_file, even with many illustrious
based models...Merging models can break models.
But merging doesn't usually change the name of the key, so it is strange that CLIP is unrecognizable.
It may not be that something is actually wrong with CLIP, but that there is some other error that is causing these error messages.A simpler pattern could be that the latest Diffusers are buggy. The one I use locally is not the latest version. At this Spaces on the other hand is the latest dev version.
It could be that the dev version is having a stroke or there might be something else that we aren't seeing
The model works normally, not sure why the older version of it doesn't even convert
An alternate version of the model that had a clip error also works, something isn't right
An alternate version of the model that had a clip error also works, something isn't right
That error-prone model is a valuable debugging resource.
You should dare to keep it.
At least a key checker for state_dict would be something I could do tomorrow for the GUI. Maybe that will tell us something.
The conversion in that model errors out at the key checking stage, not at the content checking stage.
That error-prone model is a valuable debugging resource.
You should dare to keep it.
Im sure as hell keeping that!
At least a key checker for state_dict would be something I could do tomorrow for the GUI.
I also thought of telling you this.
The conversion in that model errors out at the key checking stage, not at the content checking stage.
I also noticed that, I'll try to dig into the keys and compare them with the alternate version
I've fixed the diffusers version as well, since we won't have any trouble with SDXL, let alone with Flux, even if it's not the dev version.
You should still make that key checker gui tho!
I'd say check, but I don't know the right keys, so I can only make one that just outputs the keys. Well, if something is wrong, we can tell visually.😅
It seems as if the model lost character knowledge so it definitely has a clip problem
I wonder if CLIP or another text encoder is broken, since a slightly broken UNET wouldn't do that.
It's still a rare case.
I'd say check, but I don't know the right keys, so I can only make one that just outputs the keys. Well, if something is wrong, we can tell visually.😅
We can dump the keys from illustrious base or any illustrious based model and compare the keys
It seems as if the model lost character knowledge so it definitely has a clip problem
I wonder if CLIP or another text encoder is broken, since a slightly broken UNET wouldn't do that.
It's still a rare case.
Either the CLIP accidentally fixed itself (since clip kind of breaks when merging) during the merging process or a TE is actually broken, we'll have to see
The CLI version was completed first. The reference data is from SDXL1.0+0.9VAE.
https://huggingface.co/datasets/John6666/stkey_cli
GUI version was completed.
https://huggingface.co/spaces/John6666/safetensors-key-checker
Will test it out now
So the model that has the clip broken is missing these keys:
[
"conditioner.embedders.0.transformer.text_model.embeddings.position_ids",
"conditioner.embedders.1.model.logit_scale"
]
And the alternate version of that model is missing this:
[
"conditioner.embedders.0.transformer.text_model.embeddings.position_ids"
]
I believe that merging models breaks the clip position IDs so thats normal
Thank you for experiment.
merging models breaks the clip position IDs so thats normal
I know it breaks the contents, but I too have messed with the contents of the merger, but the fact that the keys themselves are missing is a mystery.
And I didn't see anything unusual in the original Illustrious, derived, or other SDXL merged models.
Maybe that version of the merger, is buggy...?
With Diffusers, we can just copy and paste the entire CLIP or VAE folder from elsewhere and the port is complete, and I can add that kind of functionality, but I'm afraid that a potential bug in the merger could have other effects.
Thank you for experiment.
merging models breaks the clip position IDs so thats normal
I know it breaks the contents, but I too have messed with the contents of the merger, but the fact that the keys themselves are missing is a mystery.
True
And I didn't see anything unusual in the original Illustrious, derived, or other SDXL merged models.
Maybe that version of the merger, is buggy...?
I don't think so, since I've been merging models almost every day and other checkpoints work perfectly fine
With Diffusers, we can just copy and paste the entire CLIP or VAE folder from elsewhere and the port is complete, and I can add that kind of functionality, but I'm afraid that a potential bug in the merger could have other effects.
Its really weird that the checkpoint is missing
The "conditioner.embedders.1.model.logit_scale" key, I'm not sure what kind of error could make this key disappear
I don't think so, since I've been merging models almost every day and other checkpoints work perfectly fine
So even if it is a bug, it is not a logic error, but a malfunction that occurs under limited conditions...
The "conditioner.embedders.1.model.logit_scale" key, I'm not sure what kind of error could make this key disappear
If this key does not exist in either model, the merger may ignore the entire key. But if so, then where did the model without the key come from?
For example, the process of baking VAE, or the pre-process or post-process when reading or writing the model, or something else in the model tool could be buggy.
Anyway, something has to go wrong somewhere for it to break...
Well, not only in the program, but also in some cases when downloading or when the HDD is corrupted and the data gets corrupted...😱
So even if it is a bug, it is not a logic error, but a malfunction that occurs under limited conditions...
For example, the process of baking VAE, or the pre-process or post-process when reading or writing the model, or something else in the model tool could be buggy.
I'm thinking of reviewing the merging script. But I'm not sure if that'll point us to where the issue is.
If this key does not exist in either model, the merger may ignore the entire key. But if so, then where did the model without the key come from?
I'm quite unsure, the alternate version is not missing that key.
when the HDD is corrupted and the data gets corrupted...
I believe This isn't an issue since I did this merge In colab
Hmmm, if it happened in Colab, it could only be caused by the model or somewhere in the program.
One thing I noticed when I was fiddling with the merger is that the tensor shape for UNET is basically constant, but for text encoders, the tensor shape could be different depending on the model.
What if, under certain conditions, the tensor arithmetic process to make it consistent before that merge failed and blew up the whole key or something?
However, this is a weak theory because the phenomenon should be reproduced almost 100% for the same combination of models.
but for text encoders, the tensor shape could be different depending on the model.
This is part of the reason why merging breaks clip and the model converter extension on A1111 having these options (which can break the entire model 💀)
What if, under certain conditions, the tensor arithmetic process to make it consistent before that merge failed and blew up the whole key or something?
This might be what happened except DURING the merge. i would've said that it was due to my Merge Block Weight values but I use the same on all of my models.
However, this is a weak theory because the phenomenon should be reproduced almost 100% for the same combination of models.
Yeah True
All right! This is the guy who did it. We'd still rather have an error out than have the model break...😓
All right! This is the guy who did it.
LOL Its valid MBW Values! it shouldn't outright delete keys.
We'd still rather have an error out than have the model break...😓
Yup, it's so interesting to dig into. I'll probably review the merging script today.
Also
@John6666
have you ever thought about making a GUI for merging (on HF spaces)? Especially on that supports SDXL and Flux (flux is quite interesting for me, I'm thinking of getting more into it. Especially with it's customizablity!) I'd love to see you work on this!
Since webuis are pretty Buggy on HF spaces (Forge and A1111)
Or if you have flux merging scripts please link them? (Though a GUI would be 🔥🔥)
As for creating a GUI for the merge, I've only done it on a trial basis, and so far I haven't thought about doing it in earnest.
I don't see the point of creating a third merge when there are excellent merge's that are constantly maintained in ComfyUI and WebUI. I'm not even good at logic programming.
If there's a library specific to HF, then it makes sense to wrap it...
WebUI is buggy in HF's Spaces mainly because they can't put in the latest components due to lack of CPU space specs.
https://discuss.huggingface.co/t/infinite-preparing-space/107534/1
The only reason ggufmyrepo works decently is because it calls the CLI Llamacpp inside...
The old-fashioned Linux CLI tools are good at processing data bit by bit in a memory-saving manner.
On the other hand, in the Zero GPU space, it is still difficult to do it normally because of the Quota.
This effectively closes the way for individual volunteers to achieve this, but the HF staff would not be interested in making WebUI or ComfyUI work on HF.
Even if I were to build a full scratch merger, I can only do a fairly simple merge with 16GB of RAM and 50GB of HDD in CPU space. So that thing works, but I've given up on improving it.
There are some operations that can't be done unless the GPU is available, and there simply isn't enough room for the model.
As a workaround for the current situation, there is a way to use a high-spec VM with Zero GPU space without using GPUs, and in fact I am using it to convert Flux models to Diffusers format, but the problem here is the 10-space limit issue. One of the reasons why I've been playing around with CPU space so much lately is because I can't deploy any more Zero GPU space even if I created it. If I can't use it myself, I can't debug it...
Anyway, lamenting is pointless, so I think I'll just make a frame-like program that supports both Zero GPU space and CPU space, like the quantizer I made yesterday.
So, do you know of any WebUI or ComfyUI spaces that work relatively decently? I would like to use it as a base.
So, do you know of any WebUI or ComfyUI spaces that work relatively decently? I would like to use it as a base.
Comfyui, Forge and Reforge (A1111 is unoptimized) and reforge and Comfyui are arguably the best right now, I recall comfy has a full python API
So, do you know of any WebUI or ComfyUI spaces that work relatively decently? I would like to use it as a base.
I think I have an A1111 space but that is so slow
I have yet to see anyone try to port Forge/ReForge
Comfy is unusable in spaces beyond their python api
Wtf I just found this https://huggingface.co/spaces/kadirnar/ComfyUI-Demo (I can barely use it because on my phone but I believe you are on PC so it'll work better with you)
That's great, but if we have the space running on A10G and Persistent Storage, that seems like it's already good enough...?
If I look at the contents, he's just in Docker, clone git, download files with wget and run ComfyUI, so I guess I could port this, but with this HDD consumption, it would crash before it could boot in CPU space.
Oh well, I'll try various modifications.
For now, I got it to work in Gradio space without Docker, downloads from HF are fast, and we can read private repos if we set HF_TOKEN to Secrets.
It also worked in Zero GPU space, though in CPU mode.
But it's going to be hard to mess with ComfyUI internals to upload output folders and such... WebUI (or Forge) might be easier to modify since it's Gradio.
It would be easier if someone already had a fork with modifications for HF, or a plugin or something.
https://huggingface.co/spaces/John6666/comfy_test
https://huggingface.co/spaces/wrdias/ComfyUI-Animatediff
But it's going to be hard to mess with ComfyUI internals to upload output folders and such... WebUI (or Forge) might be easier to modify since it's Gradio.
True. I thought maybe porting supermerger to a standalone gradio app would be better while also porting model mixer extension (which I believe has the dare merging method)
I never thought of that idea.😇
It's true that the plug-in is not really connected to the main unit.
Worst case scenario, I could create the entire GUI.
I think ComfyUI's Merger has more functions than SuperMerger, but I don't know how to handle objects inside ComfyUI at all... SuperMerger is probably close to an independent software in that respect.
I never thought of that idea.😇
It's true that the plug-in is not really connected to the main unit.
Worst case scenario, I could create the entire GUI.
I think ComfyUI's Merger has more functions than SuperMerger, but I don't know how to handle objects inside ComfyUI at all... SuperMerger is probably close to an independent software in that respect.
We could port the code in supermerger and model mixer and make it standalone it's not easy but it's possible.
comfyui has more functions because no one is willing to code the customizablity that comfyui has.
https://huggingface.co/spaces/John6666/supermerger-test
I tried to force SuperMerger to be standalone, but it has a stronger dependency with WebUI than I expected and I can't bring it to startup.
It's the Python promised relative/absolute import
problem. I'm starting to think it would be easier to modify it if I don't detach it.
By the way, the current body part is Forge.
For ComfyUI, this merger is also carefully implemented. The interpolation process is also carefully implemented.
https://github.com/54rt1n/ComfyUI-DareMerge
https://huggingface.co/spaces/John6666/supermerger-test
I tried to force SuperMerger to be standalone, but it has a stronger dependency with WebUI than I expected and I can't bring it to startup.
It's the Python promised relative/absoluteimport
problem. I'm starting to think it would be easier to modify it if I don't detach it.
By the way, the current body part is Forge.
If forcing extensions to become standalone doesn't work at all,
Then there should be a way to make such webuis work with zerogpu?
For ComfyUI, this merger is also carefully implemented. The interpolation process is also carefully implemented.
https://github.com/54rt1n/ComfyUI-DareMerge
Yeah that one looks very good
If forcing extensions to become standalone doesn't work at all,
Then there should be a way to make such webuis work with zerogpu?
OR going through the code of supermerger/model mixer and copying over (with the needed modifications) what we need
Then there should be a way to make such webuis work with zerogpu?
This is probably possible, but there is one complication: WebUI was Gradio 3 until this summer, but the Zero GPU space requires Gradio 4 or higher. So we will have to rethink a better package configuration with a newer version of WebUI or Forge, rather than a copy of the old CPU space or regular GPU space that was running on HF.
Well, it won't be hard, just start in CPU mode. Better performance or modification is another matter, though.
supermerger/model mixer and copying over (with the needed modifications) what we need
As long as it's a merger, I think it's possible because it will eventually just read two or more files and write them out to one file.
I would make copy/paste a last resort because it would be too low maintenance. If I'm the only maintainer, though, I'll copy and paste mercilessly.
I see no problems with SuperMerger's Gradio code and no glaring compatibility issues with Gradio4. It would be possible to run it in Gradio5 as a stand-alone application right away, if only for its appearance, without any content.
The only problem really appears to be the dependencies. If I keep chipping away at the code I'm allowed to chip away at, it should work at some point...
The only reason I haven't started scraping the code yet is because SuperMerger is frequently maintained.
It would be a shame to fork it from the current specific version. After tomorrow, I will try to get it working as it is as much as possible.
In parallel, I will try to run the Gradio4 version of the WebUI series in HF space. This should not be difficult since we have the know-how of the Gradio3 version. It will just take a lot of work.
Any Updates? 😄
No, I've started other work and haven't made any progress at all.
The reason for the lack of progress is that I'm worried about the modules.timer
problem.
Of course, I can work around it by porting manually, but it must be a problem that comes with all Python program porting. If I don't solve it, I'll run into his relatives again...
I can't claim to be an expert on Python, but it has all the pros and cons of a scripting language, and is not a great language for the encapsulation and reuse concept. It is more suited to the style of writing simple code and reusing it by copying and pasting. I think it's hard for library authors.
So I'd like to solve it the hard way, but even if I change the directory structure, I'm still stumped by modules.timer
instead of other modules... I can understand if I'm stumped by all of them.
But the actual WebUI
is running in another environment. Then it's not impossible to import
this module itself, but it must be some environmental factor or easy mistake.
On the HF CPU space, it works in the pattern of git clone WebUI
after space startup and start .sh
with subprocess.run()
.
However, the Zero GPU space is incompatible with processes and threads, and this means itself should be avoided. But it should give us a clue as to how to solve the problem.
Hmmm.
The modules are useless but they shouldn't be incompletely added to the modules folder I'm currently adding the missing ones there and seeing if I can get the space running or not or not. I think a little bit of LLMs can help in this.
I'm trying to look over the code too and see what I can do about it.
We wouldn't have a problem with copy pasting bits and pieces from supermerger and model mixer but that'll make it a lot less humanly readable.
So we're still in plan A, port supermerger.
I added the needed modules and got this
"/usr/local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 319, in _lazy_init
torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
I think this is where we need the webui running on CPU part
--skip-torch-cuda-test
It should work if you can set the equivalent of this option.
--skip-torch-cuda-test
It should work if you can set the equivalent of this option.
Not sure how I should go about adding that
I tried the good ol'
import torch
device = "cpu" if not torch.cuda.is_available() else "cuda"
But I guess I should try another way
good ol'
Exactly.
But I guess I should try another way
If there is a .to("cuda")
inside the module, this won't help.😅
Something isn't right.
I added --skip-torch-cuda-test to the commandline args env variable and still.
If it's not a WebUI version issue, there should be a clue somewhere in this.
subprocess.run([r"python3" ,r"launch.py",r"--precision",r"full",r"--no-half",r"--no-half-vae",r"--enable-insecure-extension-access",r"--medvram",r"--skip-torch-cuda-test",r"--enable-console-prompts",r"--ui-settings-file="+str(pathlib.Path(__file__).parent /r"config.json")])
Hmmmmm
Exit code: 1. Reason: Traceback (most recent call last):
File "/home/user/app/app.py", line 4, in <module>
subprocess.run([r"python3" ,r"launch.py",r"--precision",r"full",r"--no-half",r"--no-half-vae",r"--enable-insecure-extension-access",r"--medvram",r"--skip-torch-cuda-test",r"--enable-console-prompts",r"--ui-settings-file="+str(pathlib.Path(__file__).parent /r"config.json")])
NameError: name 'pathlib' is not defined
import pathlib
import pathlib
This is what's being done already lol
"/home/user/app/backend/memory_management.py", line 100, in get_torch_device
return torch.device(torch.cuda.current_device())
File "/usr/local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 940, in current_device
_lazy_init()
File "/usr/local/lib/python3.10/site-packages/torch/cuda/__init__.py", line 319, in _lazy_init
torch._C._cuda_init()
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
Something isn't right
/home/user/app/backend/memory_management.py
It's the part of the Forge author's lllyasviel respectful memory management function.
I wonder if it can be specified in the startup options for Forge? I'll go have a look on github.
Edit:
https://github.com/lllyasviel/stable-diffusion-webui-forge/blob/main/modules/cmd_args.py#L75
parser.add_argument("--use-cpu", nargs='+', help="use CPU as torch device for specified modules", default=[], type=str.lower)
https://github.com/lllyasviel/stable-diffusion-webui-forge/blob/main/backend/args.py#L50
vram_group.add_argument("--always-cpu", action="store_true")
~
parser.add_argument("--disable-gpu-warning", action="store_true")
So I should try this when doing the subprocess python3 launch.py?
#if args.always_cpu:
# cpu_state = CPUState.CPU
cpu_state = CPUState.CPU
I think this is all we can do.
Still have the same GPU error.
I'd say upload all the module files and try to implement this.
I'm not sure what Im doing wrong
It's 23:00, so tomorrow. But anyway, WebUI is trickier than we imagined...🥶
I wonder if options are being used in place of environment variables.
tomorrow
Take your time!
But anyway, WebUI is trickier than we imagined...🥶
Very true
I wonder if options are being used in place of environment variables.
Well we do have COMMANDLINE_ARGS
but it doesn't seem to have any effect in my testing
I experimented a bit, uploading the WebUI package directly to HF doesn't work, but git clone
to the same path does.
I'm not sure why, but I'm starting to understand a little.
Maybe it's just a trivial problem with HF's space specs. Something to do with line break codes or something like that.
https://huggingface.co/spaces/John6666/webui_test2
Edit:
I tried both LF
and CRLF
for newline codes, but both failed. Files that have been placed in Spaces from the beginning fail to import
even when shutil.copytree()
from another location.
There is no difference between the git clone
file and the file uploaded to Spaces when comparing them with diff
command.
The symptoms I can see now are similar to the following. This is just a symptom.
Anyway, this is not about logic or code. It is a problem with the execution environment.
https://stackoverflow.com/questions/44484082/python-cant-find-module-when-started-with-sudo
I experimented a bit, uploading the WebUI package directly to HF doesn't work, but git clone to the same path does.
You should only upload the full module folder, since a lot are missing
Anyway, this is not about logic or code. It is a problem with the execution environment.
That's weird, I'll look into it more.
You should only upload the full module folder, since a lot are missing
I've tried uploading the entire contents of the webui zip and the git clone
. Neither of them worked if I put them there beforehand. And the file contents are the same as far as I can detect with diff
, so I think the problem is the filesystem Attribute, the actual Python affiliation being run, some sort of Permission, or some other parameter that is more implicit.
I suspected it might be the github assets
, but it's not even in Forge's 4GB package, so probably not.
Edit:
https://huggingface.co/spaces/John6666/webui_test3
It worked! The code that didn't work yesterday is now working...
Well, something must have broken in the HF settings. If it works, it's OK.
Edit:
Anyway, now we can get into coding instead of fighting with Python. It's so stressful doing something other than programming to program...
Edit:
Now we are interrupted by an incompatible component of Gradio in the initialization function...
If I don't find a good route to bypass it, it seems to refer to a component that doesn't exist in 4.x.
Forge stops with another error. Currently, WebUI is easier to get up and running.
https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/16529