2024-10-19 08:56:01 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.
2024-10-19 08:56:01 | ERROR | stderr |   warnings.warn(
2024-10-19 08:56:01 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 08:56:01 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 08:56:01 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 08:56:01 | ERROR | stderr | /home/user/app/app.py:697: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
2024-10-19 08:56:01 | ERROR | stderr |   chatbot = gr.Chatbot(elem_id="chatbot", label="FERRET", visible=False).style(height=750)
2024-10-19 08:56:01 | INFO | stdout | Running on local URL:  http://0.0.0.0:7860
2024-10-19 08:56:01 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2134: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-19 08:56:01 | ERROR | stderr |   warnings.warn(
2024-10-19 08:56:01 | INFO | stdout | 
2024-10-19 08:56:01 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-19 08:56:01 | INFO | stdout | IMPORTANT: You are using gradio version 3.50.1, however version 4.44.1 is available, please upgrade.
2024-10-19 08:56:01 | INFO | stdout | --------
2024-10-19 08:56:02 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/dropdown.py:176: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Dropdown(...)` instead of `return gr.Dropdown.update(...)`.
2024-10-19 08:56:02 | ERROR | stderr |   warnings.warn(
2024-10-19 08:56:02 | INFO | stdout | state Conversation(system='A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', roles=('user', 'model'), messages=[], offset=0, sep_style=<SeparatorStyle.GEMMA: 6>, sep='', sep2='<eos>', version='gemma', skip_next=False)
2024-10-19 08:56:02 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:161: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Chatbot(...)` instead of `return gr.Chatbot.update(...)`.
2024-10-19 08:56:02 | ERROR | stderr |   warnings.warn(
2024-10-19 08:56:02 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/textbox.py:163: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Textbox(...)` instead of `return gr.Textbox.update(...)`.
2024-10-19 08:56:02 | ERROR | stderr |   warnings.warn(
2024-10-19 08:56:02 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.
2024-10-19 08:56:02 | ERROR | stderr |   warnings.warn(
2024-10-19 08:58:39 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
2024-10-19 08:58:39 | ERROR | stderr |     output = await route_utils.call_process_api(
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
2024-10-19 08:58:39 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1548, in process_api
2024-10-19 08:58:39 | ERROR | stderr |     inputs = self.preprocess_data(fn_index, inputs, state)
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1329, in preprocess_data
2024-10-19 08:58:39 | ERROR | stderr |     processed_input.append(block.preprocess(inputs[i]))
2024-10-19 08:58:39 | ERROR | stderr |   File "/home/user/app/app.py", line 552, in preprocess
2024-10-19 08:58:39 | ERROR | stderr |     return super().preprocess(x)
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/components/image.py", line 253, in preprocess
2024-10-19 08:58:39 | ERROR | stderr |     assert isinstance(x, dict)
2024-10-19 08:58:39 | ERROR | stderr | AssertionError
2024-10-19 08:58:39 | INFO | stdout | state messages:  []
2024-10-19 08:58:39 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': 'List of 0 images: []'}
2024-10-19 08:58:39 | INFO | stdout | Input Prompt: A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.
2024-10-19 08:58:39 | INFO | stdout | all_image_hash []
2024-10-19 08:58:39 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
2024-10-19 08:58:39 | ERROR | stderr |     output = await route_utils.call_process_api(
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
2024-10-19 08:58:39 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api
2024-10-19 08:58:39 | ERROR | stderr |     result = await self.call_function(
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1199, in call_function
2024-10-19 08:58:39 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 519, in async_iteration
2024-10-19 08:58:39 | ERROR | stderr |     return await iterator.__anext__()
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 512, in __anext__
2024-10-19 08:58:39 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 08:58:39 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 08:58:39 | ERROR | stderr |     return await future
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 08:58:39 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 495, in run_sync_iterator_async
2024-10-19 08:58:39 | ERROR | stderr |     return next(iterator)
2024-10-19 08:58:39 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 649, in gen_wrapper
2024-10-19 08:58:39 | ERROR | stderr |     yield from f(*args, **kwargs)
2024-10-19 08:58:39 | ERROR | stderr |   File "/home/user/app/app.py", line 446, in http_bot
2024-10-19 08:58:39 | ERROR | stderr |     state.messages[-1][-1] = "▌"
2024-10-19 08:58:39 | ERROR | stderr | IndexError: list index out of range
2024-10-19 15:56:08 | INFO | stdout | state Conversation(system='A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', roles=('user', 'model'), messages=[], offset=0, sep_style=<SeparatorStyle.GEMMA: 6>, sep='', sep2='<eos>', version='gemma', skip_next=False)
2024-10-19 18:16:52 | INFO | stdout | state Conversation(system='A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', roles=('user', 'model'), messages=[], offset=0, sep_style=<SeparatorStyle.GEMMA: 6>, sep='', sep2='<eos>', version='gemma', skip_next=False)
2024-10-19 18:18:57 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.
2024-10-19 18:18:57 | ERROR | stderr |   warnings.warn(
2024-10-19 18:18:57 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:18:57 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:18:57 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:18:57 | ERROR | stderr | /home/user/app/app.py:702: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
2024-10-19 18:18:57 | ERROR | stderr |   chatbot = gr.Chatbot(elem_id="chatbot", label="FERRET", visible=False).style(height=750)
2024-10-19 18:18:57 | INFO | stdout | Running on local URL:  http://0.0.0.0:7860
2024-10-19 18:18:57 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2134: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-19 18:18:57 | ERROR | stderr |   warnings.warn(
2024-10-19 18:18:57 | INFO | stdout | 
2024-10-19 18:18:57 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-19 18:18:57 | INFO | stdout | IMPORTANT: You are using gradio version 3.50.1, however version 4.44.1 is available, please upgrade.
2024-10-19 18:18:57 | INFO | stdout | --------
2024-10-19 18:18:57 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/dropdown.py:176: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Dropdown(...)` instead of `return gr.Dropdown.update(...)`.
2024-10-19 18:18:57 | ERROR | stderr |   warnings.warn(
2024-10-19 18:18:57 | INFO | stdout | state Conversation(system='A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', roles=('user', 'model'), messages=[], offset=0, sep_style=<SeparatorStyle.GEMMA: 6>, sep='', sep2='<eos>', version='gemma', skip_next=False)
2024-10-19 18:18:57 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:161: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Chatbot(...)` instead of `return gr.Chatbot.update(...)`.
2024-10-19 18:18:57 | ERROR | stderr |   warnings.warn(
2024-10-19 18:18:57 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/textbox.py:163: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Textbox(...)` instead of `return gr.Textbox.update(...)`.
2024-10-19 18:18:57 | ERROR | stderr |   warnings.warn(
2024-10-19 18:18:57 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.
2024-10-19 18:18:57 | ERROR | stderr |   warnings.warn(
2024-10-19 18:19:25 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
2024-10-19 18:19:25 | ERROR | stderr |     output = await route_utils.call_process_api(
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
2024-10-19 18:19:25 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1548, in process_api
2024-10-19 18:19:25 | ERROR | stderr |     inputs = self.preprocess_data(fn_index, inputs, state)
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1329, in preprocess_data
2024-10-19 18:19:25 | ERROR | stderr |     processed_input.append(block.preprocess(inputs[i]))
2024-10-19 18:19:25 | ERROR | stderr |   File "/home/user/app/app.py", line 557, in preprocess
2024-10-19 18:19:25 | ERROR | stderr |     return super().preprocess(x)
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/components/image.py", line 253, in preprocess
2024-10-19 18:19:25 | ERROR | stderr |     assert isinstance(x, dict)
2024-10-19 18:19:25 | ERROR | stderr | AssertionError
2024-10-19 18:19:25 | INFO | stdout | state messages:  []
2024-10-19 18:19:25 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': 'List of 0 images: []'}
2024-10-19 18:19:25 | INFO | stdout | Input Prompt: A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.
2024-10-19 18:19:25 | INFO | stdout | all_image_hash []
2024-10-19 18:19:25 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
2024-10-19 18:19:25 | ERROR | stderr |     output = await route_utils.call_process_api(
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
2024-10-19 18:19:25 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api
2024-10-19 18:19:25 | ERROR | stderr |     result = await self.call_function(
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1199, in call_function
2024-10-19 18:19:25 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 519, in async_iteration
2024-10-19 18:19:25 | ERROR | stderr |     return await iterator.__anext__()
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 512, in __anext__
2024-10-19 18:19:25 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:19:25 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:19:25 | ERROR | stderr |     return await future
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:19:25 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 495, in run_sync_iterator_async
2024-10-19 18:19:25 | ERROR | stderr |     return next(iterator)
2024-10-19 18:19:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 649, in gen_wrapper
2024-10-19 18:19:25 | ERROR | stderr |     yield from f(*args, **kwargs)
2024-10-19 18:19:25 | ERROR | stderr |   File "/home/user/app/app.py", line 451, in http_bot
2024-10-19 18:19:25 | ERROR | stderr |     state.messages[-1][-1] = "▌"
2024-10-19 18:19:25 | ERROR | stderr | IndexError: list index out of range
2024-10-19 18:19:58 | INFO | stdout | state Conversation(system='A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', roles=('user', 'model'), messages=[], offset=0, sep_style=<SeparatorStyle.GEMMA: 6>, sep='', sep2='<eos>', version='gemma', skip_next=False)
2024-10-19 18:20:08 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
2024-10-19 18:20:08 | ERROR | stderr |     output = await route_utils.call_process_api(
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
2024-10-19 18:20:08 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1548, in process_api
2024-10-19 18:20:08 | ERROR | stderr |     inputs = self.preprocess_data(fn_index, inputs, state)
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1329, in preprocess_data
2024-10-19 18:20:08 | ERROR | stderr |     processed_input.append(block.preprocess(inputs[i]))
2024-10-19 18:20:08 | ERROR | stderr |   File "/home/user/app/app.py", line 557, in preprocess
2024-10-19 18:20:08 | ERROR | stderr |     return super().preprocess(x)
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/components/image.py", line 253, in preprocess
2024-10-19 18:20:08 | ERROR | stderr |     assert isinstance(x, dict)
2024-10-19 18:20:08 | ERROR | stderr | AssertionError
2024-10-19 18:20:08 | INFO | stdout | state messages:  []
2024-10-19 18:20:08 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': 'List of 0 images: []'}
2024-10-19 18:20:08 | INFO | stdout | Input Prompt: A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.
2024-10-19 18:20:08 | INFO | stdout | all_image_hash []
2024-10-19 18:20:08 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
2024-10-19 18:20:08 | ERROR | stderr |     output = await route_utils.call_process_api(
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
2024-10-19 18:20:08 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api
2024-10-19 18:20:08 | ERROR | stderr |     result = await self.call_function(
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1199, in call_function
2024-10-19 18:20:08 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 519, in async_iteration
2024-10-19 18:20:08 | ERROR | stderr |     return await iterator.__anext__()
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 512, in __anext__
2024-10-19 18:20:08 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:20:08 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:20:08 | ERROR | stderr |     return await future
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:20:08 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 495, in run_sync_iterator_async
2024-10-19 18:20:08 | ERROR | stderr |     return next(iterator)
2024-10-19 18:20:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 649, in gen_wrapper
2024-10-19 18:20:08 | ERROR | stderr |     yield from f(*args, **kwargs)
2024-10-19 18:20:08 | ERROR | stderr |   File "/home/user/app/app.py", line 451, in http_bot
2024-10-19 18:20:08 | ERROR | stderr |     state.messages[-1][-1] = "▌"
2024-10-19 18:20:08 | ERROR | stderr | IndexError: list index out of range
2024-10-19 18:25:32 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.
2024-10-19 18:25:32 | ERROR | stderr |   warnings.warn(
2024-10-19 18:25:32 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:25:32 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:25:32 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:25:32 | ERROR | stderr | /home/user/app/app.py:702: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
2024-10-19 18:25:32 | ERROR | stderr |   chatbot = gr.Chatbot(elem_id="chatbot", label="FERRET", visible=False).style(height=750)
2024-10-19 18:25:32 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:25:32 | ERROR | stderr |   File "/home/user/app/app.py", line 797, in <module>
2024-10-19 18:25:32 | ERROR | stderr |     demo = build_demo(args.embed)
2024-10-19 18:25:32 | ERROR | stderr |   File "/home/user/app/app.py", line 754, in build_demo
2024-10-19 18:25:32 | ERROR | stderr |     inputs=[state, model_selector, temperature, top_p, max_new_tokens, refer_input_state],  # Inputs for `http_bot`
2024-10-19 18:25:32 | ERROR | stderr | NameError: name 'max_new_tokens' is not defined
2024-10-19 18:25:32 | INFO | stdout | IMPORTANT: You are using gradio version 3.50.1, however version 4.44.1 is available, please upgrade.
2024-10-19 18:25:32 | INFO | stdout | --------
2024-10-19 18:25:52 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.
2024-10-19 18:25:52 | ERROR | stderr |   warnings.warn(
2024-10-19 18:25:52 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:25:52 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:25:52 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:25:52 | ERROR | stderr | /home/user/app/app.py:702: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
2024-10-19 18:25:52 | ERROR | stderr |   chatbot = gr.Chatbot(elem_id="chatbot", label="FERRET", visible=False).style(height=750)
2024-10-19 18:25:52 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:25:52 | ERROR | stderr |   File "/home/user/app/app.py", line 797, in <module>
2024-10-19 18:25:52 | ERROR | stderr |     demo = build_demo(args.embed)
2024-10-19 18:25:52 | ERROR | stderr |   File "/home/user/app/app.py", line 754, in build_demo
2024-10-19 18:25:52 | ERROR | stderr |     inputs=[state, model_selector, temperature, top_p, max_new_tokens, refer_input_state],  # Inputs for `http_bot`
2024-10-19 18:25:52 | ERROR | stderr | NameError: name 'max_new_tokens' is not defined
2024-10-19 18:25:52 | INFO | stdout | IMPORTANT: You are using gradio version 3.50.1, however version 4.44.1 is available, please upgrade.
2024-10-19 18:25:52 | INFO | stdout | --------
2024-10-19 18:26:18 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.
2024-10-19 18:26:18 | ERROR | stderr |   warnings.warn(
2024-10-19 18:26:18 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:26:18 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:26:18 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:26:18 | ERROR | stderr | /home/user/app/app.py:702: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
2024-10-19 18:26:18 | ERROR | stderr |   chatbot = gr.Chatbot(elem_id="chatbot", label="FERRET", visible=False).style(height=750)
2024-10-19 18:26:18 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:26:18 | ERROR | stderr |   File "/home/user/app/app.py", line 797, in <module>
2024-10-19 18:26:18 | ERROR | stderr |     demo = build_demo(args.embed)
2024-10-19 18:26:18 | ERROR | stderr |   File "/home/user/app/app.py", line 754, in build_demo
2024-10-19 18:26:18 | ERROR | stderr |     inputs=[state, model_selector, temperature, top_p, max_new_tokens, refer_input_state],  # Inputs for `http_bot`
2024-10-19 18:26:18 | ERROR | stderr | NameError: name 'max_new_tokens' is not defined
2024-10-19 18:26:18 | INFO | stdout | IMPORTANT: You are using gradio version 3.50.1, however version 4.44.1 is available, please upgrade.
2024-10-19 18:26:18 | INFO | stdout | --------
2024-10-19 18:27:07 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.
2024-10-19 18:27:07 | ERROR | stderr |   warnings.warn(
2024-10-19 18:27:07 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:27:07 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:27:07 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:27:07 | ERROR | stderr | /home/user/app/app.py:702: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
2024-10-19 18:27:07 | ERROR | stderr |   chatbot = gr.Chatbot(elem_id="chatbot", label="FERRET", visible=False).style(height=750)
2024-10-19 18:27:07 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:27:07 | ERROR | stderr |   File "/home/user/app/app.py", line 797, in <module>
2024-10-19 18:27:07 | ERROR | stderr |     demo = build_demo(args.embed)
2024-10-19 18:27:07 | ERROR | stderr |   File "/home/user/app/app.py", line 754, in build_demo
2024-10-19 18:27:07 | ERROR | stderr |     inputs=[state, model_selector, temperature, top_p, max_new_tokens, refer_input_state],  # Inputs for `http_bot`
2024-10-19 18:27:07 | ERROR | stderr | NameError: name 'max_new_tokens' is not defined
2024-10-19 18:27:07 | INFO | stdout | IMPORTANT: You are using gradio version 3.50.1, however version 4.44.1 is available, please upgrade.
2024-10-19 18:27:07 | INFO | stdout | --------
2024-10-19 18:27:45 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.
2024-10-19 18:27:45 | ERROR | stderr |   warnings.warn(
2024-10-19 18:27:45 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:27:45 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:27:45 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:27:45 | ERROR | stderr | /home/user/app/app.py:702: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
2024-10-19 18:27:45 | ERROR | stderr |   chatbot = gr.Chatbot(elem_id="chatbot", label="FERRET", visible=False).style(height=750)
2024-10-19 18:27:46 | INFO | stdout | Running on local URL:  http://0.0.0.0:7860
2024-10-19 18:27:46 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2134: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-19 18:27:46 | ERROR | stderr |   warnings.warn(
2024-10-19 18:27:46 | INFO | stdout | 
2024-10-19 18:27:46 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-19 18:27:46 | INFO | stdout | IMPORTANT: You are using gradio version 3.50.1, however version 4.44.1 is available, please upgrade.
2024-10-19 18:27:46 | INFO | stdout | --------
2024-10-19 18:27:46 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/dropdown.py:176: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Dropdown(...)` instead of `return gr.Dropdown.update(...)`.
2024-10-19 18:27:46 | ERROR | stderr |   warnings.warn(
2024-10-19 18:27:46 | INFO | stdout | state Conversation(system='A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', roles=('user', 'model'), messages=[], offset=0, sep_style=<SeparatorStyle.GEMMA: 6>, sep='', sep2='<eos>', version='gemma', skip_next=False)
2024-10-19 18:27:46 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:161: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Chatbot(...)` instead of `return gr.Chatbot.update(...)`.
2024-10-19 18:27:46 | ERROR | stderr |   warnings.warn(
2024-10-19 18:27:46 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/textbox.py:163: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Textbox(...)` instead of `return gr.Textbox.update(...)`.
2024-10-19 18:27:46 | ERROR | stderr |   warnings.warn(
2024-10-19 18:27:46 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.
2024-10-19 18:27:46 | ERROR | stderr |   warnings.warn(
2024-10-19 18:29:35 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.
2024-10-19 18:29:35 | ERROR | stderr |   warnings.warn(
2024-10-19 18:29:35 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:29:35 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:29:35 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:29:35 | ERROR | stderr | /home/user/app/app.py:702: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
2024-10-19 18:29:35 | ERROR | stderr |   chatbot = gr.Chatbot(elem_id="chatbot", label="FERRET", visible=False).style(height=750)
2024-10-19 18:29:35 | INFO | stdout | Running on local URL:  http://0.0.0.0:7860
2024-10-19 18:29:35 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2134: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-19 18:29:35 | ERROR | stderr |   warnings.warn(
2024-10-19 18:29:35 | INFO | stdout | 
2024-10-19 18:29:35 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-19 18:29:35 | INFO | stdout | IMPORTANT: You are using gradio version 3.50.1, however version 4.44.1 is available, please upgrade.
2024-10-19 18:29:35 | INFO | stdout | --------
2024-10-19 18:29:36 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/dropdown.py:176: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Dropdown(...)` instead of `return gr.Dropdown.update(...)`.
2024-10-19 18:29:36 | ERROR | stderr |   warnings.warn(
2024-10-19 18:29:36 | INFO | stdout | state Conversation(system='A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', roles=('user', 'model'), messages=[], offset=0, sep_style=<SeparatorStyle.GEMMA: 6>, sep='', sep2='<eos>', version='gemma', skip_next=False)
2024-10-19 18:29:36 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:161: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Chatbot(...)` instead of `return gr.Chatbot.update(...)`.
2024-10-19 18:29:36 | ERROR | stderr |   warnings.warn(
2024-10-19 18:29:36 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/textbox.py:163: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Textbox(...)` instead of `return gr.Textbox.update(...)`.
2024-10-19 18:29:36 | ERROR | stderr |   warnings.warn(
2024-10-19 18:29:36 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/button.py:89: UserWarning: Using the update method is deprecated. Simply return a new object instead, e.g. `return gr.Button(...)` instead of `return gr.Button.update(...)`.
2024-10-19 18:29:36 | ERROR | stderr |   warnings.warn(
2024-10-19 18:29:52 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
2024-10-19 18:29:52 | ERROR | stderr |     output = await route_utils.call_process_api(
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
2024-10-19 18:29:52 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1548, in process_api
2024-10-19 18:29:52 | ERROR | stderr |     inputs = self.preprocess_data(fn_index, inputs, state)
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1329, in preprocess_data
2024-10-19 18:29:52 | ERROR | stderr |     processed_input.append(block.preprocess(inputs[i]))
2024-10-19 18:29:52 | ERROR | stderr |   File "/home/user/app/app.py", line 557, in preprocess
2024-10-19 18:29:52 | ERROR | stderr |     return super().preprocess(x)
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/components/image.py", line 253, in preprocess
2024-10-19 18:29:52 | ERROR | stderr |     assert isinstance(x, dict)
2024-10-19 18:29:52 | ERROR | stderr | AssertionError
2024-10-19 18:29:52 | INFO | stdout | state messages:  []
2024-10-19 18:29:52 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': 'List of 0 images: []'}
2024-10-19 18:29:52 | INFO | stdout | Input Prompt: A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.
2024-10-19 18:29:52 | INFO | stdout | all_image_hash []
2024-10-19 18:29:52 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
2024-10-19 18:29:52 | ERROR | stderr |     output = await route_utils.call_process_api(
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
2024-10-19 18:29:52 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api
2024-10-19 18:29:52 | ERROR | stderr |     result = await self.call_function(
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1199, in call_function
2024-10-19 18:29:52 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 519, in async_iteration
2024-10-19 18:29:52 | ERROR | stderr |     return await iterator.__anext__()
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 512, in __anext__
2024-10-19 18:29:52 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:29:52 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:29:52 | ERROR | stderr |     return await future
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:29:52 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 495, in run_sync_iterator_async
2024-10-19 18:29:52 | ERROR | stderr |     return next(iterator)
2024-10-19 18:29:52 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 649, in gen_wrapper
2024-10-19 18:29:52 | ERROR | stderr |     yield from f(*args, **kwargs)
2024-10-19 18:29:52 | ERROR | stderr |   File "/home/user/app/app.py", line 451, in http_bot
2024-10-19 18:29:52 | ERROR | stderr |     state.messages[-1][-1] = "▌"
2024-10-19 18:29:52 | ERROR | stderr | IndexError: list index out of range
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:41 | ERROR | stderr | ERROR:    Exception in ASGI application
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 335, in main
2024-10-19 18:30:41 | ERROR | stderr |     ) -> App:
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 208, in TemplateResponse
2024-10-19 18:30:41 | ERROR | stderr |     template = self.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/templating.py", line 131, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self.env.get_template(name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 1013, in get_template
2024-10-19 18:30:41 | ERROR | stderr |     return self._load_template(name, globals)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/environment.py", line 972, in _load_template
2024-10-19 18:30:41 | ERROR | stderr |     template = self.loader.load(self, name, self.make_globals(globals))
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 126, in load
2024-10-19 18:30:41 | ERROR | stderr |     source, filename, uptodate = self.get_source(environment, name)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/jinja2/loaders.py", line 207, in get_source
2024-10-19 18:30:41 | ERROR | stderr |     raise TemplateNotFound(template)
2024-10-19 18:30:41 | ERROR | stderr | jinja2.exceptions.TemplateNotFound: frontend/index.html
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | The above exception was the direct cause of the following exception:
2024-10-19 18:30:41 | ERROR | stderr | 
2024-10-19 18:30:41 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 406, in run_asgi
2024-10-19 18:30:41 | ERROR | stderr |     result = await app(  # type: ignore[func-returns-value]
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     return await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await super().__call__(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 113, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, _send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 715, in __call__
2024-10-19 18:30:41 | ERROR | stderr |     await self.middleware_stack(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
2024-10-19 18:30:41 | ERROR | stderr |     await route.handle(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
2024-10-19 18:30:41 | ERROR | stderr |     await self.app(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
2024-10-19 18:30:41 | ERROR | stderr |     await wrap_app_handling_exceptions(app, request)(scope, receive, send)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 62, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     raise exc
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/_exception_handler.py", line 51, in wrapped_app
2024-10-19 18:30:41 | ERROR | stderr |     await app(scope, receive, sender)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 73, in app
2024-10-19 18:30:41 | ERROR | stderr |     response = await f(request)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 301, in app
2024-10-19 18:30:41 | ERROR | stderr |     raw_response = await run_endpoint_function(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 214, in run_endpoint_function
2024-10-19 18:30:41 | ERROR | stderr |     return await run_in_threadpool(dependant.call, **values)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 39, in run_in_threadpool
2024-10-19 18:30:41 | ERROR | stderr |     return await anyio.to_thread.run_sync(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:30:41 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:30:41 | ERROR | stderr |     return await future
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:30:41 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:30:41 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/routes.py", line 346, in main
2024-10-19 18:30:41 | ERROR | stderr | ValueError: Did you install Gradio from source files? You need to build the frontend by running /scripts/build_frontend.sh
2024-10-19 18:30:54 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:30:54 | ERROR | stderr |   File "/home/user/app/app.py", line 44, in <module>
2024-10-19 18:30:54 | ERROR | stderr |     no_change_btn = gr.Button.update()
2024-10-19 18:30:54 | ERROR | stderr | AttributeError: type object 'Button' has no attribute 'update'
2024-10-19 18:33:31 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:33:31 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:33:31 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:33:32 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:33:32 | ERROR | stderr |   File "/home/user/app/app.py", line 789, in <module>
2024-10-19 18:33:32 | ERROR | stderr |     demo = build_demo(args.embed)
2024-10-19 18:33:32 | ERROR | stderr |   File "/home/user/app/app.py", line 665, in build_demo
2024-10-19 18:33:32 | ERROR | stderr |     sketch_pad = ImageMask(label="Image & Sketch", type="pil", elem_id="img2text")
2024-10-19 18:33:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/component_meta.py", line 179, in wrapper
2024-10-19 18:33:32 | ERROR | stderr |     return fn(self, **kwargs)
2024-10-19 18:33:32 | ERROR | stderr |   File "/home/user/app/app.py", line 554, in __init__
2024-10-19 18:33:32 | ERROR | stderr |     super().__init__(source="upload", tool="sketch", interactive=True, **kwargs)
2024-10-19 18:33:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/component_meta.py", line 179, in wrapper
2024-10-19 18:33:32 | ERROR | stderr |     return fn(self, **kwargs)
2024-10-19 18:33:32 | ERROR | stderr | TypeError: Image.__init__() got an unexpected keyword argument 'source'
2024-10-19 18:33:32 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 18:38:29 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:38:29 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:38:29 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:38:29 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:38:29 | ERROR | stderr |   File "/home/user/app/app.py", line 789, in <module>
2024-10-19 18:38:29 | ERROR | stderr |     demo = build_demo(args.embed)
2024-10-19 18:38:29 | ERROR | stderr |   File "/home/user/app/app.py", line 665, in build_demo
2024-10-19 18:38:29 | ERROR | stderr |     sketch_pad = ImageMask(label="Image & Sketch", type="pil", elem_id="img2text")
2024-10-19 18:38:29 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/component_meta.py", line 179, in wrapper
2024-10-19 18:38:29 | ERROR | stderr |     return fn(self, **kwargs)
2024-10-19 18:38:29 | ERROR | stderr |   File "/home/user/app/app.py", line 554, in __init__
2024-10-19 18:38:29 | ERROR | stderr |     super().__init__(sources=["upload"], tool="sketch", interactive=True, **kwargs)
2024-10-19 18:38:29 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/component_meta.py", line 179, in wrapper
2024-10-19 18:38:29 | ERROR | stderr |     return fn(self, **kwargs)
2024-10-19 18:38:29 | ERROR | stderr | TypeError: Image.__init__() got an unexpected keyword argument 'tool'
2024-10-19 18:38:29 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 18:41:42 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:41:42 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:41:42 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:41:42 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:41:42 | ERROR | stderr |   File "/home/user/app/app.py", line 789, in <module>
2024-10-19 18:41:42 | ERROR | stderr |     demo = build_demo(args.embed)
2024-10-19 18:41:42 | ERROR | stderr |   File "/home/user/app/app.py", line 665, in build_demo
2024-10-19 18:41:42 | ERROR | stderr |     sketch_pad = ImageMask(label="Image & Sketch", type="pil", elem_id="img2text")
2024-10-19 18:41:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/component_meta.py", line 179, in wrapper
2024-10-19 18:41:42 | ERROR | stderr |     return fn(self, **kwargs)
2024-10-19 18:41:42 | ERROR | stderr |   File "/home/user/app/app.py", line 554, in __init__
2024-10-19 18:41:42 | ERROR | stderr |     super().__init__(sources=["upload"], tool="sketch", interactive=True, **kwargs)
2024-10-19 18:41:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/component_meta.py", line 179, in wrapper
2024-10-19 18:41:42 | ERROR | stderr |     return fn(self, **kwargs)
2024-10-19 18:41:42 | ERROR | stderr | TypeError: ImageEditor.__init__() got an unexpected keyword argument 'tool'
2024-10-19 18:41:42 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 18:42:34 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:42:34 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:42:34 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:42:34 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-19 18:42:34 | ERROR | stderr |   warnings.warn(
2024-10-19 18:42:34 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:42:34 | ERROR | stderr |   File "/home/user/app/app.py", line 789, in <module>
2024-10-19 18:42:34 | ERROR | stderr |     demo = build_demo(args.embed)
2024-10-19 18:42:34 | ERROR | stderr |   File "/home/user/app/app.py", line 702, in build_demo
2024-10-19 18:42:34 | ERROR | stderr |     chatbot = gr.Chatbot(elem_id="chatbot", label="FERRET", visible=False).style(height=750)
2024-10-19 18:42:34 | ERROR | stderr | AttributeError: 'Chatbot' object has no attribute 'style'. Did you mean: 'scale'?
2024-10-19 18:42:35 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 18:45:04 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:45:04 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:45:04 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:45:04 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-19 18:45:04 | ERROR | stderr |   warnings.warn(
2024-10-19 18:45:04 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:45:04 | ERROR | stderr |   File "/home/user/app/app.py", line 789, in <module>
2024-10-19 18:45:04 | ERROR | stderr |     demo = build_demo(args.embed)
2024-10-19 18:45:04 | ERROR | stderr |   File "/home/user/app/app.py", line 702, in build_demo
2024-10-19 18:45:04 | ERROR | stderr |     chatbot = gr.Chatbot(elem_id="chatbot", label="FERRET", visible=False).style(height=750)
2024-10-19 18:45:04 | ERROR | stderr | AttributeError: 'Chatbot' object has no attribute 'style'. Did you mean: 'scale'?
2024-10-19 18:45:04 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 18:47:21 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:47:21 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:47:21 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-19 18:47:21 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:47:21 | ERROR | stderr |   File "/home/user/app/app.py", line 789, in <module>
2024-10-19 18:47:21 | ERROR | stderr |     demo = build_demo(args.embed)
2024-10-19 18:47:21 | ERROR | stderr |   File "/home/user/app/app.py", line 665, in build_demo
2024-10-19 18:47:21 | ERROR | stderr |     sketch_pad = ImageMask(label="Image & Sketch", type="pil", elem_id="img2text")
2024-10-19 18:47:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/component_meta.py", line 179, in wrapper
2024-10-19 18:47:21 | ERROR | stderr |     return fn(self, **kwargs)
2024-10-19 18:47:21 | ERROR | stderr |   File "/home/user/app/app.py", line 554, in __init__
2024-10-19 18:47:21 | ERROR | stderr |     super().__init__(source="upload", tool="sketch", interactive=True, **kwargs)
2024-10-19 18:47:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/component_meta.py", line 179, in wrapper
2024-10-19 18:47:21 | ERROR | stderr |     return fn(self, **kwargs)
2024-10-19 18:47:21 | ERROR | stderr | TypeError: Image.__init__() got an unexpected keyword argument 'source'
2024-10-19 18:47:21 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 18:58:51 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:58:51 | ERROR | stderr |   File "/home/user/app/app.py", line 546, in <module>
2024-10-19 18:58:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/component_meta.py", line 218, in __new__
2024-10-19 18:58:51 | ERROR | stderr |     create_or_modify_pyi(component_class, name, events)
2024-10-19 18:58:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/component_meta.py", line 113, in create_or_modify_pyi
2024-10-19 18:58:51 | ERROR | stderr |     raise ValueError("Couldn't find class source code")
2024-10-19 18:58:51 | ERROR | stderr | ValueError: Couldn't find class source code
2024-10-19 18:59:03 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 18:59:03 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 18:59:03 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 18:59:03 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-19 18:59:03 | ERROR | stderr |   warnings.warn(
2024-10-19 18:59:04 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 18:59:04 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-19 18:59:04 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-19 18:59:04 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-19 18:59:04 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-19 18:59:04 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-19 18:59:04 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-19 18:59:04 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-19 18:59:04 | ERROR | stderr |   warnings.warn(
2024-10-19 18:59:04 | INFO | stdout | 
2024-10-19 18:59:04 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-19 18:59:19 | INFO | stdout | conv mode to gemma
2024-10-19 18:59:19 | INFO | stdout | Input Image Size:(400, 586)
2024-10-19 18:59:19 | INFO | stdout | Input Image Size:(400, 586)
2024-10-19 18:59:19 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndesrcibe the text<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['9c4c9c437ec882a11cd6ed69ca2e5bd9']"}
2024-10-19 18:59:19 | INFO | stdout | Input Image Size:(400, 586)
2024-10-19 18:59:19 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 18:59:19 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-19 18:59:19 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-19 18:59:19 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-19 18:59:19 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 18:59:19 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-19 18:59:19 | ERROR | stderr |     result = await self.call_function(
2024-10-19 18:59:19 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-19 18:59:19 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 18:59:19 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-19 18:59:19 | ERROR | stderr |     return await anext(iterator)
2024-10-19 18:59:19 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-19 18:59:19 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 18:59:19 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 18:59:19 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 18:59:19 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 18:59:19 | ERROR | stderr |     return await future
2024-10-19 18:59:19 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 18:59:19 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 18:59:19 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-19 18:59:19 | ERROR | stderr |     return next(iterator)
2024-10-19 18:59:19 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-19 18:59:19 | ERROR | stderr |     response = next(iterator)
2024-10-19 18:59:19 | ERROR | stderr |   File "/home/user/app/app.py", line 252, in http_bot
2024-10-19 18:59:19 | ERROR | stderr |     results, extracted_texts = inference_and_run(
2024-10-19 18:59:19 | ERROR | stderr | TypeError: inference_and_run() got an unexpected keyword argument 'temperature'
2024-10-19 19:02:41 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 19:02:41 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 19:02:41 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 19:02:42 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-19 19:02:42 | ERROR | stderr |   warnings.warn(
2024-10-19 19:02:42 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-19 19:02:42 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-19 19:02:42 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-19 19:02:42 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 19:02:42 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-19 19:02:42 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-19 19:02:42 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-19 19:02:42 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-19 19:02:42 | ERROR | stderr |   warnings.warn(
2024-10-19 19:02:42 | INFO | stdout | 
2024-10-19 19:02:42 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-19 19:02:51 | INFO | stdout | conv mode to gemma
2024-10-19 19:02:51 | INFO | stdout | Input Image Size:(400, 668)
2024-10-19 19:02:51 | INFO | stdout | Input Image Size:(400, 668)
2024-10-19 19:02:51 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe the image<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['f5fd9bd8b1445ded1d843253a97af861']"}
2024-10-19 19:02:51 | INFO | stdout | Input Image Size:(400, 668)
2024-10-19 19:02:51 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 19:02:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-19 19:02:51 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-19 19:02:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-19 19:02:51 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 19:02:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-19 19:02:51 | ERROR | stderr |     result = await self.call_function(
2024-10-19 19:02:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-19 19:02:51 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 19:02:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-19 19:02:51 | ERROR | stderr |     return await anext(iterator)
2024-10-19 19:02:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-19 19:02:51 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 19:02:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 19:02:51 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 19:02:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 19:02:51 | ERROR | stderr |     return await future
2024-10-19 19:02:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 19:02:51 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 19:02:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-19 19:02:51 | ERROR | stderr |     return next(iterator)
2024-10-19 19:02:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-19 19:02:51 | ERROR | stderr |     response = next(iterator)
2024-10-19 19:02:51 | ERROR | stderr |   File "/home/user/app/app.py", line 252, in http_bot
2024-10-19 19:02:51 | ERROR | stderr |     results, extracted_texts = inference_and_run(
2024-10-19 19:02:51 | ERROR | stderr |   File "/home/user/app/inference.py", line 49, in inference_and_run
2024-10-19 19:02:51 | ERROR | stderr |     "image_h": Image.open(image_path).height,
2024-10-19 19:02:51 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/PIL/Image.py", line 3431, in open
2024-10-19 19:02:51 | ERROR | stderr |     fp = builtins.open(filename, "rb")
2024-10-19 19:02:51 | ERROR | stderr | FileNotFoundError: [Errno 2] No such file or directory: '/home/user/app/f5fd9bd8b1445ded1d843253a97af861'
2024-10-19 20:49:58 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 20:49:58 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 20:49:58 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 20:49:58 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-19 20:49:58 | ERROR | stderr |   warnings.warn(
2024-10-19 20:49:58 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-19 20:49:58 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-19 20:49:58 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-19 20:49:58 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 20:49:58 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-19 20:49:58 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-19 20:49:58 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-19 20:49:58 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-19 20:49:58 | ERROR | stderr |   warnings.warn(
2024-10-19 20:49:58 | INFO | stdout | 
2024-10-19 20:49:58 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-19 20:50:36 | INFO | stdout | conv mode to gemma
2024-10-19 20:50:36 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 20:50:36 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 20:50:36 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nexplain the image<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-19 20:50:36 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 20:50:36 | INFO | stdout | eval.json file created successfully.
2024-10-19 20:50:36 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-19 20:50:36 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-19 20:50:36 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-19 20:50:36 | ERROR | stderr |     result = await self.call_function(
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-19 20:50:36 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-19 20:50:36 | ERROR | stderr |     return await anext(iterator)
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-19 20:50:36 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 20:50:36 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 20:50:36 | ERROR | stderr |     return await future
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 20:50:36 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-19 20:50:36 | ERROR | stderr |     return next(iterator)
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-19 20:50:36 | ERROR | stderr |     response = next(iterator)
2024-10-19 20:50:36 | ERROR | stderr |   File "/home/user/app/app.py", line 252, in http_bot
2024-10-19 20:50:36 | ERROR | stderr |     results, extracted_texts = inference_and_run(
2024-10-19 20:50:36 | ERROR | stderr |   File "/home/user/app/inference.py", line 79, in inference_and_run
2024-10-19 20:50:36 | ERROR | stderr |     result = subprocess.run(cmd, check=True, capture_output=True, text=True)
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/subprocess.py", line 503, in run
2024-10-19 20:50:36 | ERROR | stderr |     with Popen(*popenargs, **kwargs) as process:
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/subprocess.py", line 971, in __init__
2024-10-19 20:50:36 | ERROR | stderr |     self._execute_child(args, executable, preexec_fn, close_fds,
2024-10-19 20:50:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/subprocess.py", line 1796, in _execute_child
2024-10-19 20:50:36 | ERROR | stderr |     self.pid = _posixsubprocess.fork_exec(
2024-10-19 20:50:36 | ERROR | stderr | TypeError: expected str, bytes or os.PathLike object, not float
2024-10-19 20:53:53 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 20:53:53 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 20:53:53 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 20:53:53 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-19 20:53:53 | ERROR | stderr |   warnings.warn(
2024-10-19 20:53:53 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-19 20:53:53 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-19 20:53:53 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-19 20:53:53 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 20:53:53 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-19 20:53:53 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-19 20:53:53 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-19 20:53:53 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-19 20:53:53 | ERROR | stderr |   warnings.warn(
2024-10-19 20:53:53 | INFO | stdout | 
2024-10-19 20:53:53 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-19 20:54:04 | INFO | stdout | conv mode to gemma
2024-10-19 20:54:04 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 20:54:04 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 20:54:04 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nexplain this<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-19 20:54:04 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 20:54:04 | INFO | stdout | eval.json file created successfully.
2024-10-19 20:54:04 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-19 20:54:04 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-19 20:54:04 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-19 20:54:04 | ERROR | stderr |     result = await self.call_function(
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-19 20:54:04 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-19 20:54:04 | ERROR | stderr |     return await anext(iterator)
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-19 20:54:04 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 20:54:04 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 20:54:04 | ERROR | stderr |     return await future
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 20:54:04 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-19 20:54:04 | ERROR | stderr |     return next(iterator)
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-19 20:54:04 | ERROR | stderr |     response = next(iterator)
2024-10-19 20:54:04 | ERROR | stderr |   File "/home/user/app/app.py", line 252, in http_bot
2024-10-19 20:54:04 | ERROR | stderr |     results, extracted_texts = inference_and_run(
2024-10-19 20:54:04 | ERROR | stderr |   File "/home/user/app/inference.py", line 79, in inference_and_run
2024-10-19 20:54:04 | ERROR | stderr |     result = subprocess.run(cmd, check=True, capture_output=True, text=True)
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/subprocess.py", line 503, in run
2024-10-19 20:54:04 | ERROR | stderr |     with Popen(*popenargs, **kwargs) as process:
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/subprocess.py", line 971, in __init__
2024-10-19 20:54:04 | ERROR | stderr |     self._execute_child(args, executable, preexec_fn, close_fds,
2024-10-19 20:54:04 | ERROR | stderr |   File "/usr/local/lib/python3.10/subprocess.py", line 1796, in _execute_child
2024-10-19 20:54:04 | ERROR | stderr |     self.pid = _posixsubprocess.fork_exec(
2024-10-19 20:54:04 | ERROR | stderr | TypeError: expected str, bytes or os.PathLike object, not float
2024-10-19 20:54:32 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 20:54:32 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 20:54:32 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 20:54:33 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-19 20:54:33 | ERROR | stderr |   warnings.warn(
2024-10-19 20:54:33 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-19 20:54:33 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-19 20:54:33 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-19 20:54:33 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 20:54:33 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-19 20:54:33 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-19 20:54:33 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-19 20:54:33 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-19 20:54:33 | ERROR | stderr |   warnings.warn(
2024-10-19 20:54:33 | INFO | stdout | 
2024-10-19 20:54:33 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-19 20:54:40 | INFO | stdout | conv mode to gemma
2024-10-19 20:54:40 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 20:54:40 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 20:54:40 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nexplain this<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-19 20:54:40 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 20:54:40 | INFO | stdout | eval.json file created successfully.
2024-10-19 20:54:40 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-19 20:54:40 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-19 20:54:40 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-19 20:54:40 | ERROR | stderr |     result = await self.call_function(
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-19 20:54:40 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-19 20:54:40 | ERROR | stderr |     return await anext(iterator)
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-19 20:54:40 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 20:54:40 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 20:54:40 | ERROR | stderr |     return await future
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 20:54:40 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-19 20:54:40 | ERROR | stderr |     return next(iterator)
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-19 20:54:40 | ERROR | stderr |     response = next(iterator)
2024-10-19 20:54:40 | ERROR | stderr |   File "/home/user/app/app.py", line 252, in http_bot
2024-10-19 20:54:40 | ERROR | stderr |     results, extracted_texts = inference_and_run(
2024-10-19 20:54:40 | ERROR | stderr |   File "/home/user/app/inference.py", line 79, in inference_and_run
2024-10-19 20:54:40 | ERROR | stderr |     result = subprocess.run(cmd, check=True, capture_output=True, text=True)
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/subprocess.py", line 503, in run
2024-10-19 20:54:40 | ERROR | stderr |     with Popen(*popenargs, **kwargs) as process:
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/subprocess.py", line 971, in __init__
2024-10-19 20:54:40 | ERROR | stderr |     self._execute_child(args, executable, preexec_fn, close_fds,
2024-10-19 20:54:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/subprocess.py", line 1796, in _execute_child
2024-10-19 20:54:40 | ERROR | stderr |     self.pid = _posixsubprocess.fork_exec(
2024-10-19 20:54:40 | ERROR | stderr | TypeError: expected str, bytes or os.PathLike object, not float
2024-10-19 20:56:02 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 20:56:02 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 20:56:02 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 20:56:02 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-19 20:56:02 | ERROR | stderr |   warnings.warn(
2024-10-19 20:56:02 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-19 20:56:02 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-19 20:56:02 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-19 20:56:02 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 20:56:02 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-19 20:56:02 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-19 20:56:02 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-19 20:56:02 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-19 20:56:02 | ERROR | stderr |   warnings.warn(
2024-10-19 20:56:02 | INFO | stdout | 
2024-10-19 20:56:02 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-19 20:56:08 | INFO | stdout | conv mode to gemma
2024-10-19 20:56:08 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 20:56:09 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 20:56:09 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nexplain this<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-19 20:56:09 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 20:56:09 | INFO | stdout | eval.json file created successfully.
2024-10-19 20:56:24 | INFO | stdout | Error occurred during inference:
2024-10-19 20:56:24 | INFO | stdout | Command '['python', '-m', 'model_UI', '--model_path', 'jadechoghari/Ferret-UI-Gemma2b', '--data_path', 'eval.json', '--image_path', '.', '--answers_file', 'eval_output.jsonl', '--num_beam', '1', '--temperature', '0.2', '--top_p', '0.7', '--max_new_tokens', '512', '--conv_mode', 'ferret_gemma_instruct']' returned non-zero exit status 1.
2024-10-19 20:56:24 | INFO | stdout | Subprocess output:
2024-10-19 20:56:24 | INFO | stdout | 
2024-10-19 20:56:24 | INFO | gradio_web_server | This is the respone None
2024-10-19 20:56:24 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 20:56:24 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-19 20:56:24 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-19 20:56:24 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-19 20:56:24 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 20:56:24 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-19 20:56:24 | ERROR | stderr |     result = await self.call_function(
2024-10-19 20:56:24 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-19 20:56:24 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 20:56:24 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-19 20:56:24 | ERROR | stderr |     return await anext(iterator)
2024-10-19 20:56:24 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-19 20:56:24 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 20:56:24 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 20:56:24 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 20:56:24 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 20:56:24 | ERROR | stderr |     return await future
2024-10-19 20:56:24 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 20:56:24 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 20:56:24 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-19 20:56:24 | ERROR | stderr |     return next(iterator)
2024-10-19 20:56:24 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-19 20:56:24 | ERROR | stderr |     response = next(iterator)
2024-10-19 20:56:24 | ERROR | stderr |   File "/home/user/app/app.py", line 264, in http_bot
2024-10-19 20:56:24 | ERROR | stderr |     for chunk in response.iter_lines(decode_unicode=False, delimiter=b"\0"):
2024-10-19 20:56:24 | ERROR | stderr | AttributeError: 'NoneType' object has no attribute 'iter_lines'
2024-10-19 22:26:55 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 22:26:55 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 22:26:55 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 22:26:55 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-19 22:26:55 | ERROR | stderr |   warnings.warn(
2024-10-19 22:26:56 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 22:26:56 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-19 22:26:56 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-19 22:26:56 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-19 22:26:56 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-19 22:26:56 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-19 22:26:56 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-19 22:26:56 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-19 22:26:56 | ERROR | stderr |   warnings.warn(
2024-10-19 22:26:56 | INFO | stdout | 
2024-10-19 22:26:56 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-19 22:27:43 | INFO | stdout | conv mode to gemma
2024-10-19 22:27:43 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 22:27:43 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 22:27:43 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe the image in detail<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-19 22:27:43 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 22:27:43 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 22:27:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-19 22:27:43 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-19 22:27:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-19 22:27:43 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 22:27:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-19 22:27:43 | ERROR | stderr |     result = await self.call_function(
2024-10-19 22:27:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-19 22:27:43 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 22:27:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-19 22:27:43 | ERROR | stderr |     return await anext(iterator)
2024-10-19 22:27:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-19 22:27:43 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 22:27:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 22:27:43 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 22:27:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 22:27:43 | ERROR | stderr |     return await future
2024-10-19 22:27:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 22:27:43 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 22:27:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-19 22:27:43 | ERROR | stderr |     return next(iterator)
2024-10-19 22:27:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-19 22:27:43 | ERROR | stderr |     response = next(iterator)
2024-10-19 22:27:43 | ERROR | stderr |   File "/home/user/app/app.py", line 254, in http_bot
2024-10-19 22:27:43 | ERROR | stderr |     results, extracted_texts = inference_and_run(
2024-10-19 22:27:43 | ERROR | stderr |   File "/home/user/app/inference.py", line 49, in inference_and_run
2024-10-19 22:27:43 | ERROR | stderr |     "image_h": Image.open(image_path).height,
2024-10-19 22:27:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/PIL/Image.py", line 3431, in open
2024-10-19 22:27:43 | ERROR | stderr |     fp = builtins.open(filename, "rb")
2024-10-19 22:27:43 | ERROR | stderr | IsADirectoryError: [Errno 21] Is a directory: '/home/user/app/serve_images/2024-10-19'
2024-10-19 22:28:31 | INFO | stdout | conv mode to gemma
2024-10-19 22:28:31 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 22:28:31 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 22:28:31 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nexplian this image in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-19 22:28:31 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 22:28:31 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 22:28:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-19 22:28:31 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-19 22:28:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-19 22:28:31 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 22:28:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-19 22:28:31 | ERROR | stderr |     result = await self.call_function(
2024-10-19 22:28:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-19 22:28:31 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 22:28:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-19 22:28:31 | ERROR | stderr |     return await anext(iterator)
2024-10-19 22:28:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-19 22:28:31 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 22:28:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 22:28:31 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 22:28:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 22:28:31 | ERROR | stderr |     return await future
2024-10-19 22:28:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 22:28:31 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 22:28:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-19 22:28:31 | ERROR | stderr |     return next(iterator)
2024-10-19 22:28:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-19 22:28:31 | ERROR | stderr |     response = next(iterator)
2024-10-19 22:28:31 | ERROR | stderr |   File "/home/user/app/app.py", line 254, in http_bot
2024-10-19 22:28:31 | ERROR | stderr |     results, extracted_texts = inference_and_run(
2024-10-19 22:28:31 | ERROR | stderr |   File "/home/user/app/inference.py", line 49, in inference_and_run
2024-10-19 22:28:31 | ERROR | stderr |     "image_h": Image.open(image_path).height,
2024-10-19 22:28:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/PIL/Image.py", line 3431, in open
2024-10-19 22:28:31 | ERROR | stderr |     fp = builtins.open(filename, "rb")
2024-10-19 22:28:31 | ERROR | stderr | IsADirectoryError: [Errno 21] Is a directory: '/home/user/app/serve_images/2024-10-19'
2024-10-19 22:30:10 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 22:30:10 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 22:30:10 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 22:30:10 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-19 22:30:10 | ERROR | stderr |   warnings.warn(
2024-10-19 22:30:10 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 22:30:10 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-19 22:30:10 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-19 22:30:10 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-19 22:30:10 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-19 22:30:10 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-19 22:30:10 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-19 22:30:10 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-19 22:30:10 | ERROR | stderr |   warnings.warn(
2024-10-19 22:30:10 | INFO | stdout | 
2024-10-19 22:30:10 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-19 22:31:54 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 22:31:54 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-19 22:31:54 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-19 22:31:54 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-19 22:31:54 | ERROR | stderr |   warnings.warn(
2024-10-19 22:31:55 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-19 22:31:55 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-19 22:31:55 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-19 22:31:55 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-19 22:31:55 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-19 22:31:55 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-19 22:31:55 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-19 22:31:55 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-19 22:31:55 | ERROR | stderr |   warnings.warn(
2024-10-19 22:31:55 | INFO | stdout | 
2024-10-19 22:31:55 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-19 22:32:05 | INFO | stdout | conv mode to gemma
2024-10-19 22:32:05 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 22:32:05 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 22:32:05 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nexplain this image in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-19 22:32:05 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 22:32:05 | INFO | stdout | eval.json file created successfully.
2024-10-19 22:32:23 | INFO | stdout | Error occurred during inference:
2024-10-19 22:32:23 | INFO | stdout | Command '['python', '-m', 'model_UI', '--model_path', 'jadechoghari/Ferret-UI-Gemma2b', '--data_path', 'eval.json', '--image_path', './serve_images/2024-10-19', '--answers_file', 'eval_output.jsonl', '--num_beam', '1', '--temperature', '0.2', '--top_p', '0.7', '--max_new_tokens', '512', '--conv_mode', 'ferret_gemma_instruct']' returned non-zero exit status 1.
2024-10-19 22:32:23 | INFO | stdout | Subprocess output:
2024-10-19 22:32:23 | INFO | stdout | 
2024-10-19 22:32:23 | INFO | gradio_web_server | This is the respone None
2024-10-19 22:32:23 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 22:32:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-19 22:32:23 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-19 22:32:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-19 22:32:23 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 22:32:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-19 22:32:23 | ERROR | stderr |     result = await self.call_function(
2024-10-19 22:32:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-19 22:32:23 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 22:32:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-19 22:32:23 | ERROR | stderr |     return await anext(iterator)
2024-10-19 22:32:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-19 22:32:23 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 22:32:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 22:32:23 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 22:32:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 22:32:23 | ERROR | stderr |     return await future
2024-10-19 22:32:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 22:32:23 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 22:32:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-19 22:32:23 | ERROR | stderr |     return next(iterator)
2024-10-19 22:32:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-19 22:32:23 | ERROR | stderr |     response = next(iterator)
2024-10-19 22:32:23 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-19 22:32:23 | ERROR | stderr |     for chunk in response.iter_lines(decode_unicode=False, delimiter=b"\0"):
2024-10-19 22:32:23 | ERROR | stderr | AttributeError: 'NoneType' object has no attribute 'iter_lines'
2024-10-19 22:36:12 | INFO | stdout | conv mode to gemma
2024-10-19 22:36:12 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 22:36:12 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 22:36:12 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe the image in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-19 22:36:12 | INFO | stdout | Input Image Size:(400, 433)
2024-10-19 22:36:12 | INFO | stdout | eval.json file created successfully.
2024-10-19 22:36:26 | INFO | stdout | Error occurred during inference:
2024-10-19 22:36:26 | INFO | stdout | Command '['python', '-m', 'model_UI', '--model_path', 'jadechoghari/Ferret-UI-Gemma2b', '--data_path', 'eval.json', '--image_path', './serve_images/2024-10-19', '--answers_file', 'eval_output.jsonl', '--num_beam', '1', '--temperature', '0.2', '--top_p', '0.7', '--max_new_tokens', '512', '--conv_mode', 'ferret_gemma_instruct']' returned non-zero exit status 1.
2024-10-19 22:36:26 | INFO | stdout | Subprocess output:
2024-10-19 22:36:26 | INFO | stdout | 
2024-10-19 22:36:26 | INFO | gradio_web_server | This is the respone None
2024-10-19 22:36:26 | ERROR | stderr | Traceback (most recent call last):
2024-10-19 22:36:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-19 22:36:26 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-19 22:36:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-19 22:36:26 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-19 22:36:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-19 22:36:26 | ERROR | stderr |     result = await self.call_function(
2024-10-19 22:36:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-19 22:36:26 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-19 22:36:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-19 22:36:26 | ERROR | stderr |     return await anext(iterator)
2024-10-19 22:36:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-19 22:36:26 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-19 22:36:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-19 22:36:26 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-19 22:36:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-19 22:36:26 | ERROR | stderr |     return await future
2024-10-19 22:36:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-19 22:36:26 | ERROR | stderr |     result = context.run(func, *args)
2024-10-19 22:36:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-19 22:36:26 | ERROR | stderr |     return next(iterator)
2024-10-19 22:36:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-19 22:36:26 | ERROR | stderr |     response = next(iterator)
2024-10-19 22:36:26 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-19 22:36:26 | ERROR | stderr |     for chunk in response.iter_lines(decode_unicode=False, delimiter=b"\0"):
2024-10-19 22:36:26 | ERROR | stderr | AttributeError: 'NoneType' object has no attribute 'iter_lines'
2024-10-20 02:14:43 | INFO | httpx | HTTP Request: POST http://device-api.zero/startup-report "HTTP/1.1 200 OK"
2024-10-20 02:14:43 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 02:14:43 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-20 02:14:43 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 02:14:44 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-20 02:14:44 | ERROR | stderr |   warnings.warn(
2024-10-20 02:14:44 | ERROR | stderr | 
2024-10-20 02:14:44 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 02:14:44 | ERROR | stderr | 
2024-10-20 02:14:44 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 02:14:44 | ERROR | stderr | 
2024-10-20 02:14:44 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-20 02:14:44 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-20 02:14:44 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: HEAD http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-20 02:14:45 | ERROR | stderr |   warnings.warn(
2024-10-20 02:14:45 | INFO | stdout | 
2024-10-20 02:14:45 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:45 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:46 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:47 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:47 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:47 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:47 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:47 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:47 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:47 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:47 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:47 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:47 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:48 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:48 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:48 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:50 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:50 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:50 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:50 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:50 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:50 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:50 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:50 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:50 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:50 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:50 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:50 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:50 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:51 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:51 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:51 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:51 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:51 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:52 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:53 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:55 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:55 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:55 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:55 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:55 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:55 | INFO | httpx | HTTP Request: GET http://0.0.0.0:7861/ "HTTP/1.1 200 OK"
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:55 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:56 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:14:57 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:01 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:02 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:03 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:04 | INFO | stdout | 
2024-10-20 02:15:05 | INFO | stdout | 
2024-10-20 02:15:05 | INFO | stdout | 
2024-10-20 02:15:05 | INFO | stdout | 
2024-10-20 02:15:05 | INFO | stdout | 
2024-10-20 02:15:05 | INFO | stdout | 
2024-10-20 02:15:05 | INFO | stdout | 
2024-10-20 02:15:05 | INFO | stdout | 
2024-10-20 02:15:05 | INFO | stdout | 
2024-10-20 02:15:05 | INFO | stdout | 
2024-10-20 02:15:05 | INFO | stdout | 
2024-10-20 02:15:12 | INFO | stdout | 
2024-10-20 02:15:12 | INFO | stdout | 
2024-10-20 02:15:12 | INFO | stdout | 
2024-10-20 02:15:12 | INFO | stdout | 
2024-10-20 02:15:12 | INFO | stdout | 
2024-10-20 02:15:12 | INFO | stdout | 
2024-10-20 02:15:12 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:13 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:14 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:15 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:16 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:17 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:18 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:21 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:22 | INFO | stdout | 
2024-10-20 02:15:53 | INFO | stdout | 
2024-10-20 02:15:53 | INFO | stdout | 
2024-10-20 02:15:53 | INFO | stdout | 
2024-10-20 02:15:53 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:54 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:55 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:56 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:57 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:58 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:15:59 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:00 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:01 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:02 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:03 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:04 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:05 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:06 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:07 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:08 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:09 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:10 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:11 | INFO | stdout | 
2024-10-20 02:16:12 | INFO | stdout | 
2024-10-20 02:16:12 | INFO | stdout | 
2024-10-20 02:16:12 | INFO | stdout | 
2024-10-20 02:16:12 | INFO | stdout | 
2024-10-20 02:16:12 | INFO | stdout | 
2024-10-20 02:16:12 | INFO | stdout | 
2024-10-20 02:16:12 | INFO | stdout | 
2024-10-20 02:16:12 | INFO | stdout | 
2024-10-20 02:16:12 | INFO | stdout | 
2024-10-20 02:16:12 | INFO | stdout | 
2024-10-20 02:16:12 | INFO | stdout | 
2024-10-20 02:16:13 | INFO | stdout | 
2024-10-20 02:16:13 | INFO | stdout | 
2024-10-20 02:16:13 | INFO | stdout | 
2024-10-20 02:16:13 | INFO | stdout | 
2024-10-20 02:16:13 | INFO | stdout | 
2024-10-20 02:18:03 | INFO | stdout | 
2024-10-20 02:18:04 | INFO | stdout | 
2024-10-20 02:18:04 | INFO | stdout | 
2024-10-20 02:18:04 | INFO | stdout | 
2024-10-20 02:18:04 | INFO | stdout | 
2024-10-20 02:18:04 | INFO | stdout | 
2024-10-20 02:18:04 | INFO | stdout | 
2024-10-20 02:18:04 | INFO | stdout | 
2024-10-20 02:18:04 | INFO | stdout | 
2024-10-20 02:18:05 | INFO | stdout | 
2024-10-20 02:18:05 | INFO | stdout | 
2024-10-20 02:18:05 | INFO | stdout | 
2024-10-20 02:18:05 | INFO | stdout | 
2024-10-20 02:18:05 | INFO | stdout | 
2024-10-20 02:18:05 | INFO | stdout | 
2024-10-20 02:18:05 | INFO | stdout | 
2024-10-20 02:18:05 | INFO | stdout | 
2024-10-20 02:18:06 | INFO | stdout | 
2024-10-20 02:18:06 | INFO | stdout | 
2024-10-20 02:18:06 | INFO | stdout | 
2024-10-20 02:18:06 | INFO | stdout | 
2024-10-20 02:18:06 | INFO | stdout | 
2024-10-20 02:18:06 | INFO | stdout | 
2024-10-20 02:18:06 | INFO | stdout | 
2024-10-20 02:18:06 | INFO | stdout | 
2024-10-20 02:18:06 | INFO | stdout | 
2024-10-20 02:18:07 | INFO | stdout | 
2024-10-20 02:18:07 | INFO | stdout | 
2024-10-20 02:18:07 | INFO | stdout | 
2024-10-20 02:18:07 | INFO | stdout | 
2024-10-20 02:18:07 | INFO | stdout | 
2024-10-20 02:18:07 | INFO | stdout | 
2024-10-20 02:18:07 | INFO | stdout | 
2024-10-20 02:18:07 | INFO | stdout | 
2024-10-20 02:18:08 | INFO | stdout | 
2024-10-20 02:18:08 | INFO | stdout | 
2024-10-20 02:18:08 | INFO | stdout | 
2024-10-20 02:18:08 | INFO | stdout | 
2024-10-20 02:18:08 | INFO | stdout | 
2024-10-20 02:18:08 | INFO | stdout | 
2024-10-20 02:18:08 | INFO | stdout | 
2024-10-20 02:18:08 | INFO | stdout | 
2024-10-20 02:18:09 | INFO | stdout | 
2024-10-20 02:18:09 | INFO | stdout | 
2024-10-20 02:18:09 | INFO | stdout | 
2024-10-20 02:18:09 | INFO | stdout | 
2024-10-20 02:18:09 | INFO | stdout | 
2024-10-20 02:18:09 | INFO | stdout | 
2024-10-20 02:18:09 | INFO | stdout | 
2024-10-20 02:18:09 | INFO | stdout | 
2024-10-20 02:18:09 | INFO | stdout | 
2024-10-20 02:18:10 | INFO | stdout | 
2024-10-20 02:18:10 | INFO | stdout | 
2024-10-20 02:18:10 | INFO | stdout | 
2024-10-20 02:18:10 | INFO | stdout | 
2024-10-20 02:18:10 | INFO | stdout | 
2024-10-20 02:18:10 | INFO | stdout | 
2024-10-20 02:18:10 | INFO | stdout | 
2024-10-20 02:18:10 | INFO | stdout | 
2024-10-20 02:18:11 | INFO | stdout | 
2024-10-20 02:18:11 | INFO | stdout | 
2024-10-20 02:18:11 | INFO | stdout | 
2024-10-20 02:18:11 | INFO | stdout | 
2024-10-20 02:18:11 | INFO | stdout | 
2024-10-20 02:18:11 | INFO | stdout | 
2024-10-20 02:18:11 | INFO | stdout | 
2024-10-20 02:18:11 | INFO | stdout | 
2024-10-20 02:18:12 | INFO | stdout | 
2024-10-20 02:18:12 | INFO | stdout | 
2024-10-20 02:18:12 | INFO | stdout | 
2024-10-20 02:18:12 | INFO | stdout | 
2024-10-20 02:18:12 | INFO | stdout | 
2024-10-20 02:18:12 | INFO | stdout | 
2024-10-20 02:18:12 | INFO | stdout | 
2024-10-20 02:18:12 | INFO | stdout | 
2024-10-20 02:18:13 | INFO | stdout | 
2024-10-20 02:18:13 | INFO | stdout | 
2024-10-20 02:18:13 | INFO | stdout | 
2024-10-20 02:18:13 | INFO | stdout | 
2024-10-20 02:18:13 | INFO | stdout | 
2024-10-20 02:18:13 | INFO | stdout | 
2024-10-20 02:18:13 | INFO | stdout | 
2024-10-20 02:18:13 | INFO | stdout | 
2024-10-20 02:18:14 | INFO | stdout | 
2024-10-20 02:18:14 | INFO | stdout | 
2024-10-20 02:18:14 | INFO | stdout | 
2024-10-20 02:18:14 | INFO | stdout | 
2024-10-20 02:18:14 | INFO | stdout | 
2024-10-20 02:18:14 | INFO | stdout | 
2024-10-20 02:18:14 | INFO | stdout | 
2024-10-20 02:18:14 | INFO | stdout | 
2024-10-20 02:18:14 | INFO | stdout | 
2024-10-20 02:18:15 | INFO | stdout | 
2024-10-20 02:18:15 | INFO | stdout | 
2024-10-20 02:18:15 | INFO | stdout | 
2024-10-20 02:18:15 | INFO | stdout | 
2024-10-20 02:18:15 | INFO | stdout | 
2024-10-20 02:18:15 | INFO | stdout | 
2024-10-20 02:18:15 | INFO | stdout | 
2024-10-20 02:18:16 | INFO | stdout | 
2024-10-20 02:18:16 | INFO | stdout | 
2024-10-20 02:18:16 | INFO | stdout | 
2024-10-20 02:18:16 | INFO | stdout | 
2024-10-20 02:18:16 | INFO | stdout | 
2024-10-20 02:18:16 | INFO | stdout | 
2024-10-20 02:18:16 | INFO | stdout | 
2024-10-20 02:18:16 | INFO | stdout | 
2024-10-20 02:18:16 | INFO | stdout | 
2024-10-20 02:18:17 | INFO | stdout | 
2024-10-20 02:18:17 | INFO | stdout | 
2024-10-20 02:18:17 | INFO | stdout | 
2024-10-20 02:18:17 | INFO | stdout | 
2024-10-20 02:18:17 | INFO | stdout | 
2024-10-20 02:18:17 | INFO | stdout | 
2024-10-20 02:18:17 | INFO | stdout | 
2024-10-20 02:18:17 | INFO | stdout | 
2024-10-20 02:18:17 | INFO | stdout | 
2024-10-20 02:18:18 | INFO | stdout | 
2024-10-20 02:18:18 | INFO | stdout | 
2024-10-20 02:18:18 | INFO | stdout | 
2024-10-20 02:18:18 | INFO | stdout | 
2024-10-20 02:18:18 | INFO | stdout | 
2024-10-20 02:18:18 | INFO | stdout | 
2024-10-20 02:18:18 | INFO | stdout | 
2024-10-20 02:18:18 | INFO | stdout | 
2024-10-20 02:18:18 | INFO | stdout | 
2024-10-20 02:18:19 | INFO | stdout | 
2024-10-20 02:18:19 | INFO | stdout | 
2024-10-20 02:18:19 | INFO | stdout | 
2024-10-20 02:18:19 | INFO | stdout | 
2024-10-20 02:18:19 | INFO | stdout | 
2024-10-20 02:18:19 | INFO | stdout | 
2024-10-20 02:18:19 | INFO | stdout | 
2024-10-20 02:18:19 | INFO | stdout | 
2024-10-20 02:18:20 | INFO | stdout | 
2024-10-20 02:18:20 | INFO | stdout | 
2024-10-20 02:18:20 | INFO | stdout | 
2024-10-20 02:18:20 | INFO | stdout | 
2024-10-20 02:18:20 | INFO | stdout | 
2024-10-20 02:18:20 | INFO | stdout | 
2024-10-20 02:18:20 | INFO | stdout | 
2024-10-20 02:18:20 | INFO | stdout | 
2024-10-20 02:18:20 | INFO | stdout | 
2024-10-20 02:18:21 | INFO | stdout | 
2024-10-20 02:18:21 | INFO | stdout | 
2024-10-20 02:18:21 | INFO | stdout | 
2024-10-20 02:18:21 | INFO | stdout | 
2024-10-20 02:18:21 | INFO | stdout | 
2024-10-20 02:18:21 | INFO | stdout | 
2024-10-20 02:18:21 | INFO | stdout | 
2024-10-20 02:18:21 | INFO | stdout | 
2024-10-20 02:18:22 | INFO | stdout | 
2024-10-20 02:18:22 | INFO | stdout | 
2024-10-20 02:18:22 | INFO | stdout | 
2024-10-20 02:18:22 | INFO | stdout | 
2024-10-20 02:18:22 | INFO | stdout | 
2024-10-20 02:18:22 | INFO | stdout | 
2024-10-20 02:18:22 | INFO | stdout | 
2024-10-20 02:18:22 | INFO | stdout | 
2024-10-20 02:18:22 | INFO | stdout | 
2024-10-20 02:18:23 | INFO | stdout | 
2024-10-20 02:18:23 | INFO | stdout | 
2024-10-20 02:18:23 | INFO | stdout | 
2024-10-20 02:18:23 | INFO | stdout | 
2024-10-20 02:18:23 | INFO | stdout | 
2024-10-20 02:18:23 | INFO | stdout | 
2024-10-20 02:18:23 | INFO | stdout | 
2024-10-20 02:18:23 | INFO | stdout | 
2024-10-20 02:18:23 | INFO | stdout | 
2024-10-20 02:18:24 | INFO | stdout | 
2024-10-20 02:18:24 | INFO | stdout | 
2024-10-20 02:18:24 | INFO | stdout | 
2024-10-20 02:18:24 | INFO | stdout | 
2024-10-20 02:18:24 | INFO | stdout | 
2024-10-20 02:18:24 | INFO | stdout | 
2024-10-20 02:18:24 | INFO | stdout | 
2024-10-20 02:18:24 | INFO | stdout | 
2024-10-20 02:18:25 | INFO | stdout | 
2024-10-20 02:18:25 | INFO | stdout | 
2024-10-20 02:18:25 | INFO | stdout | 
2024-10-20 02:18:25 | INFO | stdout | 
2024-10-20 02:18:25 | INFO | stdout | 
2024-10-20 02:18:25 | INFO | stdout | 
2024-10-20 02:18:25 | INFO | stdout | 
2024-10-20 02:18:25 | INFO | stdout | 
2024-10-20 02:18:25 | INFO | stdout | 
2024-10-20 02:18:26 | INFO | stdout | 
2024-10-20 02:18:26 | INFO | stdout | 
2024-10-20 02:18:26 | INFO | stdout | 
2024-10-20 02:18:26 | INFO | stdout | 
2024-10-20 02:18:26 | INFO | stdout | 
2024-10-20 02:18:26 | INFO | stdout | 
2024-10-20 02:18:26 | INFO | stdout | 
2024-10-20 02:18:26 | INFO | stdout | 
2024-10-20 02:18:26 | INFO | stdout | 
2024-10-20 02:18:27 | INFO | stdout | 
2024-10-20 02:18:27 | INFO | stdout | 
2024-10-20 02:18:27 | INFO | stdout | 
2024-10-20 02:18:27 | INFO | stdout | 
2024-10-20 02:18:27 | INFO | stdout | 
2024-10-20 02:18:27 | INFO | stdout | 
2024-10-20 02:18:27 | INFO | stdout | 
2024-10-20 02:18:27 | INFO | stdout | 
2024-10-20 02:18:27 | INFO | stdout | 
2024-10-20 02:18:28 | INFO | stdout | 
2024-10-20 02:18:28 | INFO | stdout | 
2024-10-20 02:18:28 | INFO | stdout | 
2024-10-20 02:18:28 | INFO | stdout | 
2024-10-20 02:18:28 | INFO | stdout | 
2024-10-20 02:18:28 | INFO | stdout | 
2024-10-20 02:18:28 | INFO | stdout | 
2024-10-20 02:18:28 | INFO | stdout | 
2024-10-20 02:18:28 | INFO | stdout | 
2024-10-20 02:18:29 | INFO | stdout | 
2024-10-20 02:18:29 | INFO | stdout | 
2024-10-20 02:18:29 | INFO | stdout | 
2024-10-20 02:18:29 | INFO | stdout | 
2024-10-20 02:18:29 | INFO | stdout | 
2024-10-20 02:18:29 | INFO | stdout | 
2024-10-20 02:18:29 | INFO | stdout | 
2024-10-20 02:18:29 | INFO | stdout | 
2024-10-20 02:18:29 | INFO | stdout | 
2024-10-20 02:18:30 | INFO | stdout | 
2024-10-20 02:18:30 | INFO | stdout | 
2024-10-20 02:18:30 | INFO | stdout | 
2024-10-20 02:18:30 | INFO | stdout | 
2024-10-20 02:18:30 | INFO | stdout | 
2024-10-20 02:18:30 | INFO | stdout | 
2024-10-20 02:18:30 | INFO | stdout | 
2024-10-20 02:18:30 | INFO | stdout | 
2024-10-20 02:18:30 | INFO | stdout | 
2024-10-20 02:18:31 | INFO | stdout | 
2024-10-20 02:18:31 | INFO | stdout | 
2024-10-20 02:18:31 | INFO | stdout | 
2024-10-20 02:18:31 | INFO | stdout | 
2024-10-20 02:18:31 | INFO | stdout | 
2024-10-20 02:18:31 | INFO | stdout | 
2024-10-20 02:18:31 | INFO | stdout | 
2024-10-20 02:18:31 | INFO | stdout | 
2024-10-20 02:18:32 | INFO | stdout | 
2024-10-20 02:18:32 | INFO | stdout | 
2024-10-20 02:18:32 | INFO | stdout | 
2024-10-20 02:18:32 | INFO | stdout | 
2024-10-20 02:18:32 | INFO | stdout | 
2024-10-20 02:18:32 | INFO | stdout | 
2024-10-20 02:18:32 | INFO | stdout | 
2024-10-20 02:18:32 | INFO | stdout | 
2024-10-20 02:18:32 | INFO | stdout | 
2024-10-20 02:18:33 | INFO | stdout | 
2024-10-20 02:18:33 | INFO | stdout | 
2024-10-20 02:18:33 | INFO | stdout | 
2024-10-20 02:18:33 | INFO | stdout | 
2024-10-20 02:18:33 | INFO | stdout | 
2024-10-20 02:18:33 | INFO | stdout | 
2024-10-20 02:18:33 | INFO | stdout | 
2024-10-20 02:18:33 | INFO | stdout | 
2024-10-20 02:18:34 | INFO | stdout | 
2024-10-20 02:18:34 | INFO | stdout | 
2024-10-20 02:18:34 | INFO | stdout | 
2024-10-20 02:18:34 | INFO | stdout | 
2024-10-20 02:18:34 | INFO | stdout | 
2024-10-20 02:18:34 | INFO | stdout | 
2024-10-20 02:18:34 | INFO | stdout | 
2024-10-20 02:18:34 | INFO | stdout | 
2024-10-20 02:18:35 | INFO | stdout | 
2024-10-20 02:18:35 | INFO | stdout | 
2024-10-20 02:18:35 | INFO | stdout | 
2024-10-20 02:18:35 | INFO | stdout | 
2024-10-20 02:18:35 | INFO | stdout | 
2024-10-20 02:18:35 | INFO | stdout | 
2024-10-20 02:18:35 | INFO | stdout | 
2024-10-20 02:18:35 | INFO | stdout | 
2024-10-20 02:18:35 | INFO | stdout | 
2024-10-20 02:18:35 | INFO | stdout | 
2024-10-20 02:18:36 | INFO | stdout | 
2024-10-20 02:18:36 | INFO | stdout | 
2024-10-20 02:18:36 | INFO | stdout | 
2024-10-20 02:18:36 | INFO | stdout | 
2024-10-20 02:18:36 | INFO | stdout | 
2024-10-20 02:18:36 | INFO | stdout | 
2024-10-20 02:18:36 | INFO | stdout | 
2024-10-20 02:18:36 | INFO | stdout | 
2024-10-20 02:18:36 | INFO | stdout | 
2024-10-20 02:18:37 | INFO | stdout | 
2024-10-20 02:18:37 | INFO | stdout | 
2024-10-20 02:18:37 | INFO | stdout | 
2024-10-20 02:18:37 | INFO | stdout | 
2024-10-20 02:18:37 | INFO | stdout | 
2024-10-20 02:18:37 | INFO | stdout | 
2024-10-20 02:18:37 | INFO | stdout | 
2024-10-20 02:18:37 | INFO | stdout | 
2024-10-20 02:18:37 | INFO | stdout | 
2024-10-20 02:18:38 | INFO | stdout | 
2024-10-20 02:18:38 | INFO | stdout | 
2024-10-20 02:18:38 | INFO | stdout | 
2024-10-20 02:18:38 | INFO | stdout | 
2024-10-20 02:18:38 | INFO | stdout | 
2024-10-20 02:18:38 | INFO | stdout | 
2024-10-20 02:18:38 | INFO | stdout | 
2024-10-20 02:18:38 | INFO | stdout | 
2024-10-20 02:18:38 | INFO | stdout | 
2024-10-20 02:18:38 | INFO | stdout | 
2024-10-20 02:18:39 | INFO | stdout | 
2024-10-20 02:18:39 | INFO | stdout | 
2024-10-20 02:18:39 | INFO | stdout | 
2024-10-20 02:18:39 | INFO | stdout | 
2024-10-20 02:18:39 | INFO | stdout | 
2024-10-20 02:18:39 | INFO | stdout | 
2024-10-20 02:18:39 | INFO | stdout | 
2024-10-20 02:18:39 | INFO | stdout | 
2024-10-20 02:18:39 | INFO | stdout | 
2024-10-20 02:18:40 | INFO | stdout | 
2024-10-20 02:18:40 | INFO | stdout | 
2024-10-20 02:18:40 | INFO | stdout | 
2024-10-20 02:18:40 | INFO | stdout | 
2024-10-20 02:18:40 | INFO | stdout | 
2024-10-20 02:18:40 | INFO | stdout | 
2024-10-20 02:18:40 | INFO | stdout | 
2024-10-20 02:18:40 | INFO | stdout | 
2024-10-20 02:18:40 | INFO | stdout | 
2024-10-20 02:18:40 | INFO | stdout | 
2024-10-20 02:18:41 | INFO | stdout | 
2024-10-20 02:18:41 | INFO | stdout | 
2024-10-20 02:18:41 | INFO | stdout | 
2024-10-20 02:18:41 | INFO | stdout | 
2024-10-20 02:18:41 | INFO | stdout | 
2024-10-20 02:18:41 | INFO | stdout | 
2024-10-20 02:18:41 | INFO | stdout | 
2024-10-20 02:18:41 | INFO | stdout | 
2024-10-20 02:18:41 | INFO | stdout | 
2024-10-20 02:18:41 | INFO | stdout | 
2024-10-20 02:18:42 | INFO | stdout | 
2024-10-20 02:18:42 | INFO | stdout | 
2024-10-20 02:18:42 | INFO | stdout | 
2024-10-20 02:18:42 | INFO | stdout | 
2024-10-20 02:18:42 | INFO | stdout | 
2024-10-20 02:18:42 | INFO | stdout | 
2024-10-20 02:18:42 | INFO | stdout | 
2024-10-20 02:18:42 | INFO | stdout | 
2024-10-20 02:18:42 | INFO | stdout | 
2024-10-20 02:18:42 | INFO | stdout | 
2024-10-20 02:18:43 | INFO | stdout | 
2024-10-20 02:18:43 | INFO | stdout | 
2024-10-20 02:18:43 | INFO | stdout | 
2024-10-20 02:18:43 | INFO | stdout | 
2024-10-20 02:18:43 | INFO | stdout | 
2024-10-20 02:18:43 | INFO | stdout | 
2024-10-20 02:18:43 | INFO | stdout | 
2024-10-20 02:18:43 | INFO | stdout | 
2024-10-20 02:18:43 | INFO | stdout | 
2024-10-20 02:18:44 | INFO | stdout | 
2024-10-20 02:18:44 | INFO | stdout | 
2024-10-20 02:18:44 | INFO | stdout | 
2024-10-20 02:18:44 | INFO | stdout | 
2024-10-20 02:18:44 | INFO | stdout | 
2024-10-20 02:18:44 | INFO | stdout | 
2024-10-20 02:18:44 | INFO | stdout | 
2024-10-20 02:18:44 | INFO | stdout | 
2024-10-20 02:18:44 | INFO | stdout | 
2024-10-20 02:18:44 | INFO | stdout | 
2024-10-20 02:18:45 | INFO | stdout | 
2024-10-20 02:18:45 | INFO | stdout | 
2024-10-20 02:18:45 | INFO | stdout | 
2024-10-20 02:18:45 | INFO | stdout | 
2024-10-20 02:18:45 | INFO | stdout | 
2024-10-20 02:18:45 | INFO | stdout | 
2024-10-20 02:18:45 | INFO | stdout | 
2024-10-20 02:18:45 | INFO | stdout | 
2024-10-20 02:18:45 | INFO | stdout | 
2024-10-20 02:18:45 | INFO | stdout | 
2024-10-20 02:18:46 | INFO | stdout | 
2024-10-20 02:18:46 | INFO | stdout | 
2024-10-20 02:18:46 | INFO | stdout | 
2024-10-20 02:18:46 | INFO | stdout | 
2024-10-20 02:18:46 | INFO | stdout | 
2024-10-20 02:18:46 | INFO | stdout | 
2024-10-20 02:18:46 | INFO | stdout | 
2024-10-20 02:18:46 | INFO | stdout | 
2024-10-20 02:18:46 | INFO | stdout | 
2024-10-20 02:18:46 | INFO | stdout | 
2024-10-20 02:18:47 | INFO | stdout | 
2024-10-20 02:18:47 | INFO | stdout | 
2024-10-20 02:18:47 | INFO | stdout | 
2024-10-20 02:18:47 | INFO | stdout | 
2024-10-20 02:18:47 | INFO | stdout | 
2024-10-20 02:18:47 | INFO | stdout | 
2024-10-20 02:18:47 | INFO | stdout | 
2024-10-20 02:18:47 | INFO | stdout | 
2024-10-20 02:18:47 | INFO | stdout | 
2024-10-20 02:18:47 | INFO | stdout | 
2024-10-20 02:18:48 | INFO | stdout | 
2024-10-20 02:18:48 | INFO | stdout | 
2024-10-20 02:18:48 | INFO | stdout | 
2024-10-20 02:18:48 | INFO | stdout | 
2024-10-20 02:18:48 | INFO | stdout | 
2024-10-20 02:18:48 | INFO | stdout | 
2024-10-20 02:18:48 | INFO | stdout | 
2024-10-20 02:18:48 | INFO | stdout | 
2024-10-20 02:18:48 | INFO | stdout | 
2024-10-20 02:18:48 | INFO | stdout | 
2024-10-20 02:18:49 | INFO | stdout | 
2024-10-20 02:18:49 | INFO | stdout | 
2024-10-20 02:18:49 | INFO | stdout | 
2024-10-20 02:18:49 | INFO | stdout | 
2024-10-20 02:18:49 | INFO | stdout | 
2024-10-20 02:18:49 | INFO | stdout | 
2024-10-20 02:18:49 | INFO | stdout | 
2024-10-20 02:18:49 | INFO | stdout | 
2024-10-20 02:18:49 | INFO | stdout | 
2024-10-20 02:18:49 | INFO | stdout | 
2024-10-20 02:18:50 | INFO | stdout | 
2024-10-20 02:18:50 | INFO | stdout | 
2024-10-20 02:18:50 | INFO | stdout | 
2024-10-20 02:18:50 | INFO | stdout | 
2024-10-20 02:18:50 | INFO | stdout | 
2024-10-20 02:18:50 | INFO | stdout | 
2024-10-20 02:18:50 | INFO | stdout | 
2024-10-20 02:18:50 | INFO | stdout | 
2024-10-20 02:18:50 | INFO | stdout | 
2024-10-20 02:18:50 | INFO | stdout | 
2024-10-20 02:18:50 | INFO | stdout | 
2024-10-20 02:18:51 | INFO | stdout | 
2024-10-20 02:18:51 | INFO | stdout | 
2024-10-20 02:18:51 | INFO | stdout | 
2024-10-20 02:18:51 | INFO | stdout | 
2024-10-20 02:18:51 | INFO | stdout | 
2024-10-20 02:18:51 | INFO | stdout | 
2024-10-20 02:18:51 | INFO | stdout | 
2024-10-20 02:18:51 | INFO | stdout | 
2024-10-20 02:18:51 | INFO | stdout | 
2024-10-20 02:18:52 | INFO | stdout | 
2024-10-20 02:18:52 | INFO | stdout | 
2024-10-20 02:18:52 | INFO | stdout | 
2024-10-20 02:18:52 | INFO | stdout | 
2024-10-20 02:18:52 | INFO | stdout | 
2024-10-20 02:18:52 | INFO | stdout | 
2024-10-20 02:18:52 | INFO | stdout | 
2024-10-20 02:18:52 | INFO | stdout | 
2024-10-20 02:18:52 | INFO | stdout | 
2024-10-20 02:18:52 | INFO | stdout | 
2024-10-20 02:18:52 | INFO | stdout | 
2024-10-20 02:18:53 | INFO | stdout | 
2024-10-20 02:18:53 | INFO | stdout | 
2024-10-20 02:18:53 | INFO | stdout | 
2024-10-20 02:18:53 | INFO | stdout | 
2024-10-20 02:18:53 | INFO | stdout | 
2024-10-20 02:18:53 | INFO | stdout | 
2024-10-20 02:18:53 | INFO | stdout | 
2024-10-20 02:18:53 | INFO | stdout | 
2024-10-20 02:18:53 | INFO | stdout | 
2024-10-20 02:18:53 | INFO | stdout | 
2024-10-20 02:18:54 | INFO | stdout | 
2024-10-20 02:18:54 | INFO | stdout | 
2024-10-20 02:18:54 | INFO | stdout | 
2024-10-20 02:18:54 | INFO | stdout | 
2024-10-20 02:18:54 | INFO | stdout | 
2024-10-20 02:18:54 | INFO | stdout | 
2024-10-20 02:18:54 | INFO | stdout | 
2024-10-20 02:18:54 | INFO | stdout | 
2024-10-20 02:18:54 | INFO | stdout | 
2024-10-20 02:18:54 | INFO | stdout | 
2024-10-20 02:18:55 | INFO | stdout | 
2024-10-20 02:18:55 | INFO | stdout | 
2024-10-20 02:18:55 | INFO | stdout | 
2024-10-20 02:18:55 | INFO | stdout | 
2024-10-20 02:18:55 | INFO | stdout | 
2024-10-20 02:18:55 | INFO | stdout | 
2024-10-20 02:18:55 | INFO | stdout | 
2024-10-20 02:18:55 | INFO | stdout | 
2024-10-20 02:18:55 | INFO | stdout | 
2024-10-20 02:18:55 | INFO | stdout | 
2024-10-20 02:18:55 | INFO | stdout | 
2024-10-20 02:18:56 | INFO | stdout | 
2024-10-20 02:18:56 | INFO | stdout | 
2024-10-20 02:18:56 | INFO | stdout | 
2024-10-20 02:18:56 | INFO | stdout | 
2024-10-20 02:18:56 | INFO | stdout | 
2024-10-20 02:18:56 | INFO | stdout | 
2024-10-20 02:18:56 | INFO | stdout | 
2024-10-20 02:18:56 | INFO | stdout | 
2024-10-20 02:18:56 | INFO | stdout | 
2024-10-20 02:18:56 | INFO | stdout | 
2024-10-20 02:18:57 | INFO | stdout | 
2024-10-20 02:18:57 | INFO | stdout | 
2024-10-20 02:18:57 | INFO | stdout | 
2024-10-20 02:18:57 | INFO | stdout | 
2024-10-20 02:18:57 | INFO | stdout | 
2024-10-20 02:18:57 | INFO | stdout | 
2024-10-20 02:18:57 | INFO | stdout | 
2024-10-20 02:18:57 | INFO | stdout | 
2024-10-20 02:18:57 | INFO | stdout | 
2024-10-20 02:18:57 | INFO | stdout | 
2024-10-20 02:18:57 | INFO | stdout | 
2024-10-20 02:18:58 | INFO | stdout | 
2024-10-20 02:18:58 | INFO | stdout | 
2024-10-20 02:18:58 | INFO | stdout | 
2024-10-20 02:18:58 | INFO | stdout | 
2024-10-20 02:18:58 | INFO | stdout | 
2024-10-20 02:18:58 | INFO | stdout | 
2024-10-20 02:18:58 | INFO | stdout | 
2024-10-20 02:18:58 | INFO | stdout | 
2024-10-20 02:18:58 | INFO | stdout | 
2024-10-20 02:18:58 | INFO | stdout | 
2024-10-20 02:18:58 | INFO | stdout | 
2024-10-20 02:18:59 | INFO | stdout | 
2024-10-20 02:18:59 | INFO | stdout | 
2024-10-20 02:18:59 | INFO | stdout | 
2024-10-20 02:18:59 | INFO | stdout | 
2024-10-20 02:18:59 | INFO | stdout | 
2024-10-20 02:18:59 | INFO | stdout | 
2024-10-20 02:18:59 | INFO | stdout | 
2024-10-20 02:18:59 | INFO | stdout | 
2024-10-20 02:18:59 | INFO | stdout | 
2024-10-20 02:18:59 | INFO | stdout | 
2024-10-20 02:18:59 | INFO | stdout | 
2024-10-20 02:19:00 | INFO | stdout | 
2024-10-20 02:19:00 | INFO | stdout | 
2024-10-20 02:19:00 | INFO | stdout | 
2024-10-20 02:19:00 | INFO | stdout | 
2024-10-20 02:19:00 | INFO | stdout | 
2024-10-20 02:19:00 | INFO | stdout | 
2024-10-20 02:19:00 | INFO | stdout | 
2024-10-20 02:19:00 | INFO | stdout | 
2024-10-20 02:19:00 | INFO | stdout | 
2024-10-20 02:19:00 | INFO | stdout | 
2024-10-20 02:19:00 | INFO | stdout | 
2024-10-20 02:19:01 | INFO | stdout | 
2024-10-20 02:19:01 | INFO | stdout | 
2024-10-20 02:19:01 | INFO | stdout | 
2024-10-20 02:19:01 | INFO | stdout | 
2024-10-20 02:19:01 | INFO | stdout | 
2024-10-20 02:19:01 | INFO | stdout | 
2024-10-20 02:19:01 | INFO | stdout | 
2024-10-20 02:19:01 | INFO | stdout | 
2024-10-20 02:19:01 | INFO | stdout | 
2024-10-20 02:19:01 | INFO | stdout | 
2024-10-20 02:19:02 | INFO | stdout | 
2024-10-20 02:19:02 | INFO | stdout | 
2024-10-20 02:19:02 | INFO | stdout | 
2024-10-20 02:19:02 | INFO | stdout | 
2024-10-20 02:19:02 | INFO | stdout | 
2024-10-20 02:19:02 | INFO | stdout | 
2024-10-20 02:19:02 | INFO | stdout | 
2024-10-20 02:19:02 | INFO | stdout | 
2024-10-20 02:19:02 | INFO | stdout | 
2024-10-20 02:19:02 | INFO | stdout | 
2024-10-20 02:19:02 | INFO | stdout | 
2024-10-20 02:19:03 | INFO | stdout | 
2024-10-20 02:19:03 | INFO | stdout | 
2024-10-20 02:19:03 | INFO | stdout | 
2024-10-20 02:19:03 | INFO | stdout | 
2024-10-20 02:19:03 | INFO | stdout | 
2024-10-20 02:19:03 | INFO | stdout | 
2024-10-20 02:19:03 | INFO | stdout | 
2024-10-20 02:19:03 | INFO | stdout | 
2024-10-20 02:19:03 | INFO | stdout | 
2024-10-20 02:19:03 | INFO | stdout | 
2024-10-20 02:19:04 | INFO | stdout | 
2024-10-20 02:19:04 | INFO | stdout | 
2024-10-20 02:19:04 | INFO | stdout | 
2024-10-20 02:19:04 | INFO | stdout | 
2024-10-20 02:19:04 | INFO | stdout | 
2024-10-20 02:19:04 | INFO | stdout | 
2024-10-20 02:19:04 | INFO | stdout | 
2024-10-20 02:19:04 | INFO | stdout | 
2024-10-20 02:19:04 | INFO | stdout | 
2024-10-20 02:19:04 | INFO | stdout | 
2024-10-20 02:19:05 | INFO | stdout | 
2024-10-20 02:19:05 | INFO | stdout | 
2024-10-20 02:19:05 | INFO | stdout | 
2024-10-20 02:19:05 | INFO | stdout | 
2024-10-20 02:19:05 | INFO | stdout | 
2024-10-20 02:19:05 | INFO | stdout | 
2024-10-20 02:19:05 | INFO | stdout | 
2024-10-20 02:19:05 | INFO | stdout | 
2024-10-20 02:19:05 | INFO | stdout | 
2024-10-20 02:19:05 | INFO | stdout | 
2024-10-20 02:19:05 | INFO | stdout | 
2024-10-20 02:19:06 | INFO | stdout | 
2024-10-20 02:19:06 | INFO | stdout | 
2024-10-20 02:19:06 | INFO | stdout | 
2024-10-20 02:19:06 | INFO | stdout | 
2024-10-20 02:19:06 | INFO | stdout | 
2024-10-20 02:19:06 | INFO | stdout | 
2024-10-20 02:19:06 | INFO | stdout | 
2024-10-20 02:19:06 | INFO | stdout | 
2024-10-20 02:19:06 | INFO | stdout | 
2024-10-20 02:19:06 | INFO | stdout | 
2024-10-20 02:19:07 | INFO | stdout | 
2024-10-20 02:19:07 | INFO | stdout | 
2024-10-20 02:19:07 | INFO | stdout | 
2024-10-20 02:19:07 | INFO | stdout | 
2024-10-20 02:19:07 | INFO | stdout | 
2024-10-20 02:19:07 | INFO | stdout | 
2024-10-20 02:19:07 | INFO | stdout | 
2024-10-20 02:19:07 | INFO | stdout | 
2024-10-20 02:19:07 | INFO | stdout | 
2024-10-20 02:19:07 | INFO | stdout | 
2024-10-20 02:19:07 | INFO | stdout | 
2024-10-20 02:19:08 | INFO | stdout | 
2024-10-20 02:19:08 | INFO | stdout | 
2024-10-20 02:19:08 | INFO | stdout | 
2024-10-20 02:19:08 | INFO | stdout | 
2024-10-20 02:19:08 | INFO | stdout | 
2024-10-20 02:19:08 | INFO | stdout | 
2024-10-20 02:19:08 | INFO | stdout | 
2024-10-20 02:19:08 | INFO | stdout | 
2024-10-20 02:19:08 | INFO | stdout | 
2024-10-20 02:19:08 | INFO | stdout | 
2024-10-20 02:19:08 | INFO | stdout | 
2024-10-20 02:19:08 | INFO | stdout | 
2024-10-20 02:19:09 | INFO | stdout | 
2024-10-20 02:19:09 | INFO | stdout | 
2024-10-20 02:19:09 | INFO | stdout | 
2024-10-20 02:19:09 | INFO | stdout | 
2024-10-20 02:19:09 | INFO | stdout | 
2024-10-20 02:19:09 | INFO | stdout | 
2024-10-20 02:19:09 | INFO | stdout | 
2024-10-20 02:19:09 | INFO | stdout | 
2024-10-20 02:19:09 | INFO | stdout | 
2024-10-20 02:19:09 | INFO | stdout | 
2024-10-20 02:19:09 | INFO | stdout | 
2024-10-20 02:19:10 | INFO | stdout | 
2024-10-20 02:19:10 | INFO | stdout | 
2024-10-20 02:19:10 | INFO | stdout | 
2024-10-20 02:19:10 | INFO | stdout | 
2024-10-20 02:19:10 | INFO | stdout | 
2024-10-20 02:19:10 | INFO | stdout | 
2024-10-20 02:19:10 | INFO | stdout | 
2024-10-20 02:19:10 | INFO | stdout | 
2024-10-20 02:19:10 | INFO | stdout | 
2024-10-20 02:19:10 | INFO | stdout | 
2024-10-20 02:19:10 | INFO | stdout | 
2024-10-20 02:19:10 | INFO | stdout | 
2024-10-20 02:19:11 | INFO | stdout | 
2024-10-20 02:19:11 | INFO | stdout | 
2024-10-20 02:19:11 | INFO | stdout | 
2024-10-20 02:19:11 | INFO | stdout | 
2024-10-20 02:19:11 | INFO | stdout | 
2024-10-20 02:19:11 | INFO | stdout | 
2024-10-20 02:19:11 | INFO | stdout | 
2024-10-20 02:19:11 | INFO | stdout | 
2024-10-20 02:19:11 | INFO | stdout | 
2024-10-20 02:19:11 | INFO | stdout | 
2024-10-20 02:19:11 | INFO | stdout | 
2024-10-20 02:19:11 | INFO | stdout | 
2024-10-20 02:19:12 | INFO | stdout | 
2024-10-20 02:19:12 | INFO | stdout | 
2024-10-20 02:19:12 | INFO | stdout | 
2024-10-20 02:19:12 | INFO | stdout | 
2024-10-20 02:19:12 | INFO | stdout | 
2024-10-20 02:19:12 | INFO | stdout | 
2024-10-20 02:19:12 | INFO | stdout | 
2024-10-20 02:19:12 | INFO | stdout | 
2024-10-20 02:19:12 | INFO | stdout | 
2024-10-20 02:19:12 | INFO | stdout | 
2024-10-20 02:19:12 | INFO | stdout | 
2024-10-20 02:19:12 | INFO | stdout | 
2024-10-20 02:19:13 | INFO | stdout | 
2024-10-20 02:19:13 | INFO | stdout | 
2024-10-20 02:19:13 | INFO | stdout | 
2024-10-20 02:19:13 | INFO | stdout | 
2024-10-20 02:19:13 | INFO | stdout | 
2024-10-20 02:19:13 | INFO | stdout | 
2024-10-20 02:19:13 | INFO | stdout | 
2024-10-20 02:19:13 | INFO | stdout | 
2024-10-20 02:19:13 | INFO | stdout | 
2024-10-20 02:19:13 | INFO | stdout | 
2024-10-20 02:19:13 | INFO | stdout | 
2024-10-20 02:19:13 | INFO | stdout | 
2024-10-20 02:19:14 | INFO | stdout | 
2024-10-20 02:19:14 | INFO | stdout | 
2024-10-20 02:19:14 | INFO | stdout | 
2024-10-20 02:19:14 | INFO | stdout | 
2024-10-20 02:19:14 | INFO | stdout | 
2024-10-20 02:19:14 | INFO | stdout | 
2024-10-20 02:19:14 | INFO | stdout | 
2024-10-20 02:19:14 | INFO | stdout | 
2024-10-20 02:19:14 | INFO | stdout | 
2024-10-20 02:19:14 | INFO | stdout | 
2024-10-20 02:19:14 | INFO | stdout | 
2024-10-20 02:19:14 | INFO | stdout | 
2024-10-20 02:19:15 | INFO | stdout | 
2024-10-20 02:19:15 | INFO | stdout | 
2024-10-20 02:19:15 | INFO | stdout | 
2024-10-20 02:19:15 | INFO | stdout | 
2024-10-20 02:19:15 | INFO | stdout | 
2024-10-20 02:19:15 | INFO | stdout | 
2024-10-20 02:19:15 | INFO | stdout | 
2024-10-20 02:19:15 | INFO | stdout | 
2024-10-20 02:19:15 | INFO | stdout | 
2024-10-20 02:19:15 | INFO | stdout | 
2024-10-20 02:19:15 | INFO | stdout | 
2024-10-20 02:19:15 | INFO | stdout | 
2024-10-20 02:19:16 | INFO | stdout | 
2024-10-20 02:19:16 | INFO | stdout | 
2024-10-20 02:19:16 | INFO | stdout | 
2024-10-20 02:19:16 | INFO | stdout | 
2024-10-20 02:19:16 | INFO | stdout | 
2024-10-20 02:19:16 | INFO | stdout | 
2024-10-20 02:19:16 | INFO | stdout | 
2024-10-20 02:19:16 | INFO | stdout | 
2024-10-20 02:19:16 | INFO | stdout | 
2024-10-20 02:19:16 | INFO | stdout | 
2024-10-20 02:19:16 | INFO | stdout | 
2024-10-20 02:19:16 | INFO | stdout | 
2024-10-20 02:19:17 | INFO | stdout | 
2024-10-20 02:19:17 | INFO | stdout | 
2024-10-20 02:19:17 | INFO | stdout | 
2024-10-20 02:19:17 | INFO | stdout | 
2024-10-20 02:19:17 | INFO | stdout | 
2024-10-20 02:19:17 | INFO | stdout | 
2024-10-20 02:19:17 | INFO | stdout | 
2024-10-20 02:19:17 | INFO | stdout | 
2024-10-20 02:19:17 | INFO | stdout | 
2024-10-20 02:19:17 | INFO | stdout | 
2024-10-20 02:19:17 | INFO | stdout | 
2024-10-20 02:19:17 | INFO | stdout | 
2024-10-20 02:19:18 | INFO | stdout | 
2024-10-20 02:19:18 | INFO | stdout | 
2024-10-20 02:19:18 | INFO | stdout | 
2024-10-20 02:19:18 | INFO | stdout | 
2024-10-20 02:19:18 | INFO | stdout | 
2024-10-20 02:19:18 | INFO | stdout | 
2024-10-20 02:19:18 | INFO | stdout | 
2024-10-20 02:19:18 | INFO | stdout | 
2024-10-20 02:19:18 | INFO | stdout | 
2024-10-20 02:19:18 | INFO | stdout | 
2024-10-20 02:19:18 | INFO | stdout | 
2024-10-20 02:19:18 | INFO | stdout | 
2024-10-20 02:19:18 | INFO | stdout | 
2024-10-20 02:19:19 | INFO | stdout | 
2024-10-20 02:19:19 | INFO | stdout | 
2024-10-20 02:19:19 | INFO | stdout | 
2024-10-20 02:19:19 | INFO | stdout | 
2024-10-20 02:19:19 | INFO | stdout | 
2024-10-20 02:19:19 | INFO | stdout | 
2024-10-20 02:19:19 | INFO | stdout | 
2024-10-20 02:19:19 | INFO | stdout | 
2024-10-20 02:19:19 | INFO | stdout | 
2024-10-20 02:19:19 | INFO | stdout | 
2024-10-20 02:19:19 | INFO | stdout | 
2024-10-20 02:19:19 | INFO | stdout | 
2024-10-20 02:19:20 | INFO | stdout | 
2024-10-20 02:19:20 | INFO | stdout | 
2024-10-20 02:19:20 | INFO | stdout | 
2024-10-20 02:19:20 | INFO | stdout | 
2024-10-20 02:19:20 | INFO | stdout | 
2024-10-20 02:19:20 | INFO | stdout | 
2024-10-20 02:19:20 | INFO | stdout | 
2024-10-20 02:19:20 | INFO | stdout | 
2024-10-20 02:19:20 | INFO | stdout | 
2024-10-20 02:19:20 | INFO | stdout | 
2024-10-20 02:19:20 | INFO | stdout | 
2024-10-20 02:19:20 | INFO | stdout | 
2024-10-20 02:19:20 | INFO | stdout | 
2024-10-20 02:19:21 | INFO | stdout | 
2024-10-20 02:19:21 | INFO | stdout | 
2024-10-20 02:19:21 | INFO | stdout | 
2024-10-20 02:19:21 | INFO | stdout | 
2024-10-20 02:19:21 | INFO | stdout | 
2024-10-20 02:19:21 | INFO | stdout | 
2024-10-20 02:19:21 | INFO | stdout | 
2024-10-20 02:19:21 | INFO | stdout | 
2024-10-20 02:19:21 | INFO | stdout | 
2024-10-20 02:19:22 | INFO | stdout | 
2024-10-20 02:19:22 | INFO | stdout | 
2024-10-20 02:19:22 | INFO | stdout | 
2024-10-20 02:19:22 | INFO | stdout | 
2024-10-20 02:19:22 | INFO | stdout | 
2024-10-20 02:19:22 | INFO | stdout | 
2024-10-20 02:19:22 | INFO | stdout | 
2024-10-20 02:19:22 | INFO | stdout | 
2024-10-20 02:19:22 | INFO | stdout | 
2024-10-20 02:19:43 | INFO | httpx | HTTP Request: POST http://device-api.zero/startup-report "HTTP/1.1 200 OK"
2024-10-20 02:19:43 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 02:19:43 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-20 02:19:43 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 02:19:43 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-20 02:19:43 | ERROR | stderr |   warnings.warn(
2024-10-20 02:19:44 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-20 02:19:44 | ERROR | stderr | 
2024-10-20 02:19:44 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 02:19:44 | ERROR | stderr | 
2024-10-20 02:19:44 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 02:19:44 | ERROR | stderr | 
2024-10-20 02:19:44 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-20 02:19:44 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-20 02:19:44 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-20 02:19:44 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-20 02:19:44 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-20 02:19:44 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-20 02:19:44 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-20 02:19:44 | ERROR | stderr |   warnings.warn(
2024-10-20 02:19:44 | INFO | stdout | 
2024-10-20 02:19:44 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-20 04:55:50 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140426968417616&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6ImphZGVjaG9naGFyaVtwcm9dIiwidXVpZCI6bnVsbCwiZXhwIjoxNzI5MzkzMDEwfQ.Z16DMVFNfS53OPpRSTKx6FcrU-pUK4mTJRTsu8QIrWQ "HTTP/1.1 200 OK"
2024-10-20 04:55:50 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-20 04:55:50 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-20 04:55:50 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[10, 13, 16, 15, 14]
2024-10-20 04:55:51 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=509f1303a72542190814683bcb6947ca79bf6d480ffb02693562eaea6dc97481&pid=10552 "HTTP/1.1 200 OK"
2024-10-20 04:55:53 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-20 04:55:53 | INFO | stdout | conv mode to gemma
2024-10-20 04:55:53 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 04:55:53 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 04:55:53 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': "A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nwhat's inside rectangle<end_of_turn>\n<start_of_turn>model\n", 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-20 04:55:53 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 04:55:53 | INFO | stdout | eval.json file created successfully.
2024-10-20 04:55:55 | INFO | stdout | Error occurred during inference:
2024-10-20 04:55:55 | INFO | stdout | Command '['python', '-m', 'model_UI', '--model_path', 'jadechoghari/Ferret-UI-Gemma2b', '--data_path', 'eval.json', '--image_path', './serve_images/2024-10-20', '--answers_file', 'eval_output.jsonl', '--num_beam', '1', '--temperature', '0.2', '--top_p', '0.7', '--max_new_tokens', '512', '--conv_mode', 'ferret_gemma_instruct']' returned non-zero exit status 1.
2024-10-20 04:55:55 | INFO | stdout | Subprocess output:
2024-10-20 04:55:55 | INFO | stdout | 
2024-10-20 04:55:55 | INFO | gradio_web_server | This is the respone None
2024-10-20 04:55:56 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=509f1303a72542190814683bcb6947ca79bf6d480ffb02693562eaea6dc97481&fail=true "HTTP/1.1 200 OK"
2024-10-20 04:55:56 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fill_yield_queue DONE
2024-10-20 04:55:56 | ERROR | stderr | Traceback (most recent call last):
2024-10-20 04:55:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-20 04:55:56 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-20 04:55:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-20 04:55:56 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-20 04:55:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-20 04:55:56 | ERROR | stderr |     result = await self.call_function(
2024-10-20 04:55:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-20 04:55:56 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-20 04:55:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-20 04:55:56 | ERROR | stderr |     return await anext(iterator)
2024-10-20 04:55:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-20 04:55:56 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-20 04:55:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-20 04:55:56 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-20 04:55:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-20 04:55:56 | ERROR | stderr |     return await future
2024-10-20 04:55:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-20 04:55:56 | ERROR | stderr |     result = context.run(func, *args)
2024-10-20 04:55:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-20 04:55:56 | ERROR | stderr |     return next(iterator)
2024-10-20 04:55:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-20 04:55:56 | ERROR | stderr |     response = next(iterator)
2024-10-20 04:55:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 356, in gradio_handler
2024-10-20 04:55:56 | ERROR | stderr |     raise res.value
2024-10-20 04:55:56 | ERROR | stderr | AttributeError: 'NoneType' object has no attribute 'iter_lines'
2024-10-20 04:59:16 | INFO | httpx | HTTP Request: POST http://device-api.zero/startup-report "HTTP/1.1 200 OK"
2024-10-20 04:59:16 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 04:59:16 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-20 04:59:16 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 04:59:17 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-20 04:59:17 | ERROR | stderr |   warnings.warn(
2024-10-20 04:59:17 | ERROR | stderr | 
2024-10-20 04:59:17 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 04:59:17 | ERROR | stderr | 
2024-10-20 04:59:17 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 04:59:17 | ERROR | stderr | 
2024-10-20 04:59:17 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-20 04:59:17 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-20 04:59:17 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-20 04:59:17 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-20 04:59:17 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-20 04:59:17 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-20 04:59:17 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-20 04:59:17 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-20 04:59:17 | ERROR | stderr |   warnings.warn(
2024-10-20 04:59:17 | INFO | stdout | 
2024-10-20 04:59:17 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-20 04:59:32 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140188218099024&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6ImphZGVjaG9naGFyaVtwcm9dIiwidXVpZCI6bnVsbCwiZXhwIjoxNzI5MzkzMjMyfQ.z5Q58QcdykO4WC-QLPV5Xu7TYdqjgRJ19SMa7tEcJkE "HTTP/1.1 200 OK"
2024-10-20 04:59:32 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=18
2024-10-20 04:59:32 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=20
2024-10-20 04:59:32 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[10, 15, 14, 13]
2024-10-20 04:59:33 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=e23d9cdf8e42db21b9e01e39961213b479f359f5de391aa58b227061c78abc4c&pid=11741 "HTTP/1.1 200 OK"
2024-10-20 04:59:34 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-20 04:59:34 | INFO | stdout | conv mode to gemma
2024-10-20 04:59:34 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 04:59:34 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 04:59:34 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see in this image<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-20 04:59:34 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 04:59:34 | INFO | stdout | eval.json file created successfully.
2024-10-20 05:00:00 | INFO | stdout | Subprocess output:
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout | Subprocess error (if any):
2024-10-20 05:00:00 | INFO | stdout | A new version of the following files was downloaded from https://huggingface.co/jadechoghari/Ferret-UI-Gemma2b:
2024-10-20 05:00:00 | INFO | stdout | - mm_utils.py
2024-10-20 05:00:00 | INFO | stdout | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
2024-10-20 05:00:00 | INFO | stdout | A new version of the following files was downloaded from https://huggingface.co/jadechoghari/Ferret-UI-Gemma2b:
2024-10-20 05:00:00 | INFO | stdout | - clip_encoder.py
2024-10-20 05:00:00 | INFO | stdout | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
2024-10-20 05:00:00 | INFO | stdout | A new version of the following files was downloaded from https://huggingface.co/jadechoghari/Ferret-UI-Gemma2b:
2024-10-20 05:00:00 | INFO | stdout | - constants.py
2024-10-20 05:00:00 | INFO | stdout | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
2024-10-20 05:00:00 | INFO | stdout | A new version of the following files was downloaded from https://huggingface.co/jadechoghari/Ferret-UI-Gemma2b:
2024-10-20 05:00:00 | INFO | stdout | - ferret_arch.py
2024-10-20 05:00:00 | INFO | stdout | - mm_utils.py
2024-10-20 05:00:00 | INFO | stdout | - clip_encoder.py
2024-10-20 05:00:00 | INFO | stdout | - constants.py
2024-10-20 05:00:00 | INFO | stdout | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
2024-10-20 05:00:00 | INFO | stdout | A new version of the following files was downloaded from https://huggingface.co/jadechoghari/Ferret-UI-Gemma2b:
2024-10-20 05:00:00 | INFO | stdout | - modeling.py
2024-10-20 05:00:00 | INFO | stdout | - ferret_arch.py
2024-10-20 05:00:00 | INFO | stdout | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout | Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout | Downloading shards:  50%|█████     | 1/2 [00:04<00:04,  4.30s/it]
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout | Downloading shards: 100%|██████████| 2/2 [00:06<00:00,  2.97s/it]
2024-10-20 05:00:00 | INFO | stdout | Downloading shards: 100%|██████████| 2/2 [00:06<00:00,  3.17s/it]
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout | Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout | Loading checkpoint shards:  50%|█████     | 1/2 [00:02<00:02,  2.88s/it]
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout | Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.79s/it]
2024-10-20 05:00:00 | INFO | stdout | Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.95s/it]
2024-10-20 05:00:00 | INFO | stdout | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
2024-10-20 05:00:00 | INFO | stdout | - This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
2024-10-20 05:00:00 | INFO | stdout | - This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout |   0%|          | 0/1 [00:00<?, ?it/s]Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout | 100%|██████████| 1/1 [00:04<00:00,  4.81s/it]
2024-10-20 05:00:00 | INFO | stdout | 100%|██████████| 1/1 [00:04<00:00,  4.81s/it]
2024-10-20 05:00:00 | INFO | stdout | 
2024-10-20 05:00:00 | INFO | stdout | Inference completed. Output written to eval_output.jsonl
2024-10-20 05:00:00 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=e23d9cdf8e42db21b9e01e39961213b479f359f5de391aa58b227061c78abc4c&fail=true "HTTP/1.1 200 OK"
2024-10-20 05:00:00 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fill_yield_queue DONE
2024-10-20 05:00:00 | ERROR | stderr | Traceback (most recent call last):
2024-10-20 05:00:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-20 05:00:00 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-20 05:00:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-20 05:00:00 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-20 05:00:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-20 05:00:00 | ERROR | stderr |     result = await self.call_function(
2024-10-20 05:00:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-20 05:00:00 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-20 05:00:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-20 05:00:00 | ERROR | stderr |     return await anext(iterator)
2024-10-20 05:00:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-20 05:00:00 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-20 05:00:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-20 05:00:00 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-20 05:00:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-20 05:00:00 | ERROR | stderr |     return await future
2024-10-20 05:00:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-20 05:00:00 | ERROR | stderr |     result = context.run(func, *args)
2024-10-20 05:00:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-20 05:00:00 | ERROR | stderr |     return next(iterator)
2024-10-20 05:00:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-20 05:00:00 | ERROR | stderr |     response = next(iterator)
2024-10-20 05:00:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 356, in gradio_handler
2024-10-20 05:00:00 | ERROR | stderr |     raise res.value
2024-10-20 05:00:00 | ERROR | stderr | FileNotFoundError: [Errno 2] No such file or directory: '/home/user/app/8b23f327b90b6211049acd36e3f99975.jpg'
2024-10-20 05:05:53 | INFO | httpx | HTTP Request: POST http://device-api.zero/startup-report "HTTP/1.1 200 OK"
2024-10-20 05:05:53 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 05:05:53 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-20 05:05:53 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 05:05:53 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-20 05:05:53 | ERROR | stderr |   warnings.warn(
2024-10-20 05:05:53 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-20 05:05:53 | ERROR | stderr | 
2024-10-20 05:05:53 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 05:05:53 | ERROR | stderr | 
2024-10-20 05:05:53 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 05:05:53 | ERROR | stderr | 
2024-10-20 05:05:53 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-20 05:05:53 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-20 05:05:53 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-20 05:05:53 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-20 05:05:53 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-20 05:05:54 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-20 05:05:54 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-20 05:05:54 | ERROR | stderr |   warnings.warn(
2024-10-20 05:05:54 | INFO | stdout | 
2024-10-20 05:05:54 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-20 05:06:09 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139948146754752&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6ImphZGVjaG9naGFyaVtwcm9dIiwidXVpZCI6bnVsbCwiZXhwIjoxNzI5MzkzNjI5fQ.QBnCnaap7XDoSmQtu_4oEWihRnqXcAFbz6gQQ9UOAmk "HTTP/1.1 200 OK"
2024-10-20 05:06:09 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-20 05:06:09 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-20 05:06:09 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[17, 15, 13, 10, 14]
2024-10-20 05:06:10 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=6cb002c0ec026effa0a08fac0edd253418104e920209c068bfa1ff3c6099cddf&pid=13690 "HTTP/1.1 200 OK"
2024-10-20 05:06:11 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-20 05:06:11 | INFO | stdout | conv mode to gemma
2024-10-20 05:06:11 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:06:11 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:06:11 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see, and explain me<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-20 05:06:11 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:06:11 | INFO | stdout | eval.json file created successfully.
2024-10-20 05:06:42 | INFO | stdout | Subprocess output:
2024-10-20 05:06:42 | INFO | stdout | 
2024-10-20 05:06:42 | INFO | stdout | Subprocess error (if any):
2024-10-20 05:06:42 | INFO | stdout | 
2024-10-20 05:06:42 | INFO | stdout | 
2024-10-20 05:06:42 | INFO | stdout | Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-20 05:06:42 | INFO | stdout | 
2024-10-20 05:06:42 | INFO | stdout | Loading checkpoint shards:  50%|█████     | 1/2 [00:05<00:05,  5.07s/it]
2024-10-20 05:06:42 | INFO | stdout | Loading checkpoint shards: 100%|██████████| 2/2 [00:05<00:00,  2.94s/it]
2024-10-20 05:06:42 | INFO | stdout | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
2024-10-20 05:06:42 | INFO | stdout | - This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
2024-10-20 05:06:42 | INFO | stdout | - This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-20 05:06:42 | INFO | stdout | 
2024-10-20 05:06:42 | INFO | stdout | 
2024-10-20 05:06:42 | INFO | stdout |   0%|          | 0/1 [00:00<?, ?it/s]Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
2024-10-20 05:06:42 | INFO | stdout | 
2024-10-20 05:06:42 | INFO | stdout | 
2024-10-20 05:06:42 | INFO | stdout | 100%|██████████| 1/1 [00:15<00:00, 15.36s/it]
2024-10-20 05:06:42 | INFO | stdout | 100%|██████████| 1/1 [00:15<00:00, 15.36s/it]
2024-10-20 05:06:42 | INFO | stdout | 
2024-10-20 05:06:42 | INFO | stdout | Inference completed. Output written to eval_output.jsonl
2024-10-20 05:06:43 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=6cb002c0ec026effa0a08fac0edd253418104e920209c068bfa1ff3c6099cddf&fail=true "HTTP/1.1 200 OK"
2024-10-20 05:06:43 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fill_yield_queue DONE
2024-10-20 05:06:43 | ERROR | stderr | Traceback (most recent call last):
2024-10-20 05:06:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-20 05:06:43 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-20 05:06:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-20 05:06:43 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-20 05:06:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-20 05:06:43 | ERROR | stderr |     result = await self.call_function(
2024-10-20 05:06:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-20 05:06:43 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-20 05:06:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-20 05:06:43 | ERROR | stderr |     return await anext(iterator)
2024-10-20 05:06:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-20 05:06:43 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-20 05:06:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-20 05:06:43 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-20 05:06:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-20 05:06:43 | ERROR | stderr |     return await future
2024-10-20 05:06:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-20 05:06:43 | ERROR | stderr |     result = context.run(func, *args)
2024-10-20 05:06:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-20 05:06:43 | ERROR | stderr |     return next(iterator)
2024-10-20 05:06:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-20 05:06:43 | ERROR | stderr |     response = next(iterator)
2024-10-20 05:06:43 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 356, in gradio_handler
2024-10-20 05:06:43 | ERROR | stderr |     raise res.value
2024-10-20 05:06:43 | ERROR | stderr | ValueError: not enough values to unpack (expected 2, got 1)
2024-10-20 05:11:13 | INFO | httpx | HTTP Request: POST http://device-api.zero/startup-report "HTTP/1.1 200 OK"
2024-10-20 05:11:13 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 05:11:13 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-20 05:11:13 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 05:11:14 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-20 05:11:14 | ERROR | stderr |   warnings.warn(
2024-10-20 05:11:14 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-20 05:11:14 | ERROR | stderr | 
2024-10-20 05:11:14 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 05:11:14 | ERROR | stderr | 
2024-10-20 05:11:14 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 05:11:14 | ERROR | stderr | 
2024-10-20 05:11:14 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-20 05:11:14 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-20 05:11:14 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-20 05:11:14 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-20 05:11:14 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-20 05:11:14 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-20 05:11:14 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-20 05:11:14 | ERROR | stderr |   warnings.warn(
2024-10-20 05:11:14 | INFO | stdout | 
2024-10-20 05:11:14 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-20 05:11:39 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139742296687808&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6ImphZGVjaG9naGFyaVtwcm9dIiwidXVpZCI6bnVsbCwiZXhwIjoxNzI5MzkzOTU4fQ.GtbzBobNNvyNCf64_cyzyTzJO4yf8DeEU4klIenIbPw "HTTP/1.1 200 OK"
2024-10-20 05:11:39 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=18
2024-10-20 05:11:39 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=20
2024-10-20 05:11:39 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[13, 16, 14, 10]
2024-10-20 05:11:39 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=34b7517791e274a49adf9943e365509aebfccc0dd7eccd2a7023166828215390&pid=14910 "HTTP/1.1 200 OK"
2024-10-20 05:11:40 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-20 05:11:40 | INFO | stdout | conv mode to gemma
2024-10-20 05:11:40 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:11:40 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:11:40 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nexplain what you see in the image<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-20 05:11:40 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:11:40 | INFO | stdout | eval.json file created successfully.
2024-10-20 05:11:56 | INFO | stdout | Subprocess output:
2024-10-20 05:11:56 | INFO | stdout | 
2024-10-20 05:11:56 | INFO | stdout | Subprocess error (if any):
2024-10-20 05:11:56 | INFO | stdout | 
2024-10-20 05:11:56 | INFO | stdout | 
2024-10-20 05:11:56 | INFO | stdout | Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-20 05:11:56 | INFO | stdout | 
2024-10-20 05:11:56 | INFO | stdout | Loading checkpoint shards:  50%|█████     | 1/2 [00:02<00:02,  2.79s/it]
2024-10-20 05:11:56 | INFO | stdout | Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.64s/it]
2024-10-20 05:11:56 | INFO | stdout | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
2024-10-20 05:11:56 | INFO | stdout | - This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
2024-10-20 05:11:56 | INFO | stdout | - This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-20 05:11:56 | INFO | stdout | 
2024-10-20 05:11:56 | INFO | stdout | 
2024-10-20 05:11:56 | INFO | stdout |   0%|          | 0/1 [00:00<?, ?it/s]Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
2024-10-20 05:11:56 | INFO | stdout | 
2024-10-20 05:11:56 | INFO | stdout | 
2024-10-20 05:11:56 | INFO | stdout | 100%|██████████| 1/1 [00:05<00:00,  5.97s/it]
2024-10-20 05:11:56 | INFO | stdout | 100%|██████████| 1/1 [00:05<00:00,  5.97s/it]
2024-10-20 05:11:56 | INFO | stdout | 
2024-10-20 05:11:56 | INFO | stdout | Inference completed. Output written to eval_output.jsonl
2024-10-20 05:11:56 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=34b7517791e274a49adf9943e365509aebfccc0dd7eccd2a7023166828215390&fail=true "HTTP/1.1 200 OK"
2024-10-20 05:11:56 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fill_yield_queue DONE
2024-10-20 05:11:56 | ERROR | stderr | Traceback (most recent call last):
2024-10-20 05:11:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-20 05:11:56 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-20 05:11:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-20 05:11:56 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-20 05:11:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-20 05:11:56 | ERROR | stderr |     result = await self.call_function(
2024-10-20 05:11:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-20 05:11:56 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-20 05:11:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-20 05:11:56 | ERROR | stderr |     return await anext(iterator)
2024-10-20 05:11:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-20 05:11:56 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-20 05:11:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-20 05:11:56 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-20 05:11:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-20 05:11:56 | ERROR | stderr |     return await future
2024-10-20 05:11:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-20 05:11:56 | ERROR | stderr |     result = context.run(func, *args)
2024-10-20 05:11:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-20 05:11:56 | ERROR | stderr |     return next(iterator)
2024-10-20 05:11:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-20 05:11:56 | ERROR | stderr |     response = next(iterator)
2024-10-20 05:11:56 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 356, in gradio_handler
2024-10-20 05:11:56 | ERROR | stderr |     raise res.value
2024-10-20 05:11:56 | ERROR | stderr | NameError: name 'extracted_texts' is not defined
2024-10-20 05:12:33 | INFO | httpx | HTTP Request: POST http://device-api.zero/startup-report "HTTP/1.1 200 OK"
2024-10-20 05:12:33 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 05:12:33 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-20 05:12:33 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 05:12:33 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-20 05:12:33 | ERROR | stderr |   warnings.warn(
2024-10-20 05:12:33 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-20 05:12:33 | ERROR | stderr | 
2024-10-20 05:12:33 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 05:12:33 | ERROR | stderr | 
2024-10-20 05:12:33 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 05:12:33 | ERROR | stderr | 
2024-10-20 05:12:34 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-20 05:12:34 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-20 05:12:34 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-20 05:12:34 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-20 05:12:34 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-20 05:12:34 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-20 05:12:34 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-20 05:12:34 | ERROR | stderr |   warnings.warn(
2024-10-20 05:12:34 | INFO | stdout | 
2024-10-20 05:12:34 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-20 05:12:43 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139861687354560&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6ImphZGVjaG9naGFyaVtwcm9dIiwidXVpZCI6bnVsbCwiZXhwIjoxNzI5Mzk0MDIzfQ.vSAq6i38lbe_RaM3QbcUFVSIorBENqpwmrnzTmM59z8 "HTTP/1.1 200 OK"
2024-10-20 05:12:43 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-20 05:12:43 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-20 05:12:43 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[17, 12, 14, 10, 15]
2024-10-20 05:12:43 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=d2c0da78c41ba0083970f6ea193eab10bb50c621dc9eb5cc250e021a6cab6622&pid=15426 "HTTP/1.1 200 OK"
2024-10-20 05:12:44 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-20 05:12:44 | INFO | stdout | conv mode to gemma
2024-10-20 05:12:44 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:12:44 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:12:44 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-20 05:12:44 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:12:44 | INFO | stdout | eval.json file created successfully.
2024-10-20 05:13:00 | INFO | stdout | Subprocess output:
2024-10-20 05:13:00 | INFO | stdout | 
2024-10-20 05:13:00 | INFO | stdout | Subprocess error (if any):
2024-10-20 05:13:00 | INFO | stdout | 
2024-10-20 05:13:00 | INFO | stdout | 
2024-10-20 05:13:00 | INFO | stdout | Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-20 05:13:00 | INFO | stdout | 
2024-10-20 05:13:00 | INFO | stdout | Loading checkpoint shards:  50%|█████     | 1/2 [00:02<00:02,  2.87s/it]
2024-10-20 05:13:00 | INFO | stdout | Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.70s/it]
2024-10-20 05:13:00 | INFO | stdout | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
2024-10-20 05:13:00 | INFO | stdout | - This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
2024-10-20 05:13:00 | INFO | stdout | - This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-20 05:13:00 | INFO | stdout | 
2024-10-20 05:13:00 | INFO | stdout | 
2024-10-20 05:13:00 | INFO | stdout |   0%|          | 0/1 [00:00<?, ?it/s]Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
2024-10-20 05:13:00 | INFO | stdout | 
2024-10-20 05:13:00 | INFO | stdout | 
2024-10-20 05:13:00 | INFO | stdout | 100%|██████████| 1/1 [00:05<00:00,  5.69s/it]
2024-10-20 05:13:00 | INFO | stdout | 100%|██████████| 1/1 [00:05<00:00,  5.69s/it]
2024-10-20 05:13:00 | INFO | stdout | 
2024-10-20 05:13:00 | INFO | stdout | Inference completed. Output written to eval_output.jsonl
2024-10-20 05:13:00 | INFO | gradio_web_server | This is the respone ['The mobile screen appears to be a task management or to-do list app. At the top of the screen, there\'s a selected checkbox with the text "Gardenina, Ask JackieLynn to borrow her truck, checkmark, • Messaging: Jacky" which spans almost the entire width of the screen. \n\nBelow this, there are several buttons with different tasks to do, each button also has a checkbox to the left. The tasks are: "9:41, Get grave for water to blow away", "Get gravel for wall next to garage", "Buy mulch", "Buy soil, #shoppinglist", "Buy mulch, #shoppinglist", "Pick up mulch for garden", "Put up soil for garden", "Pick up soil for garden", and "Look up native vines for vine vines". \n\nAt the bottom of the screen, there\'s a button labeled "up, Francina: Van, Arrive" that spans the entire width of the screen.']
2024-10-20 05:13:00 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=d2c0da78c41ba0083970f6ea193eab10bb50c621dc9eb5cc250e021a6cab6622&fail=true "HTTP/1.1 200 OK"
2024-10-20 05:13:00 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fill_yield_queue DONE
2024-10-20 05:13:00 | ERROR | stderr | Traceback (most recent call last):
2024-10-20 05:13:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-20 05:13:00 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-20 05:13:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-20 05:13:00 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-20 05:13:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-20 05:13:00 | ERROR | stderr |     result = await self.call_function(
2024-10-20 05:13:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-20 05:13:00 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-20 05:13:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-20 05:13:00 | ERROR | stderr |     return await anext(iterator)
2024-10-20 05:13:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-20 05:13:00 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-20 05:13:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-20 05:13:00 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-20 05:13:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-20 05:13:00 | ERROR | stderr |     return await future
2024-10-20 05:13:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-20 05:13:00 | ERROR | stderr |     result = context.run(func, *args)
2024-10-20 05:13:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-20 05:13:00 | ERROR | stderr |     return next(iterator)
2024-10-20 05:13:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-20 05:13:00 | ERROR | stderr |     response = next(iterator)
2024-10-20 05:13:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 356, in gradio_handler
2024-10-20 05:13:00 | ERROR | stderr |     raise res.value
2024-10-20 05:13:00 | ERROR | stderr | AttributeError: 'list' object has no attribute 'iter_lines'
2024-10-20 05:18:26 | INFO | httpx | HTTP Request: POST http://device-api.zero/startup-report "HTTP/1.1 200 OK"
2024-10-20 05:18:26 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 05:18:26 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-20 05:18:26 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 05:18:26 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-20 05:18:26 | ERROR | stderr |   warnings.warn(
2024-10-20 05:18:26 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-20 05:18:26 | ERROR | stderr | 
2024-10-20 05:18:26 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 05:18:26 | ERROR | stderr | 
2024-10-20 05:18:26 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 05:18:26 | ERROR | stderr | 
2024-10-20 05:18:26 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-20 05:18:26 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-20 05:18:26 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-20 05:18:26 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-20 05:18:27 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-20 05:18:27 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-20 05:18:27 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-20 05:18:27 | ERROR | stderr |   warnings.warn(
2024-10-20 05:18:27 | INFO | stdout | 
2024-10-20 05:18:27 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-20 05:18:39 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140613572986048&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6ImphZGVjaG9naGFyaVtwcm9dIiwidXVpZCI6bnVsbCwiZXhwIjoxNzI5Mzk0Mzc5fQ.tduahHBVGbirGj1g4QzIKBQKOO6trJMQkBQMG_WyUg4 "HTTP/1.1 200 OK"
2024-10-20 05:18:39 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=18
2024-10-20 05:18:39 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=20
2024-10-20 05:18:39 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[13, 15, 10, 14]
2024-10-20 05:18:40 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=8dda3423d33f3a2c9c5f207ac0fae2b107e5c788e79c16e9f42e5b74557c1c96&pid=16718 "HTTP/1.1 200 OK"
2024-10-20 05:18:40 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-20 05:18:40 | INFO | stdout | conv mode to gemma
2024-10-20 05:18:40 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:18:40 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:18:40 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nexplain what you see<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-20 05:18:40 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:18:40 | INFO | stdout | eval.json file created successfully.
2024-10-20 05:19:00 | INFO | stdout | Subprocess output:
2024-10-20 05:19:00 | INFO | stdout | 
2024-10-20 05:19:00 | INFO | stdout | Subprocess error (if any):
2024-10-20 05:19:00 | INFO | stdout | 
2024-10-20 05:19:00 | INFO | stdout | 
2024-10-20 05:19:00 | INFO | stdout | Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-20 05:19:00 | INFO | stdout | 
2024-10-20 05:19:00 | INFO | stdout | Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.33s/it]
2024-10-20 05:19:00 | INFO | stdout | Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.99s/it]
2024-10-20 05:19:00 | INFO | stdout | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
2024-10-20 05:19:00 | INFO | stdout | - This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
2024-10-20 05:19:00 | INFO | stdout | - This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-20 05:19:00 | INFO | stdout | 
2024-10-20 05:19:00 | INFO | stdout | 
2024-10-20 05:19:00 | INFO | stdout |   0%|          | 0/1 [00:00<?, ?it/s]Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
2024-10-20 05:19:00 | INFO | stdout | 
2024-10-20 05:19:00 | INFO | stdout | 
2024-10-20 05:19:00 | INFO | stdout | 100%|██████████| 1/1 [00:08<00:00,  8.86s/it]
2024-10-20 05:19:00 | INFO | stdout | 100%|██████████| 1/1 [00:08<00:00,  8.86s/it]
2024-10-20 05:19:00 | INFO | stdout | 
2024-10-20 05:19:00 | INFO | stdout | Inference completed. Output written to eval_output.jsonl
2024-10-20 05:19:00 | INFO | gradio_web_server | This is the respone ['A chat between a human and an unknown entity. \n\nThe conversation starts with a message from Jackyline Herrera saying, "Ask Jackie to borrow her truck". The reply is, "Get gravel for bow, walk, 10, 1, 1, Shopping List". \n\nThe next message is from Get Gravel for the truck, and the reply is, "Buy mulch, #shoppinglist". \n\nThe third message is from Buy mulch for the garden, and the reply is, "Pick up succulents". \n\nThe fourth message is from Pick up succulents for the garden, and the reply is, "Buy soil for succulents". \n\nThe fifth message is from Buy soil for succulents, and the reply is, "Pick up soil for succulents". \n\nThe sixth message is from Pick up succulents for the garden, and the reply is, "Pick up soil for succulents". \n\nThe seventh message is from Pick up succulents for the garden, and the reply is, "Pick up soil for succulents". \n\nThe eighth message is from Pick up succulents for the garden, and the reply is, "Pick up soil for succulents". \n\nThe ninth message is from Pick up succulents for the garden, and the reply is, "Look up native vegetables along the fence". \n\nThe tenth message is from Shopping List, and the reply is, "Shopping List". \n\nThe message at the bottom is from Shopping List, and the reply is, "Look up native vegetables along the fence". \n\nThe message at the very bottom is from Shopping List, and the reply is, "Looking: Fran".']
2024-10-20 05:19:40 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=8dda3423d33f3a2c9c5f207ac0fae2b107e5c788e79c16e9f42e5b74557c1c96&fail=true "HTTP/1.1 404 Not Found"
2024-10-20 05:19:40 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fill_yield_queue DONE
2024-10-20 05:19:40 | ERROR | stderr | Traceback (most recent call last):
2024-10-20 05:19:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-20 05:19:40 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-20 05:19:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-20 05:19:40 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-20 05:19:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-20 05:19:40 | ERROR | stderr |     result = await self.call_function(
2024-10-20 05:19:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-20 05:19:40 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-20 05:19:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-20 05:19:40 | ERROR | stderr |     return await anext(iterator)
2024-10-20 05:19:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-20 05:19:40 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-20 05:19:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-20 05:19:40 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-20 05:19:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-20 05:19:40 | ERROR | stderr |     return await future
2024-10-20 05:19:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-20 05:19:40 | ERROR | stderr |     result = context.run(func, *args)
2024-10-20 05:19:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-20 05:19:40 | ERROR | stderr |     return next(iterator)
2024-10-20 05:19:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-20 05:19:40 | ERROR | stderr |     response = next(iterator)
2024-10-20 05:19:40 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 354, in gradio_handler
2024-10-20 05:19:40 | ERROR | stderr |     raise gr.Error("GPU task aborted")
2024-10-20 05:19:40 | ERROR | stderr | gradio.exceptions.Error: 'GPU task aborted'
2024-10-20 05:21:23 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140613572986048&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6ImphZGVjaG9naGFyaVtwcm9dIiwidXVpZCI6bnVsbCwiZXhwIjoxNzI5Mzk0NTQzfQ.GUsmgrRDZzgh_X59qnJ1uLM9qwL2SMAIoVss9lZGkeY "HTTP/1.1 200 OK"
2024-10-20 05:21:23 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=18
2024-10-20 05:21:23 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=23
2024-10-20 05:21:23 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[14, 16, 10, 13]
2024-10-20 05:21:23 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=7820d07ad45c827fd58dad5be3f79c3d5b21f66198d45f74628d46721a4f174d&pid=17406 "HTTP/1.1 200 OK"
2024-10-20 05:21:24 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-20 05:21:24 | INFO | stdout | conv mode to gemma
2024-10-20 05:21:24 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:21:24 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:21:24 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe shortly what do you see<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-20 05:21:24 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 05:21:24 | INFO | stdout | eval.json file created successfully.
2024-10-20 05:21:47 | INFO | stdout | Subprocess output:
2024-10-20 05:21:47 | INFO | stdout | 
2024-10-20 05:21:47 | INFO | stdout | Subprocess error (if any):
2024-10-20 05:21:47 | INFO | stdout | 
2024-10-20 05:21:47 | INFO | stdout | 
2024-10-20 05:21:47 | INFO | stdout | Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-20 05:21:47 | INFO | stdout | 
2024-10-20 05:21:47 | INFO | stdout | Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.19s/it]
2024-10-20 05:21:47 | INFO | stdout | Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.89s/it]
2024-10-20 05:21:47 | INFO | stdout | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
2024-10-20 05:21:47 | INFO | stdout | - This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
2024-10-20 05:21:47 | INFO | stdout | - This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-20 05:21:47 | INFO | stdout | 
2024-10-20 05:21:47 | INFO | stdout | 
2024-10-20 05:21:47 | INFO | stdout |   0%|          | 0/1 [00:00<?, ?it/s]Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
2024-10-20 05:21:47 | INFO | stdout | 
2024-10-20 05:21:47 | INFO | stdout | 
2024-10-20 05:21:47 | INFO | stdout | 100%|██████████| 1/1 [00:12<00:00, 12.29s/it]
2024-10-20 05:21:47 | INFO | stdout | 100%|██████████| 1/1 [00:12<00:00, 12.29s/it]
2024-10-20 05:21:47 | INFO | stdout | 
2024-10-20 05:21:47 | INFO | stdout | Inference completed. Output written to eval_output.jsonl
2024-10-20 05:21:47 | INFO | gradio_web_server | This is the respone ['The mobile screen appears to be a task management or to-do list app. At the top of the screen, there\'s a selected checkbox with the text "Gardenina, Ask JackieLynn to borrow her truck, checkmark, • Messaging: Jackyra, • Shopping: Jackyra, Get grave for water, get gravel, get gravel for snow, get mud, get mud for sale, Pick up soil, #shoppinglist, Buy mulch, #shoppinglist, Plan up succulents, #shoppinglist, Buy soil, #shoppinglist, Pick up soil for garden, #shoppinglist, Pick up soil for garden, #shoppinglist, Pick up succulents, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up succulents for garden, #shoppinglist, Plan up']
2024-10-20 05:22:26 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=7820d07ad45c827fd58dad5be3f79c3d5b21f66198d45f74628d46721a4f174d&fail=true "HTTP/1.1 404 Not Found"
2024-10-20 05:22:26 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fill_yield_queue DONE
2024-10-20 05:22:26 | ERROR | stderr | Traceback (most recent call last):
2024-10-20 05:22:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-20 05:22:26 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-20 05:22:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-20 05:22:26 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-20 05:22:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-20 05:22:26 | ERROR | stderr |     result = await self.call_function(
2024-10-20 05:22:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-20 05:22:26 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-20 05:22:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-20 05:22:26 | ERROR | stderr |     return await anext(iterator)
2024-10-20 05:22:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-20 05:22:26 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-20 05:22:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-20 05:22:26 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-20 05:22:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-20 05:22:26 | ERROR | stderr |     return await future
2024-10-20 05:22:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-20 05:22:26 | ERROR | stderr |     result = context.run(func, *args)
2024-10-20 05:22:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-20 05:22:26 | ERROR | stderr |     return next(iterator)
2024-10-20 05:22:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-20 05:22:26 | ERROR | stderr |     response = next(iterator)
2024-10-20 05:22:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 354, in gradio_handler
2024-10-20 05:22:26 | ERROR | stderr |     raise gr.Error("GPU task aborted")
2024-10-20 05:22:26 | ERROR | stderr | gradio.exceptions.Error: 'GPU task aborted'
2024-10-20 17:04:50 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 17:04:50 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-20 17:04:50 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 17:04:50 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-20 17:04:50 | ERROR | stderr |   warnings.warn(
2024-10-20 17:04:50 | ERROR | stderr | 
2024-10-20 17:04:50 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 17:04:50 | ERROR | stderr | 
2024-10-20 17:04:50 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 17:04:50 | ERROR | stderr | 
2024-10-20 17:04:50 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-20 17:04:50 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-20 17:04:50 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-20 17:04:50 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-20 17:04:50 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-20 17:04:50 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-20 17:04:50 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-20 17:04:50 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-20 17:04:50 | ERROR | stderr |   warnings.warn(
2024-10-20 17:04:50 | INFO | stdout | 
2024-10-20 17:04:50 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-20 17:05:08 | INFO | stdout | conv mode to gemma
2024-10-20 17:05:08 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 17:05:08 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 17:05:08 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nexplain what you see in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-20 17:05:08 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 17:05:08 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140555119614928&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NDM2NzY4fQ.EuAW-hG-wXH2Sk2bugvTjvC4pQyyUeJIvn9L7o4vUdI "HTTP/1.1 200 OK"
2024-10-20 17:05:09 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-20 17:05:09 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-20 17:05:09 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[13, 16, 15, 10, 17, 14]
2024-10-20 17:05:10 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=b99df8a4dc1b6f58dc9e8005a8a2f06e49bf245c40bcb89919e310f4b5346361&pid=124923 "HTTP/1.1 200 OK"
2024-10-20 17:05:11 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-20 17:05:11 | ERROR | stderr | Traceback (most recent call last):
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-20 17:05:11 | ERROR | stderr |     res = future.result()
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-20 17:05:11 | ERROR | stderr |     return self.__get_result()
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-20 17:05:11 | ERROR | stderr |     raise self._exception
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-20 17:05:11 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-20 17:05:11 | ERROR | stderr |   File "/home/user/app/cli.py", line 40, in run_inference
2024-10-20 17:05:11 | ERROR | stderr |     tokenizer, model, image_processor, context_len = load_pretrained_model(
2024-10-20 17:05:11 | ERROR | stderr | TypeError: load_pretrained_model() got an unexpected keyword argument 'device'
2024-10-20 17:05:11 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=b99df8a4dc1b6f58dc9e8005a8a2f06e49bf245c40bcb89919e310f4b5346361&fail=true "HTTP/1.1 200 OK"
2024-10-20 17:05:11 | ERROR | stderr | Traceback (most recent call last):
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-20 17:05:11 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-20 17:05:11 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-20 17:05:11 | ERROR | stderr |     result = await self.call_function(
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-20 17:05:11 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-20 17:05:11 | ERROR | stderr |     return await anext(iterator)
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-20 17:05:11 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-20 17:05:11 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-20 17:05:11 | ERROR | stderr |     return await future
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-20 17:05:11 | ERROR | stderr |     result = context.run(func, *args)
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-20 17:05:11 | ERROR | stderr |     return next(iterator)
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-20 17:05:11 | ERROR | stderr |     response = next(iterator)
2024-10-20 17:05:11 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-20 17:05:11 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-20 17:05:11 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-20 17:05:11 | ERROR | stderr |     raise res.value
2024-10-20 17:05:11 | ERROR | stderr | TypeError: load_pretrained_model() got an unexpected keyword argument 'device'
2024-10-20 17:06:18 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 17:06:18 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-20 17:06:18 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-20 17:06:18 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-20 17:06:18 | ERROR | stderr |   warnings.warn(
2024-10-20 17:06:18 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-20 17:06:18 | ERROR | stderr | 
2024-10-20 17:06:18 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 17:06:18 | ERROR | stderr | 
2024-10-20 17:06:18 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-20 17:06:18 | ERROR | stderr | 
2024-10-20 17:06:18 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-20 17:06:18 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-20 17:06:18 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-20 17:06:18 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-20 17:06:18 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-20 17:06:18 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-20 17:06:18 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-20 17:06:18 | ERROR | stderr |   warnings.warn(
2024-10-20 17:06:18 | INFO | stdout | 
2024-10-20 17:06:18 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-20 17:06:28 | INFO | stdout | conv mode to gemma
2024-10-20 17:06:28 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 17:06:28 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 17:06:28 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe the image in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-20 17:06:28 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 17:06:28 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139933519567824&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NDM2ODQ4fQ.lOIguOxq-qAkWuUmWFXEQ5uRobpRk8ITz6YfsKg1Jz8 "HTTP/1.1 200 OK"
2024-10-20 17:06:39 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=14
2024-10-20 17:06:39 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=18
2024-10-20 17:06:39 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[15, 17, 10]
2024-10-20 17:06:41 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=eebcff088cba1bb812d21b25bf8992a7ba76ffa043e56e0dacd24db4b6b0b4b2&pid=125383 "HTTP/1.1 200 OK"
2024-10-20 17:06:42 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-20 17:06:44 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-20 17:06:44 | ERROR | stderr | 
2024-10-20 17:06:44 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-20 17:06:44 | ERROR | stderr | 
2024-10-20 17:06:47 | ERROR | stderr | 
2024-10-20 17:06:47 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.22s/it]
2024-10-20 17:06:47 | ERROR | stderr | 
2024-10-20 17:06:48 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:04<00:00,  2.02s/it]
2024-10-20 17:06:48 | ERROR | stderr | 
2024-10-20 17:06:48 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-20 17:06:50 | ERROR | stderr | Traceback (most recent call last):
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-20 17:06:50 | ERROR | stderr |     res = future.result()
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-20 17:06:50 | ERROR | stderr |     return self.__get_result()
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-20 17:06:50 | ERROR | stderr |     raise self._exception
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-20 17:06:50 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-20 17:06:50 | ERROR | stderr |   File "/home/user/app/cli.py", line 78, in run_inference
2024-10-20 17:06:50 | ERROR | stderr |     image_tensor = process_images([image], image_processor, model.config)
2024-10-20 17:06:50 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 189, in process_images
2024-10-20 17:06:50 | ERROR | stderr |     image = process_anyres_image(image, image_processor, model_cfg.image_grid_pinpoints, image_process_func=image_process_func)
2024-10-20 17:06:50 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in process_anyres_image
2024-10-20 17:06:50 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-20 17:06:50 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in <listcomp>
2024-10-20 17:06:50 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/image_processing_clip.py", line 325, in preprocess
2024-10-20 17:06:50 | ERROR | stderr |     image = self.resize(image=image, size=size, resample=resample, input_data_format=input_data_format)
2024-10-20 17:06:50 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/clip_encoder.py", line 46, in resize
2024-10-20 17:06:50 | ERROR | stderr |     output_size = get_resize_output_image_size(image, size=(size["height"], size["width"]), default_to_square=True)
2024-10-20 17:06:50 | ERROR | stderr | KeyError: 'height'
2024-10-20 17:06:50 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=eebcff088cba1bb812d21b25bf8992a7ba76ffa043e56e0dacd24db4b6b0b4b2&fail=true "HTTP/1.1 200 OK"
2024-10-20 17:06:50 | ERROR | stderr | Traceback (most recent call last):
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-20 17:06:50 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-20 17:06:50 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-20 17:06:50 | ERROR | stderr |     result = await self.call_function(
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-20 17:06:50 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-20 17:06:50 | ERROR | stderr |     return await anext(iterator)
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-20 17:06:50 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-20 17:06:50 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-20 17:06:50 | ERROR | stderr |     return await future
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-20 17:06:50 | ERROR | stderr |     result = context.run(func, *args)
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-20 17:06:50 | ERROR | stderr |     return next(iterator)
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-20 17:06:50 | ERROR | stderr |     response = next(iterator)
2024-10-20 17:06:50 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-20 17:06:50 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-20 17:06:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-20 17:06:50 | ERROR | stderr |     raise res.value
2024-10-20 17:06:50 | ERROR | stderr | KeyError: 'height'
2024-10-20 17:57:50 | INFO | stdout | conv mode to gemma
2024-10-20 17:57:50 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 17:57:50 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 17:57:50 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe the image in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-20 17:57:50 | INFO | stdout | Input Image Size:(400, 433)
2024-10-20 17:57:50 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139933519567824&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NDM5OTMwfQ.IRQ7BAi6iGhKMiMxNEByAR74e_5B40KV0WuP_xa-Bsg "HTTP/1.1 429 Too Many Requests"
2024-10-20 17:57:50 | ERROR | stderr | Traceback (most recent call last):
2024-10-20 17:57:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-20 17:57:50 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-20 17:57:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-20 17:57:50 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-20 17:57:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-20 17:57:50 | ERROR | stderr |     result = await self.call_function(
2024-10-20 17:57:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-20 17:57:50 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-20 17:57:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-20 17:57:50 | ERROR | stderr |     return await anext(iterator)
2024-10-20 17:57:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-20 17:57:50 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-20 17:57:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-20 17:57:50 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-20 17:57:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-20 17:57:50 | ERROR | stderr |     return await future
2024-10-20 17:57:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-20 17:57:50 | ERROR | stderr |     result = context.run(func, *args)
2024-10-20 17:57:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-20 17:57:50 | ERROR | stderr |     return next(iterator)
2024-10-20 17:57:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-20 17:57:50 | ERROR | stderr |     response = next(iterator)
2024-10-20 17:57:50 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-20 17:57:50 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-20 17:57:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 184, in gradio_handler
2024-10-20 17:57:50 | ERROR | stderr |     schedule_response = client.schedule(task_id=task_id, request=request, duration=duration_)
2024-10-20 17:57:50 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/client.py", line 139, in schedule
2024-10-20 17:57:50 | ERROR | stderr |     raise HTMLError(html_string(message_html, message_text))
2024-10-20 17:57:50 | ERROR | stderr | spaces.zero.gradio.HTMLError: You have exceeded your GPU quota (60s requested vs. 53s left). [Create a free account](https://huggingface.co/join) to get more usage quota.
2024-10-21 01:14:36 | INFO | stdout | conv mode to gemma
2024-10-21 01:14:36 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:14:36 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:14:36 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-21 01:14:36 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:14:36 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139933519567824&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NDY2MTM2fQ.2g10JJWV-78W5nW60LzHAXzreanpi7wtkU6Ma64zQKw "HTTP/1.1 200 OK"
2024-10-21 01:14:36 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=18
2024-10-21 01:14:36 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-21 01:14:36 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[15, 14, 16, 19, 10]
2024-10-21 01:14:37 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=d985548646b9805475ffa24e86374fab10acf32d4e17cf486f79a486c77bf9ec&pid=160207 "HTTP/1.1 200 OK"
2024-10-21 01:14:38 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 01:14:39 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 01:14:39 | ERROR | stderr | 
2024-10-21 01:14:39 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 01:14:39 | ERROR | stderr | 
2024-10-21 01:14:43 | ERROR | stderr | 
2024-10-21 01:14:43 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.39s/it]
2024-10-21 01:14:43 | ERROR | stderr | 
2024-10-21 01:14:44 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:04<00:00,  2.14s/it]
2024-10-21 01:14:44 | ERROR | stderr | 
2024-10-21 01:14:44 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 01:14:46 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:14:46 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 01:14:46 | ERROR | stderr |     res = future.result()
2024-10-21 01:14:46 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 01:14:46 | ERROR | stderr |     return self.__get_result()
2024-10-21 01:14:46 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 01:14:46 | ERROR | stderr |     raise self._exception
2024-10-21 01:14:46 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 01:14:46 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 01:14:46 | ERROR | stderr |   File "/home/user/app/cli.py", line 78, in run_inference
2024-10-21 01:14:46 | ERROR | stderr |     image_tensor = process_images([image], image_processor, model.config)
2024-10-21 01:14:46 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 189, in process_images
2024-10-21 01:14:46 | ERROR | stderr |     image = process_anyres_image(image, image_processor, model_cfg.image_grid_pinpoints, image_process_func=image_process_func)
2024-10-21 01:14:46 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in process_anyres_image
2024-10-21 01:14:46 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:14:46 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in <listcomp>
2024-10-21 01:14:46 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:14:46 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/image_processing_clip.py", line 325, in preprocess
2024-10-21 01:14:46 | ERROR | stderr |     image = self.resize(image=image, size=size, resample=resample, input_data_format=input_data_format)
2024-10-21 01:14:46 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/clip_encoder.py", line 46, in resize
2024-10-21 01:14:46 | ERROR | stderr |     output_size = get_resize_output_image_size(image, size=(size["height"], size["width"]), default_to_square=True)
2024-10-21 01:14:46 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:14:47 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=d985548646b9805475ffa24e86374fab10acf32d4e17cf486f79a486c77bf9ec&fail=true "HTTP/1.1 200 OK"
2024-10-21 01:14:47 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:14:47 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 01:14:47 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 01:14:47 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 01:14:47 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 01:14:47 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 01:14:47 | ERROR | stderr |     result = await self.call_function(
2024-10-21 01:14:47 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 01:14:47 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 01:14:47 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 01:14:47 | ERROR | stderr |     return await anext(iterator)
2024-10-21 01:14:47 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 01:14:47 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 01:14:47 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 01:14:47 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 01:14:47 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 01:14:47 | ERROR | stderr |     return await future
2024-10-21 01:14:47 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 01:14:47 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 01:14:47 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 01:14:47 | ERROR | stderr |     return next(iterator)
2024-10-21 01:14:47 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 01:14:47 | ERROR | stderr |     response = next(iterator)
2024-10-21 01:14:47 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 01:14:47 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 01:14:47 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 01:14:47 | ERROR | stderr |     raise res.value
2024-10-21 01:14:47 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:24:49 | INFO | stdout | conv mode to gemma
2024-10-21 01:24:49 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:24:49 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:24:49 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-21 01:24:49 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:24:49 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139933519567824&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NDY2NzQ5fQ.2ssJsOSfN0phT1hH0osxzbe7wA0vxZ4SCE6qvsaSiz4 "HTTP/1.1 200 OK"
2024-10-21 01:24:49 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-21 01:24:49 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=24
2024-10-21 01:24:49 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[13, 14, 15, 16, 10]
2024-10-21 01:24:50 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=67e50667573a8e380e596840d7a7666c00fb65d7edd8d5217a7b7ffe092cea7a&pid=162033 "HTTP/1.1 200 OK"
2024-10-21 01:24:51 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 01:24:52 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 01:24:52 | ERROR | stderr | 
2024-10-21 01:24:52 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 01:24:52 | ERROR | stderr | 
2024-10-21 01:24:55 | ERROR | stderr | 
2024-10-21 01:24:55 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.04s/it]
2024-10-21 01:24:55 | ERROR | stderr | 
2024-10-21 01:24:56 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.85s/it]
2024-10-21 01:24:56 | ERROR | stderr | 
2024-10-21 01:24:56 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 01:24:58 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 01:24:58 | ERROR | stderr |     res = future.result()
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 01:24:58 | ERROR | stderr |     return self.__get_result()
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 01:24:58 | ERROR | stderr |     raise self._exception
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 01:24:58 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 01:24:58 | ERROR | stderr |   File "/home/user/app/cli.py", line 78, in run_inference
2024-10-21 01:24:58 | ERROR | stderr |     image_tensor = process_images([image], image_processor, model.config)
2024-10-21 01:24:58 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 189, in process_images
2024-10-21 01:24:58 | ERROR | stderr |     image = process_anyres_image(image, image_processor, model_cfg.image_grid_pinpoints, image_process_func=image_process_func)
2024-10-21 01:24:58 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in process_anyres_image
2024-10-21 01:24:58 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:24:58 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in <listcomp>
2024-10-21 01:24:58 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/image_processing_clip.py", line 325, in preprocess
2024-10-21 01:24:58 | ERROR | stderr |     image = self.resize(image=image, size=size, resample=resample, input_data_format=input_data_format)
2024-10-21 01:24:58 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/clip_encoder.py", line 46, in resize
2024-10-21 01:24:58 | ERROR | stderr |     output_size = get_resize_output_image_size(image, size=(size["height"], size["width"]), default_to_square=True)
2024-10-21 01:24:58 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:24:58 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=67e50667573a8e380e596840d7a7666c00fb65d7edd8d5217a7b7ffe092cea7a&fail=true "HTTP/1.1 200 OK"
2024-10-21 01:24:58 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 01:24:58 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 01:24:58 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 01:24:58 | ERROR | stderr |     result = await self.call_function(
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 01:24:58 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 01:24:58 | ERROR | stderr |     return await anext(iterator)
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 01:24:58 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 01:24:58 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 01:24:58 | ERROR | stderr |     return await future
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 01:24:58 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 01:24:58 | ERROR | stderr |     return next(iterator)
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 01:24:58 | ERROR | stderr |     response = next(iterator)
2024-10-21 01:24:58 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 01:24:58 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 01:24:58 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 01:24:58 | ERROR | stderr |     raise res.value
2024-10-21 01:24:58 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:27:12 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:27:12 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 01:27:12 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:27:12 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 01:27:12 | ERROR | stderr |   warnings.warn(
2024-10-21 01:27:12 | ERROR | stderr | 
2024-10-21 01:27:12 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:27:12 | ERROR | stderr | 
2024-10-21 01:27:12 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:27:12 | ERROR | stderr | 
2024-10-21 01:27:12 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 01:27:12 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 01:27:12 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 01:27:12 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 01:27:12 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 01:27:12 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 01:27:12 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 01:27:12 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 01:27:12 | ERROR | stderr |   warnings.warn(
2024-10-21 01:27:12 | INFO | stdout | 
2024-10-21 01:27:12 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 01:27:39 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:27:39 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 01:27:39 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:27:39 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 01:27:39 | ERROR | stderr |   warnings.warn(
2024-10-21 01:27:39 | ERROR | stderr | 
2024-10-21 01:27:39 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:27:39 | ERROR | stderr | 
2024-10-21 01:27:39 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:27:39 | ERROR | stderr | 
2024-10-21 01:27:39 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 01:27:39 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 01:27:39 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 01:27:39 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 01:27:39 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 01:27:39 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 01:27:39 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 01:27:39 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 01:27:39 | ERROR | stderr |   warnings.warn(
2024-10-21 01:27:39 | INFO | stdout | 
2024-10-21 01:27:39 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 01:27:52 | INFO | stdout | conv mode to gemma
2024-10-21 01:27:52 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:27:52 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:27:52 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-21 01:27:52 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:27:52 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140484646902736&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NDY2OTMyfQ.XfZ2mQRJU-hWhsJbLaGPgawhklB0rySqTHX0o5tbuA4 "HTTP/1.1 200 OK"
2024-10-21 01:27:52 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=18
2024-10-21 01:27:52 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=20
2024-10-21 01:27:52 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[16, 14, 13, 10]
2024-10-21 01:27:52 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=f51e510aa3aee8f0ee7636868ee07d841e5d18ed79f61cecb119fe9293bda42b&pid=163170 "HTTP/1.1 200 OK"
2024-10-21 01:27:54 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 01:27:55 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 01:27:55 | ERROR | stderr | 
2024-10-21 01:27:55 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 01:27:55 | ERROR | stderr | 
2024-10-21 01:27:58 | ERROR | stderr | 
2024-10-21 01:27:58 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:02<00:02,  2.81s/it]
2024-10-21 01:27:58 | ERROR | stderr | 
2024-10-21 01:27:58 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.70s/it]
2024-10-21 01:27:58 | ERROR | stderr | 
2024-10-21 01:27:58 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 01:28:00 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:28:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 01:28:00 | ERROR | stderr |     res = future.result()
2024-10-21 01:28:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 01:28:00 | ERROR | stderr |     return self.__get_result()
2024-10-21 01:28:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 01:28:00 | ERROR | stderr |     raise self._exception
2024-10-21 01:28:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 01:28:00 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 01:28:00 | ERROR | stderr |   File "/home/user/app/cli.py", line 78, in run_inference
2024-10-21 01:28:00 | ERROR | stderr |     image_tensor = process_images([image], image_processor, model.config)
2024-10-21 01:28:00 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 189, in process_images
2024-10-21 01:28:00 | ERROR | stderr |     image = process_anyres_image(image, image_processor, model_cfg.image_grid_pinpoints, image_process_func=image_process_func)
2024-10-21 01:28:00 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in process_anyres_image
2024-10-21 01:28:00 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:28:00 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in <listcomp>
2024-10-21 01:28:00 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:28:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/image_processing_clip.py", line 325, in preprocess
2024-10-21 01:28:00 | ERROR | stderr |     image = self.resize(image=image, size=size, resample=resample, input_data_format=input_data_format)
2024-10-21 01:28:00 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/clip_encoder.py", line 46, in resize
2024-10-21 01:28:00 | ERROR | stderr |     output_size = get_resize_output_image_size(image, size=(size["height"], size["width"]), default_to_square=True)
2024-10-21 01:28:00 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:28:01 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=f51e510aa3aee8f0ee7636868ee07d841e5d18ed79f61cecb119fe9293bda42b&fail=true "HTTP/1.1 200 OK"
2024-10-21 01:28:01 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:28:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 01:28:01 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 01:28:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 01:28:01 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 01:28:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 01:28:01 | ERROR | stderr |     result = await self.call_function(
2024-10-21 01:28:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 01:28:01 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 01:28:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 01:28:01 | ERROR | stderr |     return await anext(iterator)
2024-10-21 01:28:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 01:28:01 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 01:28:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 01:28:01 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 01:28:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 01:28:01 | ERROR | stderr |     return await future
2024-10-21 01:28:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 01:28:01 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 01:28:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 01:28:01 | ERROR | stderr |     return next(iterator)
2024-10-21 01:28:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 01:28:01 | ERROR | stderr |     response = next(iterator)
2024-10-21 01:28:01 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 01:28:01 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 01:28:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 01:28:01 | ERROR | stderr |     raise res.value
2024-10-21 01:28:01 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:39:20 | INFO | stdout | conv mode to gemma
2024-10-21 01:39:20 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:39:20 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:39:20 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe this image in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-21 01:39:20 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:39:21 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140484646902736&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NDY3NjIwfQ.tAxO0nJUGqCV7JjLSXY-Vlo0jYXDAuYNocD2hU9TYfg "HTTP/1.1 200 OK"
2024-10-21 01:39:21 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=18
2024-10-21 01:39:21 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=23
2024-10-21 01:39:21 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[15, 13, 10, 14]
2024-10-21 01:39:21 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=c962eb316deaad0301ee35670b454772cb1690c2f4297d0737fbdbd5679b253f&pid=166096 "HTTP/1.1 200 OK"
2024-10-21 01:39:22 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 01:39:24 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 01:39:24 | ERROR | stderr | 
2024-10-21 01:39:24 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 01:39:24 | ERROR | stderr | 
2024-10-21 01:39:28 | ERROR | stderr | 
2024-10-21 01:39:28 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.82s/it]
2024-10-21 01:39:28 | ERROR | stderr | 
2024-10-21 01:39:29 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:04<00:00,  2.29s/it]
2024-10-21 01:39:29 | ERROR | stderr | 
2024-10-21 01:39:29 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 01:39:31 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 01:39:31 | ERROR | stderr |     res = future.result()
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 01:39:31 | ERROR | stderr |     return self.__get_result()
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 01:39:31 | ERROR | stderr |     raise self._exception
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 01:39:31 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 01:39:31 | ERROR | stderr |   File "/home/user/app/cli.py", line 78, in run_inference
2024-10-21 01:39:31 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 189, in process_images
2024-10-21 01:39:31 | ERROR | stderr |     image = process_anyres_image(image, image_processor, model_cfg.image_grid_pinpoints, image_process_func=image_process_func)
2024-10-21 01:39:31 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in process_anyres_image
2024-10-21 01:39:31 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:39:31 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in <listcomp>
2024-10-21 01:39:31 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/image_processing_clip.py", line 325, in preprocess
2024-10-21 01:39:31 | ERROR | stderr |     image = self.resize(image=image, size=size, resample=resample, input_data_format=input_data_format)
2024-10-21 01:39:31 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/clip_encoder.py", line 46, in resize
2024-10-21 01:39:31 | ERROR | stderr |     output_size = get_resize_output_image_size(image, size=(size["height"], size["width"]), default_to_square=True)
2024-10-21 01:39:31 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:39:31 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=c962eb316deaad0301ee35670b454772cb1690c2f4297d0737fbdbd5679b253f&fail=true "HTTP/1.1 200 OK"
2024-10-21 01:39:31 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 01:39:31 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 01:39:31 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 01:39:31 | ERROR | stderr |     result = await self.call_function(
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 01:39:31 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 01:39:31 | ERROR | stderr |     return await anext(iterator)
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 01:39:31 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 01:39:31 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 01:39:31 | ERROR | stderr |     return await future
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 01:39:31 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 01:39:31 | ERROR | stderr |     return next(iterator)
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 01:39:31 | ERROR | stderr |     response = next(iterator)
2024-10-21 01:39:31 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 01:39:31 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 01:39:31 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 01:39:31 | ERROR | stderr |     raise res.value
2024-10-21 01:39:31 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:40:56 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:40:56 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 01:40:56 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:40:56 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 01:40:56 | ERROR | stderr |   warnings.warn(
2024-10-21 01:40:56 | ERROR | stderr | 
2024-10-21 01:40:56 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:40:56 | ERROR | stderr | 
2024-10-21 01:40:56 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:40:56 | ERROR | stderr | 
2024-10-21 01:40:56 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 01:40:56 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 01:40:56 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 01:40:56 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 01:40:56 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 01:40:56 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 01:40:56 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 01:40:56 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 01:40:56 | ERROR | stderr |   warnings.warn(
2024-10-21 01:40:56 | INFO | stdout | 
2024-10-21 01:40:56 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 01:41:11 | INFO | stdout | conv mode to gemma
2024-10-21 01:41:11 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:41:11 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:41:11 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-21 01:41:11 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:41:11 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139635038215120&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NDY3NzMxfQ.8m7arMitMiWUAAvl0526BSo8wYnUnLtdB_ff7w3P6QM "HTTP/1.1 200 OK"
2024-10-21 01:41:11 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-21 01:41:11 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-21 01:41:11 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[10, 16, 15, 13, 14]
2024-10-21 01:41:12 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=552fbe44ffe57440a8f180bcb3db2bc2e033bff1bfc41fd698f0d97fea3d2f77&pid=166804 "HTTP/1.1 200 OK"
2024-10-21 01:41:13 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 01:41:14 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 01:41:14 | ERROR | stderr | 
2024-10-21 01:41:14 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 01:41:14 | ERROR | stderr | 
2024-10-21 01:41:18 | ERROR | stderr | 
2024-10-21 01:41:18 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.58s/it]
2024-10-21 01:41:18 | ERROR | stderr | 
2024-10-21 01:41:19 | ERROR | stderr | 
2024-10-21 01:41:19 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:05<00:00,  2.33s/it]
2024-10-21 01:41:19 | ERROR | stderr | 
2024-10-21 01:41:19 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:05<00:00,  2.51s/it]
2024-10-21 01:41:19 | ERROR | stderr | 
2024-10-21 01:41:19 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 01:41:22 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 01:41:22 | ERROR | stderr |     res = future.result()
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 01:41:22 | ERROR | stderr |     return self.__get_result()
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 01:41:22 | ERROR | stderr |     raise self._exception
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 01:41:22 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 01:41:22 | ERROR | stderr |   File "/home/user/app/cli.py", line 91, in run_inference
2024-10-21 01:41:22 | ERROR | stderr |     image_tensor = process_images([image], image_processor, model.config)
2024-10-21 01:41:22 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 189, in process_images
2024-10-21 01:41:22 | ERROR | stderr |     image = process_anyres_image(image, image_processor, model_cfg.image_grid_pinpoints, image_process_func=image_process_func)
2024-10-21 01:41:22 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in process_anyres_image
2024-10-21 01:41:22 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:41:22 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in <listcomp>
2024-10-21 01:41:22 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/image_processing_clip.py", line 325, in preprocess
2024-10-21 01:41:22 | ERROR | stderr |     image = self.resize(image=image, size=size, resample=resample, input_data_format=input_data_format)
2024-10-21 01:41:22 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/clip_encoder.py", line 46, in resize
2024-10-21 01:41:22 | ERROR | stderr |     output_size = get_resize_output_image_size(image, size=(size["height"], size["width"]), default_to_square=True)
2024-10-21 01:41:22 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:41:22 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=552fbe44ffe57440a8f180bcb3db2bc2e033bff1bfc41fd698f0d97fea3d2f77&fail=true "HTTP/1.1 200 OK"
2024-10-21 01:41:22 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 01:41:22 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 01:41:22 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 01:41:22 | ERROR | stderr |     result = await self.call_function(
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 01:41:22 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 01:41:22 | ERROR | stderr |     return await anext(iterator)
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 01:41:22 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 01:41:22 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 01:41:22 | ERROR | stderr |     return await future
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 01:41:22 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 01:41:22 | ERROR | stderr |     return next(iterator)
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 01:41:22 | ERROR | stderr |     response = next(iterator)
2024-10-21 01:41:22 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 01:41:22 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 01:41:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 01:41:22 | ERROR | stderr |     raise res.value
2024-10-21 01:41:22 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:44:04 | INFO | stdout | conv mode to gemma
2024-10-21 01:44:04 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:44:04 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:44:04 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-21 01:44:04 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:44:04 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139635038215120&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NDY3OTA0fQ.OlWC49UJZeLljZp9t2vbd1Tjuedoqsm3EYcXGSQIAVY "HTTP/1.1 200 OK"
2024-10-21 01:44:04 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-21 01:44:04 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-21 01:44:04 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[10, 13, 15, 16, 14]
2024-10-21 01:44:04 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=447a22764433297b5724435eb0fd8b9ea8bb2bdaf7f30e34d648d0a0e56deebc&pid=167522 "HTTP/1.1 200 OK"
2024-10-21 01:44:05 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 01:44:07 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 01:44:07 | ERROR | stderr | 
2024-10-21 01:44:07 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 01:44:07 | ERROR | stderr | 
2024-10-21 01:44:09 | ERROR | stderr | 
2024-10-21 01:44:09 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:02<00:02,  2.72s/it]
2024-10-21 01:44:09 | ERROR | stderr | 
2024-10-21 01:44:10 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.72s/it]
2024-10-21 01:44:10 | ERROR | stderr | 
2024-10-21 01:44:10 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 01:44:13 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 01:44:13 | ERROR | stderr |     res = future.result()
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 01:44:13 | ERROR | stderr |     return self.__get_result()
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 01:44:13 | ERROR | stderr |     raise self._exception
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 01:44:13 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 01:44:13 | ERROR | stderr |   File "/home/user/app/cli.py", line 91, in run_inference
2024-10-21 01:44:13 | ERROR | stderr |     image_tensor = process_images([image], image_processor, model.config)
2024-10-21 01:44:13 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 189, in process_images
2024-10-21 01:44:13 | ERROR | stderr |     image = process_anyres_image(image, image_processor, model_cfg.image_grid_pinpoints, image_process_func=image_process_func)
2024-10-21 01:44:13 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in process_anyres_image
2024-10-21 01:44:13 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:44:13 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in <listcomp>
2024-10-21 01:44:13 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/image_processing_clip.py", line 325, in preprocess
2024-10-21 01:44:13 | ERROR | stderr |     image = self.resize(image=image, size=size, resample=resample, input_data_format=input_data_format)
2024-10-21 01:44:13 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/clip_encoder.py", line 46, in resize
2024-10-21 01:44:13 | ERROR | stderr |     output_size = get_resize_output_image_size(image, size=(size["height"], size["width"]), default_to_square=True)
2024-10-21 01:44:13 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:44:13 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=447a22764433297b5724435eb0fd8b9ea8bb2bdaf7f30e34d648d0a0e56deebc&fail=true "HTTP/1.1 200 OK"
2024-10-21 01:44:13 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 01:44:13 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 01:44:13 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 01:44:13 | ERROR | stderr |     result = await self.call_function(
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 01:44:13 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 01:44:13 | ERROR | stderr |     return await anext(iterator)
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 01:44:13 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 01:44:13 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 01:44:13 | ERROR | stderr |     return await future
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 01:44:13 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 01:44:13 | ERROR | stderr |     return next(iterator)
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 01:44:13 | ERROR | stderr |     response = next(iterator)
2024-10-21 01:44:13 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 01:44:13 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 01:44:13 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 01:44:13 | ERROR | stderr |     raise res.value
2024-10-21 01:44:13 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:47:19 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:47:19 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 01:47:19 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:47:19 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 01:47:19 | ERROR | stderr |   warnings.warn(
2024-10-21 01:47:19 | ERROR | stderr | 
2024-10-21 01:47:19 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:47:19 | ERROR | stderr | 
2024-10-21 01:47:19 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:47:19 | ERROR | stderr | 
2024-10-21 01:47:19 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 01:47:19 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 01:47:19 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 01:47:19 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 01:47:19 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 01:47:19 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 01:47:19 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 01:47:19 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 01:47:19 | ERROR | stderr |   warnings.warn(
2024-10-21 01:47:19 | INFO | stdout | 
2024-10-21 01:47:19 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 01:47:32 | INFO | stdout | conv mode to gemma
2024-10-21 01:47:32 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:47:32 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:47:32 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe the image in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-21 01:47:32 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:47:32 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140604188695504&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NDY4MTEyfQ.fVhFDQ-Vo0yO9XF0BbZi6MXymo4PrMqyoSJfI-fVQyY "HTTP/1.1 200 OK"
2024-10-21 01:47:32 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-21 01:47:32 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-21 01:47:32 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[14, 10, 13, 16, 15]
2024-10-21 01:47:33 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=07e619fad2c683e5e78c722a5e135248587eeb0d294bb4f3f6793cfe8ab3621f&pid=168612 "HTTP/1.1 200 OK"
2024-10-21 01:47:34 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 01:47:36 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 01:47:36 | ERROR | stderr | 
2024-10-21 01:47:36 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 01:47:36 | ERROR | stderr | 
2024-10-21 01:47:39 | ERROR | stderr | 
2024-10-21 01:47:39 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:02<00:02,  2.94s/it]
2024-10-21 01:47:39 | ERROR | stderr | 
2024-10-21 01:47:39 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.87s/it]
2024-10-21 01:47:39 | ERROR | stderr | 
2024-10-21 01:47:39 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 01:47:42 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 01:47:42 | ERROR | stderr |     res = future.result()
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 01:47:42 | ERROR | stderr |     return self.__get_result()
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 01:47:42 | ERROR | stderr |     raise self._exception
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 01:47:42 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 01:47:42 | ERROR | stderr |   File "/home/user/app/cli.py", line 93, in run_inference
2024-10-21 01:47:42 | ERROR | stderr |     image_tensor = process_images([image], image_processor, model.config)
2024-10-21 01:47:42 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 189, in process_images
2024-10-21 01:47:42 | ERROR | stderr |     image = process_anyres_image(image, image_processor, model_cfg.image_grid_pinpoints, image_process_func=image_process_func)
2024-10-21 01:47:42 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in process_anyres_image
2024-10-21 01:47:42 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:47:42 | ERROR | stderr |   File "/home/user/app/mm_utils.py", line 154, in <listcomp>
2024-10-21 01:47:42 | ERROR | stderr |     image_patches = [processor.preprocess(image_patch, return_tensors='pt')['pixel_values'][0]
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/image_processing_clip.py", line 325, in preprocess
2024-10-21 01:47:42 | ERROR | stderr |     image = self.resize(image=image, size=size, resample=resample, input_data_format=input_data_format)
2024-10-21 01:47:42 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/clip_encoder.py", line 46, in resize
2024-10-21 01:47:42 | ERROR | stderr |     output_size = get_resize_output_image_size(image, size=(size["height"], size["width"]), default_to_square=True)
2024-10-21 01:47:42 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:47:42 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=07e619fad2c683e5e78c722a5e135248587eeb0d294bb4f3f6793cfe8ab3621f&fail=true "HTTP/1.1 200 OK"
2024-10-21 01:47:42 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 01:47:42 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 01:47:42 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 01:47:42 | ERROR | stderr |     result = await self.call_function(
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 01:47:42 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 01:47:42 | ERROR | stderr |     return await anext(iterator)
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 01:47:42 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 01:47:42 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 01:47:42 | ERROR | stderr |     return await future
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 01:47:42 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 01:47:42 | ERROR | stderr |     return next(iterator)
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 01:47:42 | ERROR | stderr |     response = next(iterator)
2024-10-21 01:47:42 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 01:47:42 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 01:47:42 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 01:47:42 | ERROR | stderr |     raise res.value
2024-10-21 01:47:42 | ERROR | stderr | KeyError: 'height'
2024-10-21 01:49:24 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:49:24 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 01:49:24 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:49:25 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 01:49:25 | ERROR | stderr |   warnings.warn(
2024-10-21 01:49:25 | ERROR | stderr | 
2024-10-21 01:49:25 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:49:25 | ERROR | stderr | 
2024-10-21 01:49:25 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:49:25 | ERROR | stderr | 
2024-10-21 01:49:25 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 01:49:25 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 01:49:25 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 01:49:25 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 01:49:25 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 01:49:25 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 01:49:25 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 01:49:25 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 01:49:25 | ERROR | stderr |   warnings.warn(
2024-10-21 01:49:25 | INFO | stdout | 
2024-10-21 01:49:25 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 01:51:38 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:51:38 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 01:51:38 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:51:38 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 01:51:38 | ERROR | stderr |   warnings.warn(
2024-10-21 01:51:38 | ERROR | stderr | 
2024-10-21 01:51:38 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:51:38 | ERROR | stderr | 
2024-10-21 01:51:38 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:51:38 | ERROR | stderr | 
2024-10-21 01:51:38 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 01:51:38 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 01:51:38 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 01:51:38 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 01:51:38 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 01:51:38 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 01:51:38 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 01:51:38 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 01:51:38 | ERROR | stderr |   warnings.warn(
2024-10-21 01:51:38 | INFO | stdout | 
2024-10-21 01:51:38 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 01:51:50 | INFO | stdout | conv mode to gemma
2024-10-21 01:51:50 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:51:50 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:51:50 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe this image in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-21 01:51:50 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:51:50 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139926743686096&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NDY4MzcwfQ.Gw929SPxqMYGSBP897nuEjVF0jseYPhAEz1Y-DmaLBc "HTTP/1.1 200 OK"
2024-10-21 01:51:50 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=18
2024-10-21 01:51:50 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=20
2024-10-21 01:51:50 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[14, 15, 13, 10]
2024-10-21 01:51:51 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=963a6f92197f83506e17bc89e9eed7b10479d698dd11d5195a64f3d80fb999b4&pid=170045 "HTTP/1.1 200 OK"
2024-10-21 01:51:52 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 01:51:54 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 01:51:54 | ERROR | stderr | 
2024-10-21 01:51:54 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 01:51:54 | ERROR | stderr | 
2024-10-21 01:51:57 | ERROR | stderr | 
2024-10-21 01:51:57 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.03s/it]
2024-10-21 01:51:57 | ERROR | stderr | 
2024-10-21 01:51:57 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.90s/it]
2024-10-21 01:51:57 | ERROR | stderr | 
2024-10-21 01:51:57 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 01:51:59 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:51:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 01:51:59 | ERROR | stderr |     res = future.result()
2024-10-21 01:51:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 01:51:59 | ERROR | stderr |     return self.__get_result()
2024-10-21 01:51:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 01:51:59 | ERROR | stderr |     raise self._exception
2024-10-21 01:51:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 01:51:59 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 01:51:59 | ERROR | stderr |   File "/home/user/app/cli.py", line 102, in run_inference
2024-10-21 01:51:59 | ERROR | stderr |     image_process_func = partial(image_processor.preprocess, return_tensors='pt', do_resize=True, do_center_crop=False, size=[image_h, image_w])
2024-10-21 01:51:59 | ERROR | stderr | NameError: name 'partial' is not defined
2024-10-21 01:52:00 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=963a6f92197f83506e17bc89e9eed7b10479d698dd11d5195a64f3d80fb999b4&fail=true "HTTP/1.1 200 OK"
2024-10-21 01:52:00 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:52:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 01:52:00 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 01:52:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 01:52:00 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 01:52:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 01:52:00 | ERROR | stderr |     result = await self.call_function(
2024-10-21 01:52:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 01:52:00 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 01:52:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 01:52:00 | ERROR | stderr |     return await anext(iterator)
2024-10-21 01:52:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 01:52:00 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 01:52:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 01:52:00 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 01:52:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 01:52:00 | ERROR | stderr |     return await future
2024-10-21 01:52:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 01:52:00 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 01:52:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 01:52:00 | ERROR | stderr |     return next(iterator)
2024-10-21 01:52:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 01:52:00 | ERROR | stderr |     response = next(iterator)
2024-10-21 01:52:00 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 01:52:00 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 01:52:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 01:52:00 | ERROR | stderr |     raise res.value
2024-10-21 01:52:00 | ERROR | stderr | NameError: name 'partial' is not defined
2024-10-21 01:52:36 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:52:36 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 01:52:36 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:52:36 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 01:52:36 | ERROR | stderr |   warnings.warn(
2024-10-21 01:52:36 | ERROR | stderr | 
2024-10-21 01:52:36 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:52:36 | ERROR | stderr | 
2024-10-21 01:52:36 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:52:36 | ERROR | stderr | 
2024-10-21 01:52:36 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 01:52:36 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 01:52:36 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 01:52:36 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 01:52:36 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 01:52:36 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 01:52:36 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 01:52:36 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 01:52:36 | ERROR | stderr |   warnings.warn(
2024-10-21 01:52:36 | INFO | stdout | 
2024-10-21 01:52:36 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 01:52:49 | INFO | stdout | conv mode to gemma
2024-10-21 01:52:49 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:52:49 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:52:49 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': "A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what's in the image<end_of_turn>\n<start_of_turn>model\n", 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-21 01:52:49 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:52:49 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139743490350032&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NDY4NDI5fQ.YfSg2L93_3ps3X0U8laMR6nBtlNqz2Tg5TGzWxqiR5Y "HTTP/1.1 200 OK"
2024-10-21 01:52:49 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=18
2024-10-21 01:52:49 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=20
2024-10-21 01:52:49 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[15, 10, 14, 13]
2024-10-21 01:52:50 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=1cd4e4e0edc771d7d6ec8b9c6cae2248487e70d5e19a074076e2a269bf57dcae&pid=170566 "HTTP/1.1 200 OK"
2024-10-21 01:52:51 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 01:52:53 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 01:52:53 | ERROR | stderr | 
2024-10-21 01:52:53 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 01:52:53 | ERROR | stderr | 
2024-10-21 01:52:56 | ERROR | stderr | 
2024-10-21 01:52:56 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.19s/it]
2024-10-21 01:52:56 | ERROR | stderr | 
2024-10-21 01:52:57 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.98s/it]
2024-10-21 01:52:57 | ERROR | stderr | 
2024-10-21 01:52:57 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 01:52:59 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 01:52:59 | ERROR | stderr |     res = future.result()
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 01:52:59 | ERROR | stderr |     return self.__get_result()
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 01:52:59 | ERROR | stderr |     raise self._exception
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 01:52:59 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 01:52:59 | ERROR | stderr |   File "/home/user/app/cli.py", line 103, in run_inference
2024-10-21 01:52:59 | ERROR | stderr |     image_tensor = process_images([img], image_processor, model.config, image_process_func=image_process_func)[0]
2024-10-21 01:52:59 | ERROR | stderr | NameError: name 'img' is not defined
2024-10-21 01:52:59 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=1cd4e4e0edc771d7d6ec8b9c6cae2248487e70d5e19a074076e2a269bf57dcae&fail=true "HTTP/1.1 200 OK"
2024-10-21 01:52:59 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 01:52:59 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 01:52:59 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 01:52:59 | ERROR | stderr |     result = await self.call_function(
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 01:52:59 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 01:52:59 | ERROR | stderr |     return await anext(iterator)
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 01:52:59 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 01:52:59 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 01:52:59 | ERROR | stderr |     return await future
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 01:52:59 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 01:52:59 | ERROR | stderr |     return next(iterator)
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 01:52:59 | ERROR | stderr |     response = next(iterator)
2024-10-21 01:52:59 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 01:52:59 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 01:52:59 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 01:52:59 | ERROR | stderr |     raise res.value
2024-10-21 01:52:59 | ERROR | stderr | NameError: name 'img' is not defined
2024-10-21 01:53:47 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:53:47 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 01:53:47 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 01:53:47 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 01:53:47 | ERROR | stderr |   warnings.warn(
2024-10-21 01:53:48 | ERROR | stderr | 
2024-10-21 01:53:48 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:53:48 | ERROR | stderr | 
2024-10-21 01:53:48 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 01:53:48 | ERROR | stderr | 
2024-10-21 01:53:48 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 01:53:48 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 01:53:48 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 01:53:48 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 01:53:48 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 01:53:48 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 01:53:48 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 01:53:48 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 01:53:48 | ERROR | stderr |   warnings.warn(
2024-10-21 01:53:48 | INFO | stdout | 
2024-10-21 01:53:48 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 01:53:59 | INFO | stdout | conv mode to gemma
2024-10-21 01:53:59 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:53:59 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:53:59 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe the image in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['8b23f327b90b6211049acd36e3f99975']"}
2024-10-21 01:53:59 | INFO | stdout | Input Image Size:(400, 433)
2024-10-21 01:53:59 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139729682602960&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NDY4NDk5fQ.uJiGTTdYN3LxN4DmVcfZ1_-FHC8RgDZT4h8FvjGi_sQ "HTTP/1.1 200 OK"
2024-10-21 01:53:59 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=18
2024-10-21 01:53:59 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=20
2024-10-21 01:53:59 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[10, 15, 12, 14]
2024-10-21 01:54:00 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=8d8b7379cac569b30014b4cb23dfc7f48c288d5de95672319efb6d01b1d9f3dd&pid=171173 "HTTP/1.1 200 OK"
2024-10-21 01:54:01 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 01:54:02 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 01:54:02 | ERROR | stderr | 
2024-10-21 01:54:02 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 01:54:02 | ERROR | stderr | 
2024-10-21 01:54:05 | ERROR | stderr | 
2024-10-21 01:54:05 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:02<00:02,  2.73s/it]
2024-10-21 01:54:05 | ERROR | stderr | 
2024-10-21 01:54:06 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.65s/it]
2024-10-21 01:54:06 | ERROR | stderr | 
2024-10-21 01:54:06 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 01:54:08 | INFO | stdout | image size:  (400, 433)
2024-10-21 01:54:08 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 01:54:08 | ERROR | stderr |     res = future.result()
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 01:54:08 | ERROR | stderr |     return self.__get_result()
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 01:54:08 | ERROR | stderr |     raise self._exception
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 01:54:08 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 01:54:08 | ERROR | stderr |   File "/home/user/app/cli.py", line 125, in run_inference
2024-10-21 01:54:08 | ERROR | stderr |     output_ids = model.generate(
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
2024-10-21 01:54:08 | ERROR | stderr |     return func(*args, **kwargs)
2024-10-21 01:54:08 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/modeling.py", line 135, in generate
2024-10-21 01:54:08 | ERROR | stderr |     ) = self.prepare_inputs_labels_for_multimodal(
2024-10-21 01:54:08 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 707, in prepare_inputs_labels_for_multimodal
2024-10-21 01:54:08 | ERROR | stderr |     raw_image_features, image_features, region_feature_map = self.encode_images(images, region_flag=region_flag, region_geo_sampler=region_geo_sampler)
2024-10-21 01:54:08 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 553, in encode_images
2024-10-21 01:54:08 | ERROR | stderr |     image_features = self.get_model().get_vision_tower()(images)
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
2024-10-21 01:54:08 | ERROR | stderr |     return self._call_impl(*args, **kwargs)
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
2024-10-21 01:54:08 | ERROR | stderr |     return forward_call(*args, **kwargs)
2024-10-21 01:54:08 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/clip_encoder.py", line 102, in forward
2024-10-21 01:54:08 | ERROR | stderr |     image_forward_outs = self.vision_tower(images.to(device=self.device, dtype=self.dtype), output_hidden_states=True)
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
2024-10-21 01:54:08 | ERROR | stderr |     return self._call_impl(*args, **kwargs)
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
2024-10-21 01:54:08 | ERROR | stderr |     return forward_call(*args, **kwargs)
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 1116, in forward
2024-10-21 01:54:08 | ERROR | stderr |     return self.vision_model(
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
2024-10-21 01:54:08 | ERROR | stderr |     return self._call_impl(*args, **kwargs)
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
2024-10-21 01:54:08 | ERROR | stderr |     return forward_call(*args, **kwargs)
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 1040, in forward
2024-10-21 01:54:08 | ERROR | stderr |     hidden_states = self.embeddings(pixel_values)
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
2024-10-21 01:54:08 | ERROR | stderr |     return self._call_impl(*args, **kwargs)
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
2024-10-21 01:54:08 | ERROR | stderr |     return forward_call(*args, **kwargs)
2024-10-21 01:54:08 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 207, in forward
2024-10-21 01:54:08 | ERROR | stderr |     embeddings = embeddings + self.position_embedding(self.position_ids)
2024-10-21 01:54:08 | ERROR | stderr | RuntimeError: The size of tensor a (841) must match the size of tensor b (577) at non-singleton dimension 1
2024-10-21 01:54:09 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=8d8b7379cac569b30014b4cb23dfc7f48c288d5de95672319efb6d01b1d9f3dd&fail=true "HTTP/1.1 200 OK"
2024-10-21 01:54:09 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 01:54:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 01:54:09 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 01:54:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 01:54:09 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 01:54:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 01:54:09 | ERROR | stderr |     result = await self.call_function(
2024-10-21 01:54:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 01:54:09 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 01:54:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 01:54:09 | ERROR | stderr |     return await anext(iterator)
2024-10-21 01:54:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 01:54:09 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 01:54:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 01:54:09 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 01:54:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 01:54:09 | ERROR | stderr |     return await future
2024-10-21 01:54:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 01:54:09 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 01:54:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 01:54:09 | ERROR | stderr |     return next(iterator)
2024-10-21 01:54:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 01:54:09 | ERROR | stderr |     response = next(iterator)
2024-10-21 01:54:09 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 01:54:09 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 01:54:09 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 01:54:09 | ERROR | stderr |     raise res.value
2024-10-21 01:54:09 | ERROR | stderr | RuntimeError: The size of tensor a (841) must match the size of tensor b (577) at non-singleton dimension 1
2024-10-21 05:31:18 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 05:31:18 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 05:31:18 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 05:31:18 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 05:31:18 | ERROR | stderr |   warnings.warn(
2024-10-21 05:31:19 | ERROR | stderr | 
2024-10-21 05:31:19 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 05:31:19 | ERROR | stderr | 
2024-10-21 05:31:19 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 05:31:19 | ERROR | stderr | 
2024-10-21 05:31:19 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 05:31:19 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 05:31:19 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 05:31:19 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 05:31:19 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 05:31:19 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 05:31:19 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 05:31:19 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 05:31:19 | ERROR | stderr |   warnings.warn(
2024-10-21 05:31:19 | INFO | stdout | 
2024-10-21 05:31:19 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 05:32:22 | INFO | stdout | conv mode to gemma
2024-10-21 05:32:22 | INFO | stdout | Input Image Size:(400, 668)
2024-10-21 05:32:22 | INFO | stdout | Input Image Size:(400, 668)
2024-10-21 05:32:22 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nexplain what you see in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['f5fd9bd8b1445ded1d843253a97af861']"}
2024-10-21 05:32:22 | INFO | stdout | Input Image Size:(400, 668)
2024-10-21 05:32:22 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140011682618320&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjE3NC45NS4xNC4xMDMiLCJ1c2VyIjpudWxsLCJ1dWlkIjpudWxsLCJleHAiOjE3Mjk0ODE2MDF9.p91aavRNfUhXhtR7_xDR77uWW9J_eXQ1QwzzxprUW6o "HTTP/1.1 200 OK"
2024-10-21 05:32:22 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-21 05:32:22 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-21 05:32:22 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[13, 10, 15, 16, 14]
2024-10-21 05:32:22 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=f829e9a779857a63f21056211fbd00a7e0862eb63292764ba63eef4c2a077c9e&pid=185515 "HTTP/1.1 200 OK"
2024-10-21 05:32:23 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 05:32:25 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 05:32:25 | ERROR | stderr | 
2024-10-21 05:32:25 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 05:32:25 | ERROR | stderr | 
2024-10-21 05:32:28 | ERROR | stderr | 
2024-10-21 05:32:28 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.18s/it]
2024-10-21 05:32:28 | ERROR | stderr | 
2024-10-21 05:32:29 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.93s/it]
2024-10-21 05:32:29 | ERROR | stderr | 
2024-10-21 05:32:29 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 05:32:31 | INFO | stdout | image size:  (400, 668)
2024-10-21 05:32:32 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 05:32:32 | ERROR | stderr |     res = future.result()
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 05:32:32 | ERROR | stderr |     return self.__get_result()
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 05:32:32 | ERROR | stderr |     raise self._exception
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 05:32:32 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 05:32:32 | ERROR | stderr |   File "/home/user/app/cli.py", line 125, in run_inference
2024-10-21 05:32:32 | ERROR | stderr |     output_ids = model.generate(
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
2024-10-21 05:32:32 | ERROR | stderr |     return func(*args, **kwargs)
2024-10-21 05:32:32 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/modeling.py", line 135, in generate
2024-10-21 05:32:32 | ERROR | stderr |     ) = self.prepare_inputs_labels_for_multimodal(
2024-10-21 05:32:32 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 707, in prepare_inputs_labels_for_multimodal
2024-10-21 05:32:32 | ERROR | stderr |     raw_image_features, image_features, region_feature_map = self.encode_images(images, region_flag=region_flag, region_geo_sampler=region_geo_sampler)
2024-10-21 05:32:32 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 554, in encode_images
2024-10-21 05:32:32 | ERROR | stderr |     projected_image_features = self.get_model().mm_projector(image_features)
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
2024-10-21 05:32:32 | ERROR | stderr |     return self._call_impl(*args, **kwargs)
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
2024-10-21 05:32:32 | ERROR | stderr |     return forward_call(*args, **kwargs)
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/container.py", line 219, in forward
2024-10-21 05:32:32 | ERROR | stderr |     input = module(input)
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
2024-10-21 05:32:32 | ERROR | stderr |     return self._call_impl(*args, **kwargs)
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
2024-10-21 05:32:32 | ERROR | stderr |     return forward_call(*args, **kwargs)
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 117, in forward
2024-10-21 05:32:32 | ERROR | stderr |     return F.linear(input, self.weight, self.bias)
2024-10-21 05:32:32 | ERROR | stderr | RuntimeError: mat1 and mat2 must have the same dtype, but got Float and Half
2024-10-21 05:32:32 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=f829e9a779857a63f21056211fbd00a7e0862eb63292764ba63eef4c2a077c9e&fail=true "HTTP/1.1 200 OK"
2024-10-21 05:32:32 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 05:32:32 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 05:32:32 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 05:32:32 | ERROR | stderr |     result = await self.call_function(
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 05:32:32 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 05:32:32 | ERROR | stderr |     return await anext(iterator)
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 05:32:32 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 05:32:32 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 05:32:32 | ERROR | stderr |     return await future
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 05:32:32 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 05:32:32 | ERROR | stderr |     return next(iterator)
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 05:32:32 | ERROR | stderr |     response = next(iterator)
2024-10-21 05:32:32 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 05:32:32 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 05:32:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 05:32:32 | ERROR | stderr |     raise res.value
2024-10-21 05:32:32 | ERROR | stderr | RuntimeError: mat1 and mat2 must have the same dtype, but got Float and Half
2024-10-21 05:37:54 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 05:37:54 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 05:37:54 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 05:37:54 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 05:37:54 | ERROR | stderr |   warnings.warn(
2024-10-21 05:37:54 | ERROR | stderr | 
2024-10-21 05:37:54 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 05:37:54 | ERROR | stderr | 
2024-10-21 05:37:54 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 05:37:54 | ERROR | stderr | 
2024-10-21 05:37:54 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 05:37:54 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 05:37:54 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 05:37:54 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 05:37:54 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 05:37:54 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 05:37:54 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 05:37:54 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 05:37:54 | ERROR | stderr |   warnings.warn(
2024-10-21 05:37:54 | INFO | stdout | 
2024-10-21 05:37:54 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 05:38:53 | INFO | stdout | conv mode to gemma
2024-10-21 05:38:53 | INFO | stdout | Input Image Size:(400, 668)
2024-10-21 05:38:53 | INFO | stdout | Input Image Size:(400, 668)
2024-10-21 05:38:53 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['f5fd9bd8b1445ded1d843253a97af861']"}
2024-10-21 05:38:53 | INFO | stdout | Input Image Size:(400, 668)
2024-10-21 05:38:54 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140444052265936&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjE3NC45NS4xNC4xMDMiLCJ1c2VyIjpudWxsLCJ1dWlkIjpudWxsLCJleHAiOjE3Mjk0ODE5OTN9.DO6np5naKQ1opgWXq2boYpo1tnT03S0GqyfQXdOKzlE "HTTP/1.1 200 OK"
2024-10-21 05:38:54 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=18
2024-10-21 05:38:54 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=20
2024-10-21 05:38:54 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[15, 10, 14, 13]
2024-10-21 05:38:54 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=56e1b8cdad52721947b30e3765349a5e54c874886f781ea271756ba197927279&pid=186806 "HTTP/1.1 200 OK"
2024-10-21 05:38:55 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 05:38:56 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 05:38:56 | ERROR | stderr | 
2024-10-21 05:38:56 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 05:38:56 | ERROR | stderr | 
2024-10-21 05:38:59 | ERROR | stderr | 
2024-10-21 05:38:59 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:02<00:02,  2.74s/it]
2024-10-21 05:38:59 | ERROR | stderr | 
2024-10-21 05:39:00 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.76s/it]
2024-10-21 05:39:00 | ERROR | stderr | 
2024-10-21 05:39:00 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 05:39:02 | INFO | stdout | image size:  (400, 668)
2024-10-21 05:39:03 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 05:39:03 | ERROR | stderr |     res = future.result()
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 05:39:03 | ERROR | stderr |     return self.__get_result()
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 05:39:03 | ERROR | stderr |     raise self._exception
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 05:39:03 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 05:39:03 | ERROR | stderr |   File "/home/user/app/cli.py", line 131, in run_inference
2024-10-21 05:39:03 | ERROR | stderr |     output_ids = model.generate(
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
2024-10-21 05:39:03 | ERROR | stderr |     return func(*args, **kwargs)
2024-10-21 05:39:03 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/modeling.py", line 135, in generate
2024-10-21 05:39:03 | ERROR | stderr |     ) = self.prepare_inputs_labels_for_multimodal(
2024-10-21 05:39:03 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 832, in prepare_inputs_labels_for_multimodal
2024-10-21 05:39:03 | ERROR | stderr |     assert batch_idx+1 == cur_image_idx
2024-10-21 05:39:03 | ERROR | stderr | AssertionError
2024-10-21 05:39:03 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=56e1b8cdad52721947b30e3765349a5e54c874886f781ea271756ba197927279&fail=true "HTTP/1.1 200 OK"
2024-10-21 05:39:03 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 05:39:03 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 05:39:03 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 05:39:03 | ERROR | stderr |     result = await self.call_function(
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 05:39:03 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 05:39:03 | ERROR | stderr |     return await anext(iterator)
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 05:39:03 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 05:39:03 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 05:39:03 | ERROR | stderr |     return await future
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 05:39:03 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 05:39:03 | ERROR | stderr |     return next(iterator)
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 05:39:03 | ERROR | stderr |     response = next(iterator)
2024-10-21 05:39:03 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 05:39:03 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 05:39:03 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 05:39:03 | ERROR | stderr |     raise res.value
2024-10-21 05:39:03 | ERROR | stderr | AssertionError
2024-10-21 05:53:18 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 05:53:18 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 05:53:18 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 05:53:18 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 05:53:18 | ERROR | stderr |   warnings.warn(
2024-10-21 05:53:18 | ERROR | stderr | 
2024-10-21 05:53:18 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 05:53:18 | ERROR | stderr | 
2024-10-21 05:53:18 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 05:53:18 | ERROR | stderr | 
2024-10-21 05:53:19 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 05:53:19 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 05:53:19 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 05:53:19 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 05:53:19 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 05:53:19 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 05:53:19 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 05:53:19 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 05:53:19 | ERROR | stderr |   warnings.warn(
2024-10-21 05:53:19 | INFO | stdout | 
2024-10-21 05:53:19 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 05:53:39 | INFO | stdout | conv mode to gemma
2024-10-21 05:53:39 | INFO | stdout | Input Image Size:(711, 400)
2024-10-21 05:53:39 | INFO | stdout | Input Image Size:(711, 400)
2024-10-21 05:53:39 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['96dd1c245601ae0ca04ae1c44014cff7']"}
2024-10-21 05:53:39 | INFO | stdout | Input Image Size:(711, 400)
2024-10-21 05:53:40 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140636289833936&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjE3NC45NS4xNC4xMDMiLCJ1c2VyIjpudWxsLCJ1dWlkIjpudWxsLCJleHAiOjE3Mjk0ODI4Nzl9.wCqodtwM7-0MHRoPHtSwhvf_0Iiwh3KpM2ZK9I804Rw "HTTP/1.1 200 OK"
2024-10-21 05:53:40 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=18
2024-10-21 05:53:40 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=20
2024-10-21 05:53:40 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[10, 13, 16, 14]
2024-10-21 05:53:40 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=5cbcde5d68c7e3714f9c161548cdbb3f80a55c14d22fbbd92f3ab3e017c798a3&pid=189326 "HTTP/1.1 200 OK"
2024-10-21 05:53:41 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 05:53:43 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 05:53:43 | ERROR | stderr | 
2024-10-21 05:53:43 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 05:53:43 | ERROR | stderr | 
2024-10-21 05:53:46 | ERROR | stderr | 
2024-10-21 05:53:46 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.23s/it]
2024-10-21 05:53:46 | ERROR | stderr | 
2024-10-21 05:53:47 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.99s/it]
2024-10-21 05:53:47 | ERROR | stderr | 
2024-10-21 05:53:47 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 05:53:49 | INFO | stdout | image size:  (711, 400)
2024-10-21 05:53:49 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 05:53:49 | ERROR | stderr |     res = future.result()
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 05:53:49 | ERROR | stderr |     return self.__get_result()
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 05:53:49 | ERROR | stderr |     raise self._exception
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 05:53:49 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 05:53:49 | ERROR | stderr |   File "/home/user/app/cli.py", line 130, in run_inference
2024-10-21 05:53:49 | ERROR | stderr |     images = image_tensor.unsqueeze(0).to(args.data_type).cuda()
2024-10-21 05:53:49 | ERROR | stderr | NameError: name 'args' is not defined
2024-10-21 05:53:49 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=5cbcde5d68c7e3714f9c161548cdbb3f80a55c14d22fbbd92f3ab3e017c798a3&fail=true "HTTP/1.1 200 OK"
2024-10-21 05:53:49 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 05:53:49 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 05:53:49 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 05:53:49 | ERROR | stderr |     result = await self.call_function(
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 05:53:49 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 05:53:49 | ERROR | stderr |     return await anext(iterator)
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 05:53:49 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 05:53:49 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 05:53:49 | ERROR | stderr |     return await future
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 05:53:49 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 05:53:49 | ERROR | stderr |     return next(iterator)
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 05:53:49 | ERROR | stderr |     response = next(iterator)
2024-10-21 05:53:49 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 05:53:49 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 05:53:49 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 05:53:49 | ERROR | stderr |     raise res.value
2024-10-21 05:53:49 | ERROR | stderr | NameError: name 'args' is not defined
2024-10-21 05:56:51 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 05:56:51 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 05:56:51 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 05:56:51 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 05:56:51 | ERROR | stderr |   warnings.warn(
2024-10-21 05:56:51 | ERROR | stderr | 
2024-10-21 05:56:51 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 05:56:51 | ERROR | stderr | 
2024-10-21 05:56:51 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 05:56:51 | ERROR | stderr | 
2024-10-21 05:56:51 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 05:56:51 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 05:56:51 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 05:56:51 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 05:56:52 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 05:56:52 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 05:56:52 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 05:56:52 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 05:56:52 | ERROR | stderr |   warnings.warn(
2024-10-21 05:56:52 | INFO | stdout | 
2024-10-21 05:56:52 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 05:57:11 | INFO | stdout | conv mode to gemma
2024-10-21 05:57:11 | INFO | stdout | Input Image Size:(711, 400)
2024-10-21 05:57:11 | INFO | stdout | Input Image Size:(711, 400)
2024-10-21 05:57:11 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['a8d832a4d4163a69b808476963cc7c2a']"}
2024-10-21 05:57:11 | INFO | stdout | Input Image Size:(711, 400)
2024-10-21 05:57:11 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140684771810256&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjE3NC45NS4xNC4xMDMiLCJ1c2VyIjpudWxsLCJ1dWlkIjpudWxsLCJleHAiOjE3Mjk0ODMwOTF9.kJBme1P0skhNBJ1Wc8892IB8QjufbWA5v28nDb_TipU "HTTP/1.1 200 OK"
2024-10-21 05:57:11 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-21 05:57:11 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-21 05:57:11 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[14, 15, 13, 16, 10]
2024-10-21 05:57:12 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=6ffcf6507ed7c3655ae17b6c4f53b8e83029298479520a2fb1d217b8bcf5561f&pid=190167 "HTTP/1.1 200 OK"
2024-10-21 05:57:13 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 05:57:14 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 05:57:14 | ERROR | stderr | 
2024-10-21 05:57:14 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 05:57:14 | ERROR | stderr | 
2024-10-21 05:57:17 | ERROR | stderr | 
2024-10-21 05:57:17 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:02<00:02,  2.94s/it]
2024-10-21 05:57:17 | ERROR | stderr | 
2024-10-21 05:57:18 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.88s/it]
2024-10-21 05:57:18 | ERROR | stderr | 
2024-10-21 05:57:18 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 05:57:20 | INFO | stdout | image size:  (711, 400)
2024-10-21 05:57:21 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 05:57:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 05:57:21 | ERROR | stderr |     res = future.result()
2024-10-21 05:57:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 05:57:21 | ERROR | stderr |     return self.__get_result()
2024-10-21 05:57:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 05:57:21 | ERROR | stderr |     raise self._exception
2024-10-21 05:57:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 05:57:21 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 05:57:21 | ERROR | stderr |   File "/home/user/app/cli.py", line 137, in run_inference
2024-10-21 05:57:21 | ERROR | stderr |     output_ids = model.generate(
2024-10-21 05:57:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
2024-10-21 05:57:21 | ERROR | stderr |     return func(*args, **kwargs)
2024-10-21 05:57:21 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/modeling.py", line 135, in generate
2024-10-21 05:57:21 | ERROR | stderr |     ) = self.prepare_inputs_labels_for_multimodal(
2024-10-21 05:57:21 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 817, in prepare_inputs_labels_for_multimodal
2024-10-21 05:57:21 | ERROR | stderr |     cur_image_features = image_features[cur_image_idx]
2024-10-21 05:57:21 | ERROR | stderr | IndexError: list index out of range
2024-10-21 05:57:22 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=6ffcf6507ed7c3655ae17b6c4f53b8e83029298479520a2fb1d217b8bcf5561f&fail=true "HTTP/1.1 200 OK"
2024-10-21 05:57:22 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 05:57:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 05:57:22 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 05:57:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 05:57:22 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 05:57:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 05:57:22 | ERROR | stderr |     result = await self.call_function(
2024-10-21 05:57:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 05:57:22 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 05:57:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 05:57:22 | ERROR | stderr |     return await anext(iterator)
2024-10-21 05:57:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 05:57:22 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 05:57:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 05:57:22 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 05:57:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 05:57:22 | ERROR | stderr |     return await future
2024-10-21 05:57:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 05:57:22 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 05:57:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 05:57:22 | ERROR | stderr |     return next(iterator)
2024-10-21 05:57:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 05:57:22 | ERROR | stderr |     response = next(iterator)
2024-10-21 05:57:22 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 05:57:22 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 05:57:22 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 05:57:22 | ERROR | stderr |     raise res.value
2024-10-21 05:57:22 | ERROR | stderr | IndexError: list index out of range
2024-10-21 06:23:32 | INFO | httpx | HTTP Request: POST http://device-api.zero/startup-report "HTTP/1.1 200 OK"
2024-10-21 06:23:32 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-21 06:23:32 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 06:23:32 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-21 06:23:32 | ERROR | stderr | /home/user/app/app.py:707: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
2024-10-21 06:23:32 | ERROR | stderr |   chatbot = gr.Chatbot(elem_id="chatbot", label="FERRET", visible=False).style(height=750)
2024-10-21 06:23:32 | ERROR | stderr | 
2024-10-21 06:23:32 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 06:23:32 | ERROR | stderr | 
2024-10-21 06:23:32 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 06:23:32 | ERROR | stderr | 
2024-10-21 06:23:33 | INFO | stdout | Running on local URL:  http://0.0.0.0:7860
2024-10-21 06:23:33 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2134: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 06:23:33 | ERROR | stderr |   warnings.warn(
2024-10-21 06:23:33 | INFO | stdout | 
2024-10-21 06:23:33 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 06:23:33 | INFO | stdout | IMPORTANT: You are using gradio version 3.50.2, however version 4.44.1 is available, please upgrade.
2024-10-21 06:23:33 | INFO | stdout | --------
2024-10-21 06:23:34 | INFO | stdout | state Conversation(system='A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', roles=('user', 'model'), messages=[], offset=0, sep_style=<SeparatorStyle.GEMMA: 6>, sep='', sep2='<eos>', version='gemma', skip_next=False)
2024-10-21 06:24:02 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 06:24:02 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
2024-10-21 06:24:02 | ERROR | stderr |     output = await route_utils.call_process_api(
2024-10-21 06:24:02 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
2024-10-21 06:24:02 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 06:24:02 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1548, in process_api
2024-10-21 06:24:02 | ERROR | stderr |     inputs = self.preprocess_data(fn_index, inputs, state)
2024-10-21 06:24:02 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1329, in preprocess_data
2024-10-21 06:24:02 | ERROR | stderr |     processed_input.append(block.preprocess(inputs[i]))
2024-10-21 06:24:02 | ERROR | stderr |   File "/home/user/app/app.py", line 558, in preprocess
2024-10-21 06:24:02 | ERROR | stderr |     return super().preprocess(x)
2024-10-21 06:24:02 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/components/image.py", line 253, in preprocess
2024-10-21 06:24:02 | ERROR | stderr |     assert isinstance(x, dict)
2024-10-21 06:24:02 | ERROR | stderr | AssertionError
2024-10-21 06:24:07 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 06:24:07 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
2024-10-21 06:24:07 | ERROR | stderr |     output = await route_utils.call_process_api(
2024-10-21 06:24:07 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
2024-10-21 06:24:07 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 06:24:07 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1548, in process_api
2024-10-21 06:24:07 | ERROR | stderr |     inputs = self.preprocess_data(fn_index, inputs, state)
2024-10-21 06:24:07 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1329, in preprocess_data
2024-10-21 06:24:07 | ERROR | stderr |     processed_input.append(block.preprocess(inputs[i]))
2024-10-21 06:24:07 | ERROR | stderr |   File "/home/user/app/app.py", line 558, in preprocess
2024-10-21 06:24:07 | ERROR | stderr |     return super().preprocess(x)
2024-10-21 06:24:07 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/components/image.py", line 253, in preprocess
2024-10-21 06:24:07 | ERROR | stderr |     assert isinstance(x, dict)
2024-10-21 06:24:07 | ERROR | stderr | AssertionError
2024-10-21 06:24:07 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 06:24:07 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
2024-10-21 06:24:07 | ERROR | stderr |     output = await route_utils.call_process_api(
2024-10-21 06:24:07 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
2024-10-21 06:24:07 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 06:24:07 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1548, in process_api
2024-10-21 06:24:07 | ERROR | stderr |     inputs = self.preprocess_data(fn_index, inputs, state)
2024-10-21 06:24:07 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1329, in preprocess_data
2024-10-21 06:24:07 | ERROR | stderr |     processed_input.append(block.preprocess(inputs[i]))
2024-10-21 06:24:07 | ERROR | stderr |   File "/home/user/app/app.py", line 558, in preprocess
2024-10-21 06:24:07 | ERROR | stderr |     return super().preprocess(x)
2024-10-21 06:24:07 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/components/image.py", line 253, in preprocess
2024-10-21 06:24:07 | ERROR | stderr |     assert isinstance(x, dict)
2024-10-21 06:24:07 | ERROR | stderr | AssertionError
2024-10-21 06:30:00 | INFO | httpx | HTTP Request: POST http://device-api.zero/startup-report "HTTP/1.1 200 OK"
2024-10-21 06:30:00 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-21 06:30:00 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 06:30:00 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=8, model_list_mode='once', share=False, moderate=False, embed=False, add_region_feature=False)
2024-10-21 06:30:00 | ERROR | stderr | /home/user/app/app.py:707: GradioDeprecationWarning: The `style` method is deprecated. Please set these arguments in the constructor instead.
2024-10-21 06:30:00 | ERROR | stderr |   chatbot = gr.Chatbot(elem_id="chatbot", label="FERRET", visible=False).style(height=750)
2024-10-21 06:30:00 | ERROR | stderr | 
2024-10-21 06:30:00 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 06:30:00 | ERROR | stderr | 
2024-10-21 06:30:00 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 06:30:00 | ERROR | stderr | 
2024-10-21 06:30:00 | INFO | stdout | Running on local URL:  http://0.0.0.0:7860
2024-10-21 06:30:00 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2134: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 06:30:00 | ERROR | stderr |   warnings.warn(
2024-10-21 06:30:00 | INFO | stdout | 
2024-10-21 06:30:00 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 06:30:00 | INFO | stdout | IMPORTANT: You are using gradio version 3.50.2, however version 4.44.1 is available, please upgrade.
2024-10-21 06:30:00 | INFO | stdout | --------
2024-10-21 06:30:01 | INFO | stdout | state Conversation(system='A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', roles=('user', 'model'), messages=[], offset=0, sep_style=<SeparatorStyle.GEMMA: 6>, sep='', sep2='<eos>', version='gemma', skip_next=False)
2024-10-21 06:30:23 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
2024-10-21 06:30:23 | ERROR | stderr |     output = await route_utils.call_process_api(
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
2024-10-21 06:30:23 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1548, in process_api
2024-10-21 06:30:23 | ERROR | stderr |     inputs = self.preprocess_data(fn_index, inputs, state)
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1329, in preprocess_data
2024-10-21 06:30:23 | ERROR | stderr |     processed_input.append(block.preprocess(inputs[i]))
2024-10-21 06:30:23 | ERROR | stderr |   File "/home/user/app/app.py", line 558, in preprocess
2024-10-21 06:30:23 | ERROR | stderr |     return super().preprocess(x)
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/components/image.py", line 253, in preprocess
2024-10-21 06:30:23 | ERROR | stderr |     assert isinstance(x, dict)
2024-10-21 06:30:23 | ERROR | stderr | AssertionError
2024-10-21 06:30:23 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 407, in call_prediction
2024-10-21 06:30:23 | ERROR | stderr |     output = await route_utils.call_process_api(
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 226, in call_process_api
2024-10-21 06:30:23 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1550, in process_api
2024-10-21 06:30:23 | ERROR | stderr |     result = await self.call_function(
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1199, in call_function
2024-10-21 06:30:23 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 519, in async_iteration
2024-10-21 06:30:23 | ERROR | stderr |     return await iterator.__anext__()
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 512, in __anext__
2024-10-21 06:30:23 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 06:30:23 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 06:30:23 | ERROR | stderr |     return await future
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 06:30:23 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 495, in run_sync_iterator_async
2024-10-21 06:30:23 | ERROR | stderr |     return next(iterator)
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 649, in gen_wrapper
2024-10-21 06:30:23 | ERROR | stderr |     yield from f(*args, **kwargs)
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 295, in gradio_handler
2024-10-21 06:30:23 | ERROR | stderr |     schedule_response = client.schedule(task_id=task_id, request=request, duration=duration_)
2024-10-21 06:30:23 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/client.py", line 102, in schedule
2024-10-21 06:30:23 | ERROR | stderr |     raise RuntimeError("ZeroGPU is only compatible with Gradio 4+")
2024-10-21 06:30:23 | ERROR | stderr | RuntimeError: ZeroGPU is only compatible with Gradio 4+
2024-10-21 15:19:28 | INFO | stdout | state Conversation(system='A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.', roles=('user', 'model'), messages=[], offset=0, sep_style=<SeparatorStyle.GEMMA: 6>, sep='', sep2='<eos>', version='gemma', skip_next=False)
2024-10-21 15:20:26 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 15:20:26 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 15:20:26 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 15:20:26 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 15:20:26 | ERROR | stderr |   File "/home/user/app/app.py", line 526, in <module>
2024-10-21 15:20:26 | ERROR | stderr |     demo = build_demo(args.embed, concurrency_count=args.concurrency_count)
2024-10-21 15:20:26 | ERROR | stderr |   File "/home/user/app/app.py", line 378, in build_demo
2024-10-21 15:20:26 | ERROR | stderr |     gr.Examples(examples=[
2024-10-21 15:20:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/helpers.py", line 58, in create_examples
2024-10-21 15:20:26 | ERROR | stderr |     examples_obj = Examples(
2024-10-21 15:20:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/helpers.py", line 209, in __init__
2024-10-21 15:20:26 | ERROR | stderr |     self.processed_examples = [
2024-10-21 15:20:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/helpers.py", line 210, in <listcomp>
2024-10-21 15:20:26 | ERROR | stderr |     [
2024-10-21 15:20:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/helpers.py", line 211, in <listcomp>
2024-10-21 15:20:26 | ERROR | stderr |     component.postprocess(sample)
2024-10-21 15:20:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/components/image.py", line 301, in postprocess
2024-10-21 15:20:26 | ERROR | stderr |     return client_utils.encode_url_or_file_to_base64(y)
2024-10-21 15:20:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio_client/utils.py", line 400, in encode_url_or_file_to_base64
2024-10-21 15:20:26 | ERROR | stderr |     return encode_file_to_base64(path)
2024-10-21 15:20:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio_client/utils.py", line 373, in encode_file_to_base64
2024-10-21 15:20:26 | ERROR | stderr |     with open(f, "rb") as file:
2024-10-21 15:20:26 | ERROR | stderr | FileNotFoundError: [Errno 2] No such file or directory: '/home/user/app/examples/extreme_ironing.jpg'
2024-10-21 15:20:27 | INFO | stdout | IMPORTANT: You are using gradio version 3.50.2, however version 4.44.1 is available, please upgrade.
2024-10-21 15:20:27 | INFO | stdout | --------
2024-10-21 15:21:27 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 15:21:27 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 15:21:27 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 15:21:27 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 15:21:27 | ERROR | stderr |   warnings.warn(
2024-10-21 15:21:27 | ERROR | stderr | 
2024-10-21 15:21:27 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 15:21:27 | ERROR | stderr | 
2024-10-21 15:21:27 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 15:21:27 | ERROR | stderr | 
2024-10-21 15:21:27 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 15:21:27 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 15:21:27 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 15:21:27 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 15:21:27 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 15:21:27 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 15:21:27 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 15:21:27 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 15:21:27 | ERROR | stderr |   warnings.warn(
2024-10-21 15:21:27 | INFO | stdout | 
2024-10-21 15:21:27 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 15:21:46 | INFO | stdout | conv mode to gemma
2024-10-21 15:21:46 | INFO | stdout | Input Image Size:(400, 586)
2024-10-21 15:21:46 | INFO | stdout | Input Image Size:(400, 586)
2024-10-21 15:21:46 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['9c4c9c437ec882a11cd6ed69ca2e5bd9']"}
2024-10-21 15:21:46 | INFO | stdout | Input Image Size:(400, 586)
2024-10-21 15:21:46 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139775176089968&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjE3NC45NS4xNC4xMDMiLCJ1c2VyIjpudWxsLCJ1dWlkIjpudWxsLCJleHAiOjE3Mjk1MTY5NjV9.qLaq1aSTKyQFMb4jYz4gh6lcjEyoBb-c0kExK8OCQ_A "HTTP/1.1 200 OK"
2024-10-21 15:21:46 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-21 15:21:46 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-21 15:21:46 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[13, 16, 15, 14, 10]
2024-10-21 15:21:48 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=90594bf5adbb1f0fbad2418828d795c18aff5114b5a091cc6fe2990576355d0c&pid=276553 "HTTP/1.1 200 OK"
2024-10-21 15:21:51 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 15:21:53 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 15:21:53 | ERROR | stderr | 
2024-10-21 15:21:53 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 15:21:53 | ERROR | stderr | 
2024-10-21 15:21:59 | ERROR | stderr | 
2024-10-21 15:21:59 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:05<00:05,  5.97s/it]
2024-10-21 15:21:59 | ERROR | stderr | 
2024-10-21 15:22:00 | ERROR | stderr | 
2024-10-21 15:22:00 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:07<00:00,  3.17s/it]
2024-10-21 15:22:00 | ERROR | stderr | 
2024-10-21 15:22:00 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:07<00:00,  3.59s/it]
2024-10-21 15:22:00 | ERROR | stderr | 
2024-10-21 15:22:00 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 15:22:05 | INFO | stdout | image size:  (400, 586)
2024-10-21 15:22:05 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 15:22:05 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 15:22:05 | ERROR | stderr |     res = future.result()
2024-10-21 15:22:05 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 15:22:05 | ERROR | stderr |     return self.__get_result()
2024-10-21 15:22:05 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 15:22:05 | ERROR | stderr |     raise self._exception
2024-10-21 15:22:05 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 15:22:05 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 15:22:05 | ERROR | stderr |   File "/home/user/app/cli.py", line 137, in run_inference
2024-10-21 15:22:05 | ERROR | stderr |     output_ids = model.generate(
2024-10-21 15:22:05 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
2024-10-21 15:22:05 | ERROR | stderr |     return func(*args, **kwargs)
2024-10-21 15:22:05 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/modeling.py", line 135, in generate
2024-10-21 15:22:05 | ERROR | stderr |     ) = self.prepare_inputs_labels_for_multimodal(
2024-10-21 15:22:05 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 817, in prepare_inputs_labels_for_multimodal
2024-10-21 15:22:05 | ERROR | stderr |     cur_image_features = image_features[cur_image_idx]
2024-10-21 15:22:05 | ERROR | stderr | IndexError: list index out of range
2024-10-21 15:22:06 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=90594bf5adbb1f0fbad2418828d795c18aff5114b5a091cc6fe2990576355d0c&fail=true "HTTP/1.1 200 OK"
2024-10-21 15:22:06 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 15:22:06 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 15:22:06 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 15:22:06 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 15:22:06 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 15:22:06 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 15:22:06 | ERROR | stderr |     result = await self.call_function(
2024-10-21 15:22:06 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 15:22:06 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 15:22:06 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 15:22:06 | ERROR | stderr |     return await anext(iterator)
2024-10-21 15:22:06 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 15:22:06 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 15:22:06 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 15:22:06 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 15:22:06 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 15:22:06 | ERROR | stderr |     return await future
2024-10-21 15:22:06 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 15:22:06 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 15:22:06 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 15:22:06 | ERROR | stderr |     return next(iterator)
2024-10-21 15:22:06 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 15:22:06 | ERROR | stderr |     response = next(iterator)
2024-10-21 15:22:06 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 15:22:06 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 15:22:06 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 15:22:06 | ERROR | stderr |     raise res.value
2024-10-21 15:22:06 | ERROR | stderr | IndexError: list index out of range
2024-10-21 15:25:04 | INFO | stdout | conv mode to gemma
2024-10-21 15:25:04 | INFO | stdout | Input Image Size:(711, 400)
2024-10-21 15:25:04 | INFO | stdout | Input Image Size:(711, 400)
2024-10-21 15:25:04 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what yu see<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['a8d832a4d4163a69b808476963cc7c2a']"}
2024-10-21 15:25:04 | INFO | stdout | Input Image Size:(711, 400)
2024-10-21 15:25:04 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139775176089968&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjE3NC45NS4xNC4xMDMiLCJ1c2VyIjpudWxsLCJ1dWlkIjpudWxsLCJleHAiOjE3Mjk1MTcxNjR9.2zZc9or3824jHnPazde4CnL-jpFez8IzA64nu76Fj18 "HTTP/1.1 200 OK"
2024-10-21 15:25:04 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=17
2024-10-21 15:25:04 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=19
2024-10-21 15:25:04 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[14, 13, 10]
2024-10-21 15:25:05 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=552cebcdaaeaf996a9ccf0f05254ae82ed1d550ecfdf2f53b2266fd34562562f&pid=277527 "HTTP/1.1 200 OK"
2024-10-21 15:25:06 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 15:25:09 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 15:25:09 | ERROR | stderr | 
2024-10-21 15:25:09 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 15:25:09 | ERROR | stderr | 
2024-10-21 15:25:12 | ERROR | stderr | 
2024-10-21 15:25:12 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.78s/it]
2024-10-21 15:25:12 | ERROR | stderr | 
2024-10-21 15:25:13 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:04<00:00,  2.32s/it]
2024-10-21 15:25:13 | ERROR | stderr | 
2024-10-21 15:25:13 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 15:25:15 | INFO | stdout | image size:  (711, 400)
2024-10-21 15:25:16 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 15:25:16 | ERROR | stderr |     res = future.result()
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 15:25:16 | ERROR | stderr |     return self.__get_result()
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 15:25:16 | ERROR | stderr |     raise self._exception
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 15:25:16 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 15:25:16 | ERROR | stderr |   File "/home/user/app/cli.py", line 137, in run_inference
2024-10-21 15:25:16 | ERROR | stderr |     output_ids = model.generate(
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
2024-10-21 15:25:16 | ERROR | stderr |     return func(*args, **kwargs)
2024-10-21 15:25:16 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/modeling.py", line 135, in generate
2024-10-21 15:25:16 | ERROR | stderr |     ) = self.prepare_inputs_labels_for_multimodal(
2024-10-21 15:25:16 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 817, in prepare_inputs_labels_for_multimodal
2024-10-21 15:25:16 | ERROR | stderr |     cur_image_features = image_features[cur_image_idx]
2024-10-21 15:25:16 | ERROR | stderr | IndexError: list index out of range
2024-10-21 15:25:16 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=552cebcdaaeaf996a9ccf0f05254ae82ed1d550ecfdf2f53b2266fd34562562f&fail=true "HTTP/1.1 200 OK"
2024-10-21 15:25:16 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 15:25:16 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 15:25:16 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 15:25:16 | ERROR | stderr |     result = await self.call_function(
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 15:25:16 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 15:25:16 | ERROR | stderr |     return await anext(iterator)
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 15:25:16 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 15:25:16 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 15:25:16 | ERROR | stderr |     return await future
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 15:25:16 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 15:25:16 | ERROR | stderr |     return next(iterator)
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 15:25:16 | ERROR | stderr |     response = next(iterator)
2024-10-21 15:25:16 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 15:25:16 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 15:25:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 15:25:16 | ERROR | stderr |     raise res.value
2024-10-21 15:25:16 | ERROR | stderr | IndexError: list index out of range
2024-10-21 15:29:54 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 15:29:54 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 15:29:54 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 15:29:54 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 15:29:54 | ERROR | stderr |   warnings.warn(
2024-10-21 15:29:54 | ERROR | stderr | 
2024-10-21 15:29:54 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 15:29:54 | ERROR | stderr | 
2024-10-21 15:29:54 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 15:29:54 | ERROR | stderr | 
2024-10-21 15:29:54 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 15:29:54 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 15:29:54 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 15:29:54 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 15:29:54 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 15:29:54 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 15:29:54 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 15:29:54 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 15:29:54 | ERROR | stderr |   warnings.warn(
2024-10-21 15:29:54 | INFO | stdout | 
2024-10-21 15:29:54 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 15:30:07 | INFO | stdout | conv mode to gemma
2024-10-21 15:30:07 | INFO | stdout | Input Image Size:(400, 586)
2024-10-21 15:30:07 | INFO | stdout | Input Image Size:(400, 586)
2024-10-21 15:30:07 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nexplain what you see<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['9c4c9c437ec882a11cd6ed69ca2e5bd9']"}
2024-10-21 15:30:07 | INFO | stdout | Input Image Size:(400, 586)
2024-10-21 15:30:07 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140591834374368&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjE3NC45NS4xNC4xMDMiLCJ1c2VyIjpudWxsLCJ1dWlkIjpudWxsLCJleHAiOjE3Mjk1MTc0Njd9.FO_xW1qpWy5go8eu0WetKo52DsHpADqDzjUkhlosPpY "HTTP/1.1 200 OK"
2024-10-21 15:30:07 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=20
2024-10-21 15:30:07 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=22
2024-10-21 15:30:07 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[16, 17, 12, 10, 15]
2024-10-21 15:30:08 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=648b429f91568bfe157028beedea7b464bb4a712abaa2d581c0972a2a6cc692f&pid=279039 "HTTP/1.1 200 OK"
2024-10-21 15:30:09 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 15:30:11 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 15:30:11 | ERROR | stderr | 
2024-10-21 15:30:11 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 15:30:11 | ERROR | stderr | 
2024-10-21 15:30:14 | ERROR | stderr | 
2024-10-21 15:30:14 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:02<00:02,  2.77s/it]
2024-10-21 15:30:14 | ERROR | stderr | 
2024-10-21 15:30:14 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.70s/it]
2024-10-21 15:30:14 | ERROR | stderr | 
2024-10-21 15:30:14 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 15:30:18 | INFO | stdout | image size:  (711, 400)
2024-10-21 15:30:20 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 15:30:20 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 15:30:20 | ERROR | stderr |     res = future.result()
2024-10-21 15:30:20 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 15:30:20 | ERROR | stderr |     return self.__get_result()
2024-10-21 15:30:20 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 15:30:20 | ERROR | stderr |     raise self._exception
2024-10-21 15:30:20 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 15:30:20 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 15:30:20 | ERROR | stderr |   File "/home/user/app/cli.py", line 138, in run_inference
2024-10-21 15:30:20 | ERROR | stderr |     output_ids = model.generate(
2024-10-21 15:30:20 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
2024-10-21 15:30:20 | ERROR | stderr |     return func(*args, **kwargs)
2024-10-21 15:30:20 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/modeling.py", line 135, in generate
2024-10-21 15:30:20 | ERROR | stderr |     ) = self.prepare_inputs_labels_for_multimodal(
2024-10-21 15:30:20 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 817, in prepare_inputs_labels_for_multimodal
2024-10-21 15:30:20 | ERROR | stderr |     cur_image_features = image_features[cur_image_idx]
2024-10-21 15:30:20 | ERROR | stderr | IndexError: list index out of range
2024-10-21 15:30:21 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=648b429f91568bfe157028beedea7b464bb4a712abaa2d581c0972a2a6cc692f&fail=true "HTTP/1.1 200 OK"
2024-10-21 15:30:21 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 15:30:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 15:30:21 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 15:30:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 15:30:21 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 15:30:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 15:30:21 | ERROR | stderr |     result = await self.call_function(
2024-10-21 15:30:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 15:30:21 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 15:30:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 15:30:21 | ERROR | stderr |     return await anext(iterator)
2024-10-21 15:30:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 15:30:21 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 15:30:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 15:30:21 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 15:30:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 15:30:21 | ERROR | stderr |     return await future
2024-10-21 15:30:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 15:30:21 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 15:30:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 15:30:21 | ERROR | stderr |     return next(iterator)
2024-10-21 15:30:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 15:30:21 | ERROR | stderr |     response = next(iterator)
2024-10-21 15:30:21 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 15:30:21 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 15:30:21 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 15:30:21 | ERROR | stderr |     raise res.value
2024-10-21 15:30:21 | ERROR | stderr | IndexError: list index out of range
2024-10-21 17:49:15 | INFO | stdout | conv mode to gemma
2024-10-21 17:49:15 | INFO | stdout | Input Image Size:(400, 476)
2024-10-21 17:49:15 | INFO | stdout | Input Image Size:(400, 476)
2024-10-21 17:49:15 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': "A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nexplain what's happening<end_of_turn>\n<start_of_turn>model\n", 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['f829af6c9ae32318da1e4c5a67d2978a']"}
2024-10-21 17:49:15 | INFO | stdout | Input Image Size:(400, 476)
2024-10-21 17:49:15 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140591834374368&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NTI1ODE1fQ.en-6YG4Esy4uaUREqaQbsfwoyqqO-a3WBMu_SA0EQUE "HTTP/1.1 200 OK"
2024-10-21 17:49:15 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-21 17:49:15 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-21 17:49:15 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[12, 10, 16, 15, 14]
2024-10-21 17:49:16 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=a59f3af09d85a87e0044ddf7cd4278d20578a1c68e0f792588aec40df4acac2a&pid=289902 "HTTP/1.1 200 OK"
2024-10-21 17:49:19 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 17:49:22 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 17:49:22 | ERROR | stderr | 
2024-10-21 17:49:22 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 17:49:22 | ERROR | stderr | 
2024-10-21 17:49:30 | ERROR | stderr | 
2024-10-21 17:49:30 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:08<00:08,  8.15s/it]
2024-10-21 17:49:30 | ERROR | stderr | 
2024-10-21 17:49:31 | ERROR | stderr | 
2024-10-21 17:49:31 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00,  4.08s/it]
2024-10-21 17:49:31 | ERROR | stderr | 
2024-10-21 17:49:31 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:09<00:00,  4.69s/it]
2024-10-21 17:49:31 | ERROR | stderr | 
2024-10-21 17:49:31 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 17:49:36 | INFO | stdout | image size:  (711, 400)
2024-10-21 17:49:37 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 17:49:37 | ERROR | stderr |     res = future.result()
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 17:49:37 | ERROR | stderr |     return self.__get_result()
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 17:49:37 | ERROR | stderr |     raise self._exception
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 17:49:37 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 17:49:37 | ERROR | stderr |   File "/home/user/app/cli.py", line 138, in run_inference
2024-10-21 17:49:37 | ERROR | stderr |     output_ids = model.generate(
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
2024-10-21 17:49:37 | ERROR | stderr |     return func(*args, **kwargs)
2024-10-21 17:49:37 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/modeling.py", line 135, in generate
2024-10-21 17:49:37 | ERROR | stderr |     ) = self.prepare_inputs_labels_for_multimodal(
2024-10-21 17:49:37 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 817, in prepare_inputs_labels_for_multimodal
2024-10-21 17:49:37 | ERROR | stderr |     cur_image_features = image_features[cur_image_idx]
2024-10-21 17:49:37 | ERROR | stderr | IndexError: list index out of range
2024-10-21 17:49:37 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=a59f3af09d85a87e0044ddf7cd4278d20578a1c68e0f792588aec40df4acac2a&fail=true "HTTP/1.1 200 OK"
2024-10-21 17:49:37 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 17:49:37 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 17:49:37 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 17:49:37 | ERROR | stderr |     result = await self.call_function(
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 17:49:37 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 17:49:37 | ERROR | stderr |     return await anext(iterator)
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 17:49:37 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 17:49:37 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 17:49:37 | ERROR | stderr |     return await future
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 17:49:37 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 17:49:37 | ERROR | stderr |     return next(iterator)
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 17:49:37 | ERROR | stderr |     response = next(iterator)
2024-10-21 17:49:37 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 17:49:37 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 17:49:37 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 17:49:37 | ERROR | stderr |     raise res.value
2024-10-21 17:49:37 | ERROR | stderr | IndexError: list index out of range
2024-10-21 17:55:46 | INFO | stdout | conv mode to gemma
2024-10-21 17:55:46 | INFO | stdout | Input Image Size:(400, 403)
2024-10-21 17:55:46 | INFO | stdout | Input Image Size:(400, 403)
2024-10-21 17:55:46 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe this image in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['209a525bc0390cad9c1e5ae87c12d79f']"}
2024-10-21 17:55:46 | INFO | stdout | Input Image Size:(400, 403)
2024-10-21 17:55:47 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140591834374368&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NTI2MjA2fQ.YN2DYBaW_AUn8UpMenRW36SNGCEm_w2aRsVeLC2nBDY "HTTP/1.1 200 OK"
2024-10-21 17:55:47 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=18
2024-10-21 17:55:47 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=20
2024-10-21 17:55:47 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[16, 14, 10, 12]
2024-10-21 17:55:48 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=a7ee20da2ec36b200f351ec3391e6771f38fd843aa77b69ca836198160c9443f&pid=291152 "HTTP/1.1 200 OK"
2024-10-21 17:55:49 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 17:55:51 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 17:55:51 | ERROR | stderr | 
2024-10-21 17:55:51 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 17:55:51 | ERROR | stderr | 
2024-10-21 17:55:55 | ERROR | stderr | 
2024-10-21 17:55:55 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:04<00:04,  4.25s/it]
2024-10-21 17:55:55 | ERROR | stderr | 
2024-10-21 17:55:56 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:05<00:00,  2.61s/it]
2024-10-21 17:55:56 | ERROR | stderr | 
2024-10-21 17:55:56 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 17:55:59 | INFO | stdout | image size:  (711, 400)
2024-10-21 17:56:00 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 17:56:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 17:56:00 | ERROR | stderr |     res = future.result()
2024-10-21 17:56:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 17:56:00 | ERROR | stderr |     return self.__get_result()
2024-10-21 17:56:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 17:56:00 | ERROR | stderr |     raise self._exception
2024-10-21 17:56:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 17:56:00 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 17:56:00 | ERROR | stderr |   File "/home/user/app/cli.py", line 138, in run_inference
2024-10-21 17:56:00 | ERROR | stderr |     input_ids,
2024-10-21 17:56:00 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
2024-10-21 17:56:00 | ERROR | stderr |     return func(*args, **kwargs)
2024-10-21 17:56:00 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/modeling.py", line 135, in generate
2024-10-21 17:56:00 | ERROR | stderr |     ) = self.prepare_inputs_labels_for_multimodal(
2024-10-21 17:56:00 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 817, in prepare_inputs_labels_for_multimodal
2024-10-21 17:56:00 | ERROR | stderr |     cur_image_features = image_features[cur_image_idx]
2024-10-21 17:56:00 | ERROR | stderr | IndexError: list index out of range
2024-10-21 17:56:01 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=a7ee20da2ec36b200f351ec3391e6771f38fd843aa77b69ca836198160c9443f&fail=true "HTTP/1.1 200 OK"
2024-10-21 17:56:01 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 17:56:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 17:56:01 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 17:56:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 17:56:01 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 17:56:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 17:56:01 | ERROR | stderr |     result = await self.call_function(
2024-10-21 17:56:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 17:56:01 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 17:56:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 17:56:01 | ERROR | stderr |     return await anext(iterator)
2024-10-21 17:56:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 17:56:01 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 17:56:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 17:56:01 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 17:56:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 17:56:01 | ERROR | stderr |     return await future
2024-10-21 17:56:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 17:56:01 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 17:56:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 17:56:01 | ERROR | stderr |     return next(iterator)
2024-10-21 17:56:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 17:56:01 | ERROR | stderr |     response = next(iterator)
2024-10-21 17:56:01 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 17:56:01 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 17:56:01 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 17:56:01 | ERROR | stderr |     raise res.value
2024-10-21 17:56:01 | ERROR | stderr | IndexError: list index out of range
2024-10-21 18:00:42 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 18:00:42 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 18:00:42 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 18:00:42 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 18:00:42 | ERROR | stderr |   warnings.warn(
2024-10-21 18:00:43 | ERROR | stderr | 
2024-10-21 18:00:43 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 18:00:43 | ERROR | stderr | 
2024-10-21 18:00:43 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 18:00:43 | ERROR | stderr | 
2024-10-21 18:00:43 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 18:00:43 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 18:00:43 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 18:00:43 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 18:00:43 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 18:00:43 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 18:00:43 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 18:00:43 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 18:00:43 | ERROR | stderr |   warnings.warn(
2024-10-21 18:00:43 | INFO | stdout | 
2024-10-21 18:00:43 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 18:01:19 | INFO | stdout | conv mode to gemma
2024-10-21 18:01:19 | INFO | stdout | Input Image Size:(400, 476)
2024-10-21 18:01:19 | INFO | stdout | Input Image Size:(400, 476)
2024-10-21 18:01:19 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nexplaon what you see<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['f829af6c9ae32318da1e4c5a67d2978a']"}
2024-10-21 18:01:19 | INFO | stdout | Input Image Size:(400, 476)
2024-10-21 18:01:19 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140314165141728&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NTI2NTM5fQ.x_oXSstHrzVPUMyDzPkwNGcF869Nz9tYNxIZncXaoCE "HTTP/1.1 200 OK"
2024-10-21 18:01:19 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=17
2024-10-21 18:01:19 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=20
2024-10-21 18:01:19 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[10, 19, 14, 13]
2024-10-21 18:01:19 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=6aad5e074610a6052fb447ebc0079dad9360486958bd9ad8f0adb5c5a3ef8b38&pid=292458 "HTTP/1.1 200 OK"
2024-10-21 18:01:24 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 18:01:26 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 18:01:26 | ERROR | stderr | 
2024-10-21 18:01:26 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 18:01:26 | ERROR | stderr | 
2024-10-21 18:01:30 | ERROR | stderr | 
2024-10-21 18:01:30 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:04<00:04,  4.03s/it]
2024-10-21 18:01:30 | ERROR | stderr | 
2024-10-21 18:01:31 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:04<00:00,  2.46s/it]
2024-10-21 18:01:31 | ERROR | stderr | 
2024-10-21 18:01:31 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 18:01:34 | INFO | stdout | loading image ./f829af6c9ae32318da1e4c5a67d2978a.jpg
2024-10-21 18:01:34 | INFO | stdout | the image file :  ./f829af6c9ae32318da1e4c5a67d2978a.jpg
2024-10-21 18:01:34 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 18:01:34 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 18:01:34 | ERROR | stderr |     res = future.result()
2024-10-21 18:01:34 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 18:01:34 | ERROR | stderr |     return self.__get_result()
2024-10-21 18:01:34 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 18:01:34 | ERROR | stderr |     raise self._exception
2024-10-21 18:01:34 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 18:01:34 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 18:01:34 | ERROR | stderr |   File "/home/user/app/cli.py", line 89, in run_inference
2024-10-21 18:01:34 | ERROR | stderr |     image = load_image(image_file)
2024-10-21 18:01:34 | ERROR | stderr |   File "/home/user/app/cli.py", line 26, in load_image
2024-10-21 18:01:34 | ERROR | stderr |     print(f"Error loading image: {e}")
2024-10-21 18:01:34 | ERROR | stderr | NameError: name 'e' is not defined
2024-10-21 18:01:35 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=6aad5e074610a6052fb447ebc0079dad9360486958bd9ad8f0adb5c5a3ef8b38&fail=true "HTTP/1.1 200 OK"
2024-10-21 18:01:35 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 18:01:35 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 18:01:35 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 18:01:35 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 18:01:35 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 18:01:35 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 18:01:35 | ERROR | stderr |     result = await self.call_function(
2024-10-21 18:01:35 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 18:01:35 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 18:01:35 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 18:01:35 | ERROR | stderr |     return await anext(iterator)
2024-10-21 18:01:35 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 18:01:35 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 18:01:35 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 18:01:35 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 18:01:35 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 18:01:35 | ERROR | stderr |     return await future
2024-10-21 18:01:35 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 18:01:35 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 18:01:35 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 18:01:35 | ERROR | stderr |     return next(iterator)
2024-10-21 18:01:35 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 18:01:35 | ERROR | stderr |     response = next(iterator)
2024-10-21 18:01:35 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 18:01:35 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 18:01:35 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 18:01:35 | ERROR | stderr |     raise res.value
2024-10-21 18:01:35 | ERROR | stderr | NameError: name 'e' is not defined
2024-10-21 18:07:43 | INFO | stdout | conv mode to gemma
2024-10-21 18:07:43 | INFO | stdout | Input Image Size:(362, 410)
2024-10-21 18:07:43 | INFO | stdout | Input Image Size:(362, 410)
2024-10-21 18:07:43 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\nwhats in the image<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['36da526e3e0ad24decff5808117b2363']"}
2024-10-21 18:07:43 | INFO | stdout | Input Image Size:(362, 410)
2024-10-21 18:07:44 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140314165141728&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NTI2OTIzfQ.U9crN9izZgU9SnmD8kiNIRDk3TRq37naVSMZoeI7s9M "HTTP/1.1 200 OK"
2024-10-21 18:07:44 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=21
2024-10-21 18:07:44 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=23
2024-10-21 18:07:44 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[18, 10, 13, 14, 16, 15, 17]
2024-10-21 18:07:44 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=a79d6c3e36a109df9a0f4cf64855ac18552d412274e7d8250a318c982ac50d55&pid=293629 "HTTP/1.1 200 OK"
2024-10-21 18:07:46 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 18:07:48 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 18:07:48 | ERROR | stderr | 
2024-10-21 18:07:48 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 18:07:48 | ERROR | stderr | 
2024-10-21 18:07:51 | ERROR | stderr | 
2024-10-21 18:07:51 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:02<00:02,  2.90s/it]
2024-10-21 18:07:51 | ERROR | stderr | 
2024-10-21 18:07:51 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.84s/it]
2024-10-21 18:07:51 | ERROR | stderr | 
2024-10-21 18:07:51 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 18:07:53 | INFO | stdout | loading image ./36da526e3e0ad24decff5808117b2363.jpg
2024-10-21 18:07:53 | INFO | stdout | the image file :  ./36da526e3e0ad24decff5808117b2363.jpg
2024-10-21 18:07:53 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 18:07:53 | ERROR | stderr |     res = future.result()
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 18:07:53 | ERROR | stderr |     return self.__get_result()
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 18:07:53 | ERROR | stderr |     raise self._exception
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 18:07:53 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 18:07:53 | ERROR | stderr |   File "/home/user/app/cli.py", line 89, in run_inference
2024-10-21 18:07:53 | ERROR | stderr |     print("loading image", image_file)
2024-10-21 18:07:53 | ERROR | stderr |   File "/home/user/app/cli.py", line 26, in load_image
2024-10-21 18:07:53 | ERROR | stderr |     image = Image.open(image_file).convert('RGB')
2024-10-21 18:07:53 | ERROR | stderr | NameError: name 'e' is not defined
2024-10-21 18:07:53 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=a79d6c3e36a109df9a0f4cf64855ac18552d412274e7d8250a318c982ac50d55&fail=true "HTTP/1.1 200 OK"
2024-10-21 18:07:53 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 18:07:53 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 18:07:53 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 18:07:53 | ERROR | stderr |     result = await self.call_function(
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 18:07:53 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 18:07:53 | ERROR | stderr |     return await anext(iterator)
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 18:07:53 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 18:07:53 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 18:07:53 | ERROR | stderr |     return await future
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 18:07:53 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 18:07:53 | ERROR | stderr |     return next(iterator)
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 18:07:53 | ERROR | stderr |     response = next(iterator)
2024-10-21 18:07:53 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 18:07:53 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 18:07:53 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 18:07:53 | ERROR | stderr |     raise res.value
2024-10-21 18:07:53 | ERROR | stderr | NameError: name 'e' is not defined
2024-10-21 18:08:43 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 18:08:43 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 18:08:43 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 18:08:43 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 18:08:43 | ERROR | stderr |   warnings.warn(
2024-10-21 18:08:44 | ERROR | stderr | 
2024-10-21 18:08:44 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 18:08:44 | ERROR | stderr | 
2024-10-21 18:08:44 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 18:08:44 | ERROR | stderr | 
2024-10-21 18:08:44 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 18:08:44 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 18:08:44 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 18:08:44 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 18:08:44 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 18:08:44 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 18:08:44 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 18:08:44 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 18:08:44 | ERROR | stderr |   warnings.warn(
2024-10-21 18:08:44 | INFO | stdout | 
2024-10-21 18:08:44 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 18:09:12 | INFO | stdout | conv mode to gemma
2024-10-21 18:09:12 | INFO | stdout | Input Image Size:(58, 88)
2024-10-21 18:09:12 | INFO | stdout | Input Image Size:(58, 88)
2024-10-21 18:09:12 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['bba68c82a3a4976b167abf9be5d2ade2']"}
2024-10-21 18:09:12 | INFO | stdout | Input Image Size:(58, 88)
2024-10-21 18:09:12 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140709126037728&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NTI3MDEyfQ.iQWM9MG_RBZh1bOvosbxHR4X3EoDj0vtn8MY_ze6CKc "HTTP/1.1 200 OK"
2024-10-21 18:09:12 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-21 18:09:12 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-21 18:09:12 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[15, 13, 14, 10, 17]
2024-10-21 18:09:12 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=d1a4b976d7f8bbf9abc5c79094ecc79724e2cc38b8460a4ab3f5bc8fff546245&pid=294147 "HTTP/1.1 200 OK"
2024-10-21 18:09:14 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 18:09:16 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 18:09:16 | ERROR | stderr | 
2024-10-21 18:09:16 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 18:09:16 | ERROR | stderr | 
2024-10-21 18:09:20 | ERROR | stderr | 
2024-10-21 18:09:20 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:04<00:04,  4.26s/it]
2024-10-21 18:09:20 | ERROR | stderr | 
2024-10-21 18:09:21 | ERROR | stderr | 
2024-10-21 18:09:21 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:05<00:00,  2.37s/it]
2024-10-21 18:09:21 | ERROR | stderr | 
2024-10-21 18:09:21 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:05<00:00,  2.66s/it]
2024-10-21 18:09:21 | ERROR | stderr | 
2024-10-21 18:09:21 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 18:09:25 | INFO | stdout | loading image /home/user/app/bba68c82a3a4976b167abf9be5d2ade2.jpg
2024-10-21 18:09:25 | INFO | stdout | the image file :  /home/user/app/bba68c82a3a4976b167abf9be5d2ade2.jpg
2024-10-21 18:09:25 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 18:09:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 18:09:25 | ERROR | stderr |     res = future.result()
2024-10-21 18:09:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 18:09:25 | ERROR | stderr |     return self.__get_result()
2024-10-21 18:09:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 18:09:25 | ERROR | stderr |     raise self._exception
2024-10-21 18:09:25 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 18:09:25 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 18:09:25 | ERROR | stderr |   File "/home/user/app/cli.py", line 90, in run_inference
2024-10-21 18:09:25 | ERROR | stderr |     image = load_image(image_file)
2024-10-21 18:09:25 | ERROR | stderr |   File "/home/user/app/cli.py", line 27, in load_image
2024-10-21 18:09:25 | ERROR | stderr |     print(f"Error loading image: {e}")
2024-10-21 18:09:25 | ERROR | stderr | NameError: name 'e' is not defined
2024-10-21 18:09:26 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=d1a4b976d7f8bbf9abc5c79094ecc79724e2cc38b8460a4ab3f5bc8fff546245&fail=true "HTTP/1.1 200 OK"
2024-10-21 18:09:26 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 18:09:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 18:09:26 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 18:09:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 18:09:26 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 18:09:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 18:09:26 | ERROR | stderr |     result = await self.call_function(
2024-10-21 18:09:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 18:09:26 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 18:09:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 18:09:26 | ERROR | stderr |     return await anext(iterator)
2024-10-21 18:09:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 18:09:26 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 18:09:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 18:09:26 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 18:09:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 18:09:26 | ERROR | stderr |     return await future
2024-10-21 18:09:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 18:09:26 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 18:09:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 18:09:26 | ERROR | stderr |     return next(iterator)
2024-10-21 18:09:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 18:09:26 | ERROR | stderr |     response = next(iterator)
2024-10-21 18:09:26 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 18:09:26 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 18:09:26 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 18:09:26 | ERROR | stderr |     raise res.value
2024-10-21 18:09:26 | ERROR | stderr | NameError: name 'e' is not defined
2024-10-21 18:10:15 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 18:10:15 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 18:10:15 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 18:10:15 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 18:10:15 | ERROR | stderr |   warnings.warn(
2024-10-21 18:10:15 | ERROR | stderr | 
2024-10-21 18:10:15 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 18:10:15 | ERROR | stderr | 
2024-10-21 18:10:15 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 18:10:15 | ERROR | stderr | 
2024-10-21 18:10:15 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 18:10:15 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 18:10:15 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 18:10:15 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 18:10:15 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 18:10:15 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 18:10:15 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 18:10:15 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 18:10:15 | ERROR | stderr |   warnings.warn(
2024-10-21 18:10:15 | INFO | stdout | 
2024-10-21 18:10:15 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 18:10:23 | INFO | stdout | conv mode to gemma
2024-10-21 18:10:23 | INFO | stdout | Input Image Size:(400, 476)
2024-10-21 18:10:23 | INFO | stdout | Input Image Size:(400, 476)
2024-10-21 18:10:23 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe what you see<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['f829af6c9ae32318da1e4c5a67d2978a']"}
2024-10-21 18:10:23 | INFO | stdout | Input Image Size:(400, 476)
2024-10-21 18:10:23 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140205599776992&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NTI3MDgzfQ.aE57FrJwtRazB0lzIzHCNQl10D769QDcOBfuD-sa2EU "HTTP/1.1 200 OK"
2024-10-21 18:10:23 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=20
2024-10-21 18:10:23 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=22
2024-10-21 18:10:23 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[15, 16, 10, 17, 13, 14]
2024-10-21 18:10:23 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=9c9f5876a2fe78e58a0a17f49f39fd7b7e7dfe5831c81134583c30b5c60fb094&pid=294631 "HTTP/1.1 200 OK"
2024-10-21 18:10:24 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 18:10:26 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 18:10:26 | ERROR | stderr | 
2024-10-21 18:10:26 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 18:10:26 | ERROR | stderr | 
2024-10-21 18:10:29 | ERROR | stderr | 
2024-10-21 18:10:29 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:02<00:02,  2.74s/it]
2024-10-21 18:10:29 | ERROR | stderr | 
2024-10-21 18:10:30 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.74s/it]
2024-10-21 18:10:30 | ERROR | stderr | 
2024-10-21 18:10:30 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 18:10:32 | INFO | stdout | loading image /home/user/app/f829af6c9ae32318da1e4c5a67d2978a.jpg
2024-10-21 18:10:32 | INFO | stdout | the image file :  /home/user/app/f829af6c9ae32318da1e4c5a67d2978a.jpg
2024-10-21 18:10:32 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 18:10:32 | ERROR | stderr |     res = future.result()
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 18:10:32 | ERROR | stderr |     return self.__get_result()
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 18:10:32 | ERROR | stderr |     raise self._exception
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 18:10:32 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 18:10:32 | ERROR | stderr |   File "/home/user/app/cli.py", line 90, in run_inference
2024-10-21 18:10:32 | ERROR | stderr |     image = load_image(image_file)
2024-10-21 18:10:32 | ERROR | stderr |   File "/home/user/app/cli.py", line 27, in load_image
2024-10-21 18:10:32 | ERROR | stderr |     print(f"Error loading image: {e}")
2024-10-21 18:10:32 | ERROR | stderr | NameError: name 'e' is not defined
2024-10-21 18:10:32 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=9c9f5876a2fe78e58a0a17f49f39fd7b7e7dfe5831c81134583c30b5c60fb094&fail=true "HTTP/1.1 200 OK"
2024-10-21 18:10:32 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 18:10:32 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 18:10:32 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 18:10:32 | ERROR | stderr |     result = await self.call_function(
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 18:10:32 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 18:10:32 | ERROR | stderr |     return await anext(iterator)
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 18:10:32 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 18:10:32 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 18:10:32 | ERROR | stderr |     return await future
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 18:10:32 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 18:10:32 | ERROR | stderr |     return next(iterator)
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 18:10:32 | ERROR | stderr |     response = next(iterator)
2024-10-21 18:10:32 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 18:10:32 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 18:10:32 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 18:10:32 | ERROR | stderr |     raise res.value
2024-10-21 18:10:32 | ERROR | stderr | NameError: name 'e' is not defined
2024-10-21 18:10:56 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 18:10:56 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 18:10:56 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 18:10:56 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 18:10:56 | ERROR | stderr |   warnings.warn(
2024-10-21 18:10:56 | ERROR | stderr | 
2024-10-21 18:10:56 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 18:10:56 | ERROR | stderr | 
2024-10-21 18:10:56 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 18:10:56 | ERROR | stderr | 
2024-10-21 18:10:56 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 18:10:56 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 18:10:56 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 18:10:56 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 18:10:56 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 18:10:56 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 18:10:56 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 18:10:56 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 18:10:56 | ERROR | stderr |   warnings.warn(
2024-10-21 18:10:56 | INFO | stdout | 
2024-10-21 18:10:56 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 18:11:05 | INFO | stdout | conv mode to gemma
2024-10-21 18:11:05 | INFO | stdout | Input Image Size:(400, 476)
2024-10-21 18:11:05 | INFO | stdout | Input Image Size:(400, 476)
2024-10-21 18:11:05 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe waht you see<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['f829af6c9ae32318da1e4c5a67d2978a']"}
2024-10-21 18:11:05 | INFO | stdout | Input Image Size:(400, 476)
2024-10-21 18:11:05 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=139907506462944&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NTI3MTI1fQ.RpMaP6ucRsiHLMJlkl8BVqmIqkWZD5c84VdNBVEFFAM "HTTP/1.1 200 OK"
2024-10-21 18:11:05 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=20
2024-10-21 18:11:05 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=22
2024-10-21 18:11:05 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[15, 10, 18, 13, 16, 14]
2024-10-21 18:11:06 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=06dda6b23bde95507be6016a00cc2c541af8a87032c4164221c6a257b3eb8cd3&pid=295036 "HTTP/1.1 200 OK"
2024-10-21 18:11:07 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 18:11:09 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 18:11:09 | ERROR | stderr | 
2024-10-21 18:11:09 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 18:11:09 | ERROR | stderr | 
2024-10-21 18:11:12 | ERROR | stderr | 
2024-10-21 18:11:12 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.38s/it]
2024-10-21 18:11:12 | ERROR | stderr | 
2024-10-21 18:11:13 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:04<00:00,  2.02s/it]
2024-10-21 18:11:13 | ERROR | stderr | 
2024-10-21 18:11:13 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 18:11:15 | INFO | stdout | loading image /home/user/app/f829af6c9ae32318da1e4c5a67d2978a.jpg
2024-10-21 18:11:15 | INFO | stdout | the image file :  /home/user/app/f829af6c9ae32318da1e4c5a67d2978a.jpg
2024-10-21 18:11:15 | INFO | stdout | image size:  (400, 476)
2024-10-21 18:11:15 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 18:11:15 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 18:11:15 | ERROR | stderr |     res = future.result()
2024-10-21 18:11:15 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 18:11:15 | ERROR | stderr |     return self.__get_result()
2024-10-21 18:11:15 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 18:11:15 | ERROR | stderr |     raise self._exception
2024-10-21 18:11:15 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 18:11:15 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 18:11:15 | ERROR | stderr |   File "/home/user/app/cli.py", line 140, in run_inference
2024-10-21 18:11:15 | ERROR | stderr |     output_ids = model.generate(
2024-10-21 18:11:15 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
2024-10-21 18:11:15 | ERROR | stderr |     return func(*args, **kwargs)
2024-10-21 18:11:15 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/modeling.py", line 135, in generate
2024-10-21 18:11:15 | ERROR | stderr |     ) = self.prepare_inputs_labels_for_multimodal(
2024-10-21 18:11:15 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 817, in prepare_inputs_labels_for_multimodal
2024-10-21 18:11:15 | ERROR | stderr |     cur_image_features = image_features[cur_image_idx]
2024-10-21 18:11:15 | ERROR | stderr | IndexError: list index out of range
2024-10-21 18:11:16 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=06dda6b23bde95507be6016a00cc2c541af8a87032c4164221c6a257b3eb8cd3&fail=true "HTTP/1.1 200 OK"
2024-10-21 18:11:16 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 18:11:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 18:11:16 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 18:11:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 18:11:16 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 18:11:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 18:11:16 | ERROR | stderr |     result = await self.call_function(
2024-10-21 18:11:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 18:11:16 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 18:11:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 18:11:16 | ERROR | stderr |     return await anext(iterator)
2024-10-21 18:11:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 18:11:16 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 18:11:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 18:11:16 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 18:11:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 18:11:16 | ERROR | stderr |     return await future
2024-10-21 18:11:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 18:11:16 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 18:11:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 18:11:16 | ERROR | stderr |     return next(iterator)
2024-10-21 18:11:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 18:11:16 | ERROR | stderr |     response = next(iterator)
2024-10-21 18:11:16 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 18:11:16 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 18:11:16 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 18:11:16 | ERROR | stderr |     raise res.value
2024-10-21 18:11:16 | ERROR | stderr | IndexError: list index out of range
2024-10-21 18:14:01 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 18:14:01 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 18:14:01 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 18:14:01 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 18:14:01 | ERROR | stderr |   warnings.warn(
2024-10-21 18:14:01 | ERROR | stderr | 
2024-10-21 18:14:01 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 18:14:01 | ERROR | stderr | 
2024-10-21 18:14:01 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 18:14:01 | ERROR | stderr | 
2024-10-21 18:14:01 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 18:14:01 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 18:14:01 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 18:14:01 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 18:14:01 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 18:14:01 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 18:14:01 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 18:14:01 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 18:14:01 | ERROR | stderr |   warnings.warn(
2024-10-21 18:14:01 | INFO | stdout | 
2024-10-21 18:14:01 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2024-10-21 18:14:24 | INFO | stdout | conv mode to gemma
2024-10-21 18:14:24 | INFO | stdout | Input Image Size:(362, 410)
2024-10-21 18:14:24 | INFO | stdout | Input Image Size:(362, 410)
2024-10-21 18:14:24 | INFO | gradio_web_server | ==== request ====
{'model': 'jadechoghari/Ferret-UI-Gemma2b', 'prompt': 'A chat between a human and an AI that understands visuals. In images, [x, y] denotes points: top-left [0, 0], bottom-right [width-1, height-1]. Increasing x moves right; y moves down. Bounding box: [x1, y1, x2, y2]. Image size: 1000x1000. Follow instructions.<start_of_turn>user\n<image>\ndescribe waht you see in details<end_of_turn>\n<start_of_turn>model\n', 'temperature': 0.2, 'top_p': 0.7, 'max_new_tokens': 512, 'stop': '<eos>', 'images': "List of 1 images: ['36da526e3e0ad24decff5808117b2363']"}
2024-10-21 18:14:24 | INFO | stdout | Input Image Size:(362, 410)
2024-10-21 18:14:25 | INFO | httpx | HTTP Request: POST http://device-api.zero/schedule?cgroupPath=%2Fkubepods.slice%2Fkubepods-burstable.slice%2Fkubepods-burstable-podd01b5ff5_c2cc_4948_b3ed_1e8ea56d357d.slice%2Fcri-containerd-90f67dcd1b09d742955a3a5af322b4ae02beaf359f175274f45abcca942ae839.scope&taskId=140677354168544&enableQueue=true&token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpcCI6IjEyOS45Ny4xMjQuMjE1IiwidXNlciI6bnVsbCwidXVpZCI6bnVsbCwiZXhwIjoxNzI5NTI3MzI0fQ.oqmCIuuj-5US_LzyMPwA8kcLJIKg-LT-DGW8vRDoHnI "HTTP/1.1 200 OK"
2024-10-21 18:14:25 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.arg_queue._writer.fileno()=19
2024-10-21 18:14:25 | INFO | stdout | SPACES_ZERO_GPU_DEBUG self.res_queue._writer.fileno()=21
2024-10-21 18:14:25 | INFO | stdout | SPACES_ZERO_GPU_DEBUG fds=[12, 15, 14, 16, 10]
2024-10-21 18:14:26 | INFO | httpx | HTTP Request: POST http://device-api.zero/allow?allowToken=c065cfb8eef785a48a711364ac47f4a076bf7b18865f3bb6328cc03325c0abbb&pid=295814 "HTTP/1.1 200 OK"
2024-10-21 18:14:27 | INFO | stdout | SPACES_ZERO_GPU_DEBUG total_duration_in_callback=0
2024-10-21 18:14:29 | INFO | accelerate.utils.modeling | We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
2024-10-21 18:14:29 | ERROR | stderr | 
2024-10-21 18:14:29 | ERROR | stderr | 
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
2024-10-21 18:14:29 | ERROR | stderr | 
2024-10-21 18:14:32 | ERROR | stderr | 
2024-10-21 18:14:32 | ERROR | stderr | 
Loading checkpoint shards:  50%|█████     | 1/2 [00:03<00:03,  3.16s/it]
2024-10-21 18:14:32 | ERROR | stderr | 
2024-10-21 18:14:33 | ERROR | stderr | 
Loading checkpoint shards: 100%|██████████| 2/2 [00:03<00:00,  1.97s/it]
2024-10-21 18:14:33 | ERROR | stderr | 
2024-10-21 18:14:33 | WARNING | transformers.modeling_utils | Some weights of the model checkpoint at jadechoghari/Ferret-UI-Gemma2b were not used when initializing FerretGemmaForCausalLM: ['model.vision_tower.vision_tower.vision_model.embeddings.class_embedding', 'model.vision_tower.vision_tower.vision_model.embeddings.patch_embedding.weight', 'model.vision_tower.vision_tower.vision_model.embeddings.position_embedding.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.0.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.1.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.10.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.11.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.12.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.13.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.14.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.15.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.16.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.17.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.18.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.19.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.2.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.20.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.21.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.22.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.23.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.3.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.4.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.5.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.6.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.7.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.8.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.layer_norm2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc1.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.mlp.fc2.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.k_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.out_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.q_proj.weight', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.bias', 'model.vision_tower.vision_tower.vision_model.encoder.layers.9.self_attn.v_proj.weight', 'model.vision_tower.vision_tower.vision_model.post_layernorm.bias', 'model.vision_tower.vision_tower.vision_model.post_layernorm.weight', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.bias', 'model.vision_tower.vision_tower.vision_model.pre_layrnorm.weight']
- This IS expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FerretGemmaForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2024-10-21 18:14:35 | INFO | stdout | loading image /home/user/app/36da526e3e0ad24decff5808117b2363.jpg
2024-10-21 18:14:35 | INFO | stdout | the image file :  /home/user/app/36da526e3e0ad24decff5808117b2363.jpg
2024-10-21 18:14:36 | INFO | stdout | image size:  (362, 410)
2024-10-21 18:14:36 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 256, in thread_wrapper
2024-10-21 18:14:36 | ERROR | stderr |     res = future.result()
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
2024-10-21 18:14:36 | ERROR | stderr |     return self.__get_result()
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
2024-10-21 18:14:36 | ERROR | stderr |     raise self._exception
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/concurrent/futures/thread.py", line 58, in run
2024-10-21 18:14:36 | ERROR | stderr |     result = self.fn(*self.args, **self.kwargs)
2024-10-21 18:14:36 | ERROR | stderr |   File "/home/user/app/cli.py", line 140, in run_inference
2024-10-21 18:14:36 | ERROR | stderr |     output_ids = model.generate(
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
2024-10-21 18:14:36 | ERROR | stderr |     return func(*args, **kwargs)
2024-10-21 18:14:36 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/modeling.py", line 135, in generate
2024-10-21 18:14:36 | ERROR | stderr |     ) = self.prepare_inputs_labels_for_multimodal(
2024-10-21 18:14:36 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 707, in prepare_inputs_labels_for_multimodal
2024-10-21 18:14:36 | ERROR | stderr |     raw_image_features, image_features, region_feature_map = self.encode_images(images, region_flag=region_flag, region_geo_sampler=region_geo_sampler)
2024-10-21 18:14:36 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/ferret_arch.py", line 553, in encode_images
2024-10-21 18:14:36 | ERROR | stderr |     image_features = self.get_model().get_vision_tower()(images)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
2024-10-21 18:14:36 | ERROR | stderr |     return self._call_impl(*args, **kwargs)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
2024-10-21 18:14:36 | ERROR | stderr |     return forward_call(*args, **kwargs)
2024-10-21 18:14:36 | ERROR | stderr |   File "/home/user/.cache/huggingface/modules/transformers_modules/jadechoghari/Ferret-UI-Gemma2b/28bcebb3965e5409aee774c7ed29447cf80cc078/clip_encoder.py", line 102, in forward
2024-10-21 18:14:36 | ERROR | stderr |     image_forward_outs = self.vision_tower(images.to(device=self.device, dtype=self.dtype), output_hidden_states=True)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
2024-10-21 18:14:36 | ERROR | stderr |     return self._call_impl(*args, **kwargs)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
2024-10-21 18:14:36 | ERROR | stderr |     return forward_call(*args, **kwargs)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 1116, in forward
2024-10-21 18:14:36 | ERROR | stderr |     return self.vision_model(
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
2024-10-21 18:14:36 | ERROR | stderr |     return self._call_impl(*args, **kwargs)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
2024-10-21 18:14:36 | ERROR | stderr |     return forward_call(*args, **kwargs)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 1040, in forward
2024-10-21 18:14:36 | ERROR | stderr |     hidden_states = self.embeddings(pixel_values)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
2024-10-21 18:14:36 | ERROR | stderr |     return self._call_impl(*args, **kwargs)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
2024-10-21 18:14:36 | ERROR | stderr |     return forward_call(*args, **kwargs)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 202, in forward
2024-10-21 18:14:36 | ERROR | stderr |     patch_embeds = self.patch_embedding(pixel_values.to(dtype=target_dtype))  # shape = [*, width, grid, grid]
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
2024-10-21 18:14:36 | ERROR | stderr |     return self._call_impl(*args, **kwargs)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
2024-10-21 18:14:36 | ERROR | stderr |     return forward_call(*args, **kwargs)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 458, in forward
2024-10-21 18:14:36 | ERROR | stderr |     return self._conv_forward(input, self.weight, self.bias)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 454, in _conv_forward
2024-10-21 18:14:36 | ERROR | stderr |     return F.conv2d(input, weight, bias, self.stride,
2024-10-21 18:14:36 | ERROR | stderr | RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [1, 1, 5, 3, 336, 336]
2024-10-21 18:14:36 | INFO | httpx | HTTP Request: POST http://device-api.zero/release?allowToken=c065cfb8eef785a48a711364ac47f4a076bf7b18865f3bb6328cc03325c0abbb&fail=true "HTTP/1.1 200 OK"
2024-10-21 18:14:36 | ERROR | stderr | Traceback (most recent call last):
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
2024-10-21 18:14:36 | ERROR | stderr |     response = await route_utils.call_process_api(
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
2024-10-21 18:14:36 | ERROR | stderr |     output = await app.get_blocks().process_api(
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 2014, in process_api
2024-10-21 18:14:36 | ERROR | stderr |     result = await self.call_function(
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1579, in call_function
2024-10-21 18:14:36 | ERROR | stderr |     prediction = await utils.async_iteration(iterator)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 691, in async_iteration
2024-10-21 18:14:36 | ERROR | stderr |     return await anext(iterator)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 685, in __anext__
2024-10-21 18:14:36 | ERROR | stderr |     return await anyio.to_thread.run_sync(
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
2024-10-21 18:14:36 | ERROR | stderr |     return await get_async_backend().run_sync_in_worker_thread(
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
2024-10-21 18:14:36 | ERROR | stderr |     return await future
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 943, in run
2024-10-21 18:14:36 | ERROR | stderr |     result = context.run(func, *args)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 668, in run_sync_iterator_async
2024-10-21 18:14:36 | ERROR | stderr |     return next(iterator)
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 829, in gen_wrapper
2024-10-21 18:14:36 | ERROR | stderr |     response = next(iterator)
2024-10-21 18:14:36 | ERROR | stderr |   File "/home/user/app/app.py", line 267, in http_bot
2024-10-21 18:14:36 | ERROR | stderr |     extracted_texts = run_inference(
2024-10-21 18:14:36 | ERROR | stderr |   File "/usr/local/lib/python3.10/site-packages/spaces/zero/wrappers.py", line 214, in gradio_handler
2024-10-21 18:14:36 | ERROR | stderr |     raise res.value
2024-10-21 18:14:36 | ERROR | stderr | RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [1, 1, 5, 3, 336, 336]
2024-10-21 18:38:51 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 18:38:51 | INFO | gradio_web_server | Models: ['jadechoghari/Ferret-UI-Gemma2b']
2024-10-21 18:38:51 | INFO | gradio_web_server | Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:21001', concurrency_count=16, model_list_mode='once', share=False, moderate=False, embed=False)
2024-10-21 18:38:51 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/components/chatbot.py:222: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style 'role' and 'content' keys.
2024-10-21 18:38:51 | ERROR | stderr |   warnings.warn(
2024-10-21 18:38:51 | ERROR | stderr | 
2024-10-21 18:38:51 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 18:38:51 | ERROR | stderr | 
2024-10-21 18:38:51 | ERROR | stderr | 
ZeroGPU tensors packing: 0.00B [00:00, ?B/s]
2024-10-21 18:38:51 | ERROR | stderr | 
2024-10-21 18:38:51 | INFO | httpx | HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
2024-10-21 18:38:51 | INFO | stdout | Cannot start Node server on any port in the range 7861-7861.
2024-10-21 18:38:51 | INFO | stdout | Please install Node 20 or higher and set the environment variable GRADIO_NODE_PATH to the path of your Node executable.
2024-10-21 18:38:51 | INFO | stdout | You can explicitly specify a port by setting the environment variable GRADIO_NODE_PORT.
2024-10-21 18:38:51 | INFO | stdout | * Running on local URL:  http://0.0.0.0:7860, with SSR ⚡
2024-10-21 18:38:51 | INFO | httpx | HTTP Request: GET http://localhost:7860/gradio_api/startup-events "HTTP/1.1 200 OK"
2024-10-21 18:38:51 | INFO | httpx | HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"
2024-10-21 18:38:51 | ERROR | stderr | /usr/local/lib/python3.10/site-packages/gradio/blocks.py:2595: UserWarning: Setting share=True is not supported on Hugging Face Spaces
2024-10-21 18:38:51 | ERROR | stderr |   warnings.warn(
2024-10-21 18:38:51 | INFO | stdout | 
2024-10-21 18:38:51 | INFO | stdout | To create a public link, set `share=True` in `launch()`.