ali-vilab/modelscope-text-to-video-synthesis

Mar 19, 2023

•

edited Mar 24, 2023

Is it possible to run it locally just like 1111? Share it in the replies if there's a way to, it will be helpful.

Update:
There's a way to run ModelScope in 1111 Webui as an extension, just follow this guide https://github.com/deforum-art/sd-webui-modelscope-text2video all the credits goes to the creators.

hysts

Mar 20, 2023

•

edited Mar 23, 2023

Hi, @zedwork
I think the following would work:

sudo apt install ffmpeg
git clone https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis
cd modelscope-text-to-video-synthesis
pip install -r requirements.txt
python app.py

nerkderk

Mar 20, 2023

pardon my ignorance (n00b), but how would we actually go about getting this into the 1111 stable diffusion web UI?

hysts

Mar 20, 2023

Hi, @nerkderk What I wrote is just a way to run this Space locally, and not the way to integrate this to the AUTOMATIC1111 web UI. Sorry for the confusion.

nerkderk

Mar 20, 2023

@hysts thanks. if i follow your steps to run it locally how do I then use it?
also does that work on windows or mac?

hysts

Mar 20, 2023

@nerkderk As for how to use it, you'll see a message like Running on local URL: http://127.0.0.1:7860 in your terminal after running python app.py, so you can just open the URL with your browser.
As for the required environment, this demo uses about 16GB of RAM (main memory) and 16GB of VRAM (GPU memory), so, you first need to make sure that your local environment satisfies at least these. I only tested this on Ubuntu, so I'm not sure if it'll work on Windows. Also, I'm not 100% sure, but I don't think it will run on Mac.

Manni1000

Mar 20, 2023

do u think that it can run on a 3090?

hysts

Mar 20, 2023

Hi, @Manni1000
3090 seems to have 24GB VRAM, so regarding the VRAM, I think it's enough.

RedFree

Mar 20, 2023

do u think that it can run on a 3090?

You can try lol, but it’s probably going to take a long long time. This is already slow and it’s running on an A100, a $17,000 gpu.

RalphX1

Mar 20, 2023

@hysts how would you run this in goog colab?

hysts

Mar 20, 2023

Hi, @RalphX1
I think you can do the following:

!git clone https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis
%cd modelscope-text-to-video-synthesis
!pip install -r requirements.txt
!pip uninstall -y modelscope
!pip install git+https://github.com/modelscope/modelscope
import app

But, as I mentioned above, this demo requires about 16GB of CPU RAM, so I think the memory available in the free Google Colab is not enough. Upgrading to Colab Pro might solve the OOM issue, though.

smrtscl

Mar 20, 2023

•

edited Mar 20, 2023

WARNING:modelscope:task text-to-video-synthesis input definition is missing
Traceback (most recent call last):
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\routes.py", line 393, in run_predict
output = await app.get_blocks().process_api(
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\blocks.py", line 1069, in process_api
result = await self.call_function(
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\gradio\blocks.py", line 878, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "C:\Users\User1\modelscope-text-to-video-synthesis\app.py", line 43, in generate
return pipe({'text': prompt})[OutputKeys.OUTPUT_VIDEO]
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\modelscope\pipelines\base.py", line 212, in call
output = self._process_single(input, *args, **kwargs)
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\modelscope\pipelines\base.py", line 247, in _process_single
out = self.forward(out, **forward_params)
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\modelscope\pipelines\multi_modal\text_to_video_synthesis_pipeline.py", line 58, in forward
video = self.model(input)
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\modelscope\models\base\base_model.py", line 34, in call
return self.postprocess(self.forward(*args, **kwargs))
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\modelscope\models\multi_modal\video_synthesis\text_to_video_synthesis_model.py", line 153, in forward
x0 = self.diffusion.ddim_sample_loop(
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\modelscope\models\multi_modal\video_synthesis\diffusion.py", line 219, in ddim_sample_loop
xt, _ = self.ddim_sample(xt, t, model, model_kwargs, clamp,
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\modelscope\models\multi_modal\video_synthesis\diffusion.py", line 168, in ddim_sample
_, _, _, x0 = self.p_mean_variance(xt, t, model, model_kwargs, clamp,
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\modelscope\models\multi_modal\video_synthesis\diffusion.py", line 120, in p_mean_variance
var = _i(self.posterior_variance, t, xt)
File "C:\Users\User1\AppData\Local\Programs\Python\Python310\lib\site-packages\modelscope\models\multi_modal\video_synthesis\diffusion.py", line 14, in _i
return tensor[t].view(shape).to(x)
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
Keyboard interruption in main thread... closing server.

So I am trying to run this on windows 10 and cuda 11.7, get these errors. Wondering how it would be possible to resolve them.

I feel like I am close to getting it because the model gets loaded into vram and then does this.

For reference I do have 24GB of VRAM

Edit: I may have found a stackoverflow (nvm for yoloV7)

hysts

Mar 20, 2023

Hi, @smrtscl
Did you run these two lines? Maybe you can run these again.

pip uninstall -y modelscope
pip install git+https://github.com/modelscope/modelscope.git@refs/pull/207/head

The error looks like the one that occurs when using modelscope==1.4.1 installed from PyPI. (The current latest version in PyPI has a bug in text2video pipeline.)

megatonsonic

Mar 20, 2023

well,maybe this needs at least 12GB VRAM to run locally
I try it in my 6GB VRAM 3060,and an error occurred
"CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 6.00 GiB total capacity; 5.20 GiB already allocated; 0 bytes free; 5.33 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF"
Perhaps reducing VRAM consumption can solve this problem, but I don't know how to do

smrtscl

Mar 20, 2023

Hi, @smrtscl
Did you run these two lines? Maybe you can run these again.
pip uninstall -y modelscope
pip install git+https://github.com/modelscope/modelscope.git@refs/pull/207/head
The error looks like the one that occurs when using modelscope==1.4.1 installed from PyPI. (The current latest version in PyPI has a bug in text2video pipeline.)

I had to force-install man my cards fans be going. I think its finally working, thank you!

hysts

Mar 20, 2023

Hi, @megatonsonic
Yes, as I wrote in this comment, this demo requires about 16GB CPU RAM and 16GB GPU RAM.

JimmyWang

Mar 20, 2023

•

edited Mar 20, 2023

Try the following commend to install latest version of the modelscope (need about 16GB RAM and 16GB GPU memory):

GIT_LFS_SKIP_SMUDGE=1 git clone https://github.com/modelscope/modelscope && cd modelscope && pip install -e . && cd ../

smrtscl

Mar 20, 2023

•

edited Mar 20, 2023

Now I am getting these

2023-03-20 01:02:34,128 - modelscope - WARNING - task text-to-video-synthesis input definition is missing
WARNING:modelscope:task text-to-video-synthesis input definition is missing
2023-03-20 01:02:56,760 - modelscope - WARNING - task text-to-video-synthesis output keys are missing
WARNING:modelscope:task text-to-video-synthesis output keys are missing

But it tries to do it.

I do not see the outputs. Wait its working but not in the webui. Must be my browser (Firefox)

Manni1000

Mar 20, 2023

do u think that it can run on a 3090?

You can try lol, but it’s probably going to take a long long time. This is already slow and it’s running on an A100, a $17,000 gpu.

it runns mutch faster on a 3090 then on hugging face at least the free hugging face. on a 3090 it takes aboout 23s

Manni1000

Mar 20, 2023

Do u think that fine tuning of this model will be possible? and if so how had would it be?

Headsoft

Mar 20, 2023

Apparently it's been successfully run on a 3090, so has anyone attempted on a 3080 yet? It's not quite 16GB but it's so close!

RalphX1

Mar 20, 2023

@hyst, thanks for your help regarding colab. I still get this error when trying to run it: "os.path.exists('weights/text2video_pytorch_model.pth')" returns "False"

hysts

Mar 20, 2023

•

edited Mar 20, 2023

@RalphX1 It seems the model weights were not downloaded properly. The code to download them is here. Probably it would work if you delete the weights directory and run it again.

AlexPengis

Mar 20, 2023

do u think that it can run on a 3090?

You can try lol, but it’s probably going to take a long long time. This is already slow and it’s running on an A100, a $17,000 gpu.

it runns mutch faster on a 3090 then on hugging face at least the free hugging face. on a 3090 it takes aboout 23s

I guess performance gap is not significant for most of tasks , so keep calm

AlexPengis

Mar 20, 2023

(modelscope) C:\Users\Alexey\modelscope-text-to-video-synthesis>python app.py
File "app.py", line 35
if (SPACE_ID := os.getenv('SPACE_ID')) is not None:
^
SyntaxError: invalid syntax

Any suggestion please !

megatonsonic

Mar 20, 2023

•

edited Mar 20, 2023

But, as I mentioned above, this demo requires about 16GB of CPU RAM, so I think the memory available in the free Google Colab is not enough. Upgrading to Colab Pro might solve the OOM issue, though.

Try to run it on colab https://colab.research.google.com/drive/1O0UkTs8byTx3TdWrjYe34SCRqP97kTDV
And as you said, free colab don't have enough memory,16 GB of RAM is necessary

AlexPengis

Mar 20, 2023

(modelscope) C:\Users\Alexey\modelscope-text-to-video-synthesis>python app.py
File "app.py", line 35
if (SPACE_ID := os.getenv('SPACE_ID')) is not None:
^
SyntaxError: invalid syntax

Any suggestion please !

It looks the problem is python 3.7.15 instead of 3.8

RalphX1

Mar 20, 2023

@hyst Thanks again for your help. It might now work. Not really sure because it crashes due to not enough available RAM.

RedFree

Mar 20, 2023

I put all this into GPT and yeah it said the issue is “ The error you're encountering is because of the "walrus" operator (:=), which was introduced in Python 3.8. It appears that you are using an older version of Python that does not support this syntax.”

RalphX1

Mar 20, 2023

@hyst I can now confirm that your solution worked.

Matichek

Mar 20, 2023

Hi, @zedwork
I think the following would work:

git clone https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis
cd modelscope-text-to-video-synthesis
pip install -r requirements.txt
pip uninstall -y modelscope
pip install git+https://github.com/modelscope/modelscope.git@refs/pull/207/head
pyhon app.py

I am getting "module not found" - Python310\Lib\site-packages\torchvision\image.pyd - but I have "tensorflow" and also "torchvision".

wyhauyeung

Mar 20, 2023

Hi, @zedwork
I think the following would work:

git clone https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis
cd modelscope-text-to-video-synthesis
pip install -r requirements.txt
pip uninstall -y modelscope
pip install git+https://github.com/modelscope/modelscope.git@refs/pull/207/head
pyhon app.py

I am stucked at the stage downloading weights. It seems the weights cannot be found and located.
Any thoughts? Thanks a lot

FrancisBleu37

Mar 20, 2023

My try on windows :

conda create -n p38 python=3.8
conda activate p38

git clone https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis
cd modelscope-text-to-video-synthesis
pip install -r requirements.txt
pip uninstall -y modelscope
pip install git+https://github.com/modelscope/modelscope.git@refs/pull/207/head

mkdir tmp
set TMP=PATH\tmp
set TEMP=PATH\tmp

python
import torch
torch.cuda.is_available()
exit()

=> false

conda install -c anaconda cudatoolkit

=> same

pyhon app.py

ZongzeWu

Mar 21, 2023

•

edited Mar 21, 2023

why is the video 'completely green'? The video only contains a green frame without one object. I also try the app, same results.

Python 3.8.12 (default, Oct 12 2021, 13:49:34) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from modelscope.pipelines import pipeline
2023-03-21 00:53:36,130 - modelscope - INFO - PyTorch version 2.0.0 Found.
2023-03-21 00:53:36,131 - modelscope - INFO - Loading ast index from /home/zongzew/.cache/modelscope/ast_indexer
2023-03-21 00:53:48,750 - modelscope - INFO - Loading done! Current index file version is 1.4.1, with md5 243a9d4bd58cd3f04e5171b395c5fcc3 and a total number of 842 components indexed
No module named 'tensorflow'
>>> from modelscope.outputs import OutputKeys
>>> 
>>> p = pipeline('text-to-video-synthesis', 'damo/text-to-video-synthesis')
2023-03-21 00:54:03,412 - modelscope - INFO - Model revision not specified, use default: master in development mode
2023-03-21 00:54:03,412 - modelscope - INFO - Development mode use revision: master
2023-03-21 00:54:04,015 - modelscope - INFO - initiate model from /home/zongzew/.cache/modelscope/hub/damo/text-to-video-synthesis
2023-03-21 00:54:04,016 - modelscope - INFO - initiate model from location /home/zongzew/.cache/modelscope/hub/damo/text-to-video-synthesis.
2023-03-21 00:54:04,017 - modelscope - INFO - initialize model from /home/zongzew/.cache/modelscope/hub/damo/text-to-video-synthesis
2023-03-21 00:54:28,082 - modelscope - WARNING - No preprocessor field found in cfg.
WARNING:modelscope:No preprocessor field found in cfg.
2023-03-21 00:54:28,082 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
WARNING:modelscope:No val key and type key found in preprocessor domain of configuration.json file.
2023-03-21 00:54:28,082 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': '/home/zongzew/.cache/modelscope/hub/damo/text-to-video-synthesis'}. trying to build by task and model information.
WARNING:modelscope:Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': '/home/zongzew/.cache/modelscope/hub/damo/text-to-video-synthesis'}. trying to build by task and model information.
2023-03-21 00:54:28,082 - modelscope - WARNING - No preprocessor key ('latent-text-to-video-synthesis', 'text-to-video-synthesis') found in PREPROCESSOR_MAP, skip building preprocessor.
WARNING:modelscope:No preprocessor key ('latent-text-to-video-synthesis', 'text-to-video-synthesis') found in PREPROCESSOR_MAP, skip building preprocessor.
>>> test_text = {
...         'text': 'A panda eating bamboo on a rock.',
...     }
>>> output_video_path = p(test_text,)[OutputKeys.OUTPUT_VIDEO]
2023-03-21 00:57:39,186 - modelscope - WARNING - task text-to-video-synthesis input definition is missing
WARNING:modelscope:task text-to-video-synthesis input definition is missing
2023-03-21 00:58:02,970 - modelscope - WARNING - task text-to-video-synthesis output keys are missing
WARNING:modelscope:task text-to-video-synthesis output keys are missing
>>> print('output_video_path:', output_video_path)
output_video_path: /tmp/tmpni6ioloe.mp4

DrewDobson87

Mar 21, 2023

This comment has been hidden

zedwork

Mar 21, 2023

Hi, @zedwork
I think the following would work:

git clone https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis
cd modelscope-text-to-video-synthesis
pip install -r requirements.txt
pip uninstall -y modelscope
pip install git+https://github.com/modelscope/modelscope.git@refs/pull/207/head
pyhon app.py

I believe there's a mistype at the end "pyhon app.py" i was getting errors and trying to find where it was XD

ZongzeWu

Mar 21, 2023

why is the video 'completely green'? The video only contains a green frame without one object. I also try the app, same results.

Python 3.8.12 (default, Oct 12 2021, 13:49:34) 
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from modelscope.pipelines import pipeline
2023-03-21 00:53:36,130 - modelscope - INFO - PyTorch version 2.0.0 Found.
2023-03-21 00:53:36,131 - modelscope - INFO - Loading ast index from /home/zongzew/.cache/modelscope/ast_indexer
2023-03-21 00:53:48,750 - modelscope - INFO - Loading done! Current index file version is 1.4.1, with md5 243a9d4bd58cd3f04e5171b395c5fcc3 and a total number of 842 components indexed
No module named 'tensorflow'
>>> from modelscope.outputs import OutputKeys
>>> 
>>> p = pipeline('text-to-video-synthesis', 'damo/text-to-video-synthesis')
2023-03-21 00:54:03,412 - modelscope - INFO - Model revision not specified, use default: master in development mode
2023-03-21 00:54:03,412 - modelscope - INFO - Development mode use revision: master
2023-03-21 00:54:04,015 - modelscope - INFO - initiate model from /home/zongzew/.cache/modelscope/hub/damo/text-to-video-synthesis
2023-03-21 00:54:04,016 - modelscope - INFO - initiate model from location /home/zongzew/.cache/modelscope/hub/damo/text-to-video-synthesis.
2023-03-21 00:54:04,017 - modelscope - INFO - initialize model from /home/zongzew/.cache/modelscope/hub/damo/text-to-video-synthesis
2023-03-21 00:54:28,082 - modelscope - WARNING - No preprocessor field found in cfg.
WARNING:modelscope:No preprocessor field found in cfg.
2023-03-21 00:54:28,082 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
WARNING:modelscope:No val key and type key found in preprocessor domain of configuration.json file.
2023-03-21 00:54:28,082 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': '/home/zongzew/.cache/modelscope/hub/damo/text-to-video-synthesis'}. trying to build by task and model information.
WARNING:modelscope:Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': '/home/zongzew/.cache/modelscope/hub/damo/text-to-video-synthesis'}. trying to build by task and model information.
2023-03-21 00:54:28,082 - modelscope - WARNING - No preprocessor key ('latent-text-to-video-synthesis', 'text-to-video-synthesis') found in PREPROCESSOR_MAP, skip building preprocessor.
WARNING:modelscope:No preprocessor key ('latent-text-to-video-synthesis', 'text-to-video-synthesis') found in PREPROCESSOR_MAP, skip building preprocessor.
>>> test_text = {
...         'text': 'A panda eating bamboo on a rock.',
...     }
>>> output_video_path = p(test_text,)[OutputKeys.OUTPUT_VIDEO]
2023-03-21 00:57:39,186 - modelscope - WARNING - task text-to-video-synthesis input definition is missing
WARNING:modelscope:task text-to-video-synthesis input definition is missing
2023-03-21 00:58:02,970 - modelscope - WARNING - task text-to-video-synthesis output keys are missing
WARNING:modelscope:task text-to-video-synthesis output keys are missing
>>> print('output_video_path:', output_video_path)
output_video_path: /tmp/tmpni6ioloe.mp4

I solved this problem by installing VCL player.

hysts

Mar 21, 2023

@zedwork

I believe there's a mistype at the end "pyhon app.py" i was getting errors and trying to find where it was XD

Ohhh, sorry!! I'll update the comment. Thanks for pointing out!

DrewDobson87

Mar 21, 2023

@hysts @zedwork am I missing something in the browser? the app launches fine and I see videos in /tmp, but the browser GUI does not have playback. Tried in Firefox Dev, Firefox, and Chromium. Running Ubuntu 22; Conda env Python3.8

hysts

Mar 21, 2023

Hi, @DrewDobson87
Hmm, weird. I have no idea why it doesn't work. Is the video in /tmp playable or is it corrupted? Also, can you share the screenshot of the GUI?

DrewDobson87

Mar 21, 2023

•

edited Mar 21, 2023

@hysts the videos are all playable. Copying the file path of the video shows "http://127.0.0.1:7860/file=/tmp/93d060d6b6ea0bc31ad9bb59d49811a7ad5d17e1/tmpe4cee4kv.mp4"

Output from starting the app and generating videos is:
(text2video) steven@HOSTNAME
:~/Projects/modelscope-text-to-video-synthesis$ python app.py
2023-03-20 21:44:28,382 - modelscope - INFO - PyTorch version 2.0.0 Found.
2023-03-20 21:44:28,383 - modelscope - INFO - TensorFlow version 2.11.1 Found.
2023-03-20 21:44:28,383 - modelscope - INFO - Loading ast index from /home/steven/.cache/modelscope/ast_indexer
2023-03-20 21:44:28,400 - modelscope - INFO - Loading done! Current index file version is 1.4.1, with md5 981074e49928ad74d8c80335c73fc01f and a total number of 842 components indexed
2023-03-20 21:44:28.623178: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-03-20 21:44:28.710041: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2023-03-20 21:44:29.141285: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-03-20 21:44:29.141348: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2023-03-20 21:44:29.141354: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2023-03-20 21:44:29,544 - modelscope - INFO - initiate model from weights
2023-03-20 21:44:29,544 - modelscope - INFO - initiate model from location weights.
2023-03-20 21:44:29,545 - modelscope - INFO - initialize model from weights
2023-03-20 21:44:54,031 - modelscope - WARNING - No preprocessor field found in cfg.
WARNING:modelscope:No preprocessor field found in cfg.
2023-03-20 21:44:54,031 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
WARNING:modelscope:No val key and type key found in preprocessor domain of configuration.json file.
2023-03-20 21:44:54,031 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'weights'}. trying to build by task and model information.
WARNING:modelscope:Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': 'weights'}. trying to build by task and model information.
2023-03-20 21:44:54,031 - modelscope - WARNING - No preprocessor key ('latent-text-to-video-synthesis', 'text-to-video-synthesis') found in PREPROCESSOR_MAP, skip building preprocessor.
WARNING:modelscope:No preprocessor key ('latent-text-to-video-synthesis', 'text-to-video-synthesis') found in PREPROCESSOR_MAP, skip building preprocessor.
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
2023-03-20 21:45:04,199 - modelscope - WARNING - task text-to-video-synthesis input definition is missing
WARNING:modelscope:task text-to-video-synthesis input definition is missing
2023-03-20 21:45:24,563 - modelscope - WARNING - task text-to-video-synthesis output keys are missing
WARNING:modelscope:task text-to-video-synthesis output keys are missing
2023-03-20 21:45:47,037 - modelscope - WARNING - task text-to-video-synthesis input definition is missing
WARNING:modelscope:task text-to-video-synthesis input definition is missing
2023-03-20 21:46:06,623 - modelscope - WARNING - task text-to-video-synthesis output keys are missing
WARNING:modelscope:task text-to-video-synthesis output keys are missing

hysts

Mar 21, 2023

@DrewDobson87 Thanks!
The log I get with this demo is something like this:

cgbfw 2023-03-21T04:40:35.958Z   warnings.warn(
cgbfw 2023-03-21T04:40:51.366Z 2023-03-21 05:40:51,366 - modelscope - WARNING - task text-to-video-synthesis input definition is missing
cgbfw 2023-03-21T04:40:51.366Z WARNING:modelscope:task text-to-video-synthesis input definition is missing
cgbfw 2023-03-21T04:41:14.492Z 2023-03-21 05:41:14,492 - modelscope - WARNING - task text-to-video-synthesis output keys are missing
cgbfw 2023-03-21T04:41:14.492Z WARNING:modelscope:task text-to-video-synthesis output keys are missing
cgbfw 2023-03-21T04:41:14.709Z /home/user/.pyenv/versions/3.8.9/lib/python3.8/site-packages/gradio/components.py:1981: UserWarning: Video does not have browser-compatible container or codec. Converting to mp4

Comparing it to your logs, it seems the video conversion is not working in your environment.
Can you try installing ffmpeg and x264?

sudo apt install ffmpeg x264

Maybe some other packages are required, but I'd try this first.

DrewDobson87

Mar 21, 2023

@hysts good catch! this is a newer install and I must not have had to add those yet. Browser UI is working perfectly now!

RiftHunter4

Mar 21, 2023

Having issues getting the models on Windows. The download seems to freeze up in the command console.

Fetching 6 files: 33%|██████████████████████ | 2/6 [01:19<02:39, 39.88s/it]
Downloading VQGAN_autoencoder.pth: 10%|████▏ | 514M/5.21G [01:18<07:42, 10.2MB/s]
Downloading VQGAN_autoencoder.pth: 10%|████▏ | 514M/5.21G [01:29<07:42, 10.2MB/s]
Downloading (…)ip_pytorch_model.bin: 11%|████▋ | 451M/3.94G [01:29<09:36, 6.06MB/s]

Is there a way to just download these manually through the broswer?

hysts

Mar 21, 2023

Hi, @RiftHunter4 You can manually download weights from here.

RiftHunter4

Mar 21, 2023

Thanks. The models are loading, but it looks like I'm getting some errors still.

No module named 'tensorflow'
fixed by running pip install tensorflow

then later...

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

Doing some googling, it looks like this is uaually caused by having a CPU-only version of PyTorch installed.

tintwotin

Mar 21, 2023

When manually downloading the weights, at what path should they be inserted in the modelscope python installation?

hysts

Mar 22, 2023

Hi, @tintwotin
If you are trying to run this Space locally, this Space expects the weights are stored in a directory weights. See this.
Also, you can specify the directory you saved the weights like this.

kaotec

Mar 22, 2023

Works on my 16Gb RTX500, about 15.388Gb max GPU mem use

Manni1000

Mar 22, 2023

i always put the videos thrpough ffmpeg because they are weard and dont play on eavery device

Daciansolgen9

Mar 22, 2023

•

edited Mar 22, 2023

when I try to import app.py

it starts to download the models but then gets stuck. Does anyone have an idea of how to fix that?

all other step was made successfully but not the last step. note that im not a pro in python or conda since i don't code usually

Daciansolgen9

Mar 22, 2023

i get this error

File "C:\Users\dacia\modelscope-text-to-video-synthesis\app.py", line 40, in
pipe = pipeline('text-to-video-synthesis', model_dir.as_posix())
File "C:\Users\dacia\miniconda3\lib\site-packages\modelscope\pipelines\builder.py", line 140, in pipeline
return build_pipeline(cfg, task_name=task)
File "C:\Users\dacia\miniconda3\lib\site-packages\modelscope\pipelines\builder.py", line 56, in build_pipeline
return build_from_cfg(
File "C:\Users\dacia\miniconda3\lib\site-packages\modelscope\utils\registry.py", line 215, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
RuntimeError: TextToVideoSynthesisPipeline: TextToVideoSynthesis: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

can some one help me?

BloodOfTheRock

Mar 22, 2023

•

edited Mar 22, 2023

Was able to get around cuda not being detected on windows with:

pip uninstall torch
pip cache purge
conda install pytorch pytorch-cuda=11.7 -c pytorch -c nvidia

And it started the server just fine. After running a prompt, though, I see these and no video file is output:
2023-03-22 14:09:10,723 - modelscope - WARNING - task text-to-video-synthesis input definition is missing
WARNING:modelscope:task text-to-video-synthesis input definition is missing
2023-03-22 14:09:33,764 - modelscope - WARNING - task text-to-video-synthesis output keys are missing
WARNING:modelscope:task text-to-video-synthesis output keys are missing

-----------edit:------------
Nvm, seems to be working now, I can't remember exactly what I did, I think it might have been:
conda install -c conda-forge ffmpeg

But now on a 3090 it is generating videos in around 20s locally

hysts

Mar 23, 2023

@RiftHunter4 @Daciansolgen9
If the command didn't install the pytorch correctly in your environment, you can visit the official site of PyTorch https://pytorch.org/ to find a command that works for you.

johnblues

Mar 23, 2023

Hi. If you install it locally, is there a way to increase the video length?

hysts

Mar 23, 2023

Hi, @johnblues Yes. The maximum video length is limited to 32 using the environment variable MAX_NUM_FRAMES in this Space so as not to slow down the demo, but you can change it if you duplicate this Space or install it locally. Note that increasing the number of frames also increases memory usage.

johnblues

Mar 23, 2023

@hysts Thanks. I thought it would only increase the time to process it, I hadn't though that it would also eat the memory as well.

Hypsy

Mar 27, 2023

@hysts Where/in which file are you able to change the environment variable MAX_NUM_FRAMES when you are running it locally?

hysts

Mar 27, 2023

Hi, @Hypsy
You can do something like this if you are using Ubuntu:

export MAX_NUM_FRAMES=200
python app.py

or

MAX_NUM_FRAMES=200 python app.py

As I don't have Windows environment, I'm not sure if it's correct, but it seems you can do this in Windows:

set MAX_NUM_FRAMES=200
python app.py

hysts

Mar 27, 2023

BTW, the default value of MAX_NUM_FRAMES is actually 200 (see here), so I don't think you have to manually set the environment variable in most cases.

Hypsy

Mar 28, 2023

All good had a script running from a week old so the number of frames slider wasn't there yet. Played around with it, but sometimes i'm getting this error. It appears when getting above 25 number of frames. It keeps creating videos but they are really abstract like lines etc. Running on a 3090. Any idea how to solve this

Exception in callback _ProactorBasePipeTransport._call_connection_lost(None)
handle: <Handle _ProactorBasePipeTransport._call_connection_lost(None)>
Traceback (most recent call last):
File "C:\Users...\AppData\Local\Programs\Python\Python310\lib\asyncio\events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "C:\Users...\AppData\Local\Programs\Python\Python310\lib\asyncio\proactor_events.py", line 162, in _call_connection_lost
self._sock.shutdown(socket.SHUT_RDWR)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

Spaces:

ali-vilab
/

modelscope-text-to-video-synthesis

Runtime error

Running it locally.