koukyo1994
commited on
update README
Browse files
README.md
CHANGED
@@ -48,7 +48,7 @@ For more technical details and discussions, please refer to:
|
|
48 |
|
49 |
We have verified the execution on a machine equipped with a single NVIDIA H100 80GB GPU. However, we believe it should be possible to run the model on any machine equipped with an NVIDIA GPU with 16GB or more of VRAM.
|
50 |
|
51 |
-
Terra consists of an Image Tokenizer, an Autoregressive Transformer, and a Video Refiner. Due to the complexity of setting up the Video Refiner,
|
52 |
|
53 |
### Install Packages
|
54 |
|
|
|
48 |
|
49 |
We have verified the execution on a machine equipped with a single NVIDIA H100 80GB GPU. However, we believe it should be possible to run the model on any machine equipped with an NVIDIA GPU with 16GB or more of VRAM.
|
50 |
|
51 |
+
Terra consists of an Image Tokenizer, an Autoregressive Transformer, and a Video Refiner. Due to the complexity of setting up the Video Refiner, we have not include its implementation in this Hugging Face repository. Instead, **the implementation and setup instructions for the Video Refiner are provided in [ACT-Bench repository](https://github.com/turingmotors/ACT-Bench)**. Here, we provide an example of generating video continuations using the Image Tokenizer and the Autoregressive Transformer, conditioned on image frames and a template trajectory. The resulting video quality might seem suboptimal as each frame is decoded individually. To improve the visual quality, you can use Video Refiner.
|
52 |
|
53 |
### Install Packages
|
54 |
|