NVIDIA H100 ComfyUI setup
/ 5 min read
Table of Contents
preface
Yesterday I wrote about my AI slop addiction. For now I stopped at least the long nights of image and video prompting. This is a post I was working on while still at it. I thought I’d still publish it, it contains some useful stuff just for reference.
NVIDIA H100 ComfyUI setup
I couldn’t resist and wanted to know what it’s like to use a NVIDIA H100 to generate AI image and video slop. You can get one for ~2$/hr (yes, per hour) on Hyperstack for example.
I set up a VM with a 80GB VRAM H100 and an additional 500GB SSD volume to persist the setup via the Hyperstack UI. Also created an entry for an SSH key to be able to access the VM. My ssh
terminal mojo is quite rusty so I used Claude to help me out a bit.
Hyperstack has some docs on how to access your VM via SSH, in our case this looks like:
ssh ubuntu@<vm's_public_ip> -i ~/.ssh/<filename>
To be able to use the additional volume, I needed to format and mount it. Note vdc
is the name of the volume I found out via running lsblk
.
# Formatsudo mkfs -t ext4 /dev/vdc
# Create a mount pointsudo mkdir /mnt/comfy-ui-data
# Mount itsudo mount /dev/vdc /mnt/comfy-ui-data
# Tweak permissions, change ownershipsudo chown -R ubuntu:ubuntu /mnt/comfy-ui-data
# Give read/write permissionssudo chmod -R 755 /mnt/comfy-ui-data
Next we’ll set up ComfyUI. I mostly followed the instructions on how to install it on Linux.
# clone repositorycd /mnt/comfy-ui-datagit clone https://github.com/comfyanonymous/ComfyUI.gitcd ComfyUI
# install torch related stuff for NVIDIApip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu121
# install dependenciespip install -r requirements.txt
# goto ComfyUI/custom_nodes and install ComfyUI managercd custom_nodesgit clone https://github.com/ltdrdata/ComfyUI-Manager comfyui-manager
The next section is about downloading models from Hugging Face into their respective directories. These are just reference commands without any specific order.
# Place in ComfyUI/models/diffusion_modelscurl -L https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/diffusion_models/hunyuan_video_t2v_720p_bf16.safetensors?download=true --output hunyuan_video_t2v_720p_bf16.safetensorscurl -L "https://huggingface.co/Kijai/HunyuanVideo_comfy/resolve/main/hunyuan_video_FastVideo_720_fp8_e4m3fn.safetensors?download=true" --output hunyuan_video_FastVideo_720_fp8_e4m3fn.safetensors
# Place in ComfyUI/models/text_encoderscurl -L "https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/text_encoders/clip_l.safetensors?download=true" --output clip_l.safetensorscurl -L "https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/text_encoders/llava_llama3_fp8_scaled.safetensors?download=true" --output llava_llama3_fp8_scaled.safetensorscurl -L "https://huggingface.co/zer0int/LongCLIP-SAE-ViT-L-14/resolve/main/Long-ViT-L-14-GmP-SAE-TE-only.safetensors" --output Long-ViT-L-14-GmP-SAE-TE-only.safetensors
# Place in ComfyUI/models/vaecurl -L "https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/vae/hunyuan_video_vae_bf16.safetensors?download=true" --output hunyuan_video_vae_bf16.safetensors
# Place in ComfyUI/models/clip_visioncurl -L "https://huggingface.co/openai/clip-vit-large-patch14/resolve/main/model.safetensors?download=true" --output clip-vit-large-patch14_OPENAI.safetensors
# Place in ComfyUI/models/lorascurl -L "https://huggingface.co/Kijai/HunyuanVideo_comfy/resolve/main/hyvideo_FastVideo_LoRA-fp8.safetensors" --output hyvideo_FastVideo_LoRA-fp8.safetensors
# Place in ComfyUI/models/unetcurl -L "https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/resolve/main/split_files/diffusion_models/hunyuan_video_t2v_720p_bf16.safetensors" --output hunyuan_video_t2v_720p_bf16.safetensors
Downloading models from Civitai requires an API key, look at the docs here: https://education.civitai.com/civitais-guide-to-downloading-via-api/
Once all models are downloaded (this can take a while), it’s time to start up ComfyUI.
cd /mnt/comfy-ui-data/ComfyUIpython3 main.py
ComfyUI runs a web server, to access the UI, create a tunnel to its exposed port from your local machine like this:
ssh -L 8188:localhost:8188 ubuntu@<ip-address> -i ~/.ssh/<filename>
The -L
flag creates a tunnel where the first 8188:
is the local port on your machine you want to map this to, localhost:8188
is the destination host and port on the remote VM. If you want to run this in the background (so you can use the same terminal for other commands), you can add the -N
flag.
You can now access ComfyUI on your local machine’s browser! Have fun creating some slop!
To download slop from the VM to my local machine I used rsync
:
# Create local output directory if neededmkdir -p ~/ComfyUI-output
# Run rsync to download just new data from ComfyUI's output directoryrsync -avzP -e "ssh -i ~/.ssh/<filename>" ubuntu@<ip-address>:/mnt/comfy-ui-data/ComfyUI/output/ ~/ComfyUI-output/
I already had some LORAs on my local machine I wanted to upload to the VM. This is a bit slower because it’s limited by your internet connection’s upload speed. But LORAs usually are way smaller than the checkpoint models.
# Create local LORA directory if needed and move the LORAs you want to upload in theremkdir -p ~/ComfyUI-loras
# Run rsync this time the other way around to sync new LORAs to the VMrsync -avz -e "ssh -i ~/.ssh/<filename>" ~/dev/ComfyUI-loras/ ubuntu@<ip-address>:/mnt/comfy-ui-data/ComfyUI/models/loras/
I played around with creating txt2vid with Hunyuan, here’s the docs to get a basic workflow running in ComfyUI: https://blog.comfy.org/p/hunyuanvideo-native-support-in-comfyui
Hunyuan can also be combined with LORAs, here’s a workflow doing that: https://civitai.com/models/1081086?modelVersionId=1244929 (watch out, CivitAI contains a lot of NSFW stuff)
An interesting finding was that Hunyuan creates perfect looping videos if you set the length to 201 frames!
Hunyuan prompting is quite a rabbit hole. You might get a perfect cinematic video with narrow depth of field for something as simple as the typical a cat walking on grass
, but as soon as you try something more elaborate it starts to spit out just ugly shit.
For comparison, the img2vid
workflow I played with that used A1111 with Stable Diffusion to create a starting image and then used that with Kling Video Pro v1.6 got me consistent results I never came close with Hunyuan’s txt2vid
. There are some early stage img2vid
workflows for Hunyuan (https://github.com/AeroScripts/leapfusion-hunyuan-image2video) but IMHO not in the same league as Kling Video. Tencent says they are working on official img2vid
support and it will come soon.
epilog
That’s it for now. Pro tip: Don’t forget to hibernate your H100 VM once you’re done for the day, otherwise it gets expensive quickly!
As mentioned earlier, I stopped for now with this, because I ended up spending way too much time just coming up with stupid slop. There’s still some worthwile learnings there: I finally was able to get going with an H100 via cloud, it’s an amazing piece of hardware and pretty nice that it’s as simple to use these days. It was interesting to see how ComfyUI works. I might get back to it for some Super 8 video restoration.
I’ll leave you with a Super 8 still frame of me eating pasta in the late 1970s. This was captured using a KODAK Reels Super 8 film digitizer. By default it delivers a bit shitty quality but there’s a community contributed firmware out there that improves things a bit. I then used Final Cut Pro with the Neat Video plugin to remove dust and scratches and then another pass through Topaz Video AI for some upscaling and frame rate conversion.
