I wanted to play around with the new Mistral model but had to realize that the default script didn’t run on my 6 GPU setup.
So here you go.
A simple script that works (at least for now)
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch, os, time
os.environ["PYTORCH_CUDA_ALLOC_CONF"]="expandable_segments:True"
os.environ["PYTORCH_C…