Llama On Rtx 3090, Weirdly, inference seems to speed up over time.