without you ever knowing about it, as well as (perhaps) swapping in a cheaper-to-operate model some percentage of the time, perhaps as request loads peak, hoping you’ll just roll the dice and try again.
without you ever knowing about it, as well as (perhaps) swapping in a cheaper-to-operate model some percentage of the time, perhaps as request loads peak, hoping you’ll just roll the dice and try again.
GPU limiting and a general lack of either knowledge or wanting to put in the effort to do it themselves. Even just going into Github in the first place is enough of a barrier for a lot of people, unfortunately
Yep, I’ve a mobile 3070 in my laptop, and whilst I feasibly could run some of the smallest models around, paying on a per-use basis gets me way better quality results for relatively cheaply.
Besides, running it locally isn’t free either. Your hardware deprecates and depreciates over time, in addition to non-negligible power costs.