roon@lemmy.ml to 196@lemmy.blahaj.zoneEnglish · 4 months agoThe Rulelemmy.mlexternal-linkmessage-square63fedilinkarrow-up1656arrow-down10
arrow-up1656arrow-down1external-linkThe Rulelemmy.mlroon@lemmy.ml to 196@lemmy.blahaj.zoneEnglish · 4 months agomessage-square63fedilink
minus-squareAdrianTheFrog@lemmy.worldlinkfedilinkEnglisharrow-up1·4 months agoYes, but 200 gb is probably already with 4 bit quantization, the weights in fp16 would be more like 800 gb IDK if its even possible to quantize more, if it is, you’re probably better of going with a smaller model anyways
Yes, but 200 gb is probably already with 4 bit quantization, the weights in fp16 would be more like 800 gb IDK if its even possible to quantize more, if it is, you’re probably better of going with a smaller model anyways