• Coolcoder360@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    6 hours ago

    I went with quantized Gemma

    Well, was it quantized in a way that iphone 16 supports?

    Often it’s the quantization where things break down, and the hardware needs to support the quantization, can’t run FP16 on int8 hardware… And sometimes the act of quantization can cause problems too.

    And yeah, LLMs are likely going to be very hit or miss anyway.