• Infernal_pizza@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    12
    ·
    10 hours ago

    The company’s website claims that its system has a 99.98% accuracy rate

    99.98% accuracy of the people it flags as shoplifters or 99.98% accuracy overall? And if the latter then what proportion of the population are even shoplifters? Could you achieve similar levels of “accuracy” by saying nobody is a shoplifter? Maybe throw in a few positives here and there to make it look like your product does something other than harass the public?

    • starshipwinepineapple@programming.dev
      link
      fedilink
      arrow-up
      3
      ·
      3 hours ago

      You’re right to question this.

      In machine learning Accuracy means the correct % of overall classifications. There’s some other terms like:

      • Precision which is the % of correctly identified positives divided by the number of positive classifications. A high precision score would mean that of everyone who flagged as a match you had relatively few who were not actual shoplifters.
      • Recall (true positive rate) which is the % of correctly identified positives divided by all actual positives. A high recall score measures how many shoplifters you caught and would minimize false negatives, but at the cost of more false positives.

      So in the case of classification of shoplifters ideally you would focus on Precision as false positives are undesired, but if a company doesn’t care about false positives as much as getting the shoplifters they’d focus on Recall. In either event, Accuracy is a poor metric to use or advertise in an imbalanced data set like shoplifting as most customers are not shoplifters so even if the model didn’t classify anyone as a shoplifter they’d still be 99+% accurate.