• 0 Posts
  • 10 Comments
Joined 2 years ago
cake
Cake day: August 8th, 2023

help-circle

  • Did anybody bother to look at the numbers?

    I checked the stats for the last 4 years here and it looks really strange. Statistics isn’t my thing… But it looks like it’s wise to be cautious and not to fully trust the numbers.

    Around the beginning of last year there was a huge dip in the Windows market share that seemed to be correlating with a peek in “unknown”. Windows then catched up in a somewhat erratic way.

    Mac OS also shows a weird behavior. Starts at 16%, up to 21% and the down to 14% between October and November…

    It’s not likely that a huge number of people decided to buy a Mac and then trash it one month later. Same but opposite goes for the windows stats.

    I think it looks like there is an uncertainty of more than the total market share Linux is shown to have…

    Not saying that Linux isn’t increasing on desktop market share. Just saying that numbers seen to have quite a bit error margin and to be cautious if referring to these numbers.



  • I recommend you shrink the windows partition on the internal drive and install Linux in the then empty space. The extra disk you have can be used as and extra disk or you can create mount points for /home and other directories.

    Microsoft does not recognize other operating systems as “equals” (WSL is not Linux being week. It’s making Linux a puppet controlled by Windows) and therefore they design everything Windows as it was the only OS in the world. Therefore keeping Windows will often require some extra acrobatics from you.



  • How good is good do you say?

    We got a pretty good results with CER at 4% and WER at 15%!

    This was on a limited dataset used to test and train which most likely means that if you introduced an even larger dataset with greater variations in handwriting style for testing the numbers might be even worse.

    Very simplified: A risk of a character wrong every 20th character and a word wrong every 7th word. The SER was around 20%.

    There’s an reason why no one has released a good model for western letters yet and why companies pay up to 1€ for capturing data from 10 handwritten pages.

    It will come but OCR isn’t as sexy as developing text2image solutions.


  • To train an AI to recognize handwriting you need a huge dataset of handwriting examples. That is millions of samples of handwritten text + information about what the written text says in every example).

    This is why the best engines only exists as a service in the cloud. The OCR engines you can install lovely that are acceptable, but far from perfect, are commercial. Parascript FormXtra is one of the better commercial ones.

    The only OCR Engine that’s free and really good is Tesseract OCR but it doesn’t handle handwritten text.


  • I replied to a statement about Heif being an Apple image format. It is not.

    Furthermore, HEIF is something that most major mobile device vendors support. Some, like Samsung, even sets it as default on some of their devices. So the whole “Apple always supporting not open standards” is just tiresome at this point.

    99.999% of all Android users are defacto locked in by Google. Yes, Android might be open but Play services are not. Google works hard to lock in Android users.

    At least Apple are open and honest about locking in iOS users.


  • No.

    It’s a container for image data developed by Moving Picture Experts Group (“MPEG”, try to guess what else they have created).

    While there are some compatibility issues between vendors HEIC still offers a greater set of features as compared to fx JFIF (you probably know it as JPEG/JPG.

    Apple was one of the early adopters (2017) and (as usual?) the industry has followed. Microsoft wants money for the codec in Windows and that’s probably one of the reasons why it’s not commonly used…yet.