Why are most machine learning models (not frameworks) written in Python? Even through almost any programming language can be used for machine learning?

ryujin470@fedia.io · 3 months ago

Why are most machine learning models (not frameworks) written in Python? Even through almost any programming language can be used for machine learning?

Gamma@beehaw.org · 3 months ago

It’s a popular language for data scientists

Overspark@piefed.social · 3 months ago

Yeah this. Python was already popular with the early adopters, and it’s a fairly easy language to learn and use. After that it became a network effect thing: all the best tools were already written in Python so people continued to do so.

TehPers@beehaw.org · 3 months ago

Also a lot of data scientists aren’t really programmers. They just learned Python to do data science. Learning a new language for this purpose would be both very difficult and, in their eyes, often unnecessary.

tal@lemmy.today · 3 months ago

I once had dinner with a Stanford professor, years back, who was talking about the fact that he liked teaching in Python because he spent way less time teaching the language and more the higher level stuff that he was actually trying to get across than when he was using C++. Lower barrier to entry for new users. I’d guess that probably in the intervening years, a lot of classes have decided to use it for similar reasons. If you want to teach, I dunno, signal processing and your students maybe don’t have a great handle on the language yet, you want to be spending time on the signal processing stuff, not on language concepts.

Powderhorn@beehaw.org · 3 months ago

This was infuriating to me when I started college as a CS major. I dropped out after Intro because they weren’t giving us anything worth remembering.

tal@lemmy.today · edit-2 3 months ago

My impression from what code I’ve looked at is that little computation is done by the Python code itself, so there’s little by way of gains to be had by trying to use something higher-performance, which eliminates a lot of the reason one would use some other languages.

Python’s cross-platform, albeit with a Unix heritage, so it doesn’t create barriers there. It’s already widely-used, a mature language that isn’t going anywhere and with a lot of people who know it.

It’s got an ecosystem for distributing libraries over the network, and there’s a lot of new code going out and being distributed rapidly.

Python isn’t statically-typed. Static typing can help write more-robust code. If you’re writing, say, the next big webserver, I’d want to have that checking. But for code that may often be running internally in a research project — and this is an area with a lot of people doing research — a failure just isn’t that big a deal. So, again, some of the reasons that one might use another language aren’t there.

And I imagine that there’s also inertia. Easier to default to use what others would use.

If you have another language in mind, you might mention that, see if there might be more-specific things. I could come up with more meaty plausible guesses if what you were wondering is something like “why isn’t everyone using SmallTalk?” or something.

ryannathans@aussie.zone · 3 months ago

YouTube does alright being written in python, and being strongly typed Python is leaps and bounds beyond the JavaScript that I am now seeing on the server side 🙃

unique_hemp@discuss.tchncs.de · 3 months ago

No sane person is putting plain JS on the server nowadays, it’s TS by far most of the time.

ryannathans@aussie.zone · 3 months ago

Unfortunately I have to deal with the insane

CapedStanker@beehaw.org · edit-2 3 months ago

That’s mainly what all of the researchers who turned the papers into a functioning jupyter or colab notebook knew how to do, and at the time python was the main language used in notebooks (I think it still is), so if you wanted to share it widely, you had to do it in python because you knew the people who were going to use it and improve also used notebooks heavily. Also, they want it to be as reproduce-able as possible for the peer review process, so they want to use the widely known language, which I believe python was like #1 or #2 in the world at that time (2017/2018).

Then as an added bonus, us programmers who somehow found our way into those slack/discord channels knew enough python to help them out when they needed it here and there. This was essentially before open ai or anything like that existed, particularly in its current iteration.

From there, web languages were added as a wrapper, so it’s easier to use for everyone than a notebook, where you had to click through each cell and maybe debug something now and then to get it to work right.

Onno (VK6FLAB)@lemmy.radio · edit-2 3 months ago

Because they’re all copying each other’s homework?

Zaktor@sopuli.xyz · 3 months ago

This, but without the implication it’s cheating. As someone who’s both a software engineer and trains ML models, choosing a language that’s commonly used for the general task area you’re tackling (ML or not) is very useful. If it’s popular for the task area you’ll have a lot of references for how to solve problems, you can find and use libraries designed and demonstrated for similar tasks, and yes, you can cut and paste code snippets.

Almost every language is capable of doing anything, and software engineers regularly use multiple languages in the course of their work. Libraries and support are a big deal in deciding which to use, and will often be more important than your personal language familiarity/preference.

CapedStanker@beehaw.org · 3 months ago

we were specifically taught in school to not write something that’s already been written, we all build upon each other’s work, literally going back thousands of years when you consider the importance of the math that underpins all of it.

jarfil@beehaw.org · 3 months ago

Strictly speaking, math gets proven from scratch by every math student. Software is slightly different, since most of it never gets a formal proof at all.

CapedStanker@beehaw.org · 3 months ago

Sure, but it works, and that’s what we build upon. And then people build upon that. If we really wanted, we could say simple loop is building upon the work that humans did when we simply invented/discovered counting.