I’m calling it now, the adoption of AI agents into software development will be one of the most costly mistakes in the field’s history. Agents cannot program, and it’s taking longer and longer to realize that they can’t. They are a highly sophisticated statistical model designed to mimic the distribution of programming. The output is broken, but in a way that’s getting harder and harder to detect. Which is exactly what you’d expect from an increasingly accurate statistical model.

  • GreenKnight23@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    1 hour ago

    remember, when you interview for a job and they ask you, “do you have any questions?”, you ask;

    • has AI ever been used to develop your product?
    • what percentage of your product has been written by agenetic AI?
    • is the use of AI tracked as a performance indicator?
  • Avicenna@programming.dev
    link
    fedilink
    arrow-up
    14
    ·
    edit-2
    1 hour ago

    They are not the automated from 0 to 100 coders that some people claim them to be. But they are quite capable, definitely much more capable than what anyone could have imagined ten years ago. Given well defined problems they can excel at even relatively complex tasks. I pointed Claude at a latex file of a somewhat complicated nonparametric statistical estimate calculation to look for any mistakes and it was actually able to find some. I then pointed it at a code that replicates the calculations and it was also able to correctly identify some issues with the code. I think this is the way one should use LLMs, not let it loose on coding tasks. In the former way you won’t even be able to burn through your first tier account quota where as in the latter the LLM will likely end up getting in weird loops burning tokens like there is no tomorrow. Also this method of sane usage of LLMs is much more suitable for open local LLMs. I don’t think there is any doubt anymore that LLMs can be very useful tools, not just for doing stuff but learning it too. People should move past the stage of invalid criticisms like “they are just stochastic parrots” and move to more serious matters like environmental impact, greedy fucking CEOs pretending LLMs are replacements for humans, degredation of skills, getting lazy at checking AI code, ethics of capitalizing on collective human knowledge and the unsustainable AI bubble that tech companies are pushing for.

    • Log in | Sign up@lemmy.world
      link
      fedilink
      arrow-up
      9
      ·
      8 hours ago

      invalid criticisms like “they are just stochastic parrots”

      That’s not a criticism per se, it’s a description of how they work.

      • Tiresia@slrpnk.net
        link
        fedilink
        arrow-up
        2
        ·
        2 hours ago

        Sure, and at that level of accuracy it’s also a description of how humans work. I didn’t invent these words myself, I’m just stringing them together based on a stochastic process my brain was trained into.

        Like LLMs, some of my speech is semi-random initialization (dada wawa googoo), some of that is mimicry (some of that is mimicry), some of that is reinforcement learning (downvotes incoming), and some of that is the output of a subprocess that uses the same systems prompted at the meta-level and without verbalization (maybe they won’t get the analogy between thinking and LLM scratchpads… how about I use this space to clarify).

        Calling an LLM a stochastic parrot has the same social-emotional role as calling a human an animal. Yes, it is correct. But people can infer the connotation.

  • NigelFrobisher@aussie.zone
    link
    fedilink
    arrow-up
    9
    ·
    15 hours ago

    This is very obvious unless you are in tech leadership, in which case your job is now to push this at all costs and suppress dissenting voices.

    • blargh513@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      6
      ·
      14 hours ago

      In tech leadership. I don’t have to push it. My talented engineers took to it immediately.

      They learned quickly that it is a tool. Instead of using a shovel and a wheelbarrow, they have a backhoe now. If you don’t know how to dig a hole, the backhoe is just a way to make a mess faster. It doesn’t replace intelligence.

      They can use it to do the scutwork while they focus on the important stuff.

      The duds are still typing shit into spreadsheets and emailing them as attachments while their coworkers are getting stuff done.

      It is a tool. You can learn to use it or you can just be mad that it exists. In either case it isn’t going away. Like the telephone, the car, the computer, the internet, it is here to stay.

      • Zos_Kia@jlai.lu
        link
        fedilink
        arrow-up
        1
        ·
        8 minutes ago

        What’s fascinating about this conversation is … how do people think software used to be made ? With talented and knowledgeable developers who would never “hallucinate” an API or a library function ? With cybersec experts who would never put their user’s data in jeopardy ? With performance investigators checking the computational complexity of each function ? Bitch please…

        Software engineering is not the kind of mystical cathedral building these people have in mind, it’s more like a musty workshop in Pakistan where they make tractor tires with no safety equipment and a cigarette in their mouth. We’ve been throwing imperfect humans in various states of lucidity at every problem known to man for 30 years but suddenly people start believing that their bog standard CRUD software should be written by monks having attained cosmic godhood.

      • NigelFrobisher@aussie.zone
        link
        fedilink
        arrow-up
        6
        ·
        8 hours ago

        If you’re letting your engineers find uses for it instead of constantly demanding that they generate lengthy “user stories” and decision documents and deferring thinking to agents instead of quickly planning stuff out using their experience then you’re probably quite an outlier by now.

        • blargh513@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          2
          ·
          3 hours ago

          I am a very lazy man, micromanagement takes so much fucking energy. Hire talented people, give clear and unambiguous guidance and trust them to do the work. It is amazing how easy management can be when you don’t get in the way.

          Being in leadership is way easier than being an engineer and the pay is better too. Some people really overcomplicate shit.

  • megopie@beehaw.org
    link
    fedilink
    English
    arrow-up
    3
    ·
    12 hours ago

    part of the issue as well is that when they get something completely broken, people just re roll the output until they get something that’s broken in ways they don’t notice. Or re roll parts of it, or tell the system to judge if the output is broken and re roll the parts that it judges are broken automatically. Or increase the size of the context window to get it closer to that upper limit of accuracy.

    All this together can get a more functional output with less effort, and as people find these tricks it gives them the illusion of an upward trend in capability, like this is all solvable issues that will improve as time goes on. Big problem with that though, theses tricks and methods explode the compute cost rapidly. That’s all fine and dandy when everyone is getting their compute costs for these tools subsidized by these model providers, but eventually they will need to charge the real cost of running this. The compute providers that host the model providers are also running at a loss, trying to help grow the market segment and maximize their market share. And then places that have the datacenters in them are giving tax breaks and discount utilities to attract new construction.

    Everyone except the people making the chips is selling at a loss, and as people pile on usage to make up for the fundamental limitations of these systems, the demand balloons, validating to the providers at all levels that this is a growing market they should invest more in to.

    But eventually… they need to make money. The bill comes due on all the debt and investment. What happens to the people who have fully embraced these to run their businesses? Or to all the people who have built their skill set around using these systems? It’s a crisis, a series of crisis, each time a debt wall gets hit by someone in the supply chain. A half decade of technical debt that just got really expensive to deal with, and not enough experienced people to handle it, since all the grey beared retired and not enough new people got brought in to replace them because the entry level work was automated.

  • Stefan_S_from_H@piefed.zip
    link
    fedilink
    English
    arrow-up
    10
    ·
    17 hours ago

    You know the feeling that you want to rewrite a project? But you know that most rewrites are a bad idea.

    Be it your own, old code. Or code you inherited.

    There is a small chance that the world realizes that they went in the wrong direction and nothing can get fixed. That will be the time of rewrites.

    No, I don’t expect this to be very likely. The agent code will remain, and human programmers get yelled at for not fixing it fast enough.

    • FiniteBanjo@programming.dev
      link
      fedilink
      arrow-up
      19
      ·
      20 hours ago

      Let it be known that the first person to call it was actually Sam Altman when OpenAI’s paper on AI Scaling Laws in 2020 subtly showed that the diminishing returns will stop showing improvement with infinite power, compute time, and data before 94% accuracy is reached.

  • obviouspornalt@fedinsfw.app
    link
    fedilink
    English
    arrow-up
    16
    ·
    23 hours ago

    if it’s broken in a way that can’t be detected, is it actually broken?

    all software is broken in some way. if the rate of bugs generated by llm and the severity of those bugs drops below the rate you would expect from a human programming team, then llm is offering something competitive.

    • TrickDacy@lemmy.world
      link
      fedilink
      arrow-up
      3
      ·
      19 hours ago

      broken in a way that can’t be detected

      Is not what anyone said and you’re lying when you pretend they did.

      • FiniteBanjo@programming.dev
        link
        fedilink
        arrow-up
        18
        ·
        20 hours ago

        No, humans make less mistakes. Less. That’s the key here, statistical models are trained on human data so by pure logic can never, ever, under any circuimstance, reach 100% accuracy. With current understanding of LLMs with a focus on AI Scaling Laws, and more importantly of natural human language adaptation, they will never reach 94% accuracy with infinite power and infinite training. That’s what the curve shows us in OpenAI’s 2020 research paper on AI Scaling Laws and later Deepmind’s paper correcting their math, that the diminishing returns will hit a limit far before convergence.

        In addition to that, the AI also cannot detect subtle changes to established problems or any new unaccounted for variables, because they’re a statistical model and not capable of actual thought. They also lack any sense of responsibility for their actions for the same reason.

        You fucking sloppers always try to say “HuMAnS mAkE misTAKeS, TOO!” Yeah and the fucking slopbots are trained on those mistakes and make them again but worse.

      • 42firehawk@fedinsfw.app
        link
        fedilink
        English
        arrow-up
        13
        ·
        23 hours ago

        But you’re forgetting the key difference that makes it so much worse - we can fix human mistakes especially if we can talk to the human to figure out how. With an llm we have no external reference, only poorly designed code where the comments are there to guide the writing, not describe what was written. So it’s much harder to debug an output, and the llm cannot be trusted to clean it up either.

        • Inucune@lemmy.world
          link
          fedilink
          arrow-up
          8
          ·
          22 hours ago

          A human can be held responsible. A machine cannot. If the machine writes bad code, and someone gets injured or killed because of it, who takes responsibility?

          I state again: a machine cannot be held responsible.

          • jostein@lemmy.world
            link
            fedilink
            arrow-up
            3
            ·
            22 hours ago

            It is never the coder that is responsible, it is the one who makes the code available to use. Often with humans, they are one and the same. With machines, they are not.

        • FizzyOrange@programming.dev
          link
          fedilink
          arrow-up
          4
          ·
          22 hours ago

          You can totally fix AI-written code with AI. You tell it something is wrong, it tries to fix it.

          I did a recent experiment with AI writing a document format converter and that’s exactly what I did. It wrote some code, I checked the output, found a formatting issue or similar, asked it to fix it, repeat. It works unreasonably well and with Fable the final code isn’t even bad.

            • FizzyOrange@programming.dev
              link
              fedilink
              arrow-up
              2
              ·
              edit-2
              8 hours ago

              Probably fine if you review the code carefully. And if you’re working in a domain that AI is decent at (e.g. web stuff). But even if it wasn’t it doesn’t mean AI cannot program.

          • Bane_Killgrind@lemmy.dbzer0.com
            link
            fedilink
            English
            arrow-up
            6
            ·
            22 hours ago

            You can fix problems, if you know they are there and there is a model of that problem being fixed.

            You can’t fix problems you don’t know are there, or do not have modeling.

              • 42firehawk@fedinsfw.app
                link
                fedilink
                English
                arrow-up
                3
                ·
                17 hours ago

                To add to the other response - it is much more difficult to work with Ai to debug inconsistent issues or similar unless you can understand the code and step through with a debugger to check for race conditions or similar.

                Recently I was working with an Ai tool for some c code that depending on machine ran wildly differently. The Ai was unable to identify any issues, and kept recommending fixes for hardcoding values or similar that I had to revert. The fix ended up needing to use valgrind to create a different enough environment to see how a race condition was made to properly have one async call delay for the other.

                AI can be powerful, and humans can be dumb. But if the code was human made, I would not have needed 3 hours to find a problem, and I wouldn’t have tried to turn to AI for a simple fix because I’d know what I was looking for to start with.

              • FiniteBanjo@programming.dev
                link
                fedilink
                arrow-up
                6
                ·
                20 hours ago

                Humans generally don’t hallucinate libraries or documentation. If there is a bug or error on a human maintaine REPO the human in charge will generally know what went wrong and how to fix it, the AI will just gaslight your ass because the AI has no idea.

  • FizzyOrange@programming.dev
    link
    fedilink
    arrow-up
    7
    ·
    22 hours ago

    Agents cannot program

    This is just factually incorrect. Difficult to get past a false assertion of this magnitude.

    They are a highly sophisticated statistical model designed to mimic the distribution of programming.

    I thought we had got over the stochastic parrot nonsense by now.

    You can totally have objections about the ability of AI to program - how good it is, poor failure modes, high cost, technical debt, knowledge debt, broken social contracts, etc. All valid.

    But if you’re still in the “It’s just a next word predictor! It can’t really think!” stage of denial, even now… Sorry you’re an idiot.

    • NaibofTabr@infosec.pub
      link
      fedilink
      English
      arrow-up
      11
      ·
      17 hours ago

      Um… but it is just a sophisticated statistical model… that’s literally what the math underpinning machine learning models is… and all it can do is make associations based on correlations within the field of the training data. That’s what it does.

      • FizzyOrange@programming.dev
        link
        fedilink
        arrow-up
        2
        ·
        8 hours ago

        This is like saying “but it is just a sophisticated network of neurons. All it can do is transform input signals into output signals. That’s what it does.”

        • NaibofTabr@infosec.pub
          link
          fedilink
          English
          arrow-up
          2
          ·
          7 hours ago

          Not really.

          A machine learning model is a computer program. It is fundamentally a math equation, which we understand completely.

          A living brain is not fundamentally a math equation, and is not purely a statistical model, at least not in any empirically demonstrable way. We don’t understand completely how it works, but we do know that it’s more complex than what you’re trying to imply.

          The comparison is not valid. Machine learning models are not an equivalent to a biological brain.

      • chicken@lemmy.dbzer0.com
        link
        fedilink
        arrow-up
        4
        ·
        15 hours ago

        Um… but it is just a sophisticated statistical model…

        The mistake has been thinking this implies LLMs can never do X task, and using it as a catch-all argument for any value of X, but it isn’t a good argument because it has been wrong for most of those.

        • NaibofTabr@infosec.pub
          link
          fedilink
          English
          arrow-up
          3
          ·
          edit-2
          7 hours ago

          The mistake has been thinking this implies LLMs can never do X task

          As this article points out, an LLM can spit out chunks of regurgitated code that it scraped from the internet, but that does not make the LLM a programmer. The resulting output is an attempt to find an existing pattern in the database which fits with what the user has asked for, but it is not a product of actually understanding the use case for the code. It is just statistical correlation.

          So, sure, an LLM can be set up to generate output related to X task. If you can collect and clean data that can be used to train the kind of output you want, it should be able to produce an approximate facsimile of the results you want. Is that valuable for your use case? Maybe.

          We’re still just talking about what is essentially a complex search function. The statistical model returns results from its database that correlate most closely to your input. That does not mean it returns the right answer. If there is no good correlation, it will still return a result.

          As long as you understand that the result you get is just a correlation based on your input and may or may not be relevant to your specific problem, and you are not fooled into believing that the LLM actually understands what you’re asking and produced a result by “thinking” about it, then you might be able to use an LLM as an effective tool - to search a large collection of information for something that is relevant(ish) to what you’re asking for.

          The real mistake has been broad misunderstanding of what LLMs actually do, and trying to use them as general-purpose problem solving tools (or worse, as accurate and reliable sources of information).

    • TrickDacy@lemmy.world
      link
      fedilink
      arrow-up
      7
      ·
      edit-2
      19 hours ago

      Thank you for letting us know where you stand. Those who tag users can act accordingly to this public declaration of robo-handjob-giving.

  • m532@lemmygrad.ml
    link
    fedilink
    arrow-up
    1
    ·
    20 hours ago

    The output is broken, but in a way that’s getting harder and harder to detect.

    Introducing code homeopathy: “Yes those two programs are consisting of the exact same bytes, but this one contains pure virgin artisan essence (C20 distilled)