• 0 Posts
  • 635 Comments
Joined 3 years ago
cake
Cake day: June 16th, 2023

help-circle






  • I’ll say “vibe coding” to me implies the operator has zero awareness of the actual code, and there is something wrong.

    They treat the actual program logic in the same way folks treat assembly code as some arcane black magic they don’t have to think about. Problem is the tooling is not nearly so deterministic as a compiler, and the output is just too bad to be relied upon.

    For certain clasesses of tasks, it may do a serviceable job, maybe at first. If you have ongoing evolution requirements, it can dig itself a whole that it can’t really dig out of. It can’t process the code that had been generated to extrapolate a code change to match the change request.

    The GenAI coding needs supervision, and ‘vibe coding’ implies opting out of careful supervision.


  • This is just so fitting.

    I keep getting merge requests now from people that their whole job to date had been too scared by the syntax to try coding.

    It’s almost always a shotgun of way too many lines of code changed for a small thing, often with horrible side effects that would be unacceptable.

    Someone wanted to tweak the CSS layout of one element, what should have been a one line change. The pull request had hundreds of css changes, basically touching everything. Clearly the model had started changing things and he kept saying it didn’t do it yet until finally it did and it never rolled back anything it did, including many of the rules being repeated 5 times in a row in the same place…

    They felt like AI was making them so helpful because they could submit a code change directly instead of just asking for what they want. They proudly said “AI told me:” and then explain the brilliance of the AI finding. One time the AI finding was addressed over 6 months prior, the AI never thought to update the software, but instead proposed a really crap workaround that would have failed to cover a whole class of similar scenarios while simultaneously imposing crazy side effects on scenarios that weren’t tested.

    I can use AI too, please just send me what you would have sent to the AI, and if AI can do it, I could use the AI. If you think the AI will figure out how you are using something wrong and don’t want to bother/wait for a human to help, fine, but if it gets to what it thinks is a software bug, just rewind and start from your problem statement when you come to me…



  • Sometimes it just doesn’t pan out.

    Had a junior dev that basically decided he would rather try to grift through instead of doing the job. Never seen someone work so hard at trying not to work at all. Every day it was a different excuse, a different other person to point to as to why he didn’t even try to do anything that day. I think at least 7 or 8 of his grandmothers died during his tenure. And management ate it up.

    Until one day he lost track of things and blamed the manager asking him why things weren’t done. Said the manager never sent him some material and of course the manager had. Suddenly the manager believed the rest of us who had been saying he was lying for the last many months…

    The key was he was cheap and was in theory supposed to be as good as a higher paid alternative, so management would have to admit to being wrong to ditch him…



  • Yes, the non-determinism is crazy.

    I have like one thing I use voice for usually. “Call <name>”. With Google Assistant, it reliably called that specific person.

    Now that my phone decided to gemini, it will sometimes make a call, and sometimes it says something like “I have found one contact with that name in your contacts, their phone number is 1-555-555-5555” Sometimes with some extra language clearly intended to be stuffed back into context to guide some next step that isn’t coming, don’t remember but something along the lines of “Contact match added to context to enable dialing the phone now” or something.

    I’m perfectly fine with a different wake word or chaining it to google assistant, “Hey google, ask gemini …” would be fine.

    And yes, it might be vaguely useful for doing a maps search in the car, as that is a pain. A vaguely decent answer I can confirm is nice for things like a road trip stop for food or some small thing.





  • Mine would be: “I have no idea” - An answer the LLMs generally refuse to give by their nature (usually declining to answer is rooted in something in the context indicating refusing to answer being the proper text).

    If you really pressed them, they’d probably google each thing and sum the results, so the estimates would be as consistent as first google results.

    LLMs have a tendency to emit a plausible answer without regard for facts one way or the other. We try to steer things by stuffing the context with facts roughly based on traditional ‘fact’ based measures, but if the context doesn’t have factual data to steer the output, the output is purely based on narrative consistency rather than data consistency. It may even do that if the context has fact based content in it sometimes.


  • Note that could prove you have it, but failure to execute does not prove yourself secure.

    For example, someone reported to me that their RHEL9 system was not vulnerable based on this result. But it was because python was 3.9 and didn’t have os.splice, so the demonstrator failed, but the actual issue was there.

    Similarly, if ‘/usr/bin/su’ isn’t exactly there (maybe it’s in /bin/su, or in /sbin/su, or /usr/sbin/su, or not there at all), the demonstrator will fail, but the kernel may still have the vulnerability, you just have to select a different victim utility (or change the cache for some other data other than an executable for other effects).



  • Note that this is a rather narrow view of the scope of things.

    Yes, the demonstrator is a python script that opens up ‘su’ and uses splice+this vulnerability to change it to ‘just assume all privileges and become sh’.

    However, it’s that any process in any namespace can leverage a certain socket type and splice to effectively modify any filesystem content they want. It’s easy to see how this could be part of a chained attack to, for example, replace a protected service that is firewalled off with a shell. An RCE in a service permits rewriting nginx in an entirely different container and replaces it with a shell backend of your choosing.

    That ‘flatpak’ application on your single user system that is guarded from touching your files that aren’t related? That isolation doesn’t mean anything if this issue is in play.

    In terms of shared systems, while it should be avoided if possible, practically speaking there’s a lot of shared resources.

    I don’t get why I’ve seen so many people saying “ehh, no big deal, privilege escalation is just a fact of life”.


  • In my experience, the bigger the codebase gets, the more confounded LLM gets at trying to make coherent changes. So LLM projects start on shaky ground and just get worse because they can’t maintain the stuff they themselves generated.

    I’ve seen what LLM can do and it is certainly interesting and can do some stuff, but the vast majority of my experience is someone who had not coded before “vibing” themselves into a corner and demanding help to dig them out. A bit irritating because while before we could reasonably prioritize requests to do stuff because management understood making something from nothing was real work, now management says “they aren’t asking you to make something, just help them fix something that already exists, should be easy!”

    On the ELOC metric, for a long time I pointed out how disastrous I must be because my contribution to a project I was on was about -10,000 lines of code by the time I went to something else.