Claude is very good at figuring out how to work around limitations (which is probably one reason why it’s also good at finding security issues).
At work, the monorepo is enormous and files are loaded on-demand as needed. This isn’t uncommon with huge repos - Microsoft have VFS for Git (although I hear that’s deprecated now), Meta have EdenFS, and Google has some proprietary solution.
We have a hook that blocks find and grep because they can be extremely slow, and tells it to instead use some significantly faster MCP tools to search the codebase, powered by a search index with local changes overlaid.
GPT-5.5 has no problem with this. Claude Opus mostly does it, but sometimes it loves to find workarounds rather than following the instructions. Things like: Try alternative commands like egrep. Create a symlink to grep and run that to see if it bypasses the filtering. Run it with a different shell like zsh. Write a Python script that execs grep. Write a Python script to reimplement grep.
I’m trying Hermes Agent at home, but I have it in its own VM with restricted permissions.
I never used Claude, but that’s basically my sentiment about copilot when compared to Gemini.
Then I forbid all this BS in agents file. Gemini follows it. Copilot ignores it with all its strength. Then I tell it to stop trying on the chat prompt. 2 minutes later it does it again.
Not just at prompt engineering level, but at all levels, Gemini guardrails are better ( well it was, they killed it and replaced with anti gravity now).
You need to use hooks to actually block it from doing things. CLAUDE.md files are just guidance, and it’s not guaranteed to follow everything (and the longer the file gets, the more likely it’ll ignore stuff - it should be kept as short as possible)
Hooks are code that runs at a certain point (eg after you submit a prompt, before a tool call, after a turn, etc) that can do some validation, verification, logging, etc.
It does still try to work around the blocks though, but it’s not as bad as trying to put the restrictions in the prompt.
Always fun when it tries to circumvent the problem i gave it.
“Hey claude i have had this issue for a while and i want to explore to understand whats going on to finally fix it”
Many frustrating back and forth later
“This clearly isn’t working, what if we tried to circumspect the issue by doing something else entirely like workaround i have been using for the last month “
I’ve posted passwords into the chat with hermes by accident a few times it never tries to use them. Personal stuff im not worried about. The fucker wants to ssh into every other thing on my network all the time though.
I had a similar experience with openclaw and minimax m2.7
I gave it ssh access to one other device to do one thing and it apparently just “decided” that it would just execute everything there instead of locally as originally instructed.
The docker daemon runs as root. And as such, it can access anything initially. If your container doesn’t explicitly (or implicitly) change the uid or drops capabilities, the container also runs as root (ok, IIRC some capabilities are always dropped, but the container stays almost root). It’s still locked in its own little sandbox, but the mounting of those paths etc. happens with root.
If you enter a docker command (docker run ..., docker build ..., …) this command will not run the containers by themselves, but instead call said docker daemon (that is running as root) by using a socket (/var/run/docker.sock).
To only allow trusted users to interact with the docker daemon, this socket can only be accessed by root or by users in the docker group. That’s why you usually need to type sudo docker run.... Sadly many tutorials tell you to just blindly add your user to the docker group so that you do not need to use sudo to interact with docker. BUT that now means that you gave you or those users basically full access to your whole filesystem (and thus system configuration) without sudo. Any programs (or viruses or AI Agents or…) running with these user accounts also get this group and thus docker’s capabilities.
That’s why you should NEVER add your user to the docker group or enable passwordless sudo, as you’re just one simple command/tool/script/prompt/… away from a privilege escalation.
You can configure docker to run rootless with only your user’s capabilities and rights, but at that point… Why configure docker to do something that other docker compatible projects like Podman offer out of the box?
Claude is in love with cli tools, it uses them for virtually everything these days in these long chains connected with && and |. This is probably pushing more and more people to let it run in the auto mode.
It makes sense… There’s a LOT of examples of using CLI tools in the training data. At work we’re moving away from MCP tools to instead using CLIs for everything.
Claude is very good at figuring out how to work around limitations (which is probably one reason why it’s also good at finding security issues).
At work, the monorepo is enormous and files are loaded on-demand as needed. This isn’t uncommon with huge repos - Microsoft have VFS for Git (although I hear that’s deprecated now), Meta have EdenFS, and Google has some proprietary solution.
We have a hook that blocks
findandgrepbecause they can be extremely slow, and tells it to instead use some significantly faster MCP tools to search the codebase, powered by a search index with local changes overlaid.GPT-5.5 has no problem with this. Claude Opus mostly does it, but sometimes it loves to find workarounds rather than following the instructions. Things like: Try alternative commands like egrep. Create a symlink to grep and run that to see if it bypasses the filtering. Run it with a different shell like
zsh. Write a Python script that execs grep. Write a Python script to reimplement grep.I’m trying Hermes Agent at home, but I have it in its own VM with restricted permissions.
I never used Claude, but that’s basically my sentiment about copilot when compared to Gemini.
Then I forbid all this BS in agents file. Gemini follows it. Copilot ignores it with all its strength. Then I tell it to stop trying on the chat prompt. 2 minutes later it does it again.
Not just at prompt engineering level, but at all levels, Gemini guardrails are better ( well it was, they killed it and replaced with anti gravity now).
You need to use hooks to actually block it from doing things. CLAUDE.md files are just guidance, and it’s not guaranteed to follow everything (and the longer the file gets, the more likely it’ll ignore stuff - it should be kept as short as possible)
https://code.claude.com/docs/en/hooks
Hooks are code that runs at a certain point (eg after you submit a prompt, before a tool call, after a turn, etc) that can do some validation, verification, logging, etc.
It does still try to work around the blocks though, but it’s not as bad as trying to put the restrictions in the prompt.
Always fun when it tries to circumvent the problem i gave it.
“Hey claude i have had this issue for a while and i want to explore to understand whats going on to finally fix it”
Many frustrating back and forth later
“This clearly isn’t working, what if we tried to circumspect the issue by doing something else entirely like workaround i have been using for the last month “
I’ve posted passwords into the chat with hermes by accident a few times it never tries to use them. Personal stuff im not worried about. The fucker wants to ssh into every other thing on my network all the time though.
I had a similar experience with openclaw and minimax m2.7
I gave it ssh access to one other device to do one thing and it apparently just “decided” that it would just execute everything there instead of locally as originally instructed.
Another thing Claude tried to do on my coworker’s machine yesterday was basically:
Wait, this actually works? Docker can mount directories in
/etcwith write permissions?The docker daemon runs as root. And as such, it can access anything initially. If your container doesn’t explicitly (or implicitly) change the uid or drops capabilities, the container also runs as root (ok, IIRC some capabilities are always dropped, but the container stays almost root). It’s still locked in its own little sandbox, but the mounting of those paths etc. happens with root.
If you enter a docker command (
docker run ...,docker build ..., …) this command will not run the containers by themselves, but instead call said docker daemon (that is running as root) by using a socket (/var/run/docker.sock).To only allow trusted users to interact with the docker daemon, this socket can only be accessed by root or by users in the docker group. That’s why you usually need to type
sudo docker run.... Sadly many tutorials tell you to just blindly add your user to the docker group so that you do not need to use sudo to interact with docker. BUT that now means that you gave you or those users basically full access to your whole filesystem (and thus system configuration) without sudo. Any programs (or viruses or AI Agents or…) running with these user accounts also get this group and thus docker’s capabilities.That’s why you should NEVER add your user to the docker group or enable passwordless sudo, as you’re just one simple command/tool/script/prompt/… away from a privilege escalation.
You can configure docker to run rootless with only your user’s capabilities and rights, but at that point… Why configure docker to do something that other docker compatible projects like Podman offer out of the box?
Privilege escalation as a service
privilege elevator
Giving Docker access to Claude is certainly a choice.
… especially if your user is in the docker group and doesn’t need sudo, LOL
Claude is in love with cli tools, it uses them for virtually everything these days in these long chains connected with
&&and|. This is probably pushing more and more people to let it run in the auto mode.It makes sense… There’s a LOT of examples of using CLI tools in the training data. At work we’re moving away from MCP tools to instead using CLIs for everything.
Just aliasing
greptoagsolves both issues. I’m unsure as to whether there’s a pthread replacement forfind, though.ag/rgdon’t work well in this particular scenario either. Because files are loaded on-demand, they end up trying to load the entire repo.