Off-and-on trying out an account over at @tal@oleo.cafe due to scraping bots bogging down lemmy.today to the point of near-unusability.

  • 15 Posts
  • 742 Comments
Joined 2 years ago
cake
Cake day: October 4th, 2023

help-circle
  • Security is where the gap shows most clearly

    So, this is an area where I’m also pretty skeptical. It might be possible to address some of the security issues by making minor shifts away from a pure-LLM system. There are (conventional) security code-analysis tools out there, stuff like Coverity. Like, maybe if one says “all of the code coming out of this LLM gets rammed through a series of security-analysis tools”, you catch enough to bring the security flaws down to a tolerable level.

    One item that they highlight is the problem of API keys being committed. I’d bet that there’s already software that will run on git-commit hooks that will try to red-flag those, for example. Yes, in theory an LLM could embed them into code in some sort of obfuscated form that slips through, but I bet that it’s reasonable to have heuristics that can catch most of that, that will be good-enough, and that such software isn’t terribly difficult to write.

    But in general, I think that LLMs and image diffusion models are, in their present form, more useful for generating output that a human will consume than that a CPU will consume. CPUs are not tolerant of errors in programming languages. Humans often just need an approximately-right answer, to cue our brains, which itself has the right information to construct the desired mental state. An oil painting isn’t a perfect rendition of the real world, but it’s good enough, as it can hint to us what the artist wanted to convey by cuing up the appropriate information about the world that we have in our brains.

    This Monet isn’t a perfect rendition of the world. But because we have knowledge in our brain about what the real world looks like, there’s enough information in the painting to cue up the right things in our head to let us construct a mental image.

    Ditto for rough concept art. Similarly, a diffusion model can get an image approximately right — some errors often just aren’t all that big a deal.

    But a lot of what one is producing when programming is going to be consumed by a CPU that doesn’t work the way that a human brain does. A significant error rate isn’t good enough; the CPU isn’t going to patch over flaws and errors itself using its knowledge of what the program should do.

    EDIT:

    I’d bet that there’s already software that will run on git-commit hooks that will try to red-flag those, for example.

    Yes. Here are instructions for setting up trufflehog to run on git pre-commit hooks to do just that.

    EDIT2: Though you’d need to disable this trufflehog functionality and have some out-of-band method for flagging false positives, or an LLM could learn to bypass the security-auditing code by being trained on code that flags false positives:

    Add trufflehog:ignore comments on lines with known false positives or risk-accepted findings


  • I keep seeing the “it’s good for prototyping” argument they post here, in real life.

    There are real cases where bugs aren’t a huge deal.

    Take shell scripts. Bash is designed to make it really fast to write throwaway, often one-line software that can accomplish a lot with minimal time.

    Bash is not, as a programming language, very optimized for catching corner cases, or writing highly-secure code, or highly-maintainable code. The great majority of bash code that I have written is throwaway code, stuff that I will use once and not even bother to save. It doesn’t have to handle all situations or be hardened. It just has to fill that niche of code that can be written really quickly. But that doesn’t mean that it’s not valuable. I can imagine generated code with some bugs not being such a huge problem there. If it runs once and appears to work for the inputs in that particular scenario, that may be totally fine.

    Or, take test code. I’m not going to spend a lot of time making test code perfect. If it fails, it’s probably not the end of the world. There are invariably cases that I won’t have written test code for. “Good enough” is often just fine there.

    And it might be possible to, instead of (or in addition to) having human-written commit messages, generate descriptions of commits or something down the line for someone browsing code.

    I still feel like I’m stretching, though. Like…I feel like what people are envisioning is some kind of self-improving AI software package, or just letting an LLM go and having it pump out a new version of Microsoft Office. And I’m deeply skeptical that we’re going to get there just on the back of LLMs. I think that we’re going to need more-sophisticated AI systems.

    I remember working on one large, multithreaded codebase where a developer who isn’t familiar with or isn’t following the thread-safety constraints would create an absolute maintenance nightmare for others, where you’re going to spend way more time tracking down and fixing breakages induced than you saved by them not spending time coming up to speed on the constraints that their code needs to conform to. And the existing code-generation systems just aren’t really in a great position to come up to speed on those constraints. Part of what a programmer does is, when writing code, is to look at the human-language requirements, and identify that there are undefined cases and go back and clarify the requirement with the user, or use real-world knowledge to make reasonable calls. Training an LLM to map from an English-language description to code is creating a system that just doesn’t have the capability to do that sort of thing.

    But, hey, we’ll see.


  • I apparently actually did two of these on different occasions, using different, restricted Unicode character ranges (ones that only look at the value of the character as a whole, no subpixel rendering). Can’t find the (newer) color one, but the black-and-white one:

        ░                                                                         
      ░░░░░░░                                                                     
     ░▒▒▒▒▒▒▒░░░                                                                  
     ░▒▓▓▓▒▒▒░░░░░                                                                
     ░▒▓▓▓▓▒▒▒▒▒░░░░░                                                      ░░░    
     ░▒▓▓▓▓▓▓▓▒▒▒▒▒▒▒░░                                              ░░░░░░░░░░░░░
     ░▒▓▓███▓▓▓▓▒▒▒▒▒▒▒░░                                         ░░░▒▒▒▒▒▒▒▒▒▒▒░░
     ░▒▓▓▓████▓▓▓▒▒▒▒▒▓▒▒░░                                 ░░░░░░▒▒▒▒▒▒▒▒▒▒▓▓▒▒░ 
      ░▒▓▓▓███▓▓▒▒▒▒▒▒▒▒▒▒░░░░░░░░░░░░░░░░░░░             ░░▒▒▒▒▒▒▒▒▒▒▒▒▒▓▓▓▓▓▒░  
       ░▒▓▓▓█▓▓▓▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░░░░░░░░▒▒▒░░░░░░░░░░░░░▒▒▓▓▓▓▓▓▒▒▒▒▓▓▓▓▓▓▓▒░   
        ▒▓▓▓██▓▓▓▒▒▒▒▒▒▓▓▓▒▒░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░▒▒▒▓▓▓▓▓▒▒▒▓▓▓▓▓██▓▓▓▒░    
        ░▒▓▓▓████▓▓▓▓▓▓▓▓▓▓▒░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░▒▒▒▒▒▓▓▓▓▓▓▓▓▓▓▓▓▓▓█▓▓▓▒░     
        ░▒▒▓▓▓████▓▓▓▓▓▒▒▒▒▒▒░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░▒▓▓▓▓▓▓▒▒▒▓▓▓▓▓▓▓▓▒░       
        ░░▒▓▓▓▓██▓▓▓▒▒▒▒▒░░░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░▒▒▒▓▓▓▓▓▓▓▓▓▓▓▓█▓▓▓░░        
         ░▒▓▓▓▓▓▓▓▒▒░░░░░░░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░▒▒▒▒▓▓▓▓███▓▓▓▒░          
          ░▒▓▓▓▒▒░░░░░░░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░▒▒▓▓██▓▓▓▓▒░           
            ▒▒▒░░░░░░░░░░░░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░▒▒▓▓▓▓▓▓▒░            
          ░░░░░░░░░░░░░░▒░░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░░▒▒▒▓▓▓▒░             
          ░▒▒▒░░░░▒▒▒▒░░▒▒▒▒▒▒▒▒▓▓▓▓▒▒▒▒▒▒▒▒▒▒▓▓▒▒▒▒░░░░░░░░░░▒▒▒▒▒░              
         ░░▒▒▒▒░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓▓▓▓▓▒▒▒▒▒▒░▒▒▒▓▓▓▒▒▒▒▒░░░░░░░░░░▒▒░               
        ░░▒▒▒▒▒▒▒▒▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▒▒▒▒▒▒▒▒▒▒▒▓▓▓▓▒▒▒▒▒░░░░░░░░░▒░░               
        ░░▒▒▒▒▓▓▓▓▓▓▓▒▒▒▒▓▓▓████▓▓▒▒░░░░▒▒▒▒▒▒▒▓▓▓▓▓▓▓▓▓▓▒▒▒░░░░▒░░               
       ░▒▒▒▒▒▓▓▓▓▓▓▓▓▒▒▒▒░░░▒▒▓▓▓▓▒▒░░░░░░░░░▒▓▓████▓▓▓▓▒▒▒▒▒▒░░░░░               
      ░░▒▒▒▓▓▓▓▓▒▒▒▓▓▓▓▓▒▒▒░░░▒▒▓▒▒░░░░░░░░░░░▒▓▓▓▒▒▒▒▒░░▒▒▒▒▒▒░▒░                
        ░▒▒▓▓▓▒▒▒▓▓▓▓▓▓▓▓▓▒▒▒▒▒▒▒▒░░░░░░░░░░░░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░                 
         ░▒▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▒▒▒░░░░░░░░░░░░░░▒▒▒▒▒▒▓▓▓▒▒░▒▒▒▒░                 
           ░▒▓▓▓▓▓▓▓▓▓▓▓▒▒▒▒▓▓▓▒▒▒░░░░░░░░░░░░░░▒▒▒▒▒▒▒▓▓▒▒▒░░▒▒▒░                
            ░▒▓▓▓▓▓▓▓▓▒▒▒▒▒▒▓▓▒▒▒▒░░░░░░  ░░░░░▒▒▒▒▒░░▒▒▒▒▒▒▒▒▒▒▒░                
             ░▒▓▓▓▓▓▓▓▓▓▒▒▒▒▓▓▒▒░░░░░░      ░░░░▒▒▒▒░░▒▒▒▒▓▓▓▓▓▒▒░                
               ░▒▓▓▓▓▓▓▓▒▒▓▓▓▒░░  ░░░░░▒▒▒░░   ░░▒▒░░░▒▓▓▓▓▓▒▒▒▒░                 
                ░▒▒▒▓▓▓▓▓▓▓▓▒░░   ░░▒▓▓▓▓▓▓▒░░  ░░▒▒▒▓▓▓▓▓▒▒▒▒░                   
                     ░▒▒▓▓▓▓▒░   ░▓▓█▓▓▓▒▒▓▓▓▒░  ░▒▒▓▓▓▓▒▒▒▒░                     
                        ▒▒▓▒▒▒░░░▒▓███████▓██▓░   ░▒▓▓▓▒░░                        
                          ▒▒▒▒▒▒▒▒▓████▓▓████▓░   ░▒▒▒░                           
                            ░▒▒▒▒▒▓▓▓██████▓▒▒░░░░                                
                              ░▒▒▒▒▒▓▓▓▓▓▓▓▒▒▒▒░░                                 
                                ▒▒▓▓▓▓▓▓▓▓▓▒░░                                    
                                  ░░▒▒▒▒▒▒░░                                      
    

    Is generated by the program with this uuencoded source:

    source
    begin 644 unicode_image.tar.xz
    M_3=Z6%H```3FUK1&`@`A`18```!T+^6CX"?_!IM=`":8269IV=F-Y,!%M1>4
    MZX(9LR,YMG1:D2XCM%DZ,0N4%>'\;I?0D"7/H:OI15<M6G@8HO/[&((J'=B.
    MXY\G/[7D"Y)B$O)IC]DM9Y@^\4T?'9(.Z.4+7IDF/T0&\7`M5+G#=C!?(>(U
    M+-C2%PZ!"(Q'Z_/D^%"[PVKX:A:OKH5WF?AQ=CD_AAS]<3<THTMC0S8FG\<A
    MZ;A-_9H?)5S'YG5A?0WUQ7FR0+IS\0AEUYY9QFMY?"$);\U%_R0NK(ZZ/Y&J
    MVA;@#O-P.6W8PW']0U"<S'NHB=.(/)OX[<1&UF@M8+GXPGVFQB_+K/WD01ZO
    MO#+E!CK;^`V-WGH^?0V5M!IK[KR&]`IR<>6D+ONPJT6E\CZJ^KKZ,W?3O"2K
    M!/GHQ&TDN';;P#UC;)+HRPH$`_8JM#ZV`\I6=,PO=U#S33IZ=R!K2IF]\1D@
    M*@I6;)=1P[3ICJ%,C5VTH^%^^N7(`)5NO*-SG)Y`QA_WK>PA8;TJ+X2)EV?3
    MI.G"[*>WWDZ7\Q_8`@,?X8C9YSMNGQ.S79!10SGB!PGY<3)L+A>\T4NE3RCH
    M@$<!]40^I5;'[)@>$KCW3*:VMQ")"FQ!"L?^:Y5K)WM]*CV<",@L38:E&'G;
    MOH/\?B8-H-/5$-+1`SZ2O6RHY>T@2+Q"LM7T32<P'/M;:&`9G%<:2]0W95K<
    M\.;8"EQMW`_,%NF[4)3]`F-BE^_T2,`,VV:G?3TUJ"IH\A@>7Y8:?5[I8HAX
    M/S[+K[U+U);"B>&TTWB[]4K_N9HNW6P5\,!8G[BS*%\3<$#RM"O`6C#I1>3W
    ME/8D\";AO1[@@M<^$07W%./W$^^MXKWV/QE?(SEU3GAF"?/TXQ_1N>#/)-;8
    M<MT]?:XBQTJ%9'[+D!X/9^"U.Y*2A(8A/AJ\!K^V\)9>''<,=_M*GQ;D4XU&
    M0EL_A7==:2!F6+.18],W)'0*A8VHHT@F\^+U@*=PAW&]_UDA'O3HL,)67*56
    M:4QHF]DC7*EMI@?8=/,8;'O4[2W^#V!$L.(O\+@E"I[>5AF7V].]DSR0?>4E
    MX,HXX%S-A'V'+F)3(0_FX[O!VNO&D/BL`T<!(LJ,(@3!S8LJ[><9CHP*Z!1N
    M*W=F7"R-"ZZ_7"_NQ:8M=&RU\7`Z`<J>2YY>W2B\R7!YX(:UE+7?[1HLZ6?B
    M_$Z<[I3[+!'?["<`BV0HHVNVZ$?*E!CLQ,Q'U$5$IQ?#B#+P)WE/$$-*'U<E
    M%+7[;;^I\)F++WU`\\BL@X5,*B+]OLWQ&=W!,3*4_5Z0'R\2N/\];P>]W2<(
    M-7Z3$YL3&6;-"*NTT2_Z1P=4">JZT,0Y$Q`L;RU@2\X!6>NA6D:5#HOIP#H]
    MH)2I8$WFSU,1M9OC73J.1T1-YWD[%EH?1E*H#MV/[+5HSKGU.-%-'Z)QI=$\
    MV>24Q6&KMA*-=L#[#I[2'0$86N)&8E/==`F@,S#5,`)(-KDT6A9IK-/%6OA@
    M@FI$$#7G>&.Z!8[?8:F==P>HX>WF&.?(9V6^WM`J[CVD`L]9<&\6P_U?*WN`
    MLI*_M*H;SP58M&#X!>*U*^J*XO@UT"&SIGH1%(K-=7DN@=HD6`S2EET,60JV
    MI_\%)%6Q_^3CW_5`HQ;G084_7J0'F9DDH*%`SY*.D1BP`"D_QO=5,F?$-HAG
    M_FP7H+LUTX`^%F-[SV(C'N*+AXE=&!+'OT$)RYGQ/HX,L8W(D%G9J=P!6+*$
    M)F20)=%>9ZI).Z0`I'T/OT#SUR_:(O0U1*2-:,\D0S52^NI?HL69"POCNH&X
    M_HXLQZB3EL-Z<)4.!<<BDJX3H"Q`L'&"RLO/%]17EV.5R@/$,%GYE#U(,Z'.
    M6#]?M?@0VYB%WU2-4E:Z&9RN,"SCQAYJ70='?0`L5JC6GG#:1BG]DBCY;N)<
    M>[;>JU-P]W=*RQG_KX;[>Y-0O.>_BS[M!=3Y#98EA`S8J/\1S=Z..*RC^;+U
    M!.(#>E-V^?_+/M323Q,+EM-95M%CT#G[XO0FH/`.&`__EU<3\=+#>7?FR*NY
    M9MA;$1+KD8?V@Y8XE7`(*;.N\KEF1]T4!OYS1+%#*S9&[0-#E"FRGA^\L[^A
    M\76@2CMV<J_S92KW%;UO$.=R!3!P]OD.WD@*ZE(.;>H)8L]IMC;<YHPNH343
    MY,JGKBM7M:!:S]$UJ7Y-/A'>Z]^7LV*Y.]N\MN_#%%%>)IH_A_:G46]+E.M;
    MUA@I>99IP916P;7A48N3VF+;&!__1Q<QF8AU`XF-LJ./^6J+@+JCLICOF=I-
    M.U"KKV.._JR/;P(````4=99=LD1P/P`!MPV`4```*JOOO;'$9_L"``````19
    !6@``
    `
    end
    

    I’ve also seen various programs that use the Braille Unicode characters for higher-resolution bitmap rendering, like mapscii, and I suspect that someone’s probably written software to convert to that.


  • I was kind of interested in doing this for Unicode a while back, which has the potential to provide for a lot more possible characters and thus a lot more-accurate renditions. If you don’t care about real-time operation, I suspect that you can do this with off-the-shelf software by just writing a small amount of code to generate an image for each character — a shell script driving ImageMagick could do it — and then feeding it into photomosaic software, like metapixel.

    The major limitation is that unless you’re just interested in doing this for the text-based aesthetic and are actually rendering and presenting an image to the end user — think something like Effulgence RPG, Warsim, Armoured Commander II, Armoured Commander, Cogmind, SanctuaryRPG, Cataclysm: Dark Days Ahead, Stone Story RPG, Roots of Harmony, and so forth — you can’t control the font that the thing is rendered in on the end user’s computer. And the accuracy of the rendering degrades the more the typeface used on an end user’s computer differs from your own.

    It’d probably be possible to build some kind of system that does take into account the differences for different typefaces, scores characters higher based on checking for character similarity across different typefaces.

    Note that there are also at least two existing libraries out there — what I can think of off the top of my head — that will do image-to-ASCII conversion — aalib and libcaca, the latter of which has color support. I also posted a tiny program some time back to generate images using the colored Unicode tiles, and I imagine that someone out there probably has a website that does the same thing.


  • There was a famous bug that made it into 95 and 98, a tick counter that caused the system to crash after about a month. It was in there so long because there were so many other bugs causing stability problems that it wasn’t obvious.

    I will say that classic MacOS, which is what Apple was doing at the time, was also pretty unstable. Personal computer stability really improved in the early 2000s a lot. Mac OS X came out and Microsoft shifted consumers onto a Windows-NT-based OS.

    EDIT:

    https://www.cnet.com/culture/windows-may-crash-after-49-7-days/

    A bizarre and probably obscure bug will crash some Windows computers after about a month and a half of use.

    The problem, which affects both Microsoft Windows 95 and 98 operating systems, was confirmed by the company in an alert to its users last week.

    “After exactly 49.7 days of continuous operation, your Windows 95-based computer may stop responding,” Microsoft warned its users, without much further explanation. The problem is apparently caused by a timing algorithm, according to the company.




  • tal@lemmy.todaytoSelfhosted@lemmy.worldWhere to start with backups?
    link
    fedilink
    English
    arrow-up
    6
    ·
    edit-2
    4 hours ago

    If databases are involved they usually offer some method of dumping all data to some kind of text file. Usually relying on their binary data is not recommended.

    It’s not so much text or binary. It’s because a normal backup program that just treats a live database file as a file to back up is liable to have the DBMS software write to the database while it’s being backed up, resulting in a backed-up file that’s a mix of old and new versions, and may be corrupt.

    Either:

    1. The DBMS needs to have a way to create a dump — possibly triggered by the backup software, if it’s aware of the DBMS — that won’t change during the backup

    or:

    1. One needs to have filesystem-level support to grab an atomic snapshot (e.g. one takes an atomic snapshot using something like btrfs and then backs up the snapshot rather than the live filesystem). This avoids the issue of the database file changing while the backup runs.

    In general, if this is a concern, I’d tend to favor #2 as an option, because it’s an all-in-one solution that deals with all of the problems of files changing while being backed up: DBMSes are just a particularly thorny example of that.

    Full disclosure: I mostly use ext4 myself, rather than btrfs. But I also don’t run live DBMSes.

    EDIT: Plus, #2 also provides consistency across different files on the filesystem, though that’s usually less-critical. Like, you won’t run into a situation where you have software on your computer update File A, then does a sync(), then updates File B, but your backup program grabs the new version of File B but then the old version of File A. Absent help from the filesystem, your backup program won’t know where write barriers spanning different files are happening.

    In practice, that’s not usually a huge issue, since fewer software packages are gonna be impacted by this than write ordering internal to a single file, but it is permissible for a program, under Unix filesystem semantics, to expect that the write order persists there and kerplode if it doesn’t…and a traditional backup won’t preserve it the way that a backup with help from the filesystem can.


  • And, should the GenAI market deflate, it will be because all of the big players in the market – the hyperscalers, the cloud builders, the model builders, and other large service providers – believed their own market projections with enough fervor that TSMC will shell out an entire year’s worth of net profits to build out its chip etching and packaging plants.

    The thing is that with some of these guys, the capacity isn’t general.

    So, say you’re OpenAI and you buy a metric shit-ton of Nvidia hardware.

    You are taking on some very real risks here. What you are buying is an extremely large amount of parallel compute hardware with specific performance characteristics. There are scenarios where the value of that hardware could radically change.

    • Say generative AI — even a substantial part of generative AI — shifts hard to something like MoEs, and suddenly it’s desirable to have a higher ratio of memory to compute capacity. Suddenly, the hardware that OpenAI has purchased isn’t optimal for the task at hand.

    • Say it turns out that some researchers discover that we can run expert neural nets that are only lightly connected at a higher level. Then maybe we’re just fine using a bank of consumer GPUs to do computation, rather than one beefy Nvidia chip that excels at dense models.

    • Say models get really large and someone starts putting far-cheaper-than-DRAM NVMe on the parallel compute device to store offloaded expert network model weights. Again, maybe current Nvidia hardware becomes a lot less interesting.

    • Say there’s demand, but not enough to make a return in a couple of years, and everyone else is buying the next generation of Nvidia hardware. That is, the head start that OpenAI bought just isn’t worth what they paid for it.

    • Say it turns out that a researcher figures out a new, highly-effecitve technique for identifying the relevant information about the world, and suddenly, the amount of computation falls way, way off, and doing a lot of generative AI on CPUs becomes a lot more viable. I am very confident that we are nowhere near the ideal here today.

    In all of those cases, OpenAI is left with a lot of expensive hardware that may be much less valuable than one might have expected.

    But…if you’re TSMC, what you’re buying is generalized. You fabricate chips. Yeah, okay, very high-resolution, high-speed chips at a premium price over lower-resolution stuff. But while the current AI boom may generate a lot of demand, all of that capacity can also be used to generate other sorts of chips. If generative AI demand suddenly falls way off, you might not have made an optimal investment, maybe spent more than makes sense on increasing production capacity, but there are probably a lot of people outside the generative AI world who can do things with a high-resolution chip fab.



  • tal@lemmy.todaytoTechnology@beehaw.orgMove Over, ChatGPT
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    11 hours ago

    In all fairness, while this is a particularly bad case, the fact that it’s often very difficult to safely fiddle with environment variables at runtime in a process, but very convenient as a way to cram extra parameters into a library have meant that a lot of human programmers who should know better have created problems like this too.

    IIRC, setting the timezone for some of the Posix time APIs on Linux has the same problem, and that’s a system library. And IIRC SDL and some other graphics libraries, SDL and IIRC Linux 3D stuff, have used this as a way to pass parameters out-of-band to libraries, which becomes a problem when programs start dicking with it at runtime. I remember reading some article from someone who had been banging into this on Linux gaming about how various programs and libraries for games would setenv() to fiddle with them, and races associated with that were responsible for a substantial number of crashes that they’d seen.

    setenv() is not thread-safe or signal-safe. In general, reading environment variables in a program is fine, but messing with them in very many situations is not.

    searches

    Yeah, the first thing I see is someone talking about how its lack of thread-safety is a problem for TZ, which is the time thing that’s been a pain for me a couple times in the past.

    https://news.ycombinator.com/item?id=38342642

    Back on your issue:

    Claude, being very smart and very good at drawing a straight line between two points, wrote code that took the authentication token from the HTTP request header, modified the process’s environment variables, then called the library

    for the uninitiated - a process’s environment variables are global. and HTTP servers are famously pretty good at dealing with multiple requests at once.

    Note also that a number of webservers used to fork to handle requests — and I’m sure that there are still some now that do so, though it’s certainly not the highest-performance way to do things — and in that situation, this code could avoid problems.

    searchs

    It sounds like Apache used to and apparently still can do this:

    https://old.reddit.com/r/PHP/comments/102vqa2/why_does_apache_spew_a_new_process_for_each/

    But it does highlight one of the “LLMs don’t have a broad, deep understanding of the world, and that creates problems for coding” issues that people have talked about. Like, part of what someone is doing when writing software is identifying situations where behavior isn’t defined and clarifying that, either via asking for requirements to be updated or via looking out-of-band to understand what’s appropriate. An LLM that’s working by looking at what’s what commonly done in its training set just isn’t in a good place to do that, and that’s kinda a fundamental limitation.

    I’m pretty sure that the general case of writing software is AI-hard, where the “AI” referred to by the term is an artificial general intelligence that incorporates a lot of knowledge about the world. That is, you can probably make an AI to program write software, but it won’t be just an LLM, of the “generative AI” sort of thing that we have now.

    There might be ways that you could incorporate an LLM into software that can write software themselves. But I don’t think that it’s just going to be a raw “rely on an LLM taking in a human-language set of requirements and spitting out code”. There are just things that that can’t handle reasonably.


  • Viess said field reports coming into her organization suggest the growth of death caps may be slowing in the Bay Area, while another kind of poisonous mushroom known as the destroying angel, or Amanita ocreata, is starting to pop up.

    Oh, great.

    https://en.wikipedia.org/wiki/Amanita_ocreata

    A. ocreata is highly toxic, and has been responsible for mushroom poisonings in western North America, particularly in the spring. It contains highly toxic amatoxins, as well as phallotoxins, a feature shared with the closely related death cap (A. phalloides), half a cap of which can be enough to kill a human, and other species known as destroying angels.[3][14] There is some evidence it may be the most toxic of all the North American phalloideae, as a higher proportion of people consuming it had organ damage and 40% perished.[15]




  • I think that the problem will be if software comes out that’s doesn’t target home PCs. That’s not impossible. I mean, that happens today with Web services. Closed-weight AI models aren’t going to be released to run on your home computer. I don’t use Office 365, but I understand that at least some of that is a cloud service.

    Like, say the developer of Video Game X says “I don’t want to target a ton of different pieces of hardware. I want to tune for a single one. I don’t want to target multiple OSes. I’m tired of people pirating my software. I can reduce cheating. I’m just going to release for a single cloud platform.”

    Nobody is going to take your hardware away. And you can probably keep running Linux or whatever. But…not all the new software you want to use may be something that you can run locally, if it isn’t released for your platform. Maybe you’ll use some kind of thin-client software — think telnet, ssh, RDP, VNC, etc for past iterations of this — to use that software remotely on your Thinkpad. But…can’t run it yourself.

    If it happens, I think that that’s what you’d see. More and more software would just be available only to run remotely. Phones and PCs would still exist, but they’d increasingly run a thin client, not run software locally. Same way a lot of software migrated to web services that we use with a Web browser, but with a protocol and software more aimed at low-latency, high-bandwidth use. Nobody would ban existing local software, but a lot of it would stagnate. A lot of new and exciting stuff would only be available as an online service. More and more people would buy computers that are only really suitable for use as a thin client — fewer resources, closer to a smartphone than what we conventionally think of as a computer.

    EDIT: I’d add that this is basically the scenario that the AGPL is aimed at dealing with. The concern was that people would just run open-source software as a service. They could build on that base, make their own improvements. They’d never release binaries to end users, so they wouldn’t hit the traditional GPL’s obligation to release source to anyone who gets the binary. The AGPL requires source distribution to people who even just use the software.


  • I will say that, realistically, in terms purely of physical distance, a lot of the world’s population is in a city and probably isn’t too far from a datacenter.

    https://calculatorshub.net/computing/fiber-latency-calculator/

    It’s about five microseconds of latency per kilometer down fiber optics. Ten microseconds for a round-trip.

    I think a larger issue might be bandwidth for some applications. Like, if you want to unicast uncompressed video to every computer user, say, you’re going to need an ungodly amount of bandwidth.

    DisplayPort looks like it’s currently up to 80Gb/sec. Okay, not everyone is currently saturating that, but if you want comparable capability, that’s what you’re going to have to be moving from a datacenter to every user. For video alone. And that’s assuming that they don’t have multiple monitors or something.

    I can believe that it is cheaper to have many computers in a datacenter. I am not sold that any gains will more than offset the cost of the staggering fiber rollout that this would require.

    EDIT: There are situations where it is completely reasonable to use (relatively) thin clients. That’s, well, what a lot of the Web is — browser thin clients accessing software running on remote computers. I’m typing this comment into Eternity before it gets sent to a Lemmy instance on a server in Oregon, much further away than the closest datacenter to me. That works fine.

    But “do a lot of stuff in a browser” isn’t the same thing as “eliminate the PC entirely”.


  • You could also just only use Macs.

    I actually don’t know what the current requirement is. Back in the day, Apple used to build some of the OS — like QuickDraw — into the ROMs, so unless you had a physical Mac, not just a purchased copy of MacOS, you couldn’t legally run MacOS, since the ROM contents were copyrighted, and doing so would require infringing on the ROM copyright. Apple obviously doesn’t care about this most of the time, but I imagine that if it becomes institutionalized at places that make real money, they might.

    But I don’t know if that’s still the case today. I’m vaguely recalling that there was some period where part of Apple’s EULA for MacOS prohibited running MacOS on non-Apple hardware, which would have been a different method of trying to tie it to the hardware.

    searches

    This is from 2019, and it sounds like at that point, Apple was leveraging the EULAs.

    https://discussions.apple.com/thread/250646417?sortBy=rank

    Posted on Sep 20, 2019 5:05 AM

    The widely held consensus is that it is only legal to run virtual copies of macOS on a genuine Apple made Apple Mac computer.

    There are numerous packages to do this but as above they all have to be done on a genuine Apple Mac.

    • VMware Fusion - this allows creating VMs that run as windows within a normal Mac environment. You can therefore have a virtual Mac running inside a Mac. This is useful to either run simultaneously different versions of macOS or to run a test environment inside your production environment. A lot of people are going to use this approach to run an older version of macOS which supports 32bit apps as macOS Catalina will not support old 32bit apps.
    • VMware ESXi aka vSphere - this is a different approach known as a ‘bare metal’ approach. With this you use a special VMware environment and then inside that create and run virtual machines. So on a Mac you could create one or more virtual Mac but these would run inside ESXi and not inside a Mac environment. It is more commonly used in enterprise situations and hence less applicable to Mac users.
    • Parallels Desktop - this works in the same way as VMware Fusion but is written by Parallels instead.
    • VirtualBox - this works in the same way as VMware Fusion and Parallels Desktop. Unlike those it is free of charge. Ostensible it is ‘owned’ by Oracle. It works but at least with regards to running virtual copies of macOS is still vastly inferior to VMware Fusion and Parallels Desktop. (You get what you pay for.)

    Last time I checked Apple’s terms you could do the following.

    • Run a virtualised copy of macOS on a genuine Apple made Mac for the purposes of doing software development
    • Run a virtualised copy of macOS on a genuine Apple made Mac for the purposes of testing
    • Run a virtualised copy of macOS on a genuine Apple made Mac for the purposes of being a server
    • Run a virtualised copy of macOS on a genuine Apple made Mac for personal non-commercial use

    No. Apple spells this out very clearly in the License Agreement for macOS. Must be installed on Apple branded hardware.

    They switched to ARM in 2020, so unless their legal position changed around ARM, I’d guess that they’re probably still relying on the EULA restrictions. That being said, EULAs have also been thrown out for various reasons, so…shrugs

    goes looking for the actual license text.

    Yeah, this is Tahoe’s EULA, the most-recent release:

    https://www.apple.com/legal/sla/docs/macOSTahoe.pdf

    Page 2 (of 895 pages):

    They allow only on Apple-branded hardware for individual purchases unless you buy from the Mac Store. For Mac Store purchases, they allow up to two virtual instances of MacOS to be executed on Apple-branded hardware that is also running the OS, and only under certain conditions (like for software development). And for volume purchase contracts, they say that the terms are whatever the purchaser negotiated. I’m assuming that there’s no chance that Apple is going to grant some “go use it as much as you want whenever you want to do CI tests or builds for open-source projects targeting MacOS” license.

    So for the general case, the EULA prohibits you from running MacOS wherever on non-Apple hardware.





  • Milsim games involve heavy ray tracing

    I guess it depends on what genre subset you’re thinking of.

    I play a lot of milsims — looks like I have over 100 games tagged “War” in my Steam library. Virtually none of those are graphically intensive. I assume that you’re thinking of recent infantry-oriented first-person-shooter stuff.

    I can only think of three that would remotely be graphically intensive in my library: ArmA III, DCS, and maybe IL-2 Sturmovik: Battle for Stalingrad.

    Rule the Waves 3 is a 2D Windows application.

    Fleet Command and the early Close Combat titles date to the '90s. Even the newer Close Combat titles are graphically-minimal.

    688(i) Hunter/Killer is from 1997.

    A number of of them are 2D hex-based wargames. I haven’t played any of Gary Grigsby’s stuff, but that guy is an icon, and all his stuff is 2D.

    If you go to Matrix Games, which sells a lot of more hardcore wargames, a substantial chunk of their inventory is pretty old, and a lot is 2D.