Off-and-on trying out an account over at @tal@oleo.cafe due to scraping bots bogging down lemmy.today to the point of near-unusability.

  • 20 Posts
  • 1.23K Comments
Joined 3 years ago
cake
Cake day: October 4th, 2023

help-circle
  • I would bet that a lot of the storage that AI companies are picking up isn’t for the model itself, but for storing the huge amount of information that they want to use as their training corpus.

    I’d bet that what they do is something like this:

    1. Download data and store in original form, non-destructively. This is probably not used incredibly frequently. When you see bots sucking down the whole Web, this is the sort of thing that is involved.

    2. Have some kind of filtered training corpus. This throws out a lot of stuff that is useless for training. This is generated from #1 by filtering software. It’s probably smaller than #1. Probably a lot smaller.

    3. Probably some sort of scored index is generated at this stage to put an estimate on how useful or reliable the data in step #2 should be considered; I’d assume that this is an input into the training.

    4. The generated model, generated via training.

    For the data in stage #1, I’d guess that AI companies might be able to use tapes. That being said, it might make sense to use faster storage if it accelerates the time to iterate on improving the filtering software.

    But, yeah, for the later stages, tapes probably aren’t gonna work.


  • tal@lemmy.todaytoxkcd@lemmy.worldxkcd #3241: Horizontal Stabilizers
    link
    fedilink
    English
    arrow-up
    17
    ·
    edit-2
    6 hours ago

    They’re horizontal stabilizers. They serve a crucial aerodynamic role.

    https://en.wikipedia.org/wiki/1983_Negev_mid-air_collision

    In May 1983, two Israeli Air Force aircraft, an F-15 Eagle and an A-4 Skyhawk, collided in mid-air during a training exercise over the Negev region, in Israel. Notably, the F-15 (with a crew of two) managed to land safely at a nearby airbase, despite having its right wing almost completely sheared off in the collision. The lifting body properties of the F-15, together with its overabundant engine thrust, allowed the pilot to achieve this unique feat.[1]

    The F-15 started rolling uncontrollably after the collision and the instructor ordered an ejection. Nedivi, who outranked the instructor, decided not to eject and attempted recovery by engaging the afterburner, and eventually regained control of the aircraft. He was able to maintain control because of the lift generated by the large areas of the fuselage, stabilators, and remaining wing. Diverting to Ramon Airbase,[2] the F-15 landed at twice the normal speed to maintain the necessary lift, and its tailhook was torn off completely during the landing. Nedivi managed to bring his F-15 to a complete stop approximately 20 ft (6 m) from the end of the runway. He later told The History Channel, “it’s highly likely that if I had seen it clearly I would have ejected, because it was obvious you couldn’t really fly an airplane like that.”[4] He added, “Only when McDonnell Douglas later went to analyze it, they said, OK, the F-15 has a very wide [lifting] body; you fly fast enough and you’re like a rocket. You don’t need wings.”[3][4][5]

    Sometimes things aren’t as crucial as they might seem!


  • Use tape libraries for the moment, with hard drives acting as a cache for them? Doesn’t need to mean moving the whole backing storage to tape, just predicting what won’t likely be used soon and letting the storage format indicate “go look on tape for this item”. Obviously, that can result in much higher cold storage retrieval latency, but as long as you are (a) doing predictive fetching with a reasonably good algorithm and (b) have a lot of hard drives, which I’m sure that The Internet Archive does, I’d think that tape should be workable.

    https://en.wikipedia.org/wiki/Tape_library

    In computer storage, a tape library is a physical area that holds magnetic data tapes. In an earlier era, tape libraries were maintained by people known as tape librarians and computer operators and the proper operation of the library was crucial to the running of batch processing jobs. Although tape libraries of this era were not automated, the use of tape management system software could assist in running them.

    Subsequently, tape libraries became physically automated, and as such are sometimes called a tape silo, tape robot, or tape jukebox. These are a storage devices that contain one or more tape drives, a number of slots to hold tape cartridges, a barcode reader to identify tape cartridges, and an automated method for loading tapes (a robot). Such solutions are mostly used for backups and for digital archiving. Additionally, the area where tapes that are not currently in a silo are stored is also called a tape library. One of the earliest examples was the IBM 3850 Mass Storage System (MSS), announced in 1974.

    In either era, tape libraries can contain millions of tapes.

    Physically automated tape library devices can store immense amounts of data, ranging from 20 terabytes[13] up to 2.1 exabytes of data[14] as of 2016.

    For large data-storage, they are a cost-effective solution, with cost per gigabyte as low as 2 cents USD.

    I’d also guess — though I don’t know for sure — that it’s probably a lot easier to scale up manufacturing of tapes than it is hard drives.

    EDIT: Does kind of make me wonder what the open-source options for tiered storage like that is. I’ve never really gone hunting, but it seems like there’d be a lot of commonality from place to place, and that for a lot of places that do it, it’s not really their core competency (that is, they just want to do something that deals with storing and processing lots of data, not that they really care principally about data storage).


  • The Innioasis Y1 is one of a growing number of gadgets that seems engineered to take us back to a simpler, less perpetually-connected time. It’s an unabashed iPod Classic clone: click wheel, color screen, and all, with just enough modern concessions (USB-C charging, Bluetooth) to keep it from feeling like a museum piece.

    If you’re going to leave your smartphone at home and then take this, and not having the phone with you is your goal, okay, sure.

    But if you’re not, you’re just carrying an additional device to do something that the first device is quite capable of handling.



  • This is due to phishing attacks and account takeover attempts, not due to the platform itself being insecure.

    I mean, it’s not that Signal has security issues per se, but it doesn’t have the German government’s security people with control over what goes into releases, either.

    If you remember the wake of Signalgate, the US doesn’t allow use by American officials of Signal to do their communications because they don’t certify it for classified information transmission and do have their own app that officials are supposed to be using.

    On March 15, Secretary of Defense Pete Hegseth used the chat to share sensitive and classified details of the impending airstrikes, including types of aircraft and missiles, as well as launch and attack times.[1][2] The name of an active undercover CIA officer was mentioned by CIA director John Ratcliffe in the chat,[3] while Vance and Hegseth expressed contempt for European allies.[4][5]

    A forensic investigation by the White House information technology office determined that Waltz had inadvertently saved Goldberg’s phone number under Hughes’ contact information. Waltz then added Goldberg to the chat while trying to add Hughes.[15] Subsequently, investigative journalists reported Waltz’s team regularly created group chats to coordinate official work[16] and that Hegseth shared details about missile strikes in Yemen to a second group chat which included his wife, his brother, and his lawyer.[17]

    On March 18, 2025, the Pentagon sent a department-wide memo warning, “Please note: third party messaging apps (e.g. Signal) are permitted by policy for unclassified accountability/recall exercises but are NOT approved to process or store nonpublic unclassified information”—a category whose release would be far less potentially damaging than that about ongoing military operations.[27] A former NSA hacker said that linking Signal to a desktop app is one of its biggest risks, as Ratcliffe suggested he had done.[28]

    According to the article, German government information security people do that for Wire:

    Klöckner highlighted that Wire is already provided by the Bundestag administration and is certified by Germany’s Federal Office for Information Security (BSI).



  • Well, it’s more time to fix bugs and revise the hardware to cut costs or improve functionality. I mean, few engineers are going to say no to more time to fix their project. Maybe do a 2018 release and bump up some of the specs.

    One possibility is to release a small run of the current hardware at a higher price that accounts for the increased hardware component costs as a “limited prerelease”. That has the downside that it won’t be specifically targeted by game developers, which is one perk of a console-like hardware release. Valve should also make it clear that there’s going to be a full release later that may have updated specs and will have a lower price. That gets some feedback from people and lets users who really want a living room PC now and don’t care about the price or whether developers are specifically targeting it get one. I don’t think that it’ll do very well given that it’d lack economy of scale and the high price, and having another platform will add to Valve’s cost of maintenance, but…shrugs it might be considered worthwhile.








  • So would it be possible for a whole bunch of people to ddos google/other big popular websites ipv4 to ipv6 translation such that their services would still function over ipv6 but make everyone’s day awful if running ipv4. Enough angry customers and pissed off users seems like a very effective way to get isps and mobile service providers to get their act together and start issue sing ipv6 to people.

    Trying to DDoS attack Google’s IPv4 services to get your mobile provider to provide IPv6 support seems kind of…indirect.