• tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    3 hours ago

    GitHub explicitly asked Homebrew to stop using shallow clones. Updating them was “an extremely expensive operation” due to the tree layout and traffic of homebrew-core and homebrew-cask.

    I’m not going through the PR to understand what’s breaking, since it’s not immediately apparent from a quick skim. But three possible problems based on what people are mentioning there.

    The problem is the cost of the shallow clone

    Assuming that the workload here is always --depth=1 and they aren’t doing commits at a high rate relative to clones, and that’s an expensive operation for git, I feel like for GitHub, a better solution would be some patch to git that allows it to cache a shallow clone for depth=1 for a given hashref.

    The problem is the cost of unshallowing the shallow clone

    If the actual problem isn’t the shallow clone, that a regular clone would be fine, but that unshallowing is a problem, then a patch to git that allows more-efficient unshallowing should be a better solution. I mean, I’d think that unshallowing should only need a time-ordered index of commits referenced blobs up to a given point. That shouldn’t be that expensive for git to maintain an index of, if it doesn’t already have it.

    The problem is that Homebrew has users repeatedly unshallowing a clone off GitHub and then blowing it away and repeating

    If the problem is that people keep repeatedly doing a clone off GitHub — that is, a regular, non-shallow clone would also be problematic — I’d think that a better solution would be to have Homebrew do a local bare clone as a cache, and then just do a pull on that cache and then use it as a reference to create the new clone. If Homebrew uses the fresh clone as read-only and the cache can be relied upon to remain, then they could use --reference alone. If not, then add --dissociate. I’d think that that’d lead to better performance anyway.