Hi!

While I really enjoy seeing many of my fellow man being accommodating to people with disabilities. I find manually transcribing every image I post to be very tiring.

I thought that I could at least use some sort of AI to help with image transcripts, tho, that could probably be better used by the actual person with the disability.

So thats the question, should I skip the transcribing of an image or let an AI do it?

  • forestbeasts@pawb.social
    link
    fedilink
    English
    arrow-up
    1
    ·
    47 minutes ago

    Do not.

    Please just don’t.

    People (hi I’m people) need what the image IS, what’s important about it, why you included it. Not just what some slop generator shat out about it.

    Better to have nothing, which is at least honest, than to have something that PURPORTS to have meaning but then just, doesn’t.

    – Frost

  • Rimu@piefed.social
    link
    fedilink
    English
    arrow-up
    2
    ·
    6 hours ago

    If I were blind I’d prefer it if the app just hid all image posts from me. The alt text, when it exists, is going to be trash most of the time anyway.

  • Kierunkowy74@piefed.zip
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    9 hours ago

    Check your output as it may be less accurate than your effort.

    AI is able to extensively describe a photo, like these published on !pics@lemmy.world , but fails at seeing, what part of it is actually important, or recognising a point of a meme. It will save you many keystrokes, but probably will still need to be manually corrected.

  • placebo@lemmy.zip
    link
    fedilink
    English
    arrow-up
    6
    ·
    11 hours ago

    AI is great for this. We shouldn’t put people with disabilities at a disadvantage because of the anti-AI hysteria.

  • Lumidaub@feddit.org
    link
    fedilink
    English
    arrow-up
    24
    ·
    18 hours ago

    If you can get an AI to produce an actually useful description, that would be extremely interesting. However, AIs don’t know what’s important about an image and will fill up the description with useless information, effectively spam for the person that needs a description.

    Write just a sentence, describe the thing that is important, while keeping in mind why you’re even posting the image, and it’s going to take less time than asking the AI.

      • Lumidaub@feddit.org
        link
        fedilink
        English
        arrow-up
        15
        ·
        18 hours ago

        True and one sentence written by a human who understands the image is better than twenty sentences by a word prediction machine.

        • HappyFrog@lemmy.blahaj.zone
          link
          fedilink
          English
          arrow-up
          11
          ·
          17 hours ago

          No matter how good human written descriptions are, people just won’t do them. So having a automated system is much more preferable.

          • Lumidaub@feddit.org
            link
            fedilink
            English
            arrow-up
            7
            ·
            17 hours ago

            I know what you’re saying but I truly think for most people it’s simply that they’re overthinking it. They think every single thing needs to be in the description, with references explained and sourced and whatnot. That does sound exhausting. And I have written a handful of descriptions like that for pictures where I thought the details were interesting enough to justify the effort. But really, a simple “The thirteenth Doctor and Rose Tyler embracing and deeply kissing” is already very sufficient in most cases (add “standing on an asteroid in front of a field of glittering stars - digital colour painting” if you have the spoons). So imho it’s better to educate them and encourage short, concise descriptions than to give in to the slop.

        • x74sys@programming.devB
          link
          fedilink
          English
          arrow-up
          7
          ·
          edit-2
          18 hours ago

          Yeah, apart from the fact that I imagine that people who need alt text don’t appreciate LLM output. It‘s very boring. It’s either extremely technical and ice-cold or so cringe that you have to stop reading. Just what I think.

          At least for me, if I realize that I’m reading an AI blog article or AI generated text in some other form, I don’t read it.

  • hendrik@palaver.p3x.de
    link
    fedilink
    English
    arrow-up
    13
    ·
    17 hours ago

    I’d ask someone who needs these transcriptions first. I tend more towards “Nay”. I mean if they want AI transcriptions, I guess they could just run their own AI. And that way they get to choose between human and AI ones. I’m kind of against flooding the internet with AI content as long as the recipients can do it themselves.

    • Lumidaub@feddit.org
      link
      fedilink
      English
      arrow-up
      9
      ·
      17 hours ago

      That’s a good point but wouldn’t it be preferable to have one AI run one time instead of several of them doing the work again and again?

      (Assuming that we’re even okay with AI generated descriptions in the first place which I’m not for reasons I’ve laid out in my other comments but I’m talking hypothetically)

      • hendrik@palaver.p3x.de
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        8 hours ago

        Really hard to tell. I mean there are situations in which people think they’re doing someone a favour. But they’re really not. Upside of doing it individually is: affected people get to pick the model they like best. And they can prompt it however they like. Depends a bit on your expertise on the matter if your pre-generated stuff is on the same level or more a disservice. Upside of pre-generating it once is: maybe a bit less CO2 in the atmosphere and a few less trees killed. But that certainly depends on how many people read those descriptions. If there’s just 2 people with screenreaders out there, who don’t even click on all the images, you might very well be wasting compute. And have a negative balance on the environment.

      • Meldrik@lemmy.wtf
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        13 hours ago

        Alternatively, it’s built into the platform. So when someone uploads an image to Lemmy a local AI model does the description.

        Edit: Then it could even be marked as AI generated and people could choose to be exposed to it or not.

  • vala@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 hours ago

    You have a unique advantage in using AI for this over a vision impaired person. That being that if the generated text is wrong, you know and can correct it.

  • x74sys@programming.devB
    link
    fedilink
    English
    arrow-up
    12
    ·
    edit-2
    18 hours ago

    In my opinion, no. It has to be heavily curated. You’re not saving yourself a lot of work if you have to read it word by word (and probably correct stuff) anyway.

    I think just one very short sentence describing what’s on there (it doesn’t have to be detailed) is a lot better than whatever an LLM will give you.

  • FaceDeer@fedia.io
    link
    fedilink
    arrow-up
    4
    ·
    15 hours ago

    Give it a test and see how accurate it is, if it’s good enough then go ahead. People have been using AI-based OCR for literal decades already, nothing has fundamentally changed. There’s just a sudden moral panic about it lately.

  • Tamlyn@lemmy.zip
    link
    fedilink
    English
    arrow-up
    7
    ·
    18 hours ago

    A lot artists doesn’t want that their art is used on ai. You can’t prevent that if you let ai summarize your images. So don’t use ai for that

    • Gonzako@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      18 hours ago

      I was actually thinking of using a self-hosted LLM for these tasks. I wanna dig again into it and I got access to computers on the cheap

    • Lumidaub@feddit.org
      link
      fedilink
      English
      arrow-up
      3
      ·
      18 hours ago

      Those are different mechanisms. Object recognition doesn’t mean the AI is now trained on the image and can reproduce it (which is btw why AI can still “visually” recognise what’s in an image that has been nightshaded/glazed).

      • Sir. Haxalot@nord.pub
        link
        fedilink
        English
        arrow-up
        3
        ·
        18 hours ago

        This is true but it’s also important to remember that if you use an AI model hosted by the same party that trains it it’s likely that they will pass any data you input to the training stage. Unless you have an enterprise contract regulating training use.

        OP mentioned he will use a self-hosted LLM though and in that case it’s no risk of the data being used for training.

        • Lumidaub@feddit.org
          link
          fedilink
          English
          arrow-up
          1
          ·
          18 hours ago

          I mean, if you put any image online that hasn’t been protected/poisoned in some way, you have to (unfortunately) assume it’s in some AI’s training data anyway. If the tradeoff for a useful description (! See my other comments about the lack of usefulness) is that an image is also fed into one more training corpus, that would be worth a thought, imho. If the image is protected/poisoned, I’d indeed encourage this whole hypothetical process, just to further sabotage the data.

  • rako@tarte.nuage-libre.fr
    link
    fedilink
    Français
    arrow-up
    6
    ·
    18 hours ago

    Using AI for

    no

    I find it tiring

    The problem with disabled people isn’t the disability, it’s the behaviour of non-disabled people putting them under, willingly or not. You being tired of that ir actively putting them under. Yes, it’s tiring to take care of people, it’s work. There’s no goind around that. Treating people as equals requires taking care of them, and until you take that as normal (just like brushing your teeth or doind the laundry or sweeping the floor at your place is work, but you still do it) you will be belittling them.

    The change needs to happen on your side, on your conception of humanity and society. AI is not going to help you

    • Gonzako@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      18 hours ago

      Yeah, but you can’t preemptively take care of everyone. For example, satisfactorys arachnophobia mode wouldnt exist if it wasnt for the fact that one of the devs couldn’t work on it otherwise.

      Time and effort are a limited resource.

      • rako@tarte.nuage-libre.fr
        link
        fedilink
        Français
        arrow-up
        4
        ·
        17 hours ago

        There is a huge difference between not taking care because it’s not important to you, and not taking care because you can’t. It’s a cop out to mix up both.

        It’s completely ok to acknowledge that you can’t do it, and to ask around for others to relay you. That’s society at work doing good things for all of us, and that’s how we get out of all this mess. It’s perfectly fine !

  • technocrit@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    14 hours ago

    There’s no real problem here because “AI” doesn’t exist. A transcript program is certainly not “intelligent” or even “artificial” in any meaningful sense.

    So, if you want to use an automated transcription program, I don’t see why not. Just check that it’s fairly accurate and not somehow nefarious.

    • Meldrik@lemmy.wtf
      link
      fedilink
      English
      arrow-up
      3
      ·
      13 hours ago

      Not sure you and the OP is on the same page? Or maybe I’m not.

      OP is talking about alternative text for images, for people who can’t see. The alternative text is a description of the image. I’m not sure how you could achieve automated alternative text without AI?

      If you are talking about OCR, even that is AI powered.

      • forestbeasts@pawb.social
        link
        fedilink
        English
        arrow-up
        1
        ·
        43 minutes ago

        Eh? There’s plenty of non-“AI”-powered OCR, isn’t there? Like, that’s been a thing since long before “AI” slop generators.

        (Like, mayyyybe there’s some kind of machine learning component, but even IF there is, surely you don’t have to run it through a slop generator to get a transcription?)

  • Petersson@feddit.org
    link
    fedilink
    English
    arrow-up
    6
    ·
    18 hours ago

    Personally “AI” is a slur for profit-driven generative bs. The concept it’s based on is great. I love pattern recognition and all the possible usecases for Machine Learning when it comes to science, material research, …

    tl;dr: Go for it.