Bing

promt:

Food product. Ready to eat meal that cheap and save time on preparation. Dried food. Feature children in ad. Dinner table, family. The ad should be black and white. Visual of family in the ads should be vintage (1980s) era. Fallout styles ads.

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    4 months ago

    I think that there’s some way to get either Midjourney or Bing to reliably produce human-specified text labels correctly; I’ve seen people do it in some images on here.

    I use Stable Diffusion, don’t know of a way to do that – I tend to end up with some missing letters and such too – but I’m pretty sure that I’ve seen some people here consistently pull it off on some proprietary AI image generator, where they specify the text.

    EDIT: Well, there’s ControlNet in SD but that’s kind of a time-intensive way to do it. You’d need to create the outline of the text ahead-of-time. It can be useful for if you want some elaborate effect that is easy for SD but hard for an image editor, like text formed out of a cloud or something. But for simple text like this, in SD, it’s probably easier to just remove the text via one of various methods and then just re-add the desired text in an image editor.

    I do hope that improving on this is one of the next things to be generally rolled-out; it’s pretty impressive how well existing systems can select and incorporate text. I just want a mechanism that allows more-control over specifying what text shows up.

    If anyone here does regularly embed text in their images, what system do you use, and how do you do it?

    • Altima NEO@lemmy.zip
      link
      fedilink
      English
      arrow-up
      3
      ·
      4 months ago

      I’ve had pretty good success with FLUX, but it can also spit out gibberish. Usually takes a few attempts.

    • Usernameblankface@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      4 months ago

      With Bing, I put any words I want to see in quotation marks and run the prompt over and over until I get one that works. Fewer words tend to work better. A string of 5 or 6 words long usually takes multiple tries. Longer than that might not happen at all.

      • tal@lemmy.today
        link
        fedilink
        English
        arrow-up
        2
        ·
        4 months ago

        I guess that brute-forcing can work.

        For images with multiple passages of text, like this one, can maybe combine with inpainting on image generators that provide that (so that once you get one piece text the way you want it, you can leave it alone and go generate others).

        There’s a technique I saw that someone did, not to solve this problem, but to remove text, was commenting on it a few days ago. Basically, there’s good OCR software out there, and it’s capable of detecting text of various sorts. So detextify just keeps running OCR software on a generated image detecting text, getting the bounding box on the text from the OCR software, and then re-running an inpaint on that bounding box until the OCR software can’t detect any text. It’s not incredibly compute-efficient, but it is cheap in terms of human time.

        I suppose that as long as the OCR software can handle actually reading the text, it might be possible to use a similar technique, but instead of repeating until the OCR software is unable to find text, repeating until it finds text that matches the desired string.