Google apologizes for “missing the mark” after Gemini generated racially diverse Nazis - eviltoast

It acknowledged ‘inaccuracies’ in historical prompts.

  • Ferk@kbin.social
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    10 months ago

    While the result from generating an image through AI is not meant to be “factually” accurate, its seeking to be as accurate as possible when it comes to matching the prompt that is provided. And the prompt “1943 German Soldier” or “US Senator from the 1800” or “Emperor of China” has some implications in what kind of images would be expected and which kinds wouldn’t. Just like how you wouldn’t expect a lightsaber when asking for “medieval swords”.

    I’m not convinced that attempting to “balance a biased training dataset” in the way that this is apparently being done is really attainable or worthwhile.

    An AI can only work based on biases, and it’s impossible to correct/balance the dataset without just introducing a different bias. Because the model is just a collection of biases that discriminate between how different descriptions relate to pictures. If there was no bias for the AI to rely on, they would not be able to pick anything to show.

    For example, the AI does not know whether the word “Soldier” really corresponds to someone dressed like in the picture, it’s just biased to expect that. It can’t tell whether an actual soldier might just be wearing pajamas or whether someone dressed in those uniforms might not be an actual soldier.

    Describing a picture is, on itself, an exercise of assumptions, biases, appearances that are just based on pre-conceived notions of what are our expectations when comparing the picture to our own reality. So the AI needs to show whatever corresponds to those biases in order to match as accuratelly as possible our biased expectations for what those descriptions mean.

    If the dataset is complete enough, and yet it’s biased to show predominantly a particular gender or ethnicity when asking for “1943 German Soldier” because that happens to be the most common image of what a “1943 German Soldier” is, but you want a different ethnicity or gender, then add that ethnicity/gender to the prompt (like you said in the first point), instead supporting the idea of having the developers force diversity into the results in a direction that contradicts the dataset just because the results aren’t politically correct. …it would be more honest to add a disclaimer and still show the result as it is, instead of manipulating it in a direction that activelly pushes the IA to hallucinate.

    Alternativelly: expand your dataset with more valuable data in a direction that does not contradict reality (eg. introduce more pictures of soldiers of different ethnics from situations that actually are found in our reality). You’ll be altering the data, but you would be doing it without distorting the bias unrealistically, since they would be examples grounded in reality.