I added "Vision" to my "OpenAiApps". It's interesting, and you can use it for a lot. Analysis, info, comparisons...my app is very simple and I have not made any "follow-ups". I.e. you cannot follow-up with any questions, but below are some examples of usage.

OCR /dataanalysis

If you have a pdf or image-data that you want in some dataformat. In my case: csv/semicolon-seperated data:

Prompt: In the picture theres a table with data. Can you take the first 3 columns in the table and create a new table in csv data. i.e data that is semi-colon, seperated? If you cannot do that, make it as html-data.

Using Vision as OCR

I've only tried out with a couple of examples/pdf's that were converted to JPEG-format, but it works beautifully.

What is?

Straight forward:

My application is very modularized. The "gpt-4o" app is just chat. Vision is not. There are no follow-up built in. But I get my answer.

Descriptions
Below is a description of an image created by Dalle-2. It's not really good(the image I mean), but interesting enough:

screenshot of openAiApps

The text that vision created from that image is the following:

This image depicts a ghost-like figure that appears translucent and emits a glowing, bluish-green light. The figure stands in a room near an open door that leads to a brighter area. The walls and furniture in the room are in darker, subdued colors, which contrasts with the brightly illuminated figure. This representation creates a spooky, supernatural effect, often associated with depictions of ghosts or otherworldly beings in popular culture.

 

I used this to create a description (or prompt if you like) to create another image in Dalle-E 3. The result is the following:

ghost, dall-e-3

This is a very good starting point. I have not added any descriptions of the environment or style, so it made whatever it feels fit, which is actually close to the dall-e 2 created image. 

Of course there are other usages of this, but on the hobby-level this is what I use it for. My implementation is very basic. I upload a file and I get a description back. That's it. I haven't had time to look further into it. Here's another test. I used vision on the pic that was on the blog-articles-site. It gave me the following desrciption:

This is a colorful and vibrant abstract artwork that features a blend of swirling shapes, bold lines, and various hues. The image incorporates a radiant light source at its center, which creates a focal point and extends outward in bright, warm tones of yellow, orange, and red. The right side of the image includes a profile of a human-like face, composed of flowing lines and shapes, merging seamlessly with the surrounding textures and colors. This face gives the impression of being part of the ethereal and dynamic atmosphere of the artwork. The composition includes a mix of cool blues and warm reds and oranges, contributing to the overall energetic and imaginative feel of the piece.

So i used that as a description to make another variant of the splash-screen-picture using Dalle-E 3:

vision-description

Nice actually. The descriptions that I have asked for are "promptless". I used the following prompt to get a more detailed description:

Can you give a detailed description of this image so that I can create it again with dall-e 3?

It gave med the following:

The artwork is a vibrant and dynamic abstract piece with a strong surrealistic influence. Here’s a detailed description to recreate it using DALL-E 3:

1. **Color Palette and Style:**
   - The image is rich in color, combining warm tones (reds, oranges, yellows) with cool tones (blues, purples, greens).
   - The style is abstract and surreal, with smooth gradients and flowing lines.

2. **Central Elements:**
   - **Sunburst:** At the center-left of the image, there is a bright, radiant sunburst. The sunburst has a white core, radiating outwards with beams of yellow and orange light. 
   - **Profile Silhouette:** On the right side of the image, there is the faint, ethereal profile of a human face integrated into the flowing lines and forms. The face is looking towards the sunburst.
   - **Flowing Lines and Patterns:** The image contains numerous flowing lines and swirling patterns that create a sense of movement, especially around the central sunburst and the profile silhouette.

3. **Upper Section:**
   - The upper section contains vertical lines dripping downwards, which add to the abstract nature of the image.
   - These lines transition into various colors, mirroring the overall gradient scheme from warm to cool tones.

4. **Lower Section:**
   - The lower part of the image has swirling, wave-like patterns that seem to move towards and away from the light source. 
   - The patterns are rich in color, combining dark blues and reds with the golden hues from the light.

5. **Miscellaneous Details:**
   - There are several small orbs and circles scattered throughout the image, enhancing the surreal and fantastical aesthetic.
   - The clouds and mist-like formations create a dreamy, otherworldly atmosphere.

6. **Texture:**
   - The texture appears smooth and slightly glossy, giving the colors a luminous quality.

When inputting the prompt into DALL-E 3, you might include elements such as: "abstract surreal vibrant artwork with a central sunburst, ethereal human profile, flowing lines, and patterns, combining warm and cool tones, vertical dripping lines, swirling lower section, small orbs, and smooth glossy texture."

I did as it told me an created this prompt in Dall-E-3:

Create an image with the following points:
The artwork is a abstract surreal vibrant artwork with a central sunburst, ethereal human profile, flowing lines, and patterns, combining warm and cool tones, vertical dripping lines, swirling lower section, small orbs, and smooth glossy texture:

1. **Color Palette and Style:**
   - The image is rich in color, combining warm tones (reds, oranges, yellows) with cool tones (blues, purples, greens).
   - The style is abstract and surreal, with smooth gradients and flowing lines.

2. **Central Elements:**
   - **Sunburst:** At the center-left of the image, there is a bright, radiant sunburst. The sunburst has a white core, radiating outwards with beams of yellow and orange light. 
   - **Profile Silhouette:** On the right side of the image, there is the faint, ethereal profile of a human face integrated into the flowing lines and forms. The face is looking towards the sunburst.
   - **Flowing Lines and Patterns:** The image contains numerous flowing lines and swirling patterns that create a sense of movement, especially around the central sunburst and the profile silhouette.

3. **Upper Section:**
   - The upper section contains vertical lines dripping downwards, which add to the abstract nature of the image.
   - These lines transition into various colors, mirroring the overall gradient scheme from warm to cool tones.

4. **Lower Section:**
   - The lower part of the image has swirling, wave-like patterns that seem to move towards and away from the light source. 
   - The patterns are rich in color, combining dark blues and reds with the golden hues from the light.

5. **Miscellaneous Details:**
   - There are several small orbs and circles scattered throughout the image, enhancing the surreal and fantastical aesthetic.
   - The clouds and mist-like formations create a dreamy, otherworldly atmosphere.

6. **Texture:**
   - The texture appears smooth and slightly glossy, giving the colors a luminous quality.

The result was the following: First pick is before I saw what it's instructions were and on the second I did as it told me:

Vision, details description
Vision details, first draft without following instructions
Vision details 2
Vision details, following instructions from "Vision"

Thing is that it's a tool. And it can be very helpful.