Experimenting with image to text to GPT, I wondered if you could take just images and create a short story.

Here is the story of... that story.

I went down this rabbit-hole while working out how to use ai to create image captions

Step 1 - Selecting images

The first step was to find some images.

There was no real thought about the story I wanted to create.

It was just trying to answer the question: Can GPT-4 take three unrelated photos and weave them into a story.

I used unsplash for the images.

Image 1: Woman looking at screen

I used this on a pitch deck years ago (we won the pitch!). Seemed like it might be as good as any to start with.

Image 2: Woman on top of mountain

I decided the female character could be a ongoing theme. So lets use an image of her on a mountain.

Image 3: Dog with computer

I'm not entirely sure why I went for this, but felt like including a cute pet to mix things up a little. In hindsight I may have overreached by using technology in the 1st and 3rd images.

Step 2 - Generating prompts from images

I used the describe function in Midjourney to create descriptions of the images

Example of Midjourney description for image 3

Here are the image prompts I went with.

image 1: "mother and baby sitting on couch reading on tablet, in the style of multiple screens, anglocore, humanist approach, photo taken with fujifilm superia, dark emerald and gray, contemporary fact versus fiction, selective focus --ar 146:97"
image 2: "a woman is standing on a ledge in badlands national park while watching a sunset, in the style of mind-bending murals, instax, indigo and gray, spectacular backdrops, cabincore, naturecore, stock photo --ar 146:97"
image 3: "a dog with an open laptop sitting on a table, in the style of light gray and dark brown, photo taken with provia, havencore, nature-inspired, tactile surfaces, furaffinity, stockphoto --ar 146:97"

There was no preferential selection for any image based on perceived quality or relevance. It was just the first output for each image.

Step 3 - Plugging the images into ChatGPT

The prompt. Nothing too exciting here. First set the context, then give it the image descriptions.

And here is the output:

Story Overview: "The Digital Nature Chronicles"

In a world where technology has seamlessly blended with nature, a young mother, Amelia, and her baby, Max, discover an ancient tablet containing cryptic messages (image 1). The tablet's content hints at a mysterious power hidden within the Badlands National Park.

Seeking adventure and answers, Amelia ventures to the park, leaving Max in the care of their intelligent and loyal dog, Baxter (image 3). Baxter, adept at using technology, stays in touch with Amelia through the open laptop, providing guidance and support from afar.

While exploring the park, Amelia comes across a breathtaking ledge at sunset (image 2) and experiences a mind-bending revelation about the harmonious coexistence of nature and technology. This epiphany unlocks the hidden power mentioned in the ancient tablet, forever changing the way humans interact with their surroundings.

The story will follow Amelia's quest for understanding, Max's growth under Baxter's unique tutelage, and the transformation of their world as they uncover the secret behind the Digital Nature Chronicles.

It looks pretty good

I found it interesting that the image order was mixed up slightly. I'm pretty sure you could force it to follow a certain sequence if you wanted. And in the future I might explicitly ask to not include any references to the images. And no doubt you could guide the style of the story (e.g. sci-fi, romance, etc)

Step 4 - Expanding the story

Ok so thats the overview. Let's turn this into a short story.

I was keen to use some sort of narrative structure to guide the story completion. In a new ChatGPT session (I just went for GPT-3.5 for speed, plus didn't think this needed all the firepower of GPT-4) I asked about narrative structures in short stories.

I'm going to let ChatGPT recommend the right structure. So lets add this set of structure back into our original chat and ask what ChatGPT thinks we should do.

Interesting it went for a combination. Looks promising, so lets try to expand this out. How long should we aim for?

They say a picture is worth 1,000 words. So we have 3 pictures, so let's give a target length of 3,000 words. We'll also explicitly ask to remove any reference to the images.

Good heads up here about expected breaks due to context length:

In hindsight, 3,000 words is a lot for this thought experiment! But I went down that route anyway. It turns out that the total output was ~1,300 words in the end. Again some nice UX explaining that the story was over:

And that's the end of our journey for now.

One of the areas we may want to explore as a next step is a re-write in the tone of other authors, to make the style more distinctive and/or appropriate to the topic. I'd also like to explore how we use the content to generate a good image to use alongside the article. But I'll save that for another day.

The Digital Nature Chronicles

Author: ChatGPT w Unsplash.

Amelia's heart raced as she stood on the edge of the breathtaking ledge, watching the sun dip below the horizon in the Badlands National Park. The sky was painted with indigo and gray hues, the landscape around her both spectacular and surreal. Little did she know that the discovery she was about to make would change her world forever.

As the wind whispered around her, Amelia sensed a strange energy emanating from the ancient tablet she had found earlier. The cryptic messages it contained hinted at a mysterious power hidden within the park, and she was determined to uncover its secret.

Back at home, Max, Amelia's baby, gurgled happily as Baxter, their loyal and intelligent dog, sat beside him. The open laptop on the table connected them to Amelia, who had entrusted her son's care to Baxter during her adventure. Baxter's keen senses and technological aptitude allowed him to stay in touch with Amelia, providing guidance and support from a distance.

"Max," Baxter said, his voice a blend of curiosity and excitement, "I think it's time I told you a story about our world, a story that goes back to the ancient past when nature and technology began to coexist in harmony."

Max listened intently, his wide eyes absorbing every word that Baxter spoke. As the dog shared the stories of the world's past, Amelia continued her exploration, each step bringing her closer to the truth.

As Baxter's voice filled the room, Max listened with rapt attention. The dog began his tale, speaking of a time when humanity was on the brink of losing touch with nature. The ever-advancing technological world had nearly engulfed the beauty of the natural environment, threatening the delicate balance of life.

But then, a group of visionaries emerged, determined to harmonize the relationship between humans, nature, and technology. They believed that if humanity could learn to integrate their advancements with the natural world, a new era of understanding and growth would begin. The visionaries embarked on a quest to uncover the secrets of this harmonious coexistence, leaving behind cryptic messages and artifacts for future generations to find.

Back in the Badlands, Amelia felt a growing connection to the stories Baxter shared. She traversed the rugged landscape, guided by the ancient tablet and an inexplicable sense of intuition. As the sun continued to set, Amelia discovered a hidden path that led her deeper into the park. The air around her buzzed with energy, and she knew she was on the brink of uncovering something extraordinary.

Meanwhile, Baxter continued his story, describing how the visionaries had hidden a powerful secret within the heart of nature, awaiting the one who could unlock it. He spoke of the ancient tablet that Amelia had found, explaining that it was one of many clues left behind by the visionaries. The tablet, Baxter explained, was a key to understanding the true potential of humanity's relationship with nature and technology.

Max's eyes sparkled as he imagined the world his mother was exploring. The baby could sense the importance of her journey and felt a deep connection to the stories Baxter shared. Even though he was too young to fully comprehend the words, Max instinctively knew that his mother's quest would have a profound impact on their lives.

In the depths of the Badlands, Amelia followed the hidden path, her determination unwavering. The path led her to a secluded grove, where the energy she sensed earlier seemed to concentrate. The ancient tablet seemed to vibrate in her hands, and as she held it up, a beam of light shot out, illuminating a hidden inscription on a rock face.

Amelia read the inscription, her heart pounding as she realized she had found one of the secrets the visionaries had hidden. The words spoke of a power that lay dormant, waiting to be awakened by one who could truly understand the balance between nature and technology.

The inscription spoke of an ancient ritual that could awaken the dormant power, a ritual that required a deep understanding of the symbiotic relationship between nature and technology. Amelia knew that she had been guided to this place for a reason and felt a profound responsibility to complete the ritual.

She carefully followed the instructions etched into the rock face, preparing the sacred space in the grove. Amelia placed the ancient tablet at the center and began to meditate, focusing her thoughts on the harmony of nature and technology. As she did so, the energy in the grove intensified, and the ground beneath her trembled with anticipation.

At home, Baxter sensed the shift in energy and knew that Amelia was on the verge of a breakthrough. He turned to Max, his eyes filled with excitement. "Your mother is about to do something extraordinary," he said. "A new age is about to begin, one where we live in harmony with both the natural world and the technological wonders we have created."

Max gazed at the open laptop, his tiny hands reaching out as if trying to grasp the unseen energy that connected him to his mother. As Amelia continued the ritual in the secluded grove, the ancient tablet began to glow, the cryptic symbols etched into its surface illuminating with a brilliant light.

The air around Amelia crackled with electricity, and she could feel the power surging through her, connecting her to both the natural world and the technological wonders of humanity. As the energy reached its peak, a blinding flash of light erupted from the tablet, and Amelia felt herself becoming one with the power hidden within the Badlands.

In that moment, the world around her seemed to shift, revealing the true potential of a harmonious coexistence between nature and technology. She saw visions of cities intertwined with forests, rivers flowing through bustling metropolises, and people living in perfect harmony with the environment.

Back at home, Baxter and Max felt the shift as well, their hearts filling with joy and hope for the future. They knew that Amelia had unlocked the secret hidden within the ancient tablet, and their world would never be the same.
As Amelia returned home, her heart swelled with the newfound knowledge she carried. She embraced Baxter and Max, tears of joy in her eyes, knowing that they had all played a part in unlocking the secret that would change their world.

Together, they began to share their experiences with others, spreading the message of the harmonious coexistence between nature and technology. People marveled at Amelia's discoveries, and the visionaries' wisdom began to resonate with a growing number of individuals.

Soon, communities around the globe started to embrace the concept of living in balance with nature and technology. Architects designed buildings that incorporated greenery and natural elements, while scientists developed innovations that benefited both the environment and society. Humanity as a whole started to recognize the importance of preserving the planet for future generations.

Years passed, and Max grew up in a world that was unlike anything his ancestors could have imagined. Under the loving guidance of his mother and his loyal companion Baxter, he learned to appreciate the harmony that now existed between humans, nature, and technology. He understood that he and his family had played a pivotal role in ushering in this new era, and he was determined to continue their legacy.

Max became a visionary in his own right, inspiring others with his passion for the harmonious coexistence of nature and technology. His ideas and innovations contributed to the continued growth and evolution of the world around him, ensuring that the balance between the natural and the man-made remained strong.

Baxter, ever the faithful companion, remained by Max's side throughout his life. The wise old dog watched as Max built upon Amelia's discoveries, helping to shape a world where the beauty of nature and the marvels of technology were no longer at odds but instead existed in perfect harmony.

And so, the legacy of the Digital Nature Chronicles lived on, a testament to the power of unity, understanding, and the boundless potential of the human spirit.

Share this post
The link has been copied!