0
syncing...
Generative AI Instant Camera
INSTAGEN
Published
– JAN 2024

CAPTURE MORE THAN A MOMENT

Picture this: It's a sunny Saturday afternoon in the park, the perfect setting for lounging with friends and capturing memories. Just as you're about to snap a photo with your iPhone, your friend suggests something better. From their bag, they pull out an intriguing camera – a fusion of retro Polaroid charm and sleek modern hardware.

With a flash, a unique photo is produced. Out of the film slot emerges a watercolor painting, where reality is softened into dreamy brushstrokes. The park and friend you know is still there, but it's different—serene and slightly whimsical, as if it's been lifted from the pages of a storybook and rendered with a watercolor artist's gentle touch.

Welcome to the world of Instagen. This handheld, point-and-shoot camera doesn't just capture moments; it reinvents them with cutting-edge AI, creating beautiful art pieces printed on real photo paper – giving you tangible take-home memories.

I designed and built Instagen over the course of 2023, navigating through what turned out to be my most challenging and rewarding personal project to date. It pushed the limits of my knowledge and comfort zone, driving me to learn Raspberry Pi development, software architecture, complex programming in new languages, and even CAD design for a custom camera body, complete with 3D printing.

In this writeup I'll take you deeper into the journey, from the initial spark of the idea, to the intricate principles that guided its design, and the final steps of its development process.

But before diving in, why not take a look at the Generative Gallery to see some of the shots that I've taken with the Instagen? If you don't want to, these four images below are some recent images I've taken.

THE SPARK

In April 2023, a seemingly insignificant Reddit post caught my eye and changed everything. It was a Polaroid commercial that aired in 1975, but not just any commercial - it starred a young Morgan Freeman demonstrating the magic of capturing and printing moments instantly. Sure, Morgan Freeman was cool, but it was the Polaroid camera that struck a chord. It reminded me of my first Polaroid experience as a kid – the anticipation and magic of watching a moment materialize right before my eyes. It was a slice of time instantly captured and placed in my hands.

WHY BUILD AN AI CAMERA?

Today's cameras, while technologically superior, miss that tactile, nostalgic essence. Millennials like me briefly knew the era of film before the digital wave took over, leaving us with a subtle longing for something more tangible. Instagram and the rise of 'hipsterism' partly filled this void and breathed new life into the art of photography.

Gen Z's relationship with technology is complex. They are digital natives, yet they often find themselves drawn to more tangible, 'retro' technologies. This is evidenced by the revival of film photography. Kodak's film sales, for instance, have seen an uptick of roughly 5% year over year recently, indicating a renewed interest in the medium. Retailers like Urban Outfitters are capitalizing on this trend, selling refurbished Polaroid cameras at premium prices. Disposable cameras become a hot commodity every summer, particularly among high schoolers.

Parallel to this retro resurgence is the advancement in AI image generation. Tools like Dall-E, Midjourney, Stable Diffusion, and others represent a frontier in creative technology, turning words into images and reimagining existing pictures in novel styles.

Caught between these two worlds, I saw an opportunity. What if we could merge the physical, nostalgic appeal of film with the creative potential of AI? A camera that not only captures moments but transforms them through AI, then prints them out – a blend of old and new, tangible and digital. Had anyone done that before?

How hard could it be?

OH SHIT, I DON'T KNOW HOW TO CODE

In case you didn't notice, I'm more of a designer-designer. Outside of fiddling with web templates, I have no idea how program or even write JavaScript. It's always been a bit of a mystery which I was happy to leave to people way smarter than me. But I also hate it when I can't do something that I want to do. Not in a "'"roll and scream on the floor because I can't get my way"'" kind of way, but in an "if I have an idea, I am going to find a way to make it work lest I go mad" kind of way.


When I designed Phoney I simply hired a developer to build the app and voice actors to record the voices. That idea was briefly on the table, but considering that this was purely a programming and hardware project, I would have been removing any need for myself in the project other than managing it - and I didn't start this project to LARP as a project manager.


So I had to either learn how to code (I can hear the eyes of engineers rolling rading that sentence) or figure something else out.


Fortunately the world just recently unlocked a new cheat code... ChatGPT!

I want to design and build a modded shell of a Polaroid camera with custom electronic components. The camera will capture a photo, uploads the photo to the OpenAI DALL-E API endpoint, process variations of the captured image, download the processed image, and print it onto Polaroid film. This modified Polaroid camera will use a digital camera module connected to a Raspberry Pi and a Polaroid film printer.
I want you to prepare a development plan with all of the necessary steps to make the software and hardware for this camera, including which APIs and services I should use, and which scripts I should write.

Opening request to ChatGPT

As it turns out, ChatGPT is great at writing code. Sure, this may be common knowledge now, but at the time, this was cutting edge stuff! And I won't lie, it was really easy. So easy that I'm struggling with what to write here. There's no secret to it. It was very similar to working with an engineer at work, but with our interaction only being through text and me serving as the implementer and tester of code.


Not all of it was written by the bot. The printer, which is a critical piece of the puzzle, was implemented by the talented Mike Manh - a real developer. Mike joined me over the summer as a technical consultant and development partner, and played a role in helping get my prototype off the wall and into a handheld format (more on that later).

The software is only half of the story, as of course the camera needs a body. The seemingly easiest thing to do was to take a real Polaroid, gut it, and cram all of the new digital components inside... which for the prototype is exactly what I did.


I purchased a 1970s Polaroid OneStep off of Ebay, found some YouTube repair videos showing how to disassemble the camera without breaking it, and got to work taking it apart. Once I was left with an empty shell, I took a good look at what we had and then realized that it wasn't going to work at all.


The printer that we were using loads film cartridges from the bottom. The Polaroid, however, loads film from the front through a flap that folds down. We thought that we might be able to just slide the printer in out of the flap, but it was too tall. There was also the issue of mounting the Raspberry Pi and camera module, which wasn't as easy as simply fixing to the insides. These cameras are small and offer little room to work with, so the components simply wouldn't fit. It was clear that we had to be more tactical with our approach and create some custom components. We'd have to 3D print


*Queue three months of designing and printing mounting components montage*

HOW THE SAUCE IS MADE

The first thing the software does is look at the contents of the photo to understand what sort of style prompt it should use to reprocess the image. It identifies that there are people sitting outside looking at the camera, so it filters through prompts that best fit 'portraits' out of hundreds of prompts across dozens of different art styles. In this case, it chose the prompt "portrait in the style of Peter Paul Rubens, with rich colors, strong chiaroscuro, and a focus on capturing the power and vitality of the subject".


Once the image is generated, the camera starts to gently rumble as a the on-bard printer pushes a sheet of film out of the slot. Within minutes, the processed AI image fades into view. At the same time, both the original and newly generated image were uploaded to your Instagen profile for all to see.


When you look at the original photo you took you realize that your friends were slightly out of frame and a bit overexposed due to the bright sunlight, but it doesn't matter! One of the amazing benefits of AI generated images is that errors in photos like framing, exposure, and focus issues are negated by the recreation of the photo. They always come out looking better than when they went in.

V2 BODY

Now, the real fun begins. After using the V1 camera out in the wild for a month and getting a good feel for what worked and what didn't, I was able to go back to the drawing board and start planning a purpose-built body from scratch. I wanted it to be simple, beautiful, and modular so that it was easy to print and construct. I also wanted it to look as professional as possible, which meant getting all of the extra details like a viewfinder and strap. Fortunately, I was able to recycle these components from the original OneStep and fit them into my build.


Designing the camera body from the ground up wasn't entirely simple. We all know the basic elements that make up a camera body, but once you sit down and start to design your own in 3D, the aesthetics and functional elements become a lot less clear. Add in the need to design components that support other hardware, and things start to get get really tricky. But with enough time and compostable filament, I was able to design a working chassis that held the printer and Pi components into a small form factor.


With the internals figured out, I could move to refining the final envelope to enclose it all. I'm certainly simplifying the process a bit, as there are a lot of components to account for, but all you need to know is that it took about 500 hours in total (conservative estimate) and the whole process was a lot of fun.


IT ALL COMES TOGETHER

LET'S TALK HARDWARE

WHAT'S NEXT...

Instagen is a living project that will continue to evolve over time. There is so much that can be done in this space and I have a lot of ideas. The only limiting factor is my time.


My immediate plans are to start on V2, which will replace the viewfinder with an LCD screen, introduce physical buttons and dials to offer more control over the output, and possibly introduce a new camera module with swappable lenses. Video is another exciting avenue that I am keen on exploring soon, though I'm not sure if that functionality makes sense for the Instagen or another camera model. Keep your eye on this space!


Thank you for reading.

VIEW
CLOSE