State of Generative AI (2024)

This article is meant to summarize my current assessment of issues and opportunities in the gen AI space, from the most broad to the most specific, as an applicator developer and user.

I’ve used generative AI (gen AI) extensively over the last three years for generating code, text, images, and music. I’ve applied it mundanely as a systems developer and artistically for my hobby art projects. I’ve also trained models, hosted them, and used them.

Here’s a recap of my current stance:

‘True’ Artificial General Intelligence (AGI)

In order to be called a "general" "intelligence", an entity need to be able to genuinely surprise, critique, ask clarifying questions, appreciate beauty, form models of the world and intentions of others, and exhibit creative intent.

To illustrate, one can turn to the words of Guillermo del Toro,

"The value of art is not in how much it costs and how little effort it requires, It's how much would you risk to be in its presence."

Anything less is just a more powerful spell-checker, or search engine, if you will.

If an AI should work as an 'equal partner' to humans, it will have to go way beyond what's even hinted by Altman et al.
I believe Enactivism is key for how to model this kind of ‘intelligence.’
Under Enactivism, Intelligence emerges from an organism’s direct, ongoing interaction with the world; ‘knowing by doing.’
The mind and environment co-create each other in an Autopoietic dance.
Consequently, we need to think about how it physically engages with its surroundings, not just how many parameters it can crunch.

That’s why I think that ‘true’ AGI will not come until we create something that can have the potential perceive humans as an existential threat—and act creatively to neutralize that threat.

Which is why we probably don't want true “superhuman” AGI in the first place.
Alignment is a risk management process.
It would be naïve to presume all actors hold a unified view of desirable alignment outcomes.
The real challenge is making sure the resulting policy patchwork does not have fatal tears.

Humanity’s Use of Transformer-Based AI

100% of the texts, images, sounds, songs, APIs, architectural decisions, design patterns, components, and programming languages that large language models (LLMs) are trained on were originally created by humans.
Freya Holmér has a wonderful take on this.
Model Autophagy Disorder - AI model collapse when fed AI-generated training data - is a real problem.
Although synthetic training data is valuable for model-to-model training, you can never get more out of it than what was in the original.

It’s tempting to train models on identifying the steps that led to every innovation ever done and then attempt reinforcement learning for chain-of-thought like in the latest o3 model.

But it’s still a bet. And even if it works, the system would remain confined to the leaps it has already seen before, as demonstrated by humans.

The points above clearly indicate that no, until we find some way of achieving 'true' AGI, humans will not be cut out of the loop.
Humans are still the only source of novelty; the kind of recombination that is truly innovative.
Compare Richard Feynmans arguments for why conciousness is quantum.

Navigating a Manifold of Language

Early on, I started thinking of LLMs as a manifold, a statistical landscape representation of language and stated knowledge.
A prompt effectively serves as initial positioning and directional vector for a traversal.
In this view, “the question is the answer”—literally.
This means you should use the tone, voice, language, and idioms of the domain you want the AI to focus on.
Thus, rhetorical questions can be powerful:

"Is the reason in the package.json or tsconfig ?"

"Maybe we should use a well-known gang-of-four pattern instead?"

When you explicitly specify the style—e.g.,

"Don’t give me typical AI output like overly verbose, posh words and bullet lists. I want the answer in conversational tone, like I explained it to a friend over coffee"

you can often generate text that scores as “human” on zeroGPT.

Switching between models can be fruitful.
It’s not uncommon to start with agentic Claude, or Gemini 2, then switch to “4o,” and finally throw “o1-preview” at the problem.
You develop a feel for which models are best for specific tasks or tech stacks.
Pitting two AIs against each other can lead to better content.

Take an initial output and ask a different AI:

“A friend of mine has come up with this; please give constructive criticism and suggestions for betterment.”

Then go back to the first AI, paste the suggestions verbatim, and say:

“A friend of mine came back with these suggestions.”

Repeating this several times can lead to robust and relevant results.

Referring to them as “a friend” seems to put AIs in a more helpful mode.

The Human Side of AI-Generated Content

By introspection, I’ve noticed some pitfalls that we as content creators and systems developers need to watch out for:

AI may give you 80% of a solution at the prize of making you less motivated to supply the final 20%.
Generated content is not ‘internalized’ as soundly as self-written or peer-reviewed material.
We don’t remember the structures and solutions as clearly as when we design/write them ourselves or when we review a colleague’s work.
You will feel empowered when rapidly delivering a solution, but because you’re not emotionally invested in the proposed solution, it will feel more like a handoff than a creative process.
There is something in the absence of an actual, feeling, vulnerable, proud, source that makes the generated solution come off as discardable.
This makes you intellectually and emotionally lazy and will degrade the quality of your output over time.
Which will feedback into less quality of human examples for AIs to learn from.
Be rude to machines, not people.

There seems to be a kind of 'least effort for maximum impact' lingua franca embedded in models that is akin to a programming language.

Learning to pin-point the most efficient wording, this language has become terse, bordering on rude.

One can only hope that this does not leak over into human-to-human communication.

Issues Facing Systems Developers Specifically

Commercial models favor local ‘fixes’ over systems thinking.
Likely due to cost reduction, context window, and limited fine-tuning scope in the training data, these models typically patch local issues rather than propose holistic solutions.
Once a model has locked on to a certain solution, continuing the same dialog often leads nowhere.
The context window can get ‘tainted,’ and the model start going in a loop, so your options are:

a) Restart a new session specifically targeting the problem.

b) Ask the AI to “stop it, you're going round in circles, take two steps back and reevaluate the thing by thinking outside the box.”

c) Switch to another AI entirely and see if it comes up with a different approach.

This is extra problematic when using agentic auto-apply where things can go very wrong very fast.

Make sure to branch, commit and stage to not have to start all over. I usually keep an 'instructions' text file that I can update with the current state of components, so I can efficiently revert and re-start if the AI goes astray.

Appendix A - Further Reading

For a similar post with differing conclusions: https://threadreaderapp.com/thread/1871946968148439260.html?utm_source=tldrnewsletter (thanks Lucas Hadin)

For more hands-on advice, follow https://www.linkedin.com/in/happybits/