Forget ChatGPT & Gemini: New AI Tools You Should Try

basanta sapkota
Ever catch yourself saying “AI tools” when you really mean “that one chatbot tab I never close”? Same. It’s handy, no question. But it also kind of hides what’s actually getting interesting right now.


Because the newer wave of new AI tools isn’t obsessed with chatting. They’re building workspaces. Stuff that can ship code, spit out UI you can actually hand to someone, draft video scenes, or even poke at email deliverability and show receipts when something goes sideways.

And yeah, the whole “Forget ChatGPT & Gemini…” angle has been bouncing around community posts on Medium, Substack, dev.to. Fun to skim. The real takeaway isn’t the clicky headline. It’s the shape of the products: narrower, more opinionated, and weirdly… more useful in real day-to-day work than a general assistant.

These are some of the posts that kicked off the “lists” vibe.

  • Medium. [“Forget ChatGPT & Gemini — Here Are New AI Tools…”]
  • Substack. [“Forget ChatGPT & Gemini — Here Are New AI Tools…”]
  • dev.to. [“Forget ChatGPT & Gemini…”]
  • Another Medium variant list: [“Forget ChatGPT & Gemini…”]

Below I’m grabbing a handful of new AI tools from those roundups, then leaning hardest on the ones we can back up with solid primary docs. And I’ll tell you how I use tools like this without letting them quietly torch my repo or my sender reputation. Because… it happens.

Quick list of “new AI tools” that keep showing up in the posts

Want the quick-and-dirty version? Here.

You’ll see these names again and again:

  • OpenAI Codex for agentic coding
  • Google Stitch for generating UI plus front-end code
  • Google Flow for AI filmmaking
  • MailTester.ai for deliverability and spam/content analysis
  • Then a rotating cast depending on the post version: Buildpad, Kimi AI, Pollo AI, and others like Nano Banana, Strix, Pomelli, Opal, etc.

Now for the ones we can actually validate with primary sources… and how they slot into real workflows when you’re not just playing around.

Why these new AI tools matter

The pattern is simple. It’s execution.

Chatbots answer. These new AI tools increasingly do:

  • they run in isolated environments, sandboxes
  • they operate on your repo or your assets
  • they output real artifacts, commits, PRs, UI code, scenes
  • they show logs and evidence so you can audit what happened

That evidence part is the line between “toy” and “okay, I can use this at work without sweating.”

New AI tool for coding: OpenAI Codex

OpenAI describes Codex as a cloud-based software engineering agent. It can run multiple tasks in parallel, each in its own sandbox that’s preloaded with your repository. OpenAI says it can write features, answer questions about your codebase, fix bugs, and propose pull requests for review. They also say tasks usually take 1 to 30 minutes depending on complexity. Source: [OpenAI — Introducing Codex].

The details I like here are the concrete ones:

  • each task runs in an isolated cloud environment
  • it can read/edit files and run commands like tests, linters, type checkers
  • it surfaces terminal logs and test outputs as citations so you can verify actions

That last bit matters. If I can’t see what it ran, I’m not merging it. Period.

How I’d use this

My rule is boring on purpose: give it a bounded task and a way to prove it worked.

So I’ll do things like:

  1. Make sure I’ve got a “happy path” test command I can run locally.
  2. Give the repo a little guidance. OpenAI mentions using AGENTS.md files.
  3. Ask for a small change, something fits cleanly in a PR.

Here’s an AGENTS.md style I’ve used. Not fancy. Just clear.

# AGENTS.md

#

# Repo basics
- Package manager: pnpm
- Node: 20.x

#

# Commands
- Install. Pnpm i
- Test.And test
- Lint: pnpm lint
- Typecheck: pnpm typecheck

#

# PR style
- Small commits
- Update tests for any behavior change
- Don’t reformat unrelated files

And the prompt style tends to behave:

Fix the failing test UserSettings.test.tsx. Don’t change unrelated files. Run pnpm test UserSettings and include the output.

That’s where new AI tools like Codex shine. You’re not asking it to “make the app better.” You’re handing it a job. You’re also making it show its work.

Canonical reference here: [Introducing Codex].


New AI tool for UI work: Google Stitch

Stitch is a Google Labs experiment turns text prompts or image inputs into UI designs and front-end code in minutes. Google positions it as a bridge over the designer/dev handoff potholes. Source: [Google Developers Blog — Introducing Stitch].

What Google says Stitch supports:

  • generating UI from natural language, like describing the app, palette, UX vibe
  • generating UI from images or wireframes, like a whiteboard sketch turning into digital UI
  • rapid iteration via variants
  • a “Paste to Figma” workflow
  • exporting front-end code

A practical loop I’ve seen work with Stitch

If you’ve ever watched a “quick UI tweak” mutate into a week of Slack messages… yeah. This is where Stitch can actually earn its keep.

The loop looks kind of like:

  • prompt a baseline UI
  • spin up a couple variants, 2–3 usually
  • pick one, paste it to Figma so designers can do actual design work
  • export front-end code, but treat it like scaffolding
  • then swap generated components for your design system components

A grounded prompt I’d actually use:

Create a responsive settings page for a developer tool. Needs: left nav, account section, API keys table, danger zone. Use a neutral palette and clear spacing.

Image suggestion (optional)

If this were going in the blog post, I’d include a:

  • screenshot of Stitch generating UI variants from a prompt

Alt text: “Google Stitch AI tool generating multiple UI design variants and exporting front-end code from a natural language prompt”

New AI tool for video: Google Flow (AI filmmaking with Veo)

Google Flow is described as an AI filmmaking tool built with and for creatives, meant for creating cinematic clips, scenes, and stories using Google’s generative AI models. Source: Google Labs — Flow.

A couple specific bits from Flow’s page:

  • it mentions “180 monthly credits free of charge” (plan details can change, so check the page)
  • it says it’s available in over 149 countries

Where Flow fits, even if you’re not a filmmaker

Dev teams crank out video constantly. Release notes clips. Product walkthroughs. Onboarding. Internal demos somehow become customer-facing two weeks later.

Flow and similar new AI tools can help you get to “good enough” faster without living inside a full editing suite. But I still treat it like a draft generator. Human pass required for accuracy, branding, and anything customer-facing. No shortcuts there.

Image suggestion (optional)

  • screenshot of Flow’s scene/story workspace

Alt text: “Google Flow AI filmmaking tool interface for generating cinematic clips and scenes using Veo”

New AI tool for email deliverability: MailTester.ai (content + SPF/DKIM/DMARC)

This one’s less flashy than video or coding agents, but it’s sneaky useful. MailTester.ai sells a single report that bundles:

  • AI-powered content suggestions
  • spam score analysis
  • technical deliverability checks like SPF, DKIM, DMARC, DNS

Source: MailTester.ai homepage.

A few claims straight from their site, so treat them as vendor-stated numbers, not independent benchmarks:

  • they say they check “50+ spam factors”
  • they say they hit “94% accuracy” compared to Gmail/Outlook/Yahoo decisions, with caveats since each provider is unique
  • they mention “12,000+ marketers”

The part I actually care about: verifying SPF/DKIM/DMARC ourselves

Even if you use a new AI tool for analysis, you should still be able to sanity-check DNS. When I’m on Linux/macOS, I usually start with dig. Quick, clean, no drama.

# SPF (TXT record)
dig +short TXT example.com

# DKIM (selector varies; common patterns below)
dig +short TXT selector1._domainkey.example.com
dig +short TXT default._domainkey.example.com

# DMARC
dig +short TXT _dmarc.example.com

Then I compare what I see with what my ESP expects (Mailchimp, SES, etc.).

If you want an authoritative place to start learning the standards-side background for DMARC/SPF/DKIM, stick with standards bodies and respected references. MailTester also mentions SpamAssassin, and the project home is here: Apache SpamAssassin. It’s handy if you want to understand rule-based scoring and why certain phrases trip alarms.

How I evaluate new AI tools so I don’t lose a weekend

The internet loves “100 tools you must try.” My filter is… not that nice.

Here’s what I look for:

  • Can I export the results, like code, assets, configs?
  • Is there an audit trail? Logs, diffs, test output, something real.
  • Does it cover a full workflow end-to-end? UI → Figma → code is a good example.
  • Does it reduce risk or just add novelty? Deliverability checks reduce risk. Shiny toys don’t.
  • What’s the failure mode? Silent wrong output is the nightmare.

Pricing and limits matter too. Flow’s credit model is at least spelled out on the official page.

A real-week combo: using these tools together

Here’s a combo I’ve used, or used close equivalents of, on a small product update:

  1. Stitch to scaffold a settings UI so nobody starts from a blank canvas.
  2. Codex to implement UI wiring plus tests in a branch while I review and integrate.
  3. MailTester.ai plus manual DNS checks so the announcement email doesn’t faceplant into spam.
  4. Flow to generate a short “what’s new” clip for the release post.

That’s the real story with new AI tools. They’re not replacing the whole job.Yet’re deleting the annoying parts. The fiddly parts.Plus parts drain a day and leave you with nothing to show for it.

Common mistakes people make with new AI tools

1) Treating agents like interns with prod access

Even with sandboxing, output needs review. OpenAI explicitly says manual review and validation is essential before integration or execution. Source: Introducing Codex.

2) Asking for huge changes in one shot

Big prompts create big ambiguity. Split it into PR-sized chunks. Demand tests and logs.

3) Believing deliverability is “just SPF/DKIM”

MailTester’s pitch and my experience line up here. Content can absolutely tank deliverability. Technical auth is necessary, not sufficient.

4) Shipping generated UI code as-is

Use Stitch output as scaffolding. Then refactor into your component library, accessibility rules, and design system. The boring stuff.Still important stuff.

If you’re experimenting with agentic dev workflows, you’ll probably care about running models locally for privacy or speed too. I wrote up a practical guide here: Run local LLMs on Linux with Ollama.

Pick tools that leave behind artifacts, not vibes

If you only keep one idea from all this, make it this: the new AI tools worth your time leave behind something you can review. Diffs. UI exports. Logs. Scenes. DNS checks you can confirm. That’s how trust gets built without slowing everything to a crawl.

Try one tool this week with a tiny, measurable task.

  • Codex. “fix this test and show the output”
  • Stitch. “generate a settings screen and export code”
  • MailTester.ai: “analyze this campaign email before sending”
  • Flow: “create a 15-second release clip storyboard”

And if you’ve found a new AI tool that actually holds up under real work, not just a slick landing page, tell me in the comments. I’m always collecting the ones earn a permanent spot in the toolbox.

Post a Comment