AI-native engineering for high-stakes work: Lessons from Thomson Reuters’ Rittika Jindal
Yoni and Rittika Jindal of Thomson Reuters met to talk about building reliable AI in high-stakes domains, and how AI-native engineering is changing what software teams actually do.
I first saw Rittika Jindal speak at Gartner in Florida a little over a year ago. She was presenting OpenSETU, a data agent for non-technical users, and I remember walking out thinking: this is someone who is not treating AI like a demo. She is treating it like engineering.
That distinction matters more at Thomson Reuters than in most places. This is a company serving legal, tax, risk, compliance and journalism professionals. In those environments, the wrong answer is not just annoying. It can affect real work, real customers, and in some cases real lives.
One sentence from our conversation captured that bar perfectly:
“accuracy is the key.”
I think that line is worth sitting with for a minute, because it explains almost everything that came next in our conversation (available on YouTube, Spotify and Apple Podcasts).
In high-stakes AI, accuracy is not a metric. It is the product
Rittika works in Thomson Reuters’ Content Innovation group, an R&D team focused on how content is processed, enriched and delivered into products used by professionals. In practice, that means building AI and agentic workflows into pipelines that need to be fast, scalable, and above all trustworthy.
That emphasis on trust strongly echoed what we heard in my recent conversation with Joel Hron, Thomson Reuters’ CTO. The setting is different, but the core idea is the same: when AI enters high-stakes workflows, you do not get to hide behind a cool demo. You need systems, guardrails, evaluation and clear accountability.
That is also why I liked Rittika’s framing of the problem so much. She did not talk about hallucinations as a quirky model behavior that we all laugh off on X. She talked about them as a product failure.
OpenSETU was a bridge, not a magic trick
The first project I knew Rittika for was OpenSETU. “Setu” means bridge in Sanskrit, which is a fitting name.
The idea was simple and powerful: let non-technical teams ask natural-language questions about their data, and have an agent translate those questions into Python or SQL, run the code, and return the result. In other words, bridge the gap between business users and the analysts they usually wait on.
But the interesting part was not the interface. It was the discipline behind it.
Rittika was very clear that the team did not stop at “the code runs, so I guess it works.” They built with ground truth from day one. They knew what correct answers looked like for representative questions, and used that to evaluate whether the agent was producing the right output.
This rhymes strongly with how we think about evals at Solid, and with a point I keep seeing across strong teams: if AI is part of the product, evaluation cannot be something you bolt on at the end. It has to be part of the design.
Or, as Rittika put it:
“We make sure that we have the ground truth data. We are evaluating whatever we are building, starting from day one.”
That is not flashy. It is also probably why her work feels real.
Then the job of software development moved
A lot of people still talk about AI in software as “the thing that writes code faster.” That is true, but it is also the least interesting part of the story.
What Rittika described is much bigger. Her team has moved from AI-assisted development toward AI-native engineering. The code-writing step is just one piece. AI is now involved across planning, coding, testing, debugging, observability and review.
The workflow she described was one of my favorite parts of the episode:
A developer starts by creating a plan document with AI. That plan gets reviewed by humans. It goes into GitHub. Then code gets generated against that plan. Before a PR is opened, an internal review skill checks the plan, code, tests and architectural alignment. After the PR is opened, a bot reviewer runs automatically and comments on what it finds. The developer then feeds those comments back into their coding agent and iterates. Only after that does the PR land in front of a senior engineer.
So yes, Claude writes code. But Claude also reviews Claude.
If you have been following Rittika’s writing, especially posts like How We Taught Claude to Review Claude and Reviewing AI Code at Scale: A Principal Engineer’s Dashboard, this pattern will sound familiar. What I liked hearing on the podcast is that this is not a thought experiment. It is how her team is actually working.
It also pairs nicely with what we wrote when Solid moved to an agentic platform and later on how we observe that system in production with LangSmith. Once your development process becomes more agentic, observability and evals stop being “nice to have.” They become part of the development loop itself.
Yes, teams are shipping more. No, that does not mean quality stops mattering
One of the more striking parts of the conversation was the sheer change in pace.
Rittika described a world where some repos that used to merge five or six PRs a week are now seeing much more than that every single day. In her words, “we are merging 5 to 10 PRs every single day.”
That kind of compression changes expectations fast. It also creates a fear that I hear from a lot of people: are we just generating more slop, faster?
Her answer was more balanced than the hot takes you usually see online. Yes, AI slop is real. Yes, dead code shows up. Yes, review gets harder. But humans were never exactly writing slop-free code either. The real question is whether the team has built the mechanisms to catch the things that matter: wrong business logic, broken architecture, lack of tests, poor observability, missing evals.
That is why I found her emphasis on testing, logs, and review so important. In her team’s world, the answer to faster code generation is not hand-wringing. It is a stronger system around the code.
The future engineer may look more like a software thinker
The part of the conversation that stayed with me most was not about tools. It was about roles.
Rittika argued that engineers are now spending much more of their time on planning and design, and less on manually typing code. The work is shifting upward. You ask better questions earlier. You decide what should be built, whether it should be built, how it should be evaluated, and how the workflow should be instrumented once it is live.
That sounds subtle, but I think it is a major shift.
It also echoes what we heard in my conversation with Meenal Iyer: the winning teams are the ones building foundations first, then experimenting hard, then moving into production with clarity about trust and business value.
Rittika took the idea one step further when we talked about junior engineers and the next wave of talent. Her point was that the people arriving now are more AI-native than the rest of us. They are not “adopting” AI. They barely know how to work without it. That will change not just how teams build, but who teams hire for.
She said, almost offhandedly:
“Maybe after five years we’ll hire software thinkers.”
Maybe that is too provocative. Maybe it is exactly right.
After hearing how her team works, it no longer sounds crazy to me. The scarce skill is moving away from typing syntax and toward framing problems, building evals, setting constraints, and deciding when humans need to stay in the loop.
What I took from the conversation
If I had to reduce the episode to four points, it would be these:
In high-stakes domains, accuracy is not a dashboard metric. It is the product.
AI-native engineering is not about faster code completion. It is about rebuilding the whole SDLC around plans, review, evals and observability.
Shipping more PRs only helps if your quality system gets stronger at the same time.
The software engineer role is shifting upward, from code producer to system designer, reviewer and decision-maker.
Rittika is one of the more thoughtful voices I have heard on this topic because she is not speaking in abstractions. She is building inside a high-bar environment, writing openly about the lessons, and grounding the entire thing in practice. Her Medium is worth following, and so is Thomson Reuters’ broader engineering transformation story. For people trying to level up quickly, I also liked her recommendation to start with the docs from the model companies themselves and with DeepLearning.AI.
And yes, it will surprise no one that the AI tool she said she cannot live without is Claude Code.
Want to hear the full episode? Check it out on YouTube, Spotify, and Apple Podcasts.


