Engineering the future of machine learning

Apr 13

**Naveen Ramakrishnan on machine learning, real-world robustness and what it really takes to build the Recognition Economy**

About this series

Engineering the future is a recurring series featuring the engineers behind Metropolis and what they’re working on. These are the people designing the science, solving the hard problems and building toward a future where the real world moves with you. Every installment is a candid conversation with some of the people who make Metropolis, Metropolis.

The participants

Interviewer:

Ayisha Jackson, Senior Technical Project Manager, Metropolis

Ayisha partners closely with Metropolis’ Advanced Technology Group to bridge technical execution and broader organizational goals. As a member of the engineering team, she acts as a conduit between the technical team and Metropolis as a whole. In this series she’ll be sitting with the engineers shaping our platform to ask the questions that turn deep work into compelling stories.

Subject:

Naveen Ramakrishnan, Director of Machine Learning, Metropolis

Naveen leads all machine learning and computer vision capabilities powering Metropolis’ Recognition Platform. With more than 15 years in the field — including roles leading computer vision teams at Amazon One and Bosch’s Center for Artificial Intelligence — he joined Metropolis’ Advanced Technology Group to help scale the Recognition Economy from the parking lot to the world beyond.

The conversation

This conversation has been edited for length and clarity.

AJ: Hi Naveen — I know you, but let’s introduce you to everyone! Who are you, and what does your role at Metropolis look like?

NR: I'm Naveen, Director of Machine Learning here at Metropolis. I lead all of our machine learning and computer vision capabilities that power the Recognition Platform across our products, starting with parking and now extending into quick service restaurants (QSRs), hospitality and beyond.

My background is in machine learning. I've spent more than 15 years in this field — at large enterprises like Bosch and Amazon, where I led the computer vision team responsible for Amazon One and an applied science team focused on smart energy. Before that, I led the applied AI team at the Bosch Center for Artificial Intelligence in Sunnyvale, where we worked across manufacturing, automotive, smart home and power tools. The throughline across all of it: Using data to develop computer vision and machine learning models that create new experiences for people.

I should also note I've been at Metropolis for about five months, so I'm representing the machine learning team and the work done here, some of which predates my arrival. The team that built this foundation deserves full credit for what made it possible.

AJ: How does Metropolis use machine learning within its core parking product?

NR: At Metropolis, we want to power magical customer experiences through computer vision — and that's the foundation of what we call the Recognition Economy. Our core parking product uses computer vision to identify which vehicle enters a parking site, how long it stays and when it leaves, so we can automatically charge visitors based on their actual utilization of the space. Computer vision plays a central role. Our edge system detects which Member enters a site, tracks when they leave and handles the charge — all without the Member needing to do a thing.

AJ: How does our computer vision technology differ from traditional license plate recognition — the kind you'd see at toll plazas or on speeding cameras?

NR: It's a question we get often, and the distinction matters. Traditional LPR systems are built for controlled environments, and they’re extremely restricted. They require expensive, purpose-built hardware and tend to struggle with real-world variability — different lighting conditions, extreme weather, unusual approach angles. Our system is built for robustness. It's designed to handle the kind of challenging, unpredictable conditions you encounter when you're operating at scale, in the real world, across hundreds of sites.

We use what we call a multi-layered approach. The first layer is our core computer vision model, built directly into our edge hardware — purpose-assembled, relatively low-cost and engineered to handle extreme weather, low light, direct sunlight, deep shadows and high-angle vehicle approaches. When that layer can't make a confident read, we trigger additional layers of processing — automated at first, and in rare cases, augmented by human feedback. In practice, only a very small percentage of events reach that final tier.

What also sets us apart is our continuous learning framework. We collect data on an ongoing basis and feed it back into the model through a mature MLOps pipeline, enabling rapid retraining based on errors we see in the field. The system gets smarter over time — which is something traditional, static LPR systems simply aren't designed to do. Most of them rely on older, rule-based computer vision approaches rather than modern AI architectures. We rely heavily on the latter.

AJ: Can you talk about the phrase “vehicle fingerprinting” that we use internally? What is that, and how does it work?

NR: Great example of our second layer of intelligence. Let's say a vehicle enters a site and — due to a trailer, a motorcycle sidecar partially blocking the plate or some other obstruction — our edge system can't get a clean license plate read.

Rather than losing that event, we trigger a cloud-based reidentification system. It uses what we call vehicle fingerprinting: Identifying the unique characteristics of a specific vehicle — its color, make, any distinctive markings, stickers, specifics of the body — and performing an image search across our database to find a matching vehicle image, hence, “reidentification.” If we find a match with high confidence, we can use that earlier read to identify the member and complete the transaction.

It's a proprietary, patented capability, and it's one of the clearest examples of what truly distinguishes our Recognition Platform from a simple LPR system.

AJ: We often talk about parking being Metropolis’ “proving ground.” What are the most critical lessons from building that platform that you're now applying as we expand into other verticals — like QSRs, as an example?

NR: Several things, and most of them center on one core idea: How do you build a system that performs in the real world, not just in a lab? Machine learning models trained on clean data often fall apart when you deploy them in uncontrolled environments. We learned that early. The long tail of edge cases — extreme weather, unusual lighting, occlusions, unexpected vehicle behaviors — that's where the real work lives.

So we’re fundamentally shifting the industry away from simple LPR approaches to our Recognition Platform. That means using multiple computer vision cues together: robust vehicle detection, vehicle classification, motion pattern analysis, tracking and an understanding of the physical geometry of each deployment site. The good news is that a lot of those capabilities transfer directly to QSR. We can still detect and classify vehicles, track them across a drive-thru journey and apply the same mature MLOps pipeline we built for parking — the one that continuously collects data, labels it, retrains the model and improves performance over time. That infrastructure is non-negotiable, and it gives us a real head start.

AJ: What are the hardest technical challenges as we’ve moved from parking into QSR specifically?

NR: Two main ones, with a third that underpins both. First: heavy occlusion. In a typical parking environment, vehicles move through a defined entry and exit point with space between them. In a drive-thru, vehicles are often bumper-to-bumper. That creates situations where our detection models need to correctly identify two merged, same-colored vehicles as two separate vehicles — even when the plate of one is completely hidden. That requires re-engineering our detection models to handle a different class of occlusion.

Second: multi-camera object tracking. Drive-thrus are structured differently from parking lots. You need cameras at multiple points throughout the lane to capture the full customer journey. That requires us to track a specific vehicle across all those cameras without losing it — what we call multi-camera object tracking, or MCOT. We believe it'll become a foundational recognition capability across many verticals, not just QSR.

The third piece — which enables both of the above — is what we call a world model. It’s a machine learning representation of the physical environment. It lets us simulate scenarios, test our models against them and predict the next state of the world more reliably. That's the infrastructure layer that makes everything else possible.

AJ: The architecture of the platform itself is also changing — you've started talking about
“recognition primitives.” What does that mean, and why does it matter?

NR: As we move from a single-vertical platform to one that needs to serve many different industries, the most important architectural shift is decoupling our recognition capabilities from vertical-specific business logic. If those are tightly bound together, every time you want to serve a new use case, you're rebuilding from scratch. That doesn't scale.

So we're re-engineering the ML tech stack around what we call recognition primitives — generalized building blocks that work across verticals. A general vehicle detection model. A general person detection model. Object tracking. Classification and recognition. These are capabilities that are useful whether you're in a parking location, a drive-thru or a hotel lobby.

For a specific vertical, you chain those primitives together to produce the application logic you need. In QSR, that might look like: detect vehicle, classify it, localize and track it through the drive-thru, then trigger the relevant application. In a different vertical, you'd chain them differently. What this gives our development teams is the ability to innovate fast on any single primitive without touching the rest of the system. It accelerates time to market and preserves the integrity of the platform at the same time.

AJ: The machine learning project lifecycle is pretty different from a traditional software development lifecycle. How do you think about that difference — and how does it show up in the way your team works?

NR: Traditional software development treats code and data as relatively static. You plan a project, you write the code, you ship it. The work is largely finished at deployment. Machine learning doesn't work that way. Data is constantly evolving, and it's the true source of complexity — it should shape how you build, not just what you build.

Our approach is built around a mature MLOps lifecycle. Rather than a linear release cycle, we operate in continuous loops: data collection and labeling, model training and experimentation, validation, deployment and monitoring — and then back to data collection when the world changes. Because it will. A model trained without rain and snow data will struggle the first time it encounters that weather. A model deployed in a new region may encounter vehicle types or environmental conditions it's never seen. Our job doesn't end at deployment. We're continuously monitoring for data drift, triggering retraining when needed and making sure the model keeps performing as the real world keeps moving.

That's actually one of the things that makes Metropolis technically interesting. We're not just building models — we're deploying them in the real world and tuning them to perform there. Taking work out of the lab and into production, at scale, is genuinely hard. And it's exactly the kind of problem our team is built for.

One last thing

AJ: What's been your favorite part of working at Metropolis so far?

NR: Honestly? The realization of just how vast the Recognition Economy actually is. Before I joined, I knew what we were doing in parking. But after working with the product, ATG and engineering leadership teams, I've come to see how far this technology can reach — and the business impact we can drive through this vision. Someone put it well early on: Wherever there's a transaction happening between a person and the real world, Metropolis wants to play a role there. That's a big idea. I find it genuinely exciting to be at the center of it.

AJ: Last one — what's your go-to album or song when you're deep at work on a hard problem?

NR: I grew up listening to a lot of Indian music, Tamil songs especially. But when I'm deep in work, I tend to reach for some old favorites: “Sultans of Swing” by Dire Straits is a go-to. And Linkin Park — they were a big part of my grad school years.

Interested in working with brilliant, kind people like Naveen? Check out our careers page to learn more.

Ann Blessman

Engineering the future of machine learning

About this series

The participants

The conversation

One last thing

Metropolis named to TIME100's Most Influential Companies of 2026

Super Bowl LX: Powering the logistics of a city in motion