Why I Chose Object Detection for My Startup 🚀

When you’re bootstrapping a startup—no funding, no PR, no overhyped demo videos—you’ve got to be obsessed with the problem you’re solving. For me, that obsession is object detection.

The Bigger Picture of AI

AI is fundamentally about mimicking human abilities—whether it's understanding and generating language through models like Natural Language Processing (NLP), Large Language Models (LLMs), generative agents, or creating content through synthetic media and prompt engineering. These advancements show how far we've come in replicating human language understanding and communication. But language is just one piece of the puzzle.

What about vision—human vision?

Just as we interpret the world visually—identifying objects, understanding scenes, recognizing patterns—computer vision enables machines to do the same. That’s where object detection comes in. It’s a core function that brings AI systems closer to interacting with and understanding the physical world in meaningful ways.

What Is Object Detection?

In simple terms, object detection is the process of identifying what is in an image and where it is. Think bounding boxes, labels, confidence scores.

Real-world examples include:

Self-driving cars recognizing pedestrians and traffic signs
Drones searching for survivors in disaster zones
Medical systems detecting tumors
Retail stores tracking shopper movement and product placement

A Personal Spark: Road Crack Detection

My first real exposure to object detection wasn’t through a course or tutorial—it was through a company we hired to use road imagery and assess pavement quality. They trained models to analyze photos and detect cracks.

I was fascinated. Although cracks aren’t discrete objects, detecting them meant diving into defect detection and semantic segmentation. Rather than draw boxes around "things," it meant highlighting pixel-level irregularities.

This led me into the world of data science. I learned about dataset preparation, annotation, confidence scoring, and model evaluation. Eventually, that spark evolved into completing my Master’s degree in Data Science and Analytics.

Why Object Detection?

It’s visually tangible: You can literally see the output and debug intuitively.
It applies beyond traditional “objects”: Anomalies and patterns are just as important.
Edge deployment is a big win: I’m excited about AI that runs offline on Jetson Nano, Coral, etc.
It's under-commercialized: Especially in B2B, industrial, and smart infrastructure spaces.

Main Model (So Far)

For this project, I’m starting with Mask R-CNN, using PyTorch and Detectron2 (by Facebook AI). I chose it because it’s flexible, open-source, well-documented, and supports segmentation right out of the gate.

What I’m Working On Right Now

Training Mask R-CNN on a small custom dataset
Testing YOLOv8 vs. Detectron2 for real-time inference
Building a fast annotation tool for labeling video frames
Deploying test models on Jetson Nano and benchmarking latency

Startup Vision

Object detection is at the core of what I want to build. It’s not just fun—it’s useful. From smart road monitoring to small factory inspection systems, I believe real-world vision AI still has untapped potential in niche, underserved domains.

TL;DR

AI isn’t just about language—vision matters too.
Object detection enables machines to “see” the world.
I’m starting with real use cases, not hype-driven gimmicks.
Tools like Detectron2 help me move fast, and learn faster.

What’s Next

Next up: my tech stack breakdown — from code to cloud. I’ll be sharing how I’m stitching together tools, frameworks, and hacks to build my first MVP.

Have questions, ideas, or just want to nerd out on CV? Reach out here.