Agentic AI Systems
Design and deploy autonomous LLM agents with tool integrations (MCP, SmolAgent, LangGraph), multi-step reasoning, and secure local hosting for enterprise workflows.
AI Engineer based in Bengaluru, Karnataka, currently building autonomous agent systems at Thrivv AI (Remote, Dubai). I specialize in LLM-powered agents, Retrieval-Augmented Generation (RAG), LLM fine-tuning, and production AI pipelines — from multimodal interview systems to financial copilots and document forensics engines.
Previously at Vacanzi as AI Engineer and at Rubixe AI Solution as a Data Scientist Consultant, I've delivered ML models, RAG chatbots, computer vision pipelines, and client analytics solutions. I work across LangChain, SmolAgent, PyTorch, FastAPI, Qdrant, Redis, PostgreSQL, and AWS — with a focus on shipping reliable, low-latency AI systems.
LLM agents, RAG pipelines, and fine-tuned models built for real workloads.
Redis caching, vector search, and optimized inference for production speed.
Secure, scalable systems with clear engineering and business alignment.
I combine deep technical expertise in agentic AI and LLMs with clear communication — translating complex engineering into business outcomes. I invest time upfront to understand the problem, then build secure, scalable solutions that perform in production.
| Institution | Degree | Year | Grade / Status |
|---|---|---|---|
| Indian Institute of Technology Patna | Master of Computer Application (MCA) | Pursuing | Patna, Bihar |
| Bengaluru City University | Bachelor of Computer Application (BCA) | 2023 | First Class · Bengaluru, Karnataka |
Design and deploy autonomous LLM agents with tool integrations (MCP, SmolAgent, LangGraph), multi-step reasoning, and secure local hosting for enterprise workflows.
Build production RAG and CRAG systems with vector search (Qdrant), Cohere reranking, Redis caching, and live data from PostgreSQL, MongoDB, and ERP sources.
Fine-tune open-source LLMs (Mistral, Llama) with LoRA/QLoRA, optimize inference with ONNX, and deploy scalable AI services via FastAPI, Docker, and AWS.
Develop real-time CV pipelines with OpenCV and TensorFlow for facial behavior, eye tracking, object detection, and multimodal video interview systems.
Build conversational voice assistants with Text-to-Speech and Speech-to-Text pipelines, including custom accent TTS fine-tuning with Coqui XTTS and Whisper.
Detect document tampering using DCT, FFT, ELA, EXIF/XMP validation, PDF structural analysis, and Llama-powered explainability reports.
Deliver CFO copilots and financial agents with ERP integration, banking data RAG, and time-series forecasting using Chronos-2, N-Hits, and Prophet.
Train and deploy classification, regression, and neural network models with PyTorch, TensorFlow, and Scikit-Learn — from data prep to production inference.
Integrate OpenAI, Anthropic, and Hugging Face models into your products with robust prompting, guardrails, streaming, and cost-optimized API architectures.
Autonomous financial agent with SmolAgent, RAG over ERP/PostgreSQL, Qdrant, Redis caching, and Chronos-2/N-Hits/Prophet forecasting.
Digital forensics engine using DCT, FFT, ELA, EXIF validation, PDF analysis, and Llama 3 explainability reports.
Real-time multimodal video interviews with LangChain agents, OpenCV/TensorFlow CV, and audio communication analysis.
Voice-enabled RAG chatbot with TTS/STT pipelines and MongoDB for live recruiter job data at Vacanzi.
Coqui XTTS v2 fine-tuned on Indian accent data with Librosa/Whisper preprocessing and ONNX inference optimization.
Fine-tuned Mistral 7B v0.3 with PEFT (LoRA, QLoRA), custom preprocessing pipelines, and domain-specific evaluation.
Available for AI engineering roles, agentic systems, RAG pipelines, and LLM fine-tuning. Based in Bengaluru — open to remote worldwide.
Chat about Shashi's experience, skills, projects, services, or how to collaborate on AI engineering work.