// Hello, I'm

Shriniwas Ramesh
Suram

>|

Specializing in Enterprise-scale Generative AI system architecture and distributed ML infrastructure. Proven track record of spearheading “0-to-1” organizational AI adoption.

100K+

Daily Active Users

AI Platform

$4M+

Revenue Impact

Annual

1M+

AI Interactions

Daily

45%

Latency Reduction

Optimization

About Me

I'm a passionate about crafting enterprise-scale AI systems that blend cutting-edge research with robust engineering. My favorite work lies at the intersection of AI architecture and distributed systems, creating solutions that not only perform exceptionally but are meticulously built for scale and reliability.

Currently, I'm a Lead AI Software Engineer at U.S. News & World Report, where I act as the primary point of contact for cross-functional product teams, leading end-to-end design and execution of centralized AI hosting platforms serving millions of users daily.

In the past, I've had the opportunity to build AI systems at Forbes and Greedy AI, working on everything from propensity models to agentic frameworks. Additionally, I'm pursuing a PhD focusing on AI-driven HCI and enterprise AI adoption.

Generative AI

LLMs, LVMs, RAG, Embeddings, Prompt Engineering

Agentic Systems

LangGraph, LangChain, Multi-Agent Frameworks

Model Optimization

Quantization, Distillation, TensorRT, RLHF

Distributed Systems

MLOps, Data Pipelines, Cloud Infrastructure

Education

In Progress

PhD in Information Technology

University of the Cumberlands

Specializing in Cybersecurity and Artificial Intelligence

Completed

Master of Science in IT Management

The University of Texas at Dallas

Information Technology and Management

Completed

Bachelor of Engineering

Pune University

Mechanical Engineering

Research Focus

  • AI Impact Assessment in Cybersecurity & IT Compliance
  • Design Thinking & Human-Computer Interaction
  • Enterprise Data Strategy & AI-Driven Outcomes
  • Information Governance for AI-Automated Systems

Experience

Lead AI Software Engineer @ U.S. News & World Report

Sep 2024 – Present

  • Architected a centralized AI platform powering GenAI features across 5+ product lines and directed a distributed pod of internal and vendor SWEs.
  • Engineered a multi-modal Generative AI system scaling to 100K+ daily users, driving ~$4M in projected annual revenue.
  • Scaled a production Live AI Agent to 1M+ daily interactions with sub-second latency through INT8/FP16 quantization, cutting cloud costs by 30%.
  • Developed AI Agent workflows and automated Model Evaluation (XAI) frameworks, saving 2000+ manual hours and $80k+ in OpEx.
LangChain & LangGraph FrameworkLLM Inference (vLLM/TensorRT)Model Quantization (INT8)MLOps ArchitecturePythonReact/TypeScript Full-StackKubernetes

Featured Projects

Hybrid AI Development & Inference Platform

Deployed on-premise AI environment serving fine-tuned models (Gemma, Qwen) with a high-throughput, multi-modal data ingestion pipeline processing 1TB+ of unstructured data daily from Jira, Git, and Confluence to support continuous model fine-tuning and all internal CI/CD and AI workloads.

1TB+/Month

Data Processed

40%

Cost Reduction

5+

Models Served

On-PremiseKubernetesKafkaOpenAISparkFine-tuning

Autonomous Multi-Agent System

Engineered a multi-agent 'digital twin' that became the company-wide standard for AI agents, incorporating built-in circuit breakers and human-in-the-loop validation for high-risk actions to ensure zero critical production incidents from automated tasks.

+30%

Vulnerability Detection

-50%

Security Incidents

Zero

Production Incidents

Agentic AIRLHFSystem DesignHuman-in-the-LoopSecurity

Enterprise GenAI Platform

Led the design and execution of a centralized AI hosting platform enabling seamless integration of GenAI features across 5+ product lines at U.S. News & World Report.

100K+

Daily Users

$4M+

Revenue Impact

5+

Product Lines

GenAIMulti-modalPlatform EngineeringPythonCloud

Real-time AI Agent at Scale

Grew a production Live AI Agent from 1M+ daily user interactions, orchestrating model iteration, infrastructure scaling, and real-time monitoring to sustain sub-second response times and high user satisfaction at a massive scale.

1M+

Daily Interactions

45%

Latency Reduction

30%

Cost Reduction

LLMEmbeddingsGPU OptimizationReal-timeMonitoring

Technical Skills

GCP (Vertex AI, GKE, BigQuery)95%
AWS (SageMaker, S3, Lambda)90%
Microsoft Azure75%
Route 53, RDS Aurora, DynamoDB85%

Technology Stack

PythonTypeScriptJavaScriptSQLGraphQLTensorFlowPyTorchHuggingFaceOpenAILangChainLangGraphRAGEmbeddingsTransformersRLHFKubernetesDockerTerraformAirflowJenkinsGCPAWSAzureBigQuerySageMakerReactNext.jsNode.jsDjangoSparkPostgreSQLRedisKafkaGitLinux

Certifications

TensorFlow Developer Certificate

Google

Machine Learning Engineer

Google

Data Engineer

Google

Solutions Architect

AWS

Machine Learning

AWS

5+

Cloud Certifications

30+

Technologies

// What's Next?

Get In Touch

I'm currently open to new opportunities and exciting AI challenges. Whether you have a question, want to discuss a project, or just want to say hi, my inbox is always open!

Let's Connect

I'm passionate about building AI systems that make a real impact. If you're looking for someone to lead your AI initiatives or want to collaborate on innovative projects, I'd love to hear from you.

United States

Send a Message