hi theređź‘‹, I'm
Pragnyan Ramtha
17, he/him
AI Engineer - Specializing in LLM fine-tuning and AI system design
about me.
Results-driven AI Engineer specializing in Large Language Model (LLM) fine-tuning and AI system design. I design maintainable, production‑grade AI systems and can comfortably work with deep cloud infrastructure. I learn new tools fast and use AI as a force‑multiplier in my coding, designing, and research loops, which lets me move much faster while keeping systems reliable.
experience.
AI Engineering Intern Remote
at, Reputation-DAO
Aug 2025 – Jan 2026
- Architected a high-availability serverless MLOps backend on GCP (Cloud Functions, Cloud Run) for production grade AI orchestration, achieving 99.9% uptime while successfully cutting down on cloud costs by 70%.
- Engineered a Gemini API response system with optimized prompt caching, which significantly reduced API costs and delivered a 50% reduction in inference latency across critical workflows.
- Boosted accuracy of customer support bot by developing a RAG pipeline utilizing semantic search for real-time documentation retrieval and source attribution.
- GCP
- Gemini API
- RAG
- Serverless
- Python
Github Contributor Remote
Jan 2025 – Present
- Served as a key contributor to 30+ open-source projects, focusing on bug fixes and core feature development.
- Identified a bottleneck in Scrapy causing severe latency, authored a fix that eliminated the drag and achieved a 2x speedup for affected workflows.
- Winner of IEEE Summer of Code (IEEESoC) Hackathon 2025, for open source contributions to multiple projects.
- Python
- TypeScript
- Git
- Docker
- CI/CD
projects.
AIMO-3: Efficient Reasoning via LLM Fine-Tuning
- Fine-tuned Phi-4 (14B) on CoT and TiR datasets to optimize multi-step problem solving and tool-use efficiency.
- Achieved 90% accuracy on reasoning benchmarks, rivaling 70B parameter models while utilizing significantly fewer compute resources.
- Phi-4
- Fine-tuning
- CoT/TiR
- PEFT
Personality Clone
- Fine-tuned a Large Language model, leveraging PEFT (QLoRA) and contrastive learning on private conversational data to emulate personal response style.
- Implemented a siamese network architecture with cosine similarity loss, which improved semantic embeddings and achieved 92% accuracy in replicating my response style, a 28% improvement over baseline models.
- TensorFlow
- Python
- CUDA
- Transformers
Autopilot
- Engineered an AI-driven OS automation system leveraging function calling and tool-use paradigms to execute complex natural language tasks to achieve low-level automation.
- Built a Reasoning + Acting agent framework with command sandboxing, reducing execution errors and achieving 45% faster task completion than manual workflows.
- Python
- LLM Agents
- APIs
research papers.
Scaling Context Windows to Infinity: A Comprehensive Study of Position Encoding, Attention Mechanisms, Memory-Efficient Inference, and Context Reduction Techniques in Large Language Models 2026
Read paper Academia.edu
A comprehensive analysis of techniques for extending context windows in large language models, examining position encoding strategies, efficient attention mechanisms, and memory-optimized inference approaches to enable processing of arbitrarily long sequences.
Unlocking Societal Trends in Aadhaar Enrolment and Updates: Anomaly Detection and Fraud Risk Prediction 2026
Read paper Academia.edu
A data-driven approach to identify suspicious patterns in India's Aadhaar biometric identification system, utilizing machine learning for anomaly detection and fraud risk prediction in enrollment and update processes.
Speeding Up LLM Inference Using Quantum Computing Techniques 2026
Read paper Under Research
Exploring quantum-inspired algorithms and quantum computational primitives to accelerate inference in large language models, investigating quantum annealing for attention mechanisms and variational quantum circuits for efficient token generation.
technical skills.
Languages:
Python, C++, Bash, SQL, TypeScript
AI/ML Tools:
PyTorch, Transformers, Unsloth, NumPy, Pandas, Scikit-learn, PEFT/QLoRA, CUDA
Infrastructure:
GCP, Azure, Docker, Linux (Arch), Git
Certifications:
Machine learning certification (Stanford), CS50: comp. Sci. (Harvard University)
Achievements:
- Artificial Intelligence Mathematical Olympiad (AIMO) Silver Medalist
- Authored 2 Research Papers on Modern AI Optimization Techniques
- Winner, IEEE Summer of Code (IEEESOC) Hackathon 2025
- Winner, Empathy Encryption Hackathon 2025
- Winner, Daydream Hyderabad @ Hackclub 2025
- Top 0.5% Finalist, Shell AI Hackathon 2025
Developer Tools:
uv, Neovim, Arch Linux
Pragnyan Ramtha · 2026