hi theređź‘‹, I'm
Pragnyan Ramtha
17, he/him
AI/ML Engineer specializing in Cost-Efficient Reasoning Systems & LLM Fine-Tuning. Medalist @ AIMO3.
about me.
Results-driven AI/ML Engineer specializing in Large Language Model (LLM) fine-tuning and AI system design. I design maintainable, production‑grade AI systems and can comfortably work with deep cloud infrastructure. I learn new tools fast and use AI as a force‑multiplier in my coding, designing, and research loops, which lets me move much faster while keeping systems reliable.
experience.
AI Engineering Intern Remote
at, Reputation-DAO
Aug 2025 – Jan 2026
- Architected a high-availability serverless MLOps backend on GCP (Cloud Functions, Cloud Run) for production grade AI orchestration, achieving 99.9% uptime while successfully cutting down on cloud costs by 70%.
- Engineered a Gemini API response system with optimized prompt caching, which significantly reduced API costs and delivered a 50% reduction in inference latency across critical workflows.
- Boosted accuracy of customer support bot by developing a RAG pipeline utilizing semantic search for real-time documentation retrieval and source attribution.
- GCP
- Gemini API
- RAG
- Serverless
- Python
Github Contributor Remote
Jan 2025 – Present
- Served as a key contributor to 30+ open-source projects, focusing on bug fixes and core feature development.
- Identified a bottleneck in Scrapy causing severe latency, authored a fix that eliminated the drag and achieved a 2x speedup for affected workflows.
- Winner of IEEE Summer of Code (IEEESoC) Hackathon 2025, for open source contributions to multiple projects.
- Python
- TypeScript
- Git
- Docker
- CI/CD
projects.
AIMO-3: Efficient Reasoning via LLM Fine-Tuning
- Fine-tuned Phi-4 (14B) on CoT and TiR datasets to optimize multi-step problem solving and tool-use efficiency.
- Achieved 90% accuracy on reasoning benchmarks, rivaling 125B parameter models while utilizing significantly fewer compute resources.
- Phi-4
- Fine-tuning
- CoT/TiR
- PEFT
Personality Clone
- Fine-tuned a Large Language model, leveraging PEFT (QLoRA) and contrastive learning on private conversational data to emulate personal response style.
- Implemented a siamese network architecture with cosine similarity loss, which improved semantic embeddings and achieved 92% accuracy in replicating my response style, a 28% improvement over baseline models.
- TensorFlow
- Python
- CUDA
- Transformers
Autopilot
- Engineered an AI-driven OS automation system leveraging function calling and tool-use paradigms to execute complex natural language tasks to achieve low-level automation.
- Built a Reasoning + Acting agent framework with command sandboxing, reducing execution errors and achieving 45% faster task completion than manual workflows.
- Python
- LLM Agents
- APIs
github contributions.
blogs.
Why OpenAI Sent Me $500 for a Research ProjectApr 2026
Read full post 8 min read- A complete breakdown of how I built XSA4 + EMA + GPTQ-Int6, the submission that landed at 1.1271 BPB and placed top-5 globally in OpenAI's Parameter Golf challenge.
- Every technique explained from first principles: what bits-per-byte is, how Cross-Sparse Attention works, why EMA produces better weights than a single checkpoint, and what GPTQ actually does differently from naive quantization.
- Parameter Golf
- OpenAI
- Cross-Sparse Attention
- GPTQ
- Model Compression
How I Reached #1 on ARC-AGI-2Apr 2026
Read full post 7 min read- How I adapted the parallel agent and budget allocation patterns I built for AIMO3 to reach the top of the ARC Prize 2026 leaderboard, using per-puzzle test-time training, a vocabulary-restricted DFS beam search with KV cache reuse, and augmented re-scoring.
- A breakdown of the full pipeline: what I tried, what finally worked, and the three engineering tricks that made the difference.
- ARC-AGI-2
- Test-Time Training
- Qwen
- Kaggle
- Reasoning
How I Won a Solver Medal at AIMO3Apr 2026
Read full post 6 min read- A walkthrough of the agentic system I built for AIMO3, the $2.2M Kaggle competition to make AI solve International Mathematical Olympiad problems, using Gemma-4-31B, parallel sandboxed code execution, weighted voting, and a budget-aware time allocator.
- What I tried, what flopped, and the three engineering tricks that actually moved the needle.
- AIMO3
- Gemma 4
- Math Reasoning
- Agentic LLMs
- Kaggle
research papers.
Scaling Context Windows to Infinity: A Comprehensive Study of Position Encoding, Attention Mechanisms, Memory-Efficient Inference, and Context Reduction Techniques in Large Language Models 2026
Read paper Academia.edu
A comprehensive analysis of techniques for extending context windows in large language models, examining position encoding strategies, efficient attention mechanisms, and memory-optimized inference approaches to enable processing of arbitrarily long sequences.
Unlocking Societal Trends in Aadhaar Enrolment and Updates: Anomaly Detection and Fraud Risk Prediction 2026
Read paper Academia.edu
A data-driven approach to identify suspicious patterns in India's Aadhaar biometric identification system, utilizing machine learning for anomaly detection and fraud risk prediction in enrollment and update processes.
Speeding Up LLM Inference Using Quantum Computing Techniques 2026
Read paper Under Research
Exploring quantum-inspired algorithms and quantum computational primitives to accelerate inference in large language models, investigating quantum annealing for attention mechanisms and variational quantum circuits for efficient token generation.
technical skills.
Languages:
Python, C++, Bash, SQL, TypeScript
AI/ML Tools:
PyTorch, Transformers, Unsloth, NumPy, Pandas, Scikit-learn, PEFT/QLoRA, CUDA
Infrastructure:
GCP, Azure, Docker, Linux (Arch), Git
Certifications:
Machine learning certification (Stanford), CS50: comp. Sci. (Harvard University)
Achievements:
- Artificial Intelligence Mathematical Olympiad (AIMO) Silver Medalist
- Authored 2 Research Papers on Modern AI Optimization Techniques
- Winner, IEEE Summer of Code (IEEESOC) Hackathon 2025
- Winner, Empathy Encryption Hackathon 2025
- Winner, Daydream Hyderabad @ Hackclub 2025
- Top 0.5% Finalist, Shell AI Hackathon 2025
Developer Tools:
uv, Neovim, Arch Linux
Pragnyan Ramtha · 2026