Skip to content
View MuhammadBinUsman03's full-sized avatar

Highlights

  • Pro

Block or report MuhammadBinUsman03

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
MuhammadBinUsman03/README.md

Hello, I'm MuhammadBinUsman03 ๐Ÿ‘‹

I'm a Machine Learning Engineer, currently contributing to and democratizing LLMs space. I explore recent advancements/research in the domain, experiment with them, share insights, and also build side projects.

๐Ÿš€ My Expertise

Domain Frameworks
Machine Learning PyTorch Badge Scikit-learn Badge Pandas Badge matplotlib Badge NumPy
LLM-Training Built with Axolotl UnslothAI UnslothAI
LLM-Inferencing Llama.cpp vLLM NvidiaTensorRT-LLM LMDeploy
Data-Generation Distillabel
RAG Langchain LlamaIndex
DevOps AWS Google Cloud Docker
Development Python JavaScript NodeJS PHP Laravel C++ Java

Models / Projects

  • 3-Pipeline LLMOps System - Training pipeline finetunes model on a serverless GPU infrastructure, Logs checkpoints on WandB Registry. Streaming pipeline ingests data from a live source, processes, embeds, and stores into Qdrant VectorDB, the pipeline is deployed on AWS using GitHub CI/CD. Inference pipeline loads from model registry, calls LLM with context, and maintains chat history
  • OrpoLlama3-8B - Surpassed Llama3-8B by 2 points on OpenLLM Leaderboard with 15K-steps ORPO training on 1xA100.
  • apollo-preview-v0.2 - RP/Creative writing/Instruction following dataset curated in collaboration with QuasarResearch.
  • QueryRouter - Dynamic routing system deployed on AWS for querying LLMs, boosting efficiency, and optimizing costs.
  • AutoPrune - Automatic pruning of LLMs on Runpod-GPUs.
  • Chain-QnA - RAG application deployed with LangServe as a REST-API for Basic/PDF QnA.
  • ImageGallery-Microservices - A microservices architecture-based Google Photos clone, deployed on Google Cloud.

๐Ÿ“ซ Get in Touch

LinkedIn Hugging Face Medium X Substack

Pinned Loading

  1. Real-Time-3-pipeline-LLM-Financial-Advisor Real-Time-3-pipeline-LLM-Financial-Advisor Public

    3-Pipeline LLMOps Financial advisor. Steaming pipeline deployed on AWS, 24/7 collects, embeds live-data into QdrantDB. Training pipeline finetunes model on serverless GPU and logs best model on Wanโ€ฆ

    Python 16 4

  2. Query-Router Query-Router Public

    Dynamic routing system for querying LLMs, boosting efficiency and optimizing costs

    Jupyter Notebook 1

  3. Auto-Prune Auto-Prune Public

    Shell

  4. Chain-QnA Chain-QnA Public

    RAG application encapsulated in LangServe API for Basic/PDF QnA

    Python

  5. ImageGallery-Microservice ImageGallery-Microservice Public

    A microservices architecture based Google Photos clone, deployed on Google cloud.

    JavaScript

  6. Web-Spider Web-Spider Public

    PHP