Grokking Deep Learning

Motivation

Note

Inspired by @moolmohino, I have embarked on a similar journey, committing 1-2 hours daily over the next ? days to deepen my intuition in deep learning.

Initially, I was motivated to go for a 100-day challenge. But life happens, so I am now going to continue in a disciplined manner instead.

In short, I grew tired of simply building Generative AI applications and managing LLM operations in my current role. The routine of calling an API without deeply understanding the process often frustrates me.

I started this repository for my personal growth as I learn in public. For context, I graduated with a Bachelor of Science in Business Analytics. Since my undergraduate days, I have known that I have a deep passion for solving problems in tech. However, solving bugs and building features never truly excited me.

Instead, I have always been drawn to more math-related aspects of the field — despite having purposefully avoided math for the longest time due to some secondary school "trauma".

That being said, this repository will document my step-by-step journey of getting hands-on experience and building models from scratch. I'll be diving deep into the math from research papers, experiencing plenty of mind-blowing moments along the way, and likely talking to GPT more than I talk to my partner over the entire process.

Logs

Warning

I intend to bombard this section with daily logs of my learning journey. If you're reading this from the future at the end of my challenge, be prepared to scroll through quite a bit of content.

Day 1, 07/08/24: Exploring Vision Language Model [PaliGemma].
- Implemented (partial) Siglip Model [Contrastive Vision Encoder].
  - Siglip configurations
  - Siglip vision embeddings
- Paper backlogs:
Day 2, 08/08/24: Exploring Vision Language Model [PaliGemma].
- Implemented (completed) Siglip Model [Contrastive Vision Encoder].
  - Siglip Multi-headed Attention
  - Siglig Encoder and Layers
- Implemented (partial) PaliGemma Model [Input Processor]
  - Image Processor
  - Text Processor
- Paper backlogs:
  - Layer Normalization
Day 3, 13/08/24: Exploring Recurrent Neural Networks (RNN)
- Implemented RNN components and Vanilla RNN model.
Day 4, 14/08/24: Exploring Vision Transformers (ViT)
- Implemented ViT
  - ViT Patch Embeddings
  - ViT Vision Embeddings
  - ViT Self-Attention
  - ViT MLP
  - ViT Encoder and Layers
  - ViT Classifier
- Paper backlogs:
  - An Image Is Worth 16X16 Words: Transformers For Image Recognition At Scale
Day 5, 15/08/24: Exploring variants of Attention Mechanisms
- Implemented different Attention Mechanisms
  - Self-Attention
  - Multi-Head Self-Attention
  - Multi-Query Attention
  - GQA Attention
  - Cross Attention
  - Causal Self-Attention
- Paper backlogs:
  - GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
assets		assets
src		src
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Grokking Deep Learning

Motivation

Logs

About

Releases

Packages

Languages

License

wtlow003/grokking-deep-learning

Folders and files

Latest commit

History

Repository files navigation

Grokking Deep Learning

Motivation

Logs

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages