Glimpse PoW

CLIP

Architecture Contrastive Language-Image Pre-training (CLIP) uses a dual-encoder architecture to map images and text into a shared latent space. It works by jointly training two encoders. One encoder for images (Vision Transformer) and one for text (Transformer-based language model). Image Encoder: The image encoder extracts salient features fr... Read more

NerFs

Pytorch implemntation from scratch Introduction Neural Radiance Fields are a way of storing a 3D scene within a neural network. This way of storing and representing a scene is often called an implicit representation, since the scene parameters are fully represented by the underlying Multi-Layer Perceptron (MLP). (As compared to an explicit repr... Read more

LLD

Principles 1) SOLID Single Responsibility - This means that a class must have only one responsibility Open/Closed - Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification Liskov Substitution Principle (LSP) - Objects of a superclass should be replaceable with objects of its subclas... Read more

CF #1004

Read more

Intro to RLHF

DPO v/s PPO v/s GRPO In order to understand what these fancy “alignement” algorithms mean, let’s go back to the basics first - RLHF. There are many applications such as writing stories where you want creativity, pieces of informative text which should be truthful, or code snippets that we want to be executable. Writing a loss function to captu... Read more