Munish Kaushik

One-Hot Encoding, Dot Product and Matrices multiplication: The Basics of Transformers

Introduction In the world of natural language processing (NLP), everything begins with words. However, computers don’t understand words directly – they need numbers. Our first task is to convert words into numerical representations so that we can perform mathematical operations on them. This is especially important when building systems like voice-activated assistants, where we need to transform sequences of sounds into sequences of words. To achieve this, we start by defining a Vocabulary, which is the set of symbols (or words) we’ll be working with. For simplicity, let’s assume we’re working with English, which has tens of thousands of words,…

Fine-Tuning Large Language Models: A Technical Overview

The prowess of Large Language Models (LLMs) like LLAMA 7B and ChatGPT in mimicking human-like text has been a game-changer in AI. Yet, their broad capabilities often fall short in specialized tasks. Fine-tuning is the bridge to this gap, applying principles from transfer learning to tailor LLMs to specific domains. Why Fine-Tuning? Drawing from Transfer Learning: Consider a Convolutional Neural Network trained in general image recognition. While adept across a range of objects, its specificity falls short with dog breeds. Here, transfer learning refines the model by fine-tuning it with a dataset focused on dog breeds, enhancing its specificity. Similarly,…

Foundation of AI Brilliance: Unpacking Pre-Training of Large Language Models

In the mesmerizing realm of Artificial Intelligence, the journey of a Large Language Model (LLM) from a nascent stage to a wise oracle capable of understanding and generating human-like text is nothing short of a marvel. At the heart of this journey lies the process of Pre-Training—a phase of paramount importance that shapes the core intelligence of LLMs like ChatGPT. This article aims to demystify Pre-Training, offering insights that cater to both AI novices and data science veterans, while also highlighting the broader implications, including environmental considerations. Understanding Pre-Training: Pre-Training is the initial learning phase where a model, such as…

Layers of Generative AI: Pre-Training, Fine-Tuning, and Retrieval Augmented Generation

In the rapidly evolving landscape of artificial intelligence, Generative AI stands out, driving innovations across various sectors. This transformative technology is built on foundational processes like Pre-Training, Fine-Tuning, and Retrieval Augmented Generation (RAG). Today, let’s explore these processes not just for their technical intricacies but also through the lens of cost and time investment, key factors that shape the deployment and scalability of these AI solutions. Pre-Training: The Costly Foundation Pre-Training is where a model learns from a vast array of data, gaining a broad understanding of language, concepts, or images. This stage is very similar to setting up the…

Pandas Qcut – A new approach to create bins in pandas

What do you do when you have a whole lot of numerical variables all on a varying scale. It essentially becomes difficult to analyze data in these situations. The first thing that we as Data Scientists do is to try and divide the data into bins mostly using histograms. Histograms are intuitive dividing the data equally into the bins. But once we have understood, how do we use these bins. That’s where quartiles or quantiles come into picture. With qunatiles we can essentially divide a numerical feature into a categorical features. Lets see how used to calculate quantiles earlier. It…

Understanding Singular Vector Decomposition

Singular Value Decomposition or SVD as its fondly called is one of the most popular method for dimensionality reduction. It has various applications ranging from Image compression, Recommender Engines, solving matrix equations, etc. Here, I will try to make you understand the intuition behind SVD and then will try to build a real world example in R. I have also written a similar script for python and link for the same can be found at the end of the article. I assume you have a basic understanding of Matrix algebra while reading this article. Though example would help you understand…