AI and ML

Transfer learning: 'storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognize cars could apply when trying to recognize trucks'
Reinforcement learning:
- 'focus is on finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge).'
- No need of labelled data or explicit correction. Sort of learns by itself based on what it sees.
- 'concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward'
ReLu: Rectified linear unit
- A popular activation function in ANNs
- f(x) = max(0, x)
Meta-learning: 'Learning to learn'
- https://lilianweng.github.io/posts/2018-11-30-meta-learning/
Support vector machine (SVM)
- For (linear) classification problem
- Supervised training
Perceptron (McCulloch-Pitts)
Linear classification
- Points of different classes drawn on a graph can be demarcated by straight lines
Support vector machines (SVM): Can do linear classification

Forecasting algorithms
- ARIMA (Auto-regressive integrated moving average)
- SARIMA (Seasonal auto-regressive integrated moving average)
Autoregression
Moving average

Ablation study
- Ablation => removing a component of model to see how it performance varies
- Useful to gauge the necessity/contribution of a component/architecture/parameter.
- Ablation study could be done to show that a newly inserted aspect does indeed make the model better.
  - The model with and without this new aspect can be made, to compare and show that the performance is better with the new aspect.
- Eg: Changing activation function
Continuous learning
- Train a model, but the training never stops.
- Allow model to learn new things by constantly retraining for it to be able to deal with new kinds of data
- Without losing the knowledge gained from previous trainings
- Allowing the model to adapt to changing need, without being retrained from scratch.
Hyperparameter
- Usually set explicity by the designer. Not 'learned' during training.
- These are set before training
- Eg: Batch size in training dataset, learning rate
- Difference from 'parameter': Parameters are learnt during training process
  - The term parameter refers to model parameters.
  - Eg: Weights in a neural network, k-value in K-nearest-neighbour
  - https://machinelearningmastery.com/difference-between-a-parameter-and-a-hyperparameter/
Choosing value of hyperparameters
- Evaluating AI models
- Grid search: Try all possible configurations => often impractical
- Random search: Select random combinations as configurations
  - Results shown to be close enough to grid search
- https://www.kaggle.com/code/willkoehrsen/intro-to-model-tuning-grid-and-random-search

Sobel operator: Used for edge detection in images
- aka Sobel filter
Transformers
- Dethroned CNNs' place as the choice for computer vision
- Can handle both text and images
GPT: Generative Pre-trained Transformers
Diffusion models
GANs: Generative Adversarial Neural-networks
- Made popular by deepfakes
- Made less relevant by diffusion models ??
RNNs: Recurrent Neural Networks
- LSTM: Long Short-Term Memory
  - No vanishing gradient problem
- GRU: Gated Recurrent Unit
  - A simplified form of LSTMs

Reinforcement learning

Value iteration
Q-Learning
Policy (π)
- Are probabilities of 'transitions'
- P(s'|s, a): probability of going to s' from s by taking action a
- Expected utility of a π
- Probabilistic => need not take the same path every time for the same π
- π^*: optimal policy
  - Policy with maximum expected utility
Discounted rewards
- Discount factor = γ
R(s): reward associated with state s
Markov process
- Next state is independent of history
MDP: Markov decision process

—

Bellmann equation for utility

U(s) = R(s) + γ max [P(s'|s, a) * U(s')]
                a∈A

This essentially says this:

Utility of a state = Reward of that state + Utility of next state

Sketching

From here:

A sketch C(X) of some data set X with respect to some function f is a compression of X that allows us to compute, or approximately compute, f (X) given access only to C(X).

'compress data in a way that lets you answer queries' ¹¹
Helps save bandwidth when streaming data over a network.

Fun facts

Fun fact (2025): PyTorch is the most popular framework used in deep learning research
Hugging Face also provide libraries. Like trainers.
- https://huggingface.co/docs/transformers/en/main_classes/trainer

Libraries and frameworks

PyTorch
Tensorflow (dead??)
Theano
Caffe
Keras

Acks

Much of the info here is result of online searches instigated by conversions with the following people: Vishnu, Likith, Eva