# Supervised Learning Algorithms in ML (Machine Learning)

## Types of Supervised Learning Algorithms

- Decision Trees
- Tree Pruning
- Rule-based Classification
- Naïve Bayes
- Gaussian Naive Bayes
- Multinomial Naive Bayes
- Bernoulli Naive Bayes

- Bayesian Network
- Support Vector Machines (SVM)
- k-Nearest Neighbors (k-NN)
- Ensemble Learning
- Bagging
- Boosting
- Stacking

- Random Forest Algorithm

## 1. Decision Trees

- Decision Trees are a non-linear model
**used for both classification and regression**. - They work by recursively or repetitively splitting the data based on feature values to create a tree-like structure.
- Each
**internal node represents a decision based on a feature, and each leaf node represents the output class or regression value**. - The tree typically using metrics like
**Gini impurity or information gain**. **Example:**Consider a dataset of emails labeled as spam or not spam.- Decision Trees can be used to classify emails based on features like the presence of certain keywords.
- The tree might learn that if the word "
**discount**" is present and the email sender is not in the contact list, it is likely spam.

```
1from sklearn.tree import DecisionTreeClassifier
2from sklearn.model_selection import train_test_split
3from sklearn.metrics import accuracy_score
4
5# Sample data (replace with your dataset)
6X, y = your_features, your_labels
7
8# Split data
9X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
10
11# Create and fit Decision Tree
12dt_classifier = DecisionTreeClassifier()
13dt_classifier.fit(X_train, y_train)
14
15# Make predictions
16y_pred = dt_classifier.predict(X_test)
17
18# Evaluate accuracy
19accuracy = accuracy_score(y_test, y_pred)
20print(f"Decision Tree Accuracy: {accuracy}")
```

## 2. Tree Pruning

- Pruning is a technique to prevent overfitting by limiting the depth of the tree or removing branches that do not contribute significantly.
- Tree pruning is a technique used to
**optimize Decision Trees**. - It involves
**removing branches from the tree that do not provide significant predictive power**, thus preventing overfitting. - Overfitting occurs when the model learns noise in the training data and performs poorly on new, unseen data.
- Pruning is often done by setting a maximum depth for the tree or by using algorithms that identify and remove unnecessary branches.

### Types of Pruning

#### Pre-Pruning (Early Stopping)

- Pre-pruning involves stopping or discontinuing the growth of the decision tree before it becomes too complex.
- It sets conditions during the tree-building process to decide when to stop adding new branches.
- Decisions are made during tree construction.
- Less flexible as decisions are made prematurely.
- Potentially lower Computational cost as fewer nodes are considered.

#### Post-Pruning

- Post-pruning, also known as
**cost-complexity pruning or just pruning**, - involves growing the tree without constraints and then later removing branches that do not contribute significantly to improving the model's performance.
- More flexible as it allows for adjusting the tree.
- Decisions are made after the tree is fully grown.
- May be higher computational cost as it involves assessing and modifying a fully grown tree.

```
1from sklearn.tree import DecisionTreeClassifier
2from sklearn.model_selection import train_test_split
3from sklearn.metrics import accuracy_score
4
5# Sample data (replace with your dataset)
6X, y = your_features, your_labels
7
8# Split data
9X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
10
11# Create and fit pruned Decision Tree
12pruned_dt_classifier = DecisionTreeClassifier(max_depth=3)
13pruned_dt_classifier.fit(X_train, y_train)
14
15# Make predictions
16y_pred_pruned = pruned_dt_classifier.predict(X_test)
17
18# Evaluate accuracy
19accuracy_pruned = accuracy_score(y_test, y_pred_pruned)
20print(f"Pruned Decision Tree Accuracy: {accuracy_pruned}")
```

## 3. Rule-based Classification

- Rule-based classification involves using a
**set of if-else statements**to make decisions. - Each rule consists of conditions based on input features, leading to a specific class or outcome.
- Rules are derived from analyzing the relationships between input features and the target variable.
- It's an interpretable approach but may struggle with complex relationships in data.
**Example:**In a rule-based system for loan approval, a rule might be:- "If income > $50,000 and credit score > 700, approve the loan; otherwise, deny the loan."

```
1# Example rule-based classification using if-else statements
2
3def rule_based_classifier(features):
4 if features[0] > 5 and features[1] < 10:
5 return 'Class A'
6 else:
7 return 'Class B'
8
9# Sample data (replace with your dataset)
10sample_data = [6, 8]
11
12# Make a prediction using the rule-based classifier
13prediction = rule_based_classifier(sample_data)
14print(f"Rule-based Classification Prediction: {prediction}")
```

## 4. Naïve Bayes

- it is a probabilistic classification algorithm derived from Bayes' theorem.
- Naïve Bayes is versatile, suitable for both binary (two-class) and multiclass classification problems.
- Known for its simplicity, efficiency, and effectiveness in dealing with high-dimensional data.
- It assumes that features are conditionally independent given in the class.
- The algorithm calculates the probability of a class given the input features and selects the class with the highest probability.
- Despite its "naïve" assumption, Naïve Bayes often performs well and is
**computationally efficient**.

### Example: Medical Diagnosis

Suppose we want to predict whether a person has a particular medical condition based on two symptoms: high fever and persistent cough

```
1from sklearn.naive_bayes import GaussianNB
2from sklearn.model_selection import train_test_split
3from sklearn.metrics import accuracy_score
4
5# Sample data (replace with your dataset)
6X, y = your_features, your_labels
7
8# Split data
9X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
10
11# Create and fit Naïve Bayes classifier
12nb_classifier = GaussianNB()
13nb_classifier.fit(X_train, y_train)
14
15# Make predictions
16y_pred_nb = nb_classifier.predict(X_test)
17
18# Evaluate accuracy
19accuracy_nb = accuracy_score(y_test, y_pred_nb)
20print(f"Naïve Bayes Accuracy: {accuracy_nb}")
```

## Types of Naives Bayes

### Gaussian Naive Bayes

Gaussian Naive Bayes is used when the features (attributes or variables) in the dataset are continuous and have a Gaussian (normal) distribution.

### Multinomial Naive Bayes

- Multinomial Naive Bayes is suitable when the features represent the frequency of occurrences of different events,
- and each feature is a count (non-negative integers).
- It is commonly used in text classification tasks.

### Bernoulli Naive Bayes

- Bernoulli Naive Bayes is used when the features are binary (0 or 1),
- representing the presence or absence of a particular characteristic.
**Example:**Bernoulli Naive Bayes can help classify emails as spam or not based on the binary occurrence of these words.

## 5. Bayesian Network

- A Bayesian Network is a graphical model that represents probabilistic relationships among a set of variables.
- Nodes in the graph represent variables, and edges represent dependencies.
- The network is built based on conditional dependencies between variables.
- Inference involves using observed evidence to update probabilities of other variables in the network.

```
1from pgmpy.models import BayesianModel
2from pgmpy.estimators import ParameterEstimator
3from pgmpy.inference import VariableElimination
4
5# Define the structure of the Bayesian Network
6model = BayesianModel([('A', 'B'), ('C', 'B')])
7
8# Sample data (replace with your dataset)
9data = your_data
10
11# Fit the model to the data
12model.fit(data)
13
14# Estimate parameters
15pe = ParameterEstimator(model, data)
16print(pe.state_counts('B'))
17
18
19# Perform inference
20infer = VariableElimination(model)
21result = infer.query(variables=['B'], evidence={'A': 1})
22print(result)
```

## 6. Support Vector Machines (SVM)

(for understanding)

Imagine you have a bunch of dots on a piece of paper, and these dots can be of two types - let's say red dots and blue dots.

Now, you want to draw a line (not necessarily straight) in such a way that it creates the most significant gap or space between the red dots and blue dots.

- SVM is like finding the best line that separates different groups of dots. This line is called a
**hyperplane**. - Support Vector Machines are linear classifiers that find the optimal hyperplane to separate classes.
- They work well in high-dimensional spaces.
- SVM aims or strives to find the
**hyperplane**that maximizes the margin between classes. - The kernel trick allows SVM to handle non-linear decision boundaries by transforming input features into higher-dimensional space.
- SVM algorithm can be used fo
**r image classification, text categorization, etc**.

**Example:**For classifying handwritten digits, SVM might find the optimal hyperplane that best separates different digits in a high-dimensional space.

### Types of Linear SVM

- Linear SVM
- Non Linear SVM

#### Linear SVM

- Linear SVM aims to find a straight-line (hyperplane) that best separates the classes in the input feature space.
- Example: Classifying emails as spam or not spam based on features like word frequencies.

#### Non-linear SVM

- Non-linear SVM handles complex relationships by transforming input features into a higher-dimensional space,
- allowing for the creation of non-linear decision boundaries.
- Example: Classifying handwritten digits based on pixel values, where a simple straight line wouldn't be enough

## 7. k-Nearest Neighbors (k-NN)

- k-Nearest Neighbors is a simple, Supervised learning algorithm that classifies data points based on the majority class of their k-nearest neighbors.
- k-NN (k-Nearest Neighbors) does not make any assumptions about the underlying data distribution.
- It is also called a lazy learner algorithm because it does not learn from the training set immediately.
**Example:**In a k-NN classifier for predicting movie genres,- the algorithm classifies a movie based on the genres of its k-nearest neighbors in a feature space, such as viewer ratings etc.

```
1from sklearn.neighbors import KNeighborsClassifier
2from sklearn.model_selection import train_test_split
3from sklearn.metrics import accuracy_score
4
5# Sample data (replace with your dataset)
6X, y = your_features, your_labels
7
8# Split data
9X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
10
11# Create and fit k-NN classifier
12knn_classifier = KNeighborsClassifier(n_neighbors=3)
13knn_classifier.fit(X_train, y_train)
14
15# Make predictions
16y_pred_knn = knn_classifier.predict(X_test)
17
18# Evaluate accuracy
19accuracy_knn = accuracy_score(y_test, y_pred_knn)
20print(f"k-NN Accuracy: {accuracy_knn}")
```

## 8. Ensemble Learning

- Ensemble Learning combines multiple models to improve overall performance and robustness.
- Voting classifiers combine predictions from multiple models to make a final prediction.
- Bagging (used in Random Forest) trains multiple instances of the same model on different subsets of the data to reduce overfitting.
**Example**: Stock market prediction ensemble combines models (e.g., Decision Trees, SVMs) for increased accuracy and robustness.

```
1from sklearn.ensemble import RandomForestClassifier, VotingClassifier
2from sklearn.linear_model import LogisticRegression
3from sklearn.svm import SVC
4from sklearn.model_selection import train_test_split
5from sklearn.metrics import accuracy_score
6
7# Sample data (replace with your dataset)
8X, y = your_features, your_labels
9
10# Split data
11X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
12
13# Create individual classifiers
14logreg = LogisticRegression()
15svm = SVC()
16rf = RandomForestClassifier()
17
18
19# Create an ensemble using a Voting Classifier
20ensemble_classifier = VotingClassifier(estimators=[('logreg', logreg), ('svm', svm), ('rf', rf)])
21
22# Fit ensemble classifier
23ensemble_classifier.fit(X_train, y_train)
24
25# Make predictions
26y_pred_ensemble = ensemble_classifier.predict(X_test)
27
28
29
30# Evaluate accuracy
31accuracy_ensemble = accuracy_score(y_test, y_pred_ensemble)
32print(f"Ensemble Accuracy: {accuracy_ensemble}")
```

### Types of Ensemble methods

- 1. Bagging
- 2. Boosting
- 3. Stacking

### Bagging (Bootstrap Aggregating)

- Bagging involves training multiple or numerous instances of the same base model on different subsets of the training data,
- generated through random sampling with replacement (bootstrap samples).
- Example: Random Forest

### Boosting

- Boosting focuses on sequentially training multiple weak learners, giving more weight to instances that the previous models misclassified.
- This iterative process aims to correct errors and improve overall model accuracy.
- Example: AdaBoost (Adaptive Boosting)

### Stacking

- Stacking involves training multiple diverse models, referred to as base models, and then combining their predictions using a meta-model.
- The meta-model is trained on the outputs of the base models, allowing for a higher-level understanding of the data.
**Example: Stacked Ensemble**

## 9. Random Forest Algorithm

- Random Forest is an ensemble learning algorithm that combines the
**predictions of multiple decision trees to enhance overall performance**and reduce the risk of overfitting. - It operates through the
**process of bagging (Bootstrap Aggregating)**and - Introduces an additional layer of randomness, making it robust and accurate in various applications.
- It takes less time to train as compared to other algorithms.

```
1from sklearn.ensemble import RandomForestClassifier
2from sklearn.model_selection import train_test_split
3from sklearn.metrics import accuracy_score
4
5# Sample data (replace with your dataset)
6X, y = your_features, your_labels
7
8# Split data
9X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
10
11# Create and fit Random Forest classifier
12rf_classifier = RandomForestClassifier()
13rf_classifier.fit(X_train, y_train)
14
15# Make predictions
16y_pred_rf = rf_classifier.predict(X_test)
17
18# Evaluate accuracy
19accuracy_rf = accuracy_score(y_test, y_pred_rf)
20print(f"Random Forest Accuracy: {accuracy_rf}")
```