Spam Mail Detection Using Support Vector Machine.

In this blog, we are going to classify emails into Spam and Anti Spam. Here I have used SVM Machine Learning Model for that.

All the source code and dataset are present in my GitHub repository. Links are available in the bottom of this blog.

So let’s understand the dataset first.

Here in the dataset, you can see there are two features.

  1. Label — Ham or Spam
  2. Email Text — Actual Email

So basically our model will recognize the pattern and will predict whether the mail is spam or genuine.

Algorithm used — SVM

About SVM

“Support Vector Machine” (SVM) is a supervised machine learning algorithm which can be used for both classification or regression challenges. However, it is mostly used in classification problems. In the SVM algorithm, we plot each data item as a point in n-dimensional space (where n is a number of features you have) with the value of each feature being the value of a particular coordinate. Then, we perform classification by finding the hyper-plane that differentiates the two classes very well.

AI Jobs

So, let’s jump on our coding section

Import Important Libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import GridSearchCV
from sklearn import svm

Load our Dataset

data = pd.read_csv(‘spam.csv’)

Checking the information of the dataset

data.info()

Trending AI Articles:

1. Machine Learning Concepts Every Data Scientist Should Know

2. AI for CFD: byteLAKE’s approach (part3)

3. AI Fail: To Popularize and Scale Chatbots, We Need Better Data

4. Top 5 Jupyter Widgets to boost your productivity!

Splitting our data into X and y.

X = data[‘EmailText’].values
y = data[‘Label’].values

Splitting our data into training and testing.

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=0)

Converting text into integer using CountVectorizer()

# Converting String to Integer
cv = CountVectorizer()
X_train = cv.fit_transform(X_train)
X_test = cv.transform(X_test)

Applying SVM algorithm

from sklearn.svm import SVC
classifier = SVC(kernel = ‘rbf’, random_state = 0)
classifier.fit(X_train, y_train)

Accuracy

print(classifier.score(X_test,y_test))
>> 0.9766816143497757

Here we are getting around 97.66% which is a great approach. I also request to clone my repository from here and work further with this dataset and can comment me their accuracy with different classification models.

I hope you like this blog. Feel free to share your thoughts in the comment section and you can also connect with me in:-
Linkedin — https://www.linkedin.com/in/shreyak007/
Github — https://github.com/Shreyakkk
Twitter — https://twitter.com/Shreyakkkk
Instagram — https://www.instagram.com/shreyakkk/
Facebook — https://www.facebook.com/007shreyak
Thank You for reading.

Don’t forget to give us your ? !


Spam Mail Detection Using Support Vector Machine. was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/spam-mail-detection-using-support-vector-machine-cdb57b0d62a8?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/spam-mail-detection-using-support-vector-machine

Published by 365Data Science

365 Data Science is an online educational career website that offers the incredible opportunity to find your way into the data science world no matter your previous knowledge and experience. We have prepared numerous courses that suit the needs of aspiring BI analysts, Data analysts and Data scientists. We at 365 Data Science are committed educators who believe that curiosity should not be hindered by inability to access good learning resources. This is why we focus all our efforts on creating high-quality educational content which anyone can access online.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Design a site like this with WordPress.com
Get started