365 Data Science

Future of Artificial Intelligence: The Fourth Industrial Revolution

Artificial intelligence is among the top technologies of the 21st century’s prominent technology stack. Artificial intelligence and its subsets deep learning and machine learning have brought a wide range of use cases into the light. Gartner a global research firm has claimed in its reports that Artificial Intelligence, Automation and robotics will take down 1.8 million jobs and create 2.3 million new ones. The perception that we have is that Artificial Intelligence can take away any job but that’s not the case, AI will eliminate the manual laboured jobs from all the industries.

Future of Artificial Intelligence

The elimination of jobs due to Artificial Intelligence is already happening in the IT industry for Testing, Data Centre Management and much more. So, let’s get started to shed some light upon those scenarios.

Read: “Godfathers of Artificial Intelligence”

Software Industry

In the software industry, Informatica is being for Data Centre Management. Ui Path, selenium and more for automating test cases, chatbots for real-time response to clients are all based on AI and automation. If a large-scale project required 100 testers then the requirement can be reduced to 5 to 10 testers who will have to initially write the test cases and the system will automate the testing procedure. The automated testing increases the efficiency in catching bugs but as of now, it is only being used in large scale projects because writing test cases take time and for a small project manually the testing can be completed in less time than the automated process.

Transportation

Zook, Uber and Waymo have been aggressively working to train and develop AI models with which the cab industry can be switched to fully automated rides. The automated rides will reduce the costs for these companies as well as the capabilities to operate 24*7. The end user will be safer with these rides as this AI chauffer will not molest, snatch, threaten or loot the passenger and the centralized network as well as embedded hardware in these vehicles will be aware the approaching vehicles and obstacles which significantly reduces the chance of an accident.

Construction

Smart designing tools like Auto CAD is already reducing the laborious job of finding stress points of building and finding appropriate solutions to handle the stress. Next big thing that is on its way is automated robots that will be used to wrecking buildings, joining moulded building blocks and evenly pouring concrete in construction work.

Automobile

While designing new models of vehicles companies’ have to create prototypes so that they can test in real-world conditions but now these can be tested in the virtual world with Microsoft’s HoloLens and data from these tests can be fed to AI training models recursively to get the best possible design as output.

Read: “The Impact of Artificial Intelligence on Business”

Healthcare

Companies are craving to get their hands on medical records but why they are asking for it in the first place? According to stats reported by CB insights the software, IDx-DR was able to correctly identify patients with “more than mild diabetic retinopathy” 87.4% of the time, and identify those who did not have it 89.5% of the time. IDx is one of the AI software products approved by the FDA for clinical commercial applications to date. Viz.ai was approved to analyse CT scans and notify healthcare providers of potential strokes in patients. Post-FDA-approval, Viz.ai closed a $21M Series A round from Google Ventures and Kleiner Perkins Caufield & Byers. GE Ventures-backed start-ups Artery was FDA-approved last year for analysing cardiac images with its cloud AI platform. This year, the FDA cleared its liver and lung AI lesion spotting software for cancer diagnostics. From these stats, it is pretty evident that AI models can effectively predict the diseases a patient is having and what he/she can have in the future. This analysis can enable us to cure the patients in time and take our actions with respect to warnings given by these systems.

Medicine

Every human has a different kind of body and the body’s response to medicine is completely different for each and every human that’s why we observe people allergic to different substances and some show zero response to certain antibiotics. The thoroughly trained AI models will be able to prescribe customised medicines for patients which can save more lives and can enable pharmaceutical companies to charge customers according to their needs.

Read: “The Impact of IoT on Web Development”

Textiles

Textile has been the first industry being exposed to the Industrial Revolution but it is going to be the second one after IT. Zara and many other prominent fashion houses have already started making their products with AI trained automated robots and the upcoming fleets are being manufactured mostly in China. The Asian markets will take the biggest hit from automation in Textile because most of the textile production is being done in south-east Asian countries.

Read: “Top Web Development Technologies and Frameworks”

Retail

New retail stores by Amazon, HUMA by Alibaba with cameras use image recognition and AI to know which products have been picked by customers and compute the total bill. The bills can be paid through digital transactions on checkout counters which eliminates the need to cashiers on checkout counters and improves the user experience by cutting queues. In China, digital payments can be processed with facial recognition and biometric scans.

Read: “Top Web Development Trends to follow in 2020”

Electronics, computing hardware and software companies are building software to balance the load and what not is being built with AI the use cases are endless but it is confirmed that it is going to restructure the world and its work culture.

Don’t forget to give us your ? !

Future of Artificial Intelligence: The Fourth Industrial Revolution was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/future-of-artificial-intelligence-the-fourth-industrial-revolution-48eab6f7609b?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/future-of-artificial-intelligence-the-fourth-industrial-revolution

Which methods should be used for solving linear regression?

As a foundational set of algorithms in any machine learning toolbox, linear regression can be solved with a variety of approaches. Here, we discuss. with with code examples, four methods and demonstrate how they should be used.

Originally from KDnuggets https://ift.tt/3lI2MJh

source https://365datascience.weebly.com/the-best-data-science-blog-2020/which-methods-should-be-used-for-solving-linear-regression

Here is What Ive Learned in 2 Years as a Data Scientist

In this article, for the first time, I’ll consolidate everything that I’ve learned and condense all of these into 5 lessons that I’ve learned in 2 years as a data scientist.

Originally from KDnuggets https://ift.tt/3jBpIIm

source https://365datascience.weebly.com/the-best-data-science-blog-2020/here-is-what-ive-learned-in-2-years-as-a-data-scientist

PyCaret 2.1 is here: Whats new?

PyCaret is an open-source, low-code machine learning library in Python that automates the machine learning workflow. It is an end-to-end machine learning and model management tool that speeds up the machine learning experiment cycle and makes you 10x more productive. Read about what’s new in PyCaret 2.1.

Originally from KDnuggets https://ift.tt/31LJDy8

source https://365datascience.weebly.com/the-best-data-science-blog-2020/pycaret-21-is-here-whats-new

Non-Negative Matrix Factorization for Dimensionality ReductionPredictive Hacks

Non-Negative Matrix Factorization for Dimensionality Reduction

We have explained how we can reduce the dimensions by applying the following algorithms:

We will see how we can also apply Dimensionality Reduction by applying Non-Negative Matrix Factorization. We will work with the Eurovision 2016 dataset as what we did in the Hierarchical Clustering post.

Few Words About Non-Negative Matrix Factorization

This is a very strong algorithm which many applications. For example, it can be applied for Recommender Systems, for Collaborative Filtering for topic modelling and for dimensionality reduction.

In Python, it can work with sparse matrix where the only restriction is that the values should be non-negative.

The logic for Dimensionality Reduction is to take our m x n data and to decompose it into two matrices of m x features and features x n respectively. The features will be the reduced dimensions.

Dimensionality Reduction in Eurovision Data

Load and Reshape the Data

In our dataset, the rows will be referred to the Countries that voted and the columns will be the countries that have been voted. The values will refer to the ‘televote’ ranking.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 
%matplotlib inline 
eurovision = pd.read_csv("eurovision-2016.csv") 
televote_Rank = eurovision.pivot(index='From country', columns='To country', values='Televote Rank')
# fill NAs by min per country televote_Rank.fillna(televote_Rank.min(), inplace=True)

The televote_Rank.shape is (42, 26)

Non-Negative Matrix Factorization

Since we have the data in the right form, we are ready to run the NNMF algorithm. We will choose two components because our goal is to reduce the dimensions into 2.

# Import NMF from sklearn.decomposition import NMF 
# Create an NMF instance: model 
model = NMF(n_components=2) 
# Fit the model to televote_Rank 
model.fit(televote_Rank) 
# Transform the televote_Rank: nmf_features 
nmf_features = model.transform(televote_Rank) 
# Print the NMF features 
print(nmf_features.shape)
print(model.components_.shape)

As we can see we created two matrices of (42,2) and (2,26) dimensions respectively. Our two dimensions are the (42,2) matrix.

Plot the 42 Countries in two Dimensions

Let’s see how the scatter plot of the 42 countries into two dimensions.

plt.figure(figsize=(20,12)) 
countries = np.array(televote_Rank.index) 
xs = nmf_features[:,0] 
ys = nmf_features[:,1] 
# Scatter plot plt.scatter(xs, ys, alpha=0.5)

# Annotate the points 
for x, y, countries in zip(xs, ys,countries):
    plt.annotate(countries, (x, y), fontsize=10, alpha=0.5) plt.show()

Can you see a pattern?

The 2D graph here is somehow consistent with the dendrogram that we got by applying the linkage distance. Again, we can see a “cluster” of the cluster from “ Yugoslavia” and also that the Baltic countries are close as well as the Scandinavian and the countries of the United Kingdom.

Don’t forget to give us your ? !

Non-Negative Matrix Factorization for Dimensionality Reduction — Predictive Hacks was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/non-negative-matrix-factorization-for-dimensionality-reduction-predictive-hacks-1ed9e91154c?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/non-negative-matrix-factorization-for-dimensionality-reductionpredictive-hacks

Machine Learning behind the scenes: what is dataset and why should it be qualitative?

Recognition of jokes in news headlines, driving vehicles, tracking human health — Machine Learning performs many amazing things if it has the right data. It plays a crucial role in the model training process and output quality. How does it work?

The dataset in Machine Learning

Whatever your algorithm is used for — image recognition, object tracking, matchmaking or deep analysis, it needs data to learn and evaluate performance based on it. Dataset helps you to organize unstructured data collected from multiple sources to get the target outcome. Initial data that you give to an algorithm for learning is usually called a training dataset. Training data is a foundation for further development that determines how effective and useful your Machine Learning system will be.

However, all initial datasets are flawed and require some preparation before using them for training. For mapping data to the features valuable precisely for your business, you need to label it and make it clean. It will help you exclude useless elements and files, increasing the ML model’s chances of becoming smart. The labeling process used by Exposit usually includes the following steps:

Data analysis;
Creation of data labeling rules;
Data labeling;
QA step;
Neural Network training;
Measurement of the output quality.

Collecting and labeling images to create a high-quality dataset from scratch requires a lot of resources. If you need to do research or create MVP, you can use publicly available datasets with already labeled data that can include up to 80 categories of different objects. Remember that if you use the same dataset for training, validation, and testing, you won’t be able to evaluate the efficiency of your solution objectively. At Exposit we are more likely to use new and unseen data for testing to ensure excellent performance.

Don’t forget to give us your ? !

Machine Learning behind the scenes: what is dataset and why should it be qualitative? was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/machine-learning-behind-the-scenes-what-is-dataset-and-why-should-it-be-qualitative-22f5358bf4a8?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/machine-learning-behind-the-scenes-what-is-dataset-and-why-should-it-be-qualitative

Basic Unix Commands for Data Scientists

Data Analysts/Scientists should have a basic knowledge of Unix Commands, the goal of this post is to give some examples of how the shell commands would help them on their daily tasks. For the first examples we will consider the following eg1.csv:

ID,Name,Dept,Gender
1,George,DS,M
2,Billy,DS,M
3,Nick,IT,M
4,George,IT,M 
5,Nikki,HR,F 
6,Claudia,HR,F 
7,Maria,Sales,F 
8,Jimmy,Sales,M 
9,Jane,Marketing,F 
10,George,DS,M

Examples of Basic Unix Commands

Q: How to print the first or the last 3 rows of the files.

# The first 
head -n 3 eg1.csv

# The last 
tail -n 3 eg1.csv

Q: How to skip the first line(s) or the last line(s).

Sometimes we want to skip the first line which usually is the headers. The command is:

# it skips the first line 
tail -n +2 eg1.csv

# it skips the last 4 lines 
head -n -4 eg1.csv

# skip first line
1,George,DS,M
2,Billy,DS,M
3,Nick,IT,M
4,George,IT,M 
5,Nikki,HR,F 
6,Claudia,HR,F 
7,Maria,Sales,F 
8,Jimmy,Sales,M 
9,Jane,Marketing,F 
10,George,DS,M

Q: How to print the whole file.

# for the whole file 
cat eg1.csv

# the first rows - then type space more more or q to quit 
less eg1.csv

Q: How to copy a file.

cp eg1.csv copy_eg1.csv

Q: How to rename a file.

mv copy_eg1.csv backup_eg1.csv

Q: How to remove a file.

rm backup_eg1.csv

Q: How to get a list of information about files in the working directory.

ls -lh

Q: How to check free disk space.

df -h

Q: How to get how much space one ore more files or directories is using.

du -sh

Q: How can I select columns from a file.

If you want to select columns, you can use the command cut. It has several options (use man cut to explore them), but the most common is something like:

cut -f 1-2,4 -d , eg1.csv

This means “select columns 1 through 2 and columns 4, using comma as the separator”. cut uses -f (meaning “fields”) to specify columns and -d (meaning “delimiter”) to specify the separator.

A brief description of sed “command”

Q: How to display line multiple times.

# displays the third line twice 
sed '3p' eg1.csv

ID,Name,Dept,Gender
1,George,DS,M 
2,Billy,DS,M 
2,Billy,DS,M 
3,Nick,IT,M 
4,George,IT,M 
5,Nikki,HR,F 
6,Claudia,HR,F 
7,Maria,Sales,F 
8,Jimmy,Sales,M 
9,Jane,Marketing,F 
10,George,DS,M

Q: How to display a specific line.

# it displays only the third line 
sed -n '3p' eg1.csv

2,Billy,DS,M

Q: How to display the last line of a file.

sed -n '$p' eg1.csv

10,George,DS,M

Q: How to display a range of lines

# it prints the 2nd up to 4th line 
sed -n '2,4p' eg1.csv

1,George,DS,M 
2,Billy,DS,M 
3,Nick,IT,M

Q: How NOT to display a specific line or a range of lines.

# all except 2nd line
sed -n '2!p' eg1.csv

# all except 2nd up 4th lines
sed -n '2,4!p' eg1.csv

# all except 2nd up 4th lines
ID,Name,Dept,Gender
4,George,IT,M
5,Nikki,HR,F
6,Claudia,HR,F
7,Maria,Sales,F
8,Jimmy,Sales,M
9,Jane,Marketing,F
10,George,DS,M

Q: How to display lines by searching a word.

# return any line containing the word "George
sed -n '/George/p' eg1.csv

1,George,DS,M
4,George,IT,M
10,George,DS,M

Q: How to substitute data in file.

# replace "George" to "Georgios"
sed 's/George/Georgios/g' eg1.csv

ID,Name,Dept,Gender
1,Georgios,DS,M
2,Billy,DS,M
3,Nick,IT,M
4,Georgios,IT,M
5,Nikki,HR,F
6,Claudia,HR,F
7,Maria,Sales,F
8,Jimmy,Sales,M
9,Jane,Marketing,F
10,Georgios,DS,M

A brief description of awk “command”

Q: How to print a specific column.

# prints the third column. The dollar sign 
# defines the column and the separator
# was defined with the -F ","

awk -F "," '{print $3}' eg1.csv

# alternatively
awk  '{print $3}' FS="," eg1.csv

# print the 1st and 3 column. Display separated by tab
awk -F "," '{print $1 "\t" $3}' eg1.csv

# if you want to print everything, then you can write
awk -F "," '{print $0}' eg1.csv

# print the 1st and 3 column. Display separated by tab
awk -F "," '{print $1 "\t" $3}' eg1.csv
ID     Dept
1       DS
2       DS
3       IT
4       IT
5       HR
6       HR
7       Sales
8       Sales
9       Marketing
10      DS

Q: How to remove header row from the results.

# we use the NR which comes from "number of row"
awk 'NR!=1' eg1.csv

# The NR takes also great, less, equal, not equal
# so we get the same results with the NR>1

awk 'NR>1' eg1.cs

Q: How to conditionally select data.

# let's say that we want all the rows where the department is DS
awk -F"," '$3=="DS"{print $0}' eg1.csv

# let's say that we want all the rows where the id is 
# higher than 5
awk -F"," '$1>5{print $0}' eg1.csv

# get all the rows where there is the substring "Ge"
awk -F"," '/Ge/{print $0}' eg1.csv

# get all the rows where there is the substring "Ge" 
# in second column
awk -F"," '$2~/Ge/{print $0}' eg1.csv

# get all the rows where there is NOT the 
# substring "Ge" in second column

awk -F"," '$2!~/Ge/{print $0}' eg1.csv

# awk -F"," '$3=="DS"{print $0}' eg1.csv
1,George,DS,M
2,Billy,DS,M
10,George,DS,M

# awk -F"," '$1>5{print $0}' eg1.csv
ID,Name,Dept,Gender
6,Claudia,HR,F
7,Maria,Sales,F
8,Jimmy,Sales,M
9,Jane,Marketing,F
10,George,DS,M

# awk -F"," '/Ge/{print $0}' eg1.csv
ID,Name,Dept,Gender
1,George,DS,M
4,George,IT,M
10,George,DS,M

# awk -F"," '$2!~/Ge/{print $0}' eg1.csv
ID,Name,Dept,Gender
2,Billy,DS,M
3,Nick,IT,M
5,Nikki,HR,F
6,Claudia,HR,F
7,Maria,Sales,F
8,Jimmy,Sales,M
9,Jane,Marketing,F

Don’t forget to give us your ? !

Basic Unix Commands for Data Scientists was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/basic-unix-commands-for-data-scientists-ecaeea442375?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/basic-unix-commands-for-data-scientists

2020 Best Online Masters in Analytics Business Analytics Data Science Updated

We provide an updated list of best online Masters in AI, Analytics, and Data Science, including rankings, tuition, and duration of the education program.

Originally from KDnuggets https://ift.tt/2DjoMZL

source https://365datascience.weebly.com/the-best-data-science-blog-2020/2020-best-online-masters-in-analytics-business-analytics-data-science-updated

Showcasing the Benefits of Software Optimizations for AI Workloads on Intel Xeon Scalable Platforms

The focus of this blog is to bring to light that continued software optimizations can boost performance not only for the latest platforms, but also for the current install base from prior generations. This means customers can continue to extract value from their current platform investments.

Originally from KDnuggets https://ift.tt/31OTbbU

source https://365datascience.weebly.com/the-best-data-science-blog-2020/showcasing-the-benefits-of-software-optimizations-for-ai-workloads-on-intel-xeon-scalable-platforms

eBook: Vocabularies Text Mining and FAIR Data: The Strategic Role Information Managers Play

How can information managers find strategic roles to play in their organization’s AI and data analysis projects? Download this book to learn more.

Originally from KDnuggets https://ift.tt/3jzw0ID

source https://365datascience.weebly.com/the-best-data-science-blog-2020/ebook-vocabularies-text-mining-and-fair-data-the-strategic-role-information-managers-play

365 Data Science

Future of Artificial Intelligence: The Fourth Industrial Revolution

Future of Artificial Intelligence

Software Industry

Transportation

Trending AI Articles:

Automobile

Healthcare

Medicine

Textiles

Retail

Don’t forget to give us your ? !

Which methods should be used for solving linear regression?

Here is What Ive Learned in 2 Years as a Data Scientist

PyCaret 2.1 is here: Whats new?

Non-Negative Matrix Factorization for Dimensionality ReductionPredictive Hacks

Non-Negative Matrix Factorization for Dimensionality Reduction

Few Words About Non-Negative Matrix Factorization

Dimensionality Reduction in Eurovision Data

Load and Reshape the Data

Non-Negative Matrix Factorization

Trending AI Articles:

Plot the 42 Countries in two Dimensions

Can you see a pattern?

Don’t forget to give us your ? !

Machine Learning behind the scenes: what is dataset and why should it be qualitative?

The dataset in Machine Learning

Trending AI Articles:

Why should a dataset be qualitative?

How to evaluate the quality of a dataset?

Don’t forget to give us your ? !

Basic Unix Commands for Data Scientists

Examples of Basic Unix Commands

Trending AI Articles:

A brief description of sed “command”

A brief description of awk “command”

Don’t forget to give us your ? !

2020 Best Online Masters in Analytics Business Analytics Data Science Updated

Showcasing the Benefits of Software Optimizations for AI Workloads on Intel Xeon Scalable Platforms

eBook: Vocabularies Text Mining and FAIR Data: The Strategic Role Information Managers Play