365 Data Science

New Course! Product Management for AI & Data Science with Danielle Thé

Product Management for AI & Data Science Course

We’ve been developing this project for a while, and finally, the time has come to launch our newest collaboration with an acknowledged expert in the field of AI and data science.

We’re happy to announce the release of Product Management for AI & Data Science with Danielle Thé.

Danielle Thé is a Senior Product Manager for Machine Learning with a Master’s in Science of Management. She boasts years of experience as a Product Manager and Product Marketing Manager in the tech industry for companies like Google and Deloitte Digital.

In this course, she will teach you everything you need for a successful career as a Product Manager for AI and data science.

You will learn the expert skills needed to manage the development of successful A.I. products: from defining the role of a product manager and making the difference between a product and a project manager, through executing business strategy for AI and data science, to sourcing data for your projects and understanding how this data needs to be managed.

Danielle will take you through the full lifecycle of an AI or data science project in a company. What is more, she will illustrate how to manage data science and AI teams, improve communication between team members, and how to address ethics, privacy, and bias.

This 12-part course gives you access to over 60 lessons, each paired with resources, notes or articles that complement the notions covered. You’ll also practice with quizzes, assignments and projects to put what you’ve learned into action.

Product Management for AI & Data Science is part of the 365 Data Science Online Program, so existing subscribers can access the courses at no additional cost. To learn more about the course curriculum or subscribe to the Data Science Online Program, please visit our Courses page.

Happy Learning!

The post New Course! Product Management for AI & Data Science with Danielle Thé appeared first on 365 Data Science.

from 365 Data Science https://ift.tt/34SRH23

A Curious Theory About the Consciousness Debate in AI

Dr. Michio Kaku has formulated a very interesting theory of consciousness that applies to AI systems.

Originally from KDnuggets https://ift.tt/31JkxAb

source https://365datascience.weebly.com/the-best-data-science-blog-2020/a-curious-theory-about-the-consciousness-debate-in-ai

Basic Example of Neural Style TransferPredictive Hacks

Basic Example of Neural Style Transfer

This post is a practical example of Neural Style Transfer based on the paper A Neural Algorithm of Artistic Style (Gatys et al.). For this example, we will use the pretrained Arbitrary Image Stylization module which is available in TensorFlow Hub. We will work with Python and tensorflow 2.x.

Neural Style Transfer

Neural style transfer is an optimization technique used to take two images- an image and a style reference image (such as an artwork by a famous painter)-and blend them together so the output image looks like the content image, but “painted” in the style of the style reference image.

This is implemented by optimizing the output image to match the content statistics of the content image and the style statistics of the style reference image. These statistics are extracted from the images using a convolutional network.

For example, let’s take an image of the Gold Gate Bridge and the Starry Night by Van Gogh.

Now how would it look like if Van Gogh decided to paint the picture of Golden Gate with this style?

Example of Neural Style Transfer using Tensorflow

Let’s start coding and also download the content and style images.

%tensorflow_version 2.x 
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf 
print(tf.__version__)

import IPython.display as display 
import matplotlib.pyplot as plt 
import matplotlib as mpl mpl.rcParams['figure.figsize'] = (20,20) mpl.rcParams['axes.grid'] = False

import numpy as np 
import PIL.Image 
import time 
import functools

def tensor_to_image(tensor): 
    tensor = tensor*255
    tensor = np.array(tensor, dtype=np.uint8) 
    if np.ndim(tensor)>3: 
        assert tensor.shape[0] == 1 
        tensor = tensor[0] 
    return PIL.Image.fromarray(tensor)

# Visualize the input 
# Define a function to load an image and limit its maximum dimension to 512 pixels.

def load_img(path_to_img): 
    max_dim = 512 
    img = tf.io.read_file(path_to_img) 
    img = tf.image.decode_image(img, channels=3) 
    img = tf.image.convert_image_dtype(img, tf.float32)

    shape = tf.cast(tf.shape(img)[:-1], tf.float32) 
    long_dim = max(shape) 
    scale = max_dim / long_dim

    new_shape = tf.cast(shape * scale, tf.int32) 
    img = tf.image.resize(img, new_shape) 
    img = img[tf.newaxis, :] 
    return img

# Create a simple function to display an image: 
def imshow(image, title=None): 
    if len(image.shape) > 3: 
    image = tf.squeeze(image, axis=0)
 
plt.imshow(image) 
if title: 
    plt.title(title)

# Download images and choose a style image and a content image: content_path = tf.keras.utils.get_file('Golden_Gate.jpg', 'https://upload.wikimedia.org/wikipedia/commons/0/0c/GoldenGateBridge-001.jpg')

style_path = tf.keras.utils.get_file('Starry_Night.jpg','https://upload.wikimedia.org/wikipedia/commons/thumb/e/ea/Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg/757px-Van_Gogh_-_Starry_Night_-_Google_Art_Project.jpg')

content_image = load_img(content_path) 
style_image = load_img(style_path)

plt.subplot(1, 2, 1) 
imshow(content_image, 'Content Image') 
plt.subplot(1, 2, 2)
imshow(style_image, 'Style Image')

Let’s confirm that we have downloaded and loaded correctly the images.

# Use the TensorFlow Hub 
import tensorflow_hub as hub 
hub_module = hub.load('https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/1') 
stylized_image = hub_module(tf.constant(content_image),tf.constant(style_image))[0] 
tensor_to_image(stylized_image)

Discussion

That was a practical example of how you can easily apply Neural Style Transfer and to be somehow a “digital artist” :-). Note that modern approaches train a model to generate the stylized image directly (similar to cyclegan).

In this tutorial, we assumed that the reader is familiar with Images. In the previous post, we have explained how we can extract text from images, how we can iterate over pixels, how we can get the most dominant color of an image and how we can blend images.

Don’t forget to give us your ? !

Basic Example of Neural Style Transfer — Predictive Hacks was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/basic-example-of-neural-style-transfer-predictive-hacks-bc45a88017b3?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/basic-example-of-neural-style-transferpredictive-hacks

Recommender System MetricsClearly Explained

ML Handy Tools

Recommender System Metrics — Clearly Explained

Understanding the evaluation metrics for recommender systems

In this post, we will be discussing evaluation metrics for recommender systems and try to clearly explain them. But before that let’s understand the recommender system in short.

A recommender system is an algorithm that provides recommendations to users based on their historical preferences/ tastes. Nowadays, recommendation systems are used abundantly in our everyday interactions with apps and sites. For example, Amazon is using them to recommend products, Spotify to recommend music, YouTube to recommend videos, Netflix to recommend movies.

The quality of the recommendations is based on how relevant they are to the users and also they need to be interesting. When the recommendations are too obvious, they are not useful and mundane. For the relevancy of recommendation, we use metrics like recall and precision. For the latter (serendipity) metrics like diversity, coverage, serendipity, and novelty are used. We will be exploring the relevancy metrics here, for the metrics of serendipity, please have a look at this post: Recommender Systems — It’s Not All About the Accuracy.

Let’s say that there are some users and some items, like movies, songs, or products. Each user might be interested in some items. We recommend a few items (the number is k) for each user. Now, how will you find whether our recommendations to every user were efficient?

In a classification problem, we usually use the precision and recall evaluation metrics. Similarly, for recommender systems, we use a mix of precision and recall — Mean Average Precision (MAP) metric, specifically MAP@k, where k recommendations are provided.

Let’s explain MAP, so the M is just an average (mean) of APs, average precision, of all users. In other words, we take the mean for average precision, hence Mean Average Precision. If we have 1000 users, we sum APs for each user and divide the sum by 1000. This is the MAP.

So now, what is average precision? Before that let’s understand recall (r)and precision (P).

There is usually an inverse relationship between recall and precision. Precision is concerned about how many recommendations are relevant among the provided recommendations. Recall is concerned about how many recommendations are provided among all the relevant recommendations.

Let’s understand the definitions of recall@k and precision@k, assume we are providing 5 recommendations in this order — 1 0 1 0 1, where 1 represents relevant and 0 irrelevant. So the precision@k at different values of k will be precision@3 is 2/3, precision@4 is 2/4, and precision@5 is 3/5. The recall@k would be, recall@3 is 2/3, recall@4 is 2/3, and recall@5 is 3/3.

So we don’t really need to understand average precision (AP). But we need to know this:

we can recommend at most k items for each user
it is better to submit all k recommendations because we are not penalized for bad guesses
order matters, so it’s better to submit more certain recommendations first, followed by recommendations we are less sure about

So basically we select k best recommendations (in order) and that’s it.

Don’t forget to give us your ? !

Recommender System Metrics — Clearly Explained was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/recommender-system-metrics-clearly-explained-1f2ba6690216?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/recommender-system-metricsclearly-explained

Language-Guided Navigation in a 3D Environment

Language-guided Navigation in a 3D Environment

Language-guided navigation is a widely studied field and a very complex one. Indeed, it may seem simple for a human to just walk through a house to get to your coffee that you left on your nightstand to the left of your bed. But it is a whole other story for an agent, which is an autonomous AI-driven system using deep learning to perform tasks.

Indeed, current approaches try to understand how to move in a 3D environment in order to let the agent freely move like a human. This new approach I will be covering in this article changes that by letting the agent only executing low-level actions in order to follow language navigation directions, such as “Enter the house, walk to your bedroom, go in front of the nightstand to the left of your bed.”

They’ve achieved this by using only four actions; “Turn-left”, “Turn-right”, “move forward for 0.25 m”, “Stop”. This allowed the researchers to lift a number of assumptions that prior work were using, such as the need to know exactly where your agent is at all time.

Using this technique makes the trajectories significantly longer, using an average of 55.88 steps rather than 4 to 6 steps from current approaches. But again, these steps are much smaller and make the agent much more precise and “human-like.”

Let’s see how they achieved such thing and some amazing examples. Feel free to read the paper and check out their code, which are both linked at the end of this article.

The 2-model method

They developped two different models in order to achieve such task.
The first one (a) is a simple sequence-to-sequence baseline.
The second one (b) is a more powerful cross-modal attentional model, which we can both see in this picture.

The first model

This first model takes a visual representation of the observation, containing depth and RGB features, and instructions for each time step.
Then, using this information and the instructions given by the user, it predicts a series of action to take, denoted as “at” in this image.

https://towardsdatascience.com/an-overview-of-resnet-and-its-variants-5281e2f56035

The RGB frames and depths are respectively encoded using two ResNets-50 architectures, one pre-trained on ImageNet and the other one trained to perform point-goal navigation.

https://towardsdatascience.com/illustrated-guide-to-recurrent-neural-networks-79e5eb8049c9

Then, it uses an LSTM to encode the instructions from the user.
LSTM is the short for Long short-term memory, which is a recurrent neural network architecture widely used in natural language processing applications due to its memory capabilities allowing it to use previous words information as well.

The second model

These actions, a, are then fed into the second model.
The goal of this second model is to compensate for the lack of visual reasoning in the first model, which is super important for this kind of navigation application.

For example, you need a good spatial visual reasoning in order to understand an instruction such as “to the left of the table.”
Your agent needs to know that it first needs to know where’s the table, and then, go to the left of that table.

Which is done using attention.
Attention is basically based on a common intuition that we “attend to” a certain part when processing a large amount of information, like the pixels of an image.
More specifically, it is done using two recurrent networks, as you can see in the image, one tracking observations using the same RGB and depths input as the first model.

While the other network’s role is to make decisions based on the user’s fed instructions and visual features.

https://machinetalk.org/2019/03/29/neural-machine-translation-with-attention-mechanism/

This time, the user’s instructions are encoded using a bidirectional LSTM.
Then, they compute a list of simple instructions which is used to extract both visual and depth features.

Following that, the second recurrent network uses a concatenation of all the features discussed including an action encoding as inputs and predicts a final action.

The dataset

To train such task, they used a total of 4475 trajectories split from the train and validation split. For each of those trajectories, they provided multiple language instructions and an annotated “shortest path ground truth via low-level actions” as seen in this image.

At first, it looks like it needs a lot more details and time to achieve such task. Shown in this picture below, where (a) being the current approaches, using real-time localization of the agent, and (b) being the covered approach with low-level actions.

But when we compare it to the traditional panoramic view with perfect location instead of having no position given and using only low-level actions it is clear that it needs way less computation time in order to succeed, just as you can see in the amount of information given for each approaches in the picture above.

Results

This is a comparison on the VLN validation/test datasets between this and the current state-of-the-art approaches.

From these quantitative results, we can clearly see that using this cross-modal approach with multiple low-level actions in a continuous environment outperforms the nav-graph navigation approaches in every way. It is hard to visualize such results from a theoretical comparison basis, so here are some impressive examples using this new technique:

Watch the video to see more examples of this new technique:

I invite you to check out the public release version of the code on their GitHub. Of course, this was just an introduction to the paper. Both are linked below for more information.

The paper: https://arxiv.org/pdf/2004.02857.pdf
The project: https://jacobkrantz.github.io/vlnce/?fbclid=IwAR2VO1jwjaq4Uydz2O25ZaLXVFjoD46QirYnW1zNeNAJyNkleA0KS_PDBrE
GitHub with code: https://github.com/jacobkrantz/VLN-CE

Don’t forget to give us your ? !

Language-Guided Navigation in a 3D Environment was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/language-guided-navigation-in-a-3d-environment-e3cf4102fb89?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/language-guided-navigation-in-a-3d-environment

Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Semantics and Pragmatics

Algorithms for text analytics must model how language works to incorporate meaning in language—and so do the people deploying these algorithms. Bender & Lascarides 2019 is an accessible overview of what the field of linguistics can teach NLP about how meaning is encoded in human languages.

Originally from KDnuggets https://ift.tt/2Divn6K

source https://365datascience.weebly.com/the-best-data-science-blog-2020/linguistic-fundamentals-for-natural-language-processing-100-essentials-from-semantics-and-pragmatics

Accelerated Computer Vision: A Free Course From Amazon

Amazon’s Machine Learning University is making its online courses available to the public, and this time we look at its Accelerated Computer Vision offering.

Originally from KDnuggets https://ift.tt/35188cX

source https://365datascience.weebly.com/the-best-data-science-blog-2020/accelerated-computer-vision-a-free-course-from-amazon

Semantic Segmentation for Pneumothorax Detection & Segmentation

So, Here in this Blog, i will show you that how can we solve the healthcare problem by enabling the power of Deep Learning.

Table of Contents:-

Introduction
Image Segmentation
Business Problem
Datasets
Prerequisites
Classification & Segmenatation Acrchitecture
Inference Pipeline
Conclusion
references

1. Introduction :

So here in this section we will discuss about computer vision,

https://machinelearningmastery.com/what-is-computer-vision/

Computer vision is the simply the process of perceiving the images and videos available in the digital formats.

In Machine Learning (ML) and AI — Computer vision is used to train the model recognize certain patterns and store the data into their artificial memory to utilize the same for predicting the results in real-life use

The application of computer vision in artificial intelligence is becoming unlimited and now expanded into emerging fields like automotive, healthcare, retail, robotics, agriculture, autonomous flaying like drones and manufacturing etc…

So here in this blog by enabling the power of deep learning , we will show you that how we can solve one of the problem of computer vision called ‘ Image Segmentation ’

2.Image Segmentation :

https://thegradient.pub/semantic-segmentation/

What is Image Segmentation?

Image Segmentation is a task of computer vision in which we partitioning images into different segments.

yes, sounds like Object detection , but no it different task than object detection …. Because Object Detection methods helps us draw bounding boxes around certain entities/Objects in Given Image ,

But on other side Image segmentation lets us achieve more detailed understanding of imagery than image classification or object detection.

in Simple words , in Image Segmentation we basically assign/classify each pixel to a particular class.

Types of Image Segmentation:

Semantic Segmentation
Instance Segmentation

https://www.kaggle.com/c/severstal-steel-defect-detection/discussion/108126

Semantic segmentation — classifies all the pixels of an image into meaningful classes of objects. These classes are “semantically interpretable” and correspond to real-world categories. For instance, you could isolate all the pixels associated with a cat and color them green. This is also known as dense prediction because it predicts the meaning of each pixel.

Instance segmentation — identifies each instance of each object in an image. It differs from semantic segmentation in that it doesn’t categorize every pixel. If there are three cars in an image, semantic segmentation classifies all the cars as one instance, while instance segmentation identifies each individual car.

Most Popular Chatbot Tutorials

1. Machine Learning Concepts Every Data Scientist Should Know

2. AI for CFD: byteLAKE’s approach (part3)

3. AI Fail: To Popularize and Scale Chatbots, We Need Better Data

4. Top 5 Jupyter Widgets to boost your productivity!

Application of Image Segmentation:

1 Medical Imagine

2 Computer Guided Surgery

3 Video surveillance

4 Recognition Tasks

5 Self-Driving Car

6 Industrial Machine Vision for product assembly and inspection

3. Business Problem :

https://en.wikipedia.org/wiki/Pneumothorax

This Problem basically from the Healthcare domain. imagine suddenly
gasping for air, helplessly breathless for no apparent reason. Could it be a
collapsed lung? In the future, we are going to predict this answer.

Pneumothorax can be caused by chest injury, damage from underlying
lung disease, or most horrifying — it may occur for no obvious reason at all.
On some occasions, a collapsed lung can be a life-threatening event.

Pneumothorax is usually diagnosed by a radiologist on a chest x-ray
images ,but sometimes it could be difficult to confirm.

Pneumothorax is visually diagnosed by radiologist, and even for a
professional with years of experience; it is difficult to confirm.

So Our Goal is to Detect and Segment those Pneumothorax affected area with a help of Semantic Segmentation methods,so that we can help the radiologist by giving the results with higher precision.

Solution:

An AI algorithm to detect Pneumothorax would be useful to Solve this problem,

we will try to solve this problem in two Phase:

1 Pneumothorax Classification:

2 Pneumothorax Segmentation

So, in first phase we will develope a classification model to classify Pneumothorax and in Second Phase we will build a model for segmentaiton task on given image

How prediction pipeline will work?

if Classification model detects the Pneumothorax in input X-Ray image then our Prediction Pipeline will pass that X-Ray image to Segmentation Model to segment the Pneumothorax in That X-ray image so that Radiology Expert can easily Analyze and Diagnose this Problem

This is How our Final Pipeline will Work

4. Dataset :

Before Going to Understand the Classification & Segmentation Architectures, let’s Have a glance at Dataset,

here we are using dataset which has been modified from the kaggle competition’s dataset.

so that we can easily get our input and target mask in (.png format).

Pneumothorax Dataset

4. Prerequisites :

Before start the Explaination Deep Learning Architecture, i assume that you have guys have a good Understanding of Basic Convolutional Operations like,

Conv2d, UpSampling , Dense Layers, Upconv2d Layer, Conv2dTranspose Layer , Softmax, Relu, BatchNormalisation all Basic stuff of Deep Learning with keras and most important Residual Block (ResNet & DenseNet).

5. Classification & Segmentation Architecture:

As you know that we have divided this problem into 2 part , So let’s have a look at part-1,

Part 1 : Pneumothorax Classification:

Basically here to solve this Classification problem , we have used DenseNet121 Architecture.

DenseNet:

DenseNet is One of the new Architecute for image classification & Object Recognition , it is quite similar to ResNet Architecture with some fundamental differences, ResNet uses an additive method (+) that merges the previous layer (identity) with the future layer, whereas DenseNet concatenates (.) the output of the previous layer with the future layer.

Advantages of Densenet:

They alleviate the vanishing-gradient problem.
Strengthen feature propagation.
Encourage feature reuse.
Reduce the number of parameters.

here for our problem i used the DensNet-121 architecture Using this package,

from tensorflow.keras.applications.densenet import DenseNet121

But there is one more important thing that i have used to get state of the art results is, for Transfer Learning instead of using pre-trained weights of imagenet , i have used the weights of ChestXpert — DenseNet , because this ChestXpert model was trained on Medical X-Ray images to classify around 14 Disease which was related to Lungs.

and Pneumothorax was one of them so i directly loaded the weights of ChestXpert model for our DenseNet-121 but only for all convolutional blocks , and for dense layer i have initialized the weights using keras..

Below is the keras code to implement the above model,

from tensorflow.keras.applications.densenet import DenseNet121
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.models import Model, load_model

def get_chexnet_model():
  input_shape = (256, 256, 3)
  img_input = Input(shape=input_shape)
  base_weights = 'imagenet'

  #create the base pre-trained model
  base_model = DenseNet121(include_top=False,
                           input_tensor=img_input,
                           input_shape=input_shape,
                           weights=base_weights,
                           pooling='avg')

  x = base_model.output
  
  # add a logistic layer -- let's say we have 14 classes
  predictions = Dense(14,activation='sigmoid',name='predictions')(x)

  # this is the model we will use
  model = Model(inputs=img_input,
                outputs=predictions,)

  # load chexnet weights
  model.load_weights('/content/drive/My Drive/Case-Study    1/best_weights.h5')

  # return model
  return base_model, model

tf.random.set_seed(1234)
base, model = get_chexnet_model()
x = Dense(1024, activation='relu', kernel_initializer='he_normal')(model.layers[-2].output)
x = Dense(2, activation='softmax', kernel_initializer='he_normal')(x)

final_model =  Model(model.input, x)
final_model.summary()

and you if you guys want to Understand the DenseNet Architechure then Must read this Paper on DenseNet architecture

Training:

To Train this model we used

final_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=[tf.keras.metrics.AUC(), 'accuracy'])

final_model.fit_generator(generator=Train_pipeline, epochs=30, validation_data=Test_pipeline)

and we achieved 90%Accuracy on trainset and 87% Accuracy on validation set….

So now after completing the part 1 ,Let’s have a look at Part -2

Part 2: Pneumothorax Segmentation:

Now we will discuss most important part of this blog ,

For this task we have implemented the Architecture called ” UNet++ : nested UNet architecture” , it is the advanced or can say extended version of UNet.

to Understand this architecture must have basic idea about how UNet work for Semantic segmentation , you can read this to understand the UNet Architecture.

So i hope now you will have a better understandings of UNet.

UNet++ : Nested UNet architecture for Medical Image Segmentaion:

So as you can see in this diagram so we will find some similarites between UNet and UNet++

Because Like Unet , UNet++ also follows the same encoder-decoder approch to generate the semantic segmentation

But here i have mentioned some points which differs UNet++ from UNet:

Convolution layer on skip pathways which bridges the semantic gap between encoder-decoder
Dense skip connections at skip pahways which improves the Gradient flow and prevents from gradient vanishing problem
Deep Supervision which improves the model pruning

that’s it , But if you want to go more deeper to understand How this things work in UNet++ then you from here you will get the better idea

Below keras code will help you to define above model (UNet++):

def convolution_block(x,filters,\
                      size,strides(1,1),\ 
                      padding='same',activation=True):
 x = BatchNormalization()(x)
 if activation == True:
  x = LeakyReLU(alpha=0.1)(x)
 return x


def residual_block(blockInput, num_filters=16):
  x = LeakyReLU(alpha=0.1)(blockInput)
  x = BatchNormalization()(x)
  blockInput = BatchNormalization()(blockInput)
  x = convolution_block(x, num_filters, (3,3) )
  x = convolution_block(x, num_filters, (3,3), activation=False)
  x = Add()([x, blockInput])
  return x

from efficientnet import EfficientNetB4

def UEfficientNet(input_shape=(None, None, 3),dropout_rate=0.1):
  backbone = EfficientNetB4(weights='imagenet',\
                              include_top=False,\
                              input_shape=input_shape)
  input = backbone.input
  start_neurons = 8

  conv4 = backbone.layers[342].output
  conv4 = LeakyReLU(alpha=0.1)(conv4)
  pool4 = MaxPooling2D((2, 2))(conv4)
  pool4 = Dropout(dropout_rate)(pool4)
  
  # Middle
  convm = Conv2D(start_neurons * 32, (3, 3), activation=None,\
                 padding="same",name='conv_middle')(pool4)

  convm = residual_block(convm,start_neurons * 32)
  convm = residual_block(convm,start_neurons * 32)
  convm = LeakyReLU(alpha=0.1)(convm)
 
  deconv4 = Conv2DTranspose(start_neurons * 16, (3, 3), strides=(2,   2), padding="same")(convm)
  deconv4_up1 = Conv2DTranspose(start_neurons * 16, (3, 3), strides=(2, 2), padding="same")(deconv4)
  deconv4_up2 = Conv2DTranspose(start_neurons * 16, (3, 3), strides=(2, 2), padding="same")(deconv4_up1)
  deconv4_up3 = Conv2DTranspose(start_neurons * 16, (3, 3), strides=(2, 2), padding="same")(deconv4_up2)

  uconv4 = concatenate([deconv4, conv4])
  uconv4 = Dropout(dropout_rate)(uconv4)
  uconv4 = Conv2D(start_neurons * 16, (3, 3), activation=None, padding="same")(uconv4)
  uconv4 = residual_block(uconv4,start_neurons * 16)
  uconv4 = LeakyReLU(alpha=0.1)(uconv4)  #conv1_2

  deconv3 = Conv2DTranspose(start_neurons * 8, (3, 3), strides=(2, 2), padding="same")(uconv4)

  deconv3_up1 = Conv2DTranspose(start_neurons * 8, (3, 3), strides=(2, 2), padding="same")(deconv3)

  deconv3_up2 = Conv2DTranspose(start_neurons * 8, (3, 3), strides=(2, 2), padding="same")(deconv3_up1)

  conv3 = backbone.layers[154].output
 
  uconv3 = concatenate([deconv3,deconv4_up1, conv3])
  uconv3 = Dropout(dropout_rate)(uconv3)
  uconv3 = Conv2D(start_neurons * 8, (3, 3), activation=None,  padding="same")(uconv3)
  uconv3 = residual_block(uconv3,start_neurons * 8)
  uconv3 = LeakyReLU(alpha=0.1)(uconv3)
 
  deconv2 = Conv2DTranspose(start_neurons * 4, (3, 3), strides=(2, 2), padding="same")(uconv3)
  deconv2_up1 = Conv2DTranspose(start_neurons * 4, (3, 3), strides=(2, 2), padding="same")(deconv2)

  conv2 = backbone.layers[92].output
  uconv2 = concatenate([deconv2,deconv3_up1,deconv4_up2, conv2])
  uconv2 = Dropout(0.1)(uconv2)
  uconv2 = Conv2D(start_neurons * 4, (3, 3), activation=None, padding="same")(uconv2)
  uconv2 = residual_block(uconv2,start_neurons * 4)
  uconv2 = LeakyReLU(alpha=0.1)(uconv2)
  
  deconv1 = Conv2DTranspose(start_neurons * 2, (3, 3), strides=(2, 2), padding="same")(uconv2)
  conv1 = backbone.layers[30].output
  uconv1 = concatenate([deconv1,deconv2_up1,deconv3_up2,deconv4_up3, conv1])
  uconv1 = Dropout(0.1)(uconv1)
  uconv1 = Conv2D(start_neurons * 2, (3, 3), activation=None, padding="same")(uconv1)
  uconv1 = residual_block(uconv1,start_neurons * 2)
  uconv1 = LeakyReLU(alpha=0.1)(uconv1)
  uconv0 = Conv2DTranspose(start_neurons * 1, (3, 3), strides=(2, 2), padding="same")(uconv1)
  uconv0 = Dropout(0.1)(uconv0)
  uconv0 = Conv2D(start_neurons * 1, (3, 3), activation=None, padding="same")(uconv0)
  uconv0 = residual_block(uconv0,start_neurons * 1)
  uconv0 = LeakyReLU(alpha=0.1)(uconv0)
  uconv0 = Dropout(dropout_rate/2)(uconv0)

  output_layer = Conv2D(1, (1,1), padding="same",   activation="sigmoid")(uconv0)
  
 model = Model(input, output_layer)
 model.name = 'u-xception'
 return model

tf.keras.backend.clear_session()

img_size = 256
model = UEfficientNet(input_shape=(img_size,img_size,3),dropout_rate=0.3)

Training:

To train this model , I used

Dice_Loss:

Generally for image segementation task , combination of binary cross_entropy and Dice_loss is being used immensely ,

because as we all know only binary cross entropy is not good option while you will having an imbalanced dataset , but here this Dice_loss works excellenty in those scenario ,

so our final loss will be

loss function = Binary_crossentropy + Dice_Loss, where N=Batch_size

IoU_Score (Intersection over Union):

so we have talked about loss, but what about Evaluation Metrics, should we use Accuracy ???

so Answer is No, for Image Segmenation or Object Localization task IoU_score is being Used extensively ,

Basically Accuracy function will also take those region/pixels which is not part of that object in specific image , specially for medical image segmentation task in which targeted mask contains very less amount of pixel , so in that case accuracy will always be high without taking care about location of the affected region by disease,

https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/

after looking at this diagram , you would have get the idea about how IoU Score works ,

It comes in handy when you’re measuring how close an annotation or test output lines up with the ground truth. As a ratio of the areas of intersection and union, it works on annotations of all shapes and size.

What’s cool is how IOU can be used with F1 scores to measure the accuracy of object detection tasks with multiple annotations per image

so after set the loss and evaluation matrice , we train this model using adam as an optimiser for 40 epochs,

so after 40 epochs of training we were able to get 0.73 IoU_Score and 0.71 val_IoU_Score.

7. Inference Pipeline:-

So as I mentioned in earlier in Inference pipeline’s Diagram ,

we will only predict the segmentation mask for input Chest X-Ray image if our Classifier will detects any Pneumothorax in that X-ray Image, otherwise there is no mean to predict the segmentation mask for that

here i am showing some predictions of our segmentation model…

Green: Groundtruth mask Red: Predicted mask

Here, Green Pixels Represents the Groundtruth (Actual Mask) & Red Pixels Represents the Predicted Mask.

we we have developed the complete Class called Pipeline for final production pipeline.

class Pipeline():
   
   def __init__(self, segmentation, classifier):
     self.segmentation = segmentation
     self.classifier   = classifier
   
   def Predict(self, ix):
     image = cv2.imread(mask_df.loc[ix, 'image'])
     img_clf = image/255.0
     pre_cls  =np.argmax(self.classifier.predict(tf.expand_dims(img_clf, axis=0), steps=1))
     plt.imshow(image)
     
    if pre_cls==1:
      img_seg = self.segmentation.predict(tf.expand_dims(image, axis=0), steps=1) 
      plt.imshow(img_seg[0,:,:,0], cmap='Reds', alpha = 0.4)
      plt.title("Pneumothorax is Detected.........")

    else:
      plt.title("Pneumothorax is not Detected.........")
    plt.show()

So our pipeline will segment the predicted mask by presenting it in red region , which will show the location of Pneumothorax in given input X-Ray image of Patient

8. Conclusion :-

i really appreciate you for giving time to read this blog

so please clap it if you like and learn new things from this blog , because it will encourage me to share more knowledge and informations related to Deep Learning through my blogs.

Thank You.