365 Data Science

7 Signs you are data literate

Understanding data is key to being a Data Scientist. But, how can you know if you might be a good fit for the field when you haven’t worked with much data? These telltale signs will suggest you are competent to work with data, and that you might have a talent for being data literate.

Originally from KDnuggets https://ift.tt/3iZDYLl

source https://365datascience.weebly.com/the-best-data-science-blog-2020/7-signs-you-are-data-literate

My Week in AI: Part 3

Photo by Clark Street Mercantile on Unsplash

Welcome to My Week in AI! Each week this blog will have the following parts:

What I have done this week in AI
An overview of an exciting and emerging piece of AI research

Progress Report

Building skills in big data and discovering dashboarding

This week I learned about big data engineering by starting the ‘Advance Your Skills as an Apache Spark Specialist’ learning pathway on LinkedIn Learning. I am somewhat familiar with Spark and Hadoop, but I wanted to learn more, and this seemed like a good way to gain those skills.

I also read a lot this week about data visualization in Python. I came across a powerful dashboarding library called Altair, which allows you to build very aesthetic dashboards and visualizations. It only requires Python knowledge and is based on visual grammar, so it is fairly intuitive. This is definitely a library I am going to add to my toolbox and explore further.

Applying for Etsy Fellowship

I spent another large part of my week brainstorming, researching, and writing a proposal for Etsy’s Summer of Vision Fellowship program. The assignment was to develop a machine learning project based on the question, “How might we use visual cues to improve buyers’ shopping experience on Etsy?” This prompted me to read a lot about recommendation systems, computer vision in e-commerce, and the use of computer vision and NLP in concert for such applications.

Emerging Research

Visiolinguistic Attention Learning

As part of my research for the Etsy Fellowship application, I came across a paper by Chen, Gong and Bazzani called ‘Image Search with Text Feedback by Visiolinguistic Attention Learning’ that will be presented at the upcoming CVPR 2020 conference¹. The paper discusses a new framework that these researchers developed: given a reference image and text feedback such as ‘same but in red’ or ‘without buckle,’ images may be retrieved that resemble the reference image but with the desired modification as described by the text. A potential application for this would be as a search feature on an e-commerce site.

This task involves learning an amalgamated representation that captures visual and textual information from the inputs. In order to do this, the authors presented the Visiolinguistic Attention Learning (VAL) framework, which is made up of three parts: an image encoder, a text encoder, and multiple composite transformers that modify visual feature maps based on language information. The image encoder was made up of a typical CNN with feature maps being extracted from several different layers, and the text encoder was made up of an LSTM followed by max pooling and linear projection layers.

To me, the most fascinating part of this research is in the visiolinguistic representation using the composite transformers. The visual and language features are fused and then passed through a two-stream module that learns attentional transformation and preservation. First, the self-attention stream learns non-local correlations in the fused features and generates an attention mask that highlights spatial long-range interdependencies. In parallel, a joint-attention stream works to retain the visual features of the reference image. The outputs of these two streams are combined to create a set of composite features.

In terms of training, hierarchical matching is utilized. The primary objective function relates to visual-visual matching so as to ensure that the composite features are very similar to the target features. The secondary objective function relates to visual-semantic matching, which is useful when images have accompanying text such as descriptions or tags.

In my opinion, the exciting applications of this research are for online shopping. If you see a pair of shoes that you like but would prefer them in a different color, you’d just have to type “I like these, but in blue” and the website will attempt to find for you a blue version of the original pair of shoes. This combination of semantic and visual features is not something I have come across much (probably because it is a very difficult task!).

Join me next week for more updates on my progress and a look at some cutting-edge research in the field of time series forecasting. Thanks for reading and I appreciate any comments/feedback/questions.

An update on my blog post from last week which you can find here; IBM announced on June 8th that they will cease all work on facial recognition due to fears that their products could be used for racial-profiling and perpetuating biases.

Update June 11th: Amazon will not allow usage of it’s facial recognition software by police for at least the next year.

References

[1] Y. Chen, S. Gong, and L. Bazzani. “Image Search with Text Feedback by Visiolinguistic Attention Learning,” IEEE/CVF Conference on Computer Vision and Pattern Recognition 2020.

Don’t forget to give us your ? !

My Week in AI: Part 3 was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/my-week-in-ai-part-3-e37866479d56?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/my-week-in-ai-part-3

Will AI rescue the world from the impending doom of cyber-attacks or be the cause

There has been a good deal of publicized chatter about impending cyberattacks at an unprecedented scale and how Artificial Intelligence…

Continue reading on Becoming Human: Artificial Intelligence Magazine »

Via https://becominghuman.ai/will-ai-rescue-the-world-from-the-impending-doom-of-cyber-attacks-or-be-the-cause-a20a314827fa?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/will-ai-rescue-the-world-from-the-impending-doom-of-cyber-attacks-or-be-the-cause

Deep Learning in Finance: Is This The Future of the Financial Industry?

Get a handle on how deep learning is affecting the finance industry, and identify resources to further this understanding and increase your knowledge of the various aspects.

Originally from KDnuggets https://ift.tt/3fh56Ub

source https://365datascience.weebly.com/the-best-data-science-blog-2020/deep-learning-in-finance-is-this-the-future-of-the-financial-industry

A Technical Introduction to Gesture Recognition for Data Scientists

This is part of a project at NordAxon, the company where I work at as a Data Scientist. The purpose of this article is to share my insights and knowledge, meaning to instill collaboration within the Data Science community and also to help you kickstart your own project in gesture recognition.

Background

Gesture recognition is an important part of Human Computer Interaction (HCI). It can be used to improve user interfaces, e.g. virtual reality in the gaming industry or remote control of robot arms using gestures. It can even make life easier for people with disabilities as an assistive technology such as sign language translation.

*Example problem formulation: Classify the type of walk a person is doing*

As of today, with Deep Learning and only 2D RGB images as inputs, we can build fairly robust models for static gesture recognition. Though, the more promising application for continuous gestures recognition still faces many challenges and is an open problem.

Prerequisite

This article is intended for Data Scientists who have at least basic understanding of machine learning and deep learning in computer vision.

Challenges to overcome

Most problems in gesture recognition are associated with accuracy and performance. For instance, different people might perform the same gesture differently in both speed and movement range. This greatly increases the complexity of the problem if we have many types of gestures. In this article, we will restrict our discussion only to hand gestures as they are the most important features for many gesture recognition tasks.

Occlusions (i.e. hands not in full sight) occur frequently and might also be a challenge to model. A robust model must be able to take into account the global- and local features in order to handle occlusions. Imagine a person that is holding a banana, then the part of the hand behind the banana is not visible to the camera/model.

Illustration of keypoint detection of hands.

Co-articulation occurs when a gesture is performed differently depending on what gesture comes before and after. Co-articulation usually happens between two gestures as an overlap. This occurs frequently in fluent sign language.
A limiting aspect for creating an end-to-end software is the trade-off between real-time performance and accuracy. This may be a requirement for creating e.g. reliable gesture user interfaces for controlling robot arms.
Another problem with continuous gesture recognition is that it is expensive to create datasets with frame-level annotations, which may be necessary for sign language recognition or other fast-paced gesticulations. Imagine any sign language with over 1000 words (understatement) and the amount of time that must be spent annotating this dataset.

Creating your own dataset

Let’s address the problem of acquiring a frame-level dataset for continuous hand gesture recognition. This problem was addressed by a paper from 2016 where the authors train a CNN on 1 000 000 images using weakly supervised learning. Their algorithm takes advantage of the Expectation-Maximization (EM)-algorithm and “inaccurate” labels in order to train a fairly robust CNN model on different hand shapes. In other words, through clustering the model learns different hand shapes. This is a time-saving approach if you want to create your own frame-level dataset for handshapes.

Illustration of **weak annotations/labels** of the frames in a video. Each frame is an image.

Deep Learning architectures

The gestures are characterized by spatial movements of the hand through the time, therefore our model must be able to capture spatio-temporal features. In other words, the model will handle 2D images (spatial features) through time (temporal features). Here I list some approaches that can be explored and tested:

3D Convolutional Neural Networks (3D CNN), the author of this paper argues that this type of model is more suitable for modelling spatio-temporal features compared to 2D CNNs. These models perform well when doing action recognition on videos and their usage may be extended to continuous gesture recognition. For a demonstration of these type of networks, skip to 18:00 in the following presentation

Object localization models which predicts bounding boxes such as YOLOv3, Single Shot Detectors (SSD) or Faster R-CNN can be used for static hand gestures recognition (small demonstration of YOLOv3 in the figure below). You can choose between fast real-time inference models e.g. YOLOv3 and SSD. Else if you value accuracy over speed you can use R-CNNs.

A demonstration of YOLOv3 trained on static hand gestures.

Keypoint detection models can estimate joint positions which in turn model human poses. These systems are usually built on top of an object localization model and their output could be used as features for lightweight classification models. For instance, with the help of Hidden Markov Models (HMM) or Reccurent Neural Networks (RNN). Alternatively, there exists complete end-to-end systems such as MediaPipe, OpenPose and wrnchAI. Check them out! You might be able to integrate these solutions into your gesture recognition system if you don’t want to develop one yourself.

Summary

In this article, I have introduced you to some of the challenges in continuous hand gesture recognition. For instance, occlusions and co-articulations.
You can use weakly supervised learning to train your model and therefore save time.
An end-to-end system could be a combination of different systems such as OpenPose + RNN (example) for continuous gesture recognition.

Next steps

The next steps are to decide the following

What will your dataset look like (which hand gestures) and how will you gather it?
Compare all the different approaches/models.

Don’t forget to give us your ? !

A Technical Introduction to Gesture Recognition for Data Scientists was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/a-technical-introduction-to-gesture-recognition-for-data-scientists-9ea4f1b76ed4?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/a-technical-introduction-to-gesture-recognition-for-data-scientists

My Week in AI: Part 2

Featuring more data engineering with SQL, Wheat Detection Kaggle Competition and research exploring demographic biases in facial recognition technology.

Welcome to My Week in AI! Each week this blog will have the following parts:

What I have done this week in AI
An overview of an exciting and emerging piece of AI research