The Logic of Digital Memories

Source

In the mid-90s, a variation of recurrent net with so-called Long Short-Term Memory units, or LSTMs, was proposed by the German researchers Sepp Hochreiter and Juergen Schmidhuber as a solution to the vanishing gradient problem.

LSTMs help preserve the error that can be backpropagated through time and layers. By maintaining a more constant error, they allow recurrent nets to continue to learn over many time steps (over 1000), thereby opening a channel to link causes and effects remotely. This is one of the central challenges to machine learning and AI, since algorithms are frequently confronted by environments where reward signals are sparse and delayed, such as life itself. (Religious thinkers have tackled this same problem with ideas of karma or divine reward, theorizing invisible and distant consequences to our actions.)

LSTMs contain information outside the normal flow of the recurrent network in a gated cell. Information can be stored in, written to, or read from a cell, much like data in a computer’s memory. The cell makes decisions about what to store, and when to allow reads, writes and erasures, via gates that open and close. Unlike the digital storage on computers, however, these gates are analog, implemented with element-wise multiplication by sigmoids, which are all in the range of 0–1. Analog has the advantage over digital of being differentiable, and therefore suitable for backpropagation.

Those gates act on the signals they receive, and similar to the neural network’s nodes, they block or pass on information based on its strength and import, which they filter with their own sets of weights. Those weights, like the weights that modulate input and hidden states, are adjusted via the recurrent networks learning process. That is, the cells learn when to allow data to enter, leave or be deleted through the iterative process of making guesses, backpropagating error, and adjusting weights via gradient descent.

Jobs in Machine Learning

Successful LSTM models have been built on a huge number of datasets. These include the models powering large-scale speech and translation services (Hochreiter & Schmidhuber, 1997)

The problem of long-term dependencies (github)

The reason why there is a need for LSTM s is that RNNs have difficulties in terms of remembering context across longer sequences.

LSTMs are very well suited to avoid the long-term dependency problem. Their default behavior is to keep information for long periods of time, so there is no struggle for them to learn.

Contextual information which can be accessed by standard RNNs restricted due to the fact that the influence of a given input on the hidden layer, and therefore on the network output, either decays or blows up exponentially as it cycles around the network’s recurrent connections. As mentioned above this is referred to as the vanishing gradient problem. LSTM is specifically designed to address the vanishing gradient problem. The following statements by Gers et al., 2000 are worth to emphasize when it comes understanding LSTMs:

“Standard RNNs fail to learn in the presence of time lags greater than 5–10 discrete time steps between relevant input events and target signals. LSTM can learn to bridge minimal time lags in excess of 1000 discrete time steps by enforcing constant error ow through “constant error carrousels” (CECs) within special units, called cells.”

Top 4 Most Popular Ai Articles:

1. AI, Machine Learning, & Deep Learning

2. Explained in 5 MinutesHow To Choose Between Angular And React For Your Next Project

3. Are you using the term ‘AI’ incorrectly?

4. Artificial Intelligence Conference

LSTM architecture was motivated by an analysis of error flow in existing RNNs which discovered that long time lags were inaccessible to existing architectures, because backpropagated error either blows up or decays exponentially.

An LSTM layer consists of a set of recurrently connected blocks, known as memory blocks. These blocks can be thought of as a differentiable version of the memory chips in a digital computer. Each one contains one or more recurrently connected memory cells and three multiplicative units — the input, output and forget gates — that provide continuous analogues of write, read and reset operations for the cells.

Learning rate and network size are the most crucial tunable LSTM hyperparameters which can be tuned independently. The learning rate can specifically be calibrated first using a fairly small network, thus saving a lot of experimentation time.

Let’s see how LSTMs function in detail now. You would remember from the previous chapters that that RNNs possess a chain of repeating modules of neural network, usually in the form of a single tanh layer.

The repeating module in a standard RNN contains a single layer (github)

The repeating module in LSTMs is a bit different as instead of having a single neural network layer, there are four of them, all of which are interacting in a very special way.

The repeating module in an LSTM contains four interacting layers (github)

In the figure above, the pink circles represent pointwise operations, like vector addition, while the yellow boxes are learned neural network layers. Lines merging denote concatenation, while a line forking denote its content being copied and the copies going to different locations. Starting from the output of one node to the inputs of others, each line carries an entire vector from the output of one node to the inputs of others.

So, LSTMs keep adding memory gates that control when memory is saved from one iteration to the next. The activation of these gates is controlled by means of connection weights. Special units called memory blocks in the recurrent hidden layer entail (Koenker et al, 2001):

○ Memory cells

○ Multiplicative units called gates

■ Input gate: controls flow of input activations into memory cell

■ Output gate: controls output flow of cell activations

■ Forget gate: process continuous input streams

Let’s look at each steps on how the LSTM works (Koenker et al, 2001):

– A decision should be made what information needs to be thrown away from the cell state. A sigmoid layer called the “forget gate layer” looks at ht−1 and xt and outputs a number between 0 and 1 for each number in the cell state Ct−1. While ‘1’ denotes “completely keep this” ‘0’ represents “get rid of this completely.”

LSTM Walk-through-1 (github)

– The next step is to decide what new information should be stored in the cell state.

– First, a sigmoid layer called the “input gate layer” decides which values are to be updated.

– Next, a tanh layer creates a vector of new candidate values, C̃t, that could be added to the state.

LSTM Walk-through-2 (github)

– The old cell state, Ct−1, should be updated into the new cell state Ct. The previous steps already decided what to do, so one only needs to multiply the old state by ft, forgetting the things decided to be forgotten earlier. Then, one should add it∗C̃t. This is the new candidate values, scaled by how much it is decided to update each state value.

LSTM Walk-through-3 (github)

– Finally, a decision should be made regarding the output. This output will be based on a filtered version of the cell state.

– First, one would need a sigmoid layer which decides what parts of the cell state to output.

– Then, one would need to put the cell state through Tanh (to push the values to be between −1 and 1) and multiply it by the output of the sigmoid gate to get the related outputs

LSTM Walk-through-4 (github)

Variants on Long Short Term Memory

The last section provided an overview of a generic LSTM, yet not all LSTMs might be the same. One of the most popular variation of LSTM is the so called “peephole connection”. It refers to letting the gate layers look at the cell state (Koenker et al, 2001).

Peephole Connections- A variant of LSTM (github)

As it can be seen in the figure above, peepholes can be added to all the gates, although that is not necessarily the case.

Another variant would be to use a combination of both coupled forget and input gates. Rather than separately determining what to forget and what to add, these decisions are made together. Forgetting occurs when a new input would be added into its place while new values will be added when something needs to be forgotten (Koenker et al, 2001).

Forget and Input Gates- A variant of LSTM (github)

Another variation is the so called GRU (Gated Recurrent Unit) which combines both the forget and input gates into a single “update gate” by merging the cell and hidden states (Koenker et al, 2001). This model has been growing increasingly popular.

Gated Recurrent Unit (GRU)- A variant of LSTM (github)

These examples are only a few variants for LSTM. There are many other variants. Yet, one question that you might want to ask is which variant would be the best to choose or whether these differences would matter at all. Once you have a response to this, you can realize that you already become an expert in neural networks.

Don’t forget to give us your ? !


The Logic of Digital Memories was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.

Via https://becominghuman.ai/the-logic-of-digital-memories-b6ec8feb7555?source=rss—-5e5bef33608a—4

source https://365datascience.weebly.com/the-best-data-science-blog-2020/the-logic-of-digital-memories

More Performance Evaluation Metrics for Classification Problems You Should Know

When building and optimizing your classification model, measuring how accurately it predicts your expected outcome is crucial. However, this metric alone is never the entire story, as it can still offer misleading results. That’s where these additional performance evaluations come into play to help tease out more meaning from your model.

Originally from KDnuggets https://ift.tt/3dRxZpx

source https://365datascience.weebly.com/the-best-data-science-blog-2020/more-performance-evaluation-metrics-for-classification-problems-you-should-know

Can You Become an SQL Developer?

sql developer, how to become an sql developer, can you become an sql developer

Why is SQL developer a career worth exploring?

When going through some of the most popular job boards online, you can easily see that SQL continues to be among the highly sought-after skills for development, business intelligence, and data science job offerings.

In fact, starting out as an SQL developer can open the door to a successful long-term career in data science or BI. That’s why the topic of this article is ‘can you become an SQL developer’ and how to do it.

First, we’ll describe an SQL developer’s role in a company.

Then, we’ll focus on the technical and soft skills you need to be successful on the job. We’ll also discuss the education and working experience hiring companies are looking for.

To top things off, we’ll provide information regarding and the expected salary for SQL developers in different parts of the world.

sql developer  

So, what does an SQL developer actually do?

In short, we can say that this position requires you to build, maintain, and manipulate database systems. And, very often, you’ll have to use the data stored in the databases you created to develop ad-hoc and recurring reports. To this end, you will need to write and test SQL code, as well as create stored procedures, functions, and views.

But to understand well how to organize their data, an SQL developer must communicate well with technical and non-technical Subject Matter Experts from the business.

For example, a sales SME would be best qualified to communicate the granularity of sales data to be collected, how the respective database should be maintained from a business perspective; and how frequently the sales team would like to receive a given type of report they have requested.

what does an sql developer do

Nowadays, an SQL developer doesn’t work in isolation.

Companies work with different ERP systems and the databases you maintain will sometimes have to be migrated. This can happen when the company introduces a new ERP software or changes its existing one. So, in such a situation, you’ll need to export data from multiple types of source DBs you operate at the moment; and then clean the data using an extraction, transformation, loading tool (ETL).

In our day and age, more and more companies migrate their data to the cloud and an SQL developer will certainly have to play an active role if this happens in their firm.

cloud computing, data migration to the cloud

What technical skills does an SQL developer need on the job?

Of course, you need to be proficient in SQL. (We’re sure you didn’t see this one coming, right?)

And some of the most popular database management systems that allow you to work with versions of the Structured Query Language are MySQL, SQL Server, and PostgreSQL:

best database management systems

MySQL is the world’s most popular open-source relational database management system, while Microsoft’s SQL Server is typically preferred by corporations.

Quite importantly, Microsoft’s SQL Server comes with three essential types of services – SSIS, SSRS, and SSAS. These are some of the most frequently mentioned prerequisites in job ads.

SSIS stands for SQL server integration services

It’s a framework we use for data migration and data integration; what is helpful is that SSIS contains an ETL tool that can be used for automated database maintenance.

SSRS

SQL Server reporting services help you prepare and deliver reporting, and

SSAS

SQL Server analysis services enable analytical processing and data extraction.

These SQL server components were some of the most frequently mentioned and requested technical skills among all SQL developer job postings we analysed. We can say with certainty that employers see this as the basis needed for this role.

Some other in-demand technical skills we encountered are:

sql developer technical skills, technical skills for sql developer

In addition, very often employers list some of the ‘’desired’’ (but not compulsory) skills to have.

For SQL developer positions, such ‘nice to have’ abilities are:

sql developer desired tech skills

What about soft skills? Are they important?

Well, employers look for SQL developers who are also good communicators. SQL developers work hand in hand with SMEs with various backgrounds within a company. Therefore, they need to be able to understand the other person’s point of view and reason together to design an optimal solution. Almost all job listings we researched mentioned ‘good interpersonal skills’ and ‘ability to communicate with people’.

soft skills

And that leads us to our final point:

What formal qualifications do you need to apply as an SQL developer and how much is the starting salary you could expect in different parts of the world?

This is a suitable position for junior professionals. However, in most cases, you need some initial experience. Almost all job ads we analysed required 1 or 2 (and sometimes more) years of experience with SQL and relational database tools in a professional environment.

The other most frequently seen requirement was a Bachelor’s degree; preferably if it came from a related field, be it Computer Science, Engineering, Mathematics, Statistics, or Data Analysis.

education and qualifications

Now, to determine how much SQL developers around the world make on average, we relied on Glassdoor data and found out the following:

On average, SQL developers in the United States receive $81,600 per year; whereas in Germany an SQL developer takes home $55,368.

If you work on this position in Canada, you will expect an amount in the range of $50,500; which is slightly higher than $47,600 in the UK.

And the average SQL developer in India makes around $6,000 per year, according to Glassdoor.

salaries around the world

We hope this article shed light on what an SQL developer does and how you can become one.

If you’re eager to sharpen your SQL skills, check out our super practical SQL tutorials, such as SQL Joins, Where to Use Where or Having Clause, and plenty more.

Ready to take the next step towards a data science career?

Check out the complete Data Science Program today. Start with the fundamentals with our Statistics, Maths, and Excel courses. Build up a step-by-step experience with SQL, Python, R, Tableau, and Power BI. And upgrade your skillset with Machine Learning; Deep Learning; Credit Risk Modeling; Time Series Analysis; and Customer Analytics in Python.

If you want to explore the curriculum or sign up 12 hours of beginner to advanced video content for free, click on the button below.

 

The post Can You Become an SQL Developer? appeared first on 365 Data Science.

from 365 Data Science https://ift.tt/2UWSPva

Best Free Epidemiology Courses for Data Scientists

Are you interested in knowing more about epidemiology, the field which studies the spread and distribution of diseases? This article collects some free courses which are intended to help you do just that.

Originally from KDnuggets https://ift.tt/3aFYtIF

source https://365datascience.weebly.com/the-best-data-science-blog-2020/best-free-epidemiology-courses-for-data-scientists

Stop Hurting Your Pandas!

This post will address the issues that can arise when Pandas slicing is used improperly. If you see the warning that reads “A value is trying to be set on a copy of a slice from a DataFrame”, this post is for you.

Originally from KDnuggets https://ift.tt/2UEvDCV

source https://365datascience.weebly.com/the-best-data-science-blog-2020/stop-hurting-your-pandas

List of the top data science articles & videos you want to first have a look

How to Become an SQL Developer in 2020

How to Become an SQL Developer in 2020? We’d love to explain, so let’s discuss how to become an SQL developer in 2020.

In this video, we’ll describe an SQL developer’s role in a company. Then, we’ll focus on the technical and soft skills you need to be successful on the job. We’ll also discuss the education and working experience hiring companies are looking for. To top things off, we’ll provide information regarding an the expected salary for SQL developers in different parts of the world.

👇🏻Data Science Career Guide👇🏻

https://bit.ly/2ytERJz

👇🏻GET 20% OFF the Complete Data Science Training 👇🏻

https://bit.ly/39yHF53

👇🏻Follow us on YouTube👇🏻

https://www.youtube.com/c/365DataScience

So, what does an SQL developer actually do? In short, we can say that this position requires you to build, maintain, and manipulate database systems. And, very often, you’ll have to use the data stored in the databases you created to develop ad-hoc and recurring reports. To this end, you will need to write and test SQL code, as well as create stored procedures, functions, and views. Let’s discuss some of the technical skills an SQL developer needs on the job. Naturally, you need to be proficient in SQL. I’m sure you didn’t see this one coming.

Some of the most popular database management systems that allow you to work with versions of the Structured Query Language are MySQL, SQL Server, and PostgreSQL. What formal qualifications do you need to apply as an SQL developer? This is a position that is a suitable position for junior professionals. However, in most cases, you need some initial experience. Almost all job ads we analysed required 1 or 2 (and sometimes more) years of experience with SQL and relational database tools in a professional environment.

Design a site like this with WordPress.com
Get started