Googles Professional AI Certification & What Ive Learned Since

Last week, I took Google’s Tensorflow Certification exam, a grueling 5-hour endeavor where the tester is required to build highly accurate models pertaining to image classification, text classification, and time-series predictors.

I ended up passing the exam, but it definitely was a beast and required incremental accumulation of subject matter knowledge over several years along with a very structured study schedule and hands-on practice.

I wanted to quickly share what I’ve learned since then about Tensorflow and about neural networks in general.

Artificial Intelligence Is Probabilities

Artificial Intelligence (AI) is all about probabilities at the most basic level, but we can sum it up in once sentence:

Artificial Intelligence is about arriving at the probability distribution of the truth.

You aren’t seeing this at the surface level because API’s like Keras extract the higher level math away, but ultimately its good to understand exactly what machine learning is doing under the hood so that we know when to put faith into what the model is telling us, and when not to.

So how are neural networks using probability distributions? Let’s take an example of using an AI model to predict heart disease.

Let’s say I have a training dataset of 1000 patients, 980 are labeled to not have heart disease and 20 are labeled as having heart disease. As far as the model is concerned, this dataset represents the ground truth, the maximum amount of information present to be learned by a machine learning model to equate the real world.

What a neural network is doing internally is that it’s shifting its weights around to arrive at the expected distribution of the original training dataset. In other words, when you feed data into a neural network, what you’re saying to the network is this:

“Okay, neural network, I’m going to feed you some data where 980 of the inputs have class “0″ (“do not have heart disease”), and 20 have class “1″ (“have heart disease”). Please shift your network weights around when evaluating the input data so that your predictions will typically match around 20 class “1’s” for every 980 class “0’s””.

“Okay Neural Network, given all these X’s, give me the function for a red line that best fits the data distribution so I can predict future Y’s”

That’s all that’s happening under the hood. The neural network is converging in on a probability distribution for predictions that will most closely match the probability distribution of the original ground truth you fed into it.

This is an important thing to understand because if you’re feeding datasets into a machine learning model where the distribution of classifiers do not approximate the true distribution of those classifiers in nature, then your machine learning model will be pretty “untruthful” (useless) when you try to predict on new data.

This is the ultimate challenge of data science and artificial intelligence is that it’s not just about building efficient models, it’s actually more about what you’re feeding into them.

This is analogous to diet and exercise for people: if you’re training hard in the gym, but eating fattening and unhealthy foods, it doesn’t matter how great your training program is, you’re ultimately not going to get the results you’re looking for.

The hardest part about machine learning is understanding whether the distribution of your data matches that which occurs naturally in the physical world.