Make Keras Use Cpu Again in R

TensorFlow two Tutorial: Get Started in Deep Learning With tf.keras

Last Updated on August 27, 2020

Predictive modeling with deep learning is a skill that modern developers need to know.

TensorFlow is the premier open-source deep learning framework developed and maintained by Google. Although using TensorFlow directly tin exist challenging, the modern tf.keras API beings the simplicity and ease of employ of Keras to the TensorFlow project.

Using tf.keras allows yous to design, fit, evaluate, and use deep learning models to brand predictions in simply a few lines of lawmaking. It makes common deep learning tasks, such as classification and regression predictive modeling, attainable to average developers looking to get things done.

In this tutorial, yous will notice a step-by-step guide to developing deep learning models in TensorFlow using the tf.keras API.

After completing this tutorial, y'all will know:

  • The difference between Keras and tf.keras and how to install and confirm TensorFlow is working.
  • The 5-step life-wheel of tf.keras models and how to utilise the sequential and functional APIs.
  • How to develop MLP, CNN, and RNN models with tf.keras for regression, classification, and time series forecasting.
  • How to use the advanced features of the tf.keras API to inspect and diagnose your model.
  • How to improve the performance of your tf.keras model by reducing overfitting and accelerating grooming.

This is a large tutorial, and a lot of fun. Y'all might want to bookmark information technology.

The examples are minor and focused; yous tin finish this tutorial in about 60 minutes.

Kick-start your project with my new book Deep Learning With Python, including step-by-pace tutorials and the Python source code files for all examples.

Allow'due south become started.

  • Update Jun/2020: Updated for changes to the API in TensorFlow 2.2.0.

How to Develop Deep Learning Models With tf.keras

How to Develop Deep Learning Models With tf.keras
Photo by Stephen Harlan, some rights reserved.

TensorFlow Tutorial Overview

This tutorial is designed to be your complete introduction to tf.keras for your deep learning project.

The focus is on using the API for mutual deep learning model evolution tasks; we will non exist diving into the math and theory of deep learning. For that, I recommend starting with this excellent book.

The all-time style to learn deep learning in python is past doing. Dive in. Yous can circle back for more theory later.

I have designed each lawmaking example to use best practices and to be standalone then that you tin can copy and paste it directly into your project and adjust it to your specific needs. This volition requite you lot a massive head start over trying to effigy out the API from official documentation alone.

Information technology is a large tutorial and every bit such, information technology is divided into five parts; they are:

  1. Install TensorFlow and tf.keras
    1. What Are Keras and tf.keras?
    2. How to Install TensorFlow
    3. How to Confirm TensorFlow Is Installed
  2. Deep Learning Model Life-Cycle
    1. The 5-Step Model Life-Cycle
    2. Sequential Model API (Simple)
    3. Functional Model API (Advanced)
  3. How to Develop Deep Learning Models
    1. Develop Multilayer Perceptron Models
    2. Develop Convolutional Neural Network Models
    3. Develop Recurrent Neural Network Models
  4. How to Use Advanced Model Features
    1. How to Visualize a Deep Learning Model
    2. How to Plot Model Learning Curves
    3. How to Salve and Load Your Model
  5. How to Get Better Model Performance
    1. How to Reduce Overfitting With Dropout
    2. How to Accelerate Grooming With Batch Normalization
    3. How to Halt Training at the Right Time With Early Stopping

Yous Tin can Do Deep Learning in Python!

Work through the tutorial at your ain pace.

You practise not need to understand everything (at least not right now). Your goal is to run through the tutorial terminate-to-end and go results. You do non need to understand everything on the first laissez passer. Listing downwards your questions as you get. Make heavy use of the API documentation to learn about all of the functions that you lot're using.

You do not need to know the math starting time. Math is a compact way of describing how algorithms work, specifically tools from linear algebra, probability, and statistics. These are non the only tools that you can employ to acquire how algorithms piece of work. You can too use code and explore algorithm behavior with different inputs and outputs. Knowing the math will not tell y'all what algorithm to choose or how to best configure it. You can merely discover that through careful, controlled experiments.

You exercise not need to know how the algorithms piece of work. Information technology is important to know about the limitations and how to configure deep learning algorithms. Just learning nigh algorithms can come later. You need to build upwardly this algorithm knowledge slowly over a long menses of fourth dimension. Today, commencement past getting comfy with the platform.

You do not need to be a Python programmer. The syntax of the Python language tin be intuitive if you are new to it. Simply like other languages, focus on function calls (due east.thou. function()) and assignments (e.k. a = "b"). This volition go you lot almost of the fashion. You lot are a developer, so y'all know how to choice up the nuts of a linguistic communication really fast. But become started and dive into the details after.

Yous practise not need to be a deep learning adept. Y'all can larn about the benefits and limitations of various algorithms afterwards, and there are plenty of posts that you lot can read later to brush up on the steps of a deep learning project and the importance of evaluating model skill using cantankerous-validation.

1. Install TensorFlow and tf.keras

In this section, you lot volition discover what tf.keras is, how to install it, and how to ostend that it is installed correctly.

1.1 What Are Keras and tf.keras?

Keras is an open up-source deep learning library written in Python.

The projection was started in 2022 by Francois Chollet. Information technology apace became a popular framework for developers, condign one of, if not the near, pop deep learning libraries.

During the menstruation of 2015-2019, developing deep learning models using mathematical libraries like TensorFlow, Theano, and PyTorch was cumbersome, requiring tens or fifty-fifty hundreds of lines of lawmaking to achieve the simplest tasks. The focus of these libraries was on enquiry, flexibility, and speed, not ease of use.

Keras was popular considering the API was clean and simple, allowing standard deep learning models to be defined, fit, and evaluated in just a few lines of lawmaking.

A secondary reason Keras took-off was because it immune yous to use any one among the range of popular deep learning mathematical libraries every bit the backend (e.g. used to perform the computation), such as TensorFlow, Theano, and later, CNTK. This allowed the power of these libraries to be harnessed (e.one thousand. GPUs) with a very clean and simple interface.

In 2019, Google released a new version of their TensorFlow deep learning library (TensorFlow 2) that integrated the Keras API directly and promoted this interface as the default or standard interface for deep learning evolution on the platform.

This integration is ordinarily referred to as the tf.keras interface or API ("tf" is curt for "TensorFlow"). This is to distinguish it from the so-called standalone Keras open source project.

  • Standalone Keras. The standalone open source projection that supports TensorFlow, Theano and CNTK backends.
  • tf.keras. The Keras API integrated into TensorFlow 2.

The Keras API implementation in Keras is referred to equally "tf.keras" because this is the Python idiom used when referencing the API. First, the TensorFlow module is imported and named "tf"; and so, Keras API elements are accessed via calls to tf.keras; for example:

I generally don't use this idiom myself; I don't recall it reads cleanly.

Given that TensorFlow was the de facto standard backend for the Keras open up source projection, the integration means that a single library can at present be used instead of 2 separate libraries. Further, the standalone Keras project now recommends all futurity Keras evolution use the tf.keras API.

At this time, we recommend that Keras users who use multi-backend Keras with the TensorFlow backend switch to tf.keras in TensorFlow 2.0. tf.keras is better maintained and has better integration with TensorFlow features (eager execution, distribution back up and other).

— Keras Project Homepage, Accessed December 2019.

1.2 How to Install TensorFlow

Before installing TensorFlow, ensure that yous have Python installed, such as Python 3.6 or college.

If you don't accept Python installed, you can install information technology using Anaconda. This tutorial volition show yous how:

  • How to Setup Your Python Environs for Auto Learning With Anaconda

In that location are many ways to install the TensorFlow open up-source deep learning library.

The most common, and possibly the simplest, style to install TensorFlow on your workstation is by using pip.

For case, on the command line, you tin type:

If you lot prefer to use an installation method more specific to your platform or package manager, you can encounter a complete list of installation instructions hither:

  • Install TensorFlow two Guide

At that place is no need to set upwardly the GPU now.

All examples in this tutorial will work simply fine on a modern CPU. If you desire to configure TensorFlow for your GPU, you can practise that afterward completing this tutorial. Don't become distracted!

1.three How to Confirm TensorFlow Is Installed

Once TensorFlow is installed, it is important to confirm that the library was installed successfully and that you tin can get-go using it.

Don't skip this step.

If TensorFlow is non installed correctly or raises an error on this footstep, y'all won't be able to run the examples later.

Create a new file chosen versions.py and copy and paste the following lawmaking into the file.

Save the file, then open your command line and change directory to where y'all saved the file.

Then type:

You should then see output like the post-obit:

This confirms that TensorFlow is installed correctly and that we are all using the same version.

What version did you become?
Postal service your output in the comments below.

This too shows y'all how to run a Python script from the command line. I recommend running all code from the command line in this way, and not from a notebook or an IDE.

If You lot Become Alert Messages

Sometimes when yous utilize the tf.keras API, you may come across warnings printed.

This might include messages that your hardware supports features that your TensorFlow installation was not configured to use.

Some examples on my workstation include:

They are not your fault. You did zippo wrong.

These are information letters and they will non prevent the execution of your code. You lot can safely ignore letters of this type for at present.

Information technology'south an intentional design decision made by the TensorFlow team to bear witness these warning messages. A downside of this decision is that it confuses beginners and it trains developers to ignore all messages, including those that potentially may impact the execution.

At present that you lot know what tf.keras is, how to install TensorFlow, and how to confirm your development surroundings is working, let's look at the life-cycle of deep learning models in TensorFlow.

two. Deep Learning Model Life-Cycle

In this department, yous will discover the life-cycle for a deep learning model and the ii tf.keras APIs that you lot can use to define models.

2.ane The five-Step Model Life-Cycle

A model has a life-cycle, and this very simple noesis provides the backbone for both modeling a dataset and understanding the tf.keras API.

The 5 steps in the life-wheel are as follows:

  1. Define the model.
  2. Compile the model.
  3. Fit the model.
  4. Evaluate the model.
  5. Make predictions.

Let's take a closer look at each step in turn.

Define the Model

Defining the model requires that you lot get-go select the type of model that you need and then choose the architecture or network topology.

From an API perspective, this involves defining the layers of the model, configuring each layer with a number of nodes and activation office, and connecting the layers together into a cohesive model.

Models tin exist defined either with the Sequential API or the Functional API, and we will take a expect at this in the next section.

Compile the Model

Compiling the model requires that you first select a loss function that yous want to optimize, such as mean squared error or cantankerous-entropy.

It also requires that yous select an algorithm to perform the optimization procedure, typically stochastic gradient descent, or a mod variation, such equally Adam. It may likewise crave that you select whatsoever performance metrics to go along rail of during the model preparation process.

From an API perspective, this involves calling a function to compile the model with the chosen configuration, which will prepare the appropriate data structures required for the efficient utilise of the model you have defined.

The optimizer tin can be specified every bit a cord for a known optimizer class, e.grand. 'sgd' for stochastic gradient descent, or yous can configure an case of an optimizer form and utilise that.

For a list of supported optimizers, encounter this:

  • tf.keras Optimizers

The 3 most common loss functions are:

  • 'binary_crossentropy' for binary classification.
  • 'sparse_categorical_crossentropy' for multi-form classification.
  • 'mse' (mean squared error) for regression.

For a list of supported loss functions, come across:

  • tf.keras Loss Functions

Metrics are defined as a list of strings for known metric functions or a listing of functions to call to evaluate predictions.

For a list of supported metrics, come across:

  • tf.keras Metrics

Fit the Model

Fitting the model requires that you lot first select the preparation configuration, such equally the number of epochs (loops through the training dataset) and the batch size (number of samples in an epoch used to guess model error).

Training applies the called optimization algorithm to minimize the called loss function and updates the model using the backpropagation of error algorithm.

Fitting the model is the tedious part of the whole process and can take seconds to hours to days, depending on the complication of the model, the hardware you're using, and the size of the grooming dataset.

From an API perspective, this involves calling a function to perform the training process. This function will block (not return) until the preparation process has finished.

For help on how to choose the batch size, run across this tutorial:

  • How to Control the Stability of Training Neural Networks With the Batch Size

While fitting the model, a progress bar will summarize the status of each epoch and the overall training process. This can exist simplified to a simple report of model performance each epoch by setting the "verbose" argument to 2. All output can be turned off during training by setting "verbose" to 0.

Evaluate the Model

Evaluating the model requires that you first choose a holdout dataset used to evaluate the model. This should be data not used in the training procedure so that we tin can go an unbiased estimate of the functioning of the model when making predictions on new data.

The speed of model evaluation is proportional to the amount of information you want to use for the evaluation, although it is much faster than grooming as the model is non changed.

From an API perspective, this involves calling a function with the holdout dataset and getting a loss and perhaps other metrics that tin exist reported.

Make a Prediction

Making a prediction is the final pace in the life-bicycle. It is why we wanted the model in the first place.

It requires yous take new information for which a prediction is required, e.g. where you do not have the target values.

From an API perspective, you simply call a function to brand a prediction of a class characterization, probability, or numerical value: whatever you lot designed your model to predict.

You lot may want to relieve the model and subsequently load it to make predictions. You may also choose to fit a model on all of the bachelor data earlier you start using it.

Now that we are familiar with the model life-cycle, let's take a look at the two primary means to use the tf.keras API to build models: sequential and functional.

2.2 Sequential Model API (Simple)

The sequential model API is the simplest and is the API that I recommend, especially when getting started.

It is referred to as "sequential" considering information technology involves defining a Sequential grade and adding layers to the model one by i in a linear manner, from input to output.

The example below defines a Sequential MLP model that accepts eight inputs, has ane hidden layer with 10 nodes and then an output layer with one node to predict a numerical value.

Note that the visible layer of the network is defined by the "input_shape" argument on the first hidden layer. That means in the to a higher place example, the model expects the input for one sample to be a vector of eight numbers.

The sequential API is easy to utilise because you lot keep calling model.add() until yous have added all of your layers.

For example, here is a deep MLP with five hidden layers.

ii.3 Functional Model API (Advanced)

The functional API is more than circuitous but is also more than flexible.

It involves explicitly connecting the output of one layer to the input of some other layer. Each connection is specified.

Kickoff, an input layer must be defined via the Input class, and the shape of an input sample is specified. Nosotros must retain a reference to the input layer when defining the model.

Side by side, a fully connected layer can exist continued to the input by calling the layer and passing the input layer. This will return a reference to the output connexion in this new layer.

We tin can then connect this to an output layer in the same manner.

One time continued, nosotros define a Model object and specify the input and output layers. The complete case is listed below.

Every bit such, it allows for more complicated model designs, such as models that may take multiple input paths (divide vectors) and models that have multiple output paths (e.g. a discussion and a number).

The functional API can be a lot of fun when you get used to it.

For more on the functional API, see:

  • The Keras functional API in TensorFlow

Now that we are familiar with the model life-bicycle and the two APIs that can exist used to define models, let'south look at developing some standard models.

iii. How to Develop Deep Learning Models

In this section, you will detect how to develop, evaluate, and make predictions with standard deep learning models, including Multilayer Perceptrons (MLP), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs).

3.i Develop Multilayer Perceptron Models

A Multilayer Perceptron model, or MLP for short, is a standard fully connected neural network model.

It is comprised of layers of nodes where each node is connected to all outputs from the previous layer and the output of each node is connected to all inputs for nodes in the next layer.

An MLP is created by with one or more Dense layers. This model is appropriate for tabular information, that is information as it looks in a table or spreadsheet with 1 column for each variable and i row for each variable. In that location are three predictive modeling problems you may desire to explore with an MLP; they are binary classification, multiclass nomenclature, and regression.

Allow's fit a model on a existent dataset for each of these cases.

Notation, the models in this section are effective, but not optimized. See if you lot can ameliorate their performance. Post your findings in the comments below.

MLP for Binary Classification

We will use the Ionosphere binary (2-form) nomenclature dataset to demonstrate an MLP for binary nomenclature.

This dataset involves predicting whether a structure is in the atmosphere or not given radar returns.

The dataset volition be downloaded automatically using Pandas, but y'all can larn more about it here.

  • Ionosphere Dataset (csv).
  • Ionosphere Dataset Description (csv).

We will use a LabelEncoder to encode the string labels to integer values 0 and 1. The model will be fit on 67 percent of the data, and the remaining 33 percent will exist used for evaluation, separate using the train_test_split() function.

It is a good practice to utilise 'relu' activation with a 'he_normal' weight initialization. This combination goes a long way to overcome the problem of vanishing gradients when preparation deep neural network models. For more on ReLU, meet the tutorial:

  • A Gentle Introduction to the Rectified Linear Unit (ReLU)

The model predicts the probability of class ane and uses the sigmoid activation office. The model is optimized using the adam version of stochastic gradient descent and seeks to minimize the cross-entropy loss.

The complete example is listed beneath.

Running the example offset reports the shape of the dataset, then fits the model and evaluates it on the examination dataset. Finally, a prediction is fabricated for a single row of data.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the instance a few times and compare the boilerplate outcome.

What results did yous get? Tin can you lot change the model to do amend?
Mail service your findings to the comments below.

In this instance, we tin can see that the model achieved a nomenclature accuracy of well-nigh 94 pct and then predicted a probability of 0.ix that the one row of information belongs to class ane.

MLP for Multiclass Classification

We volition use the Iris flowers multiclass classification dataset to demonstrate an MLP for multiclass nomenclature.

This problem involves predicting the species of iris bloom given measures of the flower.

The dataset will be downloaded automatically using Pandas, but you tin larn more about information technology hither.

  • Iris Dataset (csv).
  • Iris Dataset Description (csv).

Given that it is a multiclass classification, the model must take one node for each class in the output layer and apply the softmax activation function. The loss part is the 'sparse_categorical_crossentropy', which is appropriate for integer encoded class labels (east.g. 0 for ane class, one for the next class, etc.)

The complete instance of plumbing equipment and evaluating an MLP on the iris flowers dataset is listed below.

Running the example first reports the shape of the dataset, then fits the model and evaluates it on the exam dataset. Finally, a prediction is made for a single row of data.

Annotation: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the average upshot.

What results did you become? Can y'all change the model to do better?
Post your findings to the comments below.

In this case, we tin run across that the model achieved a nomenclature accurateness of about 98 percent so predicted a probability of a row of data belonging to each course, although class 0 has the highest probability.

MLP for Regression

We will use the Boston housing regression dataset to demonstrate an MLP for regression predictive modeling.

This trouble involves predicting house value based on properties of the house and neighborhood.

The dataset volition exist downloaded automatically using Pandas, only you can learn more about it here.

  • Boston Housing Dataset (csv).
  • Boston Housing Dataset Description (csv).

This is a regression problem that involves predicting a single numerical value. As such, the output layer has a single node and uses the default or linear activation office (no activation function). The mean squared mistake (mse) loss is minimized when plumbing fixtures the model.

Remember that this is a regression, not classification; therefore, nosotros cannot calculate nomenclature accuracy. For more on this, see the tutorial:

  • Difference Betwixt Classification and Regression in Machine Learning

The complete example of fitting and evaluating an MLP on the Boston housing dataset is listed below.

Running the example showtime reports the shape of the dataset then fits the model and evaluates it on the test dataset. Finally, a prediction is made for a single row of data.

Annotation: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the instance a few times and compare the average effect.

What results did you become? Can yous change the model to practise ameliorate?
Postal service your findings to the comments below.

In this instance, we can see that the model achieved an MSE of about 60 which is an RMSE of virtually 7 (units are thousands of dollars). A value of about 26 is then predicted for the unmarried example.

3.2 Develop Convolutional Neural Network Models

Convolutional Neural Networks, or CNNs for short, are a type of network designed for image input.

They are comprised of models with convolutional layers that extract features (chosen feature maps) and pooling layers that dribble features downward to the about salient elements.

CNNs are most well-suited to image nomenclature tasks, although they tin be used on a wide array of tasks that take images every bit input.

A popular image classification task is the MNIST handwritten digit classification. It involves tens of thousands of handwritten digits that must be classified as a number between 0 and nine.

The tf.keras API provides a convenience function to download and load this dataset directly.

The example below loads the dataset and plots the showtime few images.

Running the example loads the MNIST dataset, and so summarizes the default train and exam datasets.

A plot is then created showing a filigree of examples of handwritten images in the training dataset.

Plot of Handwritten Digits From the MNIST dataset

Plot of Handwritten Digits From the MNIST dataset

We can train a CNN model to classify the images in the MNIST dataset.

Note that the images are arrays of grayscale pixel data; therefore, we must add together a channel dimension to the information before nosotros tin use the images every bit input to the model. The reason is that CNN models look images in a channels-concluding format, that is each example to the network has the dimensions [rows, columns, channels], where channels stand for the color channels of the image data.

Information technology is also a good idea to scale the pixel values from the default range of 0-255 to 0-1 when training a CNN. For more on scaling pixel values, meet the tutorial:

  • How to Manually Scale Image Pixel Data for Deep Learning

The consummate example of fitting and evaluating a CNN model on the MNIST dataset is listed beneath.

Running the instance first reports the shape of the dataset, then fits the model and evaluates it on the test dataset. Finally, a prediction is made for a single image.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the example a few times and compare the boilerplate outcome.

What results did you go? Can you change the model to do ameliorate?
Postal service your findings to the comments beneath.

First, the shape of each image is reported forth with the number of classes; we tin can run into that each image is 28×28 pixels and there are x classes every bit we expected.

In this example, we tin can see that the model achieved a classification accuracy of almost 98 percent on the test dataset. We tin then see that the model predicted form 5 for the first image in the training prepare.

three.three Develop Recurrent Neural Network Models

Recurrent Neural Networks, or RNNs for short, are designed to operate upon sequences of data.

They accept proven to be very effective for natural language processing problems where sequences of text are provided as input to the model. RNNs have also seen some modest success for fourth dimension series forecasting and speech communication recognition.

The most pop type of RNN is the Long Brusk-Term Retentivity network, or LSTM for short. LSTMs can be used in a model to accept a sequence of input data and brand a prediction, such as assign a course label or predict a numerical value like the next value or values in the sequence.

Nosotros will use the car sales dataset to demonstrate an LSTM RNN for univariate time serial forecasting.

This problem involves predicting the number of machine sales per month.

The dataset will be downloaded automatically using Pandas, but you tin learn more about information technology hither.

  • Car Sales Dataset (csv).
  • Car Sales Dataset Description (csv).

We will frame the problem to take a window of the last v months of data to predict the current month'due south data.

To achieve this, we will define a new function named split_sequence() that will split up the input sequence into windows of information appropriate for plumbing equipment a supervised learning model, like an LSTM.

For instance, if the sequence was:

Then the samples for grooming the model will look similar:

We will use the last 12 months of data as the test dataset.

LSTMs expect each sample in the dataset to have two dimensions; the first is the number of fourth dimension steps (in this case it is 5), and the 2nd is the number of observations per fourth dimension step (in this case it is 1).

Because it is a regression type trouble, we volition employ a linear activation part (no activation
function) in the output layer and optimize the hateful squared error loss function. We volition also evaluate the model using the mean absolute error (MAE) metric.

The complete example of plumbing equipment and evaluating an LSTM for a univariate time series forecasting problem is listed below.

Running the example commencement reports the shape of the dataset, then fits the model and evaluates information technology on the test dataset. Finally, a prediction is made for a single case.

Note: Your results may vary given the stochastic nature of the algorithm or evaluation procedure, or differences in numerical precision. Consider running the instance a few times and compare the average outcome.

What results did you get? Can you alter the model to do amend?
Post your findings to the comments below.

Kickoff, the shape of the train and exam datasets is displayed, confirming that the last 12 examples are used for model evaluation.

In this instance, the model achieved an MAE of well-nigh 2,800 and predicted the next value in the sequence from the examination set as thirteen,199, where the expected value is 14,577 (pretty close).

Note: information technology is good exercise to scale and brand the series stationary the data prior to fitting the model. I recommend this every bit an extension in order to achieve better performance. For more on preparing time series information for modeling, meet the tutorial:

  • 4 Common Machine Learning Information Transforms for Fourth dimension Series Forecasting

4. How to Use Advanced Model Features

In this section, you will notice how to use some of the slightly more than advanced model features, such as reviewing learning curves and saving models for later use.

four.1 How to Visualize a Deep Learning Model

The architecture of deep learning models can rapidly become large and complex.

As such, it is important to have a clear idea of the connections and data flow in your model. This is especially important if you are using the functional API to ensure you lot accept indeed continued the layers of the model in the way you lot intended.

In that location are two tools you lot tin can use to visualize your model: a text clarification and a plot.

Model Text Description

A text description of your model tin can be displayed by calling the summary() role on your model.

The example below defines a small model with iii layers and then summarizes the construction.

Running the case prints a summary of each layer, as well equally a full summary.

This is an invaluable diagnostic for checking the output shapes and number of parameters (weights) in your model.

Model Architecture Plot

You tin can create a plot of your model past calling the plot_model() function.

This will create an image file that contains a box and line diagram of the layers in your model.

The example below creates a small-scale three-layer model and saves a plot of the model architecture to 'model.png' that includes input and output shapes.

Running the example creates a plot of the model showing a box for each layer with shape information, and arrows that connect the layers, showing the period of data through the network.

Plot of Neural Network Architecture

Plot of Neural Network Architecture

four.2 How to Plot Model Learning Curves

Learning curves are a plot of neural network model performance over time, such as calculated at the end of each training epoch.

Plots of learning curves provide insight into the learning dynamics of the model, such as whether the model is learning well, whether it is underfitting the preparation dataset, or whether it is overfitting the training dataset.

For a gentle introduction to learning curves and how to use them to diagnose learning dynamics of models, see the tutorial:

  • How to use Learning Curves to Diagnose Car Learning Model Functioning

You can hands create learning curves for your deep learning models.

Offset, you lot must update your call to the fit function to include reference to a validation dataset. This is a portion of the preparation gear up not used to fit the model, and is instead used to evaluate the functioning of the model during training.

You tin can separate the information manually and specify the validation_data argument, or you tin use the validation_split argument and specify a percentage split of the training dataset and let the API perform the divide for you. The latter is simpler for at present.

The fit function will render a history object that contains a trace of performance metrics recorded at the finish of each training epoch. This includes the chosen loss function and each configured metric, such as accuracy, and each loss and metric is calculated for the training and validation datasets.

A learning bend is a plot of the loss on the grooming dataset and the validation dataset. Nosotros can create this plot from the history object using the Matplotlib library.

The example below fits a small neural network on a synthetic binary classification trouble. A validation divide of 30 percent is used to evaluate the model during training and the cross-entropy loss on the railroad train and validation datasets are and so graphed using a line plot.

Running the example fits the model on the dataset. At the end of the run, the history object is returned and used as the basis for creating the line plot.

The cross-entropy loss for the training dataset is accessed via the 'loss' fundamental and the loss on the validation dataset is accessed via the 'val_loss' key on the history aspect of the history object.

Learning Curves of Cross-Entropy Loss for a Deep Learning Model

Learning Curves of Cross-Entropy Loss for a Deep Learning Model

4.3 How to Save and Load Your Model

Preparation and evaluating models is great, but we may want to utilise a model later without retraining information technology each time.

This tin be accomplished past saving the model to file and later loading information technology and using it to make predictions.

This can be achieved using the salve() office on the model to relieve the model. Information technology can be loaded later using the load_model() role.

The model is saved in H5 format, an efficient array storage format. As such, you lot must ensure that the h5py library is installed on your workstation. This can be achieved using pip; for instance:

The instance below fits a simple model on a synthetic binary nomenclature trouble and then saves the model file.

Running the example fits the model and saves information technology to file with the name 'model.h5'.

We can so load the model and employ information technology to brand a prediction, or continue grooming it, or exercise any we wish with it.

The example beneath loads the model and uses it to brand a prediction.

Running the example loads the image from file, then uses information technology to brand a prediction on a new row of data and prints the issue.

5. How to Get Improve Model Performance

In this section, you volition discover some of the techniques that you tin use to improve the functioning of your deep learning models.

A big part of improving deep learning performance involves avoiding overfitting by slowing downward the learning process or stopping the learning procedure at the right fourth dimension.

5.1 How to Reduce Overfitting With Dropout

Dropout is a clever regularization method that reduces overfitting of the preparation dataset and makes the model more robust.

This is accomplished during training, where some number of layer outputs are randomly ignored or "dropped out." This has the upshot of making the layer look like – and be treated similar – a layer with a different number of nodes and connectivity to the prior layer.

Dropout has the effect of making the training procedure noisy, forcing nodes inside a layer to probabilistically accept on more than or less responsibility for the inputs.

For more on how dropout works, see this tutorial:

  • A Gentle Introduction to Dropout for Regularizing Deep Neural Networks

You can add dropout to your models every bit a new layer prior to the layer that you want to accept input connections dropped-out.

This involves calculation a layer called Dropout() that takes an argument that specifies the probability that each output from the previous to drop. E.g. 0.4 ways forty percent of inputs volition be dropped each update to the model.

Y'all can add Dropout layers in MLP, CNN, and RNN models, although there are as well specialized versions of dropout for use with CNN and RNN models that you might as well want to explore.

The example below fits a small neural network model on a synthetic binary classification problem.

A dropout layer with 50 per centum dropout is inserted between the first hidden layer and the output layer.

5.two How to Accelerate Training With Batch Normalization

The scale and distribution of inputs to a layer can greatly impact how piece of cake or apace that layer can be trained.

This is generally why information technology is a good idea to scale input data prior to modeling information technology with a neural network model.

Batch normalization is a technique for training very deep neural networks that standardizes the inputs to a layer for each mini-batch. This has the result of stabilizing the learning process and dramatically reducing the number of training epochs required to train deep networks.

For more on how batch normalization works, see this tutorial:

  • A Gentle Introduction to Batch Normalization for Deep Neural Networks

You lot can use batch normalization in your network past calculation a batch normalization layer prior to the layer that you wish to accept standardized inputs. Y'all can use batch normalization with MLP, CNN, and RNN models.

This can be achieved by adding the BatchNormalization layer directly.

The example below defines a small MLP network for a binary classification prediction problem with a batch normalization layer between the beginning hidden layer and the output layer.

Also, tf.keras has a range of other normalization layers yous might like to explore; run across:

  • tf.keras Normalization Layers Guide.

5.three How to Halt Training at the Right Time With Early Stopping

Neural networks are challenging to railroad train.

Besides little training and the model is underfit; too much training and the model overfits the training dataset. Both cases result in a model that is less effective than it could be.

1 approach to solving this trouble is to use early stopping. This involves monitoring the loss on the training dataset and a validation dataset (a subset of the grooming prepare not used to fit the model). As shortly every bit loss for the validation fix starts to show signs of overfitting, the training procedure can be stopped.

For more on early on stopping, see the tutorial:

  • A Gentle Introduction to Early Stopping to Avoid Overtraining Neural Networks

Early on stopping can be used with your model by first ensuring that you lot accept a validation dataset. You can define the validation dataset manually via the validation_data argument to the fit() part, or you lot tin can use the validation_split and specify the corporeality of the training dataset to hold back for validation.

Yous can and then define an EarlyStopping and instruct it on which operation measure to monitor, such every bit 'val_loss' for loss on the validation dataset, and the number of epochs to observed overfitting before taking activeness, due east.1000. 5.

This configured EarlyStopping callback can then be provided to the fit() part via the "callbacks" argument that takes a list of callbacks.

This allows you to ready the number of epochs to a large number and be confident that preparation will end as shortly as the model starts overfitting. You lot might also like to create a learning curve to find more insights into the learning dynamics of the run and when training was halted.

The example below demonstrates a small neural network on a synthetic binary classification problem that uses early stopping to halt training as soon every bit the model starts overfitting (after almost 50 epochs).

The tf.keras API provides a number of callbacks that yous might similar to explore; yous can larn more than here:

  • tf.keras Callbacks

Farther Reading

This section provides more resource on the topic if y'all are looking to go deeper.

Tutorials

  • How to Control the Stability of Training Neural Networks With the Batch Size
  • A Gentle Introduction to the Rectified Linear Unit (ReLU)
  • Difference Between Nomenclature and Regression in Machine Learning
  • How to Manually Scale Image Pixel Data for Deep Learning
  • four Common Motorcar Learning Information Transforms for Time Series Forecasting
  • How to employ Learning Curves to Diagnose Machine Learning Model Performance
  • A Gentle Introduction to Dropout for Regularizing Deep Neural Networks
  • A Gentle Introduction to Batch Normalization for Deep Neural Networks
  • A Gentle Introduction to Early Stopping to Avoid Overtraining Neural Networks

Books

  • Deep Learning, 2016.

Guides

  • Install TensorFlow 2 Guide.
  • TensorFlow Core: Keras
  • Tensorflow Cadre: Keras Overview Guide
  • The Keras functional API in TensorFlow
  • Save and load models
  • Normalization Layers Guide.

APIs

  • tf.keras Module API.
  • tf.keras Optimizers
  • tf.keras Loss Functions
  • tf.keras Metrics

Summary

In this tutorial, yous discovered a step-by-step guide to developing deep learning models in TensorFlow using the tf.keras API.

Specifically, yous learned:

  • The departure between Keras and tf.keras and how to install and confirm TensorFlow is working.
  • The 5-step life-cycle of tf.keras models and how to apply the sequential and functional APIs.
  • How to develop MLP, CNN, and RNN models with tf.keras for regression, classification, and fourth dimension series forecasting.
  • How to use the advanced features of the tf.keras API to inspect and diagnose your model.
  • How to improve the performance of your tf.keras model by reducing overfitting and accelerating training.

Do yous take whatsoever questions?
Ask your questions in the comments below and I will practice my all-time to reply.

Develop Deep Learning Projects with Python!

Deep Learning with Python

 What If You Could Develop A Network in Minutes

...with simply a few lines of Python

Find how in my new Ebook:
Deep Learning With Python

It covers end-to-end projects on topics like:
Multilayer Perceptrons,Convolutional Nets andRecurrent Neural Nets, and more...

Finally Bring Deep Learning To
Your Ain Projects

Skip the Academics. Just Results.

See What's Inside

johnsonacom1978.blogspot.com

Source: https://machinelearningmastery.com/tensorflow-tutorial-deep-learning-with-tf-keras/

0 Response to "Make Keras Use Cpu Again in R"

Enregistrer un commentaire

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel