Deep Learning

Keras – Categorical Cross Entropy Loss Function

In this post, you will learn about when to use categorical cross entropy loss function when training neural network using Python Keras. Generally speaking, the loss function is used to compute the quantity that the the model should seek to minimize during training. For regression models, the commonly used loss function used is mean squared error function while for classification models predicting the probability, the loss function most commonly used is cross entropy. In this post, you will learn about different types of cross entropy loss function which is used to train the Keras neural network model.

Cross entropy loss function is an optimization function which is used in case of training a classification model which classifies the data by predicting the probability of whether the data belongs to one class or the other class. One of the examples where Cross entropy loss function is used is Logistic Regression.  Check my post on the related topic – Cross entropy loss function explained with Python examples

When fitting a neural network for classification, Keras provide the following three different types of cross entropy loss function:

  • binary_crossentropy: Used as a loss function for binary classification model. The binary_crossentropy function computes the cross-entropy loss between true labels and predicted labels.
  • categorical_crossentropy: Used as a loss function for multi-class classification model where there are two or more output labels. The outputis assigned one-hot category encoding value in form of 0s and 1. The output label, if present in integer form, is converted into categorical encoding using keras.utils to_categorical method.
  • sparse_categorical_crossentropy: Used as a loss function for multi-class classification model where the outputis assigned integer value (0, 1, 2, 3…). This loss function is mathematically same as the categorical_crossentropy. It just has a different interface.

Here is how the loss function is set as one of the above in order to configure neural network. Pay attention to the parameter, loss, which is assigned the value of binary_crossentropy for learning parameters of the binary classification neural network model.

network.compile(optimizer=optimizers.RMSprop(lr=0.01), 
                loss='binary_crossentropy', 
                metrics=['accuracy'])

When loss function to be used is categorical_crossentropy, the Keras network configuration code would look like the following:

network.compile(optimizer=optimizers.RMSprop(lr=0.01), 
                loss='categorical_crossentropy', 
                metrics=['accuracy'])

You may want to check different kinds of loss functions which can be used with Keras neural network on this page – Keras Loss Functions.

Latest posts by Ajitesh Kumar (see all)
Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

Recent Posts

Retrieval Augmented Generation (RAG) & LLM: Examples

Last updated: 25th Jan, 2025 Have you ever wondered how to seamlessly integrate the vast…

5 hours ago

What are AI Agents? How do they work?

Artificial Intelligence (AI) agents have started becoming an integral part of our lives. Imagine asking…

3 weeks ago

Agentic AI Design Patterns Examples

In the ever-evolving landscape of agentic AI workflows and applications, understanding and leveraging design patterns…

3 weeks ago

List of Agentic AI Resources, Papers, Courses

In this blog, I aim to provide a comprehensive list of valuable resources for learning…

3 weeks ago

Understanding FAR, FRR, and EER in Auth Systems

Have you ever wondered how systems determine whether to grant or deny access, and how…

3 weeks ago

Top 10 Gartner Technology Trends for 2025

What revolutionary technologies and industries will define the future of business in 2025? As we…

4 weeks ago