Aim
To design, train, and evaluate a binary classification Artificial Neural Network (ANN) that predicts whether a breast cancer tumor is malignant or benign based on standardized clinical features, evaluating its performance via accuracy metrics and a confusion matrix.
Prerequisites
Theory
Binary classification is the core machine learning task of predicting one of two possible outcomes — such as disease-positive (1) or disease-negative (0). In healthcare AI, this is critical for early risk screening and automated diagnostics. An ANN designed for binary classification must output a single probability value in the range of (0,1). This is achieved by placing exactly one neuron in the final layer equipped with the sigmoid activation function: σ(z) = 1 / (1 + e^(-z)).
The binary cross-entropy (or log loss) function is the standard mathematical loss for binary classification: L = -[y·log(p) + (1-y)·log(1-p)]. It heavily penalizes the network when it makes a confident prediction that is entirely wrong, guiding the Adam optimizer to quickly adjust the weights in the correct direction.
In this practical, we utilize the Breast Cancer Dataset from Scikit-Learn. It contains 569 patient samples, each with 30 numeric clinical features (like tumor radius, texture, area, and smoothness). Because these features exist on wildly different numerical scales (e.g., area might be ~1000 while smoothness is ~0.1), feature standardization (using `StandardScaler`) is mandatory. Without scaling, features with large magnitudes would dominate the gradient updates, leading to a biased or non-converging model.
While accuracy is a good starting metric, medical diagnostics require deeper evaluation. We use a Confusion Matrix to visualize the network's predictions. It breaks down the results into True Positives (correctly identified), True Negatives, False Positives (Type I error / false alarm), and False Negatives (Type II error / missed diagnosis). In cancer screening, minimizing False Negatives is heavily prioritized.
Algorithm / Step-by-Step
- Import required libraries: `tensorflow`, `pandas`, `sklearn.datasets`, `sklearn.preprocessing`, and visualization tools (`matplotlib`, `seaborn`).
- Load the breast cancer dataset using `load_breast_cancer()`.
- Extract the feature matrix (`X`) and the target labels (`y`).
- Split the data into training (80%) and testing (20%) sets using `train_test_split`.
- Initialize a `StandardScaler`. Fit it only on the training data, then transform both the training and testing sets to prevent data leakage.
- Build a Sequential ANN model with three layers: `Dense(32, relu)` → `Dense(16, relu)` → `Dense(1, sigmoid)`.
- Compile the model specifying `adam` as the optimizer, `binary_crossentropy` as the loss, and tracking the `accuracy` metric.
- Train the model using `model.fit()` for 50 epochs with a batch size of 16 and a validation split of 20%.
- Evaluate the final model on the unseen test set to retrieve the test accuracy and loss.
- Generate prediction probabilities for the test set, convert them to binary labels (0 or 1) using a 0.5 threshold, and compute the `confusion_matrix`.
- Visualize the confusion matrix as a heatmap using Seaborn.
Key Code Concepts
Snippet 1 — Data Splitting and Safe Scaling
from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler # Split into training and testing sets (80% train, 20% test) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Standardize the data to a mean of 0 and variance of 1 scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test)
Crucially, fit_transform is only called on the training data. This calculates the mean and standard deviation from the training set and applies it. The exact same scaler is then applied to the test data using only transform. This ensures the model receives no statistical hints about the unseen test data (preventing "data leakage").
Snippet 2 — Binary Classification Architecture
model = tf.keras.models.Sequential([ # Input layer automatically handles the 30 features from the dataset tf.keras.layers.Dense(32, activation='relu', input_shape=(X_train_scaled.shape[1],)), tf.keras.layers.Dense(16, activation='relu'), # Output layer for binary classification must have 1 neuron tf.keras.layers.Dense(1, activation='sigmoid') ]) model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
The network compresses the 30 clinical features down to 32 neurons, then to 16, and finally to 1. The sigmoid activation on the final neuron guarantees the output will be squashed strictly between 0 and 1, allowing it to be interpreted as the probability of the tumor being benign.
Snippet 3 — Generating the Confusion Matrix
# 1. Get continuous probability predictions [e.g., 0.12, 0.89, 0.45] y_pred_prob = model.predict(X_test_scaled) # 2. Threshold probabilities at 0.5 to get crisp binary classes [e.g., 0, 1, 0] y_pred = (y_pred_prob > 0.5).astype(int) # 3. Compute array of True Positives, False Positives, etc. cm = confusion_matrix(y_test, y_pred)
Neural networks output floats. To compare the network's predictions against the true labels (which are crisp 0s and 1s), we must apply a threshold boundary. Values above 0.5 are classified as 1 (Benign), and values below 0.5 are classified as 0 (Malignant).
Expected Output
Training Logs: Text output showing the model training over 50 epochs. You will see the loss steadily dropping towards 0.05 and the accuracy climbing towards 98%.
Test Accuracy: A printed line displaying the model's accuracy on the unseen test dataset. Given the relatively clean nature of the Breast Cancer dataset and the use of scaling, this test accuracy typically exceeds 96% to 98%.
Confusion Matrix Heatmap: A colorful 2x2 grid generated by Seaborn. The top-left (True Malignant) and bottom-right (True Benign) squares will display high numbers, indicating correct predictions. The top-right and bottom-left squares (the errors) will contain numbers very close to zero, reflecting the model's high accuracy.
