Intrusion Detection Mini Project: DDoS Detection (CICIDS2017)

I became curious about cybersecurity after listening to a podcast “Brass Tacks: Talking Cybersecurity”.
I decided to do a mini project to understand a little bit more about the academic field of cybersecurity.
For me, doing something and learn from it is easier than reading a textbook.

Summary

In this project, I explored how machine learning can be used to detect network attacks.
Using the CICIDS2017 dataset, I built a simple intrusion detection model that classifies network traffic as either benign or DDoS attack traffic.

graph

The experiment follows a typical machine learning workflow:

Load and clean the dataset (handling spaces, infinity values, and missing data)
Train a Random Forest classifier
Evaluate model performance using classification metrics and ROC-AUC
Analyze which network features are most important for detection
Visualize traffic patterns using PCA
Investigate the few attacks the model failed to detect

The model achieved near-perfect performance (FP = 0, FN = 4) on this dataset.
However, a closer look at the misclassified samples revealed something interesting:
the missed attacks tended to have short flow durations and small packet counts, making them look more similar to normal traffic.

This suggests that low-intensity or stealthier attacks can be harder for models to detect, even when overall accuracy appears extremely high.

Dataset

This experiment uses the CICIDS2017 dataset, developed by the
Canadian Institute for Cybersecurity.

It is a widely used benchmark dataset for intrusion detection research and contains labeled network traffic including both normal activity and various attack types.

Key characteristics:

~225,000 network flows
79 flow-based traffic features
Labels for BENIGN and DDoS traffic

Dataset link:
https://www.unb.ca/cic/datasets/ids-2017.html

Full Analysis

The full notebook output, including visualizations and detailed analysis, can be viewed here:

➡️ View the full notebook output