📊 Data & Models

This section covers how behavioral data is used to train machine learning models in bezpieczenstwoPwr, as well as how those models are integrated and run in the browser.

📁 Data Source

The machine learning models are trained using anonymized user behavior data, primarily from:

Kaggle datasets related to mouse movement, bot detection, or human-computer interaction
Custom-collected data from real users and simulation tools (e.g., puppeteer)

Typical features include:

Mouse speed and acceleration
Click patterns and hesitation times
Pointer trajectory shapes
Keystroke intervals and input dynamics

🧠 All data is anonymized before any storage or model training. No personally identifiable information is collected.

🛠 Data Preprocessing

Before feeding data into the model, we perform several preprocessing steps:

Normalization of coordinates and timing
Sequence padding or trimming to ensure consistent input shape
Feature extraction such as velocity, direction changes, or dwell time
Noise filtering to remove accidental or edge-case inputs

You can find preprocessing code inside the training pipeline scripts (not yet published, ask us if you want early access).

🧠 Model Architecture

Models are designed for lightweight, in-browser inference using TensorFlow.js.

We experimented with:

LSTM / GRU networks for time-sequence analysis
1D CNNs for spatial-temporal patterns
Ensemble methods trained offline and distilled into TensorFlow.js format

The final model is selected based on size, accuracy, and inference speed in the browser.

⚠️ Models are kept under public/models/ for now. Expect a proper versioning system and CDN-hosted models in future updates.

📦 Loading Models

The bbotd package automatically loads a model from a given URL.

📊 Data & Models ​

📁 Data Source ​

🛠 Data Preprocessing ​

🧠 Model Architecture ​

📦 Loading Models ​

📊 Data & Models

📁 Data Source

🛠 Data Preprocessing

🧠 Model Architecture

📦 Loading Models