InnoTep Research Group

General Description

The objective of this BSc thesis is to design and implement a “Security Gateway” for healthcare environments, capable of intercepting, analyzing, and auditing clinical text flows before they are processed by critical decision systems.

Motivation

The vulnerability of Large Language Models (LLMs) to data poisoning attacks represents a critical risk in healthcare environments. Small manipulations in text (indirect injections or “trigger” words) can force erroneous diagnoses or information leaks, remaining invisible to traditional quality controls.

Defense Framework: MEDLEY

The system will implement the MEDLEY defense framework (Medical Ensemble Diagnostic system with Leveraged diversitY) described in recent literature.

Key premise: Instead of relying on a single AI model, the platform will orchestrate a heterogeneous model ensemble (with different architectures). While one model may be vulnerable to a specific attack, it is statistically improbable that multiple diverse models will fail in the same way against the same malicious input.

System Architecture

The platform will act as a governance and monitoring layer, performing the following functions:

Reception: Receive clinical texts (e.g., simulated clinical notes)
Distribution: Send texts to multiple analysis engines in parallel
Measurement: Calculate discrepancy metrics (entropy/disagreement)
Alert: Generate security alerts for anomalous divergence patterns

Specific Objectives

1. Modular Mediation Architecture

Design a complete architecture including:

Ingestion Module

Receive and normalize text inputs
Simulate synthetic clinical histories

Orchestration Module

Load distribution to multiple inference engines (AI)
Parallel execution

Audit Module (MEDLEY)

Real-time calculation of disagreement metrics between models
Anomaly detection

2. Attack Simulation Environment (Red Teaming)

Implement adversarial testing capabilities:

Controlled Dataset Generation

Use public data such as MTSamples
Inject “marks” or triggers (keywords or syntactic patterns)
Validate detectably divergent responses

“Victim” Model Configuration

Light fine-tuning of language models
Make them react to specific triggers
Validate threat effectiveness

3. Diversity Detection Mechanism

Implement detection logic:

Multiple Architecture Integration

At least two different model architectures
- Example: one based on BERT
- Example: another based on rules or distilled variant

Decision Logic

Critical question: When is disagreement considered a security alert vs. legitimate clinical ambiguity?
Define appropriate thresholds and metrics
Implement alert classification system

4. Observability and Alert Dashboard

Develop monitoring interface:

Visualizations

Detected attack attempts
Average discrepancy rate
Real-time security metrics

Features

Configurable alerts
Event history
Trend analysis

Technologies and Tools

Models and Frameworks

Transformers (Hugging Face)
BERT and variants
Diverse LLMs for the ensemble

Backend and Orchestration

Python
FastAPI or similar for APIs
Queue management for load distribution

Monitoring and Visualization

Web dashboard (React/Vue or similar)
Grafana or equivalent tools

Student Profile

Requirements:

Solid Python knowledge
Machine Learning and NLP fundamentals
Interest in cybersecurity and critical systems
Autonomous work capability

Desirable:

Experience with Transformers and LLMs
Knowledge of microservice architectures
Familiarity with Red Teaming methodologies

Duration and Modality

Estimated duration: 4-6 months
Modality: Hybrid (flexible on-site/remote)
Type: BSc Thesis (Proyecto Fin de Grado)

Expected Results

Functional system Security Gateway for clinical text
Evaluation dataset with documented synthetic attacks
Performance metrics for anomaly detection
Operational monitoring dashboard
Complete technical documentation
Possibility of scientific publication

Supervision and Support

This project will be supervised by InnoTep expert researchers with experience in:

Artificial Intelligence applied to healthcare
Cybersecurity and critical systems
Natural Language Processing

Technical support:

Access to computational resources
Public clinical text datasets
Pre-trained models
Weekly progress reviews

Contact

For more information or to express interest in this project:

📧 Email: gi.innotep@upm.es
🏛️ Location: ETSIST - Universidad Politécnica de Madrid

Interested in this project? Contact us to discuss details and start your thesis in a high-impact research area.

BSc Thesis: Audit System and Adversarial Injection Detection in Clinical AI

Security Gateway for protecting AI systems in healthcare environments