Wapiki Tech - Innovation & Technological Excellence

The SCARFACE Challenge

SCARFACE must analyze 20+ simultaneous video streams in real-time:

Face detection (<100ms)

Facial recognition (<200ms)

Document OCR (ID cards, passports)

Anomaly detection

All with 99%+ accuracy.

Computer Vision Pipeline

1. Face Detection (MTCNN)

We use MTCNN (Multi-task Cascaded Convolutional Networks) for detection:

python

import cv2
from mtcnn import MTCNN

detector = MTCNN()

def detect_faces(frame):
    # RGB conversion
    rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # Detection
    faces = detector.detect_faces(rgb_frame)

    return faces

Optimization: Detection only every 5 frames, tracking between detections.

2. Facial Recognition (FaceNet)

Once the face is detected, we generate a 128D embedding with FaceNet:

python

from tensorflow.keras.models import load_model
import numpy as np

facenet_model = load_model('facenet_keras.h5')

def get_face_embedding(face_image):
    # Resize 160x160 (required by FaceNet)
    face_pixels = cv2.resize(face_image, (160, 160))

    # Normalization
    face_pixels = face_pixels.astype('float32')
    mean, std = face_pixels.mean(), face_pixels.std()
    face_pixels = (face_pixels - mean) / std

    # Expand for batch
    samples = np.expand_dims(face_pixels, axis=0)

    # Embedding
    embedding = facenet_model.predict(samples)[0]

    return embedding

3. Database Matching

python

from scipy.spatial.distance import cosine

def find_match(embedding, database_embeddings, threshold=0.6):
    min_distance = float('inf')
    matched_person = None

    for person_id, db_embedding in database_embeddings.items():
        distance = cosine(embedding, db_embedding)

        if distance < min_distance:
            min_distance = distance
            matched_person = person_id

    if min_distance < threshold:
        return matched_person, 1 - min_distance  # Confidence

    return None, 0

Document OCR

For ID card analysis, we use Tesseract OCR combined with a custom model:

python

import pytesseract
from PIL import Image

def extract_id_info(id_card_image):
    # Preprocessing
    gray = cv2.cvtColor(id_card_image, cv2.COLOR_BGR2GRAY)
    denoised = cv2.fastNlMeansDenoising(gray)

    # OCR
    text = pytesseract.image_to_string(denoised, lang='eng')

    # Structured extraction with regex
    patterns = {
        'number': r'No\s*([A-Z0-9]+)',
        'name': r'Name\s*:\s*([A-Z\s]+)',
        'surname': r'Surname\s*:\s*([A-Z\s]+)',
        'birth_date': r'(\d{2}/\d{2}/\d{4})'
    }

    extracted = {}
    for key, pattern in patterns.items():
        match = re.search(pattern, text)
        if match:
            extracted[key] = match.group(1)

    return extracted

GPU Optimizations

TensorFlow GPU with CUDA

python

import tensorflow as tf

# GPU configuration
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    tf.config.experimental.set_memory_growth(gpus[0], True)

    # Mixed precision for x2 performance
    policy = tf.keras.mixed_precision.Policy('mixed_float16')
    tf.keras.mixed_precision.set_global_policy(policy)

Batch Processing

Instead of processing each face individually, we batch:

python

def process_batch(faces, batch_size=32):
    embeddings = []

    for i in range(0, len(faces), batch_size):
        batch = faces[i:i+batch_size]
        batch_embeddings = facenet_model.predict(np.array(batch))
        embeddings.extend(batch_embeddings)

    return embeddings

System Architecture

Backend: Spring Boot with gRPC for high-performance communication

GPU Servers: NVIDIA RTX 3090 (24GB VRAM)

Database: PostgreSQL for metadata, Redis for embeddings cache

Queue: RabbitMQ for asynchronous processing

Production Results

🎯 **Accuracy**: 99.2%

⚡ **Detection latency**: 85ms

⚡ **Recognition latency**: 180ms

📹 **Simultaneous cameras**: 20+

💾 **Database**: 50,000+ indexed faces

🔍 **False positive rate**: 0.3%

Conclusion

Production Computer Vision requires a combination of performant pre-trained models, aggressive GPU optimizations and robust system architecture.

*A Computer Vision project? [Let's discuss](/contact).*

Computer Vision and Deep Learning: Behind the Scenes of SCARFACE

The SCARFACE Challenge

Computer Vision Pipeline

1. Face Detection (MTCNN)

2. Facial Recognition (FaceNet)

3. Database Matching

Document OCR

GPU Optimizations

TensorFlow GPU with CUDA

Batch Processing

System Architecture

Production Results

Conclusion

Did you like this article?