ML Engineering · Data Engineering · MLOps

Yohan Shanuka

Machine Learning & Data Engineer

Designing production-grade data pipelines, ML training systems, and MLOps workflows that scale from prototype to millions of events per second.

View Projects

ml-pipeline·kafka_producer.py

LIVE

1from confluent_kafka import Producerproducer.produce('clickstream', payload)

2# → 10k events/s

● running

Phase 01

Data Pipeline Architecture

Kafka · Spark · Airflow · Iceberg

Kafkaevent stream

Schema Regavro/protobuf

Sparkbatch / stream

Airfloworchestrator

DQ Checksexpectations

Data Lakeparquet/delta

Hidden

KafkaIngest

SparkProcess

AirflowSchedule

MLflowTrack

FastAPIServe

DockerPackage

KubernetesScale

PythonCore

KafkaIngest

SparkProcess

AirflowSchedule

MLflowTrack

FastAPIServe

DockerPackage

KubernetesScale

PythonCore

Engineering Mindset

Building Scalable ML & Data Systems.

I focus on building intelligent systems at the intersection of machine learning and data engineering, designing high-throughput distributed pipelines and deploying production-ready models that solve complex real-world challenges.

My goal is to develop production-ready machine learning workflows supported by reliable data infrastructure, modern backend systems, and scalable cloud architectures.

Particularly Interested In

Focus areas

MLOps & Automation

Data Engineering Pipelines

Machine Learning Systems

Cloud-Based Systems

Distributed Data Processing

Backend Infrastructure

Engineering Focus

Core Expertise

Three focused areas — each with a clear pipeline and the capabilities I bring to production systems.

Models Deployed

12+

Pipeline Uptime

99.9%

Avg Latency

< 100ms

Data Processed

10TB+

Domain Focus

ML Engineering

Train, evaluate, and serve models with low-latency APIs.

System Lifecycle Flow

Features

Training

Serving

Core Capabilities

CNN & transfer learning
Model APIs & FastAPI
Prediction systems
Model optimization

Key Performance Index

Inference

< 100ms

ML Lifecycle

Data Ingest

Feature Eng

Training

Evaluation

Deployment

Monitoring

KafkaIngest

SparkProcess

MLflowTrack

K8sDeploy

Tools & Frameworks

Technology Ecosystem

A curated stack I use to build scalable ML systems, data pipelines, and cloud-native infrastructure.

ProficientCurrently Learning

Machine Learning

Ecosystem Focus

Developing and deploying deep learning, computer vision, and predictive models using modern frameworks.

Core Competencies

CNN & Transfer Learning

PyTorch & TensorFlow

Model Optimization

Average Skill86%

Verified Stack

TensorFlow

PyTorch

Scikit-learn

OpenCV

XGBoost

Keras