Profile Photo

Ruben Ahrens, MSc

LinkedIn GitHub info@rubenahrens.com Leiden, The Netherlands

Skills

Python
PyTorch
TensorFlow
Data Mining
Reinforcement Learning
Computer Vision
Deep Learning
Hyperparameter Optimization
Fine-tuning Transformer Models
Big Data
NLP
Model Deployment
RAG
Docker
Databricks
Prompt Engineering

Experience

May 2025 -- Present
Data Scientist
Data Science Agency
The next challenge in my career is the position of Data Scientist at Data Science Agency. In this position, I work on data science projects as a consultant.
Feb 2025 -- May 2025
Coding Expert For AI Training
Outlier
To keep my coding skills sharp, I took a freelance opportunity at Outlier to train an in-\hspace{0pt}development Large Language Model (LLM) with Python code.
May 2023 -- Present
Sales Expert
MKC Moto
During my master's I had a side job at a motorcycle clothing store. Through this job, I improved my communication skills, learned various selling techniques, and met people sharing a passion for riding motorcycles.

Projects

Insight in Sportsparticipation

Dec 2025 -- Present
In collaboration with the BI team at NOC*NSF, I worked on a project to analyze sports participation data in the Netherlands. Using Databricks, we processed large datasets, including CBS socioeconomic data to gain insights into sports participation trends for various neighborhoods and regions.
DatabricksBig DataBIMachine Learning

Axoniq Developer Agent

Jul 2025 -- Oct 2025
In a team of full stack developers, I collaborated on the "Axoniq Developer Agent". This web application enables developers with limited knowledge of the Axon Framework to create event-sourced applications using a GUI. My responsibility was mostly on prompt-engineering, and designing the agentic sytem, however, I also contributed to the backend and frontend development.
Agentic AppsPrompt EngineeringEvent SourcingFull Stack Development

Rowing Competition Dashboard

Jul 2025
For the Dutch rowing federation (KNRB), I rejuvenated an outdated dashboard that visualized rowing competition data. The dashboard is now available accross the organization as a GUI web application.
DatabricksBokehData VisualizationFrontend Development

AI Coach Assistant

May 2025 -- Present
My first engagement for Data Science Agency involved developing a proof of concept of a RAG agent for the Dutch Olympic Committee and Sports Federation (NOC*NSF). The agent allows athletes to ask about nutrition and get nutrition and training advice depending on their personal training data.
StreamlitLLMRAGCloud RunDocker

Detecting ship plumes using satellite data

Jan 2024 -- Jan 2025
Earth observation helps monitor shipping emissions. This study uses machine learning to improve ship plume detection by incorporating SO$_2$ and HCHO alongside NO$_2$ from TROPOMI data. An XGBoost classifier trained on 80x80 km samples shows that adding SO$_2$ and HCHO enhances detection, especially at extreme NOx proxy values. Individually, SO$_2$ and HCHO achieved ROC AUCs of 0.647 and 0.634, compared to 0.684 for NO$_2$, highlighting their potential despite room for improvement with more data.
Computer VisionBig DataHyperparameter OptimizationGeospatial Machine Learning

Resistance training optimization

May 2024 -- Jun 2024
In this course paper for the course "Sports Data Science", I combined my passion for bodybuilding and AI. I explored training data from 60 people, creating an algorithm that translates weightlifting performance across exercises.
Web ScrapingData mining

Recognizing drug side-effects from text

Nov 2023 -- Jan 2024
Through fine-tuning on the CADEC dataset, consisting of medical reviews, the transformer network BioBERT, specialized in biomedical texts, demonstrates high performance in recognizing medical entities.
TensorFlowNatural Language ProcessingFine-tuning Transformer ModelsDeep Learning

Optimizing container placement for a cargo ship

May 2023 -- Jun 2023
In this project, my peers and I used a genetic algorithm to optimize the container placement on a cargo ship that had a route passing multiple harbors. The placement was meant to minimize unloading time and the distance between the center of gravity of the ship and the load.
Data Visualization

Predicting crime in Chicago neighborhoods

Oct 2022 -- Jan 2023
This is the first big project where I worked with spatiotemporal data. Me and my partner divided the map of Chicago, Illinois into grid cells, creating a sparse 50x55px image. Using an ensemble of ConvLSTM models, we were able to generate an accurate estimation of high-crime locations, accounting for high differences in data for different regions.
Data visualizationDeep LearningTensorFlowComputer Vision

Interactively classifying visual art

Apr 2021 -- Jun 2022
My bachelor thesis was my first big AI project. In this project, I used the deep learning model CLIP to classify paintings into portrait or landscape classes. I investigated the usefulness of interactive machine learning (where an algorithm is trained during the data annotation process). I created a GUI application to annotate the images.
Computer VisionFine-tuning Transformer ModelsPyTorch

Free the Sea VR Game

Dec 2020 -- Jan 2021
For a course at the University in Amsterdam, I collaborated to create an educational VR game. In "Free the Sea" a player is tasked with collecting plastic waste in the sea. The purpose of the game is to raise awareness about recycling among children.
Game DevelopmentC++