Projects

I believe in making research reproducible and accessible to all. Here are some of my open-source projects that bridge AI, data science, and animal welfare science.

AI's Representation Bias in Livestock Farming

Due to biases embedded in the prompt revision process, text-to-image generative AI systematically romanticized livestock farming as dairy cows grazing on pasture and pigs rooting in mud, even when asked for realistic depictions. Inhibiting prompt revision resulted in images that more closely reflected modern farming practices: cows housed indoors accessing feed through metal headlocks, and pigs behind metal railings on concrete floors. These findings reveal how prompt revision systematically promotes certain ideologies while erasing the reality.

Technologies: Python, OpenAI API, Docker, GNU Make, statistical analysis

Impact: First comprehensive study of the bias of text-to-image generative AI in depicting farm animals

GitHub Paper

moo4feed R Package

moo4feed is an R package designed to extract novel individual-level animal traits from raw feeding and drinking data collected from sensors. The package aims to support animal welfare science research and data-driven monitoring by enabling reproducible, scalable analysis workflows.

Technologies: R, tidyverse, time-series analysis, statistical modeling, data visualization

Impact: Enables researchers to easily visualize, analyze, and understand the story of every single animal from sensor data

Website GitHub

AI Agent System Prompts & Rules

A curated collection of battle-tested system prompts and rules for guiding AI agents across different domains and tasks. This repository contains real-world tested prompts that have been proven effective in production environments for various AI applications such as software development.

Technologies: R, Cursor, Shell, Markdown

Impact: Enables developers to collaborate with AI agents and completetasks more efficiently and effectively

GitHub

Competition Dominance Analysis

This thoroughly documented and well-organized R project uses a Bayesian statistical framework to examine how feed competition influences the structure of dominance hierarchies in dairy cattle. I used an algorithm to automatically detect displacement behaviors at the feeder, analyzing data from 159 cows over a 10-month period.

Technologies: R, Bayesian modeling, Elo-rating,automated behavior detection

Impact: First study to demonstrate that competition flattens dominance hierarchies

GitHub Paper

Lameness Hierarchy

This project integrates Python, R, and HTML/CSS/JavaScript to develop an innovative lameness assessment tool for dairy cows using Amazon MTurk. Similar in concept to the chatbot arena for ranking generative AI models, our system ranks cows based on the severity of lameness. Crowd workers perform pairwise comparisons judging which cow is more lame when watching two cows walking side by side.

Technologies: Python, R, HTML/CSS/JavaScript, Amazon MTurk, AWS, Bayesian Elo-rating

Impact: Revolutionized lameness assessment by enabling reliable, scalable, and cost-effective assessment

GitHub Paper

Welfare Assessment using GPT-4o

This project evaluates cow welfare through automated image analysis using GPT-4o and the Segment Anything Model (SAM). This study investigated whether GPT-4o can achieve expert-level accuracy in assessing cow cleanliness after being provided with the Welfare Quality Protocol manual and example images.

Technologies: Python, R, OpenAI GPT-4o, Segment Anything Model, computer vision

Impact: Demonstrates potential of Large Multimodal Models for automated welfare assessment

GitHub

Lameness Prediction through Machine Learning

This project applies various machine learning methods, including Support Vector Machines, Random Forest, K-Nearest Neighbors, and Logistic Regression, to predict lameness in dairy cows based on behavioral data. Using data from 34 lame and 100 healthy cows over 10 months, we analyzed the impact of data preprocessing techniques such as undersampling, feature selection, and dimensionality reduction on model performance.

Technologies: Python, scikit-learn, various ML algorithms, data preprocessing

Impact: Highlights potential of ML for early lameness detection

Note: This repository is private as the associated paper has not been published yet. Please email me to request access.

Request Access

Open Source Philosophy

I believe in the power of open science and reproducible research. My projects include:

Comprehensive documentation - Clear README files, vignettes, and code comments
Reproducible workflows - Docker containers, conda locks, environment files, and step-by-step instructions
Clean, well-organized code - Following best practices for maintainability
Data and model transparency - Where possible, sharing datasets and model weights
Educational value - Code that others can learn from and build upon

If you're interested in collaborating on any of these projects or have questions about implementation, please feel free to reach out!