Projects
I believe in making research reproducible and accessible to all. Here are some of my open-source projects that bridge AI, data science, and animal welfare science.
Due to biases embedded in the prompt revision process, text-to-image generative AI systematically romanticized livestock farming as dairy cows grazing on pasture and pigs rooting in mud, even when asked for realistic depictions. Inhibiting prompt revision resulted in images that more closely reflected modern farming practices: cows housed indoors accessing feed through metal headlocks, and pigs behind metal railings on concrete floors. These findings reveal how prompt revision systematically promotes certain ideologies while erasing the reality.
Technologies: Python, OpenAI API, Docker, GNU Make, statistical analysis
Impact: First comprehensive study of the bias of text-to-image generative AI in depicting farm animals
moo4feed is an R package designed to extract novel individual-level animal traits from raw feeding and drinking data collected from sensors. The package aims to support animal welfare science research and data-driven monitoring by enabling reproducible, scalable analysis workflows.
Technologies: R, tidyverse, time-series analysis, statistical modeling, data visualization
Impact: Enables researchers to easily visualize, analyze, and understand the story of every single animal from sensor data
A curated collection of battle-tested system prompts and rules for guiding AI agents across different domains and tasks. This repository contains real-world tested prompts that have been proven effective in production environments for various AI applications such as software development.
Technologies: R, Cursor, Shell, Markdown
Impact: Enables developers to collaborate with AI agents and completetasks more efficiently and effectively
This thoroughly documented and well-organized R project uses a Bayesian statistical framework to examine how feed competition influences the structure of dominance hierarchies in dairy cattle. I used an algorithm to automatically detect displacement behaviors at the feeder, analyzing data from 159 cows over a 10-month period.
Technologies: R, Bayesian modeling, Elo-rating,automated behavior detection
Impact: First study to demonstrate that competition flattens dominance hierarchies
This project integrates Python, R, and HTML/CSS/JavaScript to develop an innovative lameness assessment tool for dairy cows using Amazon MTurk. Similar in concept to the chatbot arena for ranking generative AI models, our system ranks cows based on the severity of lameness. Crowd workers perform pairwise comparisons judging which cow is more lame when watching two cows walking side by side.
Technologies: Python, R, HTML/CSS/JavaScript, Amazon MTurk, AWS, Bayesian Elo-rating
Impact: Revolutionized lameness assessment by enabling reliable, scalable, and cost-effective assessment
This project evaluates cow welfare through automated image analysis using GPT-4o and the Segment Anything Model (SAM). This study investigated whether GPT-4o can achieve expert-level accuracy in assessing cow cleanliness after being provided with the Welfare Quality Protocol manual and example images.
Technologies: Python, R, OpenAI GPT-4o, Segment Anything Model, computer vision
Impact: Demonstrates potential of Large Multimodal Models for automated welfare assessment
This project applies various machine learning methods, including Support Vector Machines, Random Forest, K-Nearest Neighbors, and Logistic Regression, to predict lameness in dairy cows based on behavioral data. Using data from 34 lame and 100 healthy cows over 10 months, we analyzed the impact of data preprocessing techniques such as undersampling, feature selection, and dimensionality reduction on model performance.
Technologies: Python, scikit-learn, various ML algorithms, data preprocessing
Impact: Highlights potential of ML for early lameness detection
Note: This repository is private as the associated paper has not been published yet. Please email me to request access.
Open Source Philosophy
I believe in the power of open science and reproducible research. My projects include:
- Comprehensive documentation - Clear README files, vignettes, and code comments
- Reproducible workflows - Docker containers, conda locks, environment files, and step-by-step instructions
- Clean, well-organized code - Following best practices for maintainability
- Data and model transparency - Where possible, sharing datasets and model weights
- Educational value - Code that others can learn from and build upon
If you're interested in collaborating on any of these projects or have questions about implementation, please feel free to reach out!