Lely — Migrating dairy-herd models from .NET to Python on Databricks

Background

Lely is a global leader in dairy-farm automation, best known for its milking robots. Its Horizon platform turns the data those robots and sensors produce into concrete advice for farmers — scoring and ranking individual cows on production, health, reproduction, and robot-milking efficiency. The scoring models were originally implemented in C#/.NET and ran as services on Kubernetes. We were brought in as ML engineers to migrate that modelling system into Python on Databricks so it could scale to whole-herd computation and be iterated on more easily — while guaranteeing the new models produced exactly the same results as the trusted originals.

The challenge

Production models ran as C#/.NET services (Azure Functions on Kubernetes), hard to scale for herd-wide computation.
Numerical parity with the existing system had to be preserved exactly — farmers act on this advice, so results could not drift.
Early Python prototypes used row-by-row pandas iteration, unsuitable for farm/herd scale.
The models were a set of interdependent sub-models feeding a composite ranking, each with its own constants and thresholds to carry over precisely.

Our approach

Re-implemented the sub-models (production, health, reproduction, robot efficiency, lactation-curve fitting) and the composite cow-ranking index in Python.
Preserved exact model constants and thresholds from the reference implementation to guarantee parity.
Ported the gradient-descent curve-fitting solver to the scientific-Python stack (NumPy / SciPy).
Re-engineered row-wise pandas logic into vectorized PySpark so the models run efficiently at herd scale on Databricks.
Built a numerical-parity test harness validating Python outputs against the legacy reference simulations before cutover.

My role on the project

Translating the C# model logic into Python, sub-model by sub-model.
Building the parity test harness comparing Python output against the legacy reference.
Vectorising the computation into PySpark for herd-scale performance.
Packaging the models to run as Databricks jobs producing per-cow advice.

Architecture

Daily summary data

Spark / Delta

Feature engineering

Vectorized PySpark

Sub-models

Production · health
Reproduction · efficiency
Curve fitting

Composite ranking

Cow index

Per-cow advice

Databricks jobs

Parity check

vs. legacy reference

C#/.NET models re-implemented in PySpark on Databricks, validated against the legacy reference for numerical parity.

Outcomes

Cow-modelling running as scalable Databricks jobs in place of the legacy .NET services.
Validated numerical parity with the reference implementation, so farmers’ advice stayed trustworthy.
A maintainable Python codebase ready for herd-scale execution and future iteration.