
Cluster feeding visits into meals and label each visit
Source:R/meal_label_visits.R
meal_label_visits.Rd
One-stop function to cluster feeding visits into meals and label each visit with its meal information.
This function first calls cluster_meals()
to identify meals, then merges the meal information back to
the original visit data using merge_cluster_results()
.
Usage
meal_label_visits(
data,
eps = NULL,
min_pts = 2,
method = "gmm",
percentile = 0.93,
eps_scope = "all_animals",
lower_bound = 5,
upper_bound = 60,
use_log_transform = TRUE,
log_multiplier = 20,
log_offset = 1,
id_col = id_col2(),
start_col = start_col2(),
end_col = end_col2(),
bin_col = bin_col2(),
intake_col = intake_col2(),
dur_col = duration_col2(),
tz = tz2()
)
Arguments
- data
Feeding visit data (dataframe or list of dataframes)
- eps
DBSCAN epsilon parameter (maximum time gap in minutes between visits in same meal). If NULL (default), the parameter is automatically determined using statistical methods.
- min_pts
DBSCAN minimum points parameter (minimum visits to form a dense cluster). Default is 2. This follows the DBSCAN recommendation of setting min_pts to D + 1 where D is the number of dimensions (we have only 1 dimension: time, so min_pts = 1 + 1 = 2).
- method
Character string specifying the automatic eps determination method when eps=NULL. Options are "gmm" (default), "percentile", or "both".
- percentile
Numeric value between 0 and 1 specifying which percentile to use for automatic eps determination when method="percentile" or "both". Default is 0.93.
- eps_scope
Character string specifying the scope for automatic eps determination when eps=NULL. Options are:
"all_animals" (default): calculate an universal optimal interval (eps) for all animals across all days
"one_animal_all_days": calculate optimal interval (eps) differently for different animals, but within each animal, we use the same eps across all days
"one_animal_single_day": calculate optimal interval (eps) differently for different animals, and calculate different eps for each day within the same animal
- lower_bound
Numeric value for lower bound of the optimal interval, if NULL, no lower bound is applied. Default is 5.
- upper_bound
Numeric value for upper bound of the optimal interval, if NULL, no upper bound is applied. Default is 60.
- use_log_transform
Logical indicating whether to use log transformation for GMM fitting. Default is TRUE. Log transformation often provides better separation of within-meal and between-meal gaps.
- log_multiplier
Numeric value for multiplier of log transformation. Default is 20.
- log_offset
Numeric value for offset of log transformation. Default is 1.
- id_col
Animal ID column name (default current global value from
id_col2()
)- start_col
Start time column name (default current global value from
start_col2()
)- end_col
End time column name (default current global value from
end_col2()
)- bin_col
Bin ID column name (default current global value from
bin_col2()
)- intake_col
Intake column name (default current global value from
intake_col2()
)- dur_col
Duration column name (default current global value from
duration_col2()
)- tz
Timezone (default current global value from
tz2()
)
Value
Same structure as input data (dataframe or list of dataframes) with additional columns:
meal_id: Sequential meal number within animal-day (0 for outliers)
meal_start: Start time of the meal this visit belongs to (NA for outliers)
meal_end: End time of the meal this visit belongs to (NA for outliers)
meal_duration: Total duration of the meal this visit belongs to (NA for outliers)
total_intake: Total intake of the meal this visit belongs to (NA for outliers)
visit_count: Number of visits in the meal this visit belongs to (NA for outliers)
Details
This function is a convenience wrapper for cluster_meals()
and merge_cluster_results()
.
It clusters feeding visits into meals and then labels each visit with its meal assignment
and summary statistics.
Examples
# Create a toy dataset
toy_data <- all_fed[[1]][which(all_fed[[1]]$cow == 5114),]
# Cluster and label meals
labeled <- meal_label_visits(toy_data, id_col = 'cow', start_col = 'start',
end_col = 'end', bin_col = 'bin', intake_col = 'intake', dur_col = 'duration',
tz = 'America/Vancouver')
head(labeled)
#> transponder cow bin start end duration
#> 31 12200060 5114 13 2020-10-31 00:03:24 2020-10-31 00:04:01 37
#> 35 12200060 5114 17 2020-10-31 00:04:24 2020-10-31 00:12:57 513
#> 36 12200060 5114 15 2020-10-31 00:13:13 2020-10-31 00:13:19 6
#> 69 12200060 5114 4 2020-10-31 00:17:13 2020-10-31 00:34:51 1058
#> 454 12200060 5114 25 2020-10-31 06:30:13 2020-10-31 06:32:25 132
#> 545 12200060 5114 26 2020-10-31 06:32:29 2020-10-31 06:51:43 1154
#> start_weight end_weight intake date meal_id meal_start
#> 31 18.7 19.4 -0.7 2020-10-31 1 2020-10-31 00:03:24
#> 35 14.1 12.3 1.8 2020-10-31 1 2020-10-31 00:03:24
#> 36 23.0 23.0 0.0 2020-10-31 2 2020-10-31 00:13:13
#> 69 11.6 7.7 3.9 2020-10-31 2 2020-10-31 00:13:13
#> 454 46.5 45.7 0.8 2020-10-31 3 2020-10-31 06:30:13
#> 545 89.8 83.1 6.7 2020-10-31 3 2020-10-31 06:30:13
#> meal_end meal_duration total_intake visit_count
#> 31 2020-10-31 00:12:57 573 1.1 2
#> 35 2020-10-31 00:12:57 573 1.1 2
#> 36 2020-10-31 00:34:51 1298 3.9 2
#> 69 2020-10-31 00:34:51 1298 3.9 2
#> 454 2020-10-31 06:51:43 1290 7.5 2
#> 545 2020-10-31 06:51:43 1290 7.5 2