Preprint
Article

Individual Piglet Detection in Weaning Stage using Deep Learning Techniques

Altmetrics

Downloads

118

Views

92

Comments

0

Submitted:

08 May 2024

Posted:

09 May 2024

You are already at the latest version

Alerts
Abstract
Pig farms management is involved in multiple factors such as environmental conditions, animal behavior, weight gaining and others. In all those scenarios, it is fundamental to detect each individual and its development as it grows to reach the shipment phase. Therefore, the objective of this study was to apply an automatic individual pig detection algorithm using Deep Learning Techniques, in frames taken from videos recorded during the weaning phase in real production conditions of small and medium producers in tropical conditions in Colombia. The weaning phase was selected because individual detection in each cage is more difficult due to environmental adaptations that piglets undergo after nursing. Piglets tend to cluster together during this phase, making image processing more challenging compared to the fattening phase, where animals are more isolated. The You Only Look Once (YOLOv7) algorithm and VGG Image Annotator were used to detect individuals in two pens separated by gender. All animal postures, lighting conditions were considered. The Precision, Recall and mAP metrics were used to assess the model performance with different confidence values. A precision of 98.3% in a threshold of 0.9 was reached, Recall of 98.5% in a threshold of 0.85. Data augmentation techniques were important to increase the dataset and to have a more realistic model for individuals’ detection. The method used allowed us to identify all the complete pigs’ bodies under different posture and lighting situations.
Keywords: 
Subject: Biology and Life Sciences  -   Animal Science, Veterinary Science and Zoology

1. Introduction

Object detection is the way to localize objects by automated methods in an image or video [1] and is a crucial middle level task in computer vision which facilitates high level tasks such as action recognition and behavioral analysis [2]. There are several methods for object detection, but the most effective algorithms include deep learning [3,4]. Compared to traditional Machine Learning methods, Deep Learning is a subfield of machine learning that automatically extracts features using artificial neural networks (ANNs) as the backbone of their algorithms.
By contrast, machine learning requires human intervention to extract image features [5] to train models and can work well on a wide range of important problems. However, they present failures in problems as recognizing speech or recognizing objects [6]. Therefore, Deep Learning was motivated by the disadvantages mentioned and their insufficiency in learning in high-dimensional spaces. In case of Deep Learning, object detection is characterized by two types of detection: one-stage detection as YOLO [7] and two-stage detection as Faster R-CNN [8].
Depending on the parameters to be evaluated and the cameras used, the methods for extracting physical characteristics through image analysis can vary. For instance, [9] used a manual selection criterion: out of 50 images taken of each animal, the single best image was selected in which the entire body of the pig appeared in eating positions to ensure less movement and the most uniformity. However, this method is extremely vulnerable to the influence of the posture of live pigs, because it requires higher accuracy for the captured images and the pigs need to be in a fixed position [10].
The performance in machine learning models improves in the presence of more data. To increase the data bank, it is necessary to apply geometric transformations to the images, to increase the quantity, this process is known as “data augmentation” [11]. According to [12,13] both the collection of individual data and the use of the geometric transformations data such as, random cropping, small rotations, image flipping or RGB jittering to generate more examples of the class; are a valuable source of information for decision making obtaining an improvement of up to 3% in the performance of the model. Therefore, object recognition is susceptible to dataset augmentation because transformations do not change the class, but it increases the number of images with many geometric operations [6].
On the other hand, there are methods such as that proposed by [14], where a sequence was implemented to extract individual features of pigs, in cages with groups of seven individuals, on videos recorded by 2D cameras. However, this analysis represents the detection of each individual at a distance from each other. Therefore, although the information present in a video is more extensive compared to a series of images, it is challenging to process the video in terms of information storage and analysis of the frames with the most relevant information about the distance state of the pigs in the cages. To detect the position of the pigs, there are methods [15,16,17] that identify the body dimensions of the animals, through methods such as fitting to geometric figures, or classifiers such as Support Vector Machine (SVM). The SVM classifier has the advantage of requiring less data to train the model, compared to other neural networks [18]. However, classifiers such as SVM have not yet been implemented in animal weight prediction, so there is a need to develop a neural network to articulate pig size variables, with their corresponding weights.
[2] proposed a 2D video camera-based multiple pig detection and tracking method. They coupled a CNN-based detector and a correlation filter-based tracker via a novel hierarchical data association algorithm tackling image processing problems such as lighting, occlusion, shadows, and shape deformation.
Part of machine learning processes is the image segmentation technique which is one of the more important initial phases to carry out artificial vision techniques. However, this stage is sensitive to variables such as lighting and controllable environments differentiating objects with background [19]. Hence, [20] found better performance in their algorithms only for white animals on a dark floor, which are not suitable for implementation in commercial farms with different backgrounds, illumination conditions, and animal coat colors.
On the other hand, authors as [1], worked with videos to detect cow structure by spatial positions of the joints and body localizations (key points). These body locations and joints can analyze body structure, and estimate, body, and weight, describe a spatial location of the cow, the body contour, the positions of the major joints, and the trajectories of their movement. However, this method needed several cameras from different views to detect the animal’s shape through joint points and it is evident that posture is a key factor in precision on variables mentioned before.
In this way, complexity and effectiveness of models are generally influenced by choices made in image collection and segmentation [21], which can be used to detect potential production concerns and implement management strategies [22] .
To address these drawbacks, the objective of this study was to apply an automatic individual pig detection algorithm using Deep Learning Techniques (The You Only Look Once (YOLOv7) algorithm), considering the real production conditions of small and medium producers in Colombia, analyzing piglets from the weaning phase. Several studies have been conducted on individual pig detection. However, this study sets a benchmark for identifying animals using Deep Learning techniques in early stages of growth under real production conditions. It is worth noting that the video-based detection process enables daily monitoring of animal activity and traceability of animals that are not developing optimally.
In this way, video processing in the weaning phase becomes more challenging because the animals are in a transitional stage from the nursing phase with a liquid diet fed by the sow, to a solid diet and are also experiencing new environmental conditions. For this reason, the identification of each animal in two cages during these adaptations, without controlling the scenarios to record the videos, such as posture, quantity of animals, or lighting conditions, was the mean purpose of this research.
On the other hand, it is crucial to achieve high precision in animal detection. This process enables other applications, such as weight prediction, identification of behavioral patterns, or decision-making for facility adaptations to improve efficiency and productivity.

2. Materials and Methods

2.1. Location and Swine Production Features

The research was developed at the farm of the National University of Colombia, at the Department of Agricultural Engineering – Animal Science, Medellin headquarters at 2154 m.s.n.m. The National University’s farm is in the state of Antioquia, in a small city named Rionegro (Figure 1A,B), 35 km away from Medellin, at coordinates 6º 07’ 56’’ N, 75º 27’ 17’’ W, as shown Figure 1C. The weather conditions at the site were obtained from a nearby weather station. The average temperature was 16.2 ºC, with a relative humidity of 82% and an average rainfall of 2645 mm.
The facility where the experiment was conducted is a typical small and medium production facility in Colombia. It has no artificial lighting at night and no automatic air conditioning or ventilation system. The piggery has odor control windows that can be opened at the worker's discretion. Two pens were used; each pen was 2.41 m × 2.41 m and contained a 2 m continuous feeder and a nipple drinker. The pens were spaced 1.5 meters apart and a scale was installed between them to collect individual weight data three times per week. The pens were also elevated 0.5 meters above the floor. Figure 2 shows the general features of the facility and the experimental setup.
The conditions of the building were the least technical imaginable, with an adequate shed for animal excrement, large windows for odor management, non-automatic windows that opened at the discretion of the worker. At night, there was an infrared lamp to warm the animals and no artificial lighting (Figure 2D). These light variations made it somewhat difficult to continuously analyze the images extracted from the night videos. The temperature inside the facility was controlled manually with the opening of the fixed window shades, and it is characterized by natural ventilation. The facility is oriented in a west-east direction as shown in Figure 2.

2.2. Experimental Animals.

Weaned pigs of the PIC breed line were used after seven days of weaning, which means 30 days after birth. The average body weight of the pigs after weaning was 8 kg. At the beginning of each cycle, the weaned pigs were grouped by 10 females and 10 males each group in its own pen. The environmental conditions were all similar for both cages. The study was conducted in a conventional Colombian housing system for piglet rearing from about 8 kg to 40 kg. Throughout the experiment, the animals were kept in the pre-fattening stage, and at the end of each data collection cycle, they were moved to the fattening stage. The cages were placed 0.5 meters above the floor, according to Figure 3. In each pen, pigs were marked for identification in video recordings.

2.3. Equipment and Data Collection

A low-cost system was implemented for real-time monitoring of pigs in the pre-fattening stage. An open Internet of Things (IoT) tool using ESP32 microprocessors and Raspberry PI were installed to collect environmental data (air temperature, relative humidity, pressure, radiation, and wind). The videos were recorded with cameras installed at a height of 1.5 meters from the slatted floor of the pig pen. The two cameras used were ArduCam OV5647 NoIR with a resolution of 1920 × 1080 pixels at 30 fps. It was installed three modules for measuring internal variables (inside of facility), external variables (outside of facility) and camera module. This study focuses on the use of artificial vision for data collected from the same authors as described in section 3.4 (Montoya et al., 2023). At the end of each cycle, the cameras were serviced because the pigs' feed produced a lot of dust that affected the quality of the video obtained for the next cycle. Videos of each cage were recorded from 6 am to 6 pm, with a duration of 1 minute per video. This means that 60 videos, each lasting 1 minute, were obtained every hour.
Four sets of trials were performed. Each trial consisted of 10 male and 10 female pigs and lasted 6 weeks to reach an average weight of 32 kg. Pigs were also manually weighed three times per week using handheld scales. Each pen had a data collection system using ESP32s and Raspberri Pi microprocessors. The ESP32 was programmed to collect microenvironmental data, and the Raspberri Pi microprocessor stored the videos, received the information from the ESP32, and then transmitted it wirelessly to a router. The information was accessed in a booth 50 meters away from the production site in real time via the Internet and Bluetooth connections. An ESP32 microprocessor-based environmental data collection module was installed on the outside of the building as shown in Figure 4C.
Figure 5 shows the general configuration of the experiment for one cage and one data acquisition cycle where the microprocessors and camera were installed. The installation for the two cages (males and females) of each cycle was completely identical.

2.4. Automatic Pig Detection Approaches

In the study conducted, the videos and sampled random images we acquired for labeling using the VGG Image Annotator (VIA) [23]. To ensure a diverse representation of different conditions, and to improve the model’s detection power, the images that show very similar scenes (e.g., those sampled in close temporal proximity within the video) were excluded. Consequently, the images that were more representative of different conditions and scenarios were retained. The first step was the selection of the videos that corresponded to the days and hours of the manual weighing of the animals. Pigs were weighed in a manual scale (0 -1000 kg). A random sample of only 15 frames per video was taken from approximately 20 videos for males and 20 videos for females in a total of 4 cycles during the experiment. A total of 600 frames (300 for males and 300 for females) in this manner (Figure 6) were labeled.
Subsequently, data augmentation techniques were applied to the labeled data before entering the data into the model. This involved randomly applying flips, rotations, crops, shears, exposure adjustments, and blurs to generate augmented images and increase the number of training examples for the model. In this way, the model is overfitted [24], and if it finds an image that is not very good, it does not detect it, it analyzes images of any modification to detect the objects under any condition. The model must be general, synthesized data is produced based on real data. As a result, 1260 labeled data samples were obtained for training and for validation, 180 samples were separated, for testing model.
The percentage of images used for training, validation and testing processes depends on the task to do. In this research, the purpose was only identifying pigs in different conditions, times and lighting, the model did not require such a demanding task to detect each individual. Therefore, the image sample does not need to be as rigorous when training, validating the model while adjusting the hyperparameters, and finally testing the final accuracy of the model with respect to the initial validation stage. There is no recommended split ratio for fitting the model, all the criteria depend on the task to be done, so in this case, testing with 14% of the total data was sufficient without any overfitting present. Each animal is contained within a polygon, and the polygon coordinates were recorded and used in conjunction with the images to train the convolutional neural network model for pig detection as shown by Figure 7.
The Deep Learning model used to automatically detect pigs was a convolutional neural network based on the YOLOv7 algorithm [25]. This artificial intelligence model divides the image into smaller sections called cells. Each cell is responsible for searching for objects using pre-defined fields and confidence values. To avoid duplicate detections, a special algorithm called Non-Maximum Suppression (NMS) is applied. NMS compares the detected objects and keeps only the most accurate and confident ones, removing all duplicates. This approach enables real-time pig detection, providing accurate bounding box coordinates and confidence scores in different environments, accurate segmentation of pigs in livestock environments.
The model was trained for 100 epochs, where each epoch represents a complete pass through the entire training dataset. A batch size of 60 was used during training. The batch size refers to the number of samples processed together in each iteration. It allows for efficient use of computational resources and facilitates the optimization process by updating the model parameters based on the error calculated from multiple samples at once.
To assess the performance of the model, the effectiveness of the validation data using different detection thresholds (0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.85, 0.9) was evaluated. Three commonly used metrics: precision, recall, and Mean Average Precision (mAP) were used. Precision measures the accuracy of positive classifications by calculating the proportion of correctly classified positive examples out of all examples classified as positive. Recall measures the ability of the model to identify true positive (TP) examples by calculating the proportion of correctly classified positive examples among all true positive examples. Finally, mAP is a metric primarily used in object detection, which calculates the average precision at different recall levels, providing a more comprehensive evaluation of the model's performance. All deep learning procedures were performed using Python 3.7 on a Linux (18.04LTS) machine with an Intel i7-9750H CPU (2.60 GHz × 12), 20 GB RAM, and an NVIDIA® GeForce® GTX 1660 (6 GB) Ti Max-Q GPU.

2.5. Evaluation Metrics

To evaluate the performance of the model developed in this study, three common indicators in detection and segmentation models were used: Precision, Recall, and mean Average Precision (mAP). To define the metrics mentioned above, it is necessary to introduce four states that occur after predicting a test sample: True positive (TP), False positive (FP), True negative (TN) and False negative (FN). True Positive means all pigs detected that are truly pigs; False Positive means the model predicts objects as pigs, but they are not true pigs, it is an incorrect detection; False negative means the model does not detect the true pigs or is missing detections of pig and, True negative detects all elements that are not pigs (background). In this sense, True Negative is not applicable for object detection and segmentation because it detects the pig image as background, which is equivalent to not detecting anything.
Precision measures the proportion of items that system has been successful in selecting [26], as shown Equation (1), all True Positives out of total positive detections, in other words, precision measures the model’s quality. On the other hand, Recall measures model’s ability to detect true positives (pigs detected correctly) out of all predictions (ground truth), in other words, is defined as the proportion of the target items selected by the system [26] as shown Equation (2). The value ranges for precision, Recall are from 0 to 1.
P r e c i s i o n = T p T p + F P
R e c a l l = T P T P + F N      
False positives as well as False negatives, used to evaluate the robustness of object detection models, as shown in Equation (3), (4)
A P = 0 1 p r e c i s i o n ( r e c a l l )   d ( r e c a l l )
m A P = 1 n i = 1 n A P i
Where n , is the number of classes, A P i is the average precision class i . In this research only one class was evaluated labeled as “pig” as shown Figure 6.

3. Results

Testing Results

The evaluation metrics (Precision, Recall, and mean Average Precision) of individual detection remained consistently high throughout the model training, as shown in Figure 8A.
The Deep Learning model showed high performance in pig individual detection. Furthermore, when tested on the validation dataset at various detection thresholds, the model continued to demonstrate remarkable performance (Figure 8B). Despite crossing and overlapping between pigs, animals were reliably detected with high accuracy (values from 0.9 and 1.0), as shown in Figure 8C. In terms of accuracy, the model predicted every pig after 30 epochs, which supports the 180 data for testing mentioned above.
These results highlight the robustness and effectiveness of the developed model in accurately detecting pigs in challenging scenarios. The high precision indicates that most positive classifications made by the model were correct, minimizing false-positive detections. High recall indicates that the model successfully identified a large proportion of truly positive examples, ensuring minimal false-negative errors. The sustained high mAP values further validate the overall performance of the model, considering the trade-off between precision and recall at different recall levels.
Such reliable pig detection capabilities hold promise for various livestock management applications, enabling precise monitoring to optimize farming processes and ensure animal welfare.
For precision - recall curve (PR curve), Figure 9 shows the tradeoff between precision and recall for several thresholds. The overall performance is better when the convex curve becomes more convex. The area under curve is mAP representing a 98.8%, which denotes a low false positive rate, a high precision of 97.3% and low false negative rate, which means a high recall of 98.4% at 0.5 of confidence.
The precision-recall curve (Figure 9) joins the trade-off of both metrics mentioned before and maximizes their effect on the model’s performance, allowing a better idea related to overall accuracy obtained.

4. Discussion

The proposed methodology tested the ability of the neural network based on the YOLOv7 algorithm to identify each individual in a group housing system of pigs in the weaning phase and housed under real conditions of a pork production construction. This method reached a precision of 98.3% with a detection threshold of 0.9, and a recall rate of up to 98.5% with a detection threshold of 0.85 as shown Figure 8B. It was obtained a higher precision than authors [2] which achieved a precision of 94.72% and recall of 94.74% with Single Multibox Detector (SSD). They detected and tracked multiple pigs under different image conditions, such as shape deformations and light fluctuations. The authors used artificial light between the hours of 7:30 and 16:00, combined with natural light from two small windows. This method, on the other hand, obtained higher precision recording videos only in a natural light condition compared to [2].
Although this method was more accurate, the speed of the detection was lower than [2]. They implemented three CNN detection architectures, but the fastest was the Single Multibox Detector (SSD) with a speed of 42 milliseconds per frame [2]. In this study a speed of 48.3 milliseconds per frame was achieved and could be considered that speed of detection can influence the quality of image processing, and even though the architectures compared are different, this is an important indicator to evaluate the model performance.
Authors such as [27], obtained a higher recall rate than this research, using a sliding window method on pig detection. They reached a recall rate of up to 99.21%, using BING+PCANet, Faster R-CNN and YOLO, but their method has a low computational efficiency’s disadvantage which depends on window size and ratio on manual design [28], and they reached the recall rate mentioned, compared to the first YOLO version. By contrast, our study adopted YOLOv7 algorithm which uses around 36.9 million parameters, 43% less in computational requirement, compared to previous versions such mentioned and is faster than most state-of the-art detection networks, achieving a real-time processing without compromising accuracy [25,29].
Another approach is using different camera perspectives as proposed by [30] that used The Faster Region-Based Convolutional Neural Network (Faster R-CNN) for object detection using video images from pig fattening and pig rearing at different camera positions. They trained two models (pig position and posture and pig positions only). For the pig position model on pig rearing, they obtained the highest Average Precision (AP) of 76.8% when the camera was located at the top of the pen, compared to front and back cameras. In this sense, the present study obtained a higher Average Precision than authors mentioned, for the same setup conditions (pig rearing phase, with top camera setting), but using a different algorithm, obtaining a rate of 97.5% of AP, in a confidence of 0.85.
Another reason for choosing to conduct this study with pigs in the weaning phase is that it is more difficult to detect individuals in this phase due to their behavior. As shown by [30], the percentage of AP was lower when evaluating the weaning phase than the fattening phase, due to the limitation of tightly grouped piglets. In the fattening phase, each animal is more isolated from the others, which allows a better recognition of each individual. In the weaning phase, the animals are still adapting to the internal environment, so it is more common to find them in agglomerations inside the cages which leads to tracking errors and problems of segmentation. To tackle this adaptation problem, an infrared camera was implemented to ensure the internal temperature of piglets’ pens for the first week of each cycle.
According to Figure 8B and Figure 9 when Recall increases, False Negative decreases, it means to decrease the quantity of pigs not detected by the model. When Precision increases, False Positive decreases, it means that decreases the number of objects detected as pigs but are not pigs. For Precision score, with a confidence of 0.9, 98.3% of detected items were really pigs, and 1.7% was false positive. In the case of Recall, with the same confidence, 93.9% of elements were correctly detected and 6.1% were false negatives. In this sense, the particular purpose of this study is prioritizing precision, with a focus on minimizing false positives, i.e., to detect true pigs even if, are missing one of them. In contrast, recall focuses on minimizing that potential pigs are not missed.
This study obtained a similar precision compared to [31], but the authors used Faster R-CNN, and even though the authors reached 95% of the individual pigs identified correctly, this network is still far from being a real-time system [2], as the one proposed in this research.
The results obtained in the performance of the model confirm the importance of using a database with the greatest possible variability in terms of geometric transformations of the image, as mentioned by [11,12,13]. In this way, the model used in this research became more realistic and, when images with different lighting settings, perspective changes or even image angles are introduced, the model is able analyze such variations and accurately detect each of the animals in their cages.
Finally, to detect accurately individuals in a group play a critical role in subsequent real-time processes such as behavior analysis, feeding and weight prediction. Therefore, ensuring a high precision in this stage can guarantee in a preliminary analysis, a high precision in other variables to assess in future research.

5. Conclusions

The recognition method of individual piglets in two cages in the weaning stage, using the Yolov7 algorithm was proposed in this study. The method was able to analyze images with different geometric transformations, using synthetic images produced from real images (data-augmentation) to identify slow weaning pigs and to make appropriate decisions in the facility. This study allowed an accurate detection of 98.3% of pigs housed in groups in the weaning phase. The weaning phase is a stage in which pigs are in constant adaptation to different thermal conditions, a change from liquid to solid diet, which is why it is more challenging to detect each individual pig in a cage, because they are all together maintaining their ideal body temperature. This is why it is considered a critical phase of pig production in the whole productive chain, because the success of a weight in fattening pigs depends mostly on the management given in the weaning phase.
What has been developed here can be used as a basis for future behavioral analysis, animal posture detection, comfort analysis and animal growth monitoring, or even classification by sex in large groups or identification of animals in different stages of growth. The model used in the present study obtained a high accuracy in the detection of animals under real conditions of medium and small pork production in Colombia in one of the Departments with the highest pork production in the country.
This study was a fundamental idea to pig production tracking and can be used in future research to enhance specific tasks in production systems, such as behavior tracking and environmental conditions.
Due to the experiment as a real time application, the decision-making process in pigs or even other animal productions will be encouraged in terms of time optimization or making appropriate decisions according to current needs.

Author Contributions

Conceptualization, E.F.L.C., I.F.F.T., R.O.H., and J.A.O.S.; investigation, E.F.L.C., I.F.F.T., R.O.H., J.A.O.S.; methodology, E.F.L.C., R.O.H., J.A.O.S., R.C.B.; writing—original draft preparation, E.F.L.C.; writing—review and editing, E.F.L.C., I.F.F.T., R.O.H., J.A.O.S., V.G.C.; document review, I.F.F.T., R.R.A, G.A.; supervision, I.F.F.T., R.O.H.; project administration, I.F.F.T., V.G.C., A.P.M.R.; funding acquisition, I.F.F.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by CAPES (88887.683151/2022-00), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brazil.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the Federal University of Viçosa (UFV), National University of Colombia, Bogotá and Medellin headquarters (UNAL) and to Veronica Gonzalez Cadavid to provide with part of her data experiment in Colombia. This work was conducted with the support of the Coordination of Superior Level Staff Improvement, Brazil (CAPES).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. H. Liu, A. R. Reibman, and J. P. Boerman, “Video analytic system for detecting cow structure,” Comput Electron Agric, vol. 178, Nov. 2020. [CrossRef]
  2. L. Zhang, H. Gray, X. Ye, L. Collins, and N. Allinson, “Automatic Individual Pig Detection and Tracking in Pig Farms,” Sensors, 2019. [CrossRef]
  3. J. Huang et al., “Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, Jul. 2017, pp. 3296–3297. [CrossRef]
  4. W. Shen et al., “Automatic recognition method of cow ruminating behaviour based on edge computing,” Comput Electron Agric, vol. 191, Dec. 2021. [CrossRef]
  5. Z. Yu et al., “Automatic Detection Method of Dairy Cow Feeding Behaviour Based on YOLO Improved Model and Edge Computing,” Sensors, vol. 22, no. 9, p. 3271, Apr. 2022. [CrossRef]
  6. I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, Illustrated. The MIT Press, 2016.
  7. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” 2016. [Online]. Available: http://pjreddie.com/yolo/.
  8. S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” Jun. 2015.
  9. J. Kongsro, “Estimation of pig weight using a Microsoft Kinect prototype imaging system,” Comput Electron Agric, vol. 109, pp. 32–35, Nov. 2014. [CrossRef]
  10. H. Chen, Y. Liang, H. Huang, Q. Huang, W. Gu, and H. Liang, “Live Pig-Weight Learning and Prediction Method Based on a Multilayer RBF Network,” Agriculture, vol. 13, no. 2, p. 253, Jan. 2023. [CrossRef]
  11. S. Srinivas, R. K. Sarvadevabhatla, K. R. Mopuri, N. Prabhu, S. S. S. Kruthiventi, and R. V. Babu, “A taxonomy of deep convolutional neural nets for computer vision,” Frontiers Robotics AI, vol. 2, no. JAN, Jan. 2016. [CrossRef]
  12. A. G. Biase, T. Z. Albertini, and R. F. de Mello, “On supervised learning to model and predict cattle weight in precision livestock breeding,” Comput Electron Agric, vol. 195, Apr. 2022. [CrossRef]
  13. K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman, “Return of the Devil in the Details: Delving Deep into Convolutional Nets,” May 2014, [Online]. Available: http://arxiv.org/abs/1405.3531.
  14. W. xing Zhu, Y. zheng Guo, P. peng Jiao, C. hua Ma, and C. Chen, “Recognition and drinking behaviour analysis of individual pigs based on machine vision,” Livest Sci, vol. 205, pp. 129–136, Nov. 2017. [CrossRef]
  15. A. Cominotte et al., “Automated computer vision system to predict body weight and average daily gain in beef cattle during growing and finishing phases,” Livest Sci, vol. 232, no. August 2019, p. 103904, 2020. [CrossRef]
  16. B. Li, L. Liu, M. Shen, Y. Sun, and M. Lu, “Group-housed pig detection in video surveillance of overhead views using multi-feature template matching,” Biosyst Eng, vol. 181, pp. 28–39, May 2019. [CrossRef]
  17. S. Tu, W. Yuan, Y. Liang, F. Wang, and H. Wan, “Automatic detection and segmentation for group-housed pigs based on pigms r-cnn†,” Sensors, vol. 21, no. 9, May 2021. [CrossRef]
  18. A. Nasirahmadi et al., “Automatic scoring of lateral and sternal lying posture in grouped pigs using image processing and Support Vector Machine,” Comput Electron Agric, vol. 156, pp. 475–481, Jan. 2019. [CrossRef]
  19. S. Bhoj, A. Tarafdar, A. Chauhan, M. Singh, and G. K. Gaur, “Image processing strategies for pig liveweight measurement: Updates and challenges,” Comput Electron Agric, vol. 193, no. September 2021, p. 106693, 2022. [CrossRef]
  20. M. Kashiha et al., “Automatic weight estimation of individual pigs using image analysis _ Elsevier Enhanced Reader,” Comput Electron Agric, 2014.
  21. E. van der Stuyft, C. P. Schofield, J. M. Randall, P. Wambacq, and V. Goedseels, “Development and application of computer vision systems for use in livestock production,” 1991.
  22. S. M. Leonard, H. Xin, T. M. Brown-Brandl, and B. C. Ramirez, “Development and application of an image acquisition system for characterizing sow behaviors in farrowing stalls,” Comput Electron Agric, vol. 163, Aug. 2019. [CrossRef]
  23. A. Dutta and A. Zisserman, “The VIA Annotation Software for Images, Audio and Video,” in In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, Oct. 2019, pp. 2276–2279.
  24. T. Poggio et al., “Theory of Deep Learning III: explaining the non-overfitting puzzle,” Center for Brains Minds + Machines, 2018.
  25. C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “YOLOv7: Trainable bag-of-freebies sets new state of the art for real time object detectors,” Jul. 2022, [Online]. Available: http://arxiv.org/abs/2207.02696.
  26. C. D. Mannning and Hinrich Schütze, Foundations of Statistical Natural Language Processing. Massachusetts: Massachusetts Institute of Technology, 1999.
  27. L. Sun, Y. Liu, S. Chen, B. Luo, Y. Li, and C. Liu, “Pig Detection Algorithm Based on Sliding Windows and PCA Convolution,” IEEE Access, vol. 7, pp. 44229–44238, 2019. [CrossRef]
  28. P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, “OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks,” Dec. 2013, [Online]. Available: http://arxiv.org/abs/1312.6229.
  29. M. Hussain, “YOLO-v1 to YOLO-v8, the Rise of YOLO and Its Complementary Nature toward Digital Manufacturing and Industrial Defect Detection,” Machines, vol. 11, no. 7, p. 677, Jun. 2023. [CrossRef]
  30. M. Riekert, A. Klein, F. Adrion, C. Hoffmann, and E. Gallmann, “Automatically detecting pig position and posture by 2D camera imaging and deep learning,” Comput Electron Agric, vol. 174, Jul. 2020. [CrossRef]
  31. Q. Yang, D. Xiao, and S. Lin, “Feeding behavior recognition for group-housed pigs with the Faster R-CNN,” Comput Electron Agric, vol. 155, pp. 453–460, Dec. 2018. [CrossRef]
Figure 1. Location of the pig breeding barn of the National University of Colombia. (A) Location of the department of Antioquia in the upper right corner. Rionegro is symbolized in red within the department of Antioquia. (B) Farm and Rionegro location. (C) Swine Swine facility location at the farm. Source: Google Earth, Google maps, 2023.
Figure 1. Location of the pig breeding barn of the National University of Colombia. (A) Location of the department of Antioquia in the upper right corner. Rionegro is symbolized in red within the department of Antioquia. (B) Farm and Rionegro location. (C) Swine Swine facility location at the farm. Source: Google Earth, Google maps, 2023.
Preprints 106006 g001
Figure 2. Swine production facility. (A) Raspberry Pi sensor collecting internal environmental conditions. (B) External facility features. (C) Males and females and their respective cages with the same environmental conditions. (D) Experiment setup for each pen, infrared light, Raspberry Pi sensor, camera ArdumCam OV5647 NoIR and Router. Source: The authors.
Figure 2. Swine production facility. (A) Raspberry Pi sensor collecting internal environmental conditions. (B) External facility features. (C) Males and females and their respective cages with the same environmental conditions. (D) Experiment setup for each pen, infrared light, Raspberry Pi sensor, camera ArdumCam OV5647 NoIR and Router. Source: The authors.
Preprints 106006 g002
Figure 3. Set up and pigs at the beginning of each cycle. (A) Overall experiment conditions. (B) Pigs at seven days after weaning with marks in their backs. Source: The authors. .
Figure 3. Set up and pigs at the beginning of each cycle. (A) Overall experiment conditions. (B) Pigs at seven days after weaning with marks in their backs. Source: The authors. .
Preprints 106006 g003
Figure 4. Equipment and data collection modules. (A) Internal module. (B) Weighing scale. (C) Outside module. Each internal module contained a camera that monitored males and females separately. Source: The authors. .
Figure 4. Equipment and data collection modules. (A) Internal module. (B) Weighing scale. (C) Outside module. Each internal module contained a camera that monitored males and females separately. Source: The authors. .
Preprints 106006 g004
Figure 5. Experiment setup. Configuration of sensors and camera for one pen and one trial. Source: The authors. .
Figure 5. Experiment setup. Configuration of sensors and camera for one pen and one trial. Source: The authors. .
Preprints 106006 g005
Figure 6. Labeled data samples of pigs in cages at different positions and times of day. (A) Pigs lying down in daytime. (B) Pigs lying down at night. (C) Pigs moving during the day. (D) Pigs moving at night. Source: The authors. .
Figure 6. Labeled data samples of pigs in cages at different positions and times of day. (A) Pigs lying down in daytime. (B) Pigs lying down at night. (C) Pigs moving during the day. (D) Pigs moving at night. Source: The authors. .
Preprints 106006 g006
Figure 7. Data augmentation set for model training. Source: The authors. .
Figure 7. Data augmentation set for model training. Source: The authors. .
Preprints 106006 g007
Figure 8. Performance evaluation of the convolutional neural network model for automatic pig detection in livestock. (A) Scatter plots showing the performance metrics - precision, recall, and mean average precision (mAP) - during the 100 training iterations over the epochs. (B) Performance metrics observed during the validation of the model using different confidence thresholds (indicated by the values inside the bars). (C) Visual representation of pig detection by the model applied to random images. Each detected animal is marked with a bounding box, along with the corresponding confidence score indicating the reliability of the detection. Source: The authors. .
Figure 8. Performance evaluation of the convolutional neural network model for automatic pig detection in livestock. (A) Scatter plots showing the performance metrics - precision, recall, and mean average precision (mAP) - during the 100 training iterations over the epochs. (B) Performance metrics observed during the validation of the model using different confidence thresholds (indicated by the values inside the bars). (C) Visual representation of pig detection by the model applied to random images. Each detected animal is marked with a bounding box, along with the corresponding confidence score indicating the reliability of the detection. Source: The authors. .
Preprints 106006 g008
Figure 9. Precision vs Recall curve for all detections at mAP 0.5 of confidence. Source: The authors. .
Figure 9. Precision vs Recall curve for all detections at mAP 0.5 of confidence. Source: The authors. .
Preprints 106006 g009
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated