Thesis Fruits And Vegetables Sorting Using Computer Vision

Thesis Fruits And Vegetables Sorting Using Computer Vision
Thesis Fruits And Vegetables Sorting Using Computer Vision

Thesis Fruits And Vegetables Sorting Using Computer Vision

Download Complete Thesis Click Here

A Thesis Submitted in Partial Fulfilment of the Requirements for the Award of Degree of


We are glad about the completion of our project report. This is the result of the cooperation and collective effort of our team members, Qasim Raza, Qaisar Ali, Zeeshan Ali, and Umair Mansha. We sincerely thank our project supervisor for guidance and encouragement in completing this report. We would like to express our deepest appreciation to all those who assisted us to complete this report.


We dedicate all our efforts and struggles of the educational life to our dear parents; without them we are meaningless. In addition, we devote the effort of this project to respectable and honorable teachers and supervisors who taught and supported us in developing our skills and personality.


Students of, BS Electrical Technology, Faculty of Engineering and Technology, University of Gujrat, Pakistan, hereby solemnly declare that the data quoted in this Report titled “FRUITS AND VEGETABLES SORTING USING COMPUTER VISION” is based on our original work, and has not yet been submitted or published elsewhere.  We also solemnly declare that the entire report is free of deliberate plagiarism and we shall not use this report for obtaining any other degree from this or any other university or institution.

We also understand that if evidence of plagiarism is provided in our report at any stage, even after the award of the degree, the degree shall be canceled or revoked by the University authority.

Table of Contents

Abstract 1

Chapter 1      Introduction. 2

1.1         Background. 2

1.2         Problem Statement 3

1.3         Importance of Lemon fruits. 4

1.4         Goals and Objectives. 4

Chapter 2      Literature Survey. 6

Chapter 3      Methods. 11

3.1         Image Acquisition. 15

3.2         Cropping. 15

3.3         Background Removal 15

3.4         Noise Removal 16

3.5         Colour Features. 17

3.5.1     Mean Value. 18

3.6         Size. 18

3.7         Surface Defects. 19

3.7.1     Centre Surround Method. 19

3.7.2     Global Standard Deviation. 20

3.7.3     Local Standard Deviation. 21

3.7.4     Data Normalizing. 22

3.7.5     Labelling. 23

3.7.6     Training. 23

3.8         Hardware and Physical Arrangement 23

3.9         Raspberry Pi 24

3.10      Camera. 25

3.11      Belt Conveyor System.. 25

3.11.1        Drive Motor 26

3.11.2        Rollers. 27

3.11.3        Frame. 28

3.11.4        Belt 28

3.12      Image Capturing Chamber 28

3.13      Actuator 29

3.13.1        H-Bridge. 29

3.14      Power Supply. 30

Chapter 4      Results. 31

4.1         Image Acquisition. 31

4.2         Cropping. 32

4.3         Noise Removal 33

4.4         Background Subtraction. 34

4.5         Blob Detection. 35

4.6         Morphological Operations. 36

4.7         Mean Value. 37

4.8         Global Standard Deviation. 38

4.9         Area. 39

4.10      Centre Surround. 39

4.11      Local Standard Deviation. 40

4.12      Training. 41

4.12.1        Data Normalization. 41

4.12.2        Neural Network. 42

4.13      Features Correlation. 43

4.14      Neural Network Performance. 44

4.15      Training Performance. 44

4.15.1        Cross Entropy Error 45

4.16      Classification Error and Accuracy. 46

4.17      Hardware Simulation Results. 47

4.18      Pulse Width Modulation for Motor Speed Control 48

Chapter 5      Discussion and conclusion. 50

5.1         Conclusion. 50

5.2         Comparison with other models. 51

5.3         Discussion. 53

5.3.1     Limitations and Areas to Improve. 54

Appendix. 59

List of Figures

Figure ‎1.1: Block diagram of the system. 3

Figure ‎3.1: Major steps of the algorithm.. 11

Figure ‎3.2: Block diagram.. 12

Figure ‎3.3: Flow chart for machine training. 13

Figure ‎3.4: Flow chart for testing. 14

Figure ‎3.5: Raspberry Pi 3 Model B, image from 24

Figure ‎3.6: Raspberry Pi camera board V1.3, image from 25

Figure ‎3.7: The belt conveyor system.. 26

Figure ‎3.8: Schematic diagram of motor control circuit 27

Figure ‎3.9: Wooden roller 27

Figure ‎3.10: Image capturing chamber 29

Figure ‎3.11: H-Bridge. 30

Figure ‎4.1: Image capturing Chamber 31

Figure ‎4.2: Original image. 32

Figure ‎4.3: Marked area. 32

Figure ‎4.4: Cropped image. 33

Figure ‎4.5: Noise removal 34

Figure ‎4.6: Background removed. 35

Figure ‎4.7: Thresholded image. 36

Figure ‎4.8: Before and after morphological operations. 37

Figure ‎4.9: (a) Ripe, (b) semi-ripe, and (c) not ripe lemons. 38

Figure ‎4.10: Local contrast 40

Figure ‎4.11: 16×16 patches to compute the local standard deviation. 41

Figure ‎4.12: Red-Green means correlation. 43

Figure ‎4.13: Red-Blue means correlation. 44

Figure ‎4.14. 45

Figure ‎4.15: Performance plot, neural network. 46

Figure ‎4.16: confusion matrix for training, testing, and validation. 47

Figure ‎4.17: Proteus simulation results. 49

Figure ‎4.18: Virtual oscilloscope settings. 49

Figure ‎5.1: Comparison between (Momin, 2013) and our method. 52

List of Tables

Table ‎3.1: Gaussian kernel, example. 17

Table ‎4.1: Determining ripeness using Mean Values of Red, Green, and Blue color channels. 38

Table ‎4.2: Parameters to initialize the neural network. 42

Table ‎4.3: H-Bridge responses. 48

Table ‎5.1: Comparison of our method with existing techniques. 53


The focus of this project was mainly on sorting fruits. The selected fruit was lemon for the project. The goal of the project was to automate the sorting and grading process, which could minimize the human errors caused by manual grading after harvesting.  A mechanical arrangement, consisting of a conveyor belt, actuator, and image-capturing chamber was assembled. Digital images were captured using a CCD camera. Furthermore, a mechanism to eradicate image blur was designed which could take the image before placing lemon on the belt. Captured images were enhanced using image processing techniques and useful features were extracted. The RGB color space was used. The background was removed followed by noise removal. The extracted features were fruit area, mean of skin color and global standard deviation of individual color channels (Red, Green, and Blue), local contrast differences, and local standard deviation of three color channels. These nine features were fed to a Back Propagation Neural Network. The neural network was trained using 99 samples from three classes, ripe, semi-ripe, and a combined class of defective and unripe. The classification accuracy of the system was about 94%.

Chapter 1          Introduction

1.1       Background

Image processing techniques have been evolving for years. Image processing and computer vision have been used in different fields and processes for example robots and self-driving cars that use object recognition and edge detection to avoid obstacles. Face recognition systems use computer vision for security. In agriculture, computer vision has been applied for several tasks such as grading, counting and sorting for about two decades. Sorting and grading using computer vision have been improving over time. It allows farmers to categorize the products accurately and provides better control over their products enabling farmers to make a good decision about the target market.

The sorting and grading of fruits and vegetables have an important role in post harvesting process. Manual sorting and grading the products is a very tiring job and requires a lot of time and workers to complete the task. The computer vision techniques, if applied carefully can help the farmer to categorize the fruits and vegetables correctly. Some product counting techniques, able to count the fruits in the image have been proposed and can provide a good estimate about the number of fruits even before harvesting. Automated sorting and grading of vegetables and fruits using computer vision is accomplished using a digital photographs of the products. The automated sorting and grading use non-destructive visual features to classify the products, meaning the product can be classified quite accurately without damaging it. Visual fruits and vegetables grading consists of six major steps:

  • Acquisition of digital image.
  • Removal of background.
  • Calculation of size.
  • Determining the ripeness.
  • Detection of surface defects.
  • Machine learning techniques to predict the quality.
Figure ‎1.1: Block diagram of the system.


1.2       Problem Statement

Sorting and grading fruits are important steps after harvesting and should be carried accurately. False grading can affect the farmer’s reputation in market, which can lead to long-term financial problems. Incorrect grading can lead to wastage of food since a healthy fruit can be dumped to waste. Conventionally sorting and grading of fruits after harvesting is performed by humans, which is very tiring, task and requires a lot of time and labor. Since manual classification depends solely on human resource, it is prone to various errors such as inconsistency and diversion of attention. Moreover, a color-blind worker can make classification errors. The automated system is not guided by humane interventions and does not subject to biological limits such as tiredness and diversion of attention. Therefore it can provide consistent results.

1.3       Importance of Lemon fruits

Tropical regions are suitable for the cultivation of citrus fruits. Pakistan is not a big grower of citrus due to its sub-tropical climate but still citrus are very important crops.  Pakistan Horticulture Development & Export Company is a state company under the ministry of commerce, the government of Pakistan which states that the total citrus cultivation in Pakistan is about 1600Km2 with an annual production of 1.5 million metric tons. Punjab has favourable growing conditions with adequate water for citrus and produces over 95% of the crop.

Lemons find their uses in food and drinks as lemonade, cocktail and soft drinks. Lemons are used in industry for the production of citric acid. Lemons are used as a cleaning agents for copper wares.

This project was based on computer vision techniques and code has been written using OpenCV library in C++. The process included image segmentation and analysis. Segmentation comprises of background removal, image enhancement and morphological operations whereas image analysis includes extracting useful features such as shape area, colour features and region surface irregularities.

1.4       Goals and Objectives

The goal of this project was to design a method for sorting and grading lemons. The project was built based on image processing and computer vision techniques. The project was designed such that the resulting process was automatic. It did not require any human supervision in prediction of the quality of fruit. Only manual placement of the fruit in the image capturing chamber was required, after that the whole process was automatic. The mechanical system was able to sort the lemons into their specific bins using a conveyor belt and an actuator.

The project has the following objectives:

  • Using image-processing techniques to segment image foreground.
  • To calculate the size of the fruit.
  • To determine the ripeness.
  • To detect defective fruits.
  • Build a physical arrangement to automate the process.

Chapter 2          Literature Survey

Computer vision researchers have long been trying to propose methods for visual sorting and grading of fruits. Sorting of fruits can be done mostly based on their characteristics such as the color of the fruit, size, surface irregularities. Some advanced techniques use laser imaging, fluorescent imaging, and spectroscopy for defect detection.

This section reviews various methods for and papers for sorting and grading of fruits and especially citrus fruits such as lemon and orange.

(Kondo & Ting, 1998) showed a fundamental setup to get the data such as color, size, and mass. The author provided a simple prototype for industry to classify the product and forward it to proper channel. Modern sorters can sort fruits very fast at a speed more than ten fruits per second base on colour, shape, defects, and stem detection.

(Jahnsa, 2001) sorted tomatoes based on computer vision techniques and observed that the tomatoes can be sorted based on mass using the only image and computer vision. The absolute error was about 2.06%.

Mangoes can be sorted based on their colour and shape. The geometric features such as shape can be compared with reference shape. Shape analysis is a good feature for a variety of mangoes. For grading purposes, the pixel value is another good feature. Pixel value greater than 100 means the skin is good and pure. This method has 83.3% accuracy (Pauly & Sankar, 2015).

Fruits such as mango can be sorted based on their maturity. A camera is used to acquire a digital image of the mango. In the second step, the noise is removed using pseudo median filter. Image is then converted to binary for edge detection. The method is 90% accurate overall (Bipan Tudu, 2012).

To evaluate the quality of fruit, a new method was proposed using the HIS color model. A digital image of fruit, taken using CCD camera captured in RGB color space was transferred into HIS colour space. The Colour intensity histogram of only the hue H channel was calculated. The histogram was provided to back propagation neural network as input. The output of the network was the description of the quality of fruit (Cui, Wang, Chen, & Ping, 2013).

A date fruit sorting and grading system was proposed. The system consisted of software and hardware. The hardware section included a conveyor belt system with a camera integrated into it. A computer loaded with software was used to analyze the digital image of dates and classify. The over al accuracy was found to be 80%. The problem associated with detecting the flabbiness of fruit was observed (Ohali, 2011).

A robot was designed to identify and pick fruits automatically using computer vision. A physical system was designed that could be mounted to a tractor. A camera was used to capture the images. The image was further processed to detect defective apples. A vacuum grabber was used to pick the apples. (Clowting, 2007)

Food color measurement in computer vision applications was reviewed. The paper analyzed the pros and cons of colour measurement for food were described and the future scope and trend in the field was proposed (Wu & Sun, 2013).

A very intuitive method for apple defect detection was proposed. The method incorporated the automatic light correction. The method counted and distinguished between the true defect and stem end. The method used a support vector machine for classification (Huang, Zhang, Gong, & Li, 2015).

(Jhawar, December 2015) proposed a lemon sorting system based on pattern recognition techniques such as nearest prototype, edited multi-seed nearest neighbor and linear regression. Features extracted were Mean Values of Red, Green, and Blue, size, standard deviation, and min-max values of the grey level image. They collected their samples from different locations in India consisting of five different breeds. The scope of their research was limited to only ripeness measurement. Our model closely resembles this model in ripeness measurements but goes beyond this research in terms of defective fruit detection. Their system was able to perform at 100% accuracy using linear regression.

(Seema, Kumar, & Gill, 2015) prepared a fruit recognition system to sort mixed fruits based on the type of fruit. The features used for fruit recognition were a shape, size, and color. They got an accuracy of 100% based on 120 samples.

(Khojasteh, 2010) proposed a lemon grading embedded system based on color and volume only. No defect-based classification was done. Greenish lemons with smaller sizes were considered as grade B, while larger yellowish lemons were considered as grade B. They used two cameras to cover maximum lemon area.

(Momin, 2013) proposed a very advanced technique for lemon defect detection. Florescent imaging was the base of the research since it has been used to extract the florescent component from the peels of citrus fruits. The technique of spectroscopy was used to identify the florescent components. Fluorescent components and spectroscopy help identify the chemical composition of lemon peel, which in return can be used for defect detection. The technique had a success rate of around 85%.

(Khoje, 2013 ) used Curvelet transforms for pattern recognition. Fruit quality was assessed using pattern recognition techniques. Curvelet transform is a multi-resolution technique that works on lower and higher resolutions to extract both local and global features related to fruit’s surface. The technique was evaluated on lemons and guava. Textural features extracted from the Curvelet transform were standard deviation, energy, entropy, and mean. Probabilistic Neural Network and Support Vector Machine were trained using these features and performance was evaluated for two classes, healthy and defective. SVM performed better and provided an accuracy of 96%.

(Swapnil S. Pawar & Dale, 2016) designed a system to recognize a fruit based on features such as roundness value and color. If the object is recognized as fruit using K-Nearest Neighbours, then the fruit is subjected to defect detection. A simple thresholding was used to isolate the defective area. If the pixel value exceeds a threshold value, then it belongs to the pure skin otherwise the pixel belongs to the defective area. All such pixels are counted to get the total defective area.

(Iqbal, 2016) devised an approach to sort citrus fruits especially lemon, oranges, and sweet limes. I single view image was proven enough for classification based on color features. Only hue from HSV colour space was used for classification. Different approaches such as colour distance, linear discriminant analysis, and probabilistic distribution were used to evaluate the classification accuracy. An accuracy of 90% was obtained based on colour classification. Moreover, colour variability was used for fruit maturity analysis. Color variability was measured using hue mean and hue median.

Defect detection on the spherical fruits is a tough task due to uneven lighting around the spherical shape. The study covered different defects like scarring and copper burn, which are common in oranges. Non-uniform spherical orange images were transformed using Butterworth filter resulting in even lighting distribution. It was observed that the stem end was detected as a defect in the algorithm. Red and Green ratio in color image along with big area and elongated region removal algorithms were used to detect stem end. The method detected defects extremely well with an accuracy of 98.9%. However, the method could not discriminate the types of defects (Li, 2013).

(Blasco, 2014) designed an automated system for citrus fruit harvesting. The authors realized that the field conditions vary massively. To make the system consistent, a good and efficient lighting system was necessary. Moreover, a low-power processing unit and image acquisition system was required.

Our method used various techniques presented in different papers. The methods have been combined and modified according to requirements.

Chapter 3          Methods

This chapter describes the methods and algorithms for lemon fruit sorting which are pre-processing, feature extraction and neural network training. The material described here is theoretical; chapter 3 demonstrates the experimental evaluation of the methods given in this chapter.

Figure ‎3.1: Major steps of the algorithm

Algorithm used in the system has seven major steps shown in figure 3.1 and figure 3.2 shows the block diagram.

Image sensor
Light Source
Digitized Image
Features Extraction
Decision Making
Signal to Mechanical Pusher
Sorting Bins
Figure ‎3.2: Block diagram

A flowchart, showing detailed algorithms and steps is presented below.

Figure ‎3.3: Flow chart for machine training


Figure ‎3.4: Flow chart for testing

3.1       Image Acquisition

Capturing the digital image is the very first step in image processing. A controlled light source is required to get a better image. Moreover, the background, distance from camera to the object should also be controlled in order to get a better picture and consistent results. The factors have a great effect on image segmentation. If a fixed background is used, no pre-processing is required (Jhawar, December 2015). Images were captured by a CCD camera, which resulted in color images in RGB color space. Exposure value and AWB gains for camera were set manually to get images with consistent parameters. Automatic metering was turned off and images was captured as frames from the video stream.

3.2       Cropping

An imaging chamber was used to capture the image, which had the physical arrangement such that fruit could only appear at a fixed central region in the camera’s field of view. The probability of finding the fruit outside that region was zero. Therefore, it was practical to crop only the central region and perform further operations on that region only. It could save the memory and the processing power resulting in speeding up the real-time operations.

3.3       Background Removal

Background needs to be removed so that the algorithm only performs calculations on the object of interest only. Background removal can save a lot of processing power later and reduces the complexities in the later algorithms. Since we are using a fixed black background, only a little or no effort is required to remove the background. The algorithm checks for pixel values for all three channels (R, G, and B). Knowing the pixel intensity range for the fruit, all other values would be set to zero. The process was simple and robust and could isolate image region accurately.

Algorithm 1:

     For each pixel  of image  with value  and :

                 Red(x,y) =


                 If  ( )



Since the background used was black and under different lighting conditions, it could only produce levels of grey. Using the fact that grey levels always have all three pixels (R, G, and B) nearly equal, green channel values were subtracted from red channel and whenever the absolute difference was below a threshold, the pixel was set to black.

3.4       Noise Removal

Captured images always contain some form of noise.

“Throughout the whole sensing process, noise is added from various sources, which may include fixed pattern noise, dark current noise, shot noise, amplifier noise and quantization noise.” (Szeliski, September 3, 2010)

Therefore it is necessary to eradicate the noise as much as possible. Various techniques are used to remove noise but Gaussian filter is a natural way for noise removal. Gaussian filter is used extensively in image processing and signal processing. It is used at the pre-processing stage and it provides better results under all kinds of noises. Gaussian filter is a low pass filter and thus can reduce high-frequency components and smoothens the image. It is a little slower in runtime compared to box filter and median filter. Finding an appropriate filter size is necessary so that image would not become too flat. The image is convolved with a Gaussian kernel having odd sizes such as 3×3 or 5×5 and so on. Larger filter sizes are slower and can reduce details in the image.

A two dimensional spatial Gaussian filter has the mathematical form:


A 3×3 kernel with the value of =1 is shown below.

Table ‎3.1: Gaussian kernel, example.

0.077 0.123 0.077
0.123 0.195 0.123
0.077 0.123 0.077

The kernel is centered at every pixel of the image and multiplied element-wise followed by the addition of all terms. The result is the new value of the central pixel. The process is also known as convolution.

3.5       Colour Features

Colour feature used to determine the ripeness of the lemon was global Mean Value of individual image channels.

Jhawar (2015) used mean along with standard deviation to determine four classes of oranges as not ripe, semi ripe, ripe, and over ripe.

A ripe lemon fruit has red channel intensity around 180, green from 150 to 180, and blue below 80.

3.5.1     Mean Value

Mean Value is simply arithmetic mean. It is an important feature and provides knowledge about ripeness. Humans consider a yellow lemon as ripe, greenish yellow as semi ripe, and green as not ripe.  Mean Values of all channels such as red and green blue were used to classify the fruit whether it is ripe, semi ripe, or not ripe at all. The colour space used was RGB colour space. By convention, it is closer to human visual system where three kinds of color receptor cells (Red, Green, and Blue) sense the colors. High Mean Value of red shows that the lemon is ripe because, in RGB color space, yellow has a high value of red.

The thing worth mentioning is that while computing mean, background pixel (which was previously set to zero) should never be considered as the part of mean computations because it yields false values. A mask was used while computing the mean to consider only fruit region for mean calculation.

3.6       Size

Size is very important feature; usually humans consider bigger sized fruit as better quality one. In computer vision, the size of an object can be determined by counting the number of pixels covered by that object in the digital image. It is not very accurate but it does certainly provide very good estimation. The way it worked is given as:

  • Count the number of non-zero pixels of the image (all other pixels were set to zero).
  • Took the real object and found its diameter using Vernier calipers.
  • Used the formula to calculate the area.
  • The number of pixels corresponds to this area.
  • Developed an equation as


where k was determined practically.

3.7       Surface Defects

Initially Surface defects were determined using region based segmentation (Mohana & C.J., 2015). The technique provided good results but it was found that region based segmentation was a time-consuming process for our ARM-based processing unit. Later two different algorithms were used to detect surface irregularities.

3.7.1     Centre Surround Method

Centre surround mechanism inspired by human receptive fields. Centre surround is a type of spatial filtering that is independent of the global context (Vonikakis & Winkler). The algorithm computed average pixel intensity of a local neighborhood of 5×5. The central pixel intensity is subtracted from the average to check local contrast difference. If the absolute difference exceeds some predetermined value, it means the contrast for that pixel is high and the pixel is set to logical high, otherwise, the pixel is set to zero. We used this technique to detect local contrast differences because the resultant image was a contrast map where only strong contrast differences were shown. Knowing the fact that the defect-free skin is smooth and does not pose strong contrasts, the mechanism was used to detect strong local contrasts. Pixels with strong contrasts were counted and result was an estimation of the defective area.

The image was rescaled to 40% in both directions to increase the filter strength. The algorithm for center-surround (Frintrop, 2006) is presented below.

Algorithm 2:

          For each pixel  of image  with value :

                      Centre =

                      For pixels with value  within image border:

                                  Surround = mean

                       = centre – surround

                      If  ( )



3.7.2     Global Standard Deviation

Standard deviation is a measure of how spread out the pixel values of an image are. Low standard deviation means the values are mostly close to the Mean Value of the image pixels. It is a good measure of how smooth an image surface is. Higher standard deviation means the pixel values are non-uniform and spread in a wider range. Lower standard deviation shows that the image is smooth and has less variation spatially. The name global standard deviation indicates that the computations were performed on the whole image channel simultaneously. All three channel Red, Green and Blue were treated separately.

Standard deviation can be calculated as:


  • is the Mean Value of all pixels and computed as:
  • is the value of a pixel in row I and column j.
  • is the total number of pixels in the image.

3.7.3     Local Standard Deviation

The method described in the previous section was quite handy but in practice, it was not suitable for all kinds of lemons. A lemon turning from green to yellow has yellow patches on green and possesses strong color contrast even if it does not have any defect.

To overcome this difficulty another method was proposed. The whole image was divided into 16×16 patches. Standard deviation was calculated for each patch thus the name, local standard deviation. The method calculated standard deviation in each 16×16 patch locally, independent of the global context. The patches having higher standard deviation were categorized as defective because, in a local neighborhood, pixel values should not spread significantly for a defect-free surface. The only defective region has high standard deviation. The computed standard deviation was stored in a matrix having elements equal to the number of 16×16 patched in the image.

It was determined experimentally that the border of fruit had a high standard deviation even for good fruits because the patches at the border have wider pixel value distribution beyond the mean. A morphological operation was performed to remove some border values. Most patches in the fruit region had values of standard deviation in the range of 0 to 1 even for smoother skin because of little bumps on the lemon surface. So the patch comprising of vale 1 was set to zero. The remaining patches, where the value of standard deviation was non-zero were added together and were used as feature M.

Features from both center-surround and local 8×8 patches standard deviation were used for defective fruit detection.

Algorithm 3:

For every ( ) and                           [where step = 8]

                                    Si,j = standard deviation

                                    If ( )

                     Si,j = Si,j – 1



The extracted features are:

  • Mean Value of Red
  • Mean Value of Green
  • Standard deviation (8×8 patches)
  • Area
  • Defective area using a center-surround algorithm

We will not use the Mean Value of the blue color because it has no significant effect on quality prediction (Jhawar, December 2015).

3.7.4     Data Normalizing

The extracted features were in the form of numbers. The five features from each fruit sample were arranged in a matrix. A total of 150 training samples, 50 from each category were acquired and put into the matrix.

Machine learning algorithms require normalized data in the form of floating-point numbers ranging from zero to 1. For this purpose, the matrix was converted to a 32-bit floating data type and each row was divided by the highest number in that row. The operation resulted from normalized floating-point data in the range {0.0 to 1.0}.

3.7.5     Labelling

Since the supervised learning algorithms were used for training and classification, labels for each sample must be passed to the learning algorithm. A floating-point matrix was created which contained only labels.

3.7.6     Training

Two different machine learning algorithms were used for training. Both the algorithms were available as built-in functions in the OpenCV machine learning library.

  • Back propagation neural network (Multilayer Perceptron)
  • Support vector machine

The resulting learned weights were saved to storage for later use. The results obtained will be described in later chapters

3.8       Hardware and Physical Arrangement

A complete fruit sorting mechanism was built to make the process automatic. The major parts of hardware are:

  • Raspberry pi
  • Camera
  • Image capturing chamber
  • Conveyor belt
  • Actuator
  • Sorting bins
  • Electrical components for control and switching

The hardware and its components are described in this section.

3.9       Raspberry Pi

Raspberry Pi is a single-board computer also called a development board. It was created in the UK to facilitate the education of computer science in developing countries and kids. It became more popular in other fields such as robotics. Various developers and inventors use Raspberry Pi for prototyping. Since its release, the Raspberry Pi organization has released many models and revisions differing in features like memory, peripherals, and processor.

The development board used in this project was Raspberry Pi 3 Model B. This particular computer has a 1.2 GHz ARM64 processor and 1 GB RAM which was enough to consider it as our central control and processing unit. It also has a 40 I/O pins header that can be used to control motors used in the project.

Figure ‎3.5: Raspberry Pi 3 Model B, image from

3.10   Camera

Figure ‎3.6: Raspberry Pi camera board V1.3, image from

The used in the project was also from the Raspberry Pi organization and it was Pi camera V1.3. The camera had a 5 mega pixel sensor that could take 2592×1944 resolution images with decent quality. It had a serial interface in could be connected directly to the Raspberry Pi board. The pictures were obtained directly from the video stream as frames which were not as good as a picture captured by the camera board because picture mode has some advanced algorithms to remove noise and filters to improve quality and correct colors. Since our application required a series of frames from the video stream so the use of picture mode was not possible. The video mode produces a stream of 1080p.

3.11   Belt Conveyor System

The belt conveyor is the medium that carries the lemons from one end to other. It has the following main parts:

  1. A drive motor
  2. A pair of rollers
  3. The frame
Figure ‎3.7: The belt conveyor system


3.11.1  Drive Motor

The drive motor used to make rollers spin was obtained from an electric scooter. The motor had external gears to reduce the angular velocity resulting in high torque. The motor required a 12V DC supply with a no-load current of 1.5A and a full load current of 2.3A. The motor was controlled directly by the RaspberryPi itself. The motor speed was reduced to half using Pulse width modulation of the square wave. The driver circuit used to control the motor was connected to the RaspberryPi. A MOSFET that can handle high current was used as a switch. A wiring library with C++ was used to produce PWM.

Figure ‎3.8: Schematic diagram of the motor control circuit


Pulse width modulation frequency was set to 5kHz and a duty cycle of 50%.

3.11.2  Rollers

Two rollers made from wood were used to support the belt. Both sides of the rollers were supported using ball bearings which allowed the rollers to spin with much-reduced friction. Each roller had a diameter of 5cm and a length of 20.3cm. One roller was connected to the motor to provide torque.

Figure ‎3.9: Wooden roller

The roller was covered by a rubbery material to make the surface contact of the belt and roller non-slipping.

3.11.3  Frame

A wooden frame was made to support all the components of the belt conveyor, image capturing chamber, and actuator. The frame has a structure at both ends where bearing blocks can be mounted. The bearing blocks can be adjusted so that to set the tension of the belt. The frame has a length of 1.22 meters and a width of 25cm. the frame has its uppers surface flat where the belt is placed.

3.11.4  Belt

A non-elastic rubbery cloth was used as the belt. The belt was composed of two materials, the fabric and a leather-like material which makes it fit for the application. The belt tension was adjusted so that it might not slip or track sideways. Belt was painted black because the image processing algorithm required the background to be black.

3.12   Image Capturing Chamber

An imaging chamber was proposed as part of the project to protect the sample from changing lighting conditions from the outside world while capturing. The camera was mounted at the roof of the chamber and two LED lamps with a built-in diffuser were installed inside the chamber.

The lamps were provided with fixed 12 volts to make the lighting conditions constant. The openings of the chamber were covered using papers to prevent as much light as possible to enter the chamber. The chamber was 20cm high, 30cm long, and 20cm wide.

The chamber inside was designed such that the lemon was placed directly inside the chamber where it was stopped right beneath the camera and the image was taken. The



Figure ‎3.10: Image capturing chamber

the idea of stopping the fruit before taking the image was to eliminate the motion blur caused when the image was taken while the fruit was moving.

3.13   Actuator

An actuator is the part of a machine that is responsible for mechanical control such as opening or closing valves. The actuator in the project was used as a pusher to move the fruit into the respective bin. The bin where the lemon was to be pushed was decided by Back Propagation Neural Network. The output of the Neural Network decided which way to move the fruit.

The actuator was built using a PVC pipe and a small motor. RaspberryPi directly controlled the movement and timing of the actuator.

3.13.1  H-Bridge

The circuit used to control the actuator was an H-Bridge which can turn a DC motor in both forward and reverse directions (Paquino25, n.d.). It was constructed using two N-Channel and Two P-Channel Power MOSFETs. A small NPN transistor was used at the driver stage to control switching which acts as protection between gate capacitance and Raspberry Pi pins.

Figure ‎3.11: H-Bridge

The H-Bridge has two inputs S1 and S2, and the output is in the form of forwarding or reverses the direction of the motor.

3.14   Power Supply

As mentioned earlier that the system required a fixed 12V for lighting conditions to remain constant. Even a little voltage drop is highly undesirable. Motors can draw six to ten times more current at starting. Since our system used three motors two of which were started and stopped regularly. The choice of power supply was critical. ATX power supplies are designed to withstand changing loads and high currents. Therefore and the ATX power supply was modified to power up the system. It was a 250-watt switched-mode power supply that was rated at an output of 12V 16A which was pretty solid considering our application. Additionally, it could provide 5V and 3.3V output too. One added advantage was its short circuit and overcurrent protection.

Chapter 4          Results

In most image processing applications, pre-processing is required on the image. Pre-processing includes noise removal, background subtraction, cropping, and resizing. This section discusses the pre-processing followed by feature extraction in detail. Experimental results are provided in this chapter.

4.1       Image Acquisition

Figure ‎4.1: Image capturing Chamber

Digital image acquisition is the very first step in this project. Varying ambient light has an adverse effect on the quality of the image captured. For this reason, an image capturing chamber was constructed. It had 20cm in height. The camera was mounted at the top inside the chamber. The camera used for image capturing was Raspberry Pi Camera Board v1.3 (5MP, 1080p) as described earlier. It was a fixed focus camera and hence could be used in a fixed position. Chamber was designed to stop ambient light from entering the chamber. Chamber was provided with two fluorescent lamps each powered by a 12V source. It could provide consistent lighting to achieve good quality images. Captured images had the dimensions 1280x960x3 and RGB color space. The image captured by the camera is shown in Figure 4-1 which is the scaled-down version of the original. The

Figure ‎4.2: Original image

lemon appears at the center of the image.

4.2       Cropping

Figure ‎4.3: Marked area

The image was cropped to remove the part of it where the probability of finding the object of interest is zero.  Cropping is an easy and very important step in image processing. Cropping allows to focus only on the main object and removes distracting objects from an image. Cropping not only reduces size but saves a lot of computing effort later. Moreover, cropping can change the aspect ratio. The only area of the image was cropped where the object could appear. The area required to crop is shown in Figures 4-3 where the region of interest is shown marked by a purple rectangle.

In this step, the original image with a resolution of 1280×930 and a total of 1190400 pixels was reduced to 410×300 with only 123000 pixels. The resulting cropped image reduces to only 10% of the original. Figure 4-4 shows the resulting cropped image.

Figure ‎4.4: Cropped image

4.3       Noise Removal

Noise removal is an important step in image processing. Noise from different sources such as lens, sensor, quantization, and transmitting channel should be removed to make the image noise-free. The algorithms perform better on noise-free images. Gaussian filter was used for this purpose. The parameters for the Gaussian filter set manually were sigma and size. Size means kernel size and it was set to 3. Higher kernel sizes eliminate more noise but are slower in operation thus the choice of the parameter was critical. Sigma is the standard deviation of the Gaussian filter. It was a 2D function and it allowed the use of two different values of sigma for x and y-direction. In this example, both parameters sigma-x and sigma-y were set equal to 3. The result was a smoother image with less noise which can be seen in figure 4-5.

Figure ‎4.5: Noise removal

(a) Close-up from original image  (b) Close-up from filtered image

4.4       Background Subtraction

Background subtraction is an important pre-processing step in computer vision. The background should be removed so that the system only performs calculations on the object of interest. Background removal can save a lot of processing power later and also reduces the complexities in features extraction. All background pixels were set to zero based on the assumption that the background was black. Since the background was black, pixels having intensity close to zero were set to zero. Some pixels had intensity in the range of 50 due to reflection of imperfection but even for higher value pixels, the black background assumption worked. It was noted that the bright spots on the background produced only shades of grey and not a color. The pixel value for background in all three channels (RGB) was almost equal therefore the difference between the two channels was used to check whether the pixel belonged to the background. For this purpose, the blue channel was subtracted from the red channel and the absolute difference was calculated. If the difference exceeded a value of 30, the pixel certainly belonged to fruit and not the background. The difference in the red and blue values in the pixels belonging to the fruit was always more than 60. It was the simple and efficient approach.

Figure ‎4.6: Background removed

4.5       Blob Detection

BLOB is the isolated object in the binary image. In image processing, blobs are used to compute shape-related features, in some operations, such as calculating Mean Value, the binary image is passed to function to compute the Mean Value of the image area highlighted by the binary image.

After pre-processing operations described previously, the blob was isolated (Figure 4-6). A simple thresholding operation was enough for this purpose. The image was first converted to greyscale followed by a thresholding operation. Thresholding is an operation in which the pixel value less than a set threshold is set to zero and all other pixels are set to one (binary). The threshold value used was 30. To suppress sharp edges in the binary image (also referred to as a binary mask) a morphological closing operation was performed.

Figure ‎4.7: Thresholded image

4.6       Morphological Operations

Morphological transformation is an operation based on the shape of the image. The transformation is usually applied to a binary image. Basic morphological transforms are dilation and erosion.

Erosion, just like soil erosion takes the white pixels forming the shape, away. The foreground object should be kept in white. Erosion removes the boundary pixels depending on the type of kernel used. The pixel is considered one only if all the pixels under the kernel are one. The operation increases the black region and suppresses the white foreground.

Dilation is just the reverse of erosion. The pixel is considered one if any of the pixels under the kernel is one. As the result, the operation increases the white region.

The threshold image obtained in the previous section was subjected to Erosion. The operation smoothed the boundary of the shape. The kernel used in erosion was 3×3.

Discussion related to pre-processing ends here, the feature extraction will be discussed from now on.

Figure ‎4.8: Before and after morphological operations.

4.7       Mean Value

The mean Value of the image was the first feature. The color image comprises three channels, Red, Green, and Blue. The mean for each channel was calculated separately and a total of three features were computed namely:

  • Red Mean
  • Green Mean
  • Blue Mean

These features were related to ripeness measurement. Mean Values of Red, Green, and Blue for Ripe, Semi Ripe, and Not Ripe have been shown in the table. A total of 100 samples were used to extract features. It can be observed that ripeness features are linearly separable. Some fixed thresholds can separate the data into the respective category.

Ripeness Mean vale
Red Mean Green Mean Blue Mean
Ripe 180 – 200 170 – 191 75 – 100
Semi Ripe 135 – 180 150 – 170 61 – 75
Not ripe <135 <150 <61

Table ‎4.1: Determining ripeness using Mean Values of Red, Green, and Blue color channels

4.8       Global Standard Deviation

Standard deviation is a statistical property of data that tells that how much data differ from the Mean Value of that data. A fair and smooth surface has a low standard deviation whereas a defective surface has a higher standard deviation. As described earlier that the color image is composed of three channels, Red, Green, and Blue. The standard deviation for each channel was computed separately just like the Mean Value feature. Standard deviation features extracted in the process were:

  • Red Channel Standard Deviation
  • Green Channel Standard Deviation
  • Blue Channel Standard Deviation

It is worth mentioning that the standard deviation computed in this particular section had a global context.

4.9       Area

The size of the fruit is a good indication of its quality. It is usually considered that bigger the size relates to better quality. It is described in the previous section that the blobs are used for shape features extraction. Lemon is a three-dimensional body that has volume. Image captured by camera loses the third dimension details thus volume calculation from a 2-dimensional image is not possible. Since the whole shape of the object is mapped to pixels in the image, counting the pixels that describe the object produce a good approximation of the area. The area is an important feature in our system.

4.10   Centre Surround

The center-surround method, as described earlier, is a biologically inspired method that detects contrast changes in an image. The Center surround method considers a local neighborhood around a pixel and computes the Mean Value. If the pixel value exceeds the Mean Value by a certain margin, the pixel is said to have a strong contrast. The method performed better while considering a neighborhood of 9×9 which was slower in processing. In the field of image processing, using a larger filter is considered inefficient. Instead of using a larger filter which slows down the operation, the image can be reduced in size, and a smaller filter is used. A filter applied to the image has a greater impact when the image is subsampled. This approach was used in the system and the image was resized to 20% both for rows and columns. A neighborhood of 5×5 was used for center-surround computations. Pixels with higher local contrast were set too high. All other pixels were set to zero. The resulting image shows a strong contrast change map which was essentially zero for smoother images.

Counting non-zero elements in the contrast map, obtained through centre surround method, resulted the total defective area of the fruit. Higher the non-zero pixels, more the fruit is faulty.

   (a)                                              (b)                                                                        (c)                                           (d)

Figure ‎4.10: Local contrast

(a) Contrast map of defective lemon, (b) a defective lemon, (c) contras map of a healthy lemon, (d) a healthy lemon

4.11   Local Standard Deviation

Local standard deviation is the measure of non-uniformity in a given local region. Whole image was divided into 16×16 patches and the standard deviation for each patch was computed separately. The standard deviation for each patch was stored in a results matrix having elements equal to the number of 16×16 patches in the original image. The results matrix elements are summed up to get feature value of local standard deviation.


Figure ‎4.11: 16×16 patches to compute local standard deviation

(a) Smooth image with feature value 56,  (b) Defective lemon with feature value 125

For local standard deviation, the image was first converted to greyscale.

4.12   Training

The algorithm began with image acquisition, the acquired image was enhanced and cropped (pre-processing) and nine useful features related to ripeness, size and defects were extracted. Now it was the time to train a neural network. Neural Network training requires pre-processing for extracted features. All the samples (lemons) were placed in the chamber one by one and useful features were extracted. Extracted features were arranged in a feature vector. Feature vector was a 32-bit floating point array having number of rows equal to number of features. The features were arranged in the columns and rows contained the data points. Since we used 100 lemons for training purpose, our training feature vector had 100 rows and 9 columns.

Similarly, a 100 rows labels vector was created which contained the labels for all samples.

4.12.1  Data Normalization

Data normalization is an important pre-processing step in machine learning. Features are normalized into a range of 0.0 to 1.0. Machine learning algorithm require features arranged in certain way, every column in the feature vector should contain a single feature. To normalize data, every column element was divided by the biggest number in that feature category.

4.12.2  Neural Network

Multilayer Perceptron was used for training and prediction of fruit quality. Multilayer Perceptron is a feed forward feed forward artificial neural network. Multilayer Perceptron has at least three layers, input layer, hidden layer and output layer. Each layer has nodes which is actually a neuron. Apart from input layer, every neuron uses activation function. OpenCV implements this as a class named ann_mlp. Multilayer Perceptron uses nonlinear activation functions and can distinguish nonlinear data. Multilayer Perceptron uses a supervised learning technique for training known as backpropagation. Input layer must have neurons equal to the number of input features. Neurons in hidden layer are determined experimentally through trial and error whereas the output layer should have neurons equal to the number of outputs.

Table ‎4.2: Parameters to initialize the neural network

Parameters Name Parameter Value
Input Layers 9
Hidden Layers 36
Output Layers 3
Training Method Backpropagation
Activation Function Sigmoid
Termination Criteria 100 iterations max or when error reduces to 10-6

Table 4.2 shows the parameters used to initialize neural network.

Learned data was stored to storage for future use as ‘.yaml’ file.

4.13   Features Correlation

In this section, the co-relation between features has been presented in the form of scatter plots of the features. Figure 5-1 shows the correlation between red and green Mean Values whereas figure 5-2 shows the correlation between red and blue Mean Values. Plot shows high correlation between Red, Green and Blue Mean Values. One or two of these features could be excluded from the training without effecting the accuracy very much. The reason to include all of these was to get maximum possible efficiency.

Figure ‎4.12: Red-Green mean correlation

All values was normalized between 0 and 1.

Figure ‎4.13: Red-Blue mean correlation

4.14   Neural Network Performance

A backpropagation multilayer perceptron was trained using 100 samples. Data set was split into 3 subsets, only first set was used for training. Other two subsets were used for validation and testing. Seventy samples were used for training, fifteen for validation and fifteen for testing. Neural network performance was observed based on these sets. Several statistical tools were used to analyse the Neural Network performance in MATLAB.

4.15   Training Performance

Neural network achieved its best performance after only 21 iterations as shown in figure 5-1. The graph shows three curves, training, validation and test. At the beginning, model started to improve and after 15 iterations, it produced best validation performance. In MATLAB, once best validation performance is achieved, the training is continued for six more iterations and then stopped. It can be observed that, after fifteen iterations, the model started to over fit the data and validation curves started to rise.

4.15.1  Cross Entropy Error

Figure 5-14, shows error rate for training, testing and validation data sets. It can be seen that the training, testing and validation errors decrease with the number of training iterations and an optimum solution was found after 21 iterations.

Cross entropy plot is the measure of the quality of neural network predictions rather than the classification error. Classification error only shows the number of misclassification whereas the cross entropy shows the quality of the prediction. The training error after 21 iterations reduced to 4.3% whereas the testing error was about 6%.

Figure ‎4.15: Performance plot, neural network

4.16    Classification Error and Accuracy

Confusion matrix is a very simple tool use to analyse the performance of a classifier. Confusion matrix for our classification problem is shown in figure 5-2.

Left confusion matrix shows overall classification accuracy of the neural network. It shows overall accuracy of about 94%. 2 out of 27 samples of good quality lemons were classified as average quality and 1 was classified as defective or unripe since the third category combines both defective and green lemons. There is no misclassification for average lemons therefor an accuracy of 100% was obtained shown in column four. Three of the defective or unripe lemons were misclassified, only was classified as good and 2 were classified as average quality.

Fourth row of the confusion matrix true positives and false positives rate. Figure shows that a total of 25 lemons were classified as good quality and only one lemon in this class was false positive compared to average class were out of 22 lemons, 4 were false positive. Fifty-two lemons were classified as defective of unripe out of which only one was false positive. No lemon from average class was misclassified to other classes and this class also have highest false positives.

4.17   Hardware Simulation Results

A dual H-Bridge was made to control two motors, one for actuator and other for inside the imaging chamber to stop lemon for image capturing. H-Bridge controls the direction of DC motor. It was a two input H-Bridge the simulation results for which are presented in table 5.3.

Table ‎4.3: H-Bridge responses

Input S1 Input S2 Output
0 0 Brake
0 1 Forward
1 0 Reverse
1 1 Brake


Applying the same signal at both inputs of the H-Bridge makes motor to brake.

4.18   Pulse Width Modulation for Motor Speed Control

The circuit in figure 4.3 shows the circuit diagram for to control the DC motor speed control. Speed was using pulse width modulation technique. Frequency for pulse width modulation was set to 5kHz with the duty cycle of 50 percent.

The input frequency was provided by raspberry pi and the circuit amplified the signal to control the motor through a power MOSFET.

Figure 5.17 shows simulation results for pulse width modulation for motor speed control. The simulation was performed on Proteus with virtual oscilloscope. The oscilloscope settings are shown in figure 5.18 below.

Figure ‎4.17: Proteus simulation results
Input Signal:   —–

Output Signal:  …….

Figure ‎4.18: Virtual oscilloscope settings

Chapter 5          Discussion and conclusion

This chapter describes results, areas to improve, and a conclusion.

5.1        Conclusion

The project was about sorting and grading the fruits using image processing and computer vision techniques. The fruit selected for this purpose was lemon. There are five kinds of lemons available in Pakistan. Sorting and grading is an important post harvesting process which is a tiring job for humans. Human workers can produce inconsistent results which can lead to food wastage and financial problems.

The objective of the project was to eliminate human intervention in decision making. A physical system was designed to make whole sorting process automatic. Physical system consists of a conveyor belt, an image capturing chamber, actuator and sorting bins. Fruit is placed in image capturing chamber where image is taken and fruit is automatically placed on the conveyor belt. The system uses a CCD camera to capture image which is further processed.

First major step is pre-processing in which the image is prepared to extract useful features. The image is first cropped to focus only on main object of interest which is lemon in this case and useless part is eliminated. Background is then removed to make further algorithm less complex. Noise is removed using a Gaussian filter of size 3×3 which smooths the image and reduce high frequency components. Pre-processing is complete at this stage.

In next step, the pre-processed image was used to extract useful features from image. Image was spitted into three channels in RGB colour space and each channel was treated individually during extraction of features. Mean Value for all three channels was computed followed by the computation of global standard deviation for all three channels. Mean Value determines the ripeness of the fruit whereas standard deviation measures the surface irregularities at global context. Area of fruit was computed which determines the size of the fruit. A local contrast map was obtained using the technique of centre surround method which approximated the surface defects well. Standard deviation was computed again but this time at local level in 16×16 patches. A total of nine features were extracted at this stage.

Third and final major step in project was to train a machine learning algorithm and store its output for future use. A feed forward, backpropagation, multilayer perceptron was trained using 99 samples of lemons obtained from local market. The learned data was used for later use. At testing stage, the sample was placed inside the chamber where the picture was captured and features were extracted as described earlier. The test samples were fed to input of neural network and the class of lemon was predicted. Lemon was the dropped onto the belt and a command was issued to the sorting actuator to push the fruit into the respective bin.

The system showed an accuracy of 94% percent.

5.2       Comparison with other models

(Jhawar, December 2015) classified citrus into four classes based on ripeness only, the classes were ripe, semi rime, green and over ripe. He could be able to obtain an accuracy of 97.98%. Our model outperformed this model in terms of only ripeness measures. Our approach had an accuracy of 100% for classification based on ripeness.

(Mohana & C.J., 2015) propose a method to detect defects as well as the stem ends. Stem end detection was implemented to lower the chance of stem ends detected as defects. The system performed at an accuracy of 97.5% for defect detection and 95% for stem end detection. Our approach detected the defective lemons with an accuracy of 98%.

(JiangboLi, XiuqinRao, FujieWang, WeiWu, & YibinYing, 2013) purposed a method for defect detection and used light intensity correction for better results. They considered stem detection to increase accuracy and prevent stem ends to be detected as defects. They achieved an accuracy of 98.9%.

(Khoje, 2013 ) used curvelet transform and pattern recognition for defective fruit detection. He used guava and lemon for testing. The approach was good enough to classify the lemons with an accuracy of 91.72%. Our approach is superior than this technique.

(Momin, 2013) Used fluorescent lamp and spectrographic techniques to detect defects on lemon. The method was only for defect detection and the paper does not show any accuracy values. Anyhow the method was able to detect the defective location. Our method uses extracted features to determine if there is a defect on fruit surface or the skin is pure hence does not provide any location information of defect. Apart from location, there is no base line for comparison between both methods for example the classification accuracy. Our approach works well suited for domain of our project to detect defective fruits.

Our approach considers three attributes, area, ripeness and defects for lemon classification whereas all other models deal with one or two attributes only. The table below compares their accuracy with our model.

Table ‎5.1: Comparison of our method with existing techniques

Ripeness Based Classification Accuracy
(Jhawar, December 2015) Method Our method
97.98% 100%
Defect Based Classification Accuracy
(Mohana & C.J., 2015) (Khojasteh, 2010) (JiangboLi 2013) Our Method
98% 91.78% 98.9% 98%


Overall accuracy of our model, as described earlier, is 94% while considering all attributes at once. Our system is more practical and general for grading and sorting while other approaches are limited only to one or two features.

5.3       Discussion

In this section, different limitations of the system will be discussed. Furthermore, areas to improve will be stated in next section. Our system was able to perform at 94% accuracy. Most of fruits and vegetable sorting methods already purposed have accuracies in the range of 90-100%. The reason for this could be a smaller data set of only 99 samples collected from the same source. Our model was trained at a small data set and might not be able to perform at said accuracy in real world.

5.3.1     Limitations and Areas to Improve

The scope of our project was to build a prototype, able to sort fruits automatically without human interventions. Our system does have many limitations discussed here.   Smaller Data Set

The system performed well for given training and testing set. Training sample was obtained from local market only. The training set was very small it did not cover all different kind of lemons found in Pakistan. The model can be thought of a model which was trained only on the lemons available in mid-summer. Different kinds of lemons become available in market in different seasons. According Pakistan Agricultural Research Council, there are five major types of lemons cultivated in Pakistan (Cultvation of lemon, n.d.). Out model does not consider all the types as well as the geographically cultivated lemons.   ARM CPU

Our project embeds a Raspberry Pi board which is has an ARM64 based CPU. ARM CPUs are well behind the AMD64 architecture and did not allow the use of region based segmentation techniques which are very time consuming on ARM architecture. Region based segmentation could potentially increase the reliability in defect segmentation.   Ones Sided Image

The second biggest limitation of our project was that it used only one camera that was mounted at the top inside the image capturing chamber. It could only take the image of top side of the fruit. There was no way to scan all sides of the fruit. bottom side of the lemon cannot be captured and defects at the bottom side were always ignored.   Key Improvements

Previous section discussed limitations related to our project. The clear statement for possible improvements is given in following lines:

  • Use of at least two cameras to maximized the scanned fruit area.
  • Use of a faster x86 or AMD64 based processor for faster real time performance enabling the use of region based segmentation algorithms.
  • Collection of a large data set from different locations and weathers covering whole variety of lemons in Pakistan.
  • Trying some other machine learning techniques for comparing results to figure out which one can perform better for this particular case.


Bipan Tudu, C. S. (2012). An Automated Machine Vision Based System for Fruit Sorting and Grading. Sixth International Conference on Sensing Technology (ICST).

Blasco, S. C.-N. (2014). ptimised computer vision system for automatic pre-grading of citrus fruit in the field using a mobile platform. Precision Agriculture.

Clowting, E. (2007). Robotic apple picker relies on a camera inside the gripper and off-the-shelf components and software. Retrieved from vision systems design:

Cui, Y., Wang, Y., Chen, S., & Ping. (2013). Study on HSI Color Model-Based Fruit Quality Evaluation. 3rd International Conferance on Image and Signal Processing (CISP2010).

Cultvation of lemon. (n.d.). Retrieved from Pakistan Agricultural Research Council:

Frintrop, S. (2006). A Visual Attention System for Object Detection and Goal Directed Search.

Huang, W., Zhang, B., Gong, L., & Li, J. (2015). Computer vision detection of Defective apples using Automatic lightness correction and weighted RVM Classifier. Elsevier-Journal of Food Engineering, 146.

Iqbal, S. M. (2016). Classification of Selected Citrus Fruits Based on Color Using Machine Vision System. International Journal of Food Properties .

Jahnsa, G. (2001). Measuring image analysis attributes and modelling fuzzy consumer aspects for tomato quality grading. Computers and Electronics in Agriculture.

Jhawar, J. (December 2015). Orange Sorting by Applying Pattern Recognition on Colour Image. International Conference on Information Security & Privacy (ICISP2015), (p. 7). Nagpur, INDIA: ScienceDirect.

JiangboLi, XiuqinRao, FujieWang, WeiWu, & YibinYing. (2013). Automatic detection of common surface defects on oranges using combined lighting transform and image ratio methods. Postharvest Biology and Technology, 59-69.

Khojasteh, M. (2010). Development of a lemon sorting system based on color and size.

Khoje, S. A. (2013 ). Automated Skin Defect Identification.

Kondo, N., & Ting, K. C. (1998). Robotics for bioproduction systems.

Li, J. (2013). Automatic detection of common surface defects on oranges using combined lighting transform and image ratio methods. Postharvest Biology and Technology.

López-Garcíaa. (August 2013). Detection of Visual Defects in Citrus Fruits: Multivariate Image Analysis vs Graph Image Segmentation. 15th International Conference on Computer Analysis of Images and Patterns. York, UK:

Mohana, & C.J., P. (2015). Automatic Detection of Surface Defects on Citrus Fruit based on Computer Vision Techniques. I.J. Image, Graphics and Signal Processing.

Momin, A. (2013). Identification of UV-Fluorescence Components for Detecting Peel Defects of Lemon and Yuzu using Machine Vision.

Ohali, Y. A. (2011). Computer vision based date fruit Grading system: Design and implementation. Journal of King Saud University, 29–36.

Paquino25. (n.d.). Retrieved from

Pauly, L., & Sankar, D. (2015). A New Method for Sorting And Grading Of Mangos Based On Computer Vision. 978-1-4799-8047-5/15/$31.00_c 2 IEEE.

Seema, Kumar, A., & Gill, G. S. (2015). Computer Vision based Model for Fruit Sorting. IEEE, Volume 2.

Swapnil S. Pawar, & Dale, M. P. (2016). Computer Vision Based Fruit Detection and Sorting System. Special Issue on International Journal of Electrical, Electronics and Computer Systems. Pune .

Szeliski, R. (September 3, 2010). Computer Vision: Algorithms and Applications.

Vonikakis, V., & Winkler, S. (n.d.). A center-surround framework for spatial image processing. Advanced Digital Sciences Center (ADSC), Singapore.

Wu, D., & Sun, D.-W. (2013). Color measurements by Computer vision for food quality control. Trends in Food Science & Technology.


C++ Code:

  1. ///////////////////////Include necessory liberaries////////////////////////////////
  2. #include < opencv2 / core / core.hpp >
  3. #include < opencv2 / highgui / highgui.hpp >
  4. #include < opencv2 / imgproc / imgproc.hpp >
  5. #include < opencv2 / ml / ml.hpp >
  6. #include < fstream >
  7. #include < iostream >
  8. #include < ctime >
  9. #include < raspicam / raspicam_cv.h >
  10. #include < stdlib.h >
  11. #include < unistd.h >
  12. #include < wiringPi.h >
  13. #include < softPwm.h >
  14. #include < thread >
  15. //////////////////////////////Defining Pin Numbers///////////////////////////////////
  16. #define motor 4  //Convayor belt drive
  17. # define actuator_1_p 1  //Mechanical pusher
  18. # define actuator_1_h 0
  19. # define door_c 2  //Door inside chamber
  20. # define door_o 3
  21. //////////////////////////////////////////////////////////////////////////////////////
  22. using namespace cv;
  23. using namespace std;
  24. using namespace ml;
  25. ///////////////////////////////Global veriables//////////////////////////////////////
  26. floatfeature_area = 0;      //fruit area
  27. floatfeatures[1][9] = {0, 0, 0, 0, 0, 0, 0, 0, 0 };
  28. intmotor_value = 2;
  29. ///////////////////////////////Function prototyping///////////////////////////////////
  30. voidfind_dct(Mat imgCropped);
  31. Mat crop_1(Mat image);
  32. Mat mask_1(Mat image);  //get binary mask to select only foreground
  33. voidmean_std(Mat image, Mat mask);
  34. voidfind_area(Mat mask);
  35. voidarea_defect(Mat image, Mat mask);
  36. PI_THREAD(actuator_1);
  37. PI_THREAD(actuator_2);
  38. PI_THREAD(door);
  39. intt1_v = 0;                        //communicate to thread
  40. intt2_v = 0;                        //communicate to thread
  41. intd_v = 0;
  42. ///////////////////////////function initialses raspberry pi pins///////////////////////
  43. voidsetup() {
  44. wiringPiSetup();
  45. pinMode(door_o, OUTPUT);
  46. pinMode(door_c, OUTPUT);
  47. pinMode(motor, OUTPUT);
  48. pinMode(actuator_1_p, OUTPUT);
  49. pinMode(actuator_1_h, OUTPUT);
  50. digitalWrite(motor, LOW);
  51. digitalWrite(actuator_1_p, LOW);
  52. digitalWrite(door_o, LOW);
  53. digitalWrite(door_c, HIGH);
  54. digitalWrite(actuator_1_h, HIGH);
  55. delay(300);
  56. digitalWrite(actuator_1_h, LOW);
  57. digitalWrite(door_c, LOW);
  58. softPwmCreate(motor, 0, 3);    //pin , initian value, range
  59. }
  60. //////////////////////////////////Main Function//////////////////////////////////////////
  61. intmain() {
  62. setup();
  63. //////////////////////////following three statements start three threads/////////////////
  64. piThreadCreate(actuator_1);
  65. piThreadCreate(actuator_2);
  66. piThreadCreate(door);
  67. raspicam::RaspiCam_Cv Camera;      //Initialize camera
  68. set(CV_CAP_PROP_FORMAT, CV_8UC3);  //setting camera parameters
  69. Mat imgOriginal, imgCropped, mask, imGray, imgWindow, imgCoeff;
  70. Scalar mean_1;
  71. namedWindow(“cropped”, WINDOW_KEEPRATIO);
  72. if (! {
  73. cerr << “Error opening the camera” << endl;
  74. return -1;
  75. }
  76. cout << “Make Sure Chamber is Empty” << endl;
  77. usleep(3000000);                                   //delay so that camera starts properly
  78. //////////////////////////////Set camera parameters//////////////////////////////////////
  79. set(CAP_PROP_EXPOSURE, 3);
  80. set(CAP_PROP_WHITE_BALANCE_BLUE_U, 0.0155);
  81. set(CAP_PROP_WHITE_BALANCE_RED_V, 0.0165);
  82. ///////////////////////////////Caliberation Complete/////////////////////////////////////
  83. cout << “Camera Caliberated” << endl;
  84. usleep(2000000);                                   //camera stabilizes
  85. softPwmWrite(motor, motor_value);  //start belt
  86. /////////////////////////////////Initialize Neural Network////////////////////////////////
  87. Ptr < ANN_MLP > ANN = ANN_MLP::load < ANN_MLP > (“ANN_MLPtrained.yaml”);
  88. int key = 0;
  89. ///////////////////////////////Here Goes Infinite Loop////////////////////////////////////
  90. while (1) {
  91. key = 0;
  92. ///////////////////////////Wait for user input to start or stop////////////////////////////
  93. key = waitKey(1);
  94. if (key == 32) {
  95. softPwmWrite(motor, 0);
  96. key = waitKey();
  97. if (key == 27) {
  98. break;
  99. }
  100. softPwmWrite(motor, motor_value);
  101. }
  102. if (key == 27) {
  103. break;
  104. }
  105. grab();
  106. retrieve(imgOriginal);  //get image
  107. imgCropped = crop_1(imgOriginal);       //crop image
  108. imshow(“cropped”, imgCropped);
  109. mask = mask_1(imgCropped);             // Get thresholded image
  110. cvtColor(imgCropped, imGray, CV_BGR2GRAY); //Convert to Grayscale
  111. mean_1 = mean(imGray, mask);            //Find mean to check wether there is fruit
  112. //////////////////////////////////If fruit detected///////////////////////////////////////////
  113. if (mean_1[0] > 30) {
  114. cout << “mean ” << mean_1 << endl;
  115. usleep(1000000);                   //wait for fruit to stop
  116. //////////////////////////////////////Acquire image//////////////////////////////////////////
  117. grab();
  118. retrieve(imgOriginal);  //get image
  119. ////////////////////////////////////////Preprocessing///////////////////////////////////////////
  120. imgCropped = crop_1(imgOriginal);   //Crop image
  121. mask = mask_1(imgCropped);          //Thresholded image
  122. Mat mrg[] = {
  123. mask, mask, mask
  124. };
  125. Mat mask3;
  126. merge(mrg, 3, mask3);
  127. bitwise_and(imgCropped, mask3, imgCropped); //Romve background
  128. /////////////////////////////////Features Extraction/////////////////////////////////////////////
  129. ///////////////Find mean value of all three channels and Global standard deviation///////////////
  130. mean_std(imgCropped, mask);
  131. find_area(mask);                           //Find area
  132. find_dct(imgCropped);                       //Find local contrast differances
  133. area_defect(imgCropped, mask);               //Find local standard deviation
  134. imshow(“Fruit”, imgCropped);
  135. /////////////////////////////////////Creat feature vector////////////////////////////////////////
  136. Mat testData(1, 9, CV_32FC1, features);
  137. cout << ” | ” << testData << endl;
  138. Mat response_ann;
  139. float response_ann_f = 0;
  140. int response_ann_idx;
  141. //////////////////////////////////////Prediction Using Neural Network//////////////////////////
  142. ANN – > predict(testData, response_ann);
  143. for (int z = 0; z < 3; z++) {
  144. if (response_ann_f < < float > (0, z)) {
  145. response_ann_f = < float > (0, z);
  146. response_ann_idx = z;
  147. }
  148. }
  149. cout << ”  => ANN_Response: ” << response_ann_idx << endl;
  150. d_v = 1;                                                    //Communicate to thread to open the door
  151. //////////////////////////////Issue a command to actuator based on decision////////////////////
  152. if (response_ann_idx == 1) {
  153. t2_v = 1;
  154. }
  155. if (response_ann_idx == 2) {
  156. t1_v = 1;
  157. }
  158. piLock(1);
  159. }
  160. usleep(1000000);
  161. }
  162. //////////////////////////////// Terminate programme on detection of ESC key///////////////////
  163. cout << “Exiting” << endl;
  164. ////////////////////////////////////Terminate threads and stop motor////////////////////////////
  165. t1_v = 2;
  166. t2_v = 2;
  167. d_v = 2;
  168. softPwmStop(motor);
  169. waitKey(100);
  170. }
  171. //////////////////////////////////////First Thread Function/////////////////////////////////////
  172. ////////////////////////////////////Used to push Defective fruit/////////////////////////////////
  173. PI_THREAD(actuator_1) {
  174. while (true) {
  175. if (t1_v == 1) {
  176. piUnlock(1);
  177. cout << “First actuator” << endl;
  178. t1_v = 0;
  179. delay(900);
  180. digitalWrite(actuator_1_p, HIGH);
  181. delay(300);
  182. digitalWrite(actuator_1_p, LOW);
  183. delay(100);
  184. digitalWrite(actuator_1_h, HIGH);
  185. delay(300);
  186. digitalWrite(actuator_1_h, LOW);
  187. } else if (t1_v == 2) {
  188. cout << “Closing Thread1” << endl;
  189. }
  190. delay(10);
  191. }
  192. }
  193. ///////////////////////////////////Second Thread Function////////////////////////////
  194. /////////////////Controls actuator for pushing B category fruit//////////////////////
  195. PI_THREAD(actuator_2) {
  196. while (true) {
  197. if (t2_v == 1) {
  198. piUnlock(1);
  199. cout << “Second actuator” << endl;
  200. t2_v = 0;
  201. delay(1);
  202. digitalWrite(actuator_1_p, HIGH);
  203. delay(300);
  204. digitalWrite(actuator_1_p, LOW);
  205. delay(600);
  206. digitalWrite(actuator_1_h, HIGH);
  207. delay(300);
  208. digitalWrite(actuator_1_h, LOW);
  209. } else if (t2_v == 2) {
  210. cout << “Closing Thread2” << endl;
  211. }
  212. delay(10);
  213. }
  214. }
  215. /////////////////////////////////////Thrird thread function///////////////////////////////
  216. //////////////////////////Opens and closes door to place fruit on belt////////////////////
  217. ///////////////////////////////// after image is taken////////////////////////////////////
  218. PI_THREAD(door) {
  219. while (true) {
  220. if (d_v == 1) {
  221. piUnlock(1);
  222. cout << “Door” << endl;
  223. d_v = 0;
  224. digitalWrite(door_o, HIGH);
  225. delay(100);
  226. digitalWrite(door_o, LOW);
  227. delay(500);
  228. digitalWrite(door_c, HIGH);
  229. delay(100);
  230. digitalWrite(door_c, LOW);
  231. } else if (d_v == 2) {
  232. cout << “Closing Thread2” << endl;
  233. }
  234. delay(10);
  235. }
  236. }
  237. ///////////////////////////////Function finds local Standard deviation/////////////////////
  238. ///////////////By dividing image into small windows and find Standard deviation////////////
  239. voidfind_dct(Mat imgCropped) {
  240. Mat imgCoeff, imgWindow;
  241. int windows_n_rows = 16;                           //Height of window
  242. int windows_n_cols = 16;                   // Width of window
  243. int StepSlide = 16;
  244. cvtColor(imgCropped, imgCropped, CV_BGR2GRAY);
  245. imgCoeff = Mat::zeros(Size(18, 25), CV_8U);
  246. int x = 0, y = 0;
  247. Scalar mean_, std;
  248. double minval, maxval;
  249. for (int row = 0; row <= (imgCropped.rows – windows_n_rows); row += StepSlide) {
  250. for (int col = 0; col <= (imgCropped.cols – windows_n_cols); col += StepSlide) {
  251. Rect windows(col, row, windows_n_rows, windows_n_cols);
  252. imgWindow = imgCropped(windows);
  253. meanStdDev(imgWindow, mean_, std);
  254. if (std[0] > 0) std[0] -= 1;
  255. at < uchar > (y, x) = std[0];
  256. x++;
  257. }
  258. x = 0;
  259. y++;
  260. }
  261. Mat element = getStructuringElement(MORPH_ELLIPSE, Size(3, 3), Point(-1, -1));
  262. erode(imgCoeff, imgCoeff, element, Point(-1, -1), 2);
  263. Scalar g = sum(imgCoeff);
  264. features[0][8] = g[0] / 100;
  265. }
  266. /////////////////////////////Function Finds Local contrast difference/////////////////
  267. ////////////////////////////To calculate defective area///////////////////////////////
  268. voidarea_defect(Mat Image, Mat mask) {
  269. GaussianBlur(Image, Image, Size(3, 3), 5, 5);
  270. cvtColor(Image, Image, CV_BGR2GRAY);
  271. convertTo(Image, -1, 1.5, 0);
  272. float resiz = 0.20;
  273. Mat Element;
  274. Element = getStructuringElement(MORPH_ELLIPSE, Size(3, 3)); // Get kernal for morphologyEx
  275. resize(mask, mask, Size(), resiz, resiz);
  276. erode(mask, mask, Element, Point(-1, -1), 4);
  277. threshold(mask, mask, 1, 255, 0);
  278. resize(Image, Image, Size(), resiz, resiz);
  279. cv::Mat lbp(Image.rows, Image.cols, CV_8UC1);
  280. float center = 0;
  281. for (int row = 2; row < Image.rows – 2; row++) {
  282. for (int col = 2; col < Image.cols – 2; col++) {
  283. center = (<uchar>(row,col))+ (<uchar>(row-1,col-1))+ (<uchar>(row-1,col))+ (<uchar>(row-1,col+1))+ (<uchar>(row,col-1))+ (<uchar>(row,col+1))+ (<uchar>(row+1,col-1))+ (<uchar>(row+1,col))+   (<uchar>(row+1,col+1))+ (<uchar>(row-2,col-2))+   (<uchar>(row-2,col-1))+  (<uchar>(row-2,col))+ (<uchar>(row-2,col+1))+  (<uchar>(row-2,col+2))+ (<uchar>(row-1,col-2))+ (<uchar>(row-1,col+2))+ (<uchar>(row,col-2))+ (<uchar>(row,col+2))+ (<uchar>(row+1,col-2))+ (<uchar>(row+1,col+2))+ (<uchar>(row+2,col-2))+ (<uchar>(row+2,col-1))+ (<uchar>(row+2,col))+ (<uchar>(row+2,col+1))+ (<uchar>(row+2,col+2));
  284. if ((((<uchar>(row,col))-(center/25))>10)|((center/25)-(<uchar>(row,col))>10))<uchar>(row,col)=0;
  285. elseat < uchar > (row, col) = 255;
  286. }
  287. }
  288. bitwise_and(mask, lbp, lbp);
  289. bitwise_not(lbp, lbp, mask);
  290. imshow(“LBP”, lbp);
  291. float defective_area = countNonZero(lbp);
  292. defective_area = defective_area / 200;
  293. features[0][7] = defective_area;
  294. }
  295. //////////////////////////Find colour channels Mean and global standard deviation////////////////////
  296. voidmean_std(Mat image, Mat mask) {
  297. Mat mean, stdDeviation;
  298. meanStdDev(image, mean, stdDeviation, mask);
  299. float redMean, greenMean, blueMean, redStd, greenStd, blueStd;
  300. redMean = < double > (2, 0);
  301. greenMean = < double > (1, 0);
  302. blueMean = < double > (0, 0);
  303. redStd = < double > (2, 0);
  304. greenStd = < double > (1, 0);
  305. blueStd = < double > (0, 0);
  306. redMean = redMean / 195;
  307. greenMean = greenMean / 195;
  308. blueMean = blueMean / 110;
  309. redStd = redStd / 30;
  310. greenStd = greenStd / 30;
  311. blueStd = blueStd / 30;
  312. features[0][1] = redMean;
  313. features[0][2] = greenMean;
  314. features[0][3] = blueMean;
  315. features[0][4] = redStd;
  316. features[0][5] = greenStd;
  317. features[0][6] = blueStd;
  318. }
  319. ///////////////////////////////////Find Area of fruit//////////////////////////////////////
  320. voidfind_area(Mat mask) {
  321. Mat image;
  322. image = mask.clone();
  323. vector < vector < Point > > contour;
  324. findContours(image, contour, RETR_EXTERNAL, CHAIN_APPROX_SIMPLE, Point(0, 0));
  325. if (contour.size() < 1) { //No fruit detected
  326. return;
  327. }
  328. float largestarea = 0;
  329. int largestIndex = 0;
  330. double total_defective_area = 0, total_area = 0;
  331. double area;
  332. for (int i = 0; i < contour.size(); i++) {
  333. area = contourArea(contour[i]);
  334. if (area > largestarea) {
  335. largestarea = area;
  336. largestIndex = i;
  337. }
  338. }
  339. feature_area = largestarea;
  340. float scale_area = feature_area / 36000;
  341. features[0][0] = scale_area;
  342. }
  343. Mat mask_1(Mat image) {
  344. GaussianBlur(image, image, Size(3, 3), 3, 3);
  345. vector < Mat > channel;
  346. split(image, channel);
  347. absdiff(channel[0], channel[1], image);                                       //cvtColor(image, image, CV_BGR2GRAY);
  348. threshold(image, image, 30, 255, 0);
  349. Mat Element;
  350. Element = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));    // Get kernal for morphologyEx
  351. erode(image, image, Element, Point(-1, -1), 2);
  352. return (image);
  353. }
  354. Mat crop_1(Mat image) {
  355. Rect rec(Point((image.cols / 2) – 79, (image.rows / 2) – 179), Point((image.cols / 2) + 220, (image.rows / 2) + 230));
  356. image = image(rec);
  357. return (image);
  358. }

Leave a Reply