-2

I am new to machine learning. I am working on a project of Machine learning irrigation problem. I need to study on particular crop (ex. Rice crop). I have to apply the machine learning approach to tell the farmer on the basis of climatic parameters that seed need to sow or not (like should farmer water the field or not).

Rice need following parameter climatic condition: - on the average, about 180–300 mm water/month is needed to produce a reasonably good crop of rice. - Optimum temperature 20-35 degree celsius

My datsets link: https://github.com/TanvirMahmudEmon/Rainfall-Prediction/blob/master/data/final-dataset.csv

Here are my following doubts:

1) Is it falls under Supervised problem or Unsupervised problem (I think it lies under Classification Supervised problem) ?

2) How do I label the datasets for training purpose. (I think by doing if-else in python by comparing the temp field and rainfall filed by standard rice climatic valueand label accordingly yes or no ) ?

3) If I label according to my approach mentioned in step (2) . How I could do for whole datasets ?

4) Which ML algorithm I should try to gain more accuracy?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Ratnesh
  • 298
  • 3
  • 14

2 Answers2

2
  1. It is a supervised learning problem and your labels will be either 1 or 0 where 1 represents "water the field" and 0 would represent "do not water the field".

  2. you can use list comprehension like:

    y = [1 if rainfall in range(180,300) and temp in range(25,30) else 0 for row in data]
    

    and then convert y to a numpy array for easier computation.

  3. The previous answer might give you an idea for question 2 also.

  4. I would suggest decision trees or logistic regression. The results might be better but you'll know only when you test them out. The reason for suggesting these two algorithms is cause it will be a bit faster than SVM.

Sridhar Murali
  • 380
  • 1
  • 11
1
  1. Yes it is a supervised problem as you have a label indicating whether or not to water them based on the parameters; a classification problem as you have two classes :to water or not. So here you are trying to classify the scenario to water or not based on the parameters under consideration.

2,3. You can label them by first importing the whole dataset using pandas and then labelling them by parsing the fields and adding them to the label field.

  1. if you intend to decide based on sensor input it is independent of time, and since the seasonal rainfall patterns might vary ,I suggest to stick with sensor data; in that case, you dont have to go for time series. Since you have only two classes , we need a binary classifier; SVM might suffice.
Ansif_Muhammed
  • 401
  • 4
  • 5