0

I am new to NLTK and trying to build a chatbot with type of request and parameter.

For ex,

corpus = [
        {
            name: "appt-count",
            text: "How many appointments I have for today?"
        },
        {
            name: "appt-count",
            text: "What is my total appointments for today?"
        },
        {
            name: "appt-list",
            text: "What are all my appointments today?"
        },
        {
            name: "appt-list",
            text: "Call you tell me my appointments today?"
        },
        {
            name: "appt-view",
            text: "What is my next appointment?"
        },
        {
            name: "appt-view",
            text: "What is my appointment after lunch?"
        },
        {
            name: "appt-view",
            text: "What is my first appointment today?"
        },
        {
            name: "appt-view",
            text: "What is my last appointment today?"
        }
    ]

Here, When user input text, it is supposed to return the corresponding name, so that the system will invoke the corresponding API and return the result to the user.

When user say "How many appointments I have for today?" it is supposed to return "appt-count" and also the parameter as "today" so that they system will check for today and user could enter some date also here.

I am trying to use LogisticRegression from sklearn.linear_model. Getting error in np.array code. Am I on the right track?

import nltk
import pickle
import pandas as pd
import numpy as np
from nltk.stem import PorterStemmer, WordNetLemmatizer
from nltk.tokenize import sent_tokenize, word_tokenize
from nltk.classify import ClassifierI

from sklearn.linear_model import LogisticRegression, SGDClassifier
from sklearn.feature_extraction.text import CountVectorizer,TfidfVectorizer

training = []
ps = PorterStemmer()

df=pd.read_json("./train_data.json", orient="records")

for name, text in zip(df['name'], df['text']):
  words = word_tokenize(text)
  stemWord = [ps.stem(w.lower()) for w in words]
  training.append([stemWord , name])

// Error:- Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
training = np.array(training) 
train_x = list(training[:,0])
train_y = list(training[:,1])

print("Training data created")

model = LogisticRegression()

model=model.fit(train_x,train_y)
user1578872
  • 7,808
  • 29
  • 108
  • 206

0 Answers0