Say I have an trained RandomForestClassifier model from sklearn. We're using gRPC to serve that model and provide predictions in real-time in a high traffic situation. From what I understand, gRPC can support multiple threads run concurrently. Can they all simultaneously make calls to RF's predict method without running into concurrency issues? Is sklearn's predict thread-safe?
If this isn't the case, it sounds like you'd have to load individual copies of the model in each worker thread.