1

I am setting up a weaviate database using the docker-compose option. Starting up the db works fine, and I am able to create a class and add data objects in the REPL or when I am running it all in the same script (i.e., create weaviate class and add data in the same file). However, when I try to set up the weaviate class(es) in a different file or command and then try to add data to it, I get the following response: {'error': [{'message': 'store is read-only'}]}

I've tried the following:

  • Start at the basics by following the weaviate Quickstart tutorial in a single function (Successful)
  • Adjust the function to create a Message class to accept a message from the user as input to be inserted (Successful)
  • Move the code to create the weaviate class to a separate file and function while keeping the code to accept the user message and add data to weaviate in the original file/function (Failed)

I've tried doing that last step in a variety of ways but to no avail. I always get the same error response.

Has anyone ran into this before or have an idea on how to resolve this?

Please let me know what other information would be helpful.

Here's a more detailed outline of what I am doing to produce the error:

  1. Run ./build.sh setup_weaviate to create the class(es) found in a json file (completes successfully):

build.sh


setup_venv () {
    python3 -m venv venv
    source venv/bin/activate
    pip install --upgrade pip wheel
    pip install -r requirements.txt
}
setup_weaviate () {
    python3 src/weaviate_client.py
}


case "$1" in
    setup_venv)
        setup_venv
        ;;
    setup_weaviate)
        setup_weaviate
        ;;
    *)
        echo "Usage: $0 {setup}"
        exit 1
        ;;
esac

src/weaviate_client.py

import os
import yaml
from dotenv import load_dotenv
import weaviate


def get_client(url, api_key):
    client = weaviate.Client(
        url=url, 
        additional_headers={"X-OpenAI-API-Key": api_key}
    )
    return client


def setup_weaviate(client):
    """Fetch the classes from the weaviate_classes.yml file and create them in Weaviate."""
    client.schema.delete_all()
    client.schema.create("resources/weaviate.json")
    print(client.schema.get())


if __name__ == "__main__":
    load_dotenv()
    OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
    WEAVIATE_URL = os.getenv("WEAVIATE_URL")
    client = get_client(WEAVIATE_URL, OPENAI_API_KEY)
    setup_weaviate(client)
    client._connection.close()

resources/weaviate.json

{"classes": [{"class": "Message", "invertedIndexConfig": {"bm25": {"b": 0.75, "k1": 1.2}, "cleanupIntervalSeconds": 60, "stopwords": {"additions": null, "preset": "en", "removals": null}}, "moduleConfig": {"text2vec-openai": {"model": "ada", "modelVersion": "002", "type": "text", "vectorizeClassName": true}}, "properties": [{"dataType": ["string"], "description": "The content of a message", "moduleConfig": {"text2vec-openai": {"skip": false, "vectorizePropertyName": false}}, "name": "content", "tokenization": "word"}], "replicationConfig": {"factor": 1}, "shardingConfig": {"virtualPerPhysical": 128, "desiredCount": 1, "actualCount": 1, "desiredVirtualCount": 128, "actualVirtualCount": 128, "key": "_id", "strategy": "hash", "function": "murmur3"}, "vectorIndexConfig": {"skip": false, "cleanupIntervalSeconds": 300, "maxConnections": 64, "efConstruction": 128, "ef": -1, "dynamicEfMin": 100, "dynamicEfMax": 500, "dynamicEfFactor": 8, "vectorCacheMaxObjects": 1000000000000, "flatSearchCutoff": 40000, "distance": "cosine", "pq": {"enabled": false, "bitCompression": false, "segments": 0, "centroids": 256, "encoder": {"type": "kmeans", "distribution": "log-normal"}}}, "vectorIndexType": "hnsw", "vectorizer": "text2vec-openai"}]}

Note that the weaviate.json file is just the output of the client.shema.get() command (after having once successfully created the class in the REPL).

  1. Execute the message:handle_message function, which creates a message object and attempts to push it to weaviate:

message.py

import os
import asyncio
from dotenv import load_dotenv
from datetime import datetime

load_dotenv()
BATCH_SIZE = int(os.getenv("BATCH_SIZE"))
    

def handle_message(client, message, messages_batch=[]):
    """Save a message to the database."""
    data = [{
        "content": message.content,
        }
    ]

    with client.batch as batch:
        batch.batch_size=100
        for i, d in enumerate(data):
            properties = {
                "content": d["content"],
            }

            client.batch.add_data_object(properties, "Message")

    return True

I get the {'error': [{'message': 'store is read-only'}]} when I pass in a message to this function. Also, I understand that as the code is currently a batch will be executed each time a message is passed to the function -- this was intentional since I was trying to resolve this issue with just one message.

The only output I get when I execute the handle_message function is what I mentioned previously: {'error': [{'message': 'store is read-only'}]}

Here is also the output from client.schema.get() in case that is helpful, but is essentially the same as the resources/weaviate.json contents:

{'classes': [{'class': 'Message', 'invertedIndexConfig': {'bm25': {'b': 0.75, 'k1': 1.2}, 'cleanupIntervalSeconds': 60, 'stopwords': {'additions': None, 'preset': 'en', 'removals': None}}, 'moduleConfig': {'text2vec-openai': {'model': 'ada', 'modelVersion': '002', 'type': 'text', 'vectorizeClassName': True}}, 'properties': [{'dataType': ['string'], 'description': 'The content of a message', 'moduleConfig': {'text2vec-openai': {'skip': False, 'vectorizePropertyName': False}}, 'name': 'content', 'tokenization': 'word'}], 'replicationConfig': {'factor': 1}, 'shardingConfig': {'virtualPerPhysical': 128, 'desiredCount': 1, 'actualCount': 1, 'desiredVirtualCount': 128, 'actualVirtualCount': 128, 'key': '_id', 'strategy': 'hash', 'function': 'murmur3'}, 'vectorIndexConfig': {'skip': False, 'cleanupIntervalSeconds': 300, 'maxConnections': 64, 'efConstruction': 128, 'ef': -1, 'dynamicEfMin': 100, 'dynamicEfMax': 500, 'dynamicEfFactor': 8, 'vectorCacheMaxObjects': 1000000000000, 'flatSearchCutoff': 40000, 'distance': 'cosine', 'pq': {'enabled': False, 'bitCompression': False, 'segments': 0, 'centroids': 256, 'encoder': {'type': 'kmeans', 'distribution': 'log-normal'}}}, 'vectorIndexType': 'hnsw', 'vectorizer': 'text2vec-openai'}]}

2 Answers2

1

{'error': [{'message': 'store is read-only'}]} is usually because you are running out of this disk space (see docs for details)

From your post, this does not seem to be the case. I suggest to report this as a bug in the weaviate repo and for further trouble shooting from the team.

hsm207
  • 471
  • 2
  • 4
  • Thanks for taking a look at this. I've submitted this question as an issue on weaviate's github -- https://github.com/weaviate/weaviate/issues/2976 – Brennan Tolman May 01 '23 at 15:08
0

I found that if I stopped mounting the PERSISTENCE_DATA_PATH to a local directory on the host, it solved the issue.

jjordan
  • 42
  • 5