We have a build of solr currently working with only English we need to add arabic support to it there is not much detail in Solr Wiki about how to start with
These are the following steps ive did
Added the following to schema.xml
<fieldType name="text_general_arabic" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ArabicNormalizationFilterFactory"/>
<filter class="solr.ArabicStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ArabicNormalizationFilterFactory"/>
<filter class="solr.ArabicStemFilterFactory"/>
</analyzer>
</fieldType>
Defined a field in Schema.xml
<field name="البرتغالية" type="text_general_arabic" indexed="true" stored="true"/>
FYI ive copied the Arabic text from google translate in the browser and pasted it
Later i have created a csv file using notepad as unicode file and saved it as Arabic.csv and it has its field name as
البرتغالية
When i try to index the file using the following cURL command
D:\>curl http://localhost:8080/solr/coll9/update/csv -F "stream.file=D:\Arabic.csv" -F "commit=true" -F "optimize=true"
-F "encapsulate="" -F "keepEmpty=true"
im getting an undefined field error i dont know where am I doing wrong
UPDATE: When i try the same thing with an XML file instead of a csv file it is working