anyone here that has ever used the readability 0.2 or textstat 0.3.1 package in python? Couldn't find anything on SO dealing with this subject or any good documentation on this.
So far my code is: It iterates over a bunch of txt files locally stored and prints the result (readability measures) into a master text file.
from textstat.textstat import textstat
import os
import glob
import contextlib
@contextlib.contextmanager
def stdout2file(fname):
import sys
f = open(fname, 'w', encoding="utf-8")
sys.stdout = f
yield
sys.stdout = sys.__stdout__
f.close()
def readability():
os.chdir(r"F:\Level1\Level2")
with stdout2file("Results_readability.txt"):
for file in glob.iglob("*.txt"): # iterates over all files in the directory ending in .txt
with open(file, encoding="utf8") as fin:
contents = fin.read()
if __name__ == '__main__':
print(textstat.flesch_reading_ease(contents))
print(file.split(os.path.sep)[-1], end=" | ")
print(textstat.smog_index(contents), end="\n ")
print(file.split(os.path.sep)[-1], end=" | ")
print(textstat.gunning_fog(contents), end="\n ")
This works pretty good, however I have two problems:
Is it possible to store my masterfile into another directory? If I am using the code above my masterfile is created in the same directory as my files that are iterated and this is kind of senseless...
Anyone has experience how accurate these packages work? I just tested the same string in textstat and http://www.webpagefx.com/tools/read-able/check.php / http://gunning-fog-index.com/ and get significant different results on all measures?
Any help appreciated.