1

So, I'm implementing a Markov random text generator in Java, and I've gotten as far as plucking out the n-grams in the text file, but now I'm struggling to write a class that gives the number of occurrences of the n-grams in the text (and eventually the probability).

This is the code I have so far. It's a little messy but this is a rough draft. //here's the main file, where I parse the text and create a new n-gram object with the given text

import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;



public class Markov {

    public static String readCorpusToString(File fileName) {
        String corpus = " ";
        try {
            corpus = new String(Files.readAllBytes(Paths.get(String.valueOf(fileName))));
        }
        catch (IOException e) {
            e.printStackTrace();
        }
        return corpus;
    }

    public static void main(String[] args) {
        File text = new File(args[0]);
        String corpus = readCorpusToString(text);
        //System.out.println(corpus);
        Ngram test = new Ngram(3, corpus);
        for ( int i = 0; i <= corpus.length(); i++) {
            System.out.println(test.next());
        }



        }
}

and here's the class for my n-gram object

import java.util.Iterator;

public class Ngram implements Iterator<String> {

        String[] words;
        int pos = 0, n;

public Ngram(int n, String str) {
        this.n = n;
        words = str.split(" ");
        }

public boolean hasNext() {
        return pos < words.length - n + 1;
        }

public String next() {
        StringBuilder sb = new StringBuilder();
        for (int i = pos; i < pos + n; i++) {
            sb.append((i > pos ? " " : "") + words[i]);
        }
        pos++;
        return sb.toString();
        }

public void remove() {
        throw new UnsupportedOperationException();
        }




}

0 Answers0