4

I have a text file that contains some hymns in a particular format.Example below.


1  Praise to the Lord

1 Praise to the Lord, the Almighty, the King of creation! O my soul, praise Him, for He is thy health and salvation! All ye who hear, now to His temple draw near; Join ye in glad adoration!

2 Praise to the Lord, Who o'er all things so wondrously reigneth, Shieldeth thee under His wings, yea, so gently sustaineth! Hast thou not seen how thy desires e'er have been Granted in what He ordaineth?

3 Praise to the Lord, who doth prosper thy work and defend thee; Surely His goodness and mercy here daily attend thee. Ponder anew what the Almighty can do, If with His love He befriend thee.

I want to extract these hymns, place them into objects and then insert them into an SQLite Database. I am trying to split them up accordingly but I am not getting anywhere so far. This is my attempt.

Main function

        //Fileinfo object wraps the file path.
        var hymns = new FileInfo(@"C:HymnWords.txt");

        //StreamReader reads from the existing file.
        var reader = hymns.OpenText();

        string line;
        int number = 0;
        var hymns = new List<Hymn>();

        var check = false; //this is set to indicate that all the lines that are follwoing will be apart of the hymn.

        while ((line = reader.ReadLine())!=null)
        {
                if (line.Any(char.IsLetter) && line.Any(char.IsDigit))
                {

                }

                if (check)
                {
                    if (line.Any(c => char.IsDigit(c) && c != 0) && !line.Any(char.IsLetter))
                    {

                    }

                }


            }

Model for the hymn

public class Hymn
{
    public string Name { set; get; }
    public List<String> Verses { set; get; }
}

When storing the verses. I need to preserve the line breaks. Is inserting a

/n

after each line before inserting the verse into object or database the best way to do this?

Joel Dean
  • 2,444
  • 5
  • 32
  • 50
  • You would need `\r\n` not `/n` to preserve line breaks. That only works if it's going to be output in plain text. What exactly is the problem? What do you mean by not getting anywhere? – EJC Mar 23 '13 at 19:40
  • Eventually its going to be outputted into an android text view. I am not sure on how to structure the algorithm to store each hymn into the objects I have created. I know how to store the title already but the extracting of the verses for the hymn I am not sure on how to do that. – Joel Dean Mar 23 '13 at 19:43
  • Ok so line 1 with the words nxt to it is the title. Then the rest are just verses. Got it. Working through this a little and will get back to you in a bit. – EJC Mar 23 '13 at 19:46

3 Answers3

3

This should get you started pretty well:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;

namespace Program
{
    public class Class1
    {
        public class Program
        {
            public static void Main(string[] args)
            {
                var hymnFiles = new List<string>()
                {
                    @"C:\HymnWords.txt",
                    @"C:\HymnWords1.txt",
                    @"C:\HymnWords2.txt",
                    @"C:\HymnWords3.txt",
                };
                var reader = new Class1();
                foreach (var hymn in reader.ReadHymnFiles(hymnFiles))
                {
                    Console.Out.WriteLine(hymn.Title);
                    foreach (var verse in hymn.Verses)
                    {
                        Console.Out.WriteLine(verse.VerseNumber);
                        foreach (var verseLine in verse.VerseLines)
                        {
                            Console.Out.WriteLine(verseLine);
                        }
                    }
                }
                Console.ReadLine();
            }

        }
        public List<Hymn> ReadHymnFiles(List<string> hymnFiles)
        {
            var hymns = new List<Hymn>();
            foreach (var hymnFile in hymnFiles)
            {
                using (TextReader reader = new StreamReader(hymnFile))
                {
                    var hymn = new Hymn();
                    hymns.Add(hymn);
                    string line;
                    int currentVerseNumber = 0;
                    while ((line = reader.ReadLine()) != null)
                    {
                        if (string.IsNullOrWhiteSpace(line))
                            continue;

                        if (line.Any(char.IsLetter) && line.Any(char.IsDigit))
                        {
                            // this must be the title
                            hymn.Title = line;
                            continue;
                        }

                        if (line.Any(c => char.IsDigit(c) && c != 0) && !line.Any(char.IsLetter))
                        {
                            //this must be the verse number
                            currentVerseNumber = Convert.ToInt32(line.Trim());
                            hymn.Verses.Add(new Verse(currentVerseNumber));
                            continue;
                        }

                        //get the current verse to add the next line to it
                        var verse = hymn.Verses.Single(v => v.VerseNumber == currentVerseNumber);
                        verse.VerseLines.Add(line);
                    }
                }
            }
            return hymns;
        }

        public class Hymn
        {
            public Hymn()
            {
                Verses = new List<Verse>();
            }
            public string Title { set; get; }
            public List<Verse> Verses { set; get; }
        }

        public class Verse
        {
            public Verse(int verseNumber)
            {
                VerseNumber = verseNumber;
                VerseLines = new List<string>();
            }
            public int VerseNumber { get; private set; }
            public List<string> VerseLines { set; get; }
        }
    }

}

Note the verses are in an object of their own and each line is it's own line. They should probably be stored that way as well.

EJC
  • 2,062
  • 5
  • 21
  • 33
  • I am getting this error. "Sequence contains no matching element" at this line //get the current verse to add the next line to it var verse = hymn.Verses.Single(v => v.VerseNumber == currentVerseNumber); verse.VerseLines.Add(line); – Joel Dean Mar 23 '13 at 21:06
  • all of the hymns are located in one text file..all 600 of them. Thanks for your help so far, things are much clearer now.. – Joel Dean Mar 23 '13 at 21:12
  • Be back in a while, gotta run out. Hope you figure it out. – EJC Mar 23 '13 at 21:27
  • Ok. Thank you very much for your help you cleared up a lot for me. – Joel Dean Mar 23 '13 at 21:29
2

This might be of help, you might need some more exception handling though.
I changed your Hymn class a litle to make it easier to work with.

Here is the modified hymn class:

public class Hymn
{
    private readonly List<List<string>> _verses = new List<List<string>>();

    public Hymn(string name)
    {
        Name = name;
    }

    public string Name { get; private set; }
    public IEnumerable<IEnumerable<string>> Verses { get { return _verses; } }

    public List<string> CreateVerse()
    {
        var verse = new List<string>();
        _verses.Add(verse);
        return verse;
    }
}

And here is a class for reading hymns from a file:

public static class HymnReader
{
    public static IEnumerable<Hymn> ReadHymns(string file)
    {
        var lines = File.ReadAllLines(file);
        var hymns = new List<Hymn>();
        Hymn hymn = null;
        List<string> verse = null;
        foreach (var line in lines)
        {
            string text;
            switch (ParseLine(line, out text))
            {
                case LineType.Title:
                    hymn = new Hymn(text);
                    hymns.Add(hymn);
                    break;
                case LineType.Verse:
                    if (verse == null) verse = hymn.CreateVerse();
                    verse.Add(text);
                    break;
                default:
                    verse = null;
                    break;
            }
        }
        return hymns;
    }

    private static LineType ParseLine(string line, out string text)
    {
        text = "";
        if (string.IsNullOrWhiteSpace(line)) return LineType.Unkown;
        var array = line.Split(new[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
        if (array.Length < 2) return LineType.Unkown; 
        int n;
        if (int.TryParse(array[0], out n))
        {
            text = string.Join(" ", array, 1, array.Length - 1).Trim();
            return LineType.Title;
        }
        text = string.Join(" ", array).Trim();
        return LineType.Verse;
    }

    private enum LineType
    {
        Unkown,
        Title,
        Verse
    }
}
Jens Granlund
  • 4,950
  • 1
  • 31
  • 31
2

This is the final solution that I arrived with the help of Shawn McLean.

namespace HymmParser
{
class Program
{
    const string TITLE_REGEX = @"\s*\d+\s{2,}[a-zA-Z]+";

    static void Main(string[] args)
    {
        var hymns = new List<Hymn>();
        //read the file
        string[] lines = System.IO.File.ReadAllLines(@"C:\HymnWords.txt");

        for (int i = 0; i < lines.Count(); i++)
        {
            //regex to check for a white space, a number, 2 or more white spaces then words after.
            if (Regex.IsMatch(lines[i], TITLE_REGEX))
            {
                var hymn = new Hymn
                {
                    //TODO: Add your title parse logic here.
                    Title = lines[i]
                };

                //find verses under this hymn
                for (i++; i < lines.Count(); i++)
                {
                    //ensure this line is not a title, else break out of it.
                    if (Regex.IsMatch(lines[i], TITLE_REGEX))
                    {
                        break;
                    }

                    //if number only found, this is the start of a verse
                    if (Regex.IsMatch(lines[i], @"^\s*\d+$"))
                    {
                        var verse = new Verse(int.Parse(lines[i]));

                        //gather up verse lines
                        for (i++; i < lines.Count(); i++)
                        {
                            //if number only, break.
                            if (Regex.IsMatch(lines[i], @"\s*\d+"))
                            {
                                //backup and break, outer loop will increment and miss this new verse
                                i--;
                                break;
                            }
                            else if (string.IsNullOrWhiteSpace(lines[i]))
                            {
                                //if whitespace, then we may have finished the verse, break out
                                break;
                            }
                            else
                            {
                                verse.VerseLines.Add(lines[i]);
                            }
                        }
                        hymn.Verses.Add(verse);
                    }
                }
                hymns.Add(hymn);
            }
        }
        foreach (var hymn in hymns)
        {
            Console.WriteLine(hymn.Title);
            foreach (var verse in hymn.Verses)
            {
                Console.WriteLine(verse.VerseNumber);
                foreach (var line in verse.VerseLines)
                {
                    Console.WriteLine(line);
                }
            }
            Console.WriteLine("\n");
        }

        Console.WriteLine("Hymns Found: {0}", hymns.Count);

        Console.ReadLine();

    }
}

public class Hymn
{
    public Hymn()
    {
        Verses = new List<Verse>();
    }
    public string Title { set; get; }
    public List<Verse> Verses { set; get; }
}

public class Verse
{
    public Verse(int verseNumber)
    {
        VerseNumber = verseNumber;
        VerseLines = new List<string>();
    }
    public int VerseNumber { get; private set; }
    public List<string> VerseLines { set; get; }
}

}

Joel Dean
  • 2,444
  • 5
  • 32
  • 50