4

I'm making a very simple Windows application using Visual Studio and C# that edits subtitles files for movies. I want a program that adds a space to dialog sentences when there isn't one. For example:

-Hey, what's up?

-Nothing much.

to

- Hey, what's up?

- Nothing much.

I used the toolbox to create an interface with just one button for selecting the correct file. This is the code I have for this button:

private void button1_Click(object sender, EventArgs e)
    {
        if (openFileDialog1.ShowDialog() == DialogResult.OK)
        {
            string text = File.ReadAllText(openFileDialog1.FileName, Encoding.GetEncoding("iso-8859-1"));
            text = text.Replace("-A", "- A");
            File.WriteAllText(openFileDialog1.FileName, text, Encoding.GetEncoding("iso-8859-1"));
        }
    }

What this does is basically replace "-A" with "- A", thus creating a space. This is the solution that I've come up with and I was planning to do this with every letter, including accented letters, such as À, Á, È, É, etc, etc.

This does not work. If I put text = text.Replace("-É", "- É"); the program does nothing.

What I want to know is how do I fix this.

Thank you for reading and if you have a better alternative for my application then please feel free to let me know.

Telmo F.
  • 167
  • 1
  • 4
  • 16
  • You need to research `regex`. You don't have to do this manually for every possible letter! – Blorgbeard Mar 09 '16 at 01:11
  • Use `regex.Replace()` – Harsh Mar 09 '16 at 01:12
  • As @Blorgbeard mentioned, you might be able to do something as simple as `text = new Regex("^-").Replace(text, " -")` – Rob Mar 09 '16 at 01:12
  • Thanks, everyone. @Rob, mind explaining why you did `Regex("^-")`? I tried your line of code and it didn't work, sadly. – Telmo F. Mar 09 '16 at 01:21
  • @T.Ferreira Sorry - my mistake. It should be `text = new Regex("^-([^\\s])").Replace(text, "- $1");`. Essentially, ^ matches the start of a line. Then we look for `-` followed by not a space. Then we replace it with `- ` – Rob Mar 09 '16 at 01:25
  • @Rob that also didn't work. It just does nothing to the file I'm trying to alter. All the dashes remain without spaces. – Telmo F. Mar 09 '16 at 01:43
  • Can you show us a sample of the file? I've tested it locally, and it was properly replacing lines: `-Hey` becomes `- Hey`, `-És` becomes `- És` – Rob Mar 09 '16 at 01:46
  • @Rob that's weird, it doesn't on my file. This is what I'm trying to change on my test file and it doesn't change at all after I run my application: `-É então? -Então, quero conferenciar convosco.` – Telmo F. Mar 09 '16 at 01:52
  • @T.Ferreira Ah - It will fix the first one, but not the second. We're only searching for `-` which appear at the start of the line. Simply removing the `^` from the regex will fix it for this case, but it will replace `some-text` with `some -text`. Is it always a space(or nothing) before the `-`? – Rob Mar 09 '16 at 02:04
  • 1
    @Rob that's really weird, then. But I actually had success with the edited solution by A. Chiesa so you don't have to bother with this anymore. But thank you so much for actually using your time and trying to help me! I will definitely keep your solution handy for any future projects. – Telmo F. Mar 09 '16 at 02:08

2 Answers2

6

As for the comments, use Regex.

        var rx = new System.Text.RegularExpressions.Regex("^-([^ ])");
        ... in your loop
        var text = rx.Replace(text, "- $1");

Basically what this does is that it searches for a dash at the beginning of the string, but only which is NOT followed by a space. The () means that the char following the dash should be "saved". The replace searches in the provided string and replaces (doh!) the matched text with a dash, a space, and the same character matched before. Whatever it is.

Source: https://xkcd.com/208/

Edit: you do not have a loop, you have a string containing the full content of a file in which every line should contain a subtitle line (right?). If that is the case, you can configure the regular expression to treat the string as a list of rows, as this:

        var rx = new Regex("^-([^ ])", RegexOptions.Multiline);

See this fiddle for an example: https://dotnetfiddle.net/ciFlAu

Alberto Chiesa
  • 7,022
  • 2
  • 26
  • 53
  • Thank you so much. I have a few questions: 1 - The `System.Text.RegularExpressions.Regex` appears in gray and if I hover the mouse over it, it says "Name can be simplified". Is this important at all? 2 - I didn't understand what you meant by `... in your loop`. I'm very knew at this, I only started C# yesterday. Your code didn't work on my application, it didn't change the subtitle file at all. Do you have any idea why? – Telmo F. Mar 09 '16 at 01:36
  • 1 - if you already have a using System.Text.RegularExpressions in you file, you don't need the fully qualified name. So you can simplify it without worry. 2 - I now realize you don't have a loop in your code. You have to supply some options to the regex. Let me check about them. – Alberto Chiesa Mar 09 '16 at 01:44
  • Your edit worked perfectly. Thank you so much! Here I was about to write dozens of lines, one for each possible letter, and you solved it with just one line. Now that's efficiency! Again, thank you. – Telmo F. Mar 09 '16 at 02:05
1

For accented character, consider of using its Unicode representation:

string text = "-\u00C9"; //-É
text = text.Replace("-\u00C9", "- \u00C9"));

And you could also use no-break space for space replacement, just in case:

string text = "-\u00C9";
text = text.Replace("-\u00C9", "-\u00A0\u00C9"));

Then you can encode using UTF-8/UTF-16:

File.WriteAllText(openFileDialog1.FileName, text, Encoding.GetEncoding("UTF-8"));
Ian
  • 30,182
  • 19
  • 69
  • 107