1

I'm trying to create a program that reads this text file with a list of domains but it also has a lot of other unnecessary things such as the title, the header of each domain, etc. All I want is the domains.

Now, the one thing in common is that all the domains have a "." somewhere or even two. Is there a way to check if a word in that .txt file has a ".", then add it to another string?

I've looked around, but I've only found String Contains, not word contains.

Now if that doesn't exist, is there a way to separate each word into a string array and then test each word individually with a for loop?

Here's an example:

Domain list name the first domain ip magicdomain.com name the second domain ip magicdomain2.com
etc.
Hexo
  • 215
  • 4
  • 11

5 Answers5

1

Consider this code:

        var words = text.Split(" ");

        foreach (var word in words)
            if (word.Contains("lookup"))
                Console.WriteLine("found it");
animaonline
  • 3,715
  • 5
  • 30
  • 57
  • This looks exactly like what I'm looking for (My second option). I'll be trying this out right now. – Hexo May 11 '12 at 12:29
1

or you can use Regex for that. Google for "Regex for domain name", i found this lib useful

Related SO: Using a C# regex to parse a domain name?

Community
  • 1
  • 1
Bek Raupov
  • 3,782
  • 3
  • 24
  • 42
  • 1
    I had a problem, so I used a Regex to solve it... Now I have 2 problems. – Eoin Campbell May 11 '12 at 12:23
  • Regex has its own issues, but once you get the Regex format correct, works like a charm! – Bek Raupov May 11 '12 at 12:24
  • I'm not saying they're not useful... but it will be slower and overkill in this situation if all he's doing is looking for a '.' and I'm guessing there's a simpler solution if there's any sort of structure to his input data. – Eoin Campbell May 11 '12 at 12:25
  • I looked up regex and it's pretty interesting, I'm probably not going to use it for this project but I'll keep it in mind. – Hexo May 11 '12 at 12:36
0

string.IndexOf will return the index of the passed in character or string if it exists or -1 if it doesn't.

if(word.IndexOf('.') > -1)
{
  // got a `.` do something with word
}
Oded
  • 489,969
  • 99
  • 883
  • 1,009
0

Get Words in string Than check each word for desired char

To get Words from a string

check "Regex : how to get words from a string (C#)" Link

Community
  • 1
  • 1
Sadaf
  • 592
  • 2
  • 11
0

Are you sure that only the domain portion will contain a .

Are there any seperator characters used to seperate the other information on the same line.

It's hard to suggest something without seeing some of your sample data, but if you're domain is always the first thing on a line, and followed by a space. you could use a combination of Substring & IndexOf to get just that first token word.

line.Substring(0, line.IndexOf(' ')) 

Alternatively you could use the string.Split() method to tokenize the line based on some seperaor character.

Can you post some sample data ?

Eoin Campbell
  • 43,500
  • 17
  • 101
  • 157