Questions tagged [text-processing]

Mechanizing the creation or manipulation of electronic text.

Text processing includes basic processing jobs using filter, tokenization or normalization method to process text. This could be a pre-processing step for .

See also:

1959 questions
0
votes
1 answer

Classify filenames (exported to Excel) based on names/type

For a part of my job we make a comprehensive list based on all files a user has in their drive. These users have to decide per file whether to archive these or not (indicated by Y or N). As a service to these users we manually fill this in for them.…
0
votes
2 answers

sed to copy part of line to end

I'm trying to copy part of a line to append to the…
Sam Lipworth
  • 107
  • 4
  • 12
0
votes
3 answers

sed command to copy lines that have strings

I want to copy the line that have strings to another file for eg A file contain the below lines ram 100 50 gopal 200 40 ravi 50 40 krishna 300 600 Govind 100 34 I want to copy the lines that has 100 or 200 to another file by skipping all the…
Vijay Asok
  • 17
  • 1
  • 7
0
votes
2 answers

In Guile or other Scheme, how to print to standard output the nth blank delimited field of lines from the input file or standard input?

If Guile is not the best Scheme for this usage, then which one should I be looking at? I'm basically looking for a Guile equivalent of awk '{print $N}'. If Scheme can't do this, then I'd like to know why not.
takatakatek
  • 103
  • 2
0
votes
3 answers

How to scan through file and replace certain phrases?

Please I would like your help with the following issue: This is a sample code customized solely for clarifying my question: File accounts_File = new File("Sample_Folder\\example.txt"); FileInputStream fis = null; BufferedInputStream…
CompilingCyborg
  • 4,760
  • 13
  • 44
  • 61
0
votes
2 answers

why stop word removal be null? (php)

I'm the beginner NLP programmer in PHP. I just want to discuss about the stop word removal. this is my practice: I have the following declaration of a variable $words = "he's the young man"; and then I remove the common words like this …
Gilang Pratama
  • 439
  • 6
  • 18
0
votes
1 answer

Shell Scripting: Fuse all duplcate lines inside a file into 1

Here is the sample file with the duplicate lines: abs bsa bsc abs bsa bsb Here is what the output should be (no duplicates): abs bsa bsc bsb I tried out the uniq -u command, but it deletes out the duplicate lines, so would it be better to use sed…
Joey
  • 1
0
votes
2 answers

Match hyphen/dash next to a Certain Letters

Input file: >AMSF107-09|Perciformes|COI-5P|GU661092 TAGTA- >AMSF114-09|Perciformes|COI-5P|GU661101 C-ACGC >ANGBF3683-12|Haemulon_sp._B_JJT-2012|COI-5P|JQ741244 -GCAGTT-CA- I want to replace the hyphens in TAGTA-, C-ACGC, and -GCAGTT-CA- with N's…
cooldood3490
  • 2,418
  • 7
  • 51
  • 66
0
votes
4 answers

Extract group name from one line repeatedly?

I got output from command like below. Need to extract group names. dsAttrTypeNative:memberOf: CN=Grupa_test,OU=Groups,DC=yellow,DC=com CN=Firefox_Install,OU=Groups,DC=yellow,DC=com CN=Network_Admin,OU=Groups,DC=yellow,DC=com So I would like to have…
Yellow
  • 11
  • 1
0
votes
1 answer

How can I save text from bash output which starts with a specific word in a file?

I am using Debian Linux (in a BeagleBone Black). The program I am using prints a lot of information on screen. So far, I have been saving entire output in a text file, which is obviously taking up a large portion of the limited storage space. For…
leaRner
  • 3
  • 2
0
votes
3 answers

text processing to select date range

I have below input and I want to select lines with dates from now to 2 weeks or 3 weeks and so on. 0029L5 08/19/2017 00:57:33 0182L5 08/19/2017 05:53:57 0183L5 02/17/2018 00:00:16 0091L5 10/19/2022 00:00:04 0045L5 07/27/2017 09:03:56 0059L5…
Sid
  • 161
  • 1
  • 10
0
votes
3 answers

If first two columns are equal, select top 3 based on descending order of 3rd column

I want to select top 3 results for every line that has the same first two column. For example the data will look like, cat data.txt A A 10 A A 1 A A 2 A A 5 A A 8 A B 1 A B 2 A C 6 A C 5 A C…
palansuya
  • 7
  • 3
0
votes
0 answers

Error in replacing pattern with `sed` command

I am trying to replace occurrences of pattern in a file. Lines in the file look like this: ****time is = 0000 ****time is = 0001 I am trying to search string by case insensitive time is and want to replace time is = xxxx by ****. I tried using sed…
er34456
  • 11
  • 2
0
votes
1 answer

Is there a trick for word wrapping with non-monospaced fonts?

I'm having a bit of trouble right now trying to implement word wrapping for non-monospaced fonts (The font can be different). I've tried searching for this everywhere but couldn't find a solution. Any tips?
fieryrage
  • 69
  • 6
0
votes
1 answer

php function nl2p to ignore lines that start or end with square brackets

I have a function that is working but can not figure a way to modify it for my needs. Function: function mynl2p($string, $line_breaks = true, $xml = true) { $string = str_replace(array('

', '

', '
', '
'), '', $string); if…
Papa Zhi
  • 45
  • 1
  • 11