Questions tagged [text-processing]

Mechanizing the creation or manipulation of electronic text.

Text processing includes basic processing jobs using filter, tokenization or normalization method to process text. This could be a pre-processing step for .

See also:

1959 questions
0
votes
2 answers

Fulltext vs id searching speed with MySQL

I have a db with two tables: pages and tags which are structured like follows: pages: page_id, page_text, page_tags (around 60000 records at any time) tags: tag_id, tag_text (around 300000 records at any time) Each page is associated with a…
Alexandros
  • 4,425
  • 4
  • 23
  • 21
0
votes
1 answer

How to get a text within parenthesis in Python?

I am trying to get a text within (German: [ˈadɔlf ˈhɪtlɐ] (About this sound listen); 20 April 1889 – 30 April 1945) in a paragraph Expected output: German: [ˈadɔlf ˈhɪtlɐ] (About this sound listen); 20 April 1889 – 30 April 1945 I am…
coder7
  • 55
  • 1
  • 1
  • 7
0
votes
1 answer

PhpStorm: In-time text replacing

I want PhpStorm print -> always, when I insert just - (so no need for Shift + .) Is there any build-in solution? If no, probably someone can suggest a software for Ubuntu, that would do the trick.
Majesty
  • 2,097
  • 5
  • 24
  • 55
0
votes
1 answer

ANTLR suitable for parsing text reports?

I am currently using regular expressions to parse a text report in order to extract various bits of information. While this approach works, it becomes increasingly difficult to maintain the regex. I am wondering if Antlr can provide a better way to…
hli
  • 259
  • 3
  • 10
0
votes
1 answer

Filter only certain categories of log records using sed

Please help me understand how can I use sed to reduce log files like this 2017-06-13 11:47:05.121 [INFO] : Finished obj.clickButton('A1'); 2017-06-13 11:47:05.137 [INFO] : Processing index 2432 2017-06-13 11:47:13.807 [INFO] : start=1497347223552 …
miroxlav
  • 11,796
  • 5
  • 58
  • 99
0
votes
3 answers

Randomly selecting (units) from a file where a unit is 2 lines.

I want to select from a file random lines/units but where the units are consisted of 2 lines. For example a file looks like this Adam Apple Mindy Candy Steve Chips David Meat Carol Carrots And I want to randomly subselect lets…
palansuya
  • 7
  • 3
0
votes
2 answers

while loop provide just one block of result using awk

I am processing my text file using awk. I have written the code below: #!/bin/bash l=1 while [ $l -lt 5 ] do echo $l awk -v L=$l '/^BS[0-5]|^FG[2-7]/ && length<10 {i++}i==L {print}' l=$(expr $l + 1) done
0
votes
2 answers

Sort text file in bash

I have this text file: 0_0_0_0_1_1_1_1_1 [ 0.01155712 0.5775286 0.01599521 0.383362 0.01155712 ] 0_1_1_0_0_1_1_1_232 [ 4.980576e-09 1.21296e-06 0.0001519765 0.9998468 4.980576e-09 ] 0_1_1_0_0_1_1_1_226 [ 0.009718912 0.5821248 0.013627…
I am not Fat
  • 283
  • 11
  • 36
0
votes
1 answer

collecting text within

from html pages

I have a blog dataset which has a huge number of blog pages, with blog posts, comments and all blog features. I need to extract only blog post from this collection and store it in a .txt file. I need to modify this program as this program should…
0
votes
0 answers

Parse and rewrite CSS with Electron and ReactJS

I'm writing a desktop application with Electron and ReactJS that edits CSS files. I need to scan the CSS looking for a class selector, and then clear the following declaration block and add some new properties. The tricky part is matching the class…
0
votes
2 answers

read multiple entries from user to create a file

I am trying to create a file which is manually created by opening a vi editor then Esc + i and then you paste a column of entries to it and then Esc :wq!, I don't want user to even open vi editor, the script should as to enter list of data and it…
Sid
  • 161
  • 1
  • 10
0
votes
1 answer

jq: Convert "header:" "line1" "line2" text file into JSON stream w/ map to lists of strings

How do I convert these lists of text strings into json Text strings: start filelist: /download/2017/download_2017.sh /download/2017/log_download_2017.json /download/2017/log_download_2017.txt start wget: 2017-05-15 20:42:00…
Gabe
  • 226
  • 3
  • 13
0
votes
1 answer

jq to convert two text strings into separate json objects

How do I convert these two text strings into separate json objects Text strings: start process: Mon May 15 03:14:09 UTC 2017 logfilename: log_download_2017 Json output: { "start process": "Mon May 15 03:14:09 UTC 2017", } { "logfilename":…
Gabe
  • 226
  • 3
  • 13
0
votes
4 answers

Mulitiple rows to single line

Input Name Rico Address Australia Age 24 Name Rica Address Asia Age 25 Output Name Rico, Address Australia, Age 24 Name Rica, Address Asia, Age 25 Can we do this in Unix?
peon
  • 13
  • 3
0
votes
0 answers

Word correlation/matching using synonyms

I am trying to match the columns of two different csv files. I have managed to match words with the same synonyms like "house" and "residence" or "notes" and "comments". My problem is that I cannot correlate successfully more complicated…
costisst
  • 381
  • 2
  • 6
  • 22