Questions tagged [file-processing]
333 questions
5
votes
3 answers
Designing file processing that handles many file formats, parsing, validation, and persistence
If you had to design a file processing component/system, that could take in a wide variety of file formats (including proprietary formats such as Excel), parse/validate and store this information to a DB.. How would you do it?
NOTE : 95% of the time…

BoxOfNotGoodery
- 1,001
- 1
- 7
- 24
5
votes
3 answers
Howto create a directory under Linux which behaves like a pipe
We want to create a relative simple document storage but there are some requirements. My idea was, that a file is scanned and handled by a separate tool/daemon when it arrives at storage immediately.
The (pseudo) DMS should provide access via NFS…

rabudde
- 7,498
- 6
- 53
- 91
4
votes
3 answers
Query from database or from memory? Which is faster?
I am trying to improve the performance of a Windows Service, developed in C# and .NET 2.0, that processes a great amount of files. I want to process more files per second.
In its process, for each file, the service does a database query to retrieve…

Guilherme de Jesus Santos
- 4,955
- 5
- 43
- 68
4
votes
6 answers
How do I read a large file from disk to database without running out of memory
I feel embarrassed to ask this question as I feel like I should already know. However, given I don't....I want to know how to read large files from disk to a database without getting an OutOfMemory exception. Specifically, I need to load CSV (or…

Mr Moose
- 5,946
- 7
- 34
- 69
4
votes
3 answers
Read large XML file from webserver without splitting in smaller chunks
I'm downloading a file from a 3rd party server, like so:
Try
req = DirectCast(HttpWebRequest.Create("https://www.example.com/my.xml"), HttpWebRequest)
req.Timeout = 100000 '100 seconds
Resp = DirectCast(req.GetResponse(),…

Adam
- 6,041
- 36
- 120
- 208
4
votes
2 answers
Read binary file in units of f64 in Rust
Assuming you have a binary file example.bin and you want to read that file in units of f64, i.e. the first 8 bytes give a float, the next 8 bytes give a number, etc. (assuming you know endianess) How can this be done in Rust?
I know that one can use…

Sito
- 494
- 10
- 29
4
votes
1 answer
Moving Multiple file from one folder to another based on filename - camel
I have one requirement, where I need to move multiple files present in one folder to another. This should be done based on filename which should be dynamic.
As Far I have tried out pollenrich and file (antInclude) but in both case I got struck.…

Lucifer
- 784
- 2
- 7
- 20
4
votes
4 answers
Perl Program to efficiently process 500,000 small files in a directory
I am processing a large directory every night. It accumulates around 1 million files each night, half of which are .txt files that I need to move to a different directory according to their contents.
Each .txt file is pipe-delimited and contains…

DenairPete
- 61
- 1
- 4
4
votes
3 answers
How to remove certain lines of a large file (>5G) using linux commands
I have files which are very large (> 5G), and I want to remove some lines by the line numbers without moving (copy and paste) files.
I know this command works for a small size file. (my sed command do not recognize -i option)
sed "${line}d" file.txt…

Jiho Noh
- 479
- 1
- 5
- 11
4
votes
4 answers
Efficient processing of sequential files C#
I am developing a system that processes sequential files generated by Cobol systems, currently, I am doing the data processing using several substrings to get the data, but I wonder if there is a more efficient way to process the file than to make…

Alexandre
- 1,985
- 5
- 30
- 55
4
votes
2 answers
Python - reading files from directory file not found in subdirectory (which is there)
I am convinced it is something simply syntactic - I however can not figure out why my code:
import os
from collections import Counter
d = {}
for filename in os.listdir('testfilefolder'):
f = open(filename,'r')
d = (f.read()).lower()
…

Relative0
- 1,567
- 4
- 17
- 23
4
votes
1 answer
reading input from text file into array of structures in c
My structure definition is,
typedef struct {
int taxid;
int geneid;
char goid[20];
char evidence[4];
char qualifier[20];
char goterm[50];
char pubmed;
char category[20];
} gene2go;
I have tab-seperated text file…

fgokmenoglu
- 43
- 1
- 1
- 4
4
votes
2 answers
How to prevent file from being overridden when reading and processing it with Java?
I'd need to read and process somewhat large file with Java and I'd like to know, if there is some sensible way to protect the file that it wouldn't be overwritten by other processes while I'm reading & processing it?
That is, some way to make it…

Touko
- 11,359
- 16
- 75
- 105
3
votes
4 answers
Skipping lines using file_get_contents?
I'm trying to skip the first 2 lines (from reading 3 files) then save back (I already got this done, all that's left is the line skipping)
Is there any way to do this?

allenskd
- 1,795
- 2
- 21
- 33
3
votes
2 answers
SIGKILL adding column in a simple python script
I have a file that has a oneline header and a long column with values. I want to add a second column with values since 10981 (step = 1) until the end of the file (ommiting the header, of course). The problem is that the script needs a lot of memory…

JaimeG
- 35
- 5