-1

I have a C++ program that loads a file with few millions lines and starts processing, the same operation was done by a php script, but in order to reduce the execution time I switched to C++.

In the old script, I checked whether there is a file with the current operation id in a "pause" folder, the file is empty It is just to check if a pause is requested, the script then checks after each 5 iterations if there is such file, if so It stuck on an empty loop until the file is deleted (a.k.a resume) :

foreach($lines as $line)
    {
        $isFinished = $index >= $countData - 1;
        if($index % 5 == 0)
        {
            do
            {
                $isPaused = file_exists("/home/pauses/".$content->{'drop-id'});
            }while($isPaused);
        }
        // Starts processing the line here 
}

But since disk accessing is relatively slow, I don't want to follow the same approach, so I was thinking of some sort of commands that simulates this :

$ kill cpp_program // C++ program returns the last index checked e.g: 37710
$ ./main 37710
$ // cpp_program escapes the first 37709 lines and continues its job

What do you think of this approach ? Is-it feasible ? Is-it non time-consuming ? Is there any better approach ? Thank you

Edit : A clarification because this seems a little ambiguous, this task runs in the background, there is another application which starts this one, I want to be able to send command from the management app (through Linux commands) to the background task to pause/resume.

Soufiane Touil
  • 803
  • 2
  • 8
  • 17
  • 3
    You can send a signal to your process, and upon receiving this signal, it can enter endless sleep - and the next signal will wake it up. This is one of the relatively low-tech solutions to the problem. – SergeyA Jan 18 '19 at 20:30
  • @SergeyA Any idea how to send a signal to the process without "killing" it ? All examples I've seen use the kill command – Soufiane Touil Jan 18 '19 at 20:37
  • why do you want to pause/resume? who did create/delete the file with your previous appraoch? – 463035818_is_not_an_ai Jan 18 '19 at 20:37
  • Are you trying to throttle a task from within the same process or externally? That's not clear to me. – François Andrieux Jan 18 '19 at 20:38
  • @SoufianeTouil: `kill -9 pid` kills a program (9 is `SIGKILL`). But `kill` can be used to send other signals as well. Your example `kill cpp_program` actually sends `SIGTERM` which is a polite request to exit -- it doesn't actually kill. – Ben Voigt Jan 18 '19 at 20:38
  • @user463035818 The file was created from an external app (the management application because this one runs in the background), I might want to pause/resume because I need to see what's going on on the screen, so If I started a job, I want to be able to finish It later – Soufiane Touil Jan 18 '19 at 20:39
  • @FrançoisAndrieux Externally – Soufiane Touil Jan 18 '19 at 20:40
  • @SoufianeTouil `kill` sends process a signal. Normally process responds to the signal by exiting, but if you have set a signal handler for it, it will execute the said handler (except for signal 9, SIGKILL, which will always terminate the process). By default, `kill` sends SIGTERM. People often use SIGHUP for similar purposes. Further reading: unix signals. – SergeyA Jan 18 '19 at 20:40
  • [Checkpointing](https://en.wikipedia.org/wiki/Application_checkpointing)? Regular UNIX-y [job control](https://en.wikipedia.org/wiki/Job_control_%28Unix%29)? – genpfault Jan 18 '19 at 20:40
  • @SergeyA the command 'kill -SIGHUP 25044' terminates the execution and returns "Hangup", Should-there be some handling of this signal in order for It not to terminate the execution ? – Soufiane Touil Jan 18 '19 at 20:45
  • 1
    Try SIGSTOP/SIGCONT. – Oliv Jan 18 '19 at 20:47
  • @genpfault I don't know about checkpointing, but jobs I believe are scheduled, if so I don't think that will achieve what I need. – Soufiane Touil Jan 18 '19 at 20:47
  • @Oliv Awesome, SIGSTOP terminated the execution but SIGCONT did not, now I need to handle the SIGCONT signal in the program, thank you – Soufiane Touil Jan 18 '19 at 20:49
  • @SoufianeTouil SIGSTOP pauses the process. And SIGCONT resumes it. – Oliv Jan 18 '19 at 20:51
  • @Oliv Ow I didn't notice that SIGSTOP did not actually terminated the program, It is still available in the processes list, you can make an answer out of this so I can mark It if you want – Soufiane Touil Jan 18 '19 at 20:53
  • Documentation in `man 7 signal` – Oliv Jan 18 '19 at 20:54

1 Answers1

2

Jumping to the 37710 line of a text file sadly requires reading all 37710 lines before it on most operating systems.

On most operating systems, text files are binary files with a convention about newlines. But the OS doesn't cache where the newlines are.

So to find the newlines, you have to read every byte.

If your program saved the byte offset of the file it had reached, it could seek to that location, however.

You can save the state of your program to some config file as you are shutting down, and set it to resume by default when it starts up again. This will require catching the signal you use to shut down, making your main logic notice the signal flag being set, and then cleanly shutting down. It is a very C-esque operation.


Now, a different traditional way to make a program controllable remotely is to have it listen on a TCP port (and/or stdin) and take command line commands there.

To go that way, you'd write a REPL component, then hook that up to whatever input and output.

Either you'd do the REPL in a coroutine like way between processing files, or you'd spawn a separate thread to do REPL and have it communicate asynchronously with the processing thread.

However, this could be beyond your skill. Each step of this (writing a REPL system, having it not block the main work, responding to commands, then attaching it to a TCP port) would take some effort and learning on your part.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524