1

This is the problem piece of code.

#define CSSPDR     SPI::Mem[2]  // Data Recieve / Data Transmit
#define CSSPSR     SPI::Mem[3] // SPI Status Register
#define SPI_BUSY            ((CSSPSR & 0x10) == 0x10)
#define SPI_READ_BUFF_EMPTY ((CSSPSR & 0x4) == 0)


ushort Comm(ushort value)
{
    ulong w1=1000000, w2=1000000;
    ulong ret;

    CSSPDR = value;
    while (SPI_BUSY && (w1>0)){--w1;};

    do {
        ret = CSSPDR;
    }
    while(!SPI_READ_BUFF_EMPTY && (--w2>0));

    return ret;
}

The above is a bit of problem code for SPI communication - send a value by writing it to a special register, await sending complete, then read the reply, wait until transmission is finished (if there used to be some junk buffered, just overwrite it, the last value arriving is the good one) - but the wait states are implemented as ugly while() loops with counters.

The remote device takes well under 1ms to process the information and send it back, but there's a lot of data to transfer. It works okay as long as the communication goes without a hitch - usually reply is achieved within a couple hundred iterations, rarely with some noise - several thousands. Very rarely something is lost to a timeout, not a biggie, the faulty readout will be fixed by a good one a couple milliseconds later and filtering functions deal with the glitch.

But if the flexible tape connecting the CPU to the remote device is damaged, the communication dies out, and the timeout variables start counting to their maximum every single time. And the side effect is the entire application grinds to a halt with vast majority of CPU time wasted waiting in these loops. This happens very rarely but the result is quite ugly and I'd much prefer a solution that doesn't break the entire system in case of failure of what is a definitely non-critical part of it.

If I did usleep(1) I'd never approach the desired throughput as it hands control back to the kernel for a segment of time usually considerably longer than 1ms (it only demands not to be woken up earlier than 1µs, the kernel is free to make the time longer and usually it does, by quite a bit, even up to 100ms if other tasks are busy). Similarly, putting anything 'heavyweight' or otherwise time-consuming inside the loops would slow down the communication unacceptably as the reaction to 'BUSY' bit vanishing would be delayed. A delay of order of 100µs between readouts would be about the most I could afford, maybe optionally with an initial delay of 500µs between sending out the data and start of polling of the BUSY bit (there's zero chance it will clear any earlier.)

The values of 1000000 iterations for timeout were found experimentally as something that "almost never fails". I could try pushing them down, but it doesn't really solve the problem, just reduces the pain a bit (and may have the side effect of breaking communications that just take a while to get across.

Lundin
  • 195,001
  • 40
  • 254
  • 396
SF.
  • 13,549
  • 14
  • 71
  • 107
  • Some ideas: can you use linux kernel realtime patch? https://rt.wiki.kernel.org/index.php/Main_Page Can you decouple the communication and test SPI_READ_BUFF_EMPTY in the main application loop and only then process the received byte(s)? Then you don't have to worry about waiting. Can you process the received bytes in an interrupt? – petrch Aug 31 '20 at 09:21
  • If the cpu usage of busy waiting when disconnected is the concern, would `pthread_yield` help? However I'm not sure if scheduler would put your back in 1 ms. If you need to guarentee the 1ms interval, maybe you need a RTOS. – Louis Go Aug 31 '20 at 09:24
  • @petrch: Realtime in our case was causing some serious problems with kernel modules required to operate some essential hardware, so unfortunately it's out of question. We're using 'poor man's realtime' (similar to described [here](https://raspberrypi.stackexchange.com/a/8855/6406) ). CSSPDR is a special register, not merely a memory cell, reading it has the side-effect of acknowledging reception; SPI_READ_BUFF_EMPTY will never go down without CSSPDR being read first. As for interrupt, implementing these in Linux userspace is quite tricky... – SF. Aug 31 '20 at 09:27
  • @LouisGo: the only practical difference I noticed between `usleep(1)` and `sched_yield` is that the latter keeps CPU load at 100%. Haven't tried `pthread_yield`, does it differ in some way? – SF. Aug 31 '20 at 09:29
  • They are the same..Sorry this suggestion is not useful. – Louis Go Aug 31 '20 at 09:34
  • According to the C and C++ tag policies - if the code is in C++ and you are using a C++ compiler, it should be tagged C++ only. – Lundin Aug 31 '20 at 09:42
  • As for the question itself, it's kind of hard to answer without knowing if this is a master or slave device. I'm assuming slave? On a microcontroller you'd simply use an interrupt. Not sure if that's an option in Embedded Linux. Also, those crude delay loops require the variables to be declared as `volatile` or the code will break when optimizations are enabled. – Lundin Aug 31 '20 at 09:45
  • Use the fast timer capabilities of the target platform, e.g. on x86 `rdtsc`. Not an expert on ARM but it also seems to have some [generic timers](https://wiki.osdev.org/ARMv7_Generic_Timers). – rustyx Aug 31 '20 at 09:56
  • @Lundin: The Linux device is the master. The remote device is the slave. The loops don't break on lack of volatiles on counters because the memory-mapped addresses are open in O_SYNC mode, meaning access to them acts like acting on a volatile. Interrupt-like functionality in Linux userspace is a [rather complicated](https://unix.stackexchange.com/questions/136274/can-i-achieve-functionality-similar-to-interrupts-in-linux-userspace) deal though if there are no easier option I might need to go that way. – SF. Aug 31 '20 at 10:15
  • @rustyx: That's an interesting suggestion. Attempt to operate on built-in Linux timers ( `clock_gettime(CLOCK_MONOTONIC, &tp);`) gives a woefully low granularity (kernel itself pre-empts the app rarely enough the value doesn't get updated all that often) but if there was a way to access a hardware timer directly that would be very nice indeed. – SF. Aug 31 '20 at 10:18
  • Do the 100ms sleeps happen if you increase the process priority? – weltensturm Aug 31 '20 at 10:26
  • @weltensturm: maybe not entire 100ms but still several ms, enough to significantly reduce the data transfer rate. – SF. Aug 31 '20 at 10:30
  • @SF. If you are master then why wait at all? You should be receiving while you are transmitting. – Lundin Aug 31 '20 at 10:54
  • @Lundin wait, I might have misunderstood which is which in this context. The Linux device sends a command ordering the remote to perform a specific, requested type of measurement and send back the result. I can't expect the result to be available and valid once reception bit is empty. – SF. Aug 31 '20 at 11:06
  • A slave cannot send anything on a SPI bus unless it is receiving data from the master at the same time. It's a full duplex system and the master must provide clock and /SS signals as well. – Lundin Aug 31 '20 at 11:10
  • @Lundin: Definitely master then. The process of obtaining readout is actually two-step, first Comm() is called with the command, with the return value discarded, then next call is issued, sending out a 0 and retrieving the readout value - unless the command was of 'no reply required' type (in which case it's just one call and the return value is discarded.) – SF. Aug 31 '20 at 11:21

1 Answers1

0

Polling is normally a bad idea. This kind of stuff is best done kernel side using interrupts.

doron
  • 27,972
  • 12
  • 65
  • 103