2

I am working an embedded project which involves reading/writing a struct into EEPROM. I am using sprintf to make it easy to display some debugging information.

There are two problems with this code for some reason. The first; sprintf is printing a very strange output. When I print 'addr++' it will follow a pattern '0, 1, 2, 3, 4, 32, ...' which doesn't make sense.

void ee_read(char * buf, unsigned int addr, unsigned int len) {
    unsigned int i;

    sprintf(bf1, "Starting EEPROM read of %u bytes.\r\n", len); // Debug output
    debugr(bf1);

    IdleI2C1();
    StartI2C1();
    WriteI2C1(EE_ADDR | EE_W);
    IdleI2C1();
    WriteI2C1((unsigned char)addr>>8); // Address to start reading data from
    IdleI2C1();
    WriteI2C1((unsigned char)addr&0xFF);
    IdleI2C1();
    RestartI2C1();
    WriteI2C1(EE_ADDR | EE_R);
    IdleI2C1();
    for (i=0; i<len; i++) {
        buf[i] = ReadI2C1(); // Read a byte from EEPROM

        sprintf(bf1, "Addr: %u Byte: %c\r\n", addr, buf[i]); // Display the read byte and the address
        debugr(bf1);

        addr++; // Increment address

        IdleI2C1();
        if (i == len-1) { // This makes sure the last byte gets 'nAcked'
            NotAckI2C1();
        } else {
            AckI2C1();
        }
    }
    StopI2C1();
}

The output from the above is here: https://gist.github.com/3803316 Please note that the about output was taken with %x for the address value (so addr is hex)

The second problem, which you may have noticed with the output, is that it doesn't stop when i > len. It continues further than the output I have supplied, and doesn't stop until the microcontroller's watch dog restarts.

Edit: Calling the function

Location loc;
ee_read(&loc, 0, sizeof(Location));

Declarations:

struct location_struct {
    char lat[12]; // ddmm.mmmmmm
    char latd[2]; // n/s
    char lon[13]; // dddmm.mmmmmm
    char lond[2]; // e/w
    char utc[11]; // hhmmss.sss
    char fix[2]; // a/v
};

typedef struct location_struct Location;

char bf1[BUFFER_SIZE];

I don't think it's a race condition. I disable the interrupts which use bf1. Even then, it would corrupt the whole debug string if that happened, and it certainly wouldn't be so repeatable.

Edit The value of addr starts as zero, which can be seen here: https://gist.github.com/3803411

Edit What this is supposed to do it copy the location structure byte by byte into the EEPROM, and then recall it when it is needed.

Closure So I never did solve this problem. The project moved away from the EEPROM, and I have since changed OS, compiler and IDE. It's unlikely I will replicate this problem.

dantheman
  • 275
  • 2
  • 12
  • 3
    provide variable declaration of bf1 – Riskhan Sep 29 '12 at 06:27
  • 2
    Idea: EEPROM/I2C-related routines corrupt the contents of some registers (possibly system or I/O registers, not just general-purpose ones) or memory (possibly that of memory-mapped devices). Also, is it possible that you may have a race condition somewhere, sharing a global variable in unsafe ways between threads/ISRs/main()? – Alexey Frunze Sep 29 '12 at 06:27
  • Can you add the value of addr to your opening sprintf() ("Starting EEPROM...")? – Scooter Sep 29 '12 at 06:51
  • 3
    Comment out lines until it stops behaving strangely. Then figure out which line causes it. – jpa Sep 29 '12 at 07:11
  • 2
    I commented all of the I2C stuff, and got the expected result (42 lines of: 'addr: 0 byte: .\r\n' 'addr: 1 byte: .\r\n' ... Which suggests that @AlexeyFrunze could be on to something with the I2C corrupting registers. I renamed 'addr' to 'address' which didn't make any difference at all. – dantheman Sep 29 '12 at 07:28
  • Well, examine the I2C code then. – Alexey Frunze Sep 29 '12 at 07:37
  • Your function writes to `buf`, passed in at `&loc`. What is `loc` and how is it declared? If this parameter is incorrect, writing to `buf` will cause data corruption and undefined behaviour. Insufficient information for a definitive answer, hence the comment. Do you have a debugger to step this code and observe its behaviour? – Clifford Sep 29 '12 at 08:54
  • 1
    @Clifford unfortunately I do not have a debugger... I get a segfault. I made my description above more clear. – dantheman Sep 30 '12 at 00:17
  • 1
    I have examined the I2C code, and I don't see anything that could cause a buffer overflow resulted in corrupt memory. There is also no mention of strange behaviour in the datasheet errata. – dantheman Sep 30 '12 at 00:20
  • Are the code and data both on the same chip with the CPU or in external memory? If external, are I2C I/O ports also used for external memory control signals? Or, CPU control? I mean, one port with completely different functions on different pins/bits. If that's the case, a simple bug such as improper bit manipulation (wrong mask, overflow, etc) can affect your program execution. – Alexey Frunze Sep 30 '12 at 00:56
  • 1
    @danthemman: how large is your stack? It looks like this code might be for a very small system with limited stack size. Formatted I/O routines often use a considerable amount of stack space; you may be overrunning the stack, which can cause all sorts of problems. – Michael Burr Sep 30 '12 at 02:54
  • @KeithThompson: The `addr` variable is an `unsigned int` not a pointer. so %u is exactly correct. It refers to the EEPROM location, not a CPU addressable location. – Clifford Sep 30 '12 at 15:40
  • @MichaelBurr: Good point; I have seen formatted I/O implementations that need as much as 4KB of stack. – Clifford Sep 30 '12 at 15:43
  • @Clifford: You're right. I'll delete my earlier comment. – Keith Thompson Sep 30 '12 at 19:37
  • 1
    I am using a PIC18F25K22 chip (8bit) to interface with an EEPROM over I2C. All of the program code and RAM lies internally with the PIC controller, and I am using an EEPROM to hold extra data. I'm not really sure how to diagnose a RAM overflow in this kind of application. However, the microcontroller is told to reset on stack overflow. – dantheman Oct 01 '12 at 01:02
  • @dantheman: You earlier mentioned a segfault; on a PIC18!? Or did you mean the debugger segfaulted? If the latter I'd work on that problem first, you'd solve this problem much faster that way. – Clifford Oct 01 '12 at 18:53
  • 1
    Fortunatly the segfault is with MPLAB. I'll attempt to resolve this problem... – dantheman Oct 01 '12 at 23:53

1 Answers1

1

I'll tell you one thing wrong with your code, this line:

(unsigned char)addr>>8

doesn't do what you seem to need.

It converts the value in addr into an unsigned char which (assuming 8-bit char and either 16-bit int or only using the lower 16 bits of a wider int), will will always give you the lower eight bits.

If you then right shift that by eight bits, you'll always end up with zero.

If your intent is to get the upper eight bits of the address, you need to use:

(unsigned char)(addr>>8)

so that the shift is done first.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953