1

Looking for a bit of advice if I may. I have a method in my PlayStation emulator (Java based for a university thesis which is since finished). It takes an integer memory address, and then returns the byte at that address - redirecting the read to RAM, BIOS ROM, a given I/O port, etc. depending on the address. At the moment this is implemented using a huge bundle of if-else cases which check the range of an address and read from the right place accordingly, returning the byte.

This gives a performance hit of around 9% of overall runtime for me. I figured I may be able to improve this using a dispatch table - essentially a HashMap with autoboxed Integer keys representing the memory addresses and a lambda value to handle the return of the byte depending on the address. Now bearing in mind there are approximately 2.6 million different possible addresses taking into account the memory map of the PS1, this uses a lot more memory - fine with that.

What is puzzling me is that this gives slightly worse performance than the bundle of if-else statements - around 12% of overall runtime. Is there a better way to do what I'm doing? I can't use an array solution (address as a primitive int index and lambda stored at that index) as there are gaps in the address space that this wouldn't handle without an order of magnitude too much memory usage.

I appreciate any other ideas that might get this number down a bit - I realise Java is not a great language for emulation, but part of my thesis was proving it would work (it does). Many thanks.

Regards, Phil

EDIT:

Below is the entire code of the readByte method (the address is converted to long to allow comparison of lower addresses to higher ones at values considered negative for a normal int):

/**
 * This reads from the correct area depending on the address.
 * @param address
 * @return 
 */
public byte readByte(int address) {

    long tempAddress = address & 0xFFFFFFFFL;
    byte retVal = 0;      

    if (tempAddress >= 0L && tempAddress < 0x200000L) { // RAM
        retVal = ram[(int)tempAddress];
    } else if (tempAddress >= 0x1F000000L && tempAddress < 0x1F800000L) { // Expansion Region 1
        // do nothing for now
        ;
    } else if (tempAddress >= 0x1F800000L && tempAddress < 0x1F800400L) { // Scratchpad
        // read from data cache scratchpad if enabled
        if (scratchpadEnabled()) {
            tempAddress -= 0x1F800000L;
            retVal = scratchpad[(int)tempAddress];
        }
    } else if (tempAddress >= 0x1F801000L && tempAddress < 0x1F802000L) { // I/O Ports
        if (tempAddress >= 0x1F801000L && tempAddress < 0x1F801004L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(expansion1BaseAddress >>> 24);
                    break;
                case 1:
                    retVal = (byte)(expansion1BaseAddress >>> 16);
                    break;
                case 2:
                    retVal = (byte)(expansion1BaseAddress >>> 8);
                    break;
                case 3:
                    retVal = (byte)expansion1BaseAddress;
                    break;
            }
        }
        else if (tempAddress >= 0x1F801004L && tempAddress < 0x1F801008L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(expansion2BaseAddress >>> 24);
                    break;
                case 1:
                    retVal = (byte)(expansion2BaseAddress >>> 16);
                    break;
                case 2:
                    retVal = (byte)(expansion2BaseAddress >>> 8);
                    break;
                case 3:
                    retVal = (byte)expansion2BaseAddress;
                    break;
            }
        } else if (tempAddress >= 0x1F801008L && tempAddress < 0x1F80100CL) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(expansion1DelaySize >>> 24);
                    break;
                case 1:
                    retVal = (byte)(expansion1DelaySize >>> 16);
                    break;
                case 2:
                    retVal = (byte)(expansion1DelaySize >>> 8);
                    break;
                case 3:
                    retVal = (byte)expansion1DelaySize;
                    break;
            }
        } else if (tempAddress >= 0x1F80100CL && tempAddress < 0x1F801010L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(expansion3DelaySize >>> 24);
                    break;
                case 1:
                    retVal = (byte)(expansion3DelaySize >>> 16);
                    break;
                case 2:
                    retVal = (byte)(expansion3DelaySize >>> 8);
                    break;
                case 3:
                    retVal = (byte)expansion3DelaySize;
                    break;
            }
        } else if (tempAddress >= 0x1F801010L && tempAddress < 0x1F801014L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(biosRomDelaySize >>> 24);
                    break;
                case 1:
                    retVal = (byte)(biosRomDelaySize >>> 16);
                    break;
                case 2:
                    retVal = (byte)(biosRomDelaySize >>> 8);
                    break;
                case 3:
                    retVal = (byte)biosRomDelaySize;
                    break;
            }
        } else if (tempAddress >= 0x1F801014L && tempAddress < 0x1F801018L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(spuDelaySize >>> 24);
                    break;
                case 1:
                    retVal = (byte)(spuDelaySize >>> 16);
                    break;
                case 2:
                    retVal = (byte)(spuDelaySize >>> 8);
                    break;
                case 3:
                    retVal = (byte)spuDelaySize;
                    break;
            }
        } else if (tempAddress >= 0x1F801018L && tempAddress < 0x1F80101CL) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(cdromDelaySize >>> 24);
                    break;
                case 1:
                    retVal = (byte)(cdromDelaySize >>> 16);
                    break;
                case 2:
                    retVal = (byte)(cdromDelaySize >>> 8);
                    break;
                case 3:
                    retVal = (byte)cdromDelaySize;
                    break;
            }
        } else if (tempAddress >= 0x1F80101CL && tempAddress < 0x1F801020L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(expansion2DelaySize >>> 24);
                    break;
                case 1:
                    retVal = (byte)(expansion2DelaySize >>> 16);
                    break;
                case 2:
                    retVal = (byte)(expansion2DelaySize >>> 8);
                    break;
                case 3:
                    retVal = (byte)expansion2DelaySize;
                    break;
            }
        } else if (tempAddress >= 0x1F801020L && tempAddress < 0x1F801024L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(commonDelay >>> 24);
                    break;
                case 1:
                    retVal = (byte)(commonDelay >>> 16);
                    break;
                case 2:
                    retVal = (byte)(commonDelay >>> 8);
                    break;
                case 3:
                    retVal = (byte)commonDelay;
                    break;
            }
        } else if (tempAddress >= 0x1F801040L && tempAddress < 0x1F801050L) {
            // read from ControllerIO object
            retVal = cio.readByte((int)tempAddress);
        } else if (tempAddress >= 0x1F801060L && tempAddress < 0x1F801064L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(ramSize >>> 24);
                    break;
                case 1:
                    retVal = (byte)(ramSize >>> 16);
                    break;
                case 2:
                    retVal = (byte)(ramSize >>> 8);
                    break;
                case 3:
                    retVal = (byte)ramSize;
                    break;
            }
        }
        else if (tempAddress >= 0x1F801070L && tempAddress < 0x1F801074L) { // Interrupt Status Register
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(interruptStatusReg >>> 24);
                    break;
                case 1:
                    retVal = (byte)(interruptStatusReg >>> 16);
                    break;
                case 2:
                    retVal = (byte)(interruptStatusReg >>> 8);
                    break;
                case 3:
                    retVal = (byte)interruptStatusReg;
                    break;    
            }
        }
        else if (tempAddress >= 0x1F801074L && tempAddress < 0x1F801078L) { // Interrupt Mask Register
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(interruptMaskReg >>> 24);
                    break;
                case 1:
                    retVal = (byte)(interruptMaskReg >>> 16);
                    break;
                case 2:
                    retVal = (byte)(interruptMaskReg >>> 8);
                    break;
                case 3:
                    retVal = (byte)interruptMaskReg;
                    break;    
            }
        }
        else if (tempAddress >= 0x1F801080L && tempAddress < 0x1F801100L) {
            retVal = dma.readByte(address);
        }
        else if (tempAddress >= 0x1F801100L && tempAddress < 0x1F801104L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(timer0.counterValueRead() >>> 24);
                    break;
                case 1:
                    retVal = (byte)(timer0.counterValueRead() >>> 16);
                    break;
                case 2:
                    retVal = (byte)(timer0.counterValueRead() >>> 8);
                    break;
                case 3:
                    retVal = (byte)timer0.counterValueRead();
                    break;
            }
        }
        else if (tempAddress >= 0x1F801104L && tempAddress < 0x1F801108L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(timer0.counterModeRead(false) >>> 24);
                    break;
                case 1:
                    retVal = (byte)(timer0.counterModeRead(false) >>> 16);
                    break;
                case 2:
                    retVal = (byte)(timer0.counterModeRead(false) >>> 8);
                    break;
                case 3:
                    retVal = (byte)timer0.counterModeRead(false);
                    break;
            }
        }
        else if (tempAddress >= 0x1F801108L && tempAddress < 0x1F80110CL) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(timer0.counterTargetRead() >>> 24);
                    break;
                case 1:
                    retVal = (byte)(timer0.counterTargetRead() >>> 16);
                    break;
                case 2:
                    retVal = (byte)(timer0.counterTargetRead() >>> 8);
                    break;
                case 3:
                    retVal = (byte)timer0.counterTargetRead();
                    break;
            }
        }
        else if (tempAddress >= 0x1F801110L && tempAddress < 0x1F801114L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(timer1.counterValueRead() >>> 24);
                    break;
                case 1:
                    retVal = (byte)(timer1.counterValueRead() >>> 16);
                    break;
                case 2:
                    retVal = (byte)(timer1.counterValueRead() >>> 8);
                    break;
                case 3:
                    retVal = (byte)timer1.counterValueRead();
                    break;
            }
        }
        else if (tempAddress >= 0x1F801114L && tempAddress < 0x1F801118L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(timer1.counterModeRead(false) >>> 24);
                    break;
                case 1:
                    retVal = (byte)(timer1.counterModeRead(false) >>> 16);
                    break;
                case 2:
                    retVal = (byte)(timer1.counterModeRead(false) >>> 8);
                    break;
                case 3:
                    retVal = (byte)timer1.counterModeRead(false);
                    break;
            }
        }
        else if (tempAddress >= 0x1F801118L && tempAddress < 0x1F80111CL) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(timer1.counterTargetRead() >>> 24);
                    break;
                case 1:
                    retVal = (byte)(timer1.counterTargetRead() >>> 16);
                    break;
                case 2:
                    retVal = (byte)(timer1.counterTargetRead() >>> 8);
                    break;
                case 3:
                    retVal = (byte)timer1.counterTargetRead();
                    break;
            }
        }
        else if (tempAddress >= 0x1F801120L && tempAddress < 0x1F801124L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(timer2.counterValueRead() >>> 24);
                    break;
                case 1:
                    retVal = (byte)(timer2.counterValueRead() >>> 16);
                    break;
                case 2:
                    retVal = (byte)(timer2.counterValueRead() >>> 8);
                    break;
                case 3:
                    retVal = (byte)timer2.counterValueRead();
                    break;
            }
        }
        else if (tempAddress >= 0x1F801124L && tempAddress < 0x1F801128L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(timer2.counterModeRead(false) >>> 24);
                    break;
                case 1:
                    retVal = (byte)(timer2.counterModeRead(false) >>> 16);
                    break;
                case 2:
                    retVal = (byte)(timer2.counterModeRead(false) >>> 8);
                    break;
                case 3:
                    retVal = (byte)timer2.counterModeRead(false);
                    break;
            }
        }
        else if (tempAddress >= 0x1F801128L && tempAddress < 0x1F80112CL) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(timer2.counterTargetRead() >>> 24);
                    break;
                case 1:
                    retVal = (byte)(timer2.counterTargetRead() >>> 16);
                    break;
                case 2:
                    retVal = (byte)(timer2.counterTargetRead() >>> 8);
                    break;
                case 3:
                    retVal = (byte)timer2.counterTargetRead();
                    break;
            }
        }
        else if (tempAddress >= 0x1F801810L && tempAddress < 0x1F801814L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(gpu.readResponse() >>> 24);
                    break;
                case 1:
                    retVal = (byte)(gpu.readResponse() >>> 16);
                    break;
                case 2:
                    retVal = (byte)(gpu.readResponse() >>> 8);
                    break;
                case 3:
                    retVal = (byte)gpu.readResponse();
                    break;
            }
        }
        else if (tempAddress >= 0x1F801814L && tempAddress < 0x1F801818L) {
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(gpu.readStatus() >>> 24);
                    break;
                case 1:
                    retVal = (byte)(gpu.readStatus() >>> 16);
                    break;
                case 2:
                    retVal = (byte)(gpu.readStatus() >>> 8);
                    break;
                case 3:
                    retVal = (byte)gpu.readStatus();
                    break;
            }
        }
        else if (tempAddress >= 0x1F801800L && tempAddress < 0x1F801804L) { // CDROM
            switch ((int)tempAddress & 0xF) {
                case 0:
                    retVal = cdrom.read1800();
                    break;
                case 1:
                    retVal = cdrom.read1801();
                    break;
                case 2:
                    retVal = cdrom.read1802();
                    break;
                case 3:
                    retVal = cdrom.read1803();
                    break;
            }
        }
        else if (tempAddress >= 0x1F801C00L && tempAddress < 0x1F802000L) {
            // fake SPU read
            retVal = spu.readByte(address);
        }
    } else if (tempAddress >= 0x1F802000L && tempAddress < 0x1F803000L) { // Expansion Region 2 (I/O Ports)
        // read from BIOS post register
        if (tempAddress == 0x1F802041L) {
            retVal = biosPost;
        }
    } else if (tempAddress >= 0x1FA00000L && tempAddress < 0x1FC00000L) { // Expansion Region 3 (Multipurpose)
        // do nothing for now
        ;
    } else if (tempAddress >= 0x1FC00000L && tempAddress < 0x1FC80000L) { // BIOS ROM
        // read from memory mapped BIOS file
        tempAddress -= 0x1FC00000L;
        retVal = biosBuffer.get((int)tempAddress);
    } else if (tempAddress >= 0xFFFE0000L && tempAddress < 0xFFFE0200L) { // I/O Ports (Cache Control)
        if (tempAddress >= 0xFFFE0130L && tempAddress < 0xFFFE0134L) { // Cache Control Register
            int shift = (int)(tempAddress & 0x3L);
            switch (shift) {
                case 0:
                    retVal = (byte)(cacheControlReg >>> 24);
                    break;
                case 1:
                    retVal = (byte)(cacheControlReg >>> 16);
                    break;
                case 2:
                    retVal = (byte)(cacheControlReg >>> 8);
                    break;
                case 3:
                    retVal = (byte)cacheControlReg;
                    break;    
            }
        }            
    }

    return retVal;
}
PhilPotter1987
  • 1,306
  • 2
  • 12
  • 20
  • Just to be clear you need to map addresses to what? functions? If you have 2.6 millions addresses that's an array of 2.600.000 * 8 bytes for an array, that's roughly 21mb or RAM, why is this too much to handle? – Jack Feb 27 '17 at 00:24
  • @jack *it returns the byte at that address* – Bohemian Feb 27 '17 at 00:27
  • @Bohemian: no, in an emulator it will call a function that will write or read from that specific address space block, the problem is that the address space is divided into multiple parts which should be handled differently, eg: `byte memoryRead(int address) { if (address >= 0x00010000 && address < 0x0001FFFF) return bios.memoryRead(address); else if ...` – Jack Feb 27 '17 at 00:28
  • 1
    Some code would be helpful. It's impossible to tell exactly what you're doing that might be an issue. Are you setting up a `HashMap` with every possible address? – ajb Feb 27 '17 at 00:31
  • @ajb essentially the current code is exactly as written by Jack above (the if-else version essentially). The RAM and BIOS are backed by byte arrays but other areas need to be generated dynamically such as timer values, I/O ports etc. My initial tweak has been to setup a hashmap with every possibly address as a key, and a lambda (shared in cases where it is just reading from RAM or ROM address as these are contiguous). There are around 2.6 million key value pairs. – PhilPotter1987 Feb 27 '17 at 00:39
  • In general, if you have a large number of *ranges* that you need to check, a `SortedMap` (such as a `TreeMap`) could be an answer, since it gives you a way to find "the largest key <= a search value" or "the smallest key >= a search value". That would be faster than a linear search or a series of if-then-elses, if the number of ranges is greater than some threshold (I don't know what, maybe about 10 to 20). I'm still not sure this is the answer to your problem, though. I need to see your code. – ajb Feb 27 '17 at 00:40
  • @ajb I've put the code of the method up now. – PhilPotter1987 Feb 27 '17 at 02:16

4 Answers4

2

It seems the real problem with your current implementation isn't the long if-else chain, but their highly-repetitive contents. This is a classic example of a pattern that's a good fit for object oriented programming. You should define a common interface for accessing segments of memory, so that each conditional block only needs to look like (picked a random block):

...
else if (tempAddress >= 0x1F801104L && tempAddress < 0x1F801108L) {
  return timer0.read(tempAddress);
}
...

There's a number of ways you could design such a class, but I'd suggest providing an interface and an abstract class that handles the switch behavior:

interface Location {
  byte read(long address);
}

abstract class IntWordLocation implements Location {
  @Override
  final byte read(long address) {
    int value = readInt(address & 0xfffffffd); // clear lower two bits;
    int shift = 3 - (int) (address & 0x3);
    for (int i = 0; i < shift; i++) { // replaces switch statement
      value = value >>> 8;
    }
    return (byte) value;
  }

  abstract protected int readInt(long int);
}

And you can add other Location implementations as needed. Now we've decoupled reading a particular segment of memory (which Location is now responsible for) from looking up which segment an address represents (your if-else blocks), and now your your readByte() method should be much simpler.


If you want to further clean up your readByte() method, you might find Guava's RangeMap useful - it allows you to efficiently represent mappings of ranges of values, meaning you could store one entry for each memory location, rather than attempt to map each address individually. The map would consist of only a few entries, rather than having it be massively large.

For example:

RangeMap<Long, Location> addressRanges =
  new ImmutableRangeMap.Builder<Integer, Location>()
    .put(Range.closedOpen(0L, 0x200000L), ramLocation)
    .put(Range.closedOpen(0x1F000000L, 0x1F800000L), NO_OP_LOCATION)
    .put(Range.closedOpen(0x1F800000L, 0x1F800400L), scratchpadLocation)
    // ...
    .build();

Then your readByte() method simply becomes:

public byte readByte(int address) {
  return addressRanges.get(address).read(address);
}

This has an added benefit of sanity-checking your address-space, since ImmutableRangeMap.Builder will reject overlapping ranges. You can also see any gaps by constructing a RangeSet of the map's keys and calling `RangeSet.complement().

dimo414
  • 47,227
  • 18
  • 148
  • 244
  • That sounds really interesting - are you aware of its performance characteristics before I take this approach though? I guess I may have to just try it... ;-) – PhilPotter1987 Feb 27 '17 at 00:49
  • It's backed by a `TreeMap`, so it's O(log n) access, but since the contents are so much smaller it's likely going to be much faster in practice. – dimo414 Feb 27 '17 at 00:58
2

The best approach depends on your implementation under the hood. I see that the address space of the PSX is 32 bit, but as with many console, zones are mirrored. Now without seeing your actual implementation it's just guessing but here's some considerations.

I'll start considering this table

 KUSEG     KSEG0     KSEG1
  00000000h 80000000h A0000000h  2048K  Main RAM (first 64K reserved for BIOS)
  1F000000h 9F000000h BF000000h  8192K  Expansion Region 1 (ROM/RAM)
  1F800000h 9F800000h    --      1K     Scratchpad (D-Cache used as Fast RAM)
  1F801000h 9F801000h BF801000h  8K     I/O Ports
  1F802000h 9F802000h BF802000h  8K     Expansion Region 2 (I/O Ports)
  1FA00000h 9FA00000h BFA00000h  2048K  Expansion Region 3 (whatever purpose)
  1FC00000h 9FC00000h BFC00000h  512K   BIOS ROM (Kernel) (4096K max)
        FFFE0000h (KSEG2)        0.5K   I/O Ports (Cache Control)

So for I/O ports there not much you can do since they're separated and must be handled specifically we can try to study how to improve addressing of everything else.

We can see that mirrored regions differ from the 4 most relevant bits. Which means that we can do address &= 0x0FFFFFFF so that we ignore the region and consider only the significative part of the address.

So now we have 3 kinds of addresses:

  • the one starting at 0x0000000, mapped to main RAM
  • the group starting at 0xF000000 and ending at 0xFC00000 (+ bios rom)
  • the I/O ports at 0xFFFF0000

This could lead to an hybrid approach in which you use both if/else and a cache, eg:

byte readMemory(int address) 
{
  if ((address & 0xFF000000) == 0xFF000000)
    return ioPorts.read(address);

  // remove most significative nibble, we don't need it
  address &= 0x0FFFFFFF;

  // 0xF000000 zone
  // according to bios rom size you could need a different kind of comparison since it may wrap over 0xFFFFFFF
  if ((address & 0xF000000) == 0xF000000)
  {
    // now your address space is just from 0xF000000 to 0xFC00000 + size of BIOS ROM (4mb max?)
  }
  else
  {
    // we don't know if you map bios together with ram or separately
    return mainRam.readMemory(address);
  }
}

Now we have that address space between 0xF000000 and 0xFC000000 that must be split into multiple parts. As you can see from the memory map we have this:

F000000h
F800000h
F801000h
F802000h
FA00000h
FC00000h 

If you notice you can see that first 4 bits are always 0xF while last 12 bits are always 0, so we don't need them to understand where to dispatch the call. This means that the interesting part of the address has the following mask 0x0FFF000, so we could translate the address:

address = (address >>> 12) & 0xFFF;

Now these are only 4096 possible values which could fit in a tight LUT table.

Jack
  • 131,802
  • 30
  • 241
  • 343
  • This looks like it might be where I end up going. I remove the mirroring before hand using a virtual to physical mapping method. This approach is therefore doubly valid. – PhilPotter1987 Feb 27 '17 at 01:30
  • The mirroring is only part of the problem, but you really should show us some code because that's just the guess game at the moment. Let me know if performance is better and by how much because I'm curious (and already spent my time writing emulators but in C++ which is a different story :) – Jack Feb 27 '17 at 01:31
  • I've put the code of the method up now. I shall have to wait until after work to test your ideas, but thanks for replying :-) – PhilPotter1987 Feb 27 '17 at 02:17
  • I tried some of this out on my lunch break - some basic segmentation was enough to improve the performance - I profiled it to see which address ranges were most frequently used. It turns out ram is the biggest (92% avg), then bios (around 6%), then everything else. This when profiled for a few minutes, along with a tweak to do bulk transfers in the DMA engine when doing CDROM to RAM transfers, is enough to double the performance give or take :-) Many thanks. – PhilPotter1987 Feb 27 '17 at 15:18
  • @PhilPotter1987: you're welcome, glad to have helped! Nice performance improvements indeed. – Jack Feb 27 '17 at 17:11
  • Yeah, getting there - gotta rewrite the timer system next as that has a 21% overhead at the moment with all the synchronisation going on and I know I can do better than that. My thesis goal was to get it working, now it is to speed it up. With the timer rewrite and the dynamic recompiler (which is half done) it will be capable of fully realtime play :-) – PhilPotter1987 Feb 27 '17 at 21:07
1

A HashMap is O(1) look up performance, but that doesn't mean it's "instantaneous". Internally there are lots of if tests and hashcode calculations etc, more it seems than is going on with your conditions.


If there's only a couple of million bytes, I would just use a byte[] and look it up - there is no faster impl.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • The problem with using a byte[] is many values are generated dynamically - that said the vast number of reads go to RAM and BIOS. Also, the address space is not contiguous overall. – PhilPotter1987 Feb 27 '17 at 00:40
  • @PhilPotter1987 How many different possibilities are there? Would it work to have another array of bytes where each byte is some sort of code to tell you how to get the data--maybe 0 = "not valid", 1 = "read from an array that simulates memory", 2 = "call another method that reads a I/O port", etc.? Two byte arrays of 2.6 million bytes each isn't a lot of memory. – ajb Feb 27 '17 at 00:45
  • @ajb off the top of my head around 100 different possible ranges a byte could come from. Sorry, not in front of code right now but yes this is an interesting idea :-) – PhilPotter1987 Feb 27 '17 at 00:48
  • @PhilPotter1987 then don't use byte[] - that's not really part of the answer. – Bohemian Feb 27 '17 at 00:50
  • @PhilPotter1987 I wasn't asking about different possible ranges, I was asking about different possible _actions_. I'm not concerned with ranges. If the total range is about 2.6 million, you can afford to have a byte array representing memory data for those addresses where you'd read memory. The array elements would be wasted for addresses that aren't in any range or that map to I/O ports, but that should not be an issue. You shouldn't need to worry about the non-contiguousness of memory. – ajb Feb 27 '17 at 00:55
  • @ajb sorry I've not been clear with my terminology. The total number of actions is around the 100 mark. The total range is a 32-bit address space. So the array would be 4 GB if I went with this approah., as addresses are quite disparate. That said, it could be somewhat smaller, I'd need to run the math. – PhilPotter1987 Feb 27 '17 at 01:00
  • @PhilPotter1987 that changes everything. – ajb Feb 27 '17 at 01:09
1

You mentioned auto-boxing in your question.

Keep in mind that such conversions from int to Integer do not come for free. In that sense you have to make sure to not cause too many unnecessary / unwanted boxing operations to take place.

In order to give you a more detailed answer, we would need to see some of your code.

But I agree with the other feedback you got: if your memory fits into a single array of bytes, then use that; and just make sure to provide reasonable interfaces around that.

GhostCat
  • 137,827
  • 25
  • 176
  • 248
  • Yes it autoboxes as I can't use a primitive as a key with a hashmap. I haven't tested the performance of the autoboxing as I read somewhere that autoboxed ints are cached if used as map keys? – PhilPotter1987 Feb 27 '17 at 00:51
  • Nope. Integer objects are objects and their creation is not for free. That is one of the few reasons why primitive types are sometimes the better choice - when you find that object type solutions don't give you the performance you need. But again: show more code. – GhostCat Feb 27 '17 at 00:55