2

When reading a MIFARE card with Android and converting the data to UTF-8 I get strange characters like �. I'm trying to build an application that can read some kind of ID card we're using. The problem now is that I get weird characters between words and some words are split between blocks so how can I safely get a word I'm looking for? For instance my readings is something like this:

43224���19032019�� at block 2 sektor 2 bindex :8

and with splitting where rest of the number starting with 19 is at a new block:

�me Name���M���19

at block 1 sektor 1 bindex :4

930402���NO934951

at block 2 sektor 1 bindex :4

c5 42 4e 49 44 00 07 4f 4f 4f 4f 4f 4f 00 4b 42   "Åbnid" "OOOOOO" "KB"
44 44 44 20 44 44 44 44 44 00 82 4d 00 c9 31 39   "DDD DDDDD" "M" "19"
39 34 34 33 34 32 00 d0 4e 4f 39 36 36 36 35 31   "944342" "NO966651"
00 00 00 00 00 00 70 f7 88 00 00 00 00 00 00 00
30 32 32 20 20 41 53 00 d3 54 4f 54 41 4c 20 4b   "022" "AS" "Total k"
4f 4e 54 52 4f 4c 4c 20 41 53 20 00 c9 30 32 38   "ONTROLL AS" "028"
37 30 34 33 33 00 c9 32 30 32 31 30 32 31 31 00   "70433" "20210211"
00 00 00 00 00 00 70 f7 88 00 00 00 00 00 00 00

This is how I read from the card:

Tag tagFromIntent = intent.getParcelableExtra(NfcAdapter.EXTRA_TAG);
MifareClassic mfc = MifareClassic.get(tagFromIntent);

Here is my code I use for reading inside a for loop:

 data = mfc.readBlock(bIndex + block); 

and then for converting data to UTF8 for printing I use:

   public String convertByteArrayToUTF8(byte[] bytes){
    String encoded = null;
    try {
        encoded = new String(bytes, StandardCharsets.UTF_8);
    }
    catch (Exception e){
        encoded = new String(bytes, Charset.defaultCharset());
    }
    return encoded;
}

I've tried with ASCII, UTF-16 etc with no luck.

Michael Roland
  • 39,663
  • 10
  • 99
  • 206
Fugue
  • 195
  • 3
  • 17
  • If you are reverse engineering a card, the strange characters may be "magic" binary values used to make your job more difficult. They could be checksums, id data, index data, encrypted data or even random data. – Morrison Chang Feb 20 '19 at 09:20
  • Well yes, i have the readkeys for these spesific cards so it should not be any magic or encrypted data. I belive that a direct conversion is the problem here, but im new in this field and there must be a way to detect if something is a "UTF-8" char and ignore everything else? – Fugue Feb 20 '19 at 09:42
  • Unclear what you mean by "readkeys". If you have the source/documentation for the encoder, but it sounds like you don't. If format is confirmed fixed by trying different cards then its just loading all data into a bytearray and reading out the fields based on offset. Regardless good luck. – Morrison Chang Feb 20 '19 at 09:50
  • Which mifare classic card are you scanning is it 2k or 4k? – Tarun Kumar Feb 20 '19 at 10:05
  • Read keys as the key to authorize reading. its a MifareClassic 4096 Bytes – Fugue Feb 20 '19 at 10:08
  • Why would you expect that all the data on the card is (UTF-8 encoded) strings? – Michael Roland Feb 20 '19 at 10:48
  • I don't know that, but i do know that these card i have contains an ID number, name, bithday etc and I need to figure out how to retrieve them as stated above. It could be ASCII, but im also a newb to this – Fugue Feb 20 '19 at 11:04
  • What im trying to do is to convert the array of byte while getting rid of �� – Fugue Feb 20 '19 at 12:13
  • Well, the � are bytes decoded into invalid/unprintable characters. Probably they are not part of the actual strings. Consequently, you will have to get the card vendor's specification and follow that specification to decode the data ... or you will have to look into the actual bits and bytes yourself and reverse engineer the actual structure yourself. In both cases blindly treating the bytes that you read from the card as strings (regardless of which encoding you use) won't help you much there. – Michael Roland Feb 20 '19 at 15:50
  • Btw. if you aim to seek help in reverse engineering here, you will certainly need to reveal the raw data (e.g. as hexadecimal representation of the bytes that you read from all blocks of the card) and the values that you actually expect to read. – Michael Roland Feb 20 '19 at 15:50
  • 3032322020415300d3544f54414c204b 4f4e54524f4c4c2041532000c9303238 373034333300c9323032313032313100 Where I expect to read: "022" "AS" "Total Kontroll" "02870433" "20210211" – Fugue Feb 21 '19 at 07:15
  • A java `String` contains Unicode stored in UTF-16 `char`s (2 bytes each). There always is a conversion from `byte[]` to the String, namely with the encoding the bytes as text are in. That is not the case here. So keep with bytes, maybe wrapped in a `ByteBuffer` for extracting data. The rest is a guarantee for double the memory, speed, and conversion errors. – Joop Eggen Feb 21 '19 at 07:38
  • 1
    @Fugue Your data seems to consist of null-terminated strings (maybe ASCII?). Each string is preceeded by a single byte that consists of the upper two bits set to '1' and the remaining bits coding the length of the string (including the terminating null), i.e. 0xC0 + LEN. Since the first string does not start with such a character, there's probably more data in the other sectors that could reveal further information about the exact format. – Michael Roland Feb 21 '19 at 10:23
  • Yes that is correct, the block above contains data that is finished with the ones you saw below, the "022". I can't post that string here as it contains some personal data. But to be clear, that string is split up between the ending of block 1 and beginning of block 2 – Fugue Feb 21 '19 at 12:06
  • so my task here is really how should i continue to implement my convertByteArrayToASCII method so that everything i get out from it contains real ASCII strings without the extra raw data – Fugue Feb 21 '19 at 12:08
  • In that case, you might want to substitute those bytes that you recognized as sensitive information with some stub values and show the rest. It would particularly be interesting if the remaining strings follow the same format (preceeded by 0xC0+LEN) and if there are other preceding bytes that might allow us to guess the format. – Michael Roland Feb 21 '19 at 13:23
  • c5 42 4e 49 44 00 07 4f 4f 4f 4f 4f 4f 00 4b 42 "Åbnid" "OOOOOO" "KB" 44 44 44 20 44 44 44 44 44 00 82 4d 00 c9 31 39 "DDD DDDDD" "M" "19" 39 34 34 33 34 32 00 d0 4e 4f 39 36 36 36 35 31 "944342" "NO966651" 00000000000070f78800000000000000 30 32 32 20 20 41 53 00 d3 54 4f 54 41 4c 20 4b "022" "AS" "Total k" 4f 4e 54 52 4f 4c 4c 20 41 53 20 00 c9 30 32 38 "ONTROLL AS" "028" 37 30 34 33 33 00 c9 32 30 32 31 30 32 31 31 00 "70433" "20210211" 00000000000070f78800000000000000 – Fugue Feb 21 '19 at 14:06
  • Above is the hexdecimals from reading and to the right the expected readings when converted to ASCII or UTF-8. – Fugue Feb 21 '19 at 14:07
  • And also as a follow up questions, is there a way to be sure that you always read the same value? Say, i want to read the name of the person. Theres probarbly hundred of these cards so they're name may be floating around in these blocks, are you seeing any control bytes? – Fugue Feb 22 '19 at 07:35

2 Answers2

6

First of all LOL for the question heading. I was in the same situation when I was a newbie. There is no tutorial online that provides you the exact code to read data from a Mifare classic card.

First understand the memory structure of the Mifare cards.

The memory of Mifare Classic divided into sectors, which are also divided into blocks of 16 bytes.

The MIFARE Classic 1K card has 16 sectors, each of which are divided into four blocks. If we do the math, we can figure out how the memory structure would be like: 16 bytes (1 block) * 4 blocks * 16 sectors = 1024 bytes.

enter image description here

The MIFARE Classic 4K card has 40 sectors, 32 of which are divided into four blocks and the remaining 8 are divided into 16 blocks. 16 bytes (1 block) * 4 blocks * 32 sectors + 16 bytes (1 block) * 16 blocks * 8 sectors = 4096 bytes. The memory structure is as follows:

enter image description here

The number on the blocks indicates its index. Each sector is protected by the site key written in the last block of the sector. For example, block 3 contains the site key for sector 1 and block 7 for sector 2. The last block in each sector also contains access conditions information such as “write”, “read” and “read & write”. The following figure demonstrates how the last block consists of:

enter image description here

Moreover, the data written in the card is binary i.e; 0 & 1.

Now, the steps you need to follow to read the data are:

step1: check whether the device support NFC or not.

step2: check if the device has NXP chip (especially for reading Mifare classic cards).

step3: instantiate the NFC manager and NFC adpater & define the techlist of card that you want to read.

step4: ask permission to access device NFC.

step5: create a intent to detect card and specify the MIME type you want to read(in most cases it is all MIME types).

step6: enable and disable foreground dispatch of adapter in onResume() and onPause() so that your app gets the priority to read the card when your activity is in foreground.

step7: When card comes in contact to device, you can get the tag information from intent.getParcelableExtra(NfcAdapter.EXTRA_TAG);

step8: read the card information i.e; card type, tech list etc..

step9: to read the data in the card you need to connect to the card via tag info retrieved above.

step10: iterate through all the sectors. Authenticate each sector with the default key //https://developer.android.com/reference/android/nfc/tech/MifareClassic.html#authenticateSectorWithKeyA(int,%20byte[])

step11: on successful authentication read the binary data in the blocks of each sector.

step12: convert the binary data to string data so that we can read it.

step13: That's all, do whatever you want to do with the data.

surprise! get the complete working code at my github repositiory here: https://github.com/codes29/RFIDReader

Note: I empathised how you feeling as a newbie and got this task where there is no proper tutorial for the same. So I updated my code that I wrote after a lot of struggle for days.

Here's the sample that you'll get after successful authentication and reading the data. The card that I scanned is empty as per now. But if there is data here then it'll surely be here instead of 0's.

/tmp/mozilla_mobulous0/Screenshot_20190221-124444.png

Cheers! Happy coding bro!

Tarun Kumar
  • 498
  • 5
  • 16
  • How does that even try to answer the question? Since OP was able to read data from the card they already know how to access a MIFARE tag. The problem seems to be to decode the data from a rather specific card with structured data on it. – Michael Roland Feb 21 '19 at 09:38
  • 1
    @MichaelRoland please read the answer once again. as the OP doesn't have the right code and even not following the correct steps to read the card(that I mentioned). OP also said that he's a newbie, I also ran into the same situation and also I am familiar with the code that OP is using (i also used the same initially) which isn't going to work. The repository link that I have shared is a result of days spend on R&D and hit & try (aggregated from many sources) works well for mifare classic cards, evenly for Mifare 4k cards. – Tarun Kumar Feb 21 '19 at 09:55
  • For a whole lot of code please visit https://android.googlesource.com/platform/frameworks/base/+/master/core/java/android/nfc/tech/MifareClassic.java This repository has code for all type of NFC cards, read the whole, understand, extract as per your need and aggregate to make your own card if it is not Mifare classic. If you have mifare classic 1k, 2k or 4k then you can visit my repository itself. It has all you need in a single activity. – Tarun Kumar Feb 21 '19 at 09:59
  • Well since OP received data from the card, they do have working code that is capable of reading data (though it was not revealed it in the question). – Michael Roland Feb 21 '19 at 10:12
  • 1
    Well yes I have a working code that that read an NFC card. The card has been written by some other company and I have received the read-key. While i can indeed get out all the data from each block my problem is converting it back to text as it seems to be both empty and control bytes inbetween the words. What I need help to is to extract/convert "Real UTF8/ASCII" text ONLY and NOT the extra bytes. – Fugue Feb 21 '19 at 14:35
  • Are you able to get the binary data in each block? – Tarun Kumar Feb 22 '19 at 05:23
  • Yes, if you see the comment section above ive posted the values i get out which i need to convert to text only while ignoring the extra bytes that is not real unicode – Fugue Feb 22 '19 at 11:35
4

So the data on your tag (excluding the sector trailers looks somewhat like that:

C5 42 4E 49 44 00 07 4F 4F 4F 4F 4F 4F 00 4B 42        ÅBNID..OOOOOO.KB
44 44 44 20 44 44 44 44 44 00 82 4D 00 C9 31 39        DDD DDDDD.‚M.É19
39 34 34 33 34 32 00 D0 4E 4F 39 36 36 36 35 31        944342.ÐNO966651
30 32 32 20 20 41 53 00 D3 54 4F 54 41 4C 20 4B        022  AS.ÓTOTAL K
4F 4E 54 52 4F 4C 4C 20 41 53 20 00 C9 30 32 38        ONTROLL AS .É028
37 30 34 33 33 00 C9 32 30 32 31 30 32 31 31 00        70433.É20210211.

This seems to be some form of structured data. Simply converting the whole binary blob into a UTF-8 (or ASCII) encoded string doesn't make much sense. Instead, you will need to reverse engineer the way that the data is structured (or, even better, you try to obtain the specification from the system manufacturer).

From what I can see, it looks as if that data consisted of multiple null-terminated strings embedded into some compact (Tag)-Length-Value format. The first byte seems to be the tag(?) + length, so we have

C5    Length = 5
    42 4E 49 44 00                                               "BNID"
07    Length = 7
    4F 4F 4F 4F 4F 4F 00                                         "OOOOOO"
4B    Length = 11
    42 44 44 44 20 44 44 44 44 44 00                             "KBDDD DDDDD"
82    Length = 2
    4D 00                                                        "M"
C9    Length = 9
    31 39 39 34 34 33 34 32 00                                   "19944342"
D0    Length = 16
    4E 4F 39 36 36 36 35 31 30 32 32 20 20 41 53 00              "NO966651022  AS"
D3    Length = 19
    54 4F 54 41 4C 20 4B 4F 4E 54 52 4F 4C 4C 20 41 53 20 00     "TOTAL KONTROLL AS "
C9    Length = 9
    30 32 38 37 30 34 33 33 00                                   "02870433"
C9    Length = 9
    32 30 32 31 30 32 31 31 00                                   "20210211"

The first byte could, for instance, be split into tag and length like this: TTTL LLLL (upper 3 bits encode the tag, lower 5 bits encode the length of the following value). This would give the following tags

  • 0x6 for "BNID", "19944342", "NO966651022 AS", "TOTAL KONTROLL AS ", "02870433", and "20210211"
  • 0x0 for "OOOOOO"
  • 0x2 for "KBDDD DDDDD"
  • 0x4 for "M"

Hence, the split between tag and length might also be TTLL LLLL (upper 2 bits encode the tag, lower 6 bits encode the length of the following value).

Unfortunately, the format doesn't resemble any of the popular formats that I'm aware of. So you could just continue your reverse engineering by comparing multiple different cards and by deriving meaning from the values.

So far, in order to decode the above, you would start by reading the first byte, extract the length from that byte, cut that amount of follow-up bytes and convert them into a string (based on the sample that you provided, ASCII encoding should do). You can then continue with the next byte, extract the length information from it, ...

Michael Roland
  • 39,663
  • 10
  • 99
  • 206
  • Alright this looks like the answer im looking for! Im going to implement a method for that, thanks! – Fugue Feb 25 '19 at 12:54