-1

I am using daikonjs(https://github.com/rii-mango/Daikon) to parse dicom file. But I am having trouble with korean, and patient name after parse return include some special symbols. But when I used radiAnt application or dicom4che, result have not special symbol. Reality: �$)C김귀순 Expected: 김귀순 It's a dicom file having Korean patient name: "https://github.com/rii-mango/Daikon/files/3696509/filenameHQ.zip"

Divyanshu Rawat
  • 4,421
  • 2
  • 37
  • 53
Mi Phạm
  • 11
  • 2

1 Answers1

0

The attribue Specific Character Set (0008,0005) defines the character set(s) used for encoding string values in the DICOM dataset. In your case, it says

(0008,0005) CS [\ISO 2022 IR 149]                       #  16, 2 SpecificCharacterSet

Which means that 2 character sets are used:

  1. US ASCII (ISO_IR 6) - the default character set in DICOM, thus not explicitly specified but implicitly the first attribute value (before the backslash)

  2. Korean character set using code extension techniques (ISO 2022 IR 149).

Using two different character sets requires the usage of ISO 2022 code extension techniques. This works by adding a special character sequence which switches the character set - here: $)

Apparently, Daikonjs (never heard of it before) does not support code extension techniques thus does not recognize the switching of the character set.

EDIT: By the way, I hope that you anonymized the dataset. It looks like real information about the patient, the hospital and the doctor is still present in it. This violates privacy legislation in most countries in the world (not sure about Korea though).

Markus Sabin
  • 3,916
  • 14
  • 32