28

I am trying to search and replace some of the text (eg 'Smith, John') in this pdf form file (header.fdf, I presumed this is treated as binary file):

'%FDF-1.2\n%\xe2\xe3\xcf\xd3\n1 0 obj\n<</FDF<</Fields[<</V(M)/T(PatientSexLabel)>><</V(24-09-1956  53)/T(PatientDateOfBirth)>><</V(Fisher)/T(PatientLastNameLabel)>><</V(CNSL)/T(PatientConsultant)>><</V(28-01-2010 18:13)/T(PatientAdmission)>><</V(134 Field Street\\rBlackburn BB1 1BB)/T(PatientAddressLabel)>><</V(Smith, John)/T(PatientName)>><</V(24-09-1956)/T(PatientDobLabel)>><</V(0123456)/T(PatientRxr)>><</V(01234567891011)/T(PatientNhsLabel)>><</V(John)/T(PatientFirstNameLabel)>><</V(0123456)/T(PatientRxrLabel)>>]>>>>\nendobj\ntrailer\n<</Root 1 0 R>>\n%%EOF\n'

After

f=open("header.fdf","rb")
s=f.read()
f.close()
s=s.replace(b'PatientName',name)

the following error occurs:

Traceback (most recent call last):
  File "/home/aj/Inkscape/Med/GAD/gad.py", line 56, in <module>
    s=s.replace(b'PatientName',name)
TypeError: expected an object with the buffer interface

How best to do this?

ajo
  • 1,045
  • 4
  • 15
  • 30
  • 1
    what is `b` supposed to mean in `b'PatientName'` ? lose it! ;-) – Nas Banov Jul 02 '10 at 00:51
  • 2
    @Nas Banov, `b` means bytes in Python2.6+ much the same as `r` for raw srtings or `u` for unicode – John La Rooy Jul 02 '10 at 00:58
  • @gnibbler: actually i can't find mentioning of b'' in anything before **Python 3.x**. no wonder i never heard of this new-fangled thing :). Yet the error is something coming from Python 3 too. – Nas Banov Jul 02 '10 at 02:21

2 Answers2

34
f=open("header.fdf","rb")
s=str(f.read())
f.close()
s=s.replace(b'PatientName',name)

or

f=open("header.fdf","rb")
s=f.read()
f.close()
s=s.replace(b'PatientName',bytes(name))

probably the latter, as I don't think you are going to be able to use unicode names with this type of substitution anyway

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • 2
    Thanks for your response. It worked fine on my linux system. However it doesn't when I tried it on a windows system: f=open("header.fdf","rb") s=f.read() print(s) # or print(str(s)) # This results in: ------------------------- %FDF-1.2 %âãÏÓ 1 0 obj << /FDF << /Fields [ << /V (þÿ --------------------- I guess the encoding is wrong, but not sure where to go from here... – ajo Jul 06 '10 at 11:44
11

You must be using Python 3.X. You didn't define 'name' in your example, but it is the problem. Likely you defined it as a Unicode string:

name = 'blah'

It needs to be a bytes object too:

name = b'blah'

This works:

Python 3.1.2 (r312:79149, Mar 21 2010, 00:41:52) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> f = open('file.txt','rb')
>>> s = f.read()
>>> f.close()
>>> s
b'Test File\r\n'
>>> name = b'Replacement'
>>> s=s.replace(b'File',name)
>>> s
b'Test Replacement\r\n'

In a bytes object, the arguments to replace must both be bytes objects.

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251