1

I know mixing text and binary is awful, but I have to do this.

I want to replace the binary content, which is around with "Content-Type: image" and "----", by string "XXXXXXXX"

So the code for test is:

# coding=utf-8
import re
raw_data = open('r_img.txt').read()
#data = re.sub(r"Content-Type: image.*?-----","Content-Type: imageXXXXXXX-----", raw_data, re.S)
data = re.sub(r"Content-Type: image[^-]*-----","Content-Type: imageXXXXXXX-----", raw_data, re.S)
print data

And the file r_img.txt would be:

Content-Disposition: form-data; name="commodity_pic1"; filename="C:\Documents and Settings\tim\My Documents\My Pictures\Pic\222A8888.jpg"

Content-Type: image/pjpeg



EEE? JFIF  H H  EEE C 

EEE C       

 EEEWhfEEE[e?EEEEEEqEEEEEEEEEEEEEEEZIOEEE(r5?-iEEEEEEEEEEEEEEE?EEE?EEEEEE
-----------------------------7db27132d0198

I had try string.replace() and re.sub, but I can't still find the answer.

pvd
  • 1,303
  • 2
  • 16
  • 31
  • Why would you not use Python's multipart MIME capabilities? – Ignacio Vazquez-Abrams Jul 08 '11 at 07:34
  • For some reason, I have to extract some product information from mysql database, then construct a SOAP request and use python's suds library to send this SOAP request to a remote server. But some of the information extract is combine with binary data and text data – pvd Jul 08 '11 at 07:44
  • 1
    That doesn't really answer my question. – Ignacio Vazquez-Abrams Jul 08 '11 at 07:45
  • Sorry, I am a newbie on python, and I have never heard Python's multipart MIME before. Thanks for your advice, I will try to google for some more detail. – pvd Jul 08 '11 at 07:52

1 Answers1

1

This works for me:

data = re.sub(r"Content-Type: image.*-----","Content-Type: imageXXXXXXX-----", 
              raw_data, 0, re.DOTALL)

Essentially it matches in a greedy way all characters between Content-Type: image and -----. The 0 means "match all occurrences of this pattern". Probably this is superfluous for you, but you can't skip it as you also wanted to use the flag re.DOTALL that modify the meaning of "any characters" to also include newlines.

HTH!

mac
  • 42,153
  • 26
  • 121
  • 131