I am making the below request to Youtube using URLRequest
and URLSession
. Most of the response looks fine, however I found that some of the script elements that are being returned have what seem to be escaped encoded characters such as { } [ ] = ' "
in the form of \\x7b \\x7d \\x5b \\x5d \\x22 \\x3d \\x27
let url = URL(string: "https://www.youtube.com/channel/UCPHWVzGcW-iozudjp8U984g/")
guard let requestUrl = url else { fatalError() }
var request = URLRequest(url: requestUrl)
request.httpMethod = "GET"
let task = URLSession.shared.dataTask(with: request) { [self] (data, response, error) in
if let error = error {
print("Error took place \(error)")
}
if let data = data, let dataString = String(data: data, encoding: .utf8) {
print("Response data string:\n \(dataString)")
}
}
task.resume()
I have done this request in Java using okhttp3
and I didn't see any encoding left in these script elements there, and they also seem just fine doing a source inspection in multiple browsers.
I tried to remove them by using replacingOccurrences which works, but for some reason the JSON is still malformed, so I must be missing some of the other weird encoding being returned. Is there any built in way to remove this encoding, or to get URLSession
to not leave it encoded?
Here is a a sample:
<script nonce=\"koFDr1miSKW8U9aJTnGQVw\">var ytInitialData = \'\\x7b\\x22responseContext\\x22:\\x7b\\x22serviceTrackingParams\\x22:\\x5b\\x7b\\x22service\\x22:\\x22GFEEDBACK\\x22,\\x22params\\x22:\\x5b\\x7b\\x22key\\x22:\\x22browse_id\\x22,\\x22value\\x22:\\x22UCPHWVzGcW-iozudjp8U984g\\x22\\x7d,\\x7b\\x22key\\x22:\\x22logged_in\\x22,\\x22value\\x22:\\x220\\x22\\x7d,\\x7b\\x22key\\x22:\\x22e\\x22,\\x22value\\x22:\\x2224022617,24023962,24014268,24022308,23968386,24022875,24025790,24025869,23857948,24006666,24022914,23923339,23976696,23983296,23944779,23744176,23990877,24021968,24021668,23966208,24011119,23891346,24006795,24023271,24001373,23934970,23987676,23897180,23891344,23804281,23974595,24016478,24007246,24012654,24024964,1714255,24002010,23946420,23997485,23884386,24019883,23882502,23918597,24012117,23969934,24014440\\x22\\x7d\\x5d\\x7d,\\x7b\\x22service\\x22:\\x22CSI\\x22,\\x22params\\x22:\\x5b\\x7b\\x22key\\x22:\\x22c\\x22,\\x22value\\x22:\\x22MWEB\\x22\\x7d,\\x7b\\x22key\\x22:\\x22cver\\x22,\\x22value\\x22:\\x222.20210406.03.00\\x22\\x7d,\\x7b\\x22key\\x22:\\x22yt_li\\x22,\\x2"..