I am working on a web crawler, so I parse HTML pages. My problem is sometime the page encoding is not UTF8 (ISO, exotic Windows[0-9] etc..) and my analyser failled.
I tried many solution in PHP/Java/NodeJS to convert the content but there is always a problem.
Is exist a proxy module (nginx, squid, varnish ....) to convert automatically the content charset to UTF8?