Highest Voted 'non-ascii-characters' Questions

33

votes

2 answers

Replace accented characters in R with non-accented counterpart (UTF-8 encoding)

I have some strings in R in UTF-8 encoding that contain accents. E.g. string="Hølmer" or string="Elizalde-González" Is there any nice function in R to replace the accented characters in these strings by their unaccented counterpart? I saw some…

r non-ascii-characters

asked Dec 10 '13 at 13:15

Tom Wenseleers

7,535
7
63
103

33

votes

5 answers

R on Windows: character encoding hell

I am trying to import a CSV encoded as OEM-866 (Cyrillic charset) into R on Windows. I also have a copy that has been converted into UTF-8 w/o BOM. Both of these files are readable by all other applications on my system, once the encoding is…

r csv encoding utf-8 non-ascii-characters

asked Sep 13 '13 at 14:52

user27636

1,070
1
18
26

32

votes

3 answers

matching unicode characters in python regular expressions

I have read thru the other questions at Stackoverflow, but still no closer. Sorry, if this is allready answered, but I didn`t get anything proposed there to work. >>> import re >>> m =…

python regex unicode non-ascii-characters character-properties

asked Feb 17 '11 at 12:08

Weholt

1,889
5
22
35

32

votes

5 answers

How to account for accent characters for regex in Python?

I currently use re.findall to find and isolate words after the '#' character for hash tags in a string: hashtags = re.findall(r'#([A-Za-z0-9_]+)', str1) It searches str1 and finds all the hashtags. This works however it doesn't account for accented…

python regex django hashtag non-ascii-characters

asked Sep 06 '13 at 17:48

deadlock

7,048
14
67
115

32

votes

1 answer

How do I get accented letters to actually work on bash?

My bash installation on cygwin doesn't handle accented letters properly. I tried adding set input-meta on # to accept 8-bit characters set output-meta on # to show 8-bit characters set convert-meta on # to show it as character, not the octal…

bash cygwin non-ascii-characters

asked Oct 26 '12 at 20:51

Ferdinando Randisi

4,068
6
32
43

27

votes

4 answers

"UnicodeEncodeError: 'ascii' codec can't encode character"

I'm trying to pass big strings of random html through regular expressions and my Python 2.6 script is choking on this: UnicodeEncodeError: 'ascii' codec can't encode character I traced it back to a trademark superscript on the end of this word:…

regex unicode python-2.6 non-ascii-characters

asked Oct 31 '09 at 00:07

KenBurnsFan1

575
1
9
17

26

votes

2 answers

PyYaml - Dump unicode with special characters ( i.e. accents )

I'm working with yaml files that have to be human readable and editable but that will also be edited from Python code. I'm using Python 2.7.3 The file needs to handle accents ( mostly to handle text in French ). Here is a sample of my issue: import…

python unicode yaml non-ascii-characters pyyaml

asked Mar 30 '15 at 09:33

Hans Baldzuhn

317
1
3
9

25

votes

10 answers

Copyleft symbol

Is there any easy way to print the copyleft symbol? https://en.wikipedia.org/wiki/Copyleft For example as simple as: © © It might be: &anticopy; &anticopy;

html unicode ascii non-ascii-characters

asked May 07 '16 at 11:54

Evhz

8,852
9
51
69

24

votes

1 answer

What's the character code for exclamation mark in circle?

What's the Unicode or Segoe UI Symbols (or other font) code for exclamation mark in circle?

unicode character symbols non-ascii-characters

asked May 03 '16 at 12:52

Waldemar Gałęzinowski

1,125
1
10
18

24

votes

4 answers

Convert Hi-Ansi chars to Ascii equivalent (é -> e)

Is there a routine available in Delphi 2007 to convert the characters in the high range of the ANSI table (>127) to their equivalent ones in pure ASCII (<=127) according to a locale (codepage)? I know some chars cannot translate well but most can,…

delphi character-encoding ascii delphi-2007 non-ascii-characters

asked Dec 11 '09 at 22:10

Francesca

21,452
4
49
90

24

votes

7 answers

Remove non-ASCII non-printable characters from a String

I get user input including non-ASCII characters and non-printable characters, such as \xc2d \xa0 \xe7 \xc3\ufffdd \xc3\ufffdd \xc2\xa0 \xc3\xa7 \xa0\xa0 for example: email : abc@gmail.com\xa0\xa0 street : 123 Main St.\xc2\xa0 desired output: …

java non-ascii-characters

asked Jun 13 '12 at 18:14

daydreamer

87,243
191
450
722

23

votes

3 answers

How to replace accented characters?

My output looks like 'àéêöhello!'. I need change my output like this 'aeeohello', Just replacing the character à as a like this.

python python-2.7 non-ascii-characters

asked Jun 08 '17 at 09:25

Ganesh Basuvaraj

231
1
2
3

23

votes

6 answers

How to ignore acute accent in a javascript regex match?

I need to match a word like 'César' for a regex like this /^cesar/i. Is there an option like /i to configure the regex so it ignores the acute accents?. Or the only solution is to use a regex like this /^césar/i.

javascript regex special-characters diacritics non-ascii-characters

asked Jun 15 '12 at 16:39

sanrodari

1,602
2
13
23

22

votes

2 answers

How to MySQL work "case insensitive" and "accent insensitive" in UTF-8

I have a schema in "utf8 -- UTF-8 Unicode" as charset and a collation of "utf8_spanish_ci". All the inside tables are InnoDB with same charset and collation as mentioned. Here comes the problem: with a query like SELECT * FROM people p WHERE p.NAME…

mysql utf-8 case-insensitive non-ascii-characters

asked May 31 '12 at 09:40

Lightworker

593
1
5
18

21

votes

7 answers

Regex accent insensitive?

I need a Regex in a C# program. I've to capture a name of a file with a specific structure. I used the \w char class, but the problem is that this class doesn't match any accented char. Then how to do this? I just don't want to put the most used…

c# regex diacritics non-ascii-characters

asked Jul 12 '11 at 13:03

J4N

19,480
39
187
340

Questions tagged [non-ascii-characters]