Questions tagged [character-properties]

character-properties are a set of attributes supplied by the Unicode Standard. For each character contained in it, many properties are specified in relation to processes or algorithms that interpret them, in order to implement the character behavior.

The Unicode Standard, on top of defining the encoding of characters, also associates a rich set of semantics with each encoded character—properties that are required for interoperability and correct behavior in implementations, as well as for Unicode conformance. These semantics are cataloged in the Unicode Character Database (UCD), a collection of data files which contain the Unicode character code points and character names.

More information can be found on Wikipedia, in the official Unicode Standard as well as in this Unicode Technical Report.

92 questions
1
vote
5 answers

Regular expression in Java that takes as input alphanumeric followed by forward slash and then again alphanumeric

I need a regular expression that takes as input alphanumeric followed by forward slash and then again alphanumeric. How do I write regular expression in Java for this? Example for this is as follows: adc9/fer4 I tried by using regular expression as…
Android_programmer_camera
  • 13,197
  • 21
  • 67
  • 81
1
vote
3 answers

Ruby: how to check if an UTF-8 string contains only letters and numbers?

I have an UTF-8 string, which might be in any language. How do I check, if it does not contain any non-alphanumeric characters? I could not find such method in UnicodeUtils Ruby gem. Examples: ėččę91 - valid $120D - invalid
krn
  • 6,715
  • 14
  • 59
  • 82
1
vote
1 answer

Unicode Regular Expressions - Fails at 343 characters

I am using the regular expression below to weed out any non-Latin characters. As a result, I found that if I use a string larger than 342 characters, the function fails, everything aborts, and the website connection is reset. I narroed it down to…
KcYxA
  • 243
  • 1
  • 3
  • 19
1
vote
4 answers

Regex word-breaker in unicode

How do I convert the regular expression \w+ To give me the whole words in Unicode – not just ASCII? I use .net
fm64
  • 161
  • 1
  • 4
1
vote
1 answer

Inconsistent arithmetic with characters in C++?

I was just playing with characters using a very simple C++ program, let me explain the situation -: #include int main(){ char c; std :: cin >> c; std :: cout << "The integer value of character entered is : " << int(c) <<…
AnkitSablok
  • 3,021
  • 7
  • 35
  • 52
1
vote
1 answer

Unicode name regex

I found many links about this, but none of them did not work me. I used \p{Letter}, it allowed space and digits. I want Unicode Regular Expression for person name. Only letters like English, Latin, Russian, Chine and other Europe countries etc.…
Jeyhun Rahimov
  • 3,769
  • 6
  • 47
  • 90
1
vote
1 answer

Is there a Unicode equivalent for `{\pGraph}` in Java / POSIX regular expressions?

According to the documentation of java.util.Pattern, the POSIX character class \p{Graph} ([:graph:] in POSIX notation) matches "a visible character: [\p{Alnum}\p{Punct}]". However, this is limited to ASCII characters only. Is there an equivalent…
Hosam Aly
  • 41,555
  • 36
  • 141
  • 182
0
votes
4 answers

regex - search for pattern that starts with alphabetics and ends with alphabets or space

What is the correct regex for getting a string that contains only letters, must start with letters and a continuous string of letters. But can end with letters OR a space (just space and not tabs or returns). I have this pattern /^\S*[a-zA-Z]\s*$/…
Jamex
  • 1,164
  • 5
  • 22
  • 34
0
votes
1 answer

RegEx doesn't accept %

What's wrong with this set of RegEx /^[\p{L}\p{N}]+/u. When my senior entered % openminded The regex return false. I need it to accept this format % openminded 100% openminded openminded 100% What do I need to add in the expression? So that…
user1149244
  • 711
  • 4
  • 10
  • 27
0
votes
4 answers

Checking for specific strings with regex

I have a list of arbitrary length of Type String, I need to ensure each String element in the list is alphanumerical or numerical with no spaces and special characters such as - \ / _ etc. Example of accepted strings…
BOWS
  • 404
  • 2
  • 4
  • 17
0
votes
1 answer

Unicode in Regex and DB Reading/Writing

Good night, I am currently working on a very simple lexical analiser for human language in C# based on Regex matching, and I am currently facing the problem of specifing a Regex that can match every possible punctuation symbol in the target…
Miguel
  • 3,466
  • 8
  • 38
  • 67
0
votes
2 answers

Java Unicode Regular Expression

I have some text like this. Every person haveue280 sumue340 ambition I want to replace ue280, ue340 to \ue280, \ue340 with regular expression Is there any solution Thanks in advance
Novice
  • 981
  • 6
  • 12
  • 25
0
votes
2 answers

Character Arithmetic --- Base 8 vs Base 10

When doing character arithmetic is it a rule that you perform the calculations in base 10 or base 8? My book says 'A' = 101 in base 8 or 65 base 10 but when I insert the character values in base 8 into an example my book gives about illustrating…
Jessica M.
  • 1,451
  • 12
  • 39
  • 54
0
votes
2 answers

Java: Validate textfield input if it only contains alphabetic characters

How do I validate if a text only contains alphabetic characters? I think we can use Pattern.matches() but I don't know the regular expression for alphabetic characters.
user1757703
  • 2,925
  • 6
  • 41
  • 62
0
votes
7 answers

Is there special syntax to follow when comparing chars in C++?

I've been learning C++, and I tried to create a basic calculator app. The goal is to obtain two numbers from 0-9 from the user, and a mathematical operation (+, -, *, /); if some other character is typed, I want to loop the program to keep prompting…