Questions tagged [string]

A string is a finite sequence of symbols, commonly used for text, though sometimes for arbitrary data.

A string is a finite sequence of symbols, commonly used for text, though sometimes for arbitrary data.

Most programming languages provide a dedicated string data type or more general facilities and conventions for handling strings; as well as providing a way to denote string literals. In some programming languages everything is a string, for example in Tcl. A dedicated support library of differing sophistication is mostly provided as well.

String representations vary widely in the features they offer; the right string type can easily decrease the order of algorithms, while the wrong one might not even be able to accommodate your string data at all.

The following are some hand-picked representatives:

  • Zero-terminated Strings (aka. C-strings, ASCIZ, sz) are arrays of non-null elements, terminated by a special, null element (variants using a different terminating symbol are mostly restricted to old systems, e.g. DOS supported $).
  • Counted String (aka Pascal Strings) are arrays of arbitrary bytes, prefixed by a length indicator. Nowadays, the size for counted strings is restricted by available address space, though it was quite common to use a single byte for length (implying maximum length of 255).
  • Ropes, which are lists of segments (for example length + pointers into modifiable and non-modifiable buffers), for efficient insertion and deletion.

Many (especially functional) languages support strings as a list of base symbols.

For Unicode support, a special string of the strings type is getting common, as Unicode characters can be of arbitrary length, even in UTF-32. This enables efficient character-indexing by pushing the complexities of the character set into the string type.

In most languages, strings can be iterated over, similar to lists/arrays. In some high-level languages (in which strings are a data type unto themselves), strings are immutable, so string operations create new strings.

For text strings, many encodings are in used, though modern usage is converging on Unicode, using UTF-8 (some early adopters of Unicode instead transitioned form UCS2 to UTF-16 as a persistence format).

Windows software often adopts the WinAPI convention of using UTF-16 internally, converting for external data and persistence instead of system calls.

A String Literal is an occurrence of a string phrase in source code, generally encapsulated in dedicated delimiters (for example, in C/C++ and Java a String literal is surrounded by double quotes - "This is a String Literal").

Useful Links:

183393 questions
108
votes
5 answers

Java equivalent of C#'s verbatim strings with @

Quick question. Is there an equivalent of @ as applied to strings in Java: For example I can do @"c:\afolder\afile" in C# and have it ignore the escape characters when processing instead of having to do "c:\\afolder\\aFile". Is there a Java…
Simon Rigby
  • 1,786
  • 4
  • 17
  • 28
108
votes
22 answers

How do I use LINQ Contains(string[]) instead of Contains(string)

I got one big question. I got a linq query to put it simply looks like this: from xx in table where xx.uid.ToString().Contains(string[]) select xx The values of the string[] array would be numbers like (1,45,20,10,etc...) the Default for .Contains…
SpoiledTechie.com
  • 10,515
  • 23
  • 77
  • 100
108
votes
5 answers

Byte Array to Hex String

I have data stored in a byte array. How can I convert this data into a hex string? Example of my byte array: array_alpha = [ 133, 53, 234, 241 ]
Jamie Wright
  • 1,259
  • 3
  • 10
  • 11
108
votes
7 answers

Finding last occurrence of substring in string, replacing that

So I have a long list of strings in the same format, and I want to find the last "." character in each one, and replace it with ". - ". I've tried using rfind, but I can't seem to utilize it properly to do this.
Adam Magyar
  • 1,423
  • 3
  • 13
  • 13
107
votes
16 answers

Stripping non printable characters from a string in python

I use to run $s =~ s/[^[:print:]]//g; on Perl to get rid of non printable characters. In Python there's no POSIX regex classes, and I can't write [:print:] having it mean what I want. I know of no way in Python to detect if a character is…
Vinko Vrsalovic
  • 330,807
  • 53
  • 334
  • 373
107
votes
19 answers

Count the number of all words in a string

Is there a function to count the number of words in a string? For example: str1 <- "How many words are in this sentence" to return a result of 7.
John
  • 41,131
  • 31
  • 82
  • 106
107
votes
3 answers

Replace first occurrence only of a string?

I have something like this: text = 'This text is very very long.' replace_words = ['very','word'] for word in replace_words: text = text.replace('very','not very') I would like to only replace the first 'very' or choose which 'very' gets…
jblu09
  • 1,071
  • 2
  • 7
  • 3
107
votes
1 answer

Does Rust's String have a method that returns the number of characters rather than the number of bytes?

Based on the Rust book, the String::len method returns the number of bytes composing the string, which may not correspond to the length in characters. For example if we consider the following string in Japanese, len() would return 30, which is the…
Salvatore Cosentino
  • 6,663
  • 6
  • 17
  • 25
107
votes
25 answers

Plurality in user messages

Many times, when generating messages to show to the user, the message will contain a number of something that I want to inform the customer about. I'll give an example: The customer has selected a number of items from 1 and up, and has clicked…
Øyvind Bråthen
  • 59,338
  • 27
  • 124
  • 151
107
votes
5 answers

Split Spark dataframe string column into multiple columns

I've seen various people suggesting that Dataframe.explode is a useful way to do this, but it results in more rows than the original dataframe, which isn't what I want at all. I simply want to do the Dataframe equivalent of the very…
Peter Gaultney
  • 3,269
  • 4
  • 16
  • 20
107
votes
4 answers

How to convert the PathBuf to String

I have to convert the PathBuf variable to a String to feed my function. My code is like this: let cwd = env::current_dir().unwrap(); let my_str: String = cwd.as_os_str().to_str().unwrap().to_string(); println!("{:?}", my_str); it works but is awful…
xiaoai
  • 1,181
  • 2
  • 7
  • 5
107
votes
2 answers

converting JSON to string in Python

I did not explain my questions clearly at beginning. Try to use str() and json.dumps() when converting JSON to string in python. >>> data = {'jsonKey': 'jsonValue',"title": "hello world"} >>> print json.dumps(data) {"jsonKey": "jsonValue", "title":…
BAE
  • 8,550
  • 22
  • 88
  • 171
107
votes
2 answers

PHP - remove all non-numeric characters from a string

What is the best way for me to do this? Should I use regex or is there another in-built PHP function I can use? For example, I'd want: 12 months to become 12. Every 6 months to become 6, 1M to become 1, etc.
b85411
  • 9,420
  • 15
  • 65
  • 119
107
votes
3 answers

How can I perform a culture-sensitive "starts-with" operation from the middle of a string?

I have a requirement which is relatively obscure, but it feels like it should be possible using the BCL. For context, I'm parsing a date/time string in Noda Time. I maintain a logical cursor for my position within the input string. So while the…
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
107
votes
13 answers

String.Format like functionality in T-SQL?

I'm looking for a built-in function/extended function in T-SQL for string manipulation similar to the String.Format method in .NET.
unknown (yahoo)