Questions tagged [parsing]

Parsing refers to breaking an artifact into its constituent elements and capturing the relationship between those elements. This tag isn't for questions about the self hosted Parse Platform (use the [parse-platform] tag) or parse errors in a particular programming language (use the appropriate language tag instead).

Parsing refers to the action by software of breaking an artifact into its constituent elements and capturing the relationship between those elements.

When the artifact is a stream of arbitrary text, parsing is often used to mean breaking the stream into constituent atoms (called words, tokens or lexemes).

When the artifact is a stream of natural language text, parsing is used to mean breaking the stream into natural language elements (words and punctuation) and discovering the structure of the text as phrases or sentences.

When the artifact is a stream of text corresponding to a computer language (or other formal language), parsing consists of applying any of a variety of parsing algorithms (ad hoc, recursive descent, LL, LR, Packrat, Earley or other) to the source text (often broken into lexemes by another lower level parser called a "lexer") to verify the validity of the source language, and often to construct a parse tree representing the grammar productions used to tile the text.

The term can be applied more generally to analyzing any complex structure such as a binary data file or a graph.

57220 questions
540
votes
13 answers

Reading a file line by line in Go

I'm unable to find file.ReadLine function in Go. How does one read a file line by line?
g06lin
  • 5,725
  • 3
  • 18
  • 13
493
votes
37 answers

Adding a parameter to the URL with JavaScript

In a web application that makes use of AJAX calls, I need to submit a request but add a parameter to the end of the URL, for example: Original URL: http://server/myapp.php?id=10 Resulting URL: http://server/myapp.php?id=10&enabled=true Looking…
Lessan Vaezi
  • 6,927
  • 3
  • 25
  • 15
485
votes
35 answers

Remove HTML tags from a String

Is there a good way to remove HTML from a Java string? A simple regex like replaceAll("\\<.*?>", "") will work, but some things like & won't be converted correctly and non-HTML between the two angle brackets will be removed (i.e. the .*? in…
Mason
  • 8,767
  • 10
  • 33
  • 34
464
votes
4 answers

What's the best practice using a settings file in Python?

I have a command line script that I run with a lot of arguments. I have now come to a point where I have too many arguments, and I want to have some arguments in dictionary form too. So in order to simplify things I would like to run the script with…
c00kiemonster
  • 22,241
  • 34
  • 95
  • 133
415
votes
11 answers

Parse query string in JavaScript

I need to parse the query string www.mysite.com/default.aspx?dest=aboutus.aspx. How do I get the dest variable in JavaScript?
sinaw
  • 4,195
  • 2
  • 16
  • 5
402
votes
24 answers

How do I check if a C++ std::string starts with a certain string, and convert a substring to an int?

How do I implement the following (Python pseudocode) in C++? if argv[1].startswith('--foo='): foo_value = int(argv[1][len('--foo='):]) (For example, if argv[1] is --foo=98, then foo_value is 98.) Update: I'm hesitant to look into Boost, since…
Daryl Spitzer
  • 143,156
  • 76
  • 154
  • 173
397
votes
8 answers

Best XML parser for Java

I need to read smallish (few MB at the most, UTF-8 encoded) XML files, rummage around looking at various elements and attributes, perhaps modify a few and write the XML back out again to disk (preferably with nice, indented formatting). What would…
Evan
  • 18,183
  • 8
  • 41
  • 48
384
votes
3 answers

Splitting on last delimiter in Python string?

What's the recommended Python idiom for splitting a string on the last occurrence of the delimiter in the string? example: # instead of regular split >> s = "a,b,c,d" >> s.split(",") >> ['a', 'b', 'c', 'd'] # ..split only on last occurrence of ','…
user248237
384
votes
13 answers

Read and parse a Json File in C#

How does one read a very large JSON file into an array in c# to be split up for later processing? I have managed to get something working that will: Read the file Miss out headers and only read values into array. Place a certain amount of values…
Chris Devine
  • 3,859
  • 2
  • 13
  • 7
372
votes
5 answers

How can I parse (read) and use JSON in Python?

My Python program receives JSON data, and I need to get bits of information out of it. How can I parse the data and use the result? I think I need to use json.loads for this task, but I can't understand how to do it. For example, suppose that I have…
ingh.am
  • 25,981
  • 43
  • 130
  • 177
360
votes
6 answers

lexers vs parsers

Are lexers and parsers really that different in theory? It seems fashionable to hate regular expressions: coding horror, another blog post. However, popular lexing based tools: pygments, geshi, or prettify, all use regular expressions. They seem…
Naveen
  • 5,910
  • 5
  • 30
  • 38
358
votes
33 answers

Parse a URI String into Name-Value Collection

I've got the URI like this: https://google.com.ua/oauth/authorize?client_id=SS&response_type=code&scope=N_FULL&access_type=offline&redirect_uri=http://localhost/Callback I need a collection with parsed elements: NAME …
Sergey Shafiev
  • 4,205
  • 4
  • 26
  • 37
339
votes
17 answers

Get URL parameters from a string in .NET

I've got a string in .NET which is actually a URL. I want an easy way to get the value from a particular parameter. Normally, I'd just use Request.Params["theThingIWant"], but this string isn't from the request. I can create a new Uri item like…
Beska
  • 12,445
  • 14
  • 77
  • 112
328
votes
39 answers

How can I read and parse CSV files in C++?

I need to load and use CSV file data in C++. At this point it can really just be a comma-delimited parser (ie don't worry about escaping new lines and commas). The main need is a line-by-line parser that will return a vector for the next line each…
User1
  • 39,458
  • 69
  • 187
  • 265
313
votes
9 answers

How to convert string into float in JavaScript?

I am trying to parse two values from a datagrid. The fields are numeric, and when they have a comma (ex. 554,20), I can't get the numbers after the comma. I've tried parseInt and parseFloat. How can I do this?
F40