Questions tagged [sgml]

Standard Generalized Markup Language (SGML) is the precursor to XML and HTML. It is an ISO standard (ISO 8879) that was used for electronic publishing of documents and books.

Standard Generalized Markup Language (SGML) is the precursor to XML and HTML. It is an ISO standard (ISO 8879) that was used for electronic publishing of documents and books.


Useful links

134 questions
0
votes
2 answers

How to strip SGML tags from a text file using Python?

I came across the Standard Generalized Markup Language lately. I have acquired the corpus which is in SGML format from EMILLE/CIIL Corpus. This is the documentation for this corpus: EMILLE Corpus Documentation I want to extract just the text…
ObiWan
  • 196
  • 1
  • 12
0
votes
1 answer

Scrapy Json Rule SgmlLink Extractor

I just want to know on how can I make rule when the website sends me a json response instead of html? On the start url first response, it gives me an html response, but when I navigated through pages, it gives me json response. Here my rule: …
Rocky
  • 137
  • 4
  • 12
0
votes
0 answers

Stax does not ready characters like "“"

I'm doing a Sgml parse with Stax. The Sgml contains characters like "“ ”" and many others that is not replaced setting the UTF-8. The parse breaks and throws the following exception: javax.xml.stream.XMLStreamException: ParseError at…
0
votes
1 answer

removing multiple tags in SGML

i have a sgml file like

sdlksdskdmskdmsamdakmdksam

... my question is how to remove one tag

and keep another one intact ...which regular expression would be siutable......

shekar
  • 49
  • 1
  • 4
0
votes
1 answer

how do i implement the predicates for the Xml code to gate access for ech item,outvar and operator by using group?

this is xml converted in prolog file which i wnated to be get access for each child using parent group. here is file :- :- style_check(-singleton). better('SWI-Prolog', AnyOtherProlog). group('Running Conditions', item('No Cylinders are Cut…
krishn
  • 11
  • 2
0
votes
1 answer

Convert from XML to Microsoft Word Doc

I'm I have a batch of XML and SGML documents (about 7000 of them). I want something that'll convert them into structured Microsoft Word Documents. I've been reading online for 2 days on how to do this and am more confused than when I started. I see…
Nigel
  • 15
  • 6
0
votes
1 answer

Comments inside HTML/SGML/XML/DTD declarations

In the W3C HTML 4.01 DTDs and earlier, inline comments are frequently used within declarations. For example, the HTML 2.0 Strict DTD has:
user339676
  • 151
  • 5
0
votes
1 answer

DTD +(tag1,tag2)

I'm new to DTD and I'm not sure if I understand this code correctly. Is this code allows P tag to contain tag1, tag2 and tag3?
Viin
  • 427
  • 1
  • 4
  • 9
0
votes
2 answers

Parsing Java String with SGML

I have a Java String with SGML, something like this... I know you ducky suck and I rocky
Nitish Upreti
  • 6,312
  • 9
  • 50
  • 92
0
votes
3 answers

Are there valid cases in HTML/XML where tags would not be fully contained?

I think in XML and HTML that having cross-scoped tags is not allowed. Maybe SGML allows it. In XML/HTML though, are there any valid and allowed cases where this can occur? Something like:

This is some example text right…

CodexArcanum
  • 3,944
  • 3
  • 35
  • 40
0
votes
3 answers

quoting HTML attribute values

I know the spec allows both ' and " as delimiters for attribute values, and I also know it's a good practice to always quote. However I consider " being the cleaner way, maybe it's just me having grown up with C and C++' syntax. What is the cleanest…
Flavius
  • 13,566
  • 13
  • 80
  • 126
0
votes
1 answer

get sgml allow regex for "example.xom/page/200/"

I'm trying to get the regular expression for "example.com/page/200/". Here's what I've done so far: rules = (Rule (SgmlLinkExtractor( allow=("//page/\d+",), restrict_xpaths=('xxxxx',)), callback="details", follow= True), ) Could anyone of…
Suresh
  • 123
  • 1
  • 3
  • 8
0
votes
0 answers

compile colord [docbok2man does not process "-//OASIS//DTD DocBook V4.1//EN"] xml catalog files

GEN colormgr.1 GEN cd-create-profile.1 GEN cd-fix-profile.1 cd-fix-profile.sgml:1: cd-create-profile.sgml:1: colormgr.sgml:1: parser parser error : parser error : error : StartTag: invalid element name StartTag: invalid element…
Sean McCully
  • 1,122
  • 3
  • 12
  • 21
0
votes
1 answer

CDATA inside PCDATA

I read this text and didn't ubderstand it: PCDATA means parsed character data, so in this case the declared element is allowed to have character data inside of it now ,you might be wondering if there is a way to define an element that has a CDATA…
user3746280
0
votes
0 answers

How can I add

I am trying to wrap an img tag with div tag using Sgml reader and XElement Here is my code: currentImageElement.AddBeforeSelf("< div >"); (without spaces) The problem is it escapes the < and > chars, the result in the source is: & lt; div & gt; Is…
Slash7GNR
  • 479
  • 4
  • 13
1 2 3
8
9