14

Came up with this a while ago while doing some data structure work, though it'd make a good code golf: Given a two dimensional array of characters containing ascii art rectangles, produce a list of coordinates and sizes for the rectangles.

  • Any trivially convertable input or output format is fine (eg: char**, list of strings, lines on standard input; list of four ints, struct, fixed amount +/- for the size; etc).
  • Similarly, output need not be in any particular order.
  • You dont have to anything useful for invalid input or malformed rectangles, but you shouldnt to produce valid-looking coordinates for a rectangle that isnt in the input.
  • No two valid rectangles share a + (though + may appear not only as part of rectangle)
  • You can assume that all rectangles are at least 3x3: each side has a - or | in it.

Examples:

"        "
"  +-+ | "
"  | | \-"
"  +-+   "
(2,1;3,3)

"+--+  +--+"
"|  |  |  |"
"+--+  +--+"
(0,0;4,3), (6,0;4,3)

"  +---+  "
"->|...|  "
"  +---+  "
(2,0;5,3)

"+-+ +--+  +--+"
"| | |  |  |  |"
"+-+ |  |  + -+"
"    |  |      "
"    +--+  +-+ "
"  +--+    |   "
"  +--+    +-+ "
(0,0;3,3), (4,0;4,5) # (2,5;4,2) is fine, but not needed
David X
  • 3,998
  • 3
  • 29
  • 32
  • 1
    Underspecifed. Can a '+' be part of multiple rectangles? Does the output order matter? – Brian Sep 10 '10 at 02:01
  • It would be better to say read from stdin and write to stdout (typical for code golf). – Brian Sep 10 '10 at 02:02
  • @Brian, 1a: no, fixed; 1b: no, trival changes to output format are fine; 2: I wanted to concentrate on the algorithm, rather than I/O. – David X Sep 10 '10 at 02:05
  • @Mark, adjoining rectangles would have have a (actually 2) `+` that is part of both of them, so no. Im going to say it up to the implementation whether you want to handle nested ones, but it shouldnt be too hard to allow them. – David X Sep 10 '10 at 02:28
  • 1
    Can big rectangles contain small rectangles? – Brian Sep 13 '10 at 18:41
  • @Brian, if they do, you dont have to recognize them, as its implementation defined whether you handle nested rectangles. (I doubt there's any algothithms that would benifit from this, though.) – David X Sep 14 '10 at 00:00
  • Can we use analog literals (http://www.xs4all.nl/~weegen/eelis/analogliterals.xhtml)? – dan04 Sep 15 '10 at 03:35
  • @user287586, sure, but they count toward your char count, so I doubt it's worth the trouble. – David X Sep 15 '10 at 06:50
  • 2
    Ascii art? You want us to program the computer to be an art critic?! – Loren Pechtel Sep 16 '10 at 21:12
  • @Loren, but of course! Everyone knows any computer left to it's own devices will spontaniously develop sapience, so lets make sure it annoys the liberal arts majors instead of us! – David X Sep 17 '10 at 00:03

10 Answers10

7

Perl, 167 165 159 chars

(156 chars if you don't count slurping stdin to @a, just remove the last 3 chars and assign a list of strings representing your input to @a)

Gets input from stdin. Newlines not significant, added for readability. Notice the use of the +++ operator ;P

map{$l=$i++;while($c=/\+-+\+/g){$w=$+[0]-2-($x=$-[0]);
$c++while$a[$l+$c]=~/^.{$x}\|.{$w}\|/;
print"($x,$l;",$w+2,",$c)\n"if$a[$c+++$l]=~/^.{$x}\+-{$w}\+/}}@a=<>


Be liberal in what you accept version, 170 chars

map{$l=$i++;while($c=/\+-*\+/g){pos=-1+pos;$w=$+[0]-2-($x=$-[0]);
$c++while$a[$l+$c]=~/^.{$x}\|.{$w}\|/;
print"($x,$l;",$w+2,",$c)\n"if$a[$c+++$l]=~/^.{$x}\+-{$w}\+/}}@a=<>


Be conservative in what you accept version, 177 chars

map{$l=$i++;while($c=/\+-+\+/g){$w=$+[0]-2-($x=$-[0]);
$c++while$a[$l+$c]=~/^.{$x}\|.{$w}\|/;print
"($x,$l;",$w+2,",$c)\n"if$c>1&&$a[$c+++$l]=~s/^(.{$x})\+(-{$w})\+/$1v$2v/}}@a=<>


Commented version:

@a=<>;          # slurp stdin into an array of lines
$l=0;           # start counting lines from zero
map{            # for each line
    while(/\+-+\+/g){               # match all box tops
            $c=1;                           # initialize height

            # x coordinate, width of box - sides
            $w=$+[0]-2-($x=$-[0]);

            # increment height while there are inner parts
            # of a box with x and w coinciding with last top
            # (look into next lines of array)
            $c++  while $a[$l+$c]=~/^.{$x}\|.{$w}\|/;

            # if there is a box bottom on line + height
            # with coinciding x and w, print coords
            # (after incrementing height)
            print "($x,$l;",$w+2,",$c)\n"  
                    if $a[$c+++$l]=~/^.{$x}\+-{$w}\+/
    }
    $l++    # line++
}@a


Mega test case:

+--+  +-+ +-+  +++   +---+   +-+  +-+-+  +-++-+
|SO|  | | | |  +++   |+-+|   | |  | | |  | || |
+--+  +-+-+-+  +++   ||+||   +-+  +-+-+  +-++-+
        | |          |+-+|   | |
      +-+-+-+        +---+   +-+
      | | | |
      +-+ +-+


++ +-+ ++     +-+   +- + +--+ +--+ +--+
|| +-+ ++   +-+-+   |  | |  | |    |  |
++          | |     |  | |  | |  |    |
            +-+     +--+ + -+ +--+ +--+
ninjalj
  • 42,493
  • 9
  • 106
  • 148
  • I don't have Perl interpreter to test. So, please, explain, where is the `$i` initialization? Does Perl allow to mind, that all variables are `0` at the start? – Nakilon Sep 19 '10 at 20:27
  • some_loop_construct{$l=$i++;...} is my golfed version of $l=0;some_loop_construct{...;$l++}. In Perl, variables are undef until they are assigned to, which would make coordinates like (1,0;0,2) print as (1,;,2) (undef is treated as 0 when used as a number, and as an empty string when used as a string). – ninjalj Sep 20 '10 at 19:19
6

Ruby — 306 260 245 228 168

# 228 chars
g=->(s,u='-'){o=[];s.scan(/\+#{u}+\+/){o<<[$`,$`+$&].map(&:size)};o}
b=t.map{|i|i.split''}.transpose.map{|s|g[s*'','\|']}
(1...t.size).map{|i|i.times{|j|(g[t[i]]&g[t[j]]).map{|x,y|p [x,j,y-x,i-j+1]if(b[x]&b[y-1]&[[j,i+1]])[0]}}}

produces

[0, 0, 3, 3]
[4, 1, 4, 3]
[10, 3, 3, 3]

for t=

["+-+       +--+",
"| | +--+  |  |",
"+-+ |  |  + -+",
"    +--+  +-+ ",
"  +--+    | | ",
"  +--+    +-+ "]

Explanation:

# function returns info about all inclusions of "+---+" in string
# "  +--+ +-+" -> [[2,5],[7,9]]
g=->(s,u='-'){o=[];s.scan(/\+#{u}+\+/){o<<[$`,$`+$&].map(&:size)};o}

# mapping transposed input with this function
b=t.map{|i|i.split''}.transpose.map{|s|g[s*'','\|']}
# earlier here was also mapping original input, but later was merged with "analyse"

# "analyse"
# take each pair of lines
(1...t.size).map{|i|i.times{|j|
    # find horizontal sides of the same length on the same positions
    (g[t[i]]&g[t[j]]).map{|x,y|
        # make output if there are correct vertical sides
        p [x,j,y-x,i-j+1]if(b[x]&b[y-1]&[[j,i+1]])[0]
    }
}}

# yeah, some strange +/-1 magick included ,.)

And more straight-forward 168-chars solution!

t.size.times{|i|t[0].size.times{|j|i.times{|k|j.times{|l|p [l,k,j-l+1,i-k+1]if
t[k..i].map{|m|m[j]+m[l]}*''=~/^\+\+\|+\+\+$/&&t[i][l..j]+t[k][l..j]=~/^(\+-+\+){2}$/}}}}
Nakilon
  • 34,866
  • 14
  • 107
  • 142
  • Can you explain what's happening in this code ? Also, what is the `->` syntax for ? Is it a way to declare an anonymous function ? – barjak Sep 13 '10 at 11:45
  • Can't you define a function (e.g. `map`) into a variable, in Ruby? – chelmertz Sep 13 '10 at 19:08
  • @barjak: 1) `f` (or `g` in current version) produces a list of `+---+` substrings in a string; program generates this maps for original input and transposed, and then analyse them 2) yes, it's anonymous function. `f=->{}` is just shorter then `def f end` – Nakilon Sep 19 '10 at 19:12
6

Perl - 223 222 216

Golfed version (newlines not significant):

$y=0;sub k{$s=$-[0];"($s,%i;".($+[0]-$s).",%i)"}while(<>){while(/\+-+\+/g){
if(exists$h{&k}){push@o,sprintf k,@{$h{&k}};delete$h{&k}}else{$h{&k}=[$y,2]}}
while(/\|.+?\|/g){++${$h{&k}}[1]if exists$h{&k}}++$y}print"@o\n"

Older de-golfed version:

# y starts at line zero.
$y = 0;

# Abuse Perl's dynamic scoping rules
# to get a key for the hash of current rectangles,
# which indexes rectangles by x and width,
# and is also used as a format string.
sub k {

    # The start of the current match.
    $s = $-[0];

    # $+[0] is the end of the current match,
    # so subtract the beginning to get the width.
    "($s,%i;" . ($+[0] - $s) . ",%i)"

}

# Read lines from STDIN.
while (<>) {

    # Get all rectangle tops and bottoms in this line.
    while (/\+-+\+/g) {

        # If line is a bottom:
        if (exists $h{&k}) {

            # Add to output list and remove from current.
            push @o, sprintf k, @{$h{&k}};
            delete $h{&k}

        # If line is a top:
        } else {

            # Add rectangle to current.
            $h{&k} = [$y, 2]

        }

    }

    # Get all rectangle sides in this line.
    while (/\|.+?\|/g) {

        # Increment the height of the corresponding
        # rectangle, if one exists.
        ++${$h{&k}}[1] if exists $h{&k}

    }

    # Keep track of the current line.
    ++$y

}

# Print output.
print join", ",@o

Note that this does not handle junk vertical bars to the left of the rectangles, that is:

   +--+  +--+
|  |  |  |  |
   +--+  +--+

Will incorrectly yield a height of 2 for both. This is because the /\|.+?\|/g pattern starts searching from the beginning of the line. Anyone have a suggestion for how to fix this?

Jon Purdy
  • 53,300
  • 8
  • 96
  • 166
  • Adding some `.` s at the start of the `/\|.+?\|/g` pattern should fix the junk `|` handling, but I don't know perl well enough to figure out how to do that. – David X Sep 14 '10 at 00:08
  • Saved a few characters taking advantage of lax output restrictions. – Jon Purdy Sep 14 '10 at 19:42
5

JavaScript — 156 characters*

Also at http://jsfiddle.net/eR5ee/4/ (only click the link if using Firefox or Chrome) or http://jsfiddle.net/eR5ee/5/ (adapted to Safari and Opera):

var A = [
    "+-+ +--+  +--+",
    "| | |  |  |  |",
    "+-+ |  |  + -+",
    "    |  |      ",
    "    +--+  +-+ ",
    "  +--+    |   ",
    "  +--+    +-+ "
    ]; // not counted

for(y=A.length;--y;)for(;m=/\+-*\+/g(A[y]);){
for(w=m[0].length,z=y;A[--z][x=m.index]+A[z][x+w-1]=="||";);
/^\+-*\+$/(A[z].substr(x,w))&&alert([x,z,w,y-z+1])}
  • Excluding newlines and whitespace characters, which are completely unnecessary.
  • Apparently, Firefox and Chrome retain the lastIndex of the first regex. It takes four more characters to keep Safari and Opera out of an infinite loop. To get Internet Explorer working, fourteen more characters would be needed to fix both the above and the error "Function expected." Apparently, "... a regular expression's exec method can be called... indirectly (with regexp(str))" (quoted from Mozilla documentation) does not apply to IE.

The code detects all rectangles 2x2 and larger if no rectangle's bottom touches any other rectangle's top or bottom or a plus or minus sign and none overlap.

The order of the numbers in each alert box (which corresponds to a rectangle) is left, top, width, height. The code does error out if a rectangle extends off the top, but all the coordinates needed have already been output (from specification: "You dont have to (sic) anything useful for invalid input or malformed rectangles...")

Since most major web browsers implement the canvas tag, in several more lines of code, drawing the detected rectangles on screen is possible. http://jsfiddle.net/MquqM/6/ works on all browsers except Internet Explorer and Opera.

Edit: eliminated an unnecessary variable assignment Edit 2: avoid throwing errors with completely valid input (--y instead of y--), clarify the specific cases the code handles

PleaseStand
  • 31,641
  • 6
  • 68
  • 95
4

C (204 186 Characters)

    #include<stdio.h>
    char H=7,W=14,*S =
    "+-+ +--+  +--+"
    "| | |  |  |  |"
    "+-+ |  |  + -+"
    "    |  |      "
    "    +--+  +-+ "
    "  +--+    |   "
    "  +--+    +-+ ";
    void main(){
#define F(a,r,t)if(*c==43){while(*(c+=a)==r);t}
char*c,*o,*e=S;while(*(c=e++))
F(1,45,F(W,'|',o=c;F(-1,45,F(-W,'|',c==e-1?
printf("%i,%i %i,%i\n",(c-S)%W,(c-S)/W,(o-c)%W+1,(o-c)/W+1):0;))))
    }

The character count is the body of main(). This code will walk the string with e until it reaches the top-left corner of a potential rectangle. It will then check the edges with c and using o to keep track of the bottom-right corner.

Output of the program is:

0,0 3,3
4,0 4,5
2,5 4,2
deadc0de
  • 46
  • 2
3

Scala 2.8 - 283 273 269 257

val a = Seq(
    "+-+ +--+  +--+",
    "| | |  |  |  |",
    "+-+ |  |  + -+",
    "    |  |      ",
    "    +--+  +-+ ",
    "  +--+    |   ",
    "  +--+    +-+ "
  )

// begin golf count
val (w,h) = (a(0).size-1,a.size-1)
for (
  x <- 0 to w;
  y <- 0 to h;
  u <- x+2 to w;
  v <- y+2 to h;
  if Set(a(y)(x),a(v)(x),a(y)(u),a(v)(u)) == Set(43)
  && (x+1 to u-1).forall(t => (a(y)(t)<<8|a(v)(t)) == 11565)
  && (y+1 to v-1).forall(t => (a(t)(x)<<8|a(t)(u)) == 31868)
) yield (x,y,u-x+1,v-y+1)
// end golf count

evaluates to :

Vector((0,0,3,3), (4,0,4,5))

The for expression evaluates to the answer (the Vector object), that is why I counted only this part (whitespaces removed). Let me know if this is the correct way to count.

How it works

The coordinates of all possible rectangles (actually, only >= 3x3) are generated by the for expression. These coordinates are filtered by looking for the +, - and | at the edges and corners of all rectangles (the if part of the for expression).

barjak
  • 10,842
  • 3
  • 33
  • 47
  • I think there is a way to simplify the pattern `v1 == 45 && v2 == 45`, but I can't see one. Any idea ? – barjak Sep 12 '10 at 10:22
  • Would `v1 == v2 == 45` make any sense? I barely can read Scala so I don't know how one would cram that in there. – Esko Sep 19 '10 at 19:15
  • @Esko there is no compound conditions in Scala, this won't work. I was thinking about an arithmetic trick, or an operation on bits, but cannot see one. – barjak Sep 20 '10 at 06:49
  • 1
    `v1*256 + v2 == ...` or even `v1<<8 | v2 == ...` – Nakilon Sep 20 '10 at 16:29
3

Python 2.6 - 287 263 254

a = [
    "+-+ +--+  +--+",
    "| | |  |  |  |",
    "+-+ |  |  + -+",
    "    |  |      ",
    "    +--+  +-+ ",
    "  +--+    |   ",
    "  +--+    +-+ "
    ]

l=len
r=range
w,h=l(a[0]),l(a)
[(x,y,u,v)for x in r(0,w)for y in r(0,h)for u in r(x+2,w)for v in r(y+2,h)if a[y][x]==a[v][x]==a[y][u]==a[v][u]=='+' and a[y][x+1:u]+a[v][x+1:u]=="-"*2*(u-x-1)and l([c for c in r(y+1,v-y)if a[c][x]==a[c][u]=='|'])==v-y-1]

evaluated to:

[(0, 0, 3, 3), (4, 0, 4, 5)]
Niavok
  • 1
  • 2
3

Python 2.6 (251 characters)

I come in a bit late, anyway, some fun. Python, using regular expressions. To save a print statement and stay shorter than Fredb219's this won't print anything if you run it as a script, but typing one line at a time in the interpreter it's gonna show the result. Not really solid, it won't handle nested boxes nor most cases more complex than the ones given by DavidX. Haven't done testing, but I think it's likely to show results in a wrong order if something "strange" occurs.

import re
l,a,o=s.index("\n")+1,re.finditer,sorted
o(o(set((m.start(),m.end())for m in a(r'\+-* *-*\+',s)for n in a(r'\|.+\|',s)if(m.start()%l,m.end()%l)==(n.start()%l,n.end()%l)if m.start()+l==n.start()or m.start()-l==n.start())),key=lambda x:x[0]%l)

Input is a single string, lines (all same length) separated by a newline character. Results are the string slices of the top and the bottom for every "good" box, starting top left. It also allows "broken" boxes (i.e. with some space in the middle of one side, not without one entire side). This was just a way to fix an unwanted behavior creating many brand new side effects! :-)

input:

>>>s = """+-+ +--+  +--+
| | |  |  |  |
+-+ |  |  + -+
    |  |      
    +--+  +-+ 
  +--+    |   
  +--+    +-+ """

or:

>>>s = "+-+ +--+  +--+\n| | |  |  |  |\n+-+ |  |  + -+\n    |  |      \n    +--+  +-+ \n  +--+    | \n  +--+    +-+ "

then:

>>>import re
>>>l,a,o=s.index("\n")+1,re.finditer,sorted
>>>o(o(set((m.start(),m.end())for m in a(r'\+-* *-*\+',s)for n in a(r'\|.+?\|',s)if(m.start()%l,m.end()%l)==(n.start()%l,n.end()%l)if m.start()+l==n.start()or m.start()-l==n.start())),key=lambda x:x[0]%l)

output:

[(0, 3), (30, 33), (4, 8), (64, 68), (10, 14), (40, 44)]

so (0,3) top of 1st box (30,33) bottom of same box, (4,8) top of 2nd box and so on.

MattiaG
  • 241
  • 1
  • 2
  • 9
  • Is there something weird about 2.6 (or your install of python)? I'm getting an empty list on 2.5 and 2.7. – David X Jan 21 '11 at 08:13
  • @DavidX: Nothing weird, you're right. I thought It was because I slipped a newline in at the start of the input string, still when I fixed that something went wrong anyway. Now I (re)copied it all from my source and it seems to work just fine. – MattiaG Jan 21 '11 at 12:38
2

F#, 297 chars

Kinda lame, but simple.

let F a=
 for x=0 to Array2D.length1 a-1 do
  for y=0 to Array2D.length2 a-1 do
  if a.[x,y]='+' then
   let mutable i,j=x+1,y+1
   while a.[i,y]<>'+' do i<-i+1
   while a.[x,j]<>'+' do j<-j+1
   printfn"(%d,%d;%d,%d)"x y (i-x+1)(j-y+1)
   a.[i,y]<-' '
   a.[x,j]<-' '
   a.[i,j]<-' '

Look for a plus. Find the one to the right of it. Find the one below it. Print out this rectangle's info, and 'null out' the pluses we already used. Since each plus is only a part of one valid rectangle, that's all we need to do.

Brian
  • 117,631
  • 17
  • 236
  • 300
  • 1
    This would produce output for invalid rectangles where, say, the - or | characters aren't present, or there is no left or bottom side. – Andrew Cooper Sep 10 '10 at 02:40
  • 2
    The spec disallows this. It says "'+' only appears as part of a single valid rectange". This seems to imply there are no '+' characters attached to invalid/incomplete rectangles. – Brian Sep 10 '10 at 02:56
  • The way i read it, that sentence means that there is no 2nd VALID rectangtle that includes the same `+`. It does not say there are no other appearances (as part of invalid rectangle, as part of valid trianlge, standalone etc) – Nas Banov Sep 10 '10 at 03:43
  • @Nas Banov, that is what I had meant, but it's apparantly harder to get right than I expected. – David X Sep 10 '10 at 04:08
2

XQuery (304 characters)

Here is my solution:

declare variable $i external;let$w:=string-length($i[1]),$h:=count($i)for$y in 1 to$h,$x in 1 to$w,$w in 2 to$w+1 -$x,$h in 1 to$h where min(for$r in (0,$h),$s in 1 to$h return (matches(substring($i[$y+$r],$x,$w),'^\+-*\+$'),matches(substring($i[$y+$s],$x,$w),'^|.*|$')))return ($x -1,$y -1,$w,$h+1,'')

You can run this (with XQSharp) by setting the variable $i to be the lines of the input

>XQuery boxes.xq "i=('  +-+','+-+-+','| |  ','+-+  ')" !method=text

2 0 3 2  0 1 3 3

I suppose one could argue that declare variable $i external; is just setting up the input and so doesn't add to the count, in which case 275 characters

Oliver Hallam
  • 4,242
  • 1
  • 24
  • 30
  • That's a really nice test case. Hope you don't mind my stealing it for the test case in my answer. – ninjalj Sep 20 '10 at 19:29