Outliers between 1,5 - 3 times the interquantile range is marked with an "+" and above 3 times the IQR with an "o". But due to this data set with multiple outliers the below boxplot is very hard to read since the "+" and "o" symbols are plotted on top of each other creating what appears to be a thick red line.
I need to plot all data so removing them is not an option but I would be fine to display "longer" boxes, i.e. stretch the q1 and q4 to reach the true min/max values and skip the "+" and "o" outlier symbols. I would also be fine if just the min and max outliers was displayed.
I'm totally in the dark here and the octave boxplot documentation found here did not include any helpful examples on how to handle outliers. A search here at stackoverflow didn't get me closer to a solution either. So any help or directions is very appreciated!
How can I modify the below code to create a boxplot based on the same data set that is readable (i.e. doesn't plot outliers on top of each other creating a thick red line)?
I'm using Octave 4.2.1 64-bits on a Windows 10 machine with qt as the graphics_toolkit and with GDAL_TRANSLATE called from within Octave to handle the tif-files.
It's not an option to switch graphics_toolkit to gnuplot etc. since I haven't been able to "rotate" the plot (horizontal boxes instead of vertical). And it's in the .pdf file the results must have an effect, not only in octaves viewer.
Please forgive my totally "newbie-style" coding-work-around to get a proper high resolution pdf-exported:
pkg load statistics
clear all;
fns = glob ("*.tif");
for k=1:numel (fns)
ofn = tmpnam;
cmd = sprintf ('gdal_translate -of aaigrid "%s" "%s"', fns{k}, ofn);
[s, out] = system (cmd);
if (s != 0)
error ('calling gdal_translate failed with "%s"', out);
endif
fid = fopen (ofn, "r");
# read 6 headerlines
hdr = [];
for i=1:6
s = strsplit (fgetl (fid), " ");
hdr.(s{1}) = str2double (s{2});
endfor
d = dlmread (fid);
# check size against header
assert (size (d), [hdr.nrows hdr.ncols])
# set nodata to NA
d (d == hdr.NODATA_value) = NA;
raw{k} = d;
# create copy with existing values
raw_v{k} = d(! isna (d));
fclose (fid);
endfor
## generate plot
boxplot (raw_v)
set (gca, "xtick", 1:numel(fns),
"xticklabel", strrep (fns, ".tif", ""));
ylabel ("Plats kvar (meter)");
set (gca, "ytick", 0:50:600);
set (gca, "ygrid", "on");
set (gca, "gridlinestyle", "--");
set (gcf, "paperunit", "centimeters", "papersize", [35, 60], "paperposition", [0 0 60 30], "paperorientation", "landscape")
zoom (0.95)
view ([90 90])
print ("loudden_box_dotted.pdf", "-F:14")