0

Imagine you want to check whether the left letter of each word in a range is "a", then join the words for which that condition is true. One way is with a helper column, returning "" if not true, the word if it begins with "a", and then a total row which CONCAT()s over the helper column.

Another way would be to use an array formula. {=CONCAT(IF(LEFT(range) = "a", range, ""))}. That's effectively using a helper column anyway.

But what I want is to use the conditional formatting approach: When applying conditional formatting, you reference the top left cell of your range. So for a range in A1:A10, I want a function like =CONCATIF(A1:A10, LEFT(A1) = "a"). SUMIF and COUNTIF get close to this functionality, only they use string inputs ">", "=..." which cannot reference the cell which they are testing (i.e., there is no equivalent to LEFT(A1) since there is no way of referring to the_cell_I'm currently_working_on in the string)

But the problem is, if I make a UDF which tries to accept a test like LEFT("A1") = "a", it just evaluates for that one cell, not over the whole range like Conditional formatting does.

Why not just use an array formula?

Well mainly I'm intrigued to see whether a vba function can act as an array formula in the way COUNTIF does - by evaluating some condition specified as a parameter. Only with the functionality of conditional formatting in accepting relative formulae, not string, inputs.

But also, for this specific case, I'm trying to CONCAT some long paragraphs into a single string. Each paragraph is maybe 1000 characters, and for whatever reason, {=CONCAT(IF(LEFT(A1:A10) = "a", A1:A10, ""))} won't work since IF( returns a value error for the long paragraphs. The formula works for small paragraphs. But a VBA version can concat the whole of A1:A10, I just need a way to run the test condition first!

Update


To make it explicitly clear, I'm looking for the following:

  • A generic UDF which takes a RANGE as well as a Boolean FORMULA input
    • These can be in the same parameter of the UDF if required
  • The UDF evaluates the Boolean formula on each of the cells in the range
  • The UDF returns an array of TRUE and FALSE values based on the result
  • This array can be passed to standard array handling functions like CONCAT or SUM
    • Or if an array-returning UDF is not possible, the array can be passed internally to another portion of the UDF which then handles it
    • Similar to how COUNTIF must return an array of TRUE/FALSE at some point, but then counts them internally to return a number, not an array, and can therefore be entered not as an array formula.
  • I'm interested in how to generate this array of Booleans

Now research shows there are 3 main ways: Array functions, which for whatever reason aren't able to hold long strings, CONCATIF/SUMIF type functions, which take a text parameter as the boolean test, but which are heavily limited in the types of test. The third approach is the one that Conditional formatting uses; a range is selected and the formula test runs over all of the cells in the selected range, returns TRUE or FALSE (internally), and this array of booleans is used to format the cells in the range. But in a UDF they could just as easily be passed as a result or used internally for further calculations (e.g. a CONCATIF function)

Consequently, I'm looking for one of the following:(this is instead of an array formula approach, and should ideally be entered without ctrl+shift+enter.)

  • A UDF which uses an approach like COUNTIF but better: it accepts a range parameter and a formula as a string, the formula being any Boolean-returning worksheet formula such as 'LEFT(cell)="a" (note, a text string)
  • Better a UDF which uses an approach like conditional formatting; accepts a Range and any boolean-returning formula *not formatted as text"
    • That would be nice as you can use the Excel auto-prompt and range highliting to create the formula, which you couldn't do if it were text
  • For option 1), the boolean-formula can reference the word "cell" or the range over which it's acting. For option 2) it references the top-left cell in the range, like in conditional formatting
    • I'm thinking application.caller might be useful in extracting the formula part when it's not text.
  • The UDF could end up returning a number of options, could be tailored further with worksheet functions or with custom calculations, but as I say, I'm really after the list of booleans

I didn't type that at first because I didn't want to make a code this for me question, but I think I've provided enough detail there (and proof of independant thought!) that it's now a matter of how not what, which I believe is more acceptable on SO.

Greedo
  • 4,967
  • 2
  • 30
  • 78
  • So, what exactly is the question? – z32a7ul Apr 03 '17 at 16:00
  • FYI, SUMIF and COUNTIF can both reference cells for the criteria. – Rory Apr 03 '17 at 16:04
  • @Rory I meant that when dealing with a range `A1:A10`, an array formula rattles through the cells 1 by 1, where the cell you are *currently on* can be referred to at any point by the entire range. In those two functions there is no way to refer to the cell you are *currently on*, no way to `=COUNTIF(A1:A10,"LEFT(...)=1")`, what do you put in the `...`? That's what I was talking about. – Greedo Apr 03 '17 at 16:26
  • That's not just *those* functions though - circular references are not usually a good idea. – Rory Apr 04 '17 at 06:31
  • @Rory How do you mean, I'm not sure at what point circular references are introduced - I'm not intending to use the result of this `COUNTIF` to change the values of the range it is testing. – Greedo Apr 04 '17 at 08:25
  • Then what exactly do you mean by "refer to the cell you are currently on"? – Rory Apr 04 '17 at 08:36
  • @z32a7ul Does the update help? – Greedo Apr 04 '17 at 08:51
  • @Rory If I enter the array `{=LEFT(A1:A3)}` that means `{LEFT(A1),LEFT(A2), LEFT(A3)}`. The variable is the cell reference, it is what changes between successive items in the array. It is that cell reference which I'm calling *"the cell I'm currently on"*. I'm talking about the cell in the array which you are running the test/ function on as you iterate through. For conditional formatting, the cell you are currently on is referred to in formulas by the top left cell of the range you are testing over. If you still don't follow then I explain in detail here: stackoverflow.com/q/39395957/6609896 – Greedo Apr 04 '17 at 09:11
  • @Greedo I don't get your point then. You can use a range of cells, or an array, as the criteria for countif. – Rory Apr 04 '17 at 09:43
  • @Rory, of course I accept that `=COUNTIF(A1:A10,"<"&A1)` is referencing the range as the criteria. But as Scott points out in the answer to that question I linked, you cannot alter that range. I can't `=COUNTIF(A1:A10,"LEN(A1..A10)<5")` because that involves carrying out a function on the range before counting, which `COUNTIF` can't do, array formulas can do (to an extent), a UDF could do. – Greedo Apr 04 '17 at 09:55
  • That depends on what the criteria are. It appears tangential to your issue though, so let's not pursue it. – Rory Apr 04 '17 at 09:57

1 Answers1

3

Use the newer TEXTJOIN¹ and IF in an array formula with CSE.

=TEXTJOIN(CHAR(32), TRUE, IF(LEFT(A1:INDEX(A:A, MATCH("zzz", A:A)))="a", A1:INDEX(A:A, MATCH("zzz", A:A)), ""))

enter image description here


¹ The TEXTJOIN was introduced with Excel 2016 in the following versions:Excel for Android phones, Excel Mobile, Excel 2016 with Office 365, Excel 2016 for Mac, Excel Online, Excel for iPad, Excel for iPhone and Excel for Android tablet. If your version of Excel does not support TEXTJOIN (e.g. you receive a #NAME! error) then search this site for [excel]textjoin for alternatives.

Community
  • 1
  • 1
  • The problem isn't with concatenate, it's with the length of the strings which can't be handled well by the worksheet functions. I recreated your sheet, substituting `A6` for `="a"&REPT("l",255)` (256 characters long). At that point the array formula returns `#VALUE!`. 255 characters and it's fine. Stepping into the function; `MATCH` just ignored it, so the formula pointed to `A1:A5` - I fixed that with `+1` and *then* got the value error. – Greedo Apr 04 '17 at 08:44
  • Testing also shows `IF` can't handle long strings either when used in an array formula, it's fine however when used normally:`=IF(LEFT("a"&REPT("l",255))="a","a"&REPT("l",255),"")`. So I need a UDF as described, or another method (not using a helper column), which gets around these small string issues. – Greedo Apr 04 '17 at 08:50
  • @Greedo, are you sure that you pressed Ctrl+Shift+Enter when entering the Array Formula, #VALUE! can be caused by entering an Array Formula as a normal one (Enter without Ctrl and Shift). – z32a7ul Apr 04 '17 at 09:44
  • @z32a7ul Certain, you can try it yourself if you want; `MATCH()`, `IF()` and probably more can't work *when entered in array formulas* with strings longer than 256 characters. I'm sure there's a document on it somewhere. – Greedo Apr 04 '17 at 09:57
  • @Greedo - Did you click on the supplied link in my footnote? There are several UDFs in that group of answers that do exactly what you are asking; one or two that I have written myself. –  Apr 04 '17 at 12:59
  • Thanks for the link - OK, I've looked at every single one now; upshot - they are (nearly) all concerned with either the `TEXTJOIN()` function and how to make it in VBA, or with using an array formula of `IF()` in combination with TJoin to aggregate results. That's not what I'm looking for. I'm looking for an array version of `IF` that doesn't need to be entered as an array, and that can handle long strings (see update) - so probably a UDF then. I know how to employ it, just not how to make it! – Greedo Apr 04 '17 at 14:51
  • I tried it, too, and it works. Both as array formula and as normal formula, even if I change 255 to 1000. I mean your formula, =IF(LEFT("a"&REPT("l",255))="a","a"&REPT("l",255),"‌​"). – z32a7ul Apr 04 '17 at 15:06