4

For example, TableA:

     ID1    ID2   
     123    abc
     123    def
     123    ghi
     123    jkl
     123    mno
     456    abc
     456    jkl

I want to do a string search for 123 and return all corresponding values.

    pp = Cases[#, x_List /; 
     MemberQ[x, y_String /; 
       StringMatchQ[y, ToString@p, IgnoreCase -> True]], {1}] &@TableA

    {f4@"ID2", f4@pp[[2]]}

Above, p is the input, or 123. This returns only one value for ID2. How do I get all values for ID2?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Rose
  • 129
  • 6

5 Answers5

7

To complement other solutions, I would like to explore the high-performance corner of this problem, that is, the case when the table is large, and one needs to perform many queries. Obviously, some kind of preprocessing can save a lot of execution time in such a case. I would like to show a rather obscure but IMO elegant solution based on a combination of Dispatch and ReplaceList. Here is a small table for an illustration (I use strings for all the entries, to keep it close to the original question):

makeTestTable[nids_, nelems_] :=
  Flatten[Thread[{"ID" <> ToString@#, 
         ToString /@ Range[#, nelems + # - 1]}] & /@ Range[nids], 1]

In[57]:= (smallTable = makeTestTable[3,5])//InputForm
Out[57]//InputForm=
{{"ID1", "1"}, {"ID1", "2"}, {"ID1", "3"}, {"ID1", "4"}, {"ID1", "5"}, 
 {"ID2", "2"}, {"ID2", "3"}, {"ID2", "4"}, {"ID2", "5"}, {"ID2", "6"}, 
 {"ID3", "3"}, {"ID3", "4"}, {"ID3", "5"}, {"ID3", "6"}, {"ID3", "7"}}

The preprocessing step consists of making a Dispatch-ed table of rules from the original table:

smallRules = Dispatch[Rule @@@ smallTable];

The code to get (say, for "ID2") the values is then:

In[59]:= ReplaceList["ID2", smallRules]

Out[59]= {"2", "3", "4", "5", "6"}

This does not look like a big deal, but let us move to larger tables:

In[60]:= Length[table = makeTestTable[1000,1000]]
Out[60]= 1000000

Preprocessing step admittedly takes some time:

In[61]:= (rules = Dispatch[Rule @@@ table]); // Timing

Out[61]= {3.703, Null}

But we only need it once. Now, all subsequent queries (perhaps except the very first) will be near instantaneous:

In[75]:= ReplaceList["ID520",rules]//Short//Timing
Out[75]= {0.,{520,521,522,523,524,525,<<988>>,1514,1515,1516,1517,1518,1519}}

while an approach without the preprocessing takes a sizable fraction of a second for this table size:

In[76]:= Cases[table,{"ID520",_}][[All,2]]//Short//Timing
Out[76]= {0.188,{520,521,522,523,524,525,<<988>>,1514,1515,1516,1517,1518,1519}}

I realize that this may be an overkill for the original question, but tasks like this are rather common, for example when someone wants to explore some large dataset imported from a database, directly in Mathematica.

Leonid Shifrin
  • 22,449
  • 4
  • 68
  • 100
  • I don't see how this is obscure, but perhaps that is only because it is exactly how I would do it. +1 – Mr.Wizard Aug 13 '11 at 22:13
  • @Mr.Wizard For me, the interesting part was that, while `Dispatch` hashes the rules so that only the first one with identical l.h.s. fires (or at least, this is how I was thinking of it), it still works correctly with `ReplaceList`, which tries all the rules down the list, and also with no performance hit. I called it obscure because both `ReplaceList` and `Dispatch` are generally not so commonly used, and here we have their combination. – Leonid Shifrin Aug 13 '11 at 22:20
  • Perhaps I underestimated the complexity. I do not have a Computer Science education, therefore I am not usually thinking about how something (`Dispatch`) works, only what it apparently does. I have not really considered at length how `Dispatch` works, only that it apparently optimizes a list of rules. Were you expecting that, among other things, `Dispatch` would internally eliminate rules two and three in: `Dispatch @ {1 -> x, 1 -> y, 1 -> z}` ? – Mr.Wizard Aug 13 '11 at 23:21
  • @Mr.Wizard Yes, I thought this could be a possible scenario. I don't know exactly how `Dispatch` is implemented, but here is what the docs say: "... Rules such as `a[1]->a1` and `a[2]->a2`, which cannot simultaneously apply, need not both be scanned explicitly. `Dispatch` generates a dispatch table which uses hash codes to specify which sets of rules need actually be scanned for a particular input expression. ... ". Since I never before saw the combination of `ReplaceList` and `Dispatch` at work, I was not sure. B.t.w., I also don't have a CS education, something I am missing a lot presently. – Leonid Shifrin Aug 14 '11 at 08:38
4

It seems that all the answers have missed the function that is almost specifically meant for situations like this, namely Pick. Pickreturns those elements of a list for which the corresponding elements in a second are True. There is even a format (which I'll use) that has a third argument, the pattern to which the elements of the second list should be matched.

list1 = {"ID1", "123", "123", "123", "123", "123", "456", "456"};
list2 = {"ID2", "abc", "def", "ghi", "jkl", "mno", "abc", "jkl"};

Pick[list2, list1, "123"]

==> {"abc", "def", "ghi", "jkl", "mno"}
Sjoerd C. de Vries
  • 16,122
  • 3
  • 42
  • 94
3
lis = {{"ID1", "ID2"},
  {"123", "abc"},
  {"123", "def"},
  {"123", "ghi"},
  {"123", "jkl"},
  {"123", "mno"},
  {"456", "abc"},
  {"456", "jkl"}}

(result = Cases[lis, {x_, y_} /; StringMatchQ[x, "123"] :> {x,y}]) // TableForm

enter image description here

If just want the RHS, then

Cases[lis, {x_, y_} /; StringMatchQ[x, "123"] :> y] // TableForm

enter image description here

Nasser
  • 12,849
  • 6
  • 52
  • 104
  • 5
    You can simplify the expressions to `Cases[lis, {"123", _}]` and `Cases[lis, {"123", v_} :> v]`. – WReach Aug 03 '11 at 23:04
2

This?

Last@Transpose[Cases[tableA, {ToString@p, _}]]

(as I cannot just cut and paste tableA from your question the way it is formatted, I didn't try it).

acl
  • 6,490
  • 1
  • 27
  • 33
1

TableA[[#[[1]], 2]] & /@ Position[TableA, 123]

Cassini
  • 477
  • 4
  • 13