To avoid having to "discover" the graph at the outset, the key property that Tarjan's algorithm would need is that, at any point in its execution, it should only depend on the subgraph it has explored so far, and it should only ever extend this explored region by enumerating the neighbours of some already-visited vertex. (If, for example, it required knowing the total number of nodes or edges in the graph at the outset, then you would be sunk.) From looking at the Wikipedia page it seems that the algorithm does indeed have this property, so no, you don't need to perform a separate discovery phase at the start -- you can discover each vertex "lazily" at the lines for each (v, w) in E do
(enumerate all neighbours of v just as you currently do in your discovery DFS) and for each v in V do
(just pick v to be any vertex you have already discovered as a w in the previous step, but which you haven't yet visited yet with a call to strongconnect(v)
).
That said, since your initial DFS discovery phase only takes linear time anyway, I'd be surprised if eliminating it sped things up much. If your graph is so large that it doesn't fit in cache, it could halve the total time, though.