Speed-wise, you're unlikely to get a performance boost spawning more threads than there are physical threads available unless your threads are spending a lot of time sleeping (in which case it gives your other threads an opportunity to execute). Note that thread sleeps can be implicit and hidden in I/O bound processes and when contending a lock.
It really depends on whether your threads are spending most of their time waiting for something (ex: more data to come from a server, for users to do something, for a file to update, to get access to a locked resource) or just going as fast as they can in parallel. If the latter case, using more threads than physically available will tend to slow you down. The only way having more threads than tasks can ever help throughput is when those threads waste time sleeping, yielding opportunities for other threads to do more while they sleep.
However, it might make things easier for you to just spawn all these tasks and let the operating system deal with the scheduling.
With vastly more threads, you could slow things down potentially (even in terms of throughput). It depends somewhat on how your scheduling and thread pools work and whether those threads spend time sleeping, but a thread is not necessarily a cheap thing to construct, and a context switch with that many threads can become more expensive than your own scheduling process which can have a lot more information about exactly what you want to do and when it's appropriate than the operating system who just sees a boatload of threads that need to be executed.
There's a reason why efficient libraries like Intel's Thread Building Blocks matches the number of threads in the pool to the physical hardware (no more, no less). It tends to be the most efficient route, but it's the most awkward to implement given the need for manual scheduling, work stealing, etc. So sometimes it can be convenient to just spawn a boatload of threads at once, but you typically don't do that as an optimization unless you're I/O bound as pointed out in the other answer and your threads are just spending most of their time sleeping and waiting for input.
If you have needs like this, the easiest way to get the most out of it is to find a good parallel processing library (ex: PPL, TBB, OMP, etc). Then you just write a parallel loop and let the library focus on how to most efficiently deal with the threads and to balance the load between them. With those kinds of cases, you focus on what tasks should do but not necessarily when they execute.