Just to have you on the same page:
Your data is organized in indices, each made of shards and distributed across multiple nodes. If a new document needs to be indexed, a new id is being generated and the destination shard is being calculated based on this id. After that, the write is delegated to the node, which is holding the calculated destination shard. This will distribute your documents pretty well across all of your shards.
Finding documents by id is now easy, as the shard, containing the wanted document, can be calulated just based on the id. There is no need for searching all shards. BTW, that's the reason why you can't change the number of shards afterwards. The changed shard number will result in a different document distribution across your shards.
Now, just to make it clear, each shard is a separate lucene index, made of segment files located on your disk. When writing, new segments will be created. If a particular number of segment files will be reached, the segments will be merged.
So just introducing more shards without distributing them to other nodes will just introduce a higher I/O and memory consumption for your single node.
While searching, the query will be executed against each shard. Afterwards the results of all shards needs to be merged into one result - more shards, more cpu work to do...
Coming back to your question:
For your write heavy indexing case, with just one node, the optimal number of indices and shards is 1!
But for the search case (not accessing by id), the optimal number of shards per node is the number of CPUs available. In such a way, searching can be done in multiple threads, resulting in better search performance. Correction: Searching and indexing are multithreaded, a single shard can fully utilize all CPU cores from a node.
But what are the benefits of sharding?
Availability: By replicating the shards to other nodes you can still serve if some of your nodes can“t be reached anymore!
Performance: Distibuting the primary shards to different nodes, will distribute the workload too.
So if your scenario is write heavy, keep the number of shards per index low. If you need better search performance, increase the number of shards, but keep the "physics" in mind. If you need reliability, take the number of nodes/replicas into account.
Further readings:
https://www.elastic.co/guide/en/elasticsearch/reference/current/_basic_concepts.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-search-speed.html
https://www.elastic.co/de/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster
https://thoughts.t37.net/designing-the-perfect-elasticsearch-cluster-the-almost-definitive-guide-e614eabc1a87