Reusing steps is critical for maintenance reasons. That doesn't mean trying to shoehorn steps here and there, but finding a balance between reusability and understanding. As already said above, arranging them into a Common or Reusable package is a pretty good idea. This is something to be done as you go, because you don't always know whether a step is going to be reused or not. In this sense, frequent refactoring of step definitions will be quite normal. Actually it is an indicator of code aliveness, so don't hesitate to make any changes to get the test scenarios clear enough and the testing code as clean as possible. It is just the same well-known coding principles, applied to testing.
One thing that helped me with this task was a utility class (actually it was a set of classes) that allowed me to know which steps and steps definitions exist, the class in which the step definition is defined, the feature files and test scenarios that make use of them, etc. You can even implement advanced options such as searching for steps or steps definitions that contain such and such keywords, or getting to know the step definitions that are not used any longer, etc. Kind of a dictionary.
It can be achieved by either processing the java classes that belong into the 'glue' folder and gather all the regular expressions associated to the gherkin annotations, or by parsing the feature files with the help of a Gherkin parser. Although you may want to have both approaches implemented, as they are not mutually-exclusive; on the contrary, they complement each other.
This is something you may not need when having just a few test scenarios. But as this number grows bigger and bigger, you will find such a mechanism really valuable.