500.000 files should be a pretty good stress test in itself.
Yes, additional rescans (they take part automatically when you update the Subsonic search index) do two things:
- they look for newly added artists and albums and fetches info for them
- they pick the oldest 1/30 of your current data and fetches updates for that (to keep it updated over time)
Since you have so much data, you might want to check the log for how long the import takes. Both this first time (that's an interesting stat figure), and on the next subsequent scan. The MusicCabinet update is asynchronous, meaning on subsequent scans, you can still use your existing index for generating playlists while it fetches new last.fm data. But the Subsonic Lucene index is not (I think) so maybe this will stop you from searching for a little while every night.
I could make the figure 1/30 configurable so that each update would become smaller for you (and the refresh cycle for your library slower).
It's also interesting that there's such a difference in progress for different imports. I wonder if I need to make the parallel execution of them more fair.
Thanks for the update!