Logo
News
HomeTeam
Dev-Blog
TaleSpireJoin us on Discord

TaleSpire Dev Log 503 - A story of thrashed fans (my dumbest bug so far)

Baggers · 15 days ago
Banner

For nearly two years, I had a ticket in my list from Ree saying that, in the Unity editor, once he entered play-mode, the fans on his machine were spinning hard and didn’t settle down even when he returned to edit-mode.

It was odd, but because I couldn’t replicate it and because it was just affecting us, I hadn’t prioritised it.

Last week, though, I decided to try out Superluminal, a profiler I’ve been very excited to kick the tires of. It works across C# and native code and is itself fantastically performant, allowing you to manipulate multi-gigabyte captures with ease.

I was down visiting Ree at the time, so we took a capture of the Unity editor before entering play-mode and then after returning to edit-mode. We only captured 10 seconds each time, but when comparing the captures, it was incredibly clear that, in the second capture, there was a very busy thread.

Turns out it was my icon-loading manager. That seemed odd, as it was meant to go to sleep when not in use. Here is where we get into the details so we can all enjoy my idiocy. When the game calls the manager asking it to load an icon, the manager starts a thread to handle the request. The thread sits in a loop, fetching what it tasks[0] from a queue[1], and then loading the specific icons. When it has nothing to do, it goes to sleep until it’s needed again. If it sleeps for 20 seconds with nothing to do, it times out, and the thread shuts down.

To handle the wait, the thread and manager communicate via a ManualResetEventSlim. It has what we need: the ability to wait with a timeout, signaling, and spinning for 10 or so iterations before using a system lock. All good stuff.

Now, take a guess who used the ManualResetEvent and forgot to manually reset it after the sleep. That’s right. This idiot.

So instead of the lovely intended behaviour, the thread just spun in place, asking the OS to pause the thread, then immediately waking up as the reset-event was still triggered, only to have nothing to do and repeat the process.

If I had been asked to write a short piece of code to make the scheduler unhappy, I could hardly have done better :|

So that’s deeply embarrassing. I sheepishly added the call to Reset and watched as the thread in question effectively disappeared from the profile capture. And of course, Ree’s machine suddenly wasn’t operating in space-heater mode all day.

This has been affecting all of ya. The fewer cores you have, the more noticeable it’s probably been. Even if it’s not affecting frame rate, it’s still bad, and we’ll be shipping the fix this week.

This does go to show how impactful a good tool can be. I’ve been using a C# profiler on TaleSpire for years and never noticed this, as it spent so little time in managed code. It would have looked like tiny blips on a background thread, which I could have been more curious about, but there were always bigger targets in the captures for me to go after. In Superluminal, however, I could see all the stuff that was happening in C#.

From downloading a trial of the profiler, I found and fixed the bug in 20 minutes. Needless to say, we are now paying customers!

This sadly was not the only dumb mistake, but one that went places I didn’t expect. I was kicking myself for screwed up the reset-event thing, berating myself for rolling a custom solution when I could have used something in C# that, while more wasteful with memory, at least wouldn’t have had that fuck up in it. And it wasn’t helped when I found a different place where I did the following:

  • Used C#’s FileSystemWatcher to call us when any files in specific folders have been modified. (We use this to detect when mods are added, for example)
  • Upon receiving the change notification, I scan the folders in question, collating a bunch of data that, for simplicity, we will call the “report”.
  • I saved the report to disk
  • I saved the report to disk in the folder the file-watcher was watching.

Fuck. So yeah, yours truly had made another infinite loop. Good job. So I fixed that and reran the profiler. But the thread was still obviously re-scanning every 750 milliseconds or so. It was so weird. Operating systems provide APIs for waiting on file changes (like this one on Windows). So, at last, I went to read the mono source code… and ooooh boy. Check this out: https://github.com/mono/mono/blob/main/mcs/class/System/System.IO/DefaultWatcher.cs#L149

Yeaaaah, that’ll work, but it means that it didn’t matter that I was being a dumbass, because so was mono.

The two lessons here are: 

  1. Use the built-in things 
  2. Don’t use the built-in things

And the real lesson: Use a good profiler. If you know what your program execution looks like normally, you’re much more likely to notice when it changes.

Superluminal lets me profile all of TaleSpire while keeping framerates high enough for real playtesting. I expect it to be a permanent part of my toolkit going forward. God, I hope they bring it to Linux!

As mentioned above, we’ll get these fixes to you asap. If your computer fans have been running harder than you’d expect in TaleSpire, I’d love to hear whether these fixes help.

Until then, have a good one.


[0] Not a C# task class, just the info of what it has to do (such as the filepath of the icon) 

[1] It’s not actually implemented with a C# queue, but conceptually, this is what is happening, and that’s good enough for this chat.



0 Comments
  • Oldest ▾
    • Oldest
    • Newest
    • Best
Subscribe
G
G
Guest
User
Guest
User
Write a Comment...

© 2005-2026 BouncyRock.com

HomeTeamDev-BlogTaleSpireJoin us on Discord