Geeks With Blogs
Ulterior Motive Lounge UML Comics and more from Martin L. Shoemaker (The UML Guy),
Offering UML Instruction and Consulting for your projects and teams.
OK... So why has my DODO been incalculable? Well, because we've been chasing down memory problems in our .NET Compact Framework 2.0 app that runs under Windows CE 5.0.
 
And if that just sent a chill down your spine, then you've probably been here before. You have my sympathies.
 
If you haven't been here before but you're planning a CF/CE app: Be afraid. Be very afraid.
 
The following is a brief summary of what we've learned (the hard way) about memory management in a Compact Framework application. 

"In the beginning, the Universe was created. This made a lot of people angry, and has been widely regarded as a bad idea."

My client's application comes in two versions: one that runs on a laptop, and one that runs on a handheld computer running Windows CE 5. Since both versions run .NET -- .NET Framework 2.0 on the laptop, .NET Compact Framework 2.0 on the handheld -- they figured to share a lot of code between the two platforms, including a homebrew implementation of a Model-View-Presenter architecture. The implementation of View (and Presenter, to a lesser extent) had to change across platforms; but for the most part, what was a good architecture for the laptop was seen as a good architecture for the handheld.
 
First mistake, there...
 
And although I wasn't here when that mistake was made, I probably would've made the same mistake myself. I didn't realize that on the handheld platform, you need different coding priorities.

"Do you know what the secret of life is? This. One thing. Just one thing. You stick to that and the rest don't mean ****."

When you code, the quality of the result depends significantly on your priorities: what do you want from the code? Every developer has an idea of the "correct" coding priorities. Some value small code, or fast code, or readable code, or clever code. Steve McConnell once advised deciding and announcing your priorities for a given project before work begins, so everyone knows what's important for this project. You might make a prioritized list, such as:
  1. Correctness.
  2. Maintainability/readability.
  3. Speed.
  4. Extensibility.
  5. Efficient use of memory.
  6. Schedule.
  7. Modularity.
  8. Reusability.

And so on. Just let the team know what's expected from them, and you're more likely to get it from them.

But for many teams, that list is too much bother. They can get by with a simple mantra:

Make it run.

Make it right.

Make it fast.

Make it run, so users can give you feedback. Make it right from their feedback. And then optimize only when necessary to make the users happy, because optimization can be difficult to read and maintain and is prone to bugs.

Well, we've learned a new mantra for the handheld:

Make it small.

Make it run.

Make it right.

Make it fast.

Repeat, repeat, repeat...

You have to make it small first; because on CE 5, if it's not small, it will never run right, and it will be fast only in crashing. And then, when you take steps to make it run and right and fast, you'll inevitably make it larger, and risk crashing due to that.

So on CE 5, you have to design for small first. Adding small afterward is a great way to push your DODO to Infinity. I should know.

"A Man's Gotta Know His Limitations."

So here's the short summary of what we've learned about our limits. First, keep in mind that this code comes in two general types: Native, and Managed. Native Code is code compiled directly for the platform, and responsible for managing and allocating its own memory and other resources. Managed Code is .NET Compact Framework code, which runs inside the Common Language Runtime, a Native Code "shell" that loads and runs and manages Managed Code. It takes care of allocating memory; and if -- if -- you let it, it does its darnedest to clean up that memory when you're done with it. And Managed Code may very well load and call Native Code for certain purposes. (Native Code can load and run Managed Code as well; but for this discussion, that's irrelevant.)

Some of what follows applies only to Native Code. Some applies only to Managed Code. And some applies to both.

"I want to be an explorer, like the Great Magellan." "Oh, you're too late! There's nothing left to explore!"

CE 5 uses Virtual Memory and memory mapping to give each running  process a "private" memory space. I'm no memory management guru; but as best I understand it, this means that each process thinks it "owns" the machine. The Operating System (CE 5) maps into part of that memory space. The app and all its data map into another part. On handheld devices, memory is usually pretty limited (128 MB on board, in our case, and 1 GB on card), and probably less than the addressable space in a process's memory map.

The OS itself loads into a given memory location. I can't say where, because I'm not a memory guru. I'm trying to give an overview here, not details. But the important part is that your process then receives an address space in which your process lives in the space from 0 GB to 1 GB, and "global" storage lives in 1 GB to 2 GB. The OS lives beyond that space.

"Game over, man! Game over!"

OK, so your process's memory map lets it address 1 GB (between addresses 0 GB and 1 GB); but here's the kicker, and here's where our troubles began. The actual memory accessible to your app -- also known as the Virtual Memory, i.e., the chunk of memory mapped into that 1 GB space -- is restricted to 32 MB on Windows CE. Period. No matter what you do. If your app needs more than 32 MB to store your currently loaded code and data, that's it. You're gonna crash. In our case, we didn't crash; but we made calls to SQL Mobile in a low-memory condition, and it'll crash like clockwork, every time.

So that's simple, right? Stay under 32 MB, right? What kind of freakin' handheld app needs 32 MB, anyway?

Well, one that was ported from the laptop without sufficient consideration for Make it small, make it run, make it right, make it fast, that's what kind.

And oh, if only it were as simple as waving a magic wand and saying "Keep it under 32 MB"...

"Take your stinking paws off me, you damned dirty ape!"

Maybe, maybe you can find a way to do what you need within 32 MB. (Probably not: when you tell customers that they're getting portability as a tradeoff for power and features, they never hear the tradeoff part. They'll want your handheld app to be portable and as powerful as your laptop app and as fast. Learn to tell them "no" as early as possible.)

But you're not the only one who gets to use that 32 MB. No, in fact, every app in the system has its stinking paws on your 32 MB, including the OS!

See, when an app is actively running on the handheld, its Virtual Memory is mapped into a special reserved 32 MB slot in the system, slot 0. Your app itself is loaded at the bottom of slot 0, growing up. Any unmanaged DLLs it loads are loaded at the top of slot 0, growing down as each is loaded. (Each DLL is allocated space in 64K chunks, so DLLs that are close to 64K in size -- or 128K, or 192K, or... -- are loaded most efficiently.) And the space in between is available for your app's data. You can also store data in the "global" area from 1 GB to 2 GB; but that's a lot more difficult to use, so you don't want to if you don't have to.

But here's the tricky part: because handhelds have limited memory, CE 5 optimizes the memory used by DLLs. If a DLL is loaded in one app, CE doesn't load it into another app that also uses it. CE loads it once, period. But in order to do this, CE loads the DLL into the same location in every app that uses it -- and effectively, every app that doesn't use it. The unmanaged DLLs are loaded into the top of slot 0 itself, and just left there as different apps are mapped into slot 0 and back out again. So your app takes the hit for every unmanaged DLL loaded anywhere in the device. That can even include OS DLLs: if the OS runs out of room for its DLLs in its reserved memory, it loads them into slot 0. (In our case, after the vendor did some tuning, that only meant one unmanaged DLL loaded into slot 0 by the OS itself.)

"Invention, my dear friends, is 93% perspiration, 6% electricity, 4% evaporation, and 2% butterscotch ripple." "That's 105 percent."

Now if the 32 MB limit sounds scary, and the external DLLs chewing up your 32 MB sounds even scarier... Well, you're right, they are scary.

But before you set out to try to work around them, ask yourself one thing: are you using .NET Compact Framework 2.0 (or later)? Because if you are, then the CF team has already given you an invention where they put in the perspiration, electricity, evaporation, and butterscotch ripple. Sadly, it's not a 105% solution (more like an 80% solution in our case); but it's very likely better than you have time and resources to devise and implement and test and maintain yourself. So before you start jumping through hoops, let's look at the hoops they've already jumped through for you.

When you build a .NET app (Compact Framework or regular Framework), the real application is the CLR itself. It's the thing that runs, not your code; and then it loads your code, and uses your code to tell it what to do.

So what this means on the Compact Framework is that the CLR can be smart: it knows about the 32 MB limit, so it works overtime to try to get around it. The biggest technique it uses is that all of your Managed DLLs and your managed application are loaded into the global storage area between 1 GB and 2 GB. Then, as you call the code in your app and your DLLs, the CLR loads the pieces you use into an area within your 32 MB VM. This JIT (Just In Time) Heap starts at about 1 MB, and grows as needed; but if the CLR sees that memory is low, it "pitches" code out of the JIT Heap, keeping only the more recently used code. So the JIT Heap can be treated as a 1 MB block under normal circumstances.

And another way in which the CF CLR is smart: if any one object you allocate is larger than 2 MB (or it might be 1 MB, I'll need to check), it automatically allocates from the global storage, without making you jump through all the hoops involved. The CLR knows how to jump through the hoops. That's its job.

If you think you can write better memory management than the CLR provides, be my guest. Be sure to allow lots of time in your schedule for design, redesign, implementation, testing, fixing, retesting, refixing, reretesting, reredesign, nervous breakdowns... Now CLR memory management does not come without a cost, of course: the CLR itself requires another 1 MB block out of your VM. And other data used by the CLR can add up as well, in the form of different heaps:

  • The AppDomain Heap describes the assemblies, modules, and types used by your app. The CLR uses these to find and load and decipher code as you call it. It grows as the overall app+DLLS grow.
  • The Process Heap and the Short Term Heap are memory used and discarded by the CLR as it works. These are usually small, usually transitory, and pretty much always beyond your control.
  • And finally, the GC (Garbage Collected) Heap itself, where the CLR allocates all of your data (other than large allocations stuck in global storage). Anything in here can be released by your code at any time, and the CLR will Garbage Collect it to free up space when it sees a need.

So your real limit in a Managed app is: 32 MB, minus the size of Native DLLs in slot 0 (in our case, 5 MB, mostly due to SQL Mobile), minus 1 MB for the CLR, minus the sum of all the heaps (with a minimum around 2 MB). That's how much Native data you can afford. Because remember: your Native DLLs will allocate memory in your 32 MB VM as well.

And frankly, for many sorts of handheld apps, that's a lot. But for anything ambitious, anything that really makes customers excited... That's not enough.

In testing of our app (explanations below), the JIT Heap, the AppDomain Heap, and the GC Heap have climbed to nearly 13 MB. Add in 5 MB for Native DLLs, 1 MB for the CLR, and 1 MB for the Process and Short Term Heap, and we're up to 20 MB out of 32. That's enough to make me nervous. And for large SQL Mobile queries, that may be enough to crash. (SQL Mobile usually crashes when we make a large query with less than 2 MB VM free; but we once saw it crash with 3.4 MB free.)

"No. Try not. Do... or do not. There is no try."

So now you know the CF 2.0/CE 5.0 memory landscape. You know, kind of, what your limitations are.

So if your app starts crashing and memory errors seem to be the diagnosis, you need to throw out stuff that might waste memory, right?

No. Wrong. Stop. Remember this important mantra: We're not guessing, we're profiling.

With a lot -- a lot -- of experience, you can begin to guess where system performance problems are, and be right maybe half the time. (And by performance, I include memory usage, CPU usage, and other ways your app might bog down a machine.) But people with that much experience will usually tell you: don't guess, use a profiler.

Now on the laptop, there are many profiling tools available. On CE 5, not so many. One of the best we could find is the .NET Compact Framework Remote Performance Monitor. (That's a link to the CF 3.5 Power Toys from Microsoft; but the RPM in there works fine with CF 2.0.) This tool will tell you the size of your different heaps. It will also take "snapshots" of your GC heap, which will let you see how many objects of different classes are in use, and how much space they consume. (In our case, the worst offender is 14,408 Strings, requiring 0.89 MB of storage. The top 10 classes for memory usage consume over 2.6 MB out of a 3.3 MB GC heap. I won't know why until I do more research. I have my guesses, but we're not guessing, we're profiling.)

Another useful memory profiling tool is DUMPMEM. (This is targeted at Pocket PC 2002, but still works OK for CE 5.) This tool is a bit verbose, with a lot of nearly unreadable information; but near the end of the report, it spells out all of the apps running on your device, including which unmanaged DLLs each has loaded.

Beyond those, we found it easiest to write our own profiling tools. They're crude, but they helped us match memory usage to user activities in the application. They rely heavily on the MemoryManagement class from the OpenNETCF libraries.

"Tweaking? A project that needs tweaking?"

Finally, some general lessons we learned about memory management under CF 2.0 and CE 5.0:

  • Design for small from the start. Don't try to add it later. Yes, I repeat myself. It bears repeating.
  • We're not guessing, we're profiling. Yes, I repeat myself. It bears repeating.
  • Don't waste time trying to "tweak" your way to sufficient memory. Tweaking is usually a sign that you're guessing, not profiling. We did a lot of little tweaks that saved us a few K here, a hundred K there... Those all made some difference, but at a lot of cost; whereas finding and eliminating the 14,408 cached strings will probably make the difference we need. When you try to tweak to fit, it's like Human Tetris: you might twist yourself into exactly the right shape; but if you're wrong, you won't know it until you go splat! Instead of tweaking, profile, identify and locate the major leaks, and fix them. Repeat, repeat, repeat.
  • Build memory diagnostics in from the start. You'll curse me now. You'll thank me later.
  • Graph your diagnostics, and identify on the graphs what the user was doing at each major change. That will help you in locating the major leaks.
  • Don't try to outsmart the CLR and the GC. Maybe you are smarter, but do you have time to build and test and maintain your smarter solution?
  • Sometimes, when your code is leaking, it can appear that the GC is not collecting. It is, at least as much as it can, and I have the graphs to prove it. Trust the GC.
  • Never call GC.Collect() directly. It never helps -- never! -- and it often hurts. It never helps because the GC is optimized to not waste too much CPU time. If called, it doesn't do any real garbage collection until a trigger has happened: cumulative Managed allocations have passed a megabyte boundary; GDI+ resources (pens, fonts, brushes, etc.) are used up; or memory is used up. If none of those have happened, no garbage is collected. But it may hurt, because before testing the triggers, it suspends your whole app, including waiting for each thread to reach a "suspendable" state. So though calling GC.Collect() won't save you any memory, it can cost you time.
  • Maybe call the unmanaged heap compact API. This seldom has had any effect for us; but on rare occasions, it has reclaimed some Virtual Memory for us. We're still kind of undecided.
  • If you can, wait for CE 6.0. It has a radically expanded memory mapping scheme, including 2 GB of VM per process. If that's not enough, you don't understand that the "C" in "CF" and "CE" stands for "Compact", as in "tiny". If 2 GB is too tiny for you, you're on the wrong platform. But as of this writing, there are no production devices running CE 6.0.
  • Consider this: is a UMPC compact enough for your customers? It's small. It's portable. And it will run the same code as your laptop does. (For us, the answer is no; but maybe you'll have a different answer...) 

"Think it'll work?" "It would take a miracle."

Actually, no, you won't need a miracle to live within the memory limits of CF 2.0 and CE 5.0. You'll just need awareness, design, profiling, patience, perseverance, and maybe a little bit of luck. And maybe I've given you a few more rabbits for your hat.

Posted on Saturday, November 15, 2008 5:53 PM CE/Mobile/Compact Framework/Whatever buzzword Microsoft has slapped on it this week , .NET | Back to top


Comments on this post: Dr. Strained Memory or: How I Learned to Stop Worrying and Love Compact Framework Memory Management

# re: Dr. Strained Memory or: How I Learned to Stop Worrying and Love Compact Framework Memory Management
Requesting Gravatar...
This article is a summary of the last month at my current primary client. One question:

In the debugger's modules window, it looks like each system module (e.g., System.Windows.Forms.dll) is mapped into 1 MB no matter how much is actually used. is this true and, if so, can I get these modules to only use what they need?

Thanks,
Bug
Left by sourcebug on May 07, 2009 7:55 PM

# re: Dr. Strained Memory or: How I Learned to Stop Worrying and Love Compact Framework Memory Management
Requesting Gravatar...
Bug,

I didn't notice that. I'm now on a new handheld project, so I'll take a look for it. Thanks!
Left by Martin L. Shoemaker on May 27, 2009 7:08 PM

Your comment:
 (will show your gravatar)


Copyright © Martin L. Shoemaker | Powered by: GeeksWithBlogs.net