docs/html/guide/practices/design/performance.jd - platform/frameworks/base - Gitiles

 page.title=Designing for Performance
 @jd:body

 <p>An Android application should be fast. Well, it's probably more accurate to
 say that it should be <em>efficient</em>. That is, it should execute as
 efficiently as possible in the mobile device environment, with its limited
 computing power and data storage, smaller screen, and constrained battery life.

 <p>As you develop your application, keep in mind that, while the application may
 perform well enough in your emulator, running on your dual-core development
 computer, it will not perform that well when run a mobile device &mdash; even
 the most powerful mobile device can't match the capabilities of a typical
 desktop system. For that reason, you should strive to write efficient code, to
 ensure the best possible performance on a variety of mobile devices.</p>

 <p>Generally speaking, writing fast or efficient code means keeping memory
 allocations to a minimum, writing tight code, and avoiding certain language and
 programming idioms that can subtly cripple performance. In object-oriented
 terms, most of this work takes place at the <em>method</em> level, on the order of
 actual lines of code, loops, and so on.</p>

 <p>This document covers these topics: </p>
 <ul>
     <li><a href="#intro">Introduction</a></li>
     <li><a href="#object_creation">Avoid Creating Objects</a></li>
     <li><a href="#native_methods">Use Native Methods</a></li>
     <li><a href="#prefer_virtual">Prefer Virtual Over Interface</a></li>
     <li><a href="#prefer_static">Prefer Static Over Virtual</a></li>
     <li><a href="#internal_get_set">Avoid Internal Getters/Setters</a></li>
     <li><a href="#cache_fields">Cache Field Lookups</a></li>
     <li><a href="#use_final">Declare Constants Final</a></li>
     <li><a href="#foreach">Use Enhanced For Loop Syntax With Caution</a></li>
     <li><a href="#avoid_enums">Avoid Enums</a></li>
     <li><a href="#package_inner">Use Package Scope with Inner Classes</a></li>
     <li><a href="#avoidfloat">Avoid Float</a> </li>
     <li><a href="#samples">Some Sample Performance Numbers</a> </li>
     <li><a href="#closing_notes">Closing Notes</a></li>
 </ul>

 <a name="intro" id="intro"></a>
 <h2>Introduction</h2>

 <p>There are two basic rules for resource-constrained systems:</p>

 <ul>
     <li>Don't do work that you don't need to do.</li>
     <li>Don't allocate memory if you can avoid it.</li>
 </ul>

 <p>All the tips below follow from these two basic tenets.</p>

 <p>Some would argue that much of the advice on this page amounts to "premature
 optimization." While it's true that micro-optimizations sometimes make it
 harder to develop efficient data structures and algorithms, on embedded
 devices like handsets you often simply have no choice.  For instance, if you
 bring your assumptions about VM performance on desktop machines to Android,
 you're quite likely to write code that exhausts system memory.  This will bring
 your application to a crawl &mdash; let alone what it will do to other programs
 running on the system!</p>

 <p>That's why these guidelines are important.  Android's success depends on
 the user experience that your applications provide, and that user experience
 depends in part on whether your code is responsive and snappy, or slow and
 aggravating.  Since all our applications will run on the same devices, we're
 all in this together, in a way.  Think of this document as like the rules of
 the road you had to learn when you got your driver's license:  things run
 smoothly when everybody follows them, but when you don't, you get your car
 smashed up.</p>

 <p>Before we get down to brass tacks, a brief observation: nearly all issues
 described below are valid whether or not the VM features a JIT compiler. If I
 have two methods that accomplish the same thing, and the interpreted execution
 of foo() is faster than bar(), then the compiled version of foo() will
 probably be as fast or faster than compiled bar(). It is unwise to rely on a
 compiler to "save" you and make your code fast enough.</p>

 <a name="object_creation"></a>
 <h2>Avoid Creating Objects</h2>

 <p>Object creation is never free. A generational GC with per-thread allocation
 pools for temporary objects can make allocation cheaper, but allocating memory
 is always more expensive than not allocating memory.</p>

 <p>If you allocate objects in a user interface loop, you will force a periodic
 garbage collection, creating little "hiccups" in the user experience.</p>

 <p>Thus, you should avoid creating object instances you don't need to.  Some
 examples of things that can help:</p>

 <ul>
     <li>When extracting strings from a set of input data, try
     to return a substring of the original data, instead of creating a copy.
     You will create a new String object, but it will share the char[]
     with the data.</li>
     <li>If you have a method returning a string, and you know that its result
     will always be appended to a StringBuffer anyway, change your signature
     and implementation so that the function does the append directly,
     instead of creating a short-lived temporary object.</li>
 </ul>

 <p>A somewhat more radical idea is to slice up multidimensional arrays into parallel
 single one-dimension arrays:</p>

 <ul>
     <li>An array of ints is a much better than an array of Integers,
     but this also generalizes to the fact that two parallel arrays of ints
     are also a <strong>lot</strong> more efficient than an array of (int,int)
     objects.  The same goes for any combination of primitive types.</li>
     <li>If you need to implement a container that stores tuples of (Foo,Bar)
     objects, try to remember that two parallel Foo[] and Bar[] arrays are
     generally much better than a single array of custom (Foo,Bar) objects.
     (The exception to this, of course, is when you're designing an API for
     other code to access;  in those cases, it's usually better to trade
     correct API design for a small hit in speed. But in your own internal
     code, you should try and be as efficient as possible.)</li>
 </ul>

 <p>Generally speaking, avoid creating short-term temporary objects if you
 can.  Fewer objects created mean less-frequent garbage collection, which has
 a direct impact on user experience.</p>

 <a name="native_methods" id="native_methods"></a>
 <h2>Use Native Methods</h2>

 <p>When processing strings, don't hesitate to use specialty methods like
 String.indexOf(), String.lastIndexOf(), and their cousins. These are typically
 implemented in C/C++ code that easily runs 10-100x faster than doing the same
 thing in a Java loop.</p>

 <p>The flip side of that advice is that punching through to a native
 method is more expensive than calling an interpreted method. Don't use native
 methods for trivial computation, if you can avoid it.</p>

 <a name="prefer_virtual" id="prefer_virtual"></a>
 <h2>Prefer Virtual Over Interface</h2>

 <p>Suppose you have a HashMap object.  You can declare it as a HashMap or as
 a generic Map:</p>

 <pre>Map myMap1 = new HashMap();
 HashMap myMap2 = new HashMap();</pre>

 <p>Which is better?</p>

 <p>Conventional wisdom says that you should prefer Map, because it
 allows you to change the underlying implementation to anything that
 implements the Map interface.  Conventional wisdom is correct for
 conventional programming, but isn't so great for embedded systems.  Calling
 through an interface reference can take 2x longer than a virtual
 method call through a concrete reference.</p>

 <p>If you have chosen a HashMap because it fits what you're doing, there
 is little value in calling it a Map.  Given the availability of
 IDEs that refactor your code for you, there's not much value in calling
 it a Map even if you're not sure where the code is headed. (Again, though,
 public APIs are an exception:  a good API usually trumps small performance
 concerns.)</p>

 <a name="prefer_static" id="prefer_static"></a>
 <h2>Prefer Static Over Virtual</h2>

 <p>If you don't need to access an object's fields, make your method static.  It can
 be called faster, because it doesn't require a virtual method table
 indirection.  It's also good practice, because you can tell from the method
 signature that calling the method can't alter the object's state.</p>

 <a name="internal_get_set" id="internal_get_set"></a>
 <h2>Avoid Internal Getters/Setters</h2>

 <p>In native languages like C++ it's common practice to use getters (e.g.
 <code>i = getCount()</code>) instead of accessing the field directly (<code>i
 = mCount</code>). This is an excellent habit for C++, because the compiler can
 usually inline the access, and if you need to restrict or debug field access
 you can add the code at any time.</p>

 <p>On Android, this is a bad idea.  Virtual method calls are expensive,
 much more so than instance field lookups.  It's reasonable to follow
 common object-oriented programming practices and have getters and setters
 in the public interface, but within a class you should always access
 fields directly.</p>

 <a name="cache_fields" id="cache_fields"></a>
 <h2>Cache Field Lookups</h2>

 <p>Accessing object fields is much slower than accessing local variables.
 Instead of writing:</p>
 <pre>for (int i = 0; i &lt; this.mCount; i++)
       dumpItem(this.mItems[i]);</pre>

 <p>You should write:</p>
 <pre>  int count = this.mCount;
   Item[] items = this.mItems;

   for (int i = 0; i &lt; count; i++)
       dumpItems(items[i]);
 </pre>

 <p>(We're using an explicit "this" to make it clear that these are
 member variables.)</p>

 <p>A similar guideline is never call a method in the second clause of a "for"
 statement.  For example, the following code will execute the getCount() method
 once per iteration, which is a huge waste when you could have simply cached
 the value as an int:</p>

 <pre>for (int i = 0; i &lt; this.getCount(); i++)
     dumpItems(this.getItem(i));
 </pre>

 <p>It's also usually a good idea to create a local variable if you're going to be
 accessing an instance field more than once.  For example:</p>

 <pre>
     protected void drawHorizontalScrollBar(Canvas canvas, int width, int height) {
         if (isHorizontalScrollBarEnabled()) {
             int size = <strong>mScrollBar</strong>.getSize(<em>false</em>);
             if (size &lt;= 0) {
                 size = mScrollBarSize;
             }
             <strong>mScrollBar</strong>.setBounds(0, <em>height</em> - size, width, height);
             <strong>mScrollBar</strong>.setParams(
                     computeHorizontalScrollRange(),
                     computeHorizontalScrollOffset(),
                     computeHorizontalScrollExtent(), <em>false</em>);
             <strong>mScrollBar</strong>.draw(canvas);
         }
     }</pre>

 <p>That's four separate lookups of the member field <code>mScrollBar</code>.
 By caching mScrollBar in a local stack variable, the four member field lookups
 become four stack variable references, which are much more efficient.</p>

 <p>Incidentally, method arguments have the same performance characteristics
 as local variables.</p>

 <a name="use_final" id="use_final"></a>
 <h2>Declare Constants Final</h2>

 <p>Consider the following declaration at the top of a class:</p>

 <pre>static int intVal = 42;
 static String strVal = "Hello, world!";</pre>

 <p>The compiler generates a class initializer method, called
 <code>&lt;clinit&gt;</code>, that is executed when the class is first used.
 The method stores the value 42 into <code>intVal</code>, and extracts a
 reference from the classfile string constant table for <code>strVal</code>.
 When these values are referenced later on, they are accessed with field
 lookups.</p>

 <p>We can improve matters with the "final" keyword:</p>

 <pre>static final int intVal = 42;
 static final String strVal = "Hello, world!";</pre>

 <p>The class no longer requires a <code>&lt;clinit&gt;</code> method,
 because the constants go into classfile static field initializers, which are
 handled directly by the VM.  Code accessing <code>intVal</code> will use
 the integer value 42 directly, and accesses to <code>strVal</code> will
 use a relatively inexpensive "string constant" instruction instead of a
 field lookup.</p>

 <p>Declaring a method or class "final" does not confer any immediate
 performance benefits, but it does allow certain optimizations. For example, if
 the compiler knows that a "getter" method can't be overridden by a sub-class,
 it can inline the method call.</p>

 <p>You can also declare local variables final. However, this has no definitive
 performance benefits. For local variables, only use "final" if it makes the
 code clearer (or you have to, e.g. for use in an anonymous inner class).</p>

 <a name="foreach" id="foreach"></a>
 <h2>Use Enhanced For Loop Syntax With Caution</h2>

 <p>The enhanced for loop (also sometimes known as "for-each" loop) can be used for collections that implement the Iterable interface.
 With these objects, an iterator is allocated to make interface calls
 to hasNext() and next(). With an ArrayList, you're better off walking through
 it directly, but for other collections the enhanced for loop syntax will be equivalent
 to explicit iterator usage.</p>

 <p>Nevertheless, the following code shows an acceptable use of the enhanced for loop:</p>

 <pre>public class Foo {
     int mSplat;
     static Foo mArray[] = new Foo[27];

     public static void zero() {
         int sum = 0;
         for (int i = 0; i &lt; mArray.length; i++) {
             sum += mArray[i].mSplat;
         }
     }

     public static void one() {
         int sum = 0;
         Foo[] localArray = mArray;
         int len = localArray.length;

         for (int i = 0; i &lt; len; i++) {
             sum += localArray[i].mSplat;
         }
     }

     public static void two() {
         int sum = 0;
         for (Foo a: mArray) {
             sum += a.mSplat;
         }
     }
 }</pre>

 <p><strong>zero()</strong> retrieves the static field twice and gets the array
 length once for every iteration through the loop.</p>

 <p><strong>one()</strong> pulls everything out into local variables, avoiding
 the lookups.</p>

 <p><strong>two()</strong> uses the enhanced for loop syntax introduced in version 1.5 of
 the Java programming language. The code generated by the compiler takes care
 of copying the array reference and the array length to local variables, making
 it a good choice for walking through all elements of an array. It does
 generate an extra local load/store in the main loop (apparently preserving
 "a"), making it a teensy bit slower and 4 bytes longer than one().</p>

 <p>To summarize all that a bit more clearly: enhanced for loop syntax performs well
 with arrays, but be cautious when using it with Iterable objects since there is
 additional object creation.</p>

 <a name="avoid_enums" id="avoid_enums"></a>
 <h2>Avoid Enums</h2>

 <p>Enums are very convenient, but unfortunately can be painful when size
 and speed matter.  For example, this:</p>

 <pre>public class Foo {
    public enum Shrubbery { GROUND, CRAWLING, HANGING }
 }</pre>

 <p>turns into a 900 byte .class file (Foo$Shrubbery.class). On first use, the
 class initializer invokes the &lt;init&gt; method on objects representing each
 of the enumerated values. Each object gets its own static field, and the full
 set is stored in an array (a static field called "$VALUES"). That's a lot of
 code and data, just for three integers.</p>

 <p>This:</p>

 <pre>Shrubbery shrub = Shrubbery.GROUND;</pre>

 <p>causes a static field lookup.  If "GROUND" were a static final int,
 the compiler would treat it as a known constant and inline it.</p>

 <p>The flip side, of course, is that with enums you get nicer APIs and
 some compile-time value checking.  So, the usual trade-off applies: you should
 by all means use enums for public APIs, but try to avoid them when performance
 matters.</p>

 <p>In some circumstances it can be helpful to get enum integer values
 through the <code>ordinal()</code> method.  For example, replace:</p>

 <pre>for (int n = 0; n &lt; list.size(); n++) {
     if (list.items[n].e == MyEnum.VAL_X)
        // do stuff 1
     else if (list.items[n].e == MyEnum.VAL_Y)
        // do stuff 2
 }</pre>

 <p>with:</p>

 <pre>   int valX = MyEnum.VAL_X.ordinal();
    int valY = MyEnum.VAL_Y.ordinal();
    int count = list.size();
    MyItem items = list.items();

    for (int  n = 0; n &lt; count; n++)
    {
         int  valItem = items[n].e.ordinal();

         if (valItem == valX)
           // do stuff 1
         else if (valItem == valY)
           // do stuff 2
    }</pre>

 <p>In some cases, this will be faster, though this is not guaranteed.</p>

 <a name="package_inner" id="package_inner"></a>
 <h2>Use Package Scope with Inner Classes</h2>

 <p>Consider the following class definition:</p>

 <pre>public class Foo {
     private int mValue;

     public void run() {
         Inner in = new Inner();
         mValue = 27;
         in.stuff();
     }

     private void doStuff(int value) {
         System.out.println("Value is " + value);
     }

     private class Inner {
         void stuff() {
             Foo.this.doStuff(Foo.this.mValue);
         }
     }
 }</pre>

 <p>The key things to note here are that we define an inner class (Foo$Inner)
 that directly accesses a private method and a private instance field
 in the outer class.  This is legal, and the code prints "Value is 27" as
 expected.</p>

 <p>The problem is that Foo$Inner is technically (behind the scenes) a totally
 separate class, which makes direct access to Foo's private
 members illegal.  To bridge that gap, the compiler generates a
 couple of synthetic methods:</p>

 <pre>/*package*/ static int Foo.access$100(Foo foo) {
     return foo.mValue;
 }
 /*package*/ static void Foo.access$200(Foo foo, int value) {
     foo.doStuff(value);
 }</pre>

 <p>The inner-class code calls these static methods whenever it needs to
 access the "mValue" field or invoke the "doStuff" method in the outer
 class. What this means is that the code above really boils down to a case
 where you're accessing member fields through accessor methods instead of
 directly.  Earlier we talked about how accessors are slower than direct field
 accesses, so this is an example of a certain language idiom resulting in an
 "invisible" performance hit.</p>

 <p>We can avoid this problem by declaring fields and methods accessed
 by inner classes to have package scope, rather than private scope.
 This runs faster and removes the overhead of the generated methods.
 (Unfortunately it also means the fields could be accessed directly by other
 classes in the same package, which runs counter to the standard OO
 practice of making all fields private. Once again, if you're
 designing a public API you might want to carefully consider using this
 optimization.)</p>

 <a name="avoidfloat" id="avoidfloat"></a>
 <h2>Avoid Float</h2>

 <p>Before the release of the Pentium CPU, it was common for game authors to do
 as much as possible with integer math. With the Pentium, the floating point
 math co-processor became a built-in feature, and by interleaving integer and
 floating-point operations your game would actually go faster than it would
 with purely integer math. The common practice on desktop systems is to use
 floating point freely.</p>

 <p>Unfortunately, embedded processors frequently do not have hardware floating
 point support, so all operations on "float" and "double" are performed in
 software. Some basic floating point operations can take on the order of a
 millisecond to complete.</p>

 <p>Also, even for integers, some chips have hardware multiply but lack
 hardware divide. In such cases, integer division and modulus operations are
 performed in software &mdash; something to think about if you're designing a
 hash table or doing lots of math.</p>

 <a name="samples" id="samples"></a>
 <h2>Some Sample Performance Numbers</h2>

 <p>To illustrate some of our ideas, here is a table listing the approximate
 run times for a few basic actions. Note that these values should NOT be taken
 as absolute numbers: they are a combination of CPU and wall clock time, and
 will change as improvements are made to the system. However, it is worth
 noting how these values apply relative to each other &mdash; for example,
 adding a member variable currently takes about four times as long as adding a
 local variable.</p>

 <table>
     <tr>
         <th>Action</th>
         <th>Time</th>
     </tr>
     <tr>
         <td>Add a local variable </td>
         <td>1</td>
     </tr>
     <tr class="alt">
         <td>Add a member variable </td>
         <td>4</td>
     </tr>
     <tr>
         <td>Call String.length()</td>
         <td>5</td>
     </tr>
     <tr class="alt">
         <td>Call empty static native method</td>
         <td>5</td>
     </tr>
     <tr>
         <td>Call empty static method </td>
         <td>12</td>
     </tr>
     <tr class="alt">
         <td>Call empty virtual method </td>
         <td>12.5</td>
     </tr>
     <tr>
         <td>Call empty interface method </td>
         <td>15</td>
     </tr>
     <tr class="alt">
         <td>Call Iterator:next() on a HashMap </td>
         <td>165</td>
     </tr>
     <tr>
         <td>Call put() on a HashMap</td>
         <td>600</td>
     </tr>
     <tr class="alt">
         <td>Inflate 1 View from XML </td>
         <td>22,000</td>
     </tr>
     <tr>
         <td>Inflate 1 LinearLayout containing 1 TextView </td>
         <td>25,000</td>
     </tr>
     <tr class="alt">
         <td>Inflate 1 LinearLayout containing 6 View objects </td>
         <td>100,000</td>
     </tr>
     <tr>
         <td>Inflate 1 LinearLayout containing 6 TextView objects </td>
         <td>135,000</td>
     </tr>
     <tr class="alt">
         <td>Launch an empty activity </td>
         <td>3,000,000</td>
     </tr>
 </table>

 <a name="closing_notes" id="closing_notes"></a>
 <h2>Closing Notes</h2>

 <p>The best way to write good, efficient code for embedded systems is to
 understand what the code you write really does. If you really want to allocate
 an iterator, by all means use enhanced for loop syntax on a List; just make it a
 deliberate choice, not an inadvertent side effect.</p>

 <p>Forewarned is forearmed!  Know what you're getting into!  Insert your
 favorite maxim here, but always think carefully about what your code is doing,
 and be on the lookout for ways to speed it up.</p>