Let's not protest protesters

Jon Foreman (of the band Switchfoot) recently wrote a post called ”Why I Refuse To Protest Protestors”. In it, he talked about his experience dealing with protests at his band's concerts. A few sentences he wrote struck me the most,

All at once I had an epiphany: these puzzling creatures that are yelling at you are human souls – as unpredictable, perplexing and unpredictable as I am. Here's the shocker: this guy with the bullhorn could be my cousin! he could be a friend of mine! Better yet: this guy could be me! If our lives were swapped, who can say that I would be any different? I put nothing below me. Who can say what I would do if I had his reality? Compassion makes you realize what you have in common with the rest of humanity.

For me as a Mozillian, these past few days have been confusing and unsettling. First it was the protests against Brendan's appointment. For an organization you are a part of, for a pioneer you look up to, for them to be rallied against, it was hard to understand. Then it was Brendan's resignation. That was unexpected and felt like things falling apart. Now it is another wave of protests against Brendan's departure. In the midst of this whirlwind, I want to feel anguish. I want to shout. Why couldn't Brendan just recant? Why couldn't the protesters just understand? Why couldn't the media just set the story straight? Why couldn't Mozilla just handle everything a little better? I want to protest.

But that's not what Mozilla is about. Like Katharina said, Mozilla is great at not only shipping great products but also at shipping love. It was love, love for the web, that first brought Mozilla together. It is love that continues to drive our mission now. Like Jon wrote,

On my best days, I want to stand for love conquering a multitude of wrongs. I want to stand for forgiveness, for mercy, for beauty, for grace. I stand for you, sir and madame. Whether you are holding a megaphone or not. Even when you refuse to shake my hand I love you. Whether you insult me or not, drunk or sober; I honestly love you! I love your passion, your fervor, your dedication. I want to know you better. I want to find out what makes you tick. I want to know why you believe what you believe. I want to learn from you. I am for you, emphatically for you!

Let's do what we do best, at showing love. Show love to Brendan; thank him for all that he's done and wish him the best. Show love to the activists; empathize with them. Show love to people who think Mozilla is not inclusive. Show love to the journalists, to companies protesting us. Show love to the Board, to your peers, to the community. Let's show our love to Mozilla.

Splitting a Git commit into two

Oftentimes I make a big commit that I later realize should be split into two commits/patches. I think the usual way to split a Git commit is the reset/“add -p”/commit sequence, like this,

# assuming HEAD is the commit to split,
git reset HEAD^ # first undo the commit
git add -p      # then choose bits to separate out
git commit      # commit separated changes
git commit -a   # commit other changes

But lately I've been using another way that involves a “commit –fixup”/revert/“rebase -i” sequence, like this,

# assuming HEAD is the commit to split,
# first, in Your Favorite Editor, delete everything that you want to separate out
git commit -a --fixup HEAD        # then commit the changes you removed as a fixup
git revert -e HEAD                # revert the fixup; this will become the split commit
git rebase -i --autosquash HEAD~3 # combine the original commit and the fixup

One thing I like about this second method is that it lets me pick out the changes to separate out in an editor with full context. I think this is a lot better than git add -p, which involves editing the hunks directly, and sometimes provides too little context to let me determine if a particular hunk should be separated out or not.

Fennec App Not Responding (ANR) Dashboard

Over the last few months, I've been working on an improved App Not Responding (ANR) dashboard for Fennec, which is now hosted at telemetry.mozilla.org/hang/anr. With the help of many people, I'm glad to say that the dashboard is now mature enough to be a useful tool for Fennec developers.

ANR Reporting

The idea of ANR/hang reporting is similar to crash reporting — every time the Fennec UI becomes unresponsive for more than five seconds, Android would show an “App Not Responding” dialog; the ANR Reporter detects this condition and collects these information about the hang:

  • Stacks for Java threads in Fennec
  • Stacks for Gecko threads (C++ stacks and profiler pseudo-stacks)
  • System information listed in about:telemetry
  • Fennec logs to help debug the hang

The ANR Reporter is enabled on Nightly and Aurora builds only, and if the user has not opted out of telemetry, the collected information is sent back to Mozilla, where the data are aggregated and presented through the ANR Dashboard. Because the debug logs may contain private information, they are not processed and are only available internally, within Mozilla.

ANR Dashboard

The ANR Dashboard presents weekly aggregated data collected through the ANR reporter. Use the drop-down list at the top of the page to choose a week to display.

ANR Dashboard

Data for each week are then grouped by certain parameters from ANR reports. The default grouping is “appName”, and because ANR reports are specific to Fennec, you only see one column in the top hangs chart labeled “Fennec”. However, if you choose to group by, for example, “memsize”, you will see many columns in the chart, with each column representing a different device memory size seen from ANR reports.

Choosing a group parameter

Each column in the top hangs chart shows the number of hangs, and each column is further divided into blocks, each representing a different hang. Hover over the blocks to see the hang stack and the number of hangs. This example shows 8 hangs with that signature occurred on devices with 768MB of memory over the past week.

Hovering over the top hangs chart to see hang stacks

Colors are preserved across columns, so the same colored blocks all represent the same hang. The blue blocks at the bottom represent all hangs outside of the top 10 list.

To the right of the top hangs chart is the distributions chart. It shows how different parameters are distributed for all hangs. Hover over each block to see details. This example shows 36% of all hangs occurred on devices running Android API level 15 (corresponding to Android 4.0.3-4.0.4 Ice Cream Sandwich) over the past week.

Hovering over the distribution chart to see details

The distributions chart can also be narrowed down to specific groups. This would let us find out, for example, on devices having 1GB of memory, what is the percentage of hangs occurring on the Nightly update channel.

Choosing a group for the distribution chart

Clicking on a block in the top hangs chart bring up a Hang Report. The hang report is specific to the column that you clicked on. For example, if you are grouping by “memsize”, clicking on a hang in the “1G” column will give you one hang report and clicking on the same hang in the “2G” column will give you a different hang report. Switch grouping to “appName” if you want to ignore groups — in that case there is only one column, “Fennec”.

Hang Report showing its distributions chart

The hang report also contains a distributions chart specific to the hang. The example above shows that 14% of this hang occurred on Nexus 7 devices.

In addition, the hang report contains a builds chart that shows the frequency of occurrence for different builds. This example shows there was one hang from build 20140224030203 on the 30.0a1 branch over the past week. The chart can be very useful when verifying that a hang has been fixed in newer builds.

Hang Report builds chart

Last but not least, the hang report contains stacks from the hang. The stacks in the hang report are more detailed than the stack shown on the main page. You can also look at stacks from other threads — useful for finding deadlocks!

Hang Report stacks

Normalization

When comparing the volume of hangs, a higher number can mean two things — the side with higher number is more likely to hang, or the side with higher number has more usage. For example, if we are comparing hangs between devices A and B, and A has a higher number of hangs. It is possible that A is more prone to hanging; however, it is also possible that A simply has more users and therefore more chances for hangs to occur.

To provide better comparisons, the ANR Dashboard has a normalization feature that tries to account for usage. Once “Normalize” is enabled at the top of the dashboard, all hang numbers in the dashboard will be divided by usage as measured by reported uptime. Instead of displaying the raw number of hangs, the top hangs chart will display the number of hangs per one thousand user-hours. For example, 10 hangs per 1k user-hour means, on average, 1000 users each using Fennec for one hour will experience 10 hangs combined; or equivalently, one user using Fennec for 1000 hours will experience 10 hangs total. The distributions chart is also updated to reflect usage.

As a demonstration, the image below shows un-normalized hangs grouped by device memory size. There is no clear trend among the different values.

Normalization turned off

The image below shows normalized hangs based on the same data. In this case, it is clear that, once usage is accounted for, higher device memory size generally corresponds to lower number of hangs. Note that the “unknown” column became hidden because there is not enough usage data for devices with “unknown” memory size.

Normalization turned on

At the moment, I think uptime is the best available measurement for usage. Hopefully there will be a better metric in the future to provide more accurate results. Or let me know if it already exists!

Caveat when using setvbuf/fread on Android

The standard library function setvbuf() can be used to set unbuffered mode for a stream. However, fread() on Android has the behavior that, in unbuffered mode, it can sometimes return EOF, or (size_t)(−1), instead of the number of elements actually read. This unexpected behavior was the cause for bug 935831.

I took a glance at the few other places where Gecko uses unbuffered mode, and fortunately, it appears none of them apply to a regular Android run.

Now you can debug Fennec on x86

The Android GDB (JimDB) Wiki and pre-built binaries now include instructions and support for Android on x86. Not only is it useful on the growing number of x86 devices, it is also useful on the x86 emulator included in the Android SDK (for developers without access to a device). Presently, the x86 version is entirely separate form the ARM version - you would need both versions in different directories if you have devices on both platforms.

The updated GDB also has limited support for on-demand decompression on Android. If you've been noticing random seg faults when debugging Fennec, the new version will ignore these seg faults. See "Random segmentation faults when debugging Fennec" and "monitor set ignore-ondemand" in the Wiki for more information.

Next up, I will be working on offering a B2G version of JimDB with similar functionality and benefits as the Android version. See the dev.b2g posting for the discussion. Let me know what you think!

Updated JimDB

At the mobile team meet-up two weeks ago I demonstrated some of the newer features in JimDB, the GDB package used for Fennec development. This past week I pushed out these new changes, and also updated the wiki page, which includes (or will include) all instructions and documentation. You should check it out if you ever need to use GDB to debug Fennec!

JimDB main prompt

Some of the changes since my last blog post include,

  • Support for launching Fennec with environment variables and arguments
  • Support for debugging Mochitests.
    • You can now debug a single test, a whole directory, or use TEST_PATH like before (thanks :jwatt!)
    • Environment variables are supported.
    • Because XRE is needed for running Mochitests, JimDB can automatically download and manage a copy of XRE for you.
  • Experimental Java debugger (JDB) integration.
    • Now you can choose the “Debug using jdb” option to debug Fennec Java code (see the jdb doc page for a quick JDB tutorial; so far at least the print and stop commands work).
    • Or you can launch two JimDB instances to debug C++ and Java simultaneously.
  • Miscellaneous improvements
    • A dump-pseudo-stack command to print the profiler stack (thanks :kats!)
    • Detection of mismatch between objdir version and installed version on device
    • Detection of device changes - now new libraries will be downloaded when the device has a new ROM
    • Automatic update
    • Better way to change settings through gdbinit.local
    • Working tab completion on OS X
    • Fixed a bug where breakpoints with conditions can cause crashes (thanks :jwatt!)
    • Fixed a bug where calling functions from JimDB can cause crashes

With these improvements, give the new JimDB a try! Let me know if you run into issues. You can always find me as jchen in #mobile on IRC. There is also a JimDB component on Bugzilla under Firefox for Android.

Next up, I have some possible new features for the next release

  • Robocop debugging support
  • Reftest debugging support
  • Web app debugging support
  • Support for debugging in ${your_favorite_editor}

Let me know if any of these features would help you!

Firefox logo plate

Went out to a ceramics studio tonight, and I painted a Firefox logo plate! It will take a few more days for it to fire (no pun intended), so I'll have to wait to see how it turns out. Overall a fun night with friends and I highly recommend you visiting one if there's a studio near you!

Firefox logo plate

Debugging C++ Unit Tests on Android

With help from Dan Mosedale (dmose), JimDB now has the ability to debug C++ unit tests running on your Android device [1]. To get started, first update your JimDB Python scripts by running git pull under the utils directory.

The process is fairly automatic. After launching JimDB, choose the second option to debug C++ tests,

Fennec GDB utilities
1. Debug Fennec (default)
2. Debug compiled-code unit test
Enter number from above: 2

Then after choosing the object directory, the actual test is specified along with any environmental variables and arguments. Environmental variables can also be preset in the gdbinit file in the utils directory.

Enter path of unit test (use tab-completion to see possibilities)
    path can be relative to $objdir/dist/bin or absolute
    environmental variables and arguments are supported
    e.g. FOO=bar TestFooBar arg1 arg2
: TestRefPtr

Next, the test can be started simply through the continue command,

Ready. Use "continue" to start execution.
(gdb) c
Continuing.

Output from the test is redirected to the terminal running GDB,

out> BEGIN unit tests for |nsRefPtr|, compiled Dec 10 2012
out> >>main()
out> sizeof(nsRefPtr<Foo>) --> 4
out> >>CreateBar() -->   new Foo@0x43f02098 [#1]
out>   new Bar@0x43f02098
out> Bar@0x43f02098
out> Foo@0x43f02098::AddRef(), refcount --> 1
out> <<CreateBar()
out> Bar@0x43f02098::QueryInterface()
out> Foo@0x43f02098::AddRef(), refcount --> 2
out> total constructions/destructions --> 1/0
out> Foo@0x43f02098::Release(), refcount --> 1
out> >>Foo@0x43f02098::Release(), refcount --> 0
out>   delete Foo@0x43f02098
out> <<Foo@0x43f02098::Release()
out> Bar@0x43f02098::~Bar()
out> Foo@0x43f02098::~Foo() [#1]
out> >>CreateBar() -->   new Foo@0x43f02098 [#2]
out>   new Bar@0x43f02098
out> Bar@0x43f02098
out> Foo@0x43f02098::AddRef(), refcount --> 1
out> <<CreateBar()
out> Bar@0x43f02098::QueryInterface()
out> Foo@0x43f02098::AddRef(), refcount --> 2
out> total constructions/destructions --> 2/1
out> Foo@0x43f02098::Release(), refcount --> 1
out> >>Foo@0x43f02098::Release(), refcount --> 0
out>   delete Foo@0x43f02098
out> <<Foo@0x43f02098::Release()
out> Bar@0x43f02098::~Bar()
out> Foo@0x43f02098::~Foo() [#2]
out> 
out> ### Test  1: will a |nsCOMPtr| call |AddRef| on a pointer assigned into it?
out>   new Foo@0x43f02098 [#3]
out> Foo@0x43f02098::QueryInterface()
out> Foo@0x43f02098::AddRef(), refcount --> 1
...

Support is limited to C++ unit tests for now, but the plan is to include more types of tests that you can debug (mochitest, reftests, etc.). Stay tuned! Feel free to ping me (jchen) on IRC #mobile if you have questions, or file bugs under the Firefox for Android > JimDB component on Bugzilla.

Tunnelling ADB through SSH

TL;DR adb start-server && ssh -R 5037:localhost:5037 remote

In my current workflow, I use a MacBook Air to ssh to a Linux machine for developing and building Fennec (mostly because the Linux box has a way beefier CPU for building than the 2-core MacBook; also it saves battery when I'm not plugged in). But since everything is on the Linux box, I needed a way to make ADB talk to my phone which is connected to my MacBook.

A quick Google search didn't reveal anything, but turns out it's pretty easy to tunnel the ADB protocol over SSH. ADB uses a client-server model. The client is the actual adb command that you use, the server is responsible for talking to the phone, and they communicate through port 5037. So to tunnel it over SSH, you simply need to add a port forwarding rule to the .ssh/config file on your local machine (in my case the MacBook). For example,

Host super-awesome-linux-box
RemoteForward 5037 localhost:5037

This way, every time the client needs to talk to the server on the Linux box, it will be forwarded to the MacBook. You do have to make sure that the ADB server is running on the MacBook, by running

adb start-server

Also, the two machines should have the same version of adb.

However, one drawback of this is that, say you go over to your Linux box while your MacBook is still ssh'd in. Because your ADB is still being forwarded, you cannot use ADB on any devices connected to the Linux box directly. One way to solve this is by using a different port for forwarding ADB,

Host super-awesome-linux-box
RemoteForward 5038 localhost:5037

This forwards the Linux port 5038 to the MacBook port 5037. Then, on the Linux box, you can tell your ADB client to use port 5038 by adding the following to your shell profile script on the Linux remote host,

if [[ -n $SSH_CONNECTION ]]; then
  export ANDROID_ADB_SERVER_PORT=5038
fi

This snippet sets the ADB client port to 5038 only when you are ssh'd in. That way, ADB over SSH will use one port, and ADB used locally will use another port.

Finally, to debug Fennec, you would need a second forwarded port for tunnelling the gdbserver protocol,

Host super-awesome-linux-box
RemoteForward 5038 localhost:5037 # adb
RemoteForward 5039 localhost:5039 # gdbserver

If you pull the latest gdbutils for JimDB, you can uncomment the following line in utils/gdbinit to use this port for gdbserver.

python feninit.default.gdbserver_port = 5039

That's it! If you have any questions, feel free to ping me (jchen) in #mobile on irc.mozilla.org


Edit: Thanks kats for pointing out an unclear part about which .ssh/config file to edit, and that you need the same version of adb.

Updated Android gdb and gdbutils

I took some time today to update both the Fennec Android gdb (aka 'jimdb') and gdbutils. Build instructions should be the same. You can also grab the pre-built Linux tarball. Note that the pre-built binaries require libpython2.7.

gdb

The updated gdb is faster when setting up remote debugging. Sometimes it can get pretty frustrating when you have to debug a lot of sessions, and you have to wait half a minute each time just to get started. This update will hopefully make it a little better. (This is not an issue if you have an intern at your disposal ;)

gdbutils

gdbutils is a set of Python scripts that works with Android gdb. The updated gdbutils includes the following modules:

feninit

feninit automatically sets up the debug environment. The updated version adds support for B2G (see gdbinit for details; thanks ThinkerYzu!) and various bug fixes.

tracebt

tracebt is a stack unwinder that works by tracing assembly. The updated version adds a sanity check to stop unwinding before the script gets stuck somewhere and becomes depressed.

fastload

fastload is a new module that automatically pulls system libraries from the device in the background. This way you don't have to wait minutes just to download system libraries when debugging on a new device.

adblog

adblog is a new module that redirects the output of 'adb logcat' to the gdb terminal when Fennec is running. It also colorizes the logs according to the order they arrived, their priority, or the threads that generated them, like this:

adblog example

You can find additional documentation for gdbutils in the README. Feel free to ping me on IRC (jchen) or file issues on github! Thanks!


Also follow me on twitter! :D