hypnoticocelot.com: My Favorite Debug Ever

hypnoticocelot.com: My Favorite Debug Ever

By Ryan Kennedy.

One of my favorite things is debugging problems. I don’t know why, but I genuinely enjoy it and I think I’m actually good at it. This is the story of one of my favorite debugging sessions ever.

I joined Yahoo! Mail back in late 2004. After doing mostly Java in my undergrad and then nothing but Java professionally afterwards, I was doing C++ and PHP at Yahoo!. A bunch of the PHP had to wrap underlying backend C++ libraries. C++ and I…were not friends in the least. I was used to the JVM hiding pointers and memory management from me. Nevertheless, after much bellyaching, I managed to get things working.

During internal testing we were getting sporadic complaints of file uploads failing. No HTTP errors from the server…the connection would just close itself. My more experienced coworkers told me this was typically the behavior seen when an Apache process would crash. The evidence would be found in core files on the affected machines. Sure enough, once I’d figured out where these mysterious artifacts could be found (having been a Java programmer for most of my life, core dumps were new to me) I quickly located quite a number of very large core files.

I had to figure out which of the core files (if any) were related to the problem I was investigating, which meant needing to learn enough GDB to load a core dump along with all the necessary symbols to be able to make sense of where things had gone wrong. At this point I had only a basic understanding of GDB (I had an unconventional undergrad experience having blown through a Computer Science degree in 2 years after spending 3 years as a Physics/Chemistry major), but I quickly figured out how to load the core dump and at least get a back trace. None of the back traces had anything to do with file uploads…they were segmentation faults all over the place. I started looking at the code indicated, but nothing looked out of place. I was completely unable to find any code that could be causing a segmentation fault.

About this time one of our frontend engineers caught the upload failure as it happened and called me over. He showed me again and again how the server would drop the connection. I asked for a copy of the attachment and went back to my desk. I sent the attachment to my own local development instance and watched, happily, as my process also crashed. This was the first breakthrough…a reproducible case. I put Apache into single process mode, attached GDB, and ran the request again. GDB caught the segmentation fault and dumped the stack trace. Unfortunately it was in an incredibly bizarre location. I had literally no idea what was going on.

The problem had a certain smell, however. It reminded me of something I’d seen in a previous job. I worked in Java at that job, but we had JNI wrappers for a vendor supplied library. I modified the wrapper once and it blew up in my face in a really non-obvious way (stack traces pointing to bizarre locations). A much more experienced engineer told me it sounded like I was “smashing the stack.” I had an array on the stack and I was writing off the end of it, blowing up bits of the stack along the way.

Determined I was encountering the same issue, I started wondering how on earth one finds memory corruption like this. Yahoo! Mail was an enormous codebase…I couldn’t just go spelunking for the problem. I needed help. During college, my senior project advisor had lent me a copy of Linux Application Development (I’m not sure why I never gave it back). On a whim, I flipped through it until I found a section on memory. In there, it talked about a tool called Electric Fence. Electric Fence replaced the system allocator, erecting barriers on either side of the allocated memory to detect buffer under and over flows.

I excitedly got back on the computer and began looking for it. I found a copy for FreeBSD, plugged it into my Apache module, restarted Apache, connected GDB, sent the doomed upload, and watched it fail exactly the same way: SIGSEGV instead of the expected SIGBUS Electric Fence ought to throw when an overflow occurred. “What the heck?”, I thought. I spent some time looking at the fine print in the documentation and noticed that by default Electric Fence would allocate a full page and set the barrier on the next page. So small overflows wouldn’t trigger Electric Fence. I found the setting (EF_ALIGNMENT) that put the barrier on the very next byte after what was requested in allocation, re-did the setup, and BOOM…SIGBUS. I ran the backtrace and found myself in the portion of code that was constructing the MIME body part, copying in the contents of the attachment provided.

It turned out that the underlying library could be called in different orders to construct a MIME message. Old Yahoo! Mail called it one order and New Yahoo! Mail (the one I was building) called it in another order. The order I was calling it in caused the buffer used to hold the attachment not to be properly initialized. As a result, attachments of a certain type and size (I remember it being nuanced, which is why it didn’t happen all the time) could overflow the buffer into undetermined space. I filed a bug against the team owning the library, updated my code to work around the ordering problem, and re-ran my test successfully.

This was 5 years into what is now a 15 year career and it is still one of the best, if not the best, bugs I’ve ever tracked down and fixed. Mostly I think I liked it because I had to learn so many new things to figure it out. So solving the problem felt like a tremendous accomplishment.

Thanks to Bruce Perens for his wonderful tool, Dr. Emilia Villareal for lending me the book (I owe you a copy of the new edition), and the inspiring Julia Evans for asking me to write this up.

Coz: Finding Code that Counts with Casual Profiling

Coz: Finding Code that Counts with Casual Profiling

By Charlie Cutsinger and Emery Berger

Coz is a new kind of profiler that unlocks optimization opportunities missed by traditional profilers. Coz employs a novel technique we call causal profiling that measures optimization potential. This measurement matches developers’ assumptions about profilers: that optimizing highly-ranked code will have the greatest impact on performance. Causal profiling measures optimization potential for serial, parallel, and asynchronous programs without instrumentation of special handling for library calls and concurrency primitives. Instead, a causal profiler uses performance experiments to predict the effect of optimizations. This allows the profiler to establish causality: “optimizing function X will have effect Y,” exactly the measurement developers had assumed they were getting all along.

Full details of Coz are available in our paper, Coz: Finding Code that Counts with Causal Profiling (pdf), SOSP 2015, October 2015 (recipient of a Best Paper Award).

Requirements

Coz, our prototype causal profiler, runs with unmodified Linux executables. Coz requires:

Python
Clang 3.1 or newer or another compiler with C++11 support
Linux version 2.6.32 or newer (must support the perf_event_open system call)
Building

To build Coz, just clone this repository and run make. The build system will check out other build dependencies and install them locally in the deps directory.

Using Coz

Using coz requires a small amount of setup, but you can jump ahead to the section on the included sample applications in this repository if you want to try coz right away.

To run your program with coz, you will need to build it with debug information. You do not need to include debug symbols in the main executable: coz uses the same procedure as gdb to locate debug information for stripped binaries. If you plan to use your program with progress points (see below), you also need to link your program with the dynamic loader library by specifying the -ldl option.

Once you have your program built with debug information, you can run it with coz using the command coz run {coz options} — {program name and arguments}. But, to produce a useful profile you need to decide which part(s) of the application you want to speed up by specifying one or more progress points.

Profiling Modes

Coz departs from conventional profiling by making it possible to view the effect of optimizations on both throughput and latency. To profile throughput, you must specify a progress point. To profile latency, you must specify a pair of progress points.

Throughput Profiling: Specifying Progress Points

To profile throughput you must indicate a line in the code that corresponds to the end of a unit of work. For example, a progress point could be the point at which a transaction concludes, when a web page finishes rendering, or when a query completes. Coz then measures the rate of visits to each progress point to determine any potential optimization’s effect on throughput.

To place a progress point, include coz.h (under the include directory in this repository) and add the COZ_PROGRESS macro to at least one line you would like to execute more frequently. Don’t forget to link your program with libdl: use the -ldl option.

By default, Coz uses the source file and line number as the name for your progress points. If you use COZ_PROGRESS_NAMED(“name for progress point”) instead, you can provide an informative name for your progress points. This also allows you to mark multiple source locations that correspond to the same progress point.

Latency Profiling: Specifying Progress Points

To profile latency, you must place two progress points that correspond to the start and end of an event of interest, such as when a transaction begins and completes. Simply mark the beginning of a transaction with the COZ_BEGIN(“transaction name”) macro, and the end with the COZ_END(“transaction name”) macro. Unlike regular progress points, you always need to specify a name for your latency progress points. Don’t forget to link your program with libdl: use the -ldl option.

When coz tests a hypothetical optimization it will report the effect of that optimization on the average latency between these two points. Coz can track this information with any knowledge of individual transactions thanks to Little’s Law.

Specifying Progress Points on the Command Line

Coz has command line options to specify progress points when profiling the application instead of modifying its source. This feature is currently disabled because it did not work particularly well. Adding support for better command line-specified progress points is planned in the near future.

Processing Results

To plot profile results, go to http://plasma-umass.github.io/coz/ and load your profile. This page also includes several sample profiles from PARSEC benchmarks.

Sample Applications

The benchmarks directory in this repository includes several small benchmarks with progress points added at appropriate locations. To build and run one of these benchmarks with coz, just browse to benchmarks/{bench name} and type make bench (or make test for a smaller input size). These programs may require several runs before coz has enough measurements to generate a useful profile. Once you have profiled these programs for several minutes, go to http://plasma-umass.github.io/coz/ to load and plot your profile.

Build your own Lisp, Learn C and build your own programming language in 1000 lines of code!

Build your own Lisp, Learn C and build your own programming language in 1000 lines of code!

By Daniel Holden

Contents • Build Your Own Lisp

Chapter 1 • Introduction
About
Who this is for
Why learn C
How to learn C
Why build a Lisp
Your own Lisp

Chapter 2 • Installation
Setup
Text Editor
Compiler
Hello World
Compilation
Errors
Documentation

Chapter 3 • Basics
Overview
Programs
Variables
Function Declarations
Structure Declarations
Pointers
Strings
Conditionals
Loops

Chapter 4 • An Interactive Prompt
Read, Evaluate, Print
An Interactive Prompt
Compilation
Editing input
The C Preprocessor

Chapter 5 • Languages
What is a Programming Language?
Parser Combinators
Coding Grammars
Natural Grammars

Chapter 6 • Parsing
Polish Notation
Regular Expressions
Installing mpc
Polish Notation Grammar
Parsing User Input

Chapter 7 • Evaluation
Trees
Recursion
Evaluation
Printing

Chapter 8 • Error Handling
Crashes
Lisp Value
Enumerations
Lisp Value Functions
Evaluating Errors
Plumbing

Chapter 9 • S-Expressions
Lists and Lisps
Types of List
Pointers
The Stack & The Heap
Parsing Expressions
Expression Structure
Constructors & Destructors
Reading Expressions
Printing Expressions
Evaluating Expressions

Chapter 10 • Q-Expressions
Adding Features
Quoted Expressions
Reading Q-Expressions
Builtin Functions
First Attempt
Macros
Builtins Lookup

Chapter 11 • Variables
Immutability
Function Pointers
Cyclic Types
Function Type
Environment
Variable Evaluation
Builtins
Define Function
Error Reporting

Chapter 12 • Functions
What is a Function?
Function Type
Lambda Function
Parent Environment
Function Calling
Variable Arguments
Interesting Functions

Chapter 13 • Conditionals
Doing it yourself
Ordering
Equality
If Function
Recursive Functions

Chapter 14 • Strings
Libraries
String Type
Reading Strings
Comments
Load Function
Command Line Arguments
Print Function
Error Function
Finishing Up

Chapter 15 • Standard Library
Minimalism
Atom
Building Blocks
Logical Operators
Miscellaneous Functions
List Functions
Conditional Functions
Fibonacci

Chapter 16 • Bonus Projects
Only the Beginning
Native Types
User Defined Types
List Literal
Operating System Interaction
Macros
Variable Hashtable
Pool Allocation
Garbage Collection
Tail Call Optimisation
Lexical Scoping
Static Typing

Conclusion

Credits

FAQ

Source

C++ reference C++98, C++03, C++11, C++14

C++ reference C++98, C++03, C++11, C++14

Our goal is to provide programmers with a complete online reference for the C and C++ languages and standard libraries, i.e. a more convenient version of the C and C++ standards.
The primary objective is to have a good specification of C and C++. That is, things that are implicitly clear to an experienced programmer should be omitted, or at least separated from the main description of a function, constant or class. A good place to demonstrate various use cases is the “example” section of each page. Rationale, implementation notes, domain specific documentation are preferred to be included in the “notes” section of each page.

C11 is the most recently published C Standard. This means that C language is now defined in terms of C11 and we also try to stick to it. However, the differences between C89, C99 and C11 should be marked as such.

C++14 is the most recently published C++ Standard, so that is the main focus of this site.
However, in order to provide a more complete reference, we also include documentation describing previous versions of the standard (C++98, C++03, and C++11) as well as draft documentation for future versions of the standard (C++17 and the Technical Specifications). All version-specific documentation should be labeled appropriately.

javadude.com: Java is Pass-by-Value, Dammit!

javadude.com: Java is Pass-by-Value, Dammit!

I’m a compiler guy at heart. The terms “pass-by-value” semantics and “pass-by-reference” semantics have very precise definitions, and they’re often horribly abused when folks talk about Java. I want to correct that… The following is how I’d describe these

Pass-by-value
The actual parameter (or argument expression) is fully evaluated and the resulting value is copied into a location being used to hold the formal parameter’s value during method/function execution. That location is typically a chunk of memory on the runtime stack for the application (which is how Java handles it), but other languages could choose parameter storage differently.

Pass-by-reference
The formal parameter merely acts as an alias for the actual parameter. Anytime the method/function uses the formal parameter (for reading or writing), it is actually using the actual parameter.

Java is strictly pass-by-value, exactly as in C. Read the Java Language Specification (JLS). It’s spelled out, and it’s correct.

In Java,

Dog d;

is exactly like C++’s

Dog *d;

And using

d.setName("Fifi");

is exactly like C++’s

d->setName("Fifi");

TinkerForge

TinkerForge

Building blocks with a wide range of modules
The well matched Tinkerforge modules allow experienced programmers to concentrate on the software, thus projects can be completed faster. A programming novice on the other hand has the possibility to learn programming with exciting applications by using the Tinkerforge building blocks.

No detailed knowledge in electronics necessary
The realization of a project with Tinkerforge is possible without troubles. You simply pick the required modules and connect them together with each other. There is no other electronics knowledge and no soldering needed.
For example: If the project is to control a motor dependent on a measured temperature, you just have to choose a temperature sensor and an appropriate motor controller out of the available Tinkerforge building blocks.

Intuitive API
The Tinkerforge API offers intuitive functions, that simplify the programming. For example: It is possible to set the velocity of a motor in meters per second with a call of setVelocity() or to read out a temperature in degree Celsius (°C) with getTemperature().