hypnoticocelot.com: My Favorite Debug Ever

hypnoticocelot.com: My Favorite Debug Ever

By Ryan Kennedy.

One of my favorite things is debugging problems. I don’t know why, but I genuinely enjoy it and I think I’m actually good at it. This is the story of one of my favorite debugging sessions ever.

I joined Yahoo! Mail back in late 2004. After doing mostly Java in my undergrad and then nothing but Java professionally afterwards, I was doing C++ and PHP at Yahoo!. A bunch of the PHP had to wrap underlying backend C++ libraries. C++ and I…were not friends in the least. I was used to the JVM hiding pointers and memory management from me. Nevertheless, after much bellyaching, I managed to get things working.

During internal testing we were getting sporadic complaints of file uploads failing. No HTTP errors from the server…the connection would just close itself. My more experienced coworkers told me this was typically the behavior seen when an Apache process would crash. The evidence would be found in core files on the affected machines. Sure enough, once I’d figured out where these mysterious artifacts could be found (having been a Java programmer for most of my life, core dumps were new to me) I quickly located quite a number of very large core files.

I had to figure out which of the core files (if any) were related to the problem I was investigating, which meant needing to learn enough GDB to load a core dump along with all the necessary symbols to be able to make sense of where things had gone wrong. At this point I had only a basic understanding of GDB (I had an unconventional undergrad experience having blown through a Computer Science degree in 2 years after spending 3 years as a Physics/Chemistry major), but I quickly figured out how to load the core dump and at least get a back trace. None of the back traces had anything to do with file uploads…they were segmentation faults all over the place. I started looking at the code indicated, but nothing looked out of place. I was completely unable to find any code that could be causing a segmentation fault.

About this time one of our frontend engineers caught the upload failure as it happened and called me over. He showed me again and again how the server would drop the connection. I asked for a copy of the attachment and went back to my desk. I sent the attachment to my own local development instance and watched, happily, as my process also crashed. This was the first breakthrough…a reproducible case. I put Apache into single process mode, attached GDB, and ran the request again. GDB caught the segmentation fault and dumped the stack trace. Unfortunately it was in an incredibly bizarre location. I had literally no idea what was going on.

The problem had a certain smell, however. It reminded me of something I’d seen in a previous job. I worked in Java at that job, but we had JNI wrappers for a vendor supplied library. I modified the wrapper once and it blew up in my face in a really non-obvious way (stack traces pointing to bizarre locations). A much more experienced engineer told me it sounded like I was “smashing the stack.” I had an array on the stack and I was writing off the end of it, blowing up bits of the stack along the way.

Determined I was encountering the same issue, I started wondering how on earth one finds memory corruption like this. Yahoo! Mail was an enormous codebase…I couldn’t just go spelunking for the problem. I needed help. During college, my senior project advisor had lent me a copy of Linux Application Development (I’m not sure why I never gave it back). On a whim, I flipped through it until I found a section on memory. In there, it talked about a tool called Electric Fence. Electric Fence replaced the system allocator, erecting barriers on either side of the allocated memory to detect buffer under and over flows.

I excitedly got back on the computer and began looking for it. I found a copy for FreeBSD, plugged it into my Apache module, restarted Apache, connected GDB, sent the doomed upload, and watched it fail exactly the same way: SIGSEGV instead of the expected SIGBUS Electric Fence ought to throw when an overflow occurred. “What the heck?”, I thought. I spent some time looking at the fine print in the documentation and noticed that by default Electric Fence would allocate a full page and set the barrier on the next page. So small overflows wouldn’t trigger Electric Fence. I found the setting (EF_ALIGNMENT) that put the barrier on the very next byte after what was requested in allocation, re-did the setup, and BOOM…SIGBUS. I ran the backtrace and found myself in the portion of code that was constructing the MIME body part, copying in the contents of the attachment provided.

It turned out that the underlying library could be called in different orders to construct a MIME message. Old Yahoo! Mail called it one order and New Yahoo! Mail (the one I was building) called it in another order. The order I was calling it in caused the buffer used to hold the attachment not to be properly initialized. As a result, attachments of a certain type and size (I remember it being nuanced, which is why it didn’t happen all the time) could overflow the buffer into undetermined space. I filed a bug against the team owning the library, updated my code to work around the ordering problem, and re-ran my test successfully.

This was 5 years into what is now a 15 year career and it is still one of the best, if not the best, bugs I’ve ever tracked down and fixed. Mostly I think I liked it because I had to learn so many new things to figure it out. So solving the problem felt like a tremendous accomplishment.

Thanks to Bruce Perens for his wonderful tool, Dr. Emilia Villareal for lending me the book (I owe you a copy of the new edition), and the inspiring Julia Evans for asking me to write this up.

Coz: Finding Code that Counts with Casual Profiling

Coz: Finding Code that Counts with Casual Profiling

By Charlie Cutsinger and Emery Berger

Coz is a new kind of profiler that unlocks optimization opportunities missed by traditional profilers. Coz employs a novel technique we call causal profiling that measures optimization potential. This measurement matches developers’ assumptions about profilers: that optimizing highly-ranked code will have the greatest impact on performance. Causal profiling measures optimization potential for serial, parallel, and asynchronous programs without instrumentation of special handling for library calls and concurrency primitives. Instead, a causal profiler uses performance experiments to predict the effect of optimizations. This allows the profiler to establish causality: “optimizing function X will have effect Y,” exactly the measurement developers had assumed they were getting all along.

Full details of Coz are available in our paper, Coz: Finding Code that Counts with Causal Profiling (pdf), SOSP 2015, October 2015 (recipient of a Best Paper Award).


Coz, our prototype causal profiler, runs with unmodified Linux executables. Coz requires:

Clang 3.1 or newer or another compiler with C++11 support
Linux version 2.6.32 or newer (must support the perf_event_open system call)

To build Coz, just clone this repository and run make. The build system will check out other build dependencies and install them locally in the deps directory.

Using Coz

Using coz requires a small amount of setup, but you can jump ahead to the section on the included sample applications in this repository if you want to try coz right away.

To run your program with coz, you will need to build it with debug information. You do not need to include debug symbols in the main executable: coz uses the same procedure as gdb to locate debug information for stripped binaries. If you plan to use your program with progress points (see below), you also need to link your program with the dynamic loader library by specifying the -ldl option.

Once you have your program built with debug information, you can run it with coz using the command coz run {coz options} — {program name and arguments}. But, to produce a useful profile you need to decide which part(s) of the application you want to speed up by specifying one or more progress points.

Profiling Modes

Coz departs from conventional profiling by making it possible to view the effect of optimizations on both throughput and latency. To profile throughput, you must specify a progress point. To profile latency, you must specify a pair of progress points.

Throughput Profiling: Specifying Progress Points

To profile throughput you must indicate a line in the code that corresponds to the end of a unit of work. For example, a progress point could be the point at which a transaction concludes, when a web page finishes rendering, or when a query completes. Coz then measures the rate of visits to each progress point to determine any potential optimization’s effect on throughput.

To place a progress point, include coz.h (under the include directory in this repository) and add the COZ_PROGRESS macro to at least one line you would like to execute more frequently. Don’t forget to link your program with libdl: use the -ldl option.

By default, Coz uses the source file and line number as the name for your progress points. If you use COZ_PROGRESS_NAMED(“name for progress point”) instead, you can provide an informative name for your progress points. This also allows you to mark multiple source locations that correspond to the same progress point.

Latency Profiling: Specifying Progress Points

To profile latency, you must place two progress points that correspond to the start and end of an event of interest, such as when a transaction begins and completes. Simply mark the beginning of a transaction with the COZ_BEGIN(“transaction name”) macro, and the end with the COZ_END(“transaction name”) macro. Unlike regular progress points, you always need to specify a name for your latency progress points. Don’t forget to link your program with libdl: use the -ldl option.

When coz tests a hypothetical optimization it will report the effect of that optimization on the average latency between these two points. Coz can track this information with any knowledge of individual transactions thanks to Little’s Law.

Specifying Progress Points on the Command Line

Coz has command line options to specify progress points when profiling the application instead of modifying its source. This feature is currently disabled because it did not work particularly well. Adding support for better command line-specified progress points is planned in the near future.

Processing Results

To plot profile results, go to http://plasma-umass.github.io/coz/ and load your profile. This page also includes several sample profiles from PARSEC benchmarks.

Sample Applications

The benchmarks directory in this repository includes several small benchmarks with progress points added at appropriate locations. To build and run one of these benchmarks with coz, just browse to benchmarks/{bench name} and type make bench (or make test for a smaller input size). These programs may require several runs before coz has enough measurements to generate a useful profile. Once you have profiled these programs for several minutes, go to http://plasma-umass.github.io/coz/ to load and plot your profile.

Build your own Lisp, Learn C and build your own programming language in 1000 lines of code!

Build your own Lisp, Learn C and build your own programming language in 1000 lines of code!

By Daniel Holden

Contents • Build Your Own Lisp

Chapter 1 • Introduction
Who this is for
Why learn C
How to learn C
Why build a Lisp
Your own Lisp

Chapter 2 • Installation
Text Editor
Hello World

Chapter 3 • Basics
Function Declarations
Structure Declarations

Chapter 4 • An Interactive Prompt
Read, Evaluate, Print
An Interactive Prompt
Editing input
The C Preprocessor

Chapter 5 • Languages
What is a Programming Language?
Parser Combinators
Coding Grammars
Natural Grammars

Chapter 6 • Parsing
Polish Notation
Regular Expressions
Installing mpc
Polish Notation Grammar
Parsing User Input

Chapter 7 • Evaluation

Chapter 8 • Error Handling
Lisp Value
Lisp Value Functions
Evaluating Errors

Chapter 9 • S-Expressions
Lists and Lisps
Types of List
The Stack & The Heap
Parsing Expressions
Expression Structure
Constructors & Destructors
Reading Expressions
Printing Expressions
Evaluating Expressions

Chapter 10 • Q-Expressions
Adding Features
Quoted Expressions
Reading Q-Expressions
Builtin Functions
First Attempt
Builtins Lookup

Chapter 11 • Variables
Function Pointers
Cyclic Types
Function Type
Variable Evaluation
Define Function
Error Reporting

Chapter 12 • Functions
What is a Function?
Function Type
Lambda Function
Parent Environment
Function Calling
Variable Arguments
Interesting Functions

Chapter 13 • Conditionals
Doing it yourself
If Function
Recursive Functions

Chapter 14 • Strings
String Type
Reading Strings
Load Function
Command Line Arguments
Print Function
Error Function
Finishing Up

Chapter 15 • Standard Library
Building Blocks
Logical Operators
Miscellaneous Functions
List Functions
Conditional Functions

Chapter 16 • Bonus Projects
Only the Beginning
Native Types
User Defined Types
List Literal
Operating System Interaction
Variable Hashtable
Pool Allocation
Garbage Collection
Tail Call Optimisation
Lexical Scoping
Static Typing





C++ reference C++98, C++03, C++11, C++14

C++ reference C++98, C++03, C++11, C++14

Our goal is to provide programmers with a complete online reference for the C and C++ languages and standard libraries, i.e. a more convenient version of the C and C++ standards.
The primary objective is to have a good specification of C and C++. That is, things that are implicitly clear to an experienced programmer should be omitted, or at least separated from the main description of a function, constant or class. A good place to demonstrate various use cases is the “example” section of each page. Rationale, implementation notes, domain specific documentation are preferred to be included in the “notes” section of each page.

C11 is the most recently published C Standard. This means that C language is now defined in terms of C11 and we also try to stick to it. However, the differences between C89, C99 and C11 should be marked as such.

C++14 is the most recently published C++ Standard, so that is the main focus of this site.
However, in order to provide a more complete reference, we also include documentation describing previous versions of the standard (C++98, C++03, and C++11) as well as draft documentation for future versions of the standard (C++17 and the Technical Specifications). All version-specific documentation should be labeled appropriately.

javadude.com: Java is Pass-by-Value, Dammit!

javadude.com: Java is Pass-by-Value, Dammit!

I’m a compiler guy at heart. The terms “pass-by-value” semantics and “pass-by-reference” semantics have very precise definitions, and they’re often horribly abused when folks talk about Java. I want to correct that… The following is how I’d describe these

The actual parameter (or argument expression) is fully evaluated and the resulting value is copied into a location being used to hold the formal parameter’s value during method/function execution. That location is typically a chunk of memory on the runtime stack for the application (which is how Java handles it), but other languages could choose parameter storage differently.

The formal parameter merely acts as an alias for the actual parameter. Anytime the method/function uses the formal parameter (for reading or writing), it is actually using the actual parameter.

Java is strictly pass-by-value, exactly as in C. Read the Java Language Specification (JLS). It’s spelled out, and it’s correct.

In Java,

Dog d;

is exactly like C++’s

Dog *d;

And using


is exactly like C++’s




Building blocks with a wide range of modules
The well matched Tinkerforge modules allow experienced programmers to concentrate on the software, thus projects can be completed faster. A programming novice on the other hand has the possibility to learn programming with exciting applications by using the Tinkerforge building blocks.

No detailed knowledge in electronics necessary
The realization of a project with Tinkerforge is possible without troubles. You simply pick the required modules and connect them together with each other. There is no other electronics knowledge and no soldering needed.
For example: If the project is to control a motor dependent on a measured temperature, you just have to choose a temperature sensor and an appropriate motor controller out of the available Tinkerforge building blocks.

Intuitive API
The Tinkerforge API offers intuitive functions, that simplify the programming. For example: It is possible to set the velocity of a motor in meters per second with a call of setVelocity() or to read out a temperature in degree Celsius (°C) with getTemperature().

Boost C++ Library

Boost C++ Library

  1. Accumulators
  2. Algorithm
  3. Align
  4. Any
  5. Array
  6. Asio
  7. Assert
  8. Assign
  9. Atomic
  10. Bimap
  11. Bind
  12. Call Traits
  13. Chrono
  14. Circular Buffer
  15. Compatibility
  16. Compressed Pair
  17. Concept Check
  18. Config
  19. Container
  20. Context
  21. Conversion
  22. Core
  23. Coroutine
  24. CRC
  25. Date Time
  26. Dynamic Bitset
  27. Enable If
  28. Exception
  29. Filesystem
  30. Flyweight
  31. Foreach
  32. Format
  33. Function
  34. Function Types
  35. Functional
  36. Functional/Factory
  37. Functional/Forward
  38. Functional/Hash
  39. Functional/Overloaded Function
  40. Fusion
  41. Geometry
  42. GIL
  43. Graph
  44. Heap
  45. ICL
  46. Identity Type
  47. In Place Factory, Typed In Place Factory
  48. Integer
  49. Interprocess
  50. Interval
  51. Intrusive
  52. IO State Savers
  53. Iostreams
  54. Iterator
  55. Lambda
  56. Lexical Cast
  57. Local Function
  58. Locale
  59. Lockfree
  60. Log
  61. Math
  62. Math Common Factor
  63. Math Octonion
  64. Math Quaternion
  65. Math/Special Functions
  66. Math/Statistical Distributions
  67. Member Function
  68. Meta State Machine
  69. Min-Max
  70. Move
  71. MPI
  72. MPL
  73. Multi-Array
  74. Multi-Index
  75. Multiprecision
  76. Numeric Conversion
  77. Odeint
  78. Operators
  79. Optional
  80. Parameter
  81. Phoenix
  82. Pointer Container
  83. Polygon
  84. Pool
  85. Predef
  86. Preprocessor
  87. Program Options
  88. Property Map
  89. Property Tree
  90. Proto
  91. Python
  92. Random
  93. Range
  94. Ratio
  95. Rational
  96. Ref
  97. Regex
  98. Result Of
  99. Scope Exit
  100. Serialization
  101. Signals
  102. Signals2
  103. Smart Ptr
  104. Spirit
  105. Statechart
  106. Static Assert
  107. String Algo
  108. Swap
  109. System
  110. Test
  111. Thread
  112. ThrowException
  113. Timer
  114. Tokenizer
  115. TR1
  116. TTI
  117. Tuple
  118. Type Erasure
  119. Type Index
  120. Type Traits
  121. Typeof
  122. uBLAS
  123. Units
  124. Unordered
  125. Utility
  126. Uuid
  127. Value Initialized
  128. Variant
  129. Wave
  130. Xpressive

Dabeaz LLC: Python Cookbook and SWIG Author

Dabeaz LLC: Python Cookbook and SWIG Author
Dabeaz LLC is David Beazley, an independent software developer and book author living in the city of Chicago. I primarily work on programming tools, provide custom software development, and teach practical programming courses for software developers, scientists, and engineers. I am best known for my work with the Python programming language where I have created several open-source packages (e.g., Swig and PLY). I am also the author of the Python Essential Reference (Addison-Wesley) and Python Cookbook, 3rd Ed. (O’Reilly). Although Python is my current language of choice, I also have significant experience with systems programming in C, C++, and assembly language.

Vitesse Data | Welcome

Vitesse Data | Welcome
SSE Optimization CSV file parsing is done using SSE instructions that process the CSV data 16-byte at a time. Drop-in Deployment 100% binary compatibility with Postgres 9.3.5 means there is no need to modify your application or site operation to realize the speed benefits and cost savings in electricity or AWS. Mr. Sulu, Step On It! CSV imports run up to 2X faster. OLAP aggregates run up to 10X faster. All because Vitesse DB pushes your x86 CPU to its limits.