Efficient File Copying On Linux

Efficient File Copying On Linux

Efficient File Copying On Linux
Mar 22, 2017
In response to my last post about dd, a friend of mine noticed that GNU cp always uses a 128 KB buffer size when copying a regular file; this is also the buffer size used by GNU cat. If you use strace to watch what happens when copying a file, you should see a lot of 128 KB read/write sequences:

$ strace -s 8 -xx cp /dev/urandom /dev/null

read(3, “\x61\xca\xf8\xff\x1a\xd6\x83\x8b”…, 131072) = 131072
write(4, “\x61\xca\xf8\xff\x1a\xd6\x83\x8b”…, 131072) = 131072
read(3, “\xd7\x47\x8f\x09\xb2\x3d\x47\x9f”…, 131072) = 131072
write(4, “\xd7\x47\x8f\x09\xb2\x3d\x47\x9f”…, 131072) = 131072
read(3, “\x12\x67\x90\x66\xb7\xed\x0a\xf5″…, 131072) = 131072
write(4, “\x12\x67\x90\x66\xb7\xed\x0a\xf5″…, 131072) = 131072
read(3, “\x9e\x35\x34\x4f\x9d\x71\x19\x6d”…, 131072) = 131072
write(4, “\x9e\x35\x34\x4f\x9d\x71\x19\x6d”…, 131072) = 131072

As you can see, each copy is operating on buffers 131072 bytes in size, which is 128 KB. GNU cp is part of the GNU coreutils project, and if you go diving into the coreutils source code you’ll find this buffer size is defined in the file src/ioblksize.h. The comments in this file are really fascinating. The author of the code in this file (Jim Meyering) did a benchmark using dd if=/dev/zero of=/dev/null with different values of the block size parameter, bs. On a wide variety of systems, including older Intel CPUs, modern high-end Intel CPUs, and even an IBM POWER7 CPU, a 128 KB buffer size is fastest. I used gnuplot to graph these results, shown below. Higher transfer rates are better, and the different symbols represent different system configurations.

buffer size

Most of the systems get faster transfer rates as the buffer size approaches 128 KB. After that, performance generally degrades slightly.

The file includes a cryptic, but interesting, explanation of why 128 KB is the best buffer size. Normally with these system calls it’s more efficient to use larger buffer sizes. This is because the larger the buffer size used, the fewer system calls need to be made. So why the drop off in performance when a buffer larger than 128 KB is used?

When copying a file, GNU cp will first call posix_fadvise(2) on the source file with POSIX_FADV_SEQUENTIAL as the “advice” flag. As the name implies, this gives a hint to the kernel that cp plans to scan the source file sequentially. This causes the Linux kernel to use “readahead” for the file. On Linux you can also initiate readahead using madvise(2). There’s also a system call actually called readahead(2), but it has a slightly different use case.

When you read(2) data from a regular file, if you’re lucky some or all of the data you plan to read will already be in the kernel’s page cache. The page cache is a cache of disk pages stored in kernel memory. Normally this works on an LRU basis, so when you read a page from disk the kernel first checks the page cache, and if the page isn’t in the cache it reads it from disk and copies it into the page cache (possibly evicting an older page from the cache). This means the first access to a disk page actually requires going to disk, but subsequent accesses can simply copy the data from main memory if the disk page is still in the page cache.

When the kernel initiates readahead, it makes a best effort to prefetch pages that it thinks will be needed imminently. In particular, when accessing a file sequentially, the kernel will attempt to prefetch upcoming parts of the file as the file is read. When everything is working correctly, one can get a high cache hit rate even if the file contents weren’t already in the page cache when the file was initially opened. In fact, if the file is actually accessed sequentially, there’s a good chance of getting a 100% hit rate from the page cache when the kernel is doing readahead.

There’s a trade-off here, because if the kernel prefetches pages more aggressively there will be a higher cache hit rate; but if the kernel is too aggressive, it may wastefully prefetch pages that aren’t actually going to be read. What actually happens is the kernel has a readahead buffer size configured for each block device, and the readahead kernel thread will prefetch at most that much data for files on that block device. You can see the readahead buffer size using the blockdev command:

# Get the readahead size for /dev/sda
$ blockdev –getra /dev/sda
The units returned by blockdev are in terms of 512 byte “sectors” (even though my Intel SSD doesn’t actually have true disk sectors). Thus a return value of 256 actually corresponds to a 128 KB buffer size. You can see how this is actually implemented by the kernel in the file mm/readahead.c, in particular in the method ondemand_readahead() which calls get_init_ra_size(). From my non-expert reading of the code, it appears that the code tries to look at the number of pages in the file, and for large files a maximum value of 128 KB is used. Note that this is highly specific to Linux: other Unix kernels may or may not implement readahead, and if they do there’s no guarantee that they’ll use the same readahead buffer size.

So how is this related to disk transfer rates? As noted earlier, typically one wants to minimize the number of system calls made, as each system call has overhead. In this case that means we want to use as large a buffer size as possible. On the other hand, performance will be best when the page cache hit rate is high. A buffer size of 128 KB fits both of these constraints—it’s the maximum buffer size that can be used before readahead will stop being effective. If a larger buffer size is used, read(2) calls will block while kernel waits for the disk to actually return new data.

In the real world a lot of other things will be happening on the host, so there’s no guarantee that the stars will align perfectly. If the disk is very fast, the effect of readahead is diminished, so the penalty for using a larger buffer size might not be as bad. It’s also possible to race the kernel here: a userspace program could try to read a file faster than the kernel can prefetch pages, which will make readahead less effective. But on the whole, we expect a 128 KB buffer size to be most effective, and that’s exactly what the benchmark above demonstrates.


Haven App: Keep Watch

Haven App: Keep Watch

About Haven
Haven is for people who need a way to protect their personal spaces and possessions without compromising their own privacy. It is an Android application that leverages on-device sensors to provide monitoring and protection of physical spaces. Haven turns any Android phone into a motion, sound, vibration and light detector, watching for unexpected guests and unwanted intruders. We designed Haven for investigative journalists, human rights defenders, and people at risk of forced disappearance to create a new kind of herd immunity. By combining the array of sensors found in any smartphone, with the world’s most secure communications technologies, like Signal and Tor, Haven prevents the worst kind of people from silencing citizens without getting caught in the act.

View our full Haven App Overview presentation for more about the origins and goals of the project.

Announcement and Public Beta
We are announcing Haven today, as an open-source project, along a public beta release of the app. We are looking for contributors who understand that physical security is as important as digital, and who have an understanding and compassion for the kind of threats faced by the users and communities we want to support. We also think it is really cool, cutting edge, and making use of encrypted messaging and onion routing in whole new ways. We believe Haven points the way to a more sophisticated approach to securing communication within networks of things and home automation system.

Learn more about the story of this project at the links below:

Haven: Building the Most Secure Baby Monitor Ever?
Snowden’s New App Uses Your Smartphone To Physically Guard Your Laptop
Snowden’s New App Turns Your Phone Into a Home Security System
Project Team
Haven was developed through a collaboration between Freedom of the Press Foundation and Guardian Project. Prototype funding was generously provided by FoPF, and donations to support continuing work can be contributed through their site: https://freedom.press/donate-support-haven-open-source-project/

Freedom of the Press Foundation Guardian Project

Safety through Sensors
Haven only saves images and sound when triggered by motion or volume, and stores everything locally on the device. You can position the device’s camera to capture visible motion, or set your phone somewhere discreet to just listen for noises. Get secure notifications of intrusion events instantly and access the logs remotely or anytime later.

The follow sensors are monitored for a measurable change, and then recorded to an event log on the device:

Accelerometer: phone’s motion and vibration
Camera: motion in the phone’s visible surroundings from front or back camera
Microphone: noises in the enviroment
Light: change in light from ambient light sensor
Power: detect device being unplugged or power loss
The application can be built using Android Studio and Gradle. It relies on a number of third-party dependencies, all which are free, open-source and listed at the end of this document.

You can currently get the Haven BETA release in one of three ways:

Download Haven from Google Play
First, install F-Droid the open-source app store, and second, add our Haven Nightly “Bleeding Edge” repository by scanning the QR Code below:

or add this repository manually in F-Droid’s Settings->Repositories: https://guardianproject.github.io/haven-nightly/fdroid/repo/

Grab the APK files from the Github releases page
You can, of course, build the app yourself, from source.

If you are an Android developer, you can learn more about how you can make use of F-Droid in your development workflow, for nightly builds, testing, reproducability and more here: F-Droid Documentation

Why no iPhone Support?
While we hope to support a version of Haven that runs directly on iOS devices in the future, iPhone users can still benefit from Haven today. You can purchase an inexpensive Android phone for less than $100, and use that as your “Haven Device”, that you leave behind, while you keep your iPhone with you. If you run Signal on your iPhone, you can configure Haven on Android to send encrypted notifications, with photos and audio, directly to you. If you enable the “Tor Onion Service” feature in Haven (requires installing “Orbot” app as well), you can remotely access all Haven log data from your iPhone, using the Onion Browser app.

So, no, iPhone users we didn’t forget about you, and hope you’ll pick up an Android burner today for a few bucks!

Haven is meant to provide an easy onboarding experience, that walks through user through configuring the sensors on their device to best detect intrusions into their environment. The current implementation has some of this implemented, but we are looking to improve this user experience dramatically.

Main view
Application’s main view allows the user to set which sensors to use and the corresponding level of sensitivity. A security code must be provided, needed to disable monitoring. A phone number can be set, if any of the sensors is triggered a message is sent to the specified number.

When one of the sensors is triggered (reaches the sensibility threshold) a notifications is sent through the following channels (if enabled).

SMS: a message is sent to the number specified when monitoring started
Signal: if configured, can send end-to-end encryption notifications via Signal
Notifications are sent through a service running in background that is defined in class MonitorService.

Remote Access
All event logs and captured media can be remotely accessed through a Tor Onion Service. Haven must be configured as an Onion Service, and requires the device to also have Orbot: Tor for Android installed and running.

This project contains source code or library dependencies from the follow projects:

SecureIt project available at: https://github.com/mziccard/secureit Copyright (c) 2014 Marco Ziccardi (Modified BSD)
libsignal-service-java from Open Whisper Systems: https://github.com/WhisperSystems/libsignal-service-java (GPLv3)
signal-cli from AsamK: https://github.com/AsamK/signal-cli (GPLv3)
Sugar ORM from chennaione: https://github.com/chennaione/sugar/ (MIT)
Square’s Picasso: https://github.com/square/picasso (Apache 2)
JayDeep’s AudioWife: https://github.com/jaydeepw/audio-wife (MIT)
AppIntro: https://github.com/apl-devs/AppIntro (Apache 2)
Guardian Project’s NetCipher: https://guardianproject.info/code/netcipher/ (Apache 2)
NanoHttpd: https://github.com/NanoHttpd/nanohttpd (BSD)
Milosmns’ Actual Number Picker: https://github.com/milosmns/actual-number-picker (GPLv3)
Fresco Image Viewer: https://github.com/stfalcon-studio/FrescoImageViewer (Apache 2)
Facebook Fresco Image Library: https://github.com/facebook/fresco (BSD)
Audio Waveform Viewer: https://github.com/derlio/audio-waveform (Apache 2)
FireZenk’s AudioWaves: https://github.com/FireZenk/AudioWaves (MIT)
MaxYou’s SimpleWaveform: https://github.com/maxyou/SimpleWaveform (MIT)
haven is maintained by guardianproject.
This page was generated by GitHub Pages.

Environment Modules

Environment Modules

Environment Modules
Welcome to the Environment Modules open source project. The Environment Modules package provides for the dynamic modification of a user’s environment via modulefiles.

What are Environment Modules?
Typically users initialize their environment when they log in by setting environment information for every application they will reference during the session. The Environment Modules package is a tool that simplify shell initialization and lets users easily modify their environment during the session with modulefiles.

Each modulefile contains the information needed to configure the shell for an application. Once the Modules package is initialized, the environment can be modified on a per-module basis using the module command which interprets modulefiles. Typically modulefiles instruct the module command to alter or set shell environment variables such as PATH, MANPATH, etc. modulefiles may be shared by many users on a system and users may have their own collection to supplement or replace the shared modulefiles.

Modules can be loaded and unloaded dynamically and atomically, in an clean fashion. All popular shells are supported, including bash, ksh, zsh, sh, csh, tcsh, fish, as well as some scripting languages such as perl, ruby, tcl and python.

Modules are useful in managing different versions of applications. Modules can also be bundled into metamodules that will load an entire suite of different applications.

Latest source release
Release notes, Migrating (2017-10-16)
Source repository
How to install Modules, reference manual page for the module(1) command and for modulefile(4) script, Frequently Asked Questions, …
Documentation portal
Mailing list: questions, comments or development suggestions for the Modules community can be sent to the modules-interest mailing list.
Bug report: spotting bug, report it to our tracker.
Quick examples
Here is an example of loading a module on a Linux machine under bash.
% module load gcc/6.1.1
% which gcc
Now we’ll switch to a different version of the module
% module switch gcc gcc/6.3.1
% which gcc
And now we’ll unload the module altogether
% module unload gcc
% which gcc
gcc not found
Now we’ll log into a different machine, using a different shell (tcsh).
tardis-> module load gcc/6.3.1
tardis-> which gcc
Note that the command line is exactly the same, but the path has automatically configured to the correct architecture.

About Modules
John L. Furlani, Peter W. Osel, “Abstract Yourself With Modules”, Proceedings of the Tenth Large Installation Systems Administration Conference (LISA ’96), pp.193-204, Chicago, IL, September 29 – October 4, 1996.
John L. Furlani, “Modules: Providing a Flexible User Environment”, Proceedings of the Fifth Large Installation Systems Administration Conference (LISA V), pp. 141-152, San Diego, CA, September 30 – October 3, 1991.
Erich Whitney, Mark Sprague, “Drag Your Design Environment Kicking and Screaming into the 90’s With Modules!”, Synopsys Users’ Group, Boston 2001
About Modules contributed / based tools
Richard Elling, Matthew Long, “user-setup: A system for Custom Configuration of User Environments, or Helping Users Help Themselves”, from Proceedings of the Sixth Systems Administration Conference (LISA VI), pp. 215-223, Long Beach, CA, October 19-23, 1992.
Brock Palen and Jeff Squyres – Research, Computing and Engineering – “RCE 60: Modules”, September 20, 2011.
Related tools
Flavours is a wrapper built on top of Modules C-version to simplify the organization and presentation of software that requiring multiple builds against different compilers, MPI libraries, processor architectures, etc. This package is written and maintained by Mark Dixon.

Env2 is a Perl script to convert environment variables between scripting languages. For example, convert a csh setup script to bash or the other way around. Supports bash, csh, ksh, modulecmd, perl, plist, sh, tclsh, tcsh, vim, yaml and zsh. This package is written and maintained by David C. Black.

Software Collections is a Red Hat project that enables you to build and concurrently install multiple RPM versions of the same components on your system, without impacting the system versions of the RPM packages installed from your distribution. Once installed a software collection is enabled with the scl command that relies on Modules for the user environment setup.

The OSCAR Cluster Project uses modules along with a tool called switcher. Read about switcher and modules in section 4.11 of the OSCAR Cluster User’s Guide.

Reference installations
The NERSC – The National Energy Research Scientific Computing Center has a great introduction and help page: Modules Approach to Environment Management.

The University of Minnesota CSE-IT manages software in Unix using Modules. They give some insight of their Modules usage and provide details on the way they have setup their Modules environment.

Modules is covered by the GNU General Public License, version 2 and the GNU Lesser General Public License, version 2.1. Copyright © 1996-1999 John L. Furlani & Peter W. Osel, © 1998-2017 R.K.Owen, © 2002-2004 Mark Lakata, © 2004-2017 Kent Mein, © 2016-2017 Xavier Delaruelle. All rights reserved. Trademarks used are the property of their respective owners.