The Archer and the Arrow

The archer took an arrow and drew it on his bow.
He shot it through the bulls-eye, not high, nor wide, nor low.
A hundred other arrows he drew and aimed and shot.
And while most arrows made their mark, alas, a few did not.
      God greets error with compassion and loving forgiveness.
      God loves us in our failure, not just in our success.

For sin is like an arrow that does not know its way.
I make two hundred efforts as I go through the day,
and, though some of my actions work out just as I aim,
the few of them with imperfections tempt me to feel shame.
      For Satan knows our weakness and takes delight in our cries;
      he wants us to despise ourselves, despair, and heed his lies.

But Jesus loves His creatures, whom He will not forsake.
Two thousand years ago, He knew every mistake
that I have made in my life, my errors great and small.
He knew me, and He loved me, and He forgave them all!
      Christ bids us seek completeness, with all heart, strength, and soul,
      but God forgives our imperfections, so we are made whole.


I wrote this in 1994 as a reminder of a lesson that I learned about my perfectionism in particular, but I have found it to be a strong counter-narrative to depression in general.

When nslookup is missing but python is not

nslookup replacement

Here’s how to look up an IP address when you don’t have nslookup but you have Python:

python -c "import socket; print(socket.gethostbyname('google-public-dns-a.google.com'))"

Of course, this all goes on one line; I think that when you copy the code above it will have no line breaks.

Here’s how to look up a host name:

python -c "import socket; print(socket.gethostbyaddr('8.8.8.8')[0])"

In this case, gethostbyaddr returns a tuple so [0] is needed to get the first member of the tuple.


netstat replacement

Unfortunately, a replacement for netstat is not as concise in python – 241 lines at
https://github.com/da667/netstat. But if you have wget or curl and an Internet connection you can do something like this:

python -c "$( wget -O - https://raw.githubusercontent.com/da667/netstat/master/netstat.py )"

tinyurl.com to the rescue:

python -c "$( wget -O - https://tinyurl.com/netstat-py )"

updates

2018.05.01 – updated for python3 syntax

 

Docker – remove exited containers and dangling volumes

Here’s my dockernuke script which does what it says it does when it says it’s doing it.

#!/bin/sh
echo find and destroy exited containers
echo ---
sudo docker rm $( echo $( \
sudo docker ps -a --filter="status=exited" -q) )
echo ...
echo find and destroy orphaned volumes
echo ---
sudo docker volume rm $( \
sudo docker volume ls -q -f 'dangling=true' )
echo ...
echo "tabula rasa!"

R equivalent to SQL select … group … by … having

In principle,  you can use the R ‘sqldf’ package for embedding SQL in R.  However, sqldf does not allow you to use R functions as aggregation functions.  This post is to help me remember how to translate SQL statements that include groupoing into base R.  Suppose the following test data:

test_df <- data.frame(
  sample      = c(1,2,3,4,5,6)
, sample_type = c(1,1,1,2,2,2)
)

A single aggregation can be performed to produce a single data.frame.  For example, this SQL:

select sample_type, count(sample) as sample_count
  from test_df
 group by sample_type 
 order by sample_type
having sample_count > 4

can be expressed in R as:

with(
  x <- with(
    # FROM test_df
    test_df
  , aggregate(
      # SELECT COUNT(sample) AS sample_count
      x = data.frame(sample_count = sample)
          # second column(s) of resulting data.frame
    , # SELECT sample_type
      # GROUP BY sample_type
      # ORDER BY sample_type
      by = list(sample_type = sample_type)
          # first column(s) of resulting data.frame
          # 'by' determines both the groups to be 
          #   aggregated and the order of the result
    , # SELECT COUNT(sample)
      FUN = length
          # R function to mimic SQL 'count' function
    )
  )
, # HAVING sample_count > 4
  x[ sample_count > 4, ]
)

This gets rather busy when aggregating multiple columns because each aggregation produces a data frame, so you need to “merge” the data frames (analogous to a SQL join):

pseudo SQL (pretending that SQL has statistical aggregation functions)

select sample_type
     , count(sample) as sample_count
     , mean(sample)  as sample_mean
     , var(sample)   as sample_var
  from test_df
 group by sample_type 
 order by sample_type
having sample_count > 4

R

with(
  x <-  with(
    test_df
  , {
      by <- list(sample_type = sample_type)
      Reduce(
        f = function(dtf1, dtf2) merge(dtf1, dtf2)
      , x = list(
          aggregate(
            x = data.frame(sample_count=sample)
          , by = by , FUN = length)
        , aggregate(
            x = data.frame(sample_mean =sample)
          , by = by, FUN = mean)
        , aggregate(
            x = data.frame(sample_var =sample),
            by = by, FUN = var)
        )
      )
    }
  )
, x[ sample_count > 4, ]
)

 

R S4 objects and overloaded data.frame

Here is some code extracted from this answer http://stackoverflow.com/a/14607290 to the question “How to create a dataframe of user defined S4 classes in R“. The answer and question are well worth reading, but I wanted to have the code example in one place without the intervening comments:

# Create S4 class person(name,age)
setClass("person", 
  slots = c(
    name="character", 
    age="numeric"
  )
)

# Create subsetting operator
setMethod("[", 
  "person",
  function(x, i, j) {
    initialize(x, name=x@name[i], age=x@age[i])
  }
)

# Create overload for format()
format.person <- function(x) {
  paste0(x@name, ", ", x@age)
}

# Create overload for as.data.frame()
as.data.frame.person <-
  function(x, row.names=NULL, optional=FALSE)
{
  if (is.null(row.names))
    row.names <- x@name
    value <- list(x)
    attr(value, "row.names") <- row.names
    class(value) <- "data.frame"
    return(value)
}

# Create overload for c()
c.person = function(...) {
  args = list(...)
  return(
    new("person",
      name=sapply(args, function(x) x@name),  
      age=sapply(args, function(x) x@age)
    )
  )
}

# Demonstrate the code above; writes "John, 20"
format(
    data.frame(
        c(
            new("person", name="Tom", age=30),
            new("person", name="John", age=20)
        )
    )[2,]
)

 

Conda Install R packages Tcl Error – solved

When I try to install R packages in conda, sometimes I get the following error:

install.packages("vegan")
. . .
Error: .onLoad failed in loadNamespace() for 'tcltk', details:
  call: fun(libname, pkgname)
  error: Can't find a usable init.tcl in the following directories: 
    /opt/anaconda1anaconda2anaconda3/lib/tcl8.5 ./lib/tcl8.5 ./lib/tcl8.5 ./library ./library ./tcl8.5.18/library ./tcl8.5.18/library

This is happening because install.packages is trying to paint the repository picker window – I don’t understand why it’s not using the command-line repository picker.

My workaround was to set the working directory to

~/miniconda2

so that

./lib/tcl8.5/init.tcl

was on my path. Then it painted the repository-picker just fine, and I was able to install my update.

An alternative workaround might have been to ssh to the same box (without X forwarding).

bash get directory for script

First, some warnings from the BashFAQ:

  • Your script does not actually have a location! Wherever the bytes end up coming from, there is no “one canonical path” for it. Never.

  • $0 is NOT the answer to your problem. If you think it is, you can either stop reading and write more bugs, or you can accept this and read on.

The BashFAQ also describes BASH_SOURCE and the applicable caveats.  Here’s some workable, albeit fallible, code from http://stackoverflow.com/questions/59895/can-a-bash-script-tell-which-directory-it-is-stored-in:

# find directory where script may reside
SOURCE="${BASH_SOURCE[0]}"
# resolve $SOURCE till not a symlink
while [ -h "$SOURCE" ]; do 
  DIR="$( cd -P "$(dirname "$SOURCE")" && pwd )"
  SOURCE="$(readlink "$SOURCE")"
  [[ $SOURCE != /* ]] && SOURCE="$DIR/$SOURCE"
done
DIR="$(cd -P "$(dirname "$SOURCE")" && pwd && \
  echo x)"
DIR="${DIR%x}"
echo Script $0 resides in directory $DIR

docker subverts ufw

The Problem

I found on Ubuntu that Docker modifies iptables such that ufw cannot effectively control incoming connections, as documented here:

http://blog.viktorpetersson.com/post/101707677489/the-dangers-of-ufw-docker

A Workaround

This link also points out that it is possible to override this behavior by adding the --iptables=false option to the command line that starts the Docker daemon. For a systemd example, this option is appended to the ExecStart line in
/lib/systemd/system/docker.service:

ExecStart=/usr/bin/dockerd -H fd:// –iptables=false

A Side-Effect of the Workaround

On Debian, with dockerd lauched with the --iptables=false option, I tried to do “docker build” for a DockerFile that included:

RUN npm install -g ethercalc pm2

But, this failed with:

npm info retry will retry, error on last attempt: Error: getaddrinfo ENOTFOUND registry.npmjs.org registry.npmjs.org:443

So, I had to

  • restart dockerd without the --iptables=false option
  • do the docker build
  • restart dockerd with the --iptables=false option

I would like to find a more elegant solution!!

Update 1:

On a Debian without any bridges, I don’t notice a difference in the output of ‘iptables -L’ with or without the option set.  So, I’m a bit stumped. Could this issue only affect Ubuntu? Could it only affect machines that have bridges?

Update 2:

I tried this again with my Ubuntu machine that does indeed have bridges, and I found that ‘iptables -L’ output is unaffected, yet, when I started dockerd without the --iptables=false option, my machine accepted connections even when ufw was set to reject all incoming connections.  So, I’m still stumped.

Percent complete R function

R function to print percent complete.

If you use
method = message
you may want to use
suffix = ” % complete”

pctComplete <- function(
  progress = 0, last.progress = 0,
  total = 100, increment = 10,
  suffix = " ", method = cat, ...) 
{
  if (increment < 1)
    increment <- 1
  if (total < 0)
    total <- 100
  pct <- round(progress * 100 / total)
  if ( pct >= increment 
               + last.progress
               - last.progress%% increment )
  {
    method(paste("", pct, suffix, sep = ""),
           ...)
  }
  pct
}

 

aggregate free memory and swap

free -b reveals the total, used, free, shared, cache, and available memory.

Here’s the awk code to extract the total of the free physical memory and the free swap.

First, observe the free memory and swap, in bytes

# free -b | \
awk 'BEGIN{s=0}; NR<2{print $3}; NR>1{print $4}'
free
293203968
21441761280

Now, total the free memory and swap

# free -b | \
awk 'BEGIN{s=0};NR>1{n=$4;s=s+n;};END{print s}'
21741248512

The sum is slightly different because the second command was run at a different time.