Tuesday, February 3, 2015

Python SSH Tunnel Example

Notes for myself and hopefully others.
def createTunnel(localport, remoteport, identityfile, user, server):
    """Create SSH Tunnels for Database connections"""

    import shlex
    import subprocess
    import time

    sshTunnelCmd = "ssh -N -L %s: -i %s %s@%s" % (
		localport, remoteport, identityfile, user, server

    args = shlex.split(sshTunnelCmd)
    tunnel = subprocess.Popen(args)

    time.sleep(2)  # Give it a couple seconds to finish setting up

    return tunnel  # return the tunnel so you can kill it before you stop
				   # the program - else the connection will persist 
				   # after the script ends

def closeSSHTunnel(tunnels):
    """Close SSH tunnels - given the process handles"""

    for tunnel in tunnels:

localport = 27018  # local port for MongoDB
remoteport = 27017  # remote server port for MongoDB
identityfile = '/home//.ssh/id_rsa.pem'
user = 'ubuntu'
server = 'example.com'

# Start tunnel
tunnel = createTunnel(localport, remoteport, identityfile, user, server)

	# Example usage
	db_server = 'localhost'
	db_port = 27018
	client = pymongo.MongoClient('mongodb://{}:{}'.format(db_server, db_port))
	pydb = client.agencies


Monday, January 19, 2015

Installation Notes for Flask App on Ubuntu 14.04 LTS using gUnicorn

Installation Notes for Flask App on Ubuntu 14.04 LTS using gUnicorn

Followed these excellent directions from Real Python and modified for python3 and Ubuntu 14.04.
Start with updating ubuntu and loading additional packages
sudo apt-get updatesudo apt-get install -y python3 python3-pip nginx mongodb supervisorsudo pip3 install virtualenv
sudo mkdir /var/wwwsudo chown ubuntu:ubuntu /var/wwwmkdir /var/www/flask-appmkdir /var/www/flask-app/logscd /var/www/flask-app

Setup virtualenv
virtualenv flask_env
source flask_env/bin/activate
pip install -r requirements.txt
pip install gunicorn

Setup nginx

sudo /etc/init.d/nginx start
sudo rm /etc/nginx/sites-enabled/default
sudo touch /etc/nginx/sites-available/flask-app
sudo ln -s /etc/nginx/sites-available/flask-app /etc/nginx/sites-enabled/flask-app
sudo vim /etc/nginx/sites-enabled/flask-app
Add the following to the nginx flask-app conf file being edited
server {
    location / {
    location /static {
        alias  /var/www/flask-app/flAsk-app/static;

sudo service nginx reload

Setup gunicorn start file

Setup bash script to run gunicorn
cd /var/www/flask-app
touch gunicorn_start
chmod a+x gunicorn_start
vim gunicorn_start
Insert the following into the gunicorn_start bash script

NUM_WORKERS=3 echo "Starting $NAME" # activate the virtualenv cd $VENVDIR source bin/activate export PYTHONPATH=$FLASKDIR:$PYTHONPATH # Create the run directory if it doesn't exist RUNDIR=$(dirname $SOCKFILE) test -d $RUNDIR || mkdir -p $RUNDIR # Start your unicorn exec gunicorn runserver:app -b \ --name $NAME \ --workers $NUM_WORKERS \ --user=$USER --group=$GROUP \ --log-level=debug \

Setup Supervisor

This will run and restart the Flask application when the application files are changed.
cd /etc/supervisor/conf.d
sudo vim flask-app.conf
Insert the following into the flask-app.conf file:
command = /var/www/flask-app/gunicorn_start
user = ubuntu
stdout_logfile = /var/www/flask-app/logs/gunicorn_supervisor.log
redirect_stderr = true

Start flask-app gunicorn:
sudo supervisorctl update
sudo supervisorctl status
You can use the following commands as well:
sudo supervisorctl start flask-app
sudo supervisorctl start all
sudo supervisorctl help|avail|stop|restart

Test that the application is running

Thursday, July 10, 2014

Insomnia due to back pain

I've noticed a number of people are posting on medical advice websites with the following symptoms: Sleep for 2-5 hours and wake up with a lot of back pain which goes away after getting up. Even napping during the day lying down will cause back pain in a couple of hours especially after a meal.

If this is happening to you, you may be suffering from acid-reflux erosion of the esophagus which can cause enervation (activation) of the nerves in the esophagus and stomach. This leads to your back pain via a process called Referred Pain. Referred pain is the process where pain induced in one part of your body is 'felt' in another part. As you can see from the chart in the Wikipedia article, stomach-based referred pain shows up in the middle of the back along the spine.

Now these pain symptoms are not going to be an exact match to the chart. Personally, I would feel like an electric charge was running along one of my ribs from my spine to my front during the earlier stages of my symptoms. Everybody's anatomy is slightly different and you'll get variations on the theme. However, it was primarily back pain that I was feeling. Now my pain almost exactly matches the chart.

After way too many specialists including a neurologist who should have had a clue about this, my GP at the time, Dr Karkalis in King of Prussia (fantastic doctor IMHO) who likes to review challenging patient charts at night for fun, thought my symptoms might be due to acid reflux. He was right, but in order to finalize the diagnosis, he prescribed a proton-pump inhibitor. After a few weeks my back pain from sleep was reduced significantly.

Another way to diagnose this and manage it longer term is to tilt the head of your bed up by 15-20 degrees to let gravity keep the acid in your stomach instead allowing it into your esophagus. If your back pain abates (not necessarily overnight though it often does) fairly quickly, then your back pain is likely due to acid reflux.

Sadly, you will need to sleep elevated for the rest of your life if you have acid-reflux back pain. The proton-pump inhibitors are pretty safe drugs, but they are not always be able to control your acid-reflux enough to control the pain completely. As you can see from my hammock blog post, hammocks work really well to set the right angle for sleeping and are much less expensive than adjustable beds.

Your acid reflux will be more or less severe at different times due to exercise, stress, eating habits, body weight so your management of it will need to accommodate the severity of your acid reflux. In other words, you may need to increase the angle of your bed when your acid reflux is more severe.

Also, check with your doctor and get tested for Barret's Esophagus. Basically, serious acid reflux sufferers are at higher risk for esophageal cancer.

Sunday, August 26, 2012

Acid Reflux, Camping, Hammocks and Holes

This is a personal blog article.  I generally try to keep this to professional matters, but I would like to record this for those of us suffering from acid reflux.  I've been dealing with it for over a decade now and mostly control it via mechanical means - sleeping on an incline.  After a couple of hours sleeping flat, I get referred pain from my esophagus which feels like back pain.  I try really hard to figure out how to sleep on an incline so I can get 6-8 hours of sleep instead of 2-3 hours of sleep and then 2-3 hours sitting up until the pain subsides, rinse, repeat.  

I was camping last weekend, first time since I've started noticing my acid reflux.  I generally suffer when away from home and try all sorts of tricks with setting the bed on blocks to create an incline, build a mound of pillows, or sleep in a recliner.  I wasn't looking forward to sleeping on the ground for a couple of nights.  

I tried to find a spot on a bit of a slope, but that had to be balanced with the desire to not slide down on a slippery tent floor in a slippery sleeping bag.  The first night was absolutely miserable.  I did manage 5 hours or so I think.  I woke up somewhere between 3-4am in pain and left the tent so I wouldn't disturb my campmates.  Being in pain, cold, and really tired doesn't make for a lovely camping trip.  

I was absolutely dreading the coming night and trying to find a nice spot of ground (shaped like a recliner) to set up my sleeping bag and pad and praying that it wouldn't rain.  My wife fortunately found the right spot and not something I would have thought of.  It was a hole about 2 feet deep, 2 feet wide and 3 feet long in a U shape.  It turned out to be a perfect, recliner-like, shape.  I tried it out and immediately felt comfortable.  Long story short, I had a great night sleeping in the hole.  I heartily recommend digging a hole to sleep in if you have trouble sleeping on your back or sleeping flat due to acid reflux issues.

This experience started me thinking about using a hammock when away from home as it would provide the same shape..  Looking on Google, there are a LOT of articles on using baby hammocks for babies with acid reflux and very few about adults using hammocks to manage acid reflux.  I took the plunge anyway and bought a hammock to try out.  I slept in it last night, and it worked great.  I was able to sleep for a very long time (about 10 hours - trying to catch up on missed sleep).  It was very comfortable all night, and I had no pain indicative of acid reflux this morning. 

It seems to be a successful experiment - though I do need a few more data points to fully confirm it of course.  Assuming that additional data points confirm this approach, I need to figure out how to take the hammock with me when traveling.  Hotels don't generally provide hammock hooks in the wall.  Travel-wise, hammocks don't take up much room which is good as I hate checking bags when I fly.  The hammock I tried out was the ENO Double Nest Hammock which doesn't take up much space or weight (about the size of a grapefruit and less than two pounds).

ENO Double Nest Hammock (Tomato/Khaki)

ENO Double Nest Hammock (Navy/Olive)

These straps are handy to hang the hammock with:  ENO Slap Straps

I'd recommend getting two carabiners to replace the ones that come with the hammock based on reviews I saw on Amazon:  Black Diamond Neutrino Carabiner - Grey



Tuesday, July 17, 2012

String Similarity

This post is my attempt at recording a very nice thread of posts on BioNLP.org's mailing list on string similarity measures. Harsha G at Molecular Connections asked about string similarity measures which prompted 

Tools recommended:



From Tudor Groza:

Dear Harsha,

I would suggest you have a look at Simmetrics [1] - it is a comprehensive
package for string similarities ranging from basic ones, like Levenshtein
distance to more advanced one, like Smith-Waterman or Needleman-Wunch. You
can find the Java API at [2] - for some reasons the original page is
missing, hence the only way to get to it is via the Web archive.

Hope that this helps.

[1] http://sourceforge.net/projects/simmetrics/

Kind regards,


From Sampo Pyysalo:

Dear Harsha, all,

Not sure what your exact needs are, but I've found that in
approximate-matching lookup against many larger biomedical resources it's
good to do a fast, comparatively simple first-pass lookup before running
more advanced string comparison algorithms to avoid the computational costs
of full comparison for a large number of string pairs. I've found Naoaki
Okazaki's simstring (http://www.chokkan.org/software/simstring/) to be
excellent for this first task. The way I'd recommend to use this is to
first filter a large string collection to a reasonably-sized set of best
matches (in terms of a comparatively coarse similarity function like char
n-gram cosine) with simstring and then run more advanced stuff like
custom-cost edit distance for this smaller set.

There are a number of studies by Okazaki as well as Yoshimasa Tsuruoka and
others on the topic of string similarity metrics for domain tasks that may
also be of interest to you, e.g.



From Florian Leitner:

Dear Harsha,

A good overview is the 2003 W. Cohen paper "promoting" the SoftTFIDF measure and with a very good overview of available similarity measures:


As for libraries to do string similarity matching, there are many, many options available. As they have not been mentioned so far, most prominently, there are the Regular Expression libraries.

In terms of pure speed, some of Google's own searches are powered by re2 (developed by a Google search engineer), a deterministic RegEx ("DFA")  engine that is significantly faster than the "default" engines available in most other programming languages (because they are all are at least in parts non-deterministic, i.e., "NFAs"). However, due to the pure deterministic nature there is quite some default functionality missing (e.g., lookaheads and -behinds, etc.), so you have to define all variants you wish to match in your patterns (no approximate matches!), while it is blazingly fast:


In terms of pure approximate matching speed, don't forget that *nix offers a pretty powerful approximate string matching implementation right at your "fingertips":


Last, another C implementation of a POSIX compliant approximate (DFA-based) regex matcher is TRE, although this is library is therefore somewhat slower than the RE2 engine, too:


These three regex libraries are probably the most noteworthy if you need raw speed. Then there are a few Java regex libraries that seem noteworthy, too:

First, there is a non-determinisitc RegEx engine (FREJ) to do approximate matching, also in Java:


And yet another Java regex implementation, partially DFA and partially NFA, is the Brics Automaton:


(there are much more Java regex libraries, but let Google be your best friend if you need even more pointers...)

Apart from the regex/D- or NFA based implementations, there are distance-based measures to do approx. string matching. A very fast similarity search tool is SimString, an approximate matcher based on distance measures, and already mentioned by Sampo in his post, in C++:


Probably the most well-known package in this domain is the SecondString package from the CMU (from W. Cohen, the author cited above) for approx. string matching in Java, also based on edit distance measures:


Last I'd mention there is a simple Python module to calculate n-gram-based similarities; while I do love Python very much, alone due to that fact that this is Python-based, it will most likely be the slowest option listed here:


Hope this helps to get you up to [matching] speed!


 From Aurélie Névéol:


Another measure to look into is the "PubMed distance" described in this paper:

Lu Z, Wilbur WJ. Improving accuracy for identifying related PubMed queries by an integrated approach. J Biomed Inform. 2009 Oct;42(5):831-8.

An example of use and evaluation can be found in this other paper:

Névéol A, Islamaj-Doğan R, Lu Z. Author Keywords in Biomedical Journal Articles. Proc AMIA Annu Symp. 2010:537-41.

Best regards,


From Bob Carpenter:

I'd suggest looking further than Jaccard distance in
the LingPipe matchers.  We have TF/IDF matchers based
on character n-grams that are widely used in practice (not
just by us or with our implementation;  note
that this is NOT the same as Cohen et al.'s soft TF/IDF,
which I've never fully understood).

There's also the Jaro-Winkler matchers, which are
tuned for matching single-word names.

LingPipe also has a dictionary-based matcher that will
spot approximate matches (by weighted edit distance) in
text using the Aho-Corasick algorithm for deterministic
matching and suffix arrays for speeding approximate matching.

And you can also use something like an HMM- or CRF-based
chunker to find matches in texts.  It basically then looks
like a named-entity problem.

If you want something fancier that should outperform any of
these methods, check out this paper by McCallum, Bellare and Pereira:


I'm also quite keen on this method for string comparison
by Dreyer, Eisner and Smith, though I haven't tried it, either:


And in the end, you may be wanting to do something like cluster
similar terms rather than just provide pairwise similarities.
Andrew McCallum and crew have done some great work on this problem,
and there's a huge swath of "deduplication" and "record linkage"
literature that's related.

Tuesday, February 7, 2012

CAPEX, OPEX and Cloud?

Here is a really good overview of CAPEX and OPEX as well as how it impacts Cloud Computing initiatives:


I think we all find the CAPEX/OPEX financial issues to be troubling on multiple levels.  Thanks to Matthew Dube for letting me know about this.

Fable of the porcupine

Fable of the Porcupine:

It was the coldest winter ever, and many animals died because of the cold. The porcupines, realizing the situation, decided to group together to keep warm, but the quills of each one wounded their closest companions. After a while, they decided to distance themselves from one another, but then they began to die, alone and frozen.

They had to make a choice: Either accept the quills of their companions or die. Wisely, they learned to live with the little wounds caused by their close relationships, in order to receive the heat that came from the others. This way they were able to survive.

Moral of the story:

The best relationship is not the one that brings together perfect people; it's when each individual learns to live with the others' imperfections and can admire their good qualities.


A friend of mine, Nils Onsager, Master Hapkido Instructor, shared this with me.  I found it an excellent parable for diversity and the challenges inherent in diversity.